MDS FEST 3.0

Blazing Fast I/O of Data in the Cloud

Kevin Wang, Founding Engineer, Eventual Computing

ChanChan Mao, Developer Relations, Eventual Computing

I/O from remote storage consistently bottlenecks large-scale data processing. Even listing hundreds of thousands of S3 files can take minutes, while reading them can be more challenging than processing. Learn strategies to overcome these performance limitations.

Talk overview

I/O from remote storage is a consistent bottleneck for large scale data processing workloads. When you have hundreds of thousands of files in S3 storage, even listing those files can take several minutes and become a bottleneck! Reading those files can be even more painful than the actual processing of the files.Daft Dataframes are built for the cloud and feature many optimizations that make them extremely efficient at reading and working with cloud storage. In this talk, we will showcase and explain some of the optimization that are built into Daft using its Rust I/O layer, but exposed to users as a familiar Python Dataframe interface.

THURSDAY, NOVEMBER 6 AT 1PM EST

You're invited to Secoda AI office hours

See live demos of how anyone on your team can create dashboards in seconds. Stay for Q&A with the team.

Claim your spot

Unlock the blueprint for enterprise data governance

Benchmarks and actionable strategies to scale governance frameworks effectively.

Get the report

Blazing Fast I/O of Data in the Cloud

Talk overview

You're invited to Secoda AI office hours

Unlock the blueprint for enterprise data governance

Product

Solutions

Use cases

Resources

Company

Social