Bigger Data With Ease Using Apache Arrow

R is unparalleled in its ability to transform raw data into a wide array of beautiful graphics, all within the same environment.

Bigger Data With Ease Using Apache Arrow

January 21, 2021

The Apache Arrow project enables data scientists using R, Python, and other languages to work with large datasets efficiently and with interactive speed. Arrow is so fast at some workflows that it seems to defy reality--or at least the limits of R's capabilities. This talk examines the unique characteristics of the Arrow project that enable it to redefine what is possible in R. The talk also highlights some of the latest developments in the arrow R package, including how you can query and manipulate multi-file datasets, and it presents strategies for speeding up workflows by up to 100x.

Additional Videos

Neal Richardson, Lucy D'Agostino McGowan, ZJ, and Garrick Aden Buie Q&A

Neal Richardson and Lucy D'Agostino McGowan Q&A


About the speaker

Currently Director of Engineering at Ursa Labs / RStudio. Previously led product and engineering at Crunch.io. Ph.D. in Political Science from the University of California, Berkeley.