Scaling R with Spark - Javier Luraschi

This talk introduces new features in sparklyr that enable real-time data processing, brand new modeling extensions and significant performance improvements. The sparklyr package provides an interface to Apache Spark to enable data analysis and modeling in large datsets through familiar packages like dplyr and broom.

About the Author

Javier Luraschi

Javier is a Software Engineer with experience in technologies ranging from desktop, web, mobile and backend; to augmented reality and deep learning applications. He previously worked for Microsoft Research and SAP and holds a double degree in Mathematics and Software Engineering.

Follow on Twitter More Content by Javier Luraschi
Previous Video
Solving the model representation problem with broom - Alex Hayes
Solving the model representation problem with broom - Alex Hayes

The R objects used to represent model fits are notoriously inconsistent, making data analysis inconvenient ...

Next Video
RStudio Job Launcher Changing where we run R stuff - Darby Hadley
RStudio Job Launcher Changing where we run R stuff - Darby Hadley

RStudio Job Launcher provides the ability to start processes within batch processing systems and container ...