Part 1 - Introducing an R interface for Apache Spark

August 9, 2017 Edgar Ruiz

RStudio recently announced a new open-source package called sparklyr that facilitates a connection between R and Spark using a full-fledged dplyr backend with support for the entirety of Spark’s MLlib library. Due to Spark’s ability to interact with distributed data with little latency, it is becoming an attractive tool for interfacing with large datasets in an interactive environment. In addition to handling the storage of data, Spark also incorporates a variety of other tools including stream processing, computing on graphs, and a distributed machine learning framework. Some of these tools are available to R programmers via the sparklyr package.

In this four-part series, we’ll discuss how to leverage Spark’s capabilities in a modern R environment. The sparklyr Series:

  1. Introducing an R interface for Apache Spark
  2. Extending Spark using sparklyr and R
  3. Advanced Features of sparklyr
  4. Understanding Spark and sparklyr deployment modes 

Download Materials

About the Author

Edgar Ruiz

Edgar is the author and administrator of the https://db.rstudio.com web site, and current administrator of the [sparklyr] web site: https://spark.rstudio.com. Author of the Data Science in Spark with sparklyr cheatsheet. Co-author of the dbplyr package and creator of the dbplot package.

Follow on Twitter Visit Website More Content by Edgar Ruiz
Previous Video
Tidyverse visualization manipulation basics - This introduction to the tidyverse will cover several best practices for analyzing data with R
Tidyverse visualization manipulation basics - This introduction to the tidyverse will cover several best practices for analyzing data with R

This introduction to the tidyverse will cover several best practices for analyzing data with R.

Next Video
Part 2 - Extending Spark using sparklyr
Part 2 - Extending Spark using sparklyr

sparklyr facilitates a connection between R and Spark using a full-fledged dplyr backend with support for t...

×

Please register to receive regular updates on our webinars.

!
Thank you!
Error - something went wrong!