Data wrangling with R and RStudio

Before an R program can look for answers, your data must be cleaned up and converted to a form that makes information accessible.

Data wrangling with R and RStudio

October 24, 2016

Download Materials

A recent article from the New York Times said “Data scientists, according to interviews and expert estimates, spend from 50 percent to 80 percent of their time mired in the mundane labor of collecting and preparing data, before it can be explored for useful information.”

Before an R program can look for answers, your data must be cleaned up and converted to a form that makes information accessible. In this webinar, you will learn how to use the `dplyr` and `tidyr` packages to optimise the data wrangling process. You’ll learn to:

  • Spot the variables and observations within your data
  • Quickly derive new variables and observations to explore
  • Reshape your data into the layout that works best for R
  • Join multiple data sets together
  • Use group-wise summaries to explore hidden levels of information within your data

About the speaker

Garrett is the author of Hands-On Programming with R and co-author of R for Data Science and R Markdown: The Definitive Guide. He is a Data Scientist at RStudio and holds a Ph.D. in Statistics, but specializes in teaching. He’s taught people how to use R at over 50 government agencies, small businesses, and multi-billion dollar global companies; and he’s designed RStudio’s training materials for R, Shiny, R Markdown and more. Garrett wrote the popular lubridate package for dates and times in R and creates the RStudio cheatsheets.