Understanding PCA using Shiny and Stack Overflow data – Julia Silge

February 26, 2018


Principal component analysis (PCA) is a powerful approach for exploring high-dimensional data, but can be challenging for learners to comprehend. In this talk, I will walk through a practical and interactive explanation of what PCA is and how it works. As a case study I’ll explore a domain that many data analysts and data scientists are familiar with: programming languages and technologies, as understood through traffic to Stack Overflow questions. We will explore how interactive visualization using Shiny gives us insight into the complex, real-world relationships in high-dimensional datasets.

About the speaker

Julia Silge
Data Scientist, Stack Overflow

I love making beautiful charts, the statistical programming language R, Jane Austen, black coffee, and red wine.

I studied physics and astronomy, finishing my PhD in 2005. I worked in academia (teaching and doing research) and ed tech before moving into data science and discovering R. Now, my background in astronomy, physics, and education has given me a strong foundation for using data to answer interesting questions and communicate about technical topics with diverse audiences. I wrote a book with my collaborator Dave about text mining with R.

Previous Video
Modeling in the tidyverse – Max Kuhn
Modeling in the tidyverse – Max Kuhn

Next Video
Rapid prototyping data products using Shiny – Tanya Cashorali
Rapid prototyping data products using Shiny – Tanya Cashorali