Making better spaghetti (plots): Exploring the individuals in longitudinal data with the brolgar pac - Dr. Nicholas Tierney - 🍝📈

January 30, 2020 Nicholas Tierney
There are two main challenges of working with longitudinal (panel) data: 1) Visualising the data, and 2) Understanding the model. Visualising longitudinal data is challenging as you often get a "spaghetti plot”, where a line is drawn for each individual. When overlaid in one plot, it can have the appearance of a bowl of spaghetti. With even a small number of subjects, these plots are too overloaded to be read easily. For similar reasons, it is difficult to relate the model predictions back to the individual and keep the context of what the model means for the individual. For both visualisation, and modelling, it is challenging to capture interesting or unusual individuals, which are often lost in the noise. Better tools, and a more diverse set of grammar and verbs are needed to visualise and understand longitudinal data and models, to capture the individual experiences. In this talk, I introduce the R package, **brolgar** (BRowse over Longitudinal data Graphically and Analytically in R), which provides new tools, verbs, and grammar to identify and summarise interesting individual patterns in longitudinal data. This package extends upon ggplot2 with custom facets, and the new tidyverts time series packages to efficiently explore longitudinal data.

About the Author

Nick Tierney completed his PhD in Statistics at QUT under Kerrie Mengersen. He is now a Lecturer at Monash University, working with Di Cook and Rob Hyndman. His research aims to improve data analysis workflow. Crucial to this work is producing high quality software to accompany each research idea. His work so far has focussed on the importance of knowing your data ( visdat), and on creating principles and tools that make it easier to work with, explore, and model missing data (naniar).

Nick loves how the R programming language has transformed his world. He is a proud member of the rOpenSci community, and co-host of rstats podcast , Credibly Curious, with Saskia Freytag.

Nick is a keen outdoors person, and loves hiking and rock climbing, taking photos, singing karaoke, and drinking coffee.

More Content by Nicholas Tierney
Previous Video
Putting the Fun in Functional Data: A tidy pipeline to identify routes in NFL tracking data - Dani Chu
Putting the Fun in Functional Data: A tidy pipeline to identify routes in NFL tracking data - Dani Chu

Currently in football many hours are spent watching game film to manually label the routes run on passing p...

Next Video
Updates on Spark, MLflow, and the broader ML ecosystem - Javier Luraschi
Updates on Spark, MLflow, and the broader ML ecosystem - Javier Luraschi