Putting the Fun in Functional Data: A tidy pipeline to identify routes in NFL tracking data - Dani Chu

January 30, 2020 Dani Chu
Currently in football many hours are spent watching game film to manually label the routes run on passing plays. Using tracking data, each route can be described as a sequence of spatial-temporal measurements that varies in length depending on the duration of the play. This data can be conveniently analyzed using nested columns in tidyr and purrr. We demonstrate how model-based curve clustering using Bernstein polynomial basis functions (i.e. Bézier curves) fit using the Expectation Maximization algorithm can cluster route trajectories. Each cluster can then be labelled to obtain route names for each route and create route trees for all receivers. The clusters and routes can be visualized nicely using ggplot and seen developing over time using gganimate.

About the Author

Dani Chu

Dani Chu is a second year masters student in statistics at Simon Fraser University. He is planning on graduating in December of 2019 and will be joining the Hockey Reserach & Development team with NHL Seattle in January. Recently, he completed an internship at the NBA department of Basketball Strategy & Analytics. At SFU, he's the co-president of the SFU Sports Analytics Club with Lucas Wu and Matthew Reyers. Along with Lucas, Matt and James Thomson, he was the winner of the College Division of the 2019 NFL Big Data Bowl and the 2018 Sacramento Kings Case Competition. Dani has also interned as a statistician at Best Buy Canada and Fraser Health Authority.

Follow on Twitter Follow on Linkedin More Content by Dani Chu
Previous Video
What's new in TensorFlow for R - Daniel Falbel
What's new in TensorFlow for R - Daniel Falbel

TensorFlow is the most popular open-source platform for machine learning and it's ecosystem is evolving inc...

Next Video
Making better spaghetti (plots): Exploring the individuals in longitudinal data with the brolgar pac - Dr. Nicholas Tierney - 🍝📈
Making better spaghetti (plots): Exploring the individuals in longitudinal data with the brolgar pac - Dr. Nicholas Tierney - 🍝📈

There are two main challenges of working with longitudinal (panel) data: 1) Visualising the data, and 2) Un...