FlatironKitchen: How we overhauled a Frankensteinian SQL workflow with the Tidyverse

FlatironKitchen: How we overhauled a Frankensteinian SQL workflow with the Tidyverse to enable fast, reproducible, elegant analyses of electronic health records.

FlatironKitchen: How we overhauled a Frankensteinian SQL workflow with the Tidyverse

January 30, 2020

FlatironKitchen: How we overhauled a Frankensteinian SQL workflow with the Tidyverse to enable fast, reproducible, elegant analyses of electronic health records. The increasing availability of real-world electronic health record (EHR) data is revolutionising how pharma companies are developing Personalized Healthcare (PHC) solutions. However, the scale and complexity of EHR data pose major challenges in deriving fit-for-purpose insights systematically and efficiently. The conventional approach, where siloed programmers write (or copy and paste) thousands of lines of undocumented, untested, unconnected SAS and SQL code for every research project is bad for business and ultimately for patients. Our team threw out the conventional approach and turned to R and the Tidyverse. The result is FlatironKitchen, a modern R package enabling end-to-end EHR analyses in a cohesive, user-centric platform. FlatironKitchen allows users to “pipe their way” from database connections, to calculating derived variables, to running statistical analyses, to creating stunning visualisations. All of the technical details are both fully documented and seamlessly automised allowing users to focus on only meaningful functions that are fit-for-purpose to EHR analyses. The result: FlatironKitchen code is so simple it actually tells a step-by-step, human readable story about what the data scientist is doing-- a far cry from the Frankensteinian SQL/SAS code from the past. FlatironKitchen represents the best of both worlds in pharmaceutical data science. It gives expert data scientists a library of unit-tested, customisable functions for implementing existing procedures and designing new ones. Simultaneously, it enables those who are ‘coding insecure’ to -- finally -- work directly with data by reducing barriers. FlatironKitchen’s simple, easy-to-use syntax, combined with its training library of tutorials, vignettes and lessons made possible through RMarkdown has shown itself to be truly empowering. In addition to showcasing FlatironKitchen, we share lessons learned, and give a call to action for other pharma companies to embrace R.

About the speaker

Nathaniel is a decision and data scientist with a passion for inspiring, enabling, and supporting others to make insightful data-driven decisions. An avid R fanatic, he has taught university level R courses for 5+ years, written several R packages such as FFTrees, written an introductory e-book on R titled “YaRrr! The Pirate’s Guide to R”, presented at R conferences such as R/Medicine and useR!, and co-founded an R focused data science start-up “The R Bootcamp” in Basel, Switzerland.

Currently, he works as a Senior Data Scientist in Roche Personalised Health Care where he develops and applies R solutions to data science questions using electronic health records data.

Outside of R, Nathaniel is mostly known for his love of hot sauce, his passion for patient-focused healthcare, and for turning into a pile of goo when he sees a baby animal.