My organization currently has over 250 oceanographic sensors deployed around the coast of Nova Scotia, Canada. Together, these generate around 4 million rows of data every year. I was shocked when I discovered my colleagues manually compiled, formatted, and analyzed these data using hundreds of Excel spreadsheets. This was highly time consuming, error prone, and lacked traceability. To improve this workflow, I developed an R package that reduced processing time by 95%. The package has since become integral to our data pipeline, including quality control, analysis, visualization, and report generation in RMarkdown. The resulting datasets have already proven invaluable to industry leaders looking to invest in Nova Scotia’s coastal resources.
Talk materials are available at https://github.com/dempsey-CMAR/2022_rstudio_conf.
Danielle Dempsey is a Research Scientist and resident “R nerd” at the Centre for Marine Applied Research in Nova Scotia, Canada. Danielle enjoys developing code to automate repetitive tasks and improve workflows. She has written several R packages to facilitate ocean data wrangling and analysis, and designed Shiny dashboards for interactive data visualization.