bigrquery, and incorporating R into a GCP Data Pipeline

November 17, 2019 @NickCanova Nicholas

@NickCanova wrote:

Hi all,

I am a longtime R and Tidyverse user, and I recently joined a small data team of 3 at a company where our data team of 3 uses GCP extensively for our data pipelines and data analysis. Our data engineer is piping data from our data sources into BigQuery, and I am creating many views / tables in BigQuery from the data.

I must say BigQuery is great, and handles big data sources of GB-sized tables with ease, but I'd also like to use R on some of the smaller tables saved in BigQuery. There are some fairly clear pros and cons of using each of BigQuery (power to handles big data) and R (power of flexibility when working with smaller data), and I'd like to introduce R into our stack to handle what it's good at. When I proposed this, I received the following feedback:

"deploying r onto our pipelines environment will not be straightforward. we'd have to make a special kubernetes pod just for r and maintain that. also r isn't as performant and library management isn't as easy. it isn't a language built for building production pipelines."

With that said, I am interested in hearing if anybody here has been successful integrating R / BigQuery into a production GCP data pipeline. I don't have enough knowledge of data engineering work and our data pipeline to know better than our engineer on this. He seems convincing on R's weaknesses in this regard (library management, not as performant, lots of setup work), however I am highly proficient in R and am confident that if my team allowed me to introduce R into our stack / pipeline, that it would be helpful.

Any thoughts or related experiences on this would be greatly appreciated, thanks!

Posts: 7

Participants: 5

Read full topic

Previous Article
How to track a user package in renv
How to track a user package in renv

@kendonb wrote: Is there a way to manage a dependency on a user package through ...

Next Article
ts package not available for R 3.6.0
ts package not available for R 3.6.0

@AJ-DS wrote: I want to install "ts" package with OS- Windows, R 3.6.0. While in...