Question on managing long and heavy R code (one script VS multiple pieces)

August 1, 2018 @taras Taras Kaduk

@taras wrote:

I am having a pretty long multi-stage piece of R code, which takes a lot of resources and time to run.
In the development stage, I broke it down into multiple stages, a separate R script for each stage. Each R script to start with loading an .RData file from the previous stage, and save another .RData file in the end for the next stage.

Basically, I'd first write the import part, save it. Then write the tidying part, save it. Then write the modeling part. Save it. And so on. It saved me a lot of time at the stages closer to the end.


Now that the code is almost done and ready to work, I am wondering:

  • if I should stitch it all back into one R script and remove the intermediate "checkpoints" (create one reproducible piece of code) VS keep it separate (since each stage produces some tangible result and potentially useful data, it would be easier for someone to pick up at any chosen stage instead of recreating everything from scratch)
  • in general, if such approach of breaking down long code is a good practice to begin with, and if there is a better approach.
  • if using .RData instead of cvs is frown upon (it was easier to save and later load .RData objects, as they preserve everything.

Posts: 8

Participants: 6

Read full topic

Previous Article
Suggestions for pointing out bad statistical practice
Suggestions for pointing out bad statistical practice

@jrlewi wrote: Hello - Occasionally, I will see questions pop up that indicates...

Next Article
Use npm and webpack to build a new shiny widget (cytoscpe.js)
Use npm and webpack to build a new shiny widget (cytoscpe.js)

@pshannon wrote: 5:36 PM (less than a minute ago) We have a bioconductor package...