pins and S3 use case / best practice

February 26, 2020 @glynnfoster Glynn Foster

@glynnfoster wrote:

Hi all,

I have a question about pins and a use case with AWS S3 boards that I'm not sure about in terms of recommendations. Here at Montoux, we develop a SaaS web based platform for life insurance companies to model their portfolios from an actuarial analysis point of view. We're developing a bunch of data science based models in R, and we're figuring out the mechanics of productising these into our platform.

We've previously been using s3mpi (https://github.com/robertzk/s3mpi) as a way to allow us to fetch data from S3 and cache it locally - some of the data we consume is pretty large, so this has worked well for explorative analysis. However, we feel like most of the community traction is around pins, so we're looking at how we can use it.

One area that's a little unknown to us is how we should approach data that hasn't yet been cached/pinned - for example, may have been uploaded directly by a user, or produced as part of some other data processing - ie. the metadata data.txt hasn't been produced. One way to approach this would be to use aws.s3 to pull the file and then pin it, but this seems somewhat inefficient and it would be nice if there was a way for pins to populate the cache from an existing S3 object. Is there anything I'm missing here, or is this a use case that is outside the scope of pins?

I'd really appreciate any experiences or recommendations anyone has.

Thanks!
Glynn

Posts: 4

Participants: 2

Read full topic

Previous Article
Secure Coding Verification tool for R and RStudio Connect
Secure Coding Verification tool for R and RStudio Connect

@nolimal wrote: Dear All, I am searching for a tool similiar to Checkmarx or Son...

Next Article
R Studio session crashes and terminates
R Studio session crashes and terminates

@tdusa wrote: I am trying to run an optimization model and it crashes at differe...