Reading feather files from S3 from EC2 instance on Connect

August 30, 2019 @hlendway Heather

@hlendway wrote:

I am running RStudio Connect version 1.7.4.1-7 with R 3.5.1 on an EC2 instance. For my shiny apps I'm reading large feather files, my largest being almost 2 GB. When I read the files locally, feather is by far the fastest but when I read from the server feather files take more than 7 times what they do locally and RData and Rds are faster. Below are the run benchmark times for local and on the Connect Server.

Local Results, Feather is

expr min lq mean median uq max neval
readCSV 13 13 14 13 14 16 10
readrCSV 5.1 5.2 5.7 5.4 5.9 7.5 10
fread 1.8 1.8 1.9 1.9 2.1 2.3 10
loadRdata 7.3 7.4 7.6 7.5 7.7 7.8 10
readRds 7.3 7.4 7.5 7.5 7.6 7.8 10
readFeather 1.3 1.3 1.7 1.5 1.8 3.6 10

Here are the results run from Connect, which includes pulling the file from S3:
image

In the logs I do see that I frequently get the following error:

Error in value[[3L]](cond) : IO error: Memory mapping file failed
06/20 02:19:46.934
Calls: local ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous>
06/20 02:19:46.934
Execution halted

These 3 issues seems to be somewhat related to what I'm seeing but it sounds like an issue with feather files and it appears they will likely not be resolved.



Because all my apps have fairly large data sets I'm hoping I can find a solution that get's me much closer to that 1.7 avg. read time that I see locally. 14 seconds is not ideal for the end user experience. Any thoughts on this would be appreciated!

Posts: 11

Participants: 4

Read full topic

Previous Article
Install package error for Shiny Server
Install package error for Shiny Server

@serbaysimsir wrote: I want take help/feedback that ı am facing with. I installe...

Next Article
Which shiny server am I using?
Which shiny server am I using?

@rmon44 wrote: Hello, Sorry for the stupid question: I was running a shiny serve...