Organizations of all sizes use cloud services for data science to mitigate challenges such as:

  • Long delays and high startup costs for new projects and new data science teams
  • Obstacles to collaboration between organizations or groups
  • High costs of computing infrastructure, including hardware, software and manpower
  • Difficulty scaling to meet variable demand
  • Excessive time and costs moving the data to the analysis

Choose your Cloud Strategy based on your Data Science needs

Depending on the circumstances of your organization and what specific challenges you are trying to address, there are multiple cloud options to consider:

  • Hosted and Software as a Service (SaaS) offerings:
    A fully hosted service, such as RStudio Cloud, can minimize the cost and time required to start up a new project, workshop or class.
  • Deployment to a Virtual Private Cloud (VPC) provider:
    Deploying software on a major cloud platform such as Amazon Web Services (AWS) or Azure can provide the full flexibility and customization of on-premise software.
  • Cloud Marketplace Offerings:
    Pre-built applications offered on services such as AWS Marketplace and Azure make it easier to get started with images built and tested by the vendor. Cloud marketplaces may also provide streamlined purchasing options that bring access to IT infrastructure budgets and make the total cost of ownership easier to track for your organization.
  • Fully-Managed Services:
    These offerings, such as RStudio on Amazon SageMaker or Azure ML, provide the convenience and scalability of the cloud while offloading the maintenance and administration to the cloud provider or a third party.
  • Data Science in Your Data Lake:
    By embedding your data science tools into your existing data platform, your computations can be run close to the data, minimize overhead, and easily tie into your data pipeline.

Want to learn more?

RStudio supports your Data Science Cloud Strategy

Regardless of which approach you choose, RStudio provides multiple options to support your cloud journey.

Simplify and reduce startup costs with a SaaS solution:

Promote collaboration and instruction between organizations and groups:

Mitigate high costs of computing infrastructure:

Scale to meet variable demand. In addition to the above options (marketplaces, fully-managed services, VPCs, Docker and Cloud storage), RStudio's pro products provide specific functionality to help:

  • RStudio Workbench's Launcher integrates with Kubernetes, Slurm, and other HPC environments
  • RStudio provides Helm charts to help you manage your Kubernetes configurations
  • RStudio Connect provides many options to scale and tune performance, including being part of an autoscaling group. These options allow Connect to deliver dashboards, Shiny applications, and other types of content to large numbers of users

Minimize data movement. By running your computations close to the data, you can minimize overhead and tie your data science directly into your data pipelines:

  • Run your data science tools on your cloud provider, whether in marketplaces, fully-managed services or VPCs as listed above, to help minimize data movement
  • Native R interface to Spark: Sparklyr allows you to easily filter and aggregate Spark datasets and streams, then bring them into R for analysis and visualization to train models at scale and productionize machine learning pipelines in Spark. Learn more at spark
  • Connect to cloud-based data storage, such as Snowflake, Redshift or S3, using RStudio Professional Drivers
  • Use Amazon EFS (Elastic File System) as your shared file system for RStudio Team

More resources:

  • Read the blog post, "Where does RStudio fit into your Cloud Journey"
  • Watch the webinar, "Why Data Science in the Cloud?", copresented with RStudio partner ProCogia
  • Learn more about RStudio Cloud, where you can get started for free, or check our Available Plans . If you are interested in using RStudio Cloud for teaching, watch the webinar, "Teaching R online with RStudio Cloud"
Set up a call with RStudio

RStudio Cloud stories from your peers