4 Scaling with the compute layer

This chapter covers

  • Designing scalable infrastructure that allows data scientists to handle computationally demanding projects
  • Choosing a cloud-based compute layer that matches your needs
  • Configuring and using compute layers in Metaflow
  • Developing robust workflows that handle failures gracefully

What are the most fundamental building blocks of all data science projects? First, by definition, data science projects use data. At least small amounts of data are needed by all machine learning and data science projects. Second, the science part of data science implies that we don’t merely collect data but we use it for something, that is, we compute something using data. Correspondingly, data and compute are the two ...

Get Effective Data Science Infrastructure now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.