Scaling and deploying Dask
This chapter covers
- Creating a Dask Distributed cluster on Amazon AWS using Docker and Elastic Container Service
- Using a Jupyter Notebook server and Elastic File System to store and access data science notebooks and shared datasets in Amazon AWS
- Using the Distributed client object to submit jobs to a Dask cluster
- Monitoring execution of jobs on the cluster using the Distributed monitoring dashboard
Up to this point, we’ve been working with Dask in local mode. This means that everything we’ve asked Dask to do has all been executed on a single computer. Running Dask in local mode is very useful for prototyping, development, and ad-hoc exploration, but we can still quickly reach the performance limits of a single ...