November 2018
Intermediate to advanced
360 pages
9h 36m
English
The previous code is still quite slow, so now, we will use parallel processing to accelerate our data analysis. Our first approach will be using Dask, a Python-based library that provides scalable parallelism: most of the code that scales on your laptop will be able to scale on a large cluster. Dask is a fairly low-level and Python-related approach. Later in this chapter, we will discuss an alternative approach that is more high-level and language-agnostic.