Chapter 8. How to Evaluate Dask’s Components and Libraries

It’s hard, although possible, to build reliable systems out of unreliable components.1 Dask is a largely community-driven open source project, and its components evolve at different rates. Not all parts of Dask are equally mature; even the components we cover in this book have different levels of support and development. While Dask’s core parts are well maintained and tested, some parts lack the same level of maintenance.

Still, there are already dozens of popular libraries specifically for Dask, and the open source Dask community is growing around them. This gives us some confidence that many of these libraries are here to stay. Table 8-1 shows a non-exhaustive list of foundational libraries in use and their relation to the core Dask project. It is meant as a road map for users and is not an endorsement of individual projects. Though we haven’t attempted to cover all the projects shown here, we offer evaluation of some individual projects throughout the book.

Table 8-1. Libraries frequently used with Dask
Category Subcategory Libraries

Dask project

  • Dask

  • Distributed

  • dask-ml

Data structures: Extend functionality, specific scientific data handling, or deployment hardware options of Dask built-in data structures

Functionalities and convenience

  • xarray: adds axis labels for Dask array

  • sparse: an efficient implementation for sparse arrays and matrices, often found in ML and deep learning

  • pint: scientific ...

Get Scaling Python with Dask now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.