Skip to Content
Scaling Python with Dask
book

Scaling Python with Dask

by Holden Karau, Mika Kimmins
July 2023
Intermediate to advanced
223 pages
5h 24m
English
O'Reilly Media, Inc.
Content preview from Scaling Python with Dask

Chapter 11. Machine Learning with Dask

Now that you know Dask’s many different data types, computation patterns, deployment options, and libraries, we are ready to tackle machine learning. You will quickly find that ML with Dask is quite intuitive to use, as it runs on the same Python environment as the many other popular ML libraries. Much of the heavy work is done by Dask’s built-in data types and Dask’s distributed schedulers, making writing code an enjoyable experience for the user.1

This chapter will primarily use the Dask-ML library, a robustly supported ML library from the Dask open source project, but we will also highlight other libraries, such as XGBoost and scikit-learn. The Dask-ML library is designed to run both in clusters and locally.2 Dask-ML provides familiar interfaces by extending many common ML libraries. ML is different from many of the tasks discussed so far, as it requires the framework (here Dask-ML) to coordinate work more closely. In this chapter we’ll show some of the ways you can use it in your own programs, and we’ll also offer tips.

Since ML is such a wide and varied discipline, we are able to cover only some of the situations where Dask-ML is useful. This chapter will discuss some of the common work patterns, such as exploratory data analysis, random split, featurization, regression, and deep learning inferences, from a practitioner’s perspective on ramping up on Dask. If you don’t see your particular library or use case represented, it may still ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Scaling Python with Ray

Scaling Python with Ray

Holden Karau, Boris Lublinsky

Publisher Resources

ISBN: 9781098119867Errata Page