Skip to Content
Scaling Python with Dask
book

Scaling Python with Dask

by Holden Karau, Mika Kimmins
July 2023
Intermediate to advanced
223 pages
5h 24m
English
O'Reilly Media, Inc.
Content preview from Scaling Python with Dask

Chapter 10. Dask with GPUs and Other Special Resources

Sometimes the answer to our scaling problem isn’t throwing more computers at it; it’s throwing different types of resources at it. One example of this might be ten thousand monkeys trying to reproduce the works of Shakespeare, versus one Shakespeare.1 While performance varies, some benchmarks have shown up to an 85% improvement in model training times when using GPUs over CPUs. Continuing its modular tradition, the GPU logic of Dask is found in the libraries and ecosystem surrounding it. The libraries can either run on a collection of GPU workers or parallelize work over different GPUs on one host.

Most work we do on the computer is done on the CPU. GPUs were created for displaying video but involve doing large amounts of vectorized floating point (e.g., non-integer) operations. With vectorized operations, the same operation is applied in parallel on large sets of data, like a map. Tensor Processing Units (TPUs) are similar to GPUs, except without also being used for graphics.

For our purposes, in Dask, we can think of GPUs and TPUs as specializing in offloading large vectorized computations, but there are many other kinds of accelerators. While much of this chapter is focused on GPUs, the same general techniques, albeit with different libraries, generally apply to other accelerators. Other kinds of specialized resources include NVMe drives, faster (or larger) RAM, TCP/IP offload, Just-a-Bunch-of-Disks expansion ports, and ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Scaling Python with Ray

Scaling Python with Ray

Holden Karau, Boris Lublinsky

Publisher Resources

ISBN: 9781098119867Errata Page