Skip to Content
Scaling Python with Dask
book

Scaling Python with Dask

by Holden Karau, Mika Kimmins
July 2023
Intermediate to advanced
223 pages
5h 24m
English
O'Reilly Media, Inc.
Content preview from Scaling Python with Dask

Chapter 6. Advanced Task Scheduling: Futures and Friends

Dask’s computational flow follows these four main logical steps, which can happen concurrently and recursively for each task:

  1. Collect and read the input data.

  2. Define and build the compute graph representing the set of computations that needs to be performed on the data.

  3. Run the computation (this happens when you run .compute()).

  4. Pass the result as data to the next step.

Now we introduce more ways to control this flow with futures. So far, you have mostly seen lazy operations in Dask, where Dask doesn’t do the work until something forces the computation. This pattern has a number of benefits, including allowing Dask’s optimizer to combine steps when doing so makes sense. However, not all tasks are well suited to lazy evaluation. One common pattern not well suited to lazy evaluation is fire-and-forget, where we call a function for its side effect1 and necessarily care about the output. Trying to express this with lazy evaluation (e.g., dask.delayed) results in unnecessary blocking to force computation. When lazy evaluation is not what you need, you can explore Dask’s futures. Futures can be used for much more than just fire-and-forget, and you can return results from them. This chapter will explore a number of common use cases for futures.

Note

You may already be familiar with futures from Python. Dask’s futures are an extension of Python’s concurrent.futures library, allowing you to use them in its place. Similar ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Scaling Python with Ray

Scaling Python with Ray

Holden Karau, Boris Lublinsky

Publisher Resources

ISBN: 9781098119867Errata Page