Skip to Content
Scaling Python with Dask
book

Scaling Python with Dask

by Holden Karau, Mika Kimmins
July 2023
Intermediate to advanced
223 pages
5h 24m
English
O'Reilly Media, Inc.
Content preview from Scaling Python with Dask

Preface

We wrote this book for data scientists and data engineers familiar with Python and pandas who are looking to handle larger-scale problems than their current tooling allows. Current PySpark users will find that some of this material overlaps with their existing knowledge of PySpark, but we hope they still find it helpful, and not just for getting away from the Java Virtual Machine (JVM).

If you are not familiar with Python, some excellent O’Reilly titles include Learning Python and Python for Data Analysis. If you and your team are more frequent users of JVM languages (such as Java or Scala), while we are a bit biased, we’d encourage you to check out Apache Spark along with Learning Spark (O’Reilly) and High Performance Spark (O’Reilly).

This book is primarily focused on data science and related tasks because, in our opinion, that is where Dask excels the most. If you have a more general problem that Dask does not seem to be quite the right fit for, we would (with a bit of bias again) encourage you to check out Scaling Python with Ray (O’Reilly), which has less of a data science focus.

A Note on Responsibility

As the saying goes, with great power comes great responsibility. Dask and tools like it enable you to process more data and build more complex models. It’s essential not to get carried away with collecting data simply for the sake of it, and to stop to ask yourself if including a new field in your model might have some unintended real-world implications. You don’t ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Scaling Python with Ray

Scaling Python with Ray

Holden Karau, Boris Lublinsky

Publisher Resources

ISBN: 9781098119867Errata Page