About this book

Who should read this book

The goal of this book is to teach a scalable style of programming. To do that, we’ll cover a wider range of material than you might be familiar with from other programming or technology books. Where other books might cover a single library, this book covers many libraries—both built-in modules, such as functools and itertools, as well as third-party libraries, such as toolz, pathos, and mrjob. Where other books cover just one technology, this book covers many, including Hadoop, Spark, and Amazon Web Services (AWS). The choice to cover a broad range of technologies is admitting the fact that to scale your code, you need to be able to adapt to new situations. Across all the technologies, however, I emphasize ...

Get Mastering Large Datasets with Python now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.