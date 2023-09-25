Book description
The exponentially-increasing volume and complexity of data make scalability and reliability increasingly challenging issues. But while modern systems contain multi-core CPUs and GPUs that have the potential for parallel computing, many Python tools weren't designed to leverage this parallelism. Using Dask to parallelize Python workflows delivers a competitive advantage by reducing turn-around time, freeing you to work on more interesting or complex data problems.
With this essential guide at your side, you'll be able to:
- Deploy Dask on the cloud or on-prem
- Scale your Python code to bigger datasets and CPU-intensive workflows
- Speed up data pipelines that often take weeks or months to execute
- Overcome the limits of serial computing on your local machine (or system of machines)
- Use the examples provided to scale your workflows, whether you're working with NumPy, pandas, scikit-learn, PyTorch, XGBoost, or other tools
- Develop a specialized data science library that leverages parallel and distributed computing
- Scale computations to a cluster of machines and to the cloud securely and efficientlyand much more
Publisher resources
Table of contents
- 1. Understanding the Architecture of Dask DataFrames
-
2. How to Work with Dask DataFrames
- Reading Data into a Dask DataFrame
-
Processing Data with Dask DataFrames
- Converting to Parquet files
- Materializing results in memory with compute
- Materializing results in memory with persist()
- Repartitioning Dask DataFrames
- Filtering Dask DataFrames
- Setting the Index
- Joining Dask DataFrames
- Mapping Custom Functions
- groupby aggregations
- Memory usage
- Tips on managing memory
- Converting to number columns with to_numeric
- Vertically union Dask DataFrames
- Writing Data with Dask DataFrames
- Summary
Product information
- Title: Dask: The Definitive Guide
- Author(s):
- Release date: September 2023
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098117085
You might also like
video
Python Fundamentals
51+ hours of video instruction. Overview The professional programmer’s Deitel® video guide to Python development with …
book
Clean Code: A Handbook of Agile Software Craftsmanship
Even bad code can function. But if code isn't clean, it can bring a development organization …
book
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
book
Building Microservices, 2nd Edition
Distributed systems have become more fine-grained as organizations shift from code-heavy monolithic applications to smaller, self-contained …