Skip to Main Content
Python Data Analysis - Third Edition
book

Python Data Analysis - Third Edition

by Avinash Navlani, Ivan Idris
February 2021
Beginner to intermediate content levelBeginner to intermediate
478 pages
8h 38m
English
Packt Publishing
Content preview from Python Data Analysis - Third Edition
Parallel Computing Using Dask

Dask is one of the simplest ways to process your data in a parallel manner. The platform is for pandas lovers who struggle with large datasets. Dask offers scalability in a similar manner to Hadoop and Spark and the same flexibility that Airflow and Luigi provide. Dask can be used to work on pandas DataFrames and Numpy arrays that cannot fit into RAM. It splits these data structures and processes them in parallel while making minimal code changes. It utilizes your laptop power and has the ability to run locally. We can also deploy it on large distributed systems as we deploy Python applications. Dask can execute data in parallel and processes it in less time. It also scales the computation power of your workstation ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Python for Geospatial Data Analysis

Python for Geospatial Data Analysis

Bonny P. McClain
Hands-On Exploratory Data Analysis with Python

Hands-On Exploratory Data Analysis with Python

Suresh Kumar Mukhiya, Usman Ahmed

Publisher Resources

ISBN: 9781789955248Supplemental Content