O'Reilly logo

Data Science with Python and Dask by Jesse Daniel

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

3 Introducing Dask DataFrames

This chapter covers

  • Defining structured data and determining when to use Dask DataFrames
  • Exploring how Dask DataFrames are organized
  • Inspecting DataFrames to see how they are partitioned
  • Dealing with some limitations of DataFrames

In the previous chapter, we started exploring how Dask uses DAGs to coordinate and manage complex tasks across many machines. However, we only looked at some simple examples using the Delayed API to help illustrate how Dask code relates to elements of a DAG. In this chapter, we’ll begin to take a closer look at the DataFrame API. We’ll also start working through the NYC Parking Ticket data following a fairly typical data science workflow. This workflow and their corresponding chapters ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required