Chapter 14. Joining and Concatenating

Data often comes from multiple sources that you will have to connect and combine in a meaningful way. There are multiple ways to combine DataFrames, which we’ll go over in this chapter.

Funnily enough this is where Polars once started. Faced with combining two CSV files in Rust, Ritchie Vink started his journey which ultimately led to where we are now. This gives a special sentiment to the operations in this chapter.

In this chapter, you’ll learn:

  • That you can use df.join() to combine DataFrames based on the values in the DataFrames and the strategies outlined here.

  • df.join_asof() is a special join that joins DataFrames based on the nearest value in the other DataFrame.

  • how to combine DataFrames using pl.concat(), df.vstack(), df.hstack(), and df.extend().

  • how to combine Series with series.append()

  • the difference between all these methods and when to use them.

Joining

To combine ...

Get Python Polars: The Definitive Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.