Chapter 1. Introducing Polars
In 2022, we found ourselves in the middle of a challenging project for a client. Their data pipeline was growing out of control. The codebase was a mix of Python and R, with the Python side relying heavily on the pandas package for wrangling all the data. Over time, three major issues emerged: the code was becoming increasingly difficult to maintain, performance had slowed to a crawl, and memory consumption had skyrocketed to over 500 GB. These problems were stifling productivity and pushing the limits of the infrastructure.
Back then, Polars was still relatively unknown, but we had experimented with it and seen some promising results. Convincing the rest of the team to migrate both the pandas and R code to Polars wasn’t easy, but once the switch was made, the impact was immediate. The new data pipeline was much faster, and the memory footprint shrank to just 40 GB—a fraction of what it used to be.
Thanks to this success, we’re fully convinced of the power of Polars. We wrote this book, Python Polars: The Definitive Guide, to share with you what we’ve learned and help you unlock the same potential in your data workflows.
In this introductory chapter, you’ll learn:
-
The main features of Polars
-
Why Polars is fast and popular
-
How Polars compares to other data processing packages
-
Why you should use Polars
-
How we have organized this book
-
Why we focus on Python Polars
In addition, we’ll demonstrate Polars’ capabilities through a showcase, ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access