9 Strings

When most people think of pandas or data analysis in general, they think of numbers. And indeed, much of the work that people do with pandas is with numbers. That’s why pandas is built on top of NumPy, which takes advantage of C’s fast, efficient integers and floats. And that’s why so many of the exercises in this book involve working with numbers.

However, we often have to work with textual data—usernames, product names, sales regions, business units, ticker symbols, and company names are just a few examples. Sometimes the text is central to the analysis you’re doing—such as when you’re preparing data for a text-based machine-learning model—and other times, it’s secondary to the numbers and used as a description or categorical data. ...

Get Pandas Workout now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.