Pandas is a remarkable tool for data analysis that aims to be the most powerful and flexible open source data analysis/manipulation tool available in any language. And, as you will soon see, if it doesn't already live up to this claim, it can't be too far off. Let's now take a look:

Importing the iris dataset

You can see from the preceding screenshot that I have imported a classic machine learning dataset, the iris dataset (also available at, using scikit-learn, a library we'll examine in detail later. I then passed the data into a pandas DataFrame, making sure to assign the column headers. ...

Get Python Machine Learning Blueprints - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.