O'Reilly logo

Python: Advanced Predictive Analytics by Joseph Babcock, Ashish Kumar

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Implementing a decision tree with scikit-learn

Now, when we are sufficiently aware of the mathematics behind decision trees, let us implement a simple decision tree using the methods in scikit-learn. The dataset we will be using for this is a commonly available dataset called the iris dataset that has information about flower species and their petal and sepal dimensions. The purpose of this exercise will be to create a classifier that can classify a flower as belonging to a certain species based on the flower petal and sepal dimensions.

To do this, let's first import the dataset and have a look at it:

import pandas as pd
data=pd.read_csv('E:/Personal/Learning/Predictive Modeling Book/My Work/Chapter 7/iris.csv')
data.head()

The datasheet looks as ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required