Implementing a decision tree with scikit-learn
Now, when we are sufficiently aware of the mathematics behind decision trees, let us implement a simple decision tree using the methods in
scikit-learn. The dataset we will be using for this is a commonly available dataset called the
iris dataset that has information about flower species and their petal and sepal dimensions. The purpose of this exercise will be to create a classifier that can classify a flower as belonging to a certain species based on the flower petal and sepal dimensions.
To do this, let's first import the dataset and have a look at it:
import pandas as pd data=pd.read_csv('E:/Personal/Learning/Predictive Modeling Book/My Work/Chapter 7/iris.csv') data.head()
The datasheet looks as ...