Decision tree classification with scikit-learn

scikit-learn contains the DecisionTreeClassifier class, which can train a binary decision tree with Gini and cross-entropy impurity measures. In our example, let's consider a dataset with three features and three classes:

from sklearn.datasets import make_classification>>> nb_samples = 500>>> X, Y = make_classification(n_samples=nb_samples, n_features=3, n_informative=3, n_redundant=0, n_classes=3, n_clusters_per_class=1)

Let's first consider a classification with default Gini impurity:

from sklearn.tree import DecisionTreeClassifierfrom sklearn.model_selection import cross_val_score>>> dt = DecisionTreeClassifier()>>> print(cross_val_score(dt, X, Y, scoring='accuracy', cv=10).mean())0.970 ...

Get Machine Learning Algorithms now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.