Decision tree classification with scikit-learn

scikit-learn contains the DecisionTreeClassifier class, which can train a binary decision tree with Gini and cross-entropy impurity measures. In our example, let's consider a dataset with three features and three classes:

from sklearn.datasets import make_classification>>> nb_samples = 500>>> X, Y = make_classification(n_samples=nb_samples, n_features=3, n_informative=3, n_redundant=0, n_classes=3, n_clusters_per_class=1)

Let's first consider a classification with default Gini impurity:

from sklearn.tree import DecisionTreeClassifierfrom sklearn.model_selection import cross_val_score>>> dt = DecisionTreeClassifier()>>> print(cross_val_score(dt, X, Y, scoring='accuracy', cv=10).mean())0.970 ...

Get Machine Learning Algorithms now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.