83
print forest.validate(folds)
print "Calculating score for regression tree"
regression = MushroomRegression(data)
print regression.validate(folds)
Running this code shows the following output. In the code, actual is the data inside
of the training set. This is the data we hold to be true.
preds are results we got out of
the model we built:
Calculating score for decision tree
[preds 0 1
actual
0 844 0
1 0 781,
preds 0 1
actual
0 834 0
1 0 791,
preds 0 1
actual
0 814 0
1 0 811,
preds 0 1
actual
0 855 0
1 0 770,
preds 0 1
actual
0 861 0
1 0 763]
Calculating score for random forest method
[preds 0 1
actual
0 841 0
1 0 784,
preds 0 1
actual
0 869 0
1 0 756,
preds 0 1
actual
0 834 0
1 0 791,
preds 0 1
actual
0 835 0
1 0 790,
preds 0 1
actual
0 829 0
Pruning Trees | 81
1 0 795]
Calculating score for regression tree
[0.0, 0.0, 0.0, 0.0, 0.0]
What youll notice is, given this toy example, we are able to create a decision tree that
does exceptionally well. Does that mean we should go out to the woods and eat
mushrooms? No, but given the training data and information we gathered, we have
built a highly accurate model of mapping mushrooms to either poisonous or edible!
The resulting decision tree is actually quite fascinating as you can see in Figure 5-8.
Figure 5-8. e resulting tree from building decision trees
I dont think its important to discuss what this tree means, but it is interesting to
think of mushroom poisonousness as a function of a handful of decision nodes.
82 | Chapter 5: Decision Trees and Random Forests

5-8

5-8：从构建决策树得到的结果树

Get Python 机器学习实践：测试驱动的开发方法 now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.