O'Reilly logo

Practical Predictive Analytics by Ralph Winters

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Pruning

Pruning is another way to defend against overfitting. If you visually examine a tree and find that you want to stop the growth of branches at a particular point, you can prune all branches below that node, so that all of the results under that node are collapsed into a single node. That may give you more interpretable results for the decision rules displayed at that point.

The rpart has a unique interactive pruning feature using the prp() function which can help you do this.

In the previous states example, you might want to rid yourself of all of the nodes below node 70 in order to balance the tree and keep it at two levels deep. Balanced trees can also be considered a desirable feature:

 PrunedTree <- prp(y1,type=4, extra=1,snip=TRUE)$obj ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required