O'Reilly logo

R Statistical Application Development by Example Beginner's Guide by Prabhanjan Narayanachar Tattar

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Pruning and other finer aspects of a tree

Recall from Figure 14: Classification tree for the test part of the German credit data problem that the rules numbered 21, 143, 69, 165, 142, 70, 40, 164, and 16, respectively, covered only 20, 25, 11, 11, 14, 12, 28, 19, and 22. If we look at the total number of observations, we have about 600, and individually these rules do not cover even about five percent of them. This is one reason to suspect that maybe we overfitted the data. Using the option of minsplit, we can restrict the minimum number of observations each rule should cover at the least.

Another technical way of reducing the complexity of a classification tree is by "pruning" the tree. Here, the least important splits are recursively snipped ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required