O'Reilly logo

Mastering Python for Data Science by Samir Madhavan

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Random forests

We have learned how to create a decision tree but, at times, decision tree models don't hold up well when there are many variables and a large dataset. This is where ensemble models, such as random forest, come to rescue.

A random forest basically creates many decision trees on the dataset and then averages out the results. If you see a singing competition, such as American Idol, or a sporting competition, such as the Olympics, there are multiple judges. The reason for having multiple judges is to eliminate bias and give fair results, and this is what a random forest tries to achieve.

A decision tree can change drastically if the data changes slightly and it can easily overfit the data.

Let's try to create a random forest model and ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required