Tuning a random forest model

In the previous recipe, we reviewed how to use the random forest classifier. In this recipe, we'll walk through how to tune its performance by tuning its parameters.

Getting ready

In order to tune a random forest model, we'll need to first create a dataset that's a little more difficult to predict. Then, we'll alter the parameters and do some preprocessing to fit the dataset better.

So, let's create the dataset first:

>>> from sklearn import datasets
>>> X, y = datasets.make_classification(n_samples=10000, 
                                        n_features=20, 
                                        n_informative=15, 
                                        flip_y=.5, weights=[.2, .8])

How to do it…

In this recipe, we will do the following:

  1. Create a training and test set. We won't just sail through this recipe like we did in the previous ...

Get scikit-learn : Machine Learning Simplified now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.