Skip to Content
Statistics for Machine Learning
book

Statistics for Machine Learning

by Pratap Dangeti
July 2017
Beginner to intermediate
442 pages
10h 8m
English
Packt Publishing
Content preview from Statistics for Machine Learning

Grid search on random forest

Grid search has been performed by changing various hyperparameters with the following settings. However, readers are encouraged to try other parameters to explore further in this space.

  • Number of trees is (1000,2000,3000)
  • Maximum depth is (100,200,300)
  • Minimum samples per split are (2,3)
  • Minimum samples in leaf node are (1,2)

Import Pipeline as follows:

>>> from sklearn.pipeline import Pipeline>>> from sklearn.model_selection import train_test_split,GridSearchCV

The Pipeline function creates the combinations which will be applied one by one sequentially to determine the best possible combination:

>>> pipeline = Pipeline([ ('clf',RandomForestClassifier(criterion='gini'))])>>> parameters = { ... 'clf__n_estimators':(1000,2000,3000), ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Probability and Statistics for Machine Learning

Probability and Statistics for Machine Learning

Jon Krohn

Publisher Resources

ISBN: 9781788295758Supplemental Content