Reducing the grid search runtime

The GridSearchCV function can really manage an extensive amount of work for you by checking all combinations of parameters, as required by your grid specification. Anyway, when the data or grid search space is big, the procedure may take a long time to compute.

A potential remedy to this issue would be the following approach from the model_selection module. RandomizedSearchCV offers a procedure that randomly draws a sample of combinations and reports the best combination found.

This has some clear advantages:

  • You can limit the number of computations.
  • You can obtain a good result or, at worst, understand where to focus your efforts on in the grid search.
  • RandomizedSearchCV has the same options as GridSearchCV ...

Get Python Data Science Essentials - Third Edition now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.