Reducing the grid search runtime

The GridSearchCV function can really manage an extensive amount of work for you by checking all combinations of parameters, as required by your grid specification. Anyway, when the data or grid search space is big, the procedure may take a long time to compute.

A potential remedy to this issue would be the following approach from the model_selection module. RandomizedSearchCV offers a procedure that randomly draws a sample of combinations and reports the best combination found.

This has some clear advantages:

  • You can limit the number of computations.
  • You can obtain a good result or, at worst, understand where to focus your efforts on in the grid search.
  • RandomizedSearchCV has the same options as GridSearchCV ...

Get Python Data Science Essentials - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.