Wrapping everything in a pipeline

As a concluding topic, we will discuss how to wrap the operations of transformation and selection we have seen so far together, into a single command, a pipeline that will take your data from source to your machine learning algorithm.

Wrapping all of your data operations into a single command offers some advantages:

  • Your code becomes clear and more logically constructed because pipelines force you to rely on functions for your operations (each step is a function).
  • You treat the test data in the exact same way as your train data without code repetitions or the possibility of any mistakes being made in the process.
  • You can easily grid search the best parameters on all the data pipelines you have devised, not ...

Get Python Data Science Essentials - Third Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.