Artificial Intelligence for Big Data
by Anand Deshpande, Manish Kumar, Albenzo Coletta, Giancarlo Zaccone
Pipeline
Pipeline represents a sequence of stages, where every stage is a transformer or an estimator. All these stages run in an order and the dataset that is input is altered as it passes through every stage. For the stages of transformers, the transform () method is used, while for the stages of estimators, the fit() method is used to create a transformer.
Every DataFrame that is output from one stage is input for the next stage. The pipeline is also an estimator. Therefore, it produces PipelineModel once the fit() method is run. PipelineModel is a transformer. PipelineModel contains the same number of stages as in the original pipeline. PipelineModel and pipelines make sure that the test and training data pass through similar feature-processing ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access