Why do we need the Pipeline API?
Before digging into the details of the Pipeline API, it is important to understand what a machine learning pipeline means, and why we need a Pipeline
API.
It is important to understand that you cannot have an efficient machine learning platform if the only thing you provide is a bunch of algorithms for people to use. Machine learning is quite an involved process, which involves multiple steps, and a machine learning algorithm itself is just one (though very important) part of the step. As an example, let's consider a text classification example, where you have a corpus of text, and you want to classify if that is a sports article or not a sports article. We would like to simplify it to a 1 and a 0, where a 1 ...
Get Learning Apache Spark 2 now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.