Let's kick the tires
This final section introduces the key elements of the training and classification workflow. A test case using a simple logistic regression is used to illustrate each step of the computational workflow.
Overview of computational workflows
In its simplest form, a computational workflow to perform runtime processing of a dataset is composed of the following stages:
- Loading the dataset from files, databases, or any streaming devices.
- Splitting the dataset for parallel data processing.
- Preprocessing data using filtering techniques, analysis of variance, and applying penalty and normalization functions whenever necessary.
- Applying the model, either a set of clusters or classes to classify new data.
- Assessing the quality of the model.
Get Scala: Guide for Data Science Professionals now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.