Let's kick the tires

This final section introduces the key elements of the training and classification workflow. A test case using a simple logistic regression is used to illustrate each step of the computational workflow.

An overview of computational workflows

In its simplest form, a computational workflow to perform runtime processing of a dataset is composed of the following stages:

  1. Loading the dataset from files, databases, or any streaming devices.
  2. Splitting the dataset for parallel data processing.
  3. Preprocessing data using filtering techniques, analysis of variance, and applying penalty and normalization functions whenever necessary.
  4. Applying the model—either a set of clusters or classes—to classify new data.
  5. Assessing the quality of the model. ...

Get Scala:Applied Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.