Chapter 7. Training Systems

ML training is the process by which we transform input data into models. We take a set of input data, almost always preprocessed and stored in an efficient way, and process it through a set of ML algorithms. The output is a representation of that data, called a model, that we can integrate into other applications. For more details on what a model is, see Chapter 3.

A training algorithm describes the specific steps by which software reads data and updates a model to try to represent that data. A training system, on the other hand, describes the entire set of software surrounding that algorithm. The simplest implementation of an ML training system is on a single computer running in a single process that reads data, performs some cleaning and imposes some consistency on that data, applies an ML algorithm to it, and creates a representation of the data in a model with new values as a result of what it learns from the data. Training on a single computer is by far the simplest way to build a model, and the large cloud providers do rent powerful configurations of individual machines. Note, though, that many interesting uses of ML in production process a significant amount of data and as a result might benefit from significantly more than one computer. Distributing processing brings scale but also complexity.

In part, because of our broad conception of what an ML training system is, ML training systems may have less in common with one another across different ...

Get Reliable Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.