2 OVERVIEW OF THE MACHINE LEARNING PROCESS

In this chapter, we give an overview of the steps involved in machine learning (ML), starting from a clear goal definition and ending with model deployment. The general steps are shown schematically in Figure 2.1. We also discuss issues related to data collection, cleaning, and preprocessing. We explain the notion of data partitioning, where methods are trained on a set of training data and then model performance is evaluated on a separate set of validation data, and how this practice helps avoid overfitting. Finally, we illustrate the steps of model building by applying them to data.

Schematic illustration of the machine learning process

FIGURE 2.1 Schematic of the machine learning process

2.1 INTRODUCTION

In Chapter 1, we saw some very general definitions of machine learning. In this chapter, we introduce a variety of machine learning methods. The core of this book focuses on what has come to be called predictive analytics, the tasks of classification and prediction as well as pattern discovery, that have become key elements of a function in most large firms. These terms are described next.

2.2 CORE IDEAS IN MACHINE LEARNING

Classification

Classification is perhaps the most basic form of predictive analytics. The recipient of an offer can respond or not respond. An applicant for a loan can repay on time, repay late, or declare bankruptcy. A credit card transaction can be normal ...

Get Machine Learning for Business Analytics, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.