CHAPTER 7Supervised Learning—Classification Using Logistic Regression

What Is Logistic Regression?

In the previous chapter, you learned about linear regression and how you can use it to predict future values. In this chapter, you will learn another supervised machine learning algorithm—logistic regression. Unlike linear regression, logistic regression does not try to predict the value of a numeric variable given a set of inputs. Instead, the output of logistic regression is the probability of a given input point belonging to a specific class. The output of logistic regression always lies in [0,1].

To understand the use of logistic regression, consider the example shown in Figure 7.1. Suppose that you have a dataset containing information about voter income and voting preferences. For this dataset, you can see that low‐income voters tend to vote for candidate B, while high‐income voters tend to favor candidate A.

Illustration depicting the use of logistic regression of a dataset containing information about voter income and voting preferences.

Figure 7.1: Some problems have binary outcomes

With this dataset, you would be very interested in trying to predict which candidate future voters will vote for based on their income level. At first glance, you might be tempted to apply what you have just learned to this problem; that is, using linear regression. Figure 7.2 shows what it looks like when you apply linear regression to this problem.

Figure 7.2: Using linear regression to solve the voting preferences problem ...

Get Python Machine Learning now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.