Linear Classification Models

In this section, we’ll look at a few popular linear classification models.

Logistic Regression

Suppose that you were trying to estimate the probability of a certain outcome (which we’ll call A) for a categorical variable with two values. You could try to predict the probability of A as a linear function of the predictor variables, assuming y = c₀ + c₁x₁ + x₂x₂ + ... + c_nx_n= Pr(A). The problem with this approach is that the value of y is unconstrained; probabilities are only valid for values between 0 and 1. A good approach for dealing with this problem is to pick a function for y that varies between 0 and 1 for all possible predictor values. If we were to use that function as a link function in a general linear model, then we could build a model that estimates the probability of different outcomes. That is the idea behind logistic regression.

In a logistic regression, the relationship between the predictor variables and the probability that an observation is a member of a given class is given by the logistic function:

The logit function (which is used as the link function) is:

Let’s take a look at a specific example of logistic regression. In particular, let’s look at the field goal data set. Each time a kicker attempts a field goal, there is a chance that the ...

Get R in a Nutshell now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

R in a Nutshell by Joseph Adler

Linear Classification Models

Logistic Regression

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly