Chapter 16. Logistic Regression
16.0 Introduction
Despite being called a regression, logistic regression is actually a widely used supervised classification technique. Logistic regression and its extensions, like multinomial logistic regression, allow us to predict the probability that an observation is of a certain class using a straightforward and well-understood approach. In this chapter, we will cover training a variety of classifiers using scikit-learn.
16.1 Training a Binary Classifier
Problem
You need to train a simple classifier model.
Solution
Train a logistic regression in scikit-learn using LogisticRegression:
# Load librariesfromsklearn.linear_modelimportLogisticRegressionfromsklearnimportdatasetsfromsklearn.preprocessingimportStandardScaler# Load data with only two classesiris=datasets.load_iris()features=iris.data[:100,:]target=iris.target[:100]# Standardize featuresscaler=StandardScaler()features_standardized=scaler.fit_transform(features)# Create logistic regression objectlogistic_regression=LogisticRegression(random_state=0)# Train modelmodel=logistic_regression.fit(features_standardized,target)
Discussion
Despite having “regression” in its name, a logistic regression is actually a widely used binary classifier (i.e., the target vector can only take two values). In a logistic regression, a linear model (e.g., β0 + β1x) is included in a logistic (also called sigmoid) function, , such that: