13. Generalized Linear Models
13.1 Introduction
Not every response variable will be continuous, so a linear regression will not be the correct model in every circumstance. Some outcomes may contain binary data (e.g., sick, not-sick), or even count data (e.g., how many heads will I get). A general class of models called “generalized linear models” (GLM) can account for these types of data, yet still use a linear combination of predictors.
13.2 Logistic Regression
When you have a binary response variable, logistic regression is often used to model the data. Here’s some data from the American Community Survey (ACS) for New York.
import pandas as pd acs = pd.read_csv('../data/acs_ny.csv') print(acs.columns)
Get Pandas for Everyone: Python Data Analysis, First Edition now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.