The vast majority of problems in applied data science require regression modeling. That is, you have a response variable (y) that you want to model or predict as a function of a vector of inputs or covariates (x). This chapter introduces the basic framework and language of regression. We will build on this material throughout the rest of the book.

Linear Models

A basic but powerful regression strategy is to deal in averages and lines. We model the conditional mean for y given x as


Here, x = [1, x1, x2, . . . xp] is a vector of covariates ...

Get Business Data Science: Combining Machine Learning and Economics to Optimize, Automate, and Accelerate Business Decisions now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.