# 5Statistical Modeling

## 5.1 Concepts in Regression

What is statistical modeling?

• It is a formalization of relationships between variables in the form of mathematical equations.
• It describes how one or more random variables are related to one or more other variables.
• The variables are not deterministically but stochastically related.

### Example

• Height and age are probabilistically distributed among humans.
• They are stochastically related; when you know that a person is of age 30 years, this influences the chance of this person of being 4‐feet tall. When you know that a person is of age 13 years, this influences the chance of this person of being 6 feet tall.
• Model 1
• heighti = b0 + b1agei + εi, where b0 is the intercept, b1 is a parameter that age is multiplied by to get a prediction of height, ε is the error term, and i is the subject.
• Model 2
• heighti = b0 + b1agei + b2sexi + εi, where the variable sex is dichotomous.

Regression models involve the following variables:

• The unknown parameters
• The independent variables, X
• The dependent variable, Y
• Y = a + BX is the simplest form of regression
• Linear regression Y = a + Bx + (E)
• Multivariate regression Y = a + bx + cy + (E)
• Logistic regression ln(p/1 − p) = a + bX

### Example

Okun’s LawThe relationship between an economy’s unemployment rate and its gross national product (GNP). Economist Arthur Okun developed this idea, ...

Get Python for R Users now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.