5 Statistical Modeling
5.1 Concepts in Regression
What is statistical modeling?
- It is a formalization of relationships between variables in the form of mathematical equations.
- It describes how one or more random variables are related to one or more other variables.
- The variables are not deterministically but stochastically related.
Reading Statistical Modeling: The Two Cultures http://projecteuclid.org/download/pdf_1/euclid.ss/1009213726
Example
- Height and age are probabilistically distributed among humans.
- They are stochastically related; when you know that a person is of age 30 years, this influences the chance of this person of being 4‐feet tall. When you know that a person is of age 13 years, this influences the chance of this person of being 6 feet tall.
- Model 1
- heighti = b0 + b1agei + εi, where b0 is the intercept, b1 is a parameter that age is multiplied by to get a prediction of height, ε is the error term, and i is the subject.
- Model 2
- heighti = b0 + b1agei + b2sexi + εi, where the variable sex is dichotomous.
Regression models involve the following variables:
- The unknown parameters
- The independent variables, X
- The dependent variable, Y
- Y = a + BX is the simplest form of regression
- Linear regression Y = a + Bx + (E)
- Multivariate regression Y = a + bx + cy + (E)
- Logistic regression ln(p/1 − p) = a + bX
Example
Okun’s LawThe relationship between an economy’s unemployment rate and its gross national product (GNP). Economist Arthur Okun developed this idea, ...
Get Python for R Users now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.