December 2018
Beginner to intermediate
684 pages
21h 9m
English
We will use a slightly more complicated model to illustrate Markov chain Monte Carlo inference:
formula = 'income ~ sex + age+ I(age ** 2) + hours + educ'
Patsy's function, I(), allows us to use regular Python expressions to create new variables on the fly. Here, we square age to capture the non-linear relationship that more experience adds less income later in life.
Note that variables measured on very different scales can slow down the sampling process. Hence, we first apply sklearn's scale() function to standardize the age, hours, and educ variables.
Once we have defined our model with the new formula, we are ready to perform inference to approximate the posterior distribution. MCMC sampling algorithms are ...