## With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

No credit card required

# Cross-validating our model

Now before we cheat and look at our answer key, let's see how well this solution does at predicting data it hasn't seen. To do this, I write the following fairly large test:

`def final_model_cross_validation_test(): df = pandas.read_csv('./generated_data.csv') df['predicted_dependent_var'] = 25.6266 \ + 2.7083*df['ind_var_a'] \ - 1.5527*df['ind_var_b'] \ - 0.3917*df['ind_var_c'] \ - 0.2006*df['ind_var_e'] \ + 5.6450*df['ind_var_b'] * df['ind_var_c'] df['diff'] = (df['dependent_var'] - df['predicted_dependent_var']).abs() print df['diff'] print '===========' cv_df = pandas.read_csv('./generated_data_cv.csv') cv_df['predicted_dependent_var'] = 25.6266 \ + 2.7083*cv_df['ind_var_a'] \ - 1.5527*cv_df['ind_var_b'] \ - 0.3917*cv_df['ind_var_c'] ...`

## With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

No credit card required