Test driving our model
To start with now, we must create the framework for scoring our model in a test. It will look like the following:
import pandas import sklearn.metrics import statsmodels.formula.api as smf import numpy as np def logistic_regression_test(): df = pandas.DataFrame.from_csv('./generated_logistic_data.csv') generated_model = smf.logit('y ~ variable_d', df) generated_fit = generated_model.fit() roc_data = sklearn.metrics.roc_curve(df['y'], generated_fit.predict(df)) auc = sklearn.metrics.auc(roc_data[0], roc_data[1]) print generated_fit.summary() print "AUC score: {0}".format(auc) assert auc > .6, 'AUC should be significantly above random'
The previous code also includes a first stab at a model. Because we generated the data, we ...
Get Test-Driven Machine Learning now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.