Training and testing a model

Let's take the data and divide it into training and test sets:

>>> from sklearn import linear_model,cross_validation, 
                   feature_selection,preprocessing
>>> import statsmodels.formula.api as sm
>>> from statsmodels.tools.eval_measures import mse
>>> from statsmodels.tools.tools import add_constant
>>> from sklearn.metrics import mean_squared_error

>>> X = b_data.values.copy() 
>>> X_train, X_valid, y_train, y_valid = 
                     cross_validation.train_test_split( X[:, :-1], X[:, -1], 
                     train_size=0.80)

We first convert the data frame into an array structure using values.copy() of b_data. We then use the train_test_split function of cross_validation from SciKit to divide the data into training and test set for 80% of the data. ...

Get Mastering Python for Data Science now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.