Training and testing a model

Let's take the data and divide it into training and test sets:

>>> from sklearn import linear_model,cross_validation, 
                   feature_selection,preprocessing
>>> import statsmodels.formula.api as sm
>>> from statsmodels.tools.eval_measures import mse
>>> from statsmodels.tools.tools import add_constant
>>> from sklearn.metrics import mean_squared_error

>>> X = b_data.values.copy() 
>>> X_train, X_valid, y_train, y_valid = 
                     cross_validation.train_test_split( X[:, :-1], X[:, -1], 
                     train_size=0.80)

We first convert the data frame into an array structure using values.copy() of b_data. We then use the train_test_split function of cross_validation from SciKit to divide the data into training and test set for 80% of the data. ...

Get Mastering Python for Data Science now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.