December 2018
Beginner to intermediate
684 pages
21h 9m
English
We will use 250 folds to generally predict about 2 days of forward returns following the historical training data that will gradually increase in length. Each iteration obtains the appropriate training and test dates from our custom cross-validation function, selects the corresponding features and targets, and then trains and predicts accordingly. We capture the root mean squared error as well as the Spearman rank correlation between actual and predicted values:
nfolds = 250lr = LinearRegression()test_results, result_idx, preds = [], [], pd.DataFrame()for train_dates, test_dates in time_series_split(dates, nfolds=nfolds): X_train = model_data.loc[idx[train_dates], features] y_train = model_data.loc[idx[train_dates] ...