December 2018
Beginner to intermediate
684 pages
21h 9m
English
We also construct a custom cross-validation class tailored to the format of the data just created, which has pandas MultiIndex with two levels, one for the ticker and one for the data:
class OneStepTimeSeriesSplit: """Generates tuples of train_idx, test_idx pairs Assumes the index contains a level labeled 'date'""" def __init__(self, n_splits=3, test_period_length=1, shuffle=False): self.n_splits = n_splits self.test_period_length = test_period_length self.shuffle = shuffle self.test_end = n_splits * test_period_length @staticmethod def chunks(l, chunk_size): for i in range(0, len(l), chunk_size): yield l[i:i + chunk_size] def split(self, X, y=None, groups=None): unique_dates = (X.index .get_level_values( ...