April 2019
Intermediate to advanced
426 pages
11h 13m
English
We are interested in using the earliest five years of pricing data for training our model, and the most recent year of 2018 for testing our predictions. Run the following codes to split our df dataset:
In [ ]: df_train = df['2017':'2013'] df_test = df['2018']
The df_train and df_test variables contain our training and testing data respectively.
An important step in data preprocessing is to normalize the dataset. This will transform input feature values to a mean of zero and a variance of one. Normalization helps to avoid biases during training due to the different scales of input features.
The MinMaxScaler function of the sklearn module helps to transform each feature into a range between -1 and ...