This section explains the transformations needed on the data to be used in the model.
- One of the first steps to building a model is splitting the data into a training and test dataset for model evaluation purposes. Our goal is to use all of the stock quotes from 2000 through 2016 to predict stock trends in 2017-2018. We know from previous sections that we have a total of 4,610 days of stock quotes, but we don't know exactly how many fall in each year. We can use the groupBy() function within the dataframe to get a unique count of stock quotes per year, as can be seen in the following screenshot:
- 2016 and 2017's combined ...