Feature engineering

As discussed earlier, we want to predict the close price for the DJIA index for a particular trading day. In this section, we will do feature selection based on our intuition for our basic prediction model for stock prices. We have already generated the training dataset. So, now we will load the saved .pkl format dataset and perform feature selection as well as minor data processing. We will also generate the sentiment score for each of the filtered NYTimes news articles and will use this sentiment score to train our baseline model. We will use the following Python dependencies:

  • numpy
  • pandas
  • nltk

This section has the following steps:

  1. Loading the dataset
  2. Minor preprocessing
  3. Feature selection
  4. Sentiment analysis

So, let's begin coding! ...

Get Machine Learning Solutions now with the O’Reilly learning platform.

O’Reilly members experience live online training, plus books, videos, and digital content from nearly 200 publishers.