Skip to Content
Machine Learning Solutions
book

Machine Learning Solutions

by Jalaj Thanaki
April 2018
Beginner to intermediate content levelBeginner to intermediate
566 pages
12h 17m
English
Packt Publishing
Content preview from Machine Learning Solutions

Training the baseline model

As you know, we have selected the RandomForestRegressor algorithm. We will be using the scikit-learn library to train the model. These are the steps we need to follow:

  1. Splitting the training and testing dataset
  2. Splitting prediction labels for the training and testing dataset
  3. Converting sentiment scores into the numpy array
  4. Training the ML model

So, let's implement each of these steps one by one.

Splitting the training and testing dataset

We have 10 years of data values. So for training purposes, we will be using 8 years of the data, which means the dataset from 2007 to 2014. For testing purposes, we will be using 2 years of the data, which means data from 2015 and 2016. You can refer to the code snippet in the following screenshot ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Machine Learning

Machine Learning

Subramanian Chandramouli, Saikat Dutt, Amit Kumar Das
Machine Learning for Business

Machine Learning for Business

Doug Hudgeon, Richard Nichol
Introducing Machine Learning

Introducing Machine Learning

Dino Esposito, Francesco Esposito

Publisher Resources

ISBN: 9781788390040Supplemental Content