Skip to Content
Machine Learning Solutions
book

Machine Learning Solutions

by Jalaj Thanaki
April 2018
Beginner to intermediate content levelBeginner to intermediate
566 pages
12h 17m
English
Packt Publishing
Content preview from Machine Learning Solutions

Building the training and testing datasets for the baseline model

In this section, we will be generating the training dataset as well as the testing dataset. We will iterate over the files of our dataset and consider all files whose names start with the digit 12 as our test dataset. So, roughly 90% of our dataset is considered the training dataset and 10 % of our dataset is considered the testing dataset. You can refer to the code for this in the following figure:

Building the training and testing datasets for the baseline model

Figure 5.6: Code snippet for building the training and testing dataset

As you can see, if the filename starts with 12 then we consider the content of those files as the testing dataset. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Machine Learning

Machine Learning

Subramanian Chandramouli, Saikat Dutt, Amit Kumar Das
Machine Learning for Business

Machine Learning for Business

Doug Hudgeon, Richard Nichol
Introducing Machine Learning

Introducing Machine Learning

Dino Esposito, Francesco Esposito

Publisher Resources

ISBN: 9781788390040Supplemental Content