Skip to Main Content
Ensemble Machine Learning Cookbook
book

Ensemble Machine Learning Cookbook

by Dipayan Sarkar, Vijayalakshmi Natarajan
January 2019
Beginner to intermediate content levelBeginner to intermediate
336 pages
7h 58m
English
Packt Publishing
Content preview from Ensemble Machine Learning Cookbook

Getting ready

We start by importing the required libraries:

import osimport globimport pandas as pd

 We set our working folder as follows:

os.chdir("/.../Chapter 11/CS - IMDB Classification")os.getcwd()

We set our path variable and iterate through the .txt files in the folders. 

Note that we have a subfolder, /txt_sentoken/pos, which holds the TXT files for the positive reviews. Similarly, we have a subfolder, /txt_sentoken/neg, which holds the TXT files for the negative reviews.

The TXT files for the positive reviews are read and the reviews are appended in an array. We use the array to create a DataFrame, df_pos.

path="/.../Chapter 11/CS - IMDB Classification/txt_sentoken/pos/*.txt"files = glob.glob(path)text_pos = []for p in files: file_read ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Productive and Efficient Data Science with Python: With Modularizing, Memory profiles, and Parallel/GPU Processing

Productive and Efficient Data Science with Python: With Modularizing, Memory profiles, and Parallel/GPU Processing

Tirthajyoti Sarkar
Hands-On Automated Machine Learning

Hands-On Automated Machine Learning

Sibanjan Das, Umit Mert Cakmak

Publisher Resources

ISBN: 9781789136609Supplemental Content