We start by importing the required libraries:
import osimport globimport pandas as pd
We set our working folder as follows:
os.chdir("/.../Chapter 11/CS - IMDB Classification")os.getcwd()
We set our path variable and iterate through the .txt files in the folders.
The TXT files for the positive reviews are read and the reviews are appended in an array. We use the array to create a DataFrame, df_pos.
path="/.../Chapter 11/CS - IMDB Classification/txt_sentoken/pos/*.txt"files = glob.glob(path)text_pos = []for p in files: file_read ...