We will now look at classifying sentiments in the movie reviews corpus in NLTK. The complete Jupyter Notebook for this example is available at Chapter02/01_example.ipynb, in the book's code repository.
First, we will load the movie reviews based on the sentiment categories, which are either positive or negative, using the following code:
cats = movie_reviews.categories()reviews = []for cat in cats: for fid in movie_reviews.fileids(cat): review = (list(movie_reviews.words(fid)),cat) reviews.append(review)random.shuffle(reviews)
The categories() function returns either pos or neg, for positive and negative sentiments, respectively. There are 1,000 reviews in each of the positive and negative ...