Appendix
1. Introduction to Natural Language Processing
Activity 1.01: Preprocessing of Raw Text
Solution
Let's perform preprocessing on a text corpus. To complete this activity, follow these steps:
- Open a Jupyter Notebook.
- Insert a new cell and add the following code to import the necessary libraries:
from nltk import download
download('stopwords')
download('wordnet')
nltk.download('punkt')
download('averaged_perceptron_tagger')
from nltk import word_tokenize
from nltk.stem.wordnet import WordNetLemmatizer
from nltk.corpus import stopwords
from autocorrect import Speller
from nltk.wsd import lesk
from nltk.tokenize import sent_tokenize
from nltk import stem, pos_tag
import string
- Read the content of file.txt and store it in a variable named sentence ...
Get The Natural Language Processing Workshop now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.