Appendix

1. Introduction to Natural Language Processing

Activity 1.01: Preprocessing of Raw Text

Solution

Let's perform preprocessing on a text corpus. To complete this activity, follow these steps:

  1. Open a Jupyter Notebook.
  2. Insert a new cell and add the following code to import the necessary libraries:

    from nltk import download

    download('stopwords')

    download('wordnet')

    nltk.download('punkt')

    download('averaged_perceptron_tagger')

    from nltk import word_tokenize

    from nltk.stem.wordnet import WordNetLemmatizer

    from nltk.corpus import stopwords

    from autocorrect import Speller

    from nltk.wsd import lesk

    from nltk.tokenize import sent_tokenize

    from nltk import stem, pos_tag

    import string

  3. Read the content of file.txt and store it in a variable named sentence ...

Get The Natural Language Processing Workshop now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.