This section is included to assist the learners to perform the activities present in the book. It includes detailed steps that are to be performed by the learners to complete and achieve the objectives of the book.
Chapter 1: Introduction to Natural Language Processing
Activity 1: Generating word embeddings from a corpus using Word2Vec.
- Upload the text corpus from the link aforementioned.
- Import the word2vec from gensim models
from gensim.models import word2vec
- Store the corpus in a variable.
sentences = word2vec.Text8Corpus('text8')
- Fit the word2vec model on the corpus.
model = word2vec.Word2Vec(sentences, size = 200)
- Find the most similar word to 'man'.
The output is as follows: ...