The gensim package for creating Word2vec embeddings

We will not be implementing a full working neural network that performs the word embedding procedure, however we will be using a Python package called gensim to do this work for us:

# import the gensim package import gensim

A gensim can take in a corpora of text and run the preceding neural network structure for us and obtain word embeddings with only a few lines of code. To see this in action, let's import a standard corpus to get started. Let's set a logger in our notebook so that we can see the training process in a bit more detail:

import logging  logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)

Now, let's create our corpus:

from gensim.models ...

Get Feature Engineering Made Easy now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.