Building Word2Vec model

In this section, we will go through some deeper details of how can we build a Word2Vec model. As we mentioned previously, our final goal is to have a trained model that will able to generate real-valued vector representation for the input textual data which is also called word embeddings.

During the training of the model, we will use the maximum likelihood method (https://en.wikipedia.org/wiki/Maximum_likelihood), which can be used to maximize the probability of the next word wt in the input sentence given the previous words that the model has seen, which we can call h.

This maximum likelihood method will be expressed in terms of the softmax function:

Here, the score function computes a value to represent the compatibility ...

Get Deep Learning By Example now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.