Building Word2Vec model
In this section, we will go through some deeper details of how can we build a Word2Vec model. As we mentioned previously, our final goal is to have a trained model that will able to generate real-valued vector representation for the input textual data which is also called word embeddings.
During the training of the model, we will use the maximum likelihood method (https://en.wikipedia.org/wiki/Maximum_likelihood), which can be used to maximize the probability of the next word wt in the input sentence given the previous words that the model has seen, which we can call h.
This maximum likelihood method will be expressed in terms of the softmax function:
Here, the score function computes a value to represent the compatibility ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access