Extensions to the word embeddings algorithms

The original paper by Mikolov and others, published in 2013, discusses several extensions that can improve the performance of the word embedding learning algorithms even further. Though they are initially introduced to be used for skip-gram, they are extendable to CBOW as well. Also, as we already saw that CBOW outperforms the skip-gram algorithm in our example, we will use CBOW for understanding all the extensions.

Using the unigram distribution for negative sampling

It has been found that the performance results of negative sampling are better when performed by sampling from certain distributions rather than from the uniform distribution. One such distribution is the unigram distribution. The unigram ...

Get Natural Language Processing with TensorFlow now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.