January 2019
Intermediate to advanced
386 pages
11h 13m
English
A lot of research has gone into creating better word embedding models, in particular by omitting learning the probability function over sequences of words. One of the most popular ways to do this is via word2vec (http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf and https://arxiv.org/pdf/1301.3781.pdf). To create embedding vectors with a word2vec model, we'll need a simple neural network that has the following:
We'll train the network with gradient ...