April 2017
Intermediate to advanced
318 pages
7h 40m
English
Let us now look at the CBOW word2vec model. Recall that the CBOW model predicts the center word given the context words. Thus, in the first tuple in the following example, the CBOW model needs to predict the output word love, given the context words I and green:
([I, green], love) ([love, eggs], green) ([green, and], eggs) ...
Like the skip-gram model, the CBOW model is also a classifier that takes the context words as input and predicts the target word. The architecture is somewhat more straightforward than the skip-gram model. The input to the model is the word IDs for the context words. These word IDs are fed into a common embedding layer that is initialized with small random weights. Each word ID is transformed ...