A Word2vec model is a two-layer neural net that takes a text corpus as input and outputs a set of embedding vectors for words in that corpus. There are two different architectures to learn word vectors efficiently using shallow neural networks depicted in the following figure:
- The Continuous-Bag-Of-Words (CBOW) model predicts the target word using the average of the context word vectors as input so that their order does not matter. A CBOW model trains faster and tends to be slightly more accurate for frequent terms, but pays less attention to infrequent words.
- The Skip-Gram (SG) model, by contrast, uses the target word to predict words sampled from the context. It works well with small datasets ...