January 2018
Beginner to intermediate
284 pages
8h 35m
English
Well, we can’t feed a word directly as a text string to the neural network. Instead, we need something mathematically. Suppose we have a vocabulary of 10,000 unique words; by using one-hot encoding, we can represent each word as a vector of length 10,000, with one entry as one in the position corresponding to the word itself, and zero in all of the other positions.
The input of the Skip-Gram model is a single word represented (one-hot encoded) with length equal to the size of the vocabulary, V, and output is determined by the generated pairs.
Read now
Unlock full access