January 2018
Beginner to intermediate
284 pages
8h 35m
English
The hidden layer, in this case, does not have any activation functions. The connection between the input layer and the hidden layer can be thought of as a weight matrix, WV X N , where N is the number of the neurons in the hidden layer. WV X N is with V rows, that is one for every word in the vocabulary, and N columns, that is one for every hidden neuron. The number N will be the embedding vector length. There is another auxiliary matrix, W′N X V, which connects the hidden layer and the output layer, and the similarity of the word, W, with the hallucinated context word (out-of-window word) is minimized.
Read now
Unlock full access