January 2019
Intermediate to advanced
386 pages
11h 13m
English
One way to overcome the curse of dimensionality is by learning a lower dimensional, distributed representation of the words (http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf). This distributed representation is created by learning an embedding function that transforms the space of words into a lower dimensional space of word embeddings, as follows:

Words from the vocabulary with size V are transformed into one-hot encoding vectors of size V (each word is encoded uniquely). Then, the embedding function transforms this V-dimensional space into a distributed ...