August 2018
Intermediate to advanced
438 pages
12h 3m
English
We start with a query: where is the milk now? It is encoded with bags of words using a vector of size V. In the simplest case, we use embedding B (d x V) to convert the vector to a word embedding of size d. We have u=embeddingB(q):

The input sentences x1, x2, ... , and xi are stored in memory by using another embedding matrix, A (d x Vd x V), with the same dimension as B mi=embeddingA(xi). The similarity between each embedded query, u, and each memory, mi, is computed by taking the inner product followed by a softmax: pi = softmax(uTmi).
The output memory representation is as follows: each xi has a corresponding output vector, c
Read now
Unlock full access