July 2019
Intermediate to advanced
512 pages
19h 39m
English
In both the CBOW and skip-gram models, we used the softmax function for computing the probability of the occurrence of a word. But computing the probability using the softmax function is computationally expensive. Say, we are building a CBOW model; we compute the probability of the
word in our vocabulary to be the target word as:

If you look at the preceding equation, we are basically driving the exponent of the
with the ...
Read now
Unlock full access