March 2018
Intermediate to advanced
272 pages
7h 53m
English
If you've been reading carefully, you might have noticed a gap that I haven't closed. Word-embedding models create a vector for each word. Comparatively, BoW models create a vector for each document. So then, how can we use word-embedding models for document classification?
One naive way might be to take the vectors for all the words in our document and compute the mean. We might interpret this value to be the mean semantic value for the document. In practice, this solution is often used and it can yield good results. However, it is not always superior to BoW embedding models. Consider the phrases dog bites man and man bites dog. Hopefully, you'll agree with me that those are two very different statements; ...