3

Word2vec – Learning Word Embeddings

In this chapter, we will discuss a topic of paramount importance in NLP—Word2vec, a data-driven technique for learning powerful numerical representations (that is, vectors) of words or tokens in a language. Languages are complex. This warrants sound language understanding capabilities in the models we build to solve NLP problems. When transforming words to a numerical representation, a lot of methods aren’t able to sufficiently capture the semantics and contextual information that word carries. For example, the feature representation of the word forest should be very different from oven as these words are rarely used in similar contexts, whereas the representations of forest and jungle should be very similar. ...

Get Natural Language Processing with TensorFlow - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.