O'Reilly logo

Hands-On Natural Language Processing with Python by Rajalingappaa Shanmugamani, Rajesh Arumugam

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Building a vocabulary for word embedding lookup

We want to create word embeddings for each of the words in our facts, candidates, and questions. Hence, we need to read our data and candidates to calculate the number of words to create embeddings for, as well as maximum sentence lengths. This information is passed to the memory network model to initialize the embedding matrices and input placeholders:

    def build_vocab(self, data, candidates):        # Build word vocabulary set from all data and candidate words        vocab = reduce(lambda x1, x2: x1 | x2,             (set(list(chain.from_iterable(facts)) + questions)                 for facts, questions, answers in data))        vocab |= reduce(lambda x1, x2: x1 | x2,             (set(candidate) for candidate in candidates)) vocab = sorted(vocab) ...

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required