Skip to Main Content
Natural Language Processing with Spark NLP
book

Natural Language Processing with Spark NLP

by Alex Thomas
June 2020
Beginner to intermediate content levelBeginner to intermediate
364 pages
8h 58m
English
O'Reilly Media, Inc.
Content preview from Natural Language Processing with Spark NLP

Chapter 8. Sequence Modeling with Keras

So far, we have looked at documents as bags-of-words. This is a common, easily achievable approach for many NLP tasks. However, this approach produces a low-fidelity model of language. The order of words is essential to the encoding and decoding of meaning in language, and to incorporate this, we will need to model sequences.

When people refer to sequences in a machine learning context, they are generally talking about sequences of data in which data points are not independent of the data points around them. We can still use features derived from the data point, as with general machine learning, but now we can also use the data and labels from the nearby data points. For example, if we are trying to determine if the token “produce” is being used as a noun or verb, knowing what words are around it will be very informative. If the token before it is “might,” that indicates “produce” is a verb. If “the” is the preceding token, that indicates it is a noun. These other words give us context.

What if “to” precedes “produce”? That could still indicate either a noun or a verb. We need to look back further. This gives us the concept of windows—the amount of context we want to capture. Many algorithms have a fixed amount of context that is decided as a hyperparameter. There are some algorithms, for example LSTMs, that can learn how long to remember the context.

Sequence problems come from different domains, and in different data formats. In this book ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Applied Natural Language Processing with Python: Implementing Machine Learning and Deep Learning Algorithms for Natural Language Processing

Applied Natural Language Processing with Python: Implementing Machine Learning and Deep Learning Algorithms for Natural Language Processing

Taweh Beysolow II

Publisher Resources

ISBN: 9781492047759Errata PageSupplemental Content