Skip to Content
Applied Natural Language Processing in the Enterprise
book

Applied Natural Language Processing in the Enterprise

by Ankur A. Patel, Ajay Uppili Arasanipalai
May 2021
Beginner to intermediate
333 pages
8h 45m
English
O'Reilly Media, Inc.
Content preview from Applied Natural Language Processing in the Enterprise

Chapter 6. Recurrent Neural Networks and Other Sequence Models

One of the big themes of this book so far has been transformers. In fact, almost every model we have trained so far has been some member or relative of the transformer family. Even the tokenizers we built and used were constructed with specific transformer architectures in mind.

But transformers aren’t the only model in town.

Transformers themselves are relatively recent—the original paper by Vaswani et al.1 was first published on arXiv in June 2017 (eons ago in the deep learning community but not too long ago in the span of human history). Before then, people weren’t really using transformers. So what was the alternative?

Recurrent neural networks (RNNs) were the name of the game back in the day. With all of our talk about how transformers and transfer learning have revolutionized the field, we might have given you the (false) impression that NLP wasn’t really a thing until BERT came out. This is most certainly not the case.

RNNs and their variants were the convolutional neural networks (CNNs) of NLP. In 2015, if you wanted to learn deep learning, most courses introduced CNNs as the “solution” for vision and RNNs as the “solution” for NLP. Perhaps the most salient example of 2015 RNN hype was Andrej Karpathy’s blog post, “The Unreasonable Effectiveness of Recurrent Neural Networks”, which shows how RNNs can be used to do a lot of interesting things and actually work.

RNNs and their variants, unlike transformers, are ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Natural Language Processing with Flair

Natural Language Processing with Flair

Tadej Magajna

Publisher Resources

ISBN: 9781492062561Errata Page