12 Network design alternatives to RNNs

This chapter covers

Working around the limitations of RNNs
Adding time to a model using positional encodings
Adapting CNNs to sequence-based problems
Extending attention to multiheaded attention
Understanding transformers

Recurrent neural networks—in particular, LSTMs—have been used for classifying and working with sequence problems for over two decades. While they have long been reliable tools for the task, they have several undesirable properties. First, RNNs are just plain slow. They take a long time to train, which means waiting around for results. Second, they do not scale well with more layers (hard to improve model accuracy) or with more GPUs (hard to make them train faster). With skip connections ...

Get Inside Deep Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Inside Deep Learning by Edward Raff

12 Network design alternatives to RNNs

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly