August 2021
Intermediate to advanced
752 pages
21h 59m
English
This chapter focuses on a technique known as attention. We start by describing the attention mechanism and how it can be used to improve the encoder-decoder-based neural machine translation architecture from Chapter 14, “Sequence-to-Sequence Networks and Natural Language Translation.” We then describe a mechanism known as self-attention and how the different attention mechanisms can be used to build an architecture known as the Transformer.
Many readers will find attention tricky on the first encounter. We encourage you to try to get through this chapter, but it is fine to skip over the details during the first reading. Focus on understanding the big picture. In particular, do not worry if you feel lost ...
Read now
Unlock full access