January 2025
Beginner to intermediate
432 pages
13h 16m
English
This chapter covers
In the last chapter, we built a Transformer from scratch that can translate between any two languages, based on the paper “Attention Is All You Need.”1 Specifically, we implemented the self-attention mechanism, using query, key, and value vectors to calculate scaled dot product attention (SDPA).
To have a deeper understanding of self-attention and Transformers, we’ll use English-to-French translation as our case study in this ...
Read now
Unlock full access