December 2025
Beginner to intermediate
360 pages
10h 48m
English
This part introduces the foundations of transformer architectures, which have become the backbone of modern generative AI. We begin with a comparison of transformers and diffusion models, showing how each tackles the problem of generating data in very different ways. From there, you’ll build a transformer from scratch in chapter 2 to translate German to English, gaining hands-on experience with the attention mechanism that enables these models to capture relationships across sequences.
We then explore practical applications of transformers in computer vision and multimodal tasks. You’ll implement a vision transformer (ViT) to classify images in chapter 3 and build a multimodal transformer to ...
Read now
Unlock full access