January 2025
Beginner to intermediate
432 pages
13h 16m
English
Part III focuses on text generation.
In chapter 8, you’ll learn to build and train a recurrent neural network to generate text. Along the way, you’ll learn how tokenization and word embedding work. You’ll also learn to generate text autoregressively and how to use temperature and top-K sampling to control the creativity of the generated text. In chapters 9 and 10, you’ll build a Transformer from scratch, based on the paper “Attention Is All You Need,” to translate English to French. In chapter 11, you’ll learn to build GPT-2XL, the largest version of GPT-2, from scratch. After that, you’ll learn how to extract the pretrained weights from Hugging Face and load them to your own GPT-2 model. You’ll ...
Read now
Unlock full access