Understand the architecture that underpins today’s most powerful AI models.
Transformers are the superpower behind large language models (LLMs) like ChatGPT, Gemini, and Claude. Transformers in Action gives you the insights, practical techniques, and extensive code samples you need to adapt pretrained transformer models to new and exciting tasks.
Inside Transformers in Action you’ll learn:
How transformers and LLMs work
Modeling families and architecture variants
Efficient and specialized large language models
Adapt HuggingFace models to new tasks
Automate hyperparameter search with Ray Tune and Optuna
Optimize LLM model performance
Advanced prompting and zero/few-shot learning
Text generation with reinforcement learning
Responsible LLMs
Transformers in Action takes you from the origins of transformers all the way to fine-tuning an LLM for your own projects. Author Nicole Koenigstein demonstrates the vital mathematical and theoretical background of the transformer architecture practically through executable Jupyter notebooks. You’ll discover advice on prompt engineering, as well as proven-and-tested methods for optimizing and tuning large language models. Plus, you’ll find unique coverage of AI ethics, specialized smaller models, and the decoder encoder architecture.
About the Technology Transformers are the beating heart of large language models (LLMs) and other generative AI tools. These powerful neural networks use a mechanism called self-attention, which enables them to dynamically evaluate the relevance of each input element in context. Transformer-based models can understand and generate natural language, translate between languages, summarize text, and even write code—all with impressive fluency and coherence.
About the Book Transformers in Action introduces you to transformers and large language models with careful attention to their design and mathematical underpinnings. You’ll learn why architecture matters for speed, scale, and retrieval as you explore applications including RAG and multi-modal models. Along the way, you’ll discover how to optimize training and performance using advanced sampling and decoding techniques, use reinforcement learning to align models with human preferences, and more. The hands-on Jupyter notebooks and real-world examples ensure you’ll see transformers in action as you go.
What's Inside
Optimizing LLM model performance
Adapting HuggingFace models to new tasks
How transformers and LLMs work under the hood
Mitigating bias and responsible ethics in LLMs
About the Reader For data scientists and machine learning engineers.
About the Author Nicole Koenigstein is the Co-Founder and Chief AI Officer at the fintech company Quantmate.
Quotes An absolute joy to read and learn from. - Ankit Virmani, Google
New insights and high impact applications in almost every chapter. - Hobson Lane, Tangible AI
Finally, the transformers book that prioritizes code over theory! - Olena Sokol, Samsung
A sharp, no-fluff deep dive into transformers. - Priyanka Neelakrishnan, Palo Alto Networks
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month, and much more.
O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.