1 Understanding large language models

This chapter covers

High-level explanations of the fundamental concepts behind large language models (LLMs)
Insights into the transformer architecture from which LLMs are derived
A plan for building an LLM from scratch

Large language models (LLMs), such as those offered in OpenAI’s ChatGPT, are deep neural network models that have been developed over the past few years. They ushered in a new era for natural language processing (NLP). Before the advent of LLMs, traditional methods excelled at categorization tasks such as email spam classification and straightforward pattern recognition that could be captured with handcrafted rules or simpler models. However, they typically underperformed in language ...

Get Build a Large Language Model (From Scratch) now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Build a Large Language Model (From Scratch) by Sebastian Raschka

1 Understanding large language models

This chapter covers

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly