Skip to Content
Generative AI on AWS
book

Generative AI on AWS

by Chris Fregly, Antje Barth, Shelbee Eigenbrode
November 2023
Intermediate to advanced
312 pages
8h 15m
English
O'Reilly Media, Inc.
Book available
Content preview from Generative AI on AWS

Chapter 3. Large-Language Foundation Models

In Chapter 2, you learned how to perform prompt engineering and leverage in-context learning using an existing foundation model. In this chapter, you will explore how a foundation model is trained, including the training objectives and datasets. While it’s not common to train your own foundation model from scratch, it is worth understanding how much time, effort, and complexity is required to perform this compute-intensive process.

Training a multibillion-parameter large-language model from scratch, called pretraining, requires millions of GPU compute hours, trillions of data tokens, and a lot of patience. In this chapter, you will learn about empirical scaling laws as described in the popular Chinchilla paper for model pretraining.1

When training the BloombergGPT model, for example, researchers used the Chinchilla scaling laws as a starting point but still required a lot of trial and error, as explained in the BloombergGPT paper.2 With a GPU compute budget of 1.3 million GPU hours, BloombergGPT was trained with a large distributed cluster of GPU instances using Amazon SageMaker.

Note

This chapter dives deep into pretraining generative foundation models, which may overwhelm some readers. It’s important to note that you do not need to fully understand this chapter to effectively build generative AI applications. You may find this chapter useful as a reference for some advanced concepts later in this book.

Large-Language Foundation ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

Kubernetes for the Absolute Beginners - Hands-On

Kubernetes for the Absolute Beginners - Hands-On

KodeKloud
Building AI Agents with LLMs: Harnessing the Power of Generative AI with Autonomous Agents

Building AI Agents with LLMs: Harnessing the Power of Generative AI with Autonomous Agents

Abi Aryan, Shawn “swyx” Wang, Div Garg, Kence Anderson, Yohei Nakajima, Jaya Gupta, Arjun Bansal

Publisher Resources

ISBN: 9781098159214Errata PageSupplemental Content