Skip to Content
LLM Engineer's Handbook
book

LLM Engineer's Handbook

by Paul Iusztin, Maxime Labonne
October 2024
Intermediate to advanced
522 pages
12h 55m
English
Packt Publishing
Content preview from LLM Engineer's Handbook

3

Data Engineering

This chapter will begin exploring the LLM Twin project in more depth. We will learn how to design and implement the data collection pipeline to gather the raw data we will use in all our LLM use cases, such as fine-tuning or inference. As this is not a book on data engineering, we will keep this chapter short and focus only on what is strictly necessary to collect the required raw data. Starting with Chapter 4, we will concentrate on LLMs and GenAI, exploring its theory and concrete implementation details.

When working on toy projects or doing research, you usually have a static dataset with which you work. But in our LLM Twin use case, we want to mimic a real-world scenario where we must gather and curate the data ourselves. ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Start your free trial

You might also like

AI Engineering

AI Engineering

Chip Huyen
AI Engineering

AI Engineering

Chip Huyen
AI Engineering

AI Engineering

Chip Huyen

Publisher Resources

ISBN: 9781836200079Supplemental Content