Skip to Content
LLMOps
book

LLMOps

by Abi Aryan
July 2025
Intermediate to advanced
284 pages
8h 21m
English
O'Reilly Media, Inc.
Content preview from LLMOps

Chapter 4. Data Engineering for LLMs

In this chapter, you will learn about data engineering, data management practices, and the database tools and systems available. The discussion will be geared toward data, DevOps, and MLOps engineers who want to become LLMOps engineers and/or lead their company’s data engineering efforts. By the end of this chapter, you will have a strong grasp of the foundations of data engineering, as well as best practices for LLMs.

Data Engineering and the Rise of LLMs

In the late 1960s, British computer scientist Edgar F. Codd, fresh from finishing his doctorate in self-replicating computers, was working at IBM. Codd became fascinated by the theory of data arrangement and in 1970 published an internal IBM paper called “A Relational Model of Data for Large Shared Data Banks” that introduced what we know today as relational databases. For example, instead of a sales table in which each record contains all the information about the products and the customers to whom they’ve been sold, relational databases store this data in multiple related tables: one for customers, one for products, and one for sales. Before relational databases, something as simple as a change in customer address would require changing all sales records for that customer, which was an expensive operation in mainframes. In a relational database, you can change just the customer record, and all the related records will be updated.

While it didn’t fascinate anyone at IBM right away, the ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

FastAPI

FastAPI

Bill Lubanovic
Practical MLOps

Practical MLOps

Noah Gift, Alfredo Deza
INSPIRED

INSPIRED

Marty Cagan
Learning Go

Learning Go

Jon Bodner

Publisher Resources

ISBN: 9781098154196Errata Page