Skip to Content
Data Analysis with LLMs
book

Data Analysis with LLMs

by Immanuel Trummer
April 2025
Intermediate to advanced
232 pages
6h 56m
English
Manning Publications

Overview

Speed up common data science tasks with AI assistants like ChatGPT and Large Language Models (LLMs) from Anthropic, Cohere, Open AI, Google, Hugging Face, and more!

Data Analysis with LLMs teaches you to use the new generation of AI assistants and Large Language Models (LLMs) to aid and accelerate common data science tasks.

Learn how to use LLMs to:

  • Analyze text, tables, images, and audio files
  • Extract information from multi-modal data lakes
  • Classify, cluster, transform, and query multimodal data
  • Build natural language query interfaces over structured data sources
  • Use LangChain to build complex data analysis pipelines
  • Prompt engineering and model configuration

All practical, Data Analysis with LLMs takes you from your first prompts through advanced techniques like creating LLM-based agents for data analysis and fine-tuning existing models. You’ll learn how to extract data, build natural language query interfaces, and much more.

About the Technology
Large Language Models (LLMs) can streamline and accelerate almost any data science task. Master the techniques in this book, and you’ll be able to analyze large amounts of text, tabular and graph data, images, videos, and more with clear natural language prompts and a few lines of Python code.

About the Book
Data Analysis with LLMs shows you exactly how to integrate generative AI into your day-to-day work as a data scientist. In it, Cornell professor Immanuel Trummer guides you through a series of engaging projects that introduce OpenAI’s Python library, tools like LangChain and LlamaIndex, and LLMs from Anthropic, Cohere, and Hugging Face. As you go, you’ll use AI to query structured and unstructured data, analyze sound and images, and optimize the cost and quality of your data analysis process.

What's Inside
  • Classify, cluster, transform, and query multimodal data
  • Build natural language query interfaces over structured data sources
  • Create LLM-based agents for autonomous data analysis
  • Prompt engineering and model configuration


About the Reader
For data scientists and data analysts who know the basics of Python.

About the Author
Immanuel Trummer is an associate professor of computer science at Cornell University and a member of the Cornell Database Group.

Quotes
Comprehensive, insightful, and packed with hands-on guidance. A must-read!
- Oren Etzioni, Allen Institute of AI

Goes into the deep and fascinating areas that other books gloss over. It will level you up fast.
- Andrew Carr, Cartwheel

Helps you make LLMs an indispensable tool to process data of all types and uncover valuable insights with ease.
- Aditya Parameswaran, University of California, Berkeley

A valuable resource to leverage LLMs for multimodal data analysis.
- Sumit Bhattacharyya, TELUS Health

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

AirBnbBlueOriginElectronic ArtsHomeDepotNasdaqRakutenTata Consultancy Services

QuotationMarkO’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.
Julian F.
Head of Cybersecurity
QuotationMarkI wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.
Addison B.
Field Engineer
QuotationMarkI’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.
Amir M.
Data Platform Tech Lead
QuotationMarkI'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.
Mark W.
Embedded Software Engineer

You might also like

Practical Statistics for Data Scientists, 2nd Edition

Practical Statistics for Data Scientists, 2nd Edition

Peter Bruce, Andrew Bruce, Peter Gedeck

Publisher Resources

ISBN: 9781633437647Publisher SupportPublisher Website