Hands-On Large Language Models

Book description

AI has acquired startling new language capabilities in just the past few years. Driven by the rapid advances in deep learning, language AI systems are able to write and understand text better than ever before. This trend enables the rise of new features, products, and entire industries. With this book, Python developers will learn the practical tools and concepts they need to use these capabilities today.

You'll learn how to use the power of pretrained large language models for use cases like copywriting and summarization; create semantic search systems that go beyond keyword matching; build systems that classify and cluster text to enable scalable understanding of large numbers of text documents; and use existing libraries and pretrained models for text classification, search, and clusterings.

This book also shows you how to:

  • Build advanced LLM pipelines to cluster text documents and explore the topics they belong to
  • Build semantic search engines that go beyond keyword search with methods like dense retrieval and rerankers
  • Learn various use cases where these models can provide value
  • Understand the architecture of underlying Transformer models like BERT and GPT
  • Get a deeper understanding of how LLMs are trained
  • Optimize LLMs for specific applications with methods such as generative model fine-tuning, contrastive fine-tuning, and in-context learning

Publisher resources

View/Submit Errata

Table of contents

  1. Brief Table of Contents (Not Yet Final)
  2. 1. Categorizing Text
    1. Supervised Text Classification
      1. Model Selection
      2. Data
      3. Classification Head
      4. Pre-Trained Embeddings
    2. Zero-shot Classification
      1. Pre-Trained Embeddings
      2. Natural Language Inference
    3. Classification with Generative Models
      1. In-Context Learning
      2. Named Entity Recognition
    4. Summary
  3. 2. Semantic Search
    1. Three Major Categories of Language-Model-based Search Systems
    2. Dense Retrieval
      1. Dense Retrieval Example
      2. Chunking Long Texts
      3. Nearest Neighbor Search vs. Vector Databases
      4. Fine-tuning embedding models for dense retrieval
    3. Reranking
      1. Reranking Example
      2. Open Source Retrieval and Reranking with Sentence Transformers
      3. How Reranking Models Work
    4. Generative Search
      1. What is Generative Search?
    5. Other LLM applications in search
      1. Evaluation metrics
    6. Summary
  4. 3. Text Clustering and Topic Modeling
    1. Text Clustering
      1. Data
      2. How do we perform Text Clustering?
    2. Topic Modeling
      1. BERTopic
      2. Example
      3. Representation Models
      4. Text Generation
      5. Topic Modeling Variations
    3. Summary
  5. 4. Text Generation with GPT Models
    1. Using Text Generation Models
      1. Choosing a Text Generation Model
      2. Loading a Text Generation Model
      3. Controlling the Model Output
    2. Intro to Prompt Engineering
      1. The Basic Ingredients of a Prompt
      2. Instruction-based Prompting
    3. Advanced Prompt Engineering
      1. The Potential Complexity of a Prompt
      2. In-Context Learning: Providing Examples
      3. Chain Prompting: Breaking up the Problem
    4. Reasoning with Generative Models
      1. Chain-of-Thought: Think Before Answering
      2. Tree-of-Thought: Exploring Intermediate Steps
    5. Output Verification
      1. Providing Examples
      2. Grammar: Constrained Sampling
    6. Summary
  6. 5. Multimodal Large Language Models
    1. Transformers for Vision
    2. Multimodal Embedding Models
      1. CLIP: Connecting Text and Images
    3. Making Text Generation Models Multimodal
      1. BLIP-2: Bridging the Modality Gap
      2. Preprocessing Multimodal Inputs
      3. Use Case 1: Image Captioning
      4. Use Case 2: Multimodal Chat-based Prompting
    4. Summary
  7. 6. Tokens & Token Embeddings
    1. LLM Tokenization
      1. How tokenizers prepare the inputs to the language model
      2. Word vs. Subword vs. Character vs. Byte Tokens
      3. Comparing Trained LLM Tokenizers
      4. Tokenizer Properties
      5. A Language Model Holds Embeddings for the Vocabulary of its Tokenizer
      6. Creating Contextualized Word Embeddings with Language Models
    2. Word Embeddings
      1. Using Pre-trained Word Embeddings
      2. The Word2vec Algorithm and Contrastive Training
    3. Embeddings for Recommendation Systems
      1. Recommending songs by embeddings
    4. Summary
  8. 7. Creating Text Embedding Models
    1. Embedding Models
    2. What is Contrastive Learning?
    3. SBERT
    4. Creating an Embedding Model
      1. Generating contrastive examples
      2. Train model
      3. In-depth Evaluation
      4. Loss Functions
    5. Fine-tuning an Embedding Model
      1. Supervised
      2. Augmented SBERT
    6. Unsupervised Learning
      1. Transformer-based Denoising AutoEncoder
      2. Domain Adaptation
      3. Generative Pseudo-Labeling
    7. Summary
  9. About the Authors

Product information

  • Title: Hands-On Large Language Models
  • Author(s): Jay Alammar, Maarten Grootendorst
  • Release date: October 2024
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098150969