Skip to Content
View all events

AI Search with Open Weight LLMs

Published by O'Reilly Media, Inc.

Intermediate content levelIntermediate

Learn how to superchargeyour search results with language models, using sentence transformers, optimized retrieval, rank fusion and re-rankers

Course outcomes:

  • Understand encoder LLMs with open weights and how to adopt them to your needs
  • Be able to perform question answering with existing document corpora as a basis for RAG
  • Learn about specific advantages and disadvantages of existing LLMs
  • Understand how to use MTEB to find suitable embedding and reranking models
  • Be able to build a scalable retrieval pipeline for RAG

Course description:

There is an incredible amount of different large language models available on Hugging Face. Many models are generic, some are suited for very special requirements. A lot of work has been invested by the Open Source community to fine-tune models to specific needs like finding answers to given questions or ranking those answers with respect to the question. This leads to new and more powerful, commercially usable LLMs with open weights with more and more coming.

Join expert Christian Winkler to get a structured and consistent introduction to using LLMs with open weights for retrieving content from large amounts of documents. You’ll learn how to use models to retrieve information, combine the results of different models with rank fusion and refine the results with dense passage retrieval. For scaling this solution, you will use a vector database. You’ll get working, hands-on solutions, and explanations on how these models can also excel on less powerful hardware. And you’ll also learn about different frontends these models can be plugged into. All code will be provided in the course and in GitHub.

What you’ll learn and how you can apply it

  • Work with open weight LLMs using the transformers and sentence_transformers libraries
  • How to choose a suitable base LLM
  • Retrieve text (e.g. as a basis for RAG)
  • Work with multiple models in parallel to dramatically improve results
  • Combine the results with rank fusion
  • Refine the results with cross-encoder for even better performance

This live event is for you because...

  • You’re a data scientist, ML engineer, or NLP developer.
  • You want to become an expert in text retrieval using large language models.
  • You want to use modern methods for business use cases.

Prerequisites

  • Set up a Google Colab account, alternatively you can run the software on your own computer Scaling needs a GPU, runpod or other hosters are also good options, but you can also run the software from the course on a CPU if you have a bit of patience
  • For local installations, a powerful GPU is necessary)
  • Link to Jupyter Notebook
  • Link to GitHub repository
  • A working knowledge of Python and Jupyter notebooks
  • Machine learning and Hugging Face Transformers experience (helpful but not required)

Recommended follow-up:

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Sentence embeddings (90 minutes)

  • Presentation: Introduction to similarity; finding semantically similar statements
  • Hands-on exercises: Prepare data; work with an embeddings model using SBERT; use different base models and compare results
  • Q&A
  • Break

Advanced document retrieval (90 minutes)

  • Presentation: Vector databases; sparse document vectors (lexical retrieval); rank fusion algorithms and cross-encoders
  • Hands-on exercises: Use an (embedded) vector database; implement a rank fusion algorithm; use cross-encoders; compare with previous results; improve solution with a vector database
  • Use the reciprocal rank fusion algorithm to work with several models in parallel (boosting the performance)
  • Refine the results with cross-encoders
  • Q&A
  • Break

Using existing software (60 minutes)

  • Presentation: Open source solutions for document retrieval; features of LangChain, txtai and LlamaIndex
  • Hands-on exercises: Install software; use existing software like txtai or LangChain to achieve a similar result; explore deployment options
  • Q&A

Your Instructor

  • Christian Winkler

    Christian Winkler is a professor at the Technical University of Applied Science in Nürnberg, where he concentrates on the latest research in natural language processing and, specifically, in the application of large language models. He coauthored Blueprints for Text Analytics Using Python for O’Reilly and has written many articles about NLP.

Skills covered

  • Search
  • Large Language Models (LLMs)