AI Search with Open Weight LLMs

Intermediate

Learn how to superchargeyour search results with language models, using sentence transformers, optimized retrieval, rank fusion and re-rankers

Course outcomes:

Understand encoder LLMs with open weights and how to adopt them to your needs
Be able to perform question answering with existing document corpora as a basis for RAG
Learn about specific advantages and disadvantages of existing LLMs
Understand how to use MTEB to find suitable embedding and reranking models
Be able to build a scalable retrieval pipeline for RAG

Course description:

There is an incredible amount of different large language models available on Hugging Face. Many models are generic, some are suited for very special requirements. A lot of work has been invested by the Open Source community to fine-tune models to specific needs like finding answers to given questions or ranking those answers with respect to the question. This leads to new and more powerful, commercially usable LLMs with open weights with more and more coming.

Join expert Christian Winkler to get a structured and consistent introduction to using LLMs with open weights for retrieving content from large amounts of documents. You’ll learn how to use models to retrieve information, combine the results of different models with rank fusion and refine the results with dense passage retrieval. For scaling this solution, you will use a vector database. You’ll get working, hands-on solutions, and explanations on how these models can also excel on less powerful hardware. And you’ll also learn about different frontends these models can be plugged into. All code will be provided in the course and in GitHub.

What you’ll learn and how you can apply it

Work with open weight LLMs using the transformers and sentence_transformers libraries
How to choose a suitable base LLM
Retrieve text (e.g. as a basis for RAG)
Work with multiple models in parallel to dramatically improve results
Combine the results with rank fusion
Refine the results with cross-encoder for even better performance

This live event is for you because...

You’re a data scientist, ML engineer, or NLP developer.
You want to become an expert in text retrieval using large language models.
You want to use modern methods for business use cases.

Prerequisites

Set up a Google Colab account, alternatively you can run the software on your own computer Scaling needs a GPU, runpod or other hosters are also good options, but you can also run the software from the course on a CPU if you have a bit of patience
For local installations, a powerful GPU is necessary)
Link to Jupyter Notebook
Link to GitHub repository
A working knowledge of Python and Jupyter notebooks
Machine learning and Hugging Face Transformers experience (helpful but not required)

Recommended follow-up:

Take Fundamentals of Large Language Models (live online course with Jonathan Fernandes)
Take Fine-Tuning Open Weight Large Language Models (live online course with Christian Winkler)
Read Hands-On Large Language Models (book)
Read Scaling Search and Retrieval for Contextual AI (book)
Read Hands-On RAG for Production (book)

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Sentence embeddings (90 minutes)

Presentation: Introduction to similarity; finding semantically similar statements
Hands-on exercises: Prepare data; work with an embeddings model using SBERT; use different base models and compare results
Q&A
Break

Advanced document retrieval (90 minutes)

Presentation: Vector databases; sparse document vectors (lexical retrieval); rank fusion algorithms and cross-encoders
Hands-on exercises: Use an (embedded) vector database; implement a rank fusion algorithm; use cross-encoders; compare with previous results; improve solution with a vector database
Use the reciprocal rank fusion algorithm to work with several models in parallel (boosting the performance)
Refine the results with cross-encoders
Q&A
Break

Using existing software (60 minutes)

Presentation: Open source solutions for document retrieval; features of LangChain, txtai and LlamaIndex
Hands-on exercises: Install software; use existing software like txtai or LangChain to achieve a similar result; explore deployment options
Q&A

Your Instructor

Christian Winkler
Christian Winkler is a professor at the Technical University of Applied Science in Nürnberg, where he concentrates on the latest research in natural language processing and, specifically, in the application of large language models. He coauthored Blueprints for Text Analytics Using Python for O’Reilly and has written many articles about NLP.

linkedin link search

Skills covered

Large Language Models (LLMs)

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills