Skip to Content
View all events

Fine-Tuning Open Weight Large Language Models

Published by O'Reilly Media, Inc.

Intermediate to advanced content levelIntermediate to advanced

Get the most out of BERT, Llama, and Mistral and tailor them to your needs

Course outcomes

  • Understand the different types of LLMs (encoder and decoders) and their limitations
  • Get an insight into transfer learning and how it relates to fine-tuning
  • Understand the challenges related to fine-tuning
  • Perform fine-tuning for the different types of LLMs
  • Know about possible hardware scenarios

Course description

There is an incredible amount of different large language models available on Hugging Face. Many models are generic, some are suited for very special requirements. A lot of work has been invested by the ML community to fine-tune the model to specific needs for tasks such as question answering or for chatbots. But what if you want to fine-tune your own model using e.g. company-internal data? What seems to be very difficult and only suitable for experts with immensely expensive hardware turns out to be not so difficult after all.

Join expert Christian Winkler to get a structured and accessible introduction to fine-tuning open weight LLMs using free software. You’ll fine-tune your own model and adapt the base model to your functional needs (like question answering or domain-specific vocabulary) with code presented in the course along with a GitHub repo. You’ll discover how these models can also excel on less powerful hardware by using new approaches to quantization.

What you’ll learn and how you can apply it

  • Develop Domain-Specific Experts: Move beyond "general" AI by teaching models your industry’s unique vocabulary, from legal compliance to specialized technical support.
  • Reduce Cloud Costs: Stop overpaying for API calls by fine-tuning smaller, open-weight models that perform as well as larger proprietary models for specific tasks.
  • Ensure Data Sovereignty: Apply fine-tuning techniques to internal datasets on your own infrastructure, keeping sensitive company data completely private.
  • Optimize for Consumer Hardware: Deploy your fine-tuned models on accessible hardware like Mac M-series chips or mid-range NVIDIA cards.
  • Enhance RAG Pipelines: Use fine-tuned embedding models to make your Retrieval-Augmented Generation systems significantly more accurate and context-aware.

This live event is for you because...

  • You’re a data scientist, ML engineer, or NLP developer.
  • You want to adapt LLMs to your specific needs.
  • ChatGPT is not enough for you.

Prerequisites

  • Working knowledge of Python and Jupyter Notebooks
  • Basic understanding of machine learning
  • Familiarity with Hugging Face Transformers (helpful but not required)

Recommended preparation:

  • Set up a Google Colab account (for local installations a powerful GPU is necessary)

Recommended follow-up:

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Short introduction to LLMs (30 minutes)

  • Group discussion: Which open-source LLMs are you using?
  • Presentation: BERT-like models (classification, inference, embeddings); GPT-like models (text generation); transfer learning (basis for fine-tuning)
  • Q&A

Fine-tuning BERT models (60 minutes)

  • Group discussion: Who has fine-tuned models? How long does it take to fine-tune models?
  • Presentation: Classification; embeddings
  • Hands-on exercises: Fine-tune a BERT model; fine-tune an embeddings model with sentence transformers (SBERT)
  • Q&A

Fine-tuning GPT models (120 minutes)

  • Presentation: LoRA and PEFT; fine-tuning with Hugging Face transformers; fine-tuning with Hugging Face trainer and Unsloth frameworks
  • Hands-on exercises: Fine-tune using the Hugging Face software and the Unsloth frameworks
  • Q&A

Hardware (30 minutes)

  • Group discussion: Where do you want to run your models?; Which hardware is most suitable for your usage scenario?
  • Presentation: Running on CPUs; running on GPUs (local or cloud); running on Apple hardware
  • Q&A

Your Instructor

  • Christian Winkler

    Christian Winkler has worked with NLP for many years and written many articles about it. As an O’Reilly author he has co-authored “Blueprints for Text Analytics using Python”. Working as a professor at the Technical University of Applied Science in Nürnberg, he concentrates on the latest research in natural language processing and specifically in the application of large language models.

Skills covered

  • Prompt Engineering
  • QA / Testing
  • GPT