Create an On-Premises AI Solution
Published by Pearson
Essential skills for hosting GenAI solutions in your IT infrastructure
- Learn essential AI skills for IT operators
- Configure your data center for hosting internal AI solutions
- Tweak systems according to the required use case
AI is changing the way people work. Generic public AI services are widely available, but don’t meet the requirements to implement AI within companies, where working with private data is required. In this course, expert trainer Sander van Vugt will teach you about the most effective ways that AI can be used in corporate data centers, and how to provide the facilities required for hosting your own AI solutions. Throughout the course, you will work with common open-source AI solutions and learn how to run them on stand-alone servers or using Kubernetes. By the end of the course, you will have the essential knowledge needed to host GenAI solutions in your own IT infrastructure.
What you’ll learn and how you can apply it
- The 3 fundamentally different ways GenAI can be used
- Choose the right LLM to integrate with your corporate data
- How to create a solution to run your own AI solution on premises
This live event is for you because...
- You are integrating GenAI with your private data
- You are in IT operations and want to learn how adoption of GenAI changes the way you work
- You want to tweak systems for handling AI workloads in the most efficient way
Prerequisites
- Attendees need to be familiar with the Linux operating system and know how to configure it for handling common tasks
Course Set-up
- Attendees should have a Linux system to follow along with the demos in this course. Participants are recommended to use an installation of Ubuntu Linux, but other Linux distributions are also supported. Specific AI-related hardware such as GPUs doesn’t have to be present in the attendees' systems, but it does make following along easier.
Recommended Preparation
- Attend: Linux Fundamentals Bootcamp by Sander van Vugt
- Watch: Linux Fundamentals, 2nd Edition, by Sander van Vugt
Recommended Follow-up
- Watch: Linux under the Hood, 2nd Edition, by Sander van Vugt
- Attend: Linux Under the Hood (Updated) by Sander van vugt
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
DAY 1
Segment 1: Introduction (5 minutes)
Segment 2: GenAI and Machine Learning: An overview (15 minutes)
- What is GenAI
- What is machine learning
- Which other kinds of AI do exist?
Segment 3: Using LLMs (50 minutes)
- Using an LLM for inference
- Choosing the right model
- Understanding quantization
- LLM families
- Using Open Webui
- Lab: Test-drive an LLM with llama.cpp without GPU support
Q&A (5 minutes)
Break (5 minutes)
Segment 4: LLMs and Huggingface(30 minutes)
- Huggingface and Open Source LLMs
- Demo: Discovering Huggingface
- How to find the right LLM
Q&A (5 minutes)
Break (10 minutes)
Segment 5: Reducing LLM Usage System Requirements (30 minutes)
- How many parameters do you really need?
- Quantization
- The number of tokens
- Other LLM Based Optimizations
- Lab: Tewaking llama.cpp based LLM inference
Q&A (5 minutes)
Break (10 minutes)
Segment 6: Running LLMs on your own servers (50 minutes)
- Understanding the use of GPUs
- Calculating system requirements
- Using GPU features
- Adding GPU support for containers
- Lab: Running llama.cpp on GPU-based hardware
Segment 7: Tweaking your server for LLM usage (20 minutes)
- Why only Linux really matters
- Optimizing Linux Performance for LLM based tasks
- Lab: Running llama.cpp on GPU-based hardware
Day 1 wrap up and Q&A (10 minutes)
DAY 2
Segment 8: Running an inference server (70 minutes)
- Llama.cpp versus vLLM
- Requirements for using vLLM
- Running vLLM
- Basic confifuration and tuning for vLLM
- Lab: Running vLLM
Q&A (5 minutes)
Break (5 minutes)
Segment 9: Using Kubernetes as a platform for GenAI (70 minutes)
- How applications are offered on Kubernetes
- Setting up a small Kubernetes cluster
- Exposing GPUs to Kubernetes applications
- Lab: Running a GPU-based application on Kubernetes
Q&A (5 minutes)
Break (5 minutes)
Segment 10: Adding Data to LLMs: An overview (30 minutes)
- Options for adding data to an LLM
- Prompt-based injection
- RAG
- Fine-tuning and LLM versus adapters
- Training an LLM from Scratch
- When to use which
Q&A (5 minutes)
Break (5 minutes)
Segment 11: Using RAG (30 minutes)
- Understanding RAG usage options
- Configuring Open WebUI for RAG
- Testing RAG
- Lab: configuring RAG
Course Wrap up and Q&A (10 minutes)
Your Instructor
Sander van Vugt
Sander van Vugt has many years of experience working with, writing about, and teaching Linux and Open Source topics. He is the author of the best-selling Red Hat RHCSA Cert Guide and the Red Hat RHCSA Complete Video Course along with many other titles on topics that include RHCE, Bash, Kubernetes, Ansible and more. Sander also works as a Linux instructor, teaching on-site and online classes for customers around the world.