Create an On-Premises AI Solution

Published by Pearson

Intermediate

Essential skills for hosting GenAI solutions in your IT infrastructure

Learn essential AI skills for IT operators
Configure your data center for hosting internal AI solutions
Tweak systems according to the required use case

AI is changing the way people work. Generic public AI services are widely available, but don’t meet the requirements to implement AI within companies, where working with private data is required. In this course, expert trainer Sander van Vugt will teach you about the most effective ways that AI can be used in corporate data centers, and how to provide the facilities required for hosting your own AI solutions. Throughout the course, you will work with common open-source AI solutions and learn how to run them on stand-alone servers or using Kubernetes. By the end of the course, you will have the essential knowledge needed to host GenAI solutions in your own IT infrastructure.

What you’ll learn and how you can apply it

The 3 fundamentally different ways GenAI can be used
Choose the right LLM to integrate with your corporate data
How to create a solution to run your own AI solution on premises

This live event is for you because...

You are integrating GenAI with your private data
You are in IT operations and want to learn how adoption of GenAI changes the way you work
You want to tweak systems for handling AI workloads in the most efficient way

Prerequisites

Attendees need to be familiar with the Linux operating system and know how to configure it for handling common tasks

Course Set-up

Attendees should have a Linux system to follow along with the demos in this course. Participants are recommended to use an installation of Ubuntu Linux, but other Linux distributions are also supported. Specific AI-related hardware such as GPUs doesn’t have to be present in the attendees' systems, but it does make following along easier.

Recommended Preparation

Attend: Linux Fundamentals Bootcamp by Sander van Vugt
Watch: Linux Fundamentals, 2nd Edition, by Sander van Vugt

Recommended Follow-up

Watch: Linux under the Hood, 2nd Edition, by Sander van Vugt
Attend: Linux Under the Hood (Updated) by Sander van vugt

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

DAY 1

Segment 1: Introduction (5 minutes)

Segment 2: GenAI and Machine Learning: An overview (15 minutes)

What is GenAI
What is machine learning
Which other kinds of AI do exist?

Segment 3: Using LLMs (50 minutes)

Using an LLM for inference
Choosing the right model
Understanding quantization
LLM families
Using Open Webui
Lab: Test-drive an LLM with llama.cpp without GPU support

Q&A (5 minutes)

Break (5 minutes)

Segment 4: LLMs and Huggingface(30 minutes)

Huggingface and Open Source LLMs
Demo: Discovering Huggingface
How to find the right LLM

Q&A (5 minutes)

Break (10 minutes)

Segment 5: Reducing LLM Usage System Requirements (30 minutes)

How many parameters do you really need?
Quantization
The number of tokens
Other LLM Based Optimizations
Lab: Tewaking llama.cpp based LLM inference

Q&A (5 minutes)

Break (10 minutes)

Segment 6: Running LLMs on your own servers (50 minutes)

Understanding the use of GPUs
Calculating system requirements
Using GPU features
Adding GPU support for containers
Lab: Running llama.cpp on GPU-based hardware

Segment 7: Tweaking your server for LLM usage (20 minutes)

Why only Linux really matters
Optimizing Linux Performance for LLM based tasks
Lab: Running llama.cpp on GPU-based hardware

Day 1 wrap up and Q&A (10 minutes)

DAY 2

Segment 8: Running an inference server (70 minutes)

Llama.cpp versus vLLM
Requirements for using vLLM
Running vLLM
Basic confifuration and tuning for vLLM
Lab: Running vLLM

Q&A (5 minutes)

Break (5 minutes)

Segment 9: Using Kubernetes as a platform for GenAI (70 minutes)

How applications are offered on Kubernetes
Setting up a small Kubernetes cluster
Exposing GPUs to Kubernetes applications
Lab: Running a GPU-based application on Kubernetes

Q&A (5 minutes)

Break (5 minutes)

Segment 10: Adding Data to LLMs: An overview (30 minutes)

Options for adding data to an LLM
Prompt-based injection
RAG
Fine-tuning and LLM versus adapters
Training an LLM from Scratch
When to use which

Q&A (5 minutes)

Break (5 minutes)

Segment 11: Using RAG (30 minutes)

Understanding RAG usage options
Configuring Open WebUI for RAG
Testing RAG
Lab: configuring RAG

Course Wrap up and Q&A (10 minutes)

Your Instructor

Sander van Vugt
Sander van Vugt has many years of experience working with, writing about, and teaching Linux and Open Source topics. He is the author of the best-selling Red Hat RHCSA Cert Guide and the Red Hat RHCSA Complete Video Course along with many other titles on topics that include RHCE, Bash, Kubernetes, Ansible and more. Sander also works as a Linux instructor, teaching on-site and online classes for customers around the world.

linkedin link search

Skill covered

Artificial Intelligence (AI)

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Create an On-Premises AI Solution

What you’ll learn and how you can apply it

This live event is for you because...

Prerequisites

Schedule

DAY 1

DAY 2

Your Instructor

Sander van Vugt

Skill covered