on-demand course

Introduction to LLM vulnerabilities

with Alfredo Deza

April 2024

Intermediate

1h 25m

English

Pragmatic AI Labs

Overview

Introduction to LLM vulnerabilities

This introductory course on vulnerabilities for Large Language Models (LLMs) and language models in general. It provides a deep dive into the practical applications of large language models (LLMs) using Azure's AI services.

Upon completion, learners will be able to:

Explain the concept of model replication or model shadowing as a potential attack vector in large language models, and describe methods to mitigate it through techniques like rate limiting and buffering.
Analyze the potential benefits and limitations of using pre-trained LLMs
Develop strategies for mitigating risks and ethical considerations when deploying LLM-powered applications.
Describe the high-level process of creating a large language model, including data collection, cleaning, and training.
Explain the role of security in large language models and recognize potential security vulnerabilities and attack vectors.
Identify insecure plugin designs in large language model software development kits (SDKs) that could lead to remote execution and implement strategies to secure plugins.

You will learn how to secure your large language model (LLM) applications by addressing potential vulnerabilities. You will explore strategies to mitigate risks from insecure plugin design, including proper input validation and sanitization. Additionally, you will discover techniques to protect against sensitive information disclosure, such as using a redaction service to remove personally identifiable data from prompts and model responses. Finally, you will learn how to actively monitor your application dependencies for security updates and vulnerabilities, ensuring your system remains secure over time.

Week 1: Foundations of Language Models

This week you will get a brief overview of LLMs and how do they work

Learning Objectives

Analyze common types of generative applications and their architectures, including multi-model applications, and understand their challenges and benefits.
Explain the functioning of a multi-model application, including the role of the framework and specialized machine learning models.
Identify the advantages of smaller, specialized models in terms of resource usage, interaction speed, and deployment agility.
Compare and contrast different generative AI application types, such as API-based, embedded models, and multi-model applications, and understand their use cases and challenges.
Recognize the importance of large language models in various real-world applications, including text-based chatting, customer service, content creation, and daily tasks.
Evaluate the benefits and drawbacks of large language models, considering aspects like accuracy, privacy, and potential misuse.
Understand the basics of tokenization, indexing, and probability machines in the context of large language models.
Describe the high-level process of creating a large language model, including data collection, cleaning, and training.
Explain the role of security in large language models and recognize potential security vulnerabilities and attack vectors.

Week 2: Language Model Vulnerabilities

This week focuses on model-based vulnerabilities that you can explore with prompts.

Learning Objectives

Explain the concept of model replication or model shadowing as a potential attack vector in large language models, and describe methods to mitigate it through techniques like rate limiting and buffering.
Identify and demonstrate insecure output handling in large language models, and understand the potential security threats and attack vectors associated with it.
Understand prompt injection and its implications for large language models, including how certain applications define the initial behavior of these models and how to exploit implicit system prompts.
Recognize model theft vulnerabilities and understand how handling and access to system components can impact model security, particularly in the context of dynamically loaded models from external sources.

Week 3: System vulnerabilities

This week you will learn how to deal with environments and system-based vulnerabilities as they relate to LLMs.

Learning Objectives

Identify insecure plugin designs in large language model software development kits (SDKs) that could lead to remote execution and implement strategies to secure plugins.
Explain the potential risks of sensitive information disclosure in large language models and implement measures to redact personal identifiable information using HTTP APIs and regular expressions.
Monitor and update dependencies in large language model applications to prevent potential security vulnerabilities and automate the process using tools like GitHub's Dependabot.
Evaluate application vulnerabilities based on the programming language and framework, and implement measures to prevent potential security threats.

Week 4: Other types of vulnerabilities

Learning Objectives

Identify potential security threats and vulnerabilities associated with large and small language models.
Implement strategies to prevent security situations and guard against making environments more secure.
Recognize the concept of excessive agency in large language models and its potential impacts on functionality.
Explain the denial of service threat for large language models and describe methods to guard against API misuse.%

About your instructor

Alfredo Deza has over a decade of experience as a Software Engineer doing DevOps, automation, and scalable system architecture. Before getting into technology he participated in the 2004 Olympic Games and was the first-ever World Champion in High Jump representing Peru. He currently works in Developer Relations at Microsoft and is an Adjunct Professor at Duke University teaching Machine Learning, Cloud Computing, Data Engineering, Python, and Rust. With Alfredo's guidance, you will gain the knowledge and skills to understand and work with vulnerabilities within language models.

Resources

Introduction to Generative AI

Responsible Generative AI and Local LLMS

Practical MLOps book

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Watch now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 28562236VIDEOPAIML

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills