book

Privacy and Security for Large Language Models

Name: Privacy and Security for Large Language Models
Author: Baihan Lin
ISBN: 9781098160845

by Baihan Lin

January 2026

Intermediate to advanced

318 pages

8h 44m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Includes

Quizzes

Preface
Who Should Read This BookWhy I Wrote This BookNavigating This BookConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Introduction
The Rise of Large Language ModelsPrivacy and Security Concerns in LLMsWhat This Book CoversYour Role in This JourneySummary
2. Understanding Large Language Models
Fundamentals of Large Language ModelsBasic Building Blocks of Language ModelsKey Concepts in LLMsLLM ArchitecturesTransformer ArchitectureMixture of Experts ArchitecturePopular LLM ModelsTraining Techniques for LLMsPre-Training TechniquesFine-Tuning TechniquesRetrieval-Augmented GenerationSummary
3. Evaluating the Privacy and Security Risks of LLMs
Privacy MetricsDifferential PrivacyPrivacy Lossk-anonymityPrivacy Considerations in RAG SystemsSecurity MetricsAttack Success Rate (ASR)False Positive Rate (FPR) for Membership InferenceReconstruction Error for Model InversionLLM Privacy and Security AuditsSimulating AttacksLLMPrivacySecurityEvaluator: The All-in-One AuditorModern Evaluation Frameworks and BenchmarksSummary
4. Privacy-Preserving Training Techniques
A Real-World Example of Privacy Breach in the Training PhaseSynthetic Data for Privacy EvaluationHow to Apply LLMPrivacySecurityEvaluator on Your DataDifferential Privacy for LLMsThe Mathematical FoundationImplementing DP-SGD for LLMsPrivacy Accounting in PracticeTrade-Offs and ConsiderationsApplying Differential Privacy to Retrieval-Augmented GenerationFederated Learning with LLMsThe ConceptImplementing Federated Learning for LLMsAdvantages and Challenges of Federated LearningHomomorphic Encryption in LLMsThe ConceptImplementing HE for LLMsAdvantages and Challenges of Homomorphic EncryptionMulti-Party Computation for Secure AggregationThe ConceptImplementing MPC with Modern LibrariesAdvantages and Challenges of MPCParameter-Efficient Fine-Tuning for PrivacyLow-Rank AdaptationQuantized Low-Rank AdaptationPrivacy-Preserving Data TransformationData Anonymization and De-IdentificationPrivacy-Preserving Data AugmentationAdvantages and Challenges of Privacy-Preserving Data AugmentationSummary
5. Secure Deployment of LLMs
Secure Model Hosting and InfrastructureUnderstanding Infrastructure ComponentsIsolation StrategiesNetwork SecurityResource Management and MonitoringSecure APIs and CommunicationsAPI Design PrinciplesImplementation of Secure APIsAuthentication and AuthorizationSecure CommunicationSecure Model Versioning and UpdatesModel Registry and Version ControlSecure Update ProcessSummary
6. Adversarial Attacks and Defenses
Understanding Adversarial Attacks on LLMsTaxonomy of Adversarial Attacks on LLMsNotable Attack MethodsEmbedding Space AttacksLLM Agent AttacksImpact of Model Scale and ArchitectureCase Study: Defending Against Jailbreaking AttacksRobust Fine-Tuning TechniquesAdversarial TrainingRobust Optimization TechniquesData Augmentation for RobustnessPrefix-Tuning and Prompt-Based RobustnessEnsemble MethodsCertifiably Robust Fine-TuningRed-Teaming LLMsRed-Teaming MethodologiesImplementing a Red-Teaming ProgramRed-Teaming Tools and FrameworksAutomated Multiround Red-TeamingCase Study: Red-Teaming in PracticeAdversarial Evaluation and Robustness MetricsRobustness BenchmarksRobustness Under Distribution ShiftHuman-in-the-Loop EvaluationAgent-Based EvaluationStandardized Attack Success MetricsDefense Evaluation MetricsChallenges in Robustness EvaluationBest PracticesFuture Directions in LLM RobustnessSummary
7. Ethical Considerations in Fine-Tuning LLMs
Bias and Fairness Issues in PersonalizationUnderstanding Bias in Fine-Tuned LLMsMeasuring Fairness in Fine-Tuned ModelsBias Mitigation StrategiesChallenges in Privacy-Preserving Bias MitigationTransparency and Explainability in Fine-Tuned ModelsThe Explainability Challenge in LLMsTechniques for Explaining LLM BehaviorPrivacy-Preserving ExplainabilityAddressing AI Bias with Privacy ConstraintsThe Privacy-Fairness Trade-OffGroup-Aware Privacy MechanismsBias-Aware Federated LearningPrivacy-Preserving Bias AuditingSummary
8. Navigating the Cultural, Social, and Legal Landscapes
A New Kind of Socio-Technical SystemsRiding Amidst an AI-Mediated Cultural EvolutionThe Rise of AI-Generated Content and the Erosion of TrustPersonalized AI and Identity Crisis in the Age of Surveillance CapitalismExistential Questions in Human-Machine InteractionUnveiling the Generative AI Supply ChainThe Emergence of Machine CultureAdaptable Legal Frameworks for Regulation and AccountabilityThe Case of Copyright and Intellectual Property in the Age of LLMsThe Case of Data Privacy and Protection in Personalized AI SystemsThe Case of Algorithmic Bias and Discrimination in AI-Powered Decision MakingThe Case of Liability and Accountability in AI-Powered SystemsUniversal Challenges to Techno-Legal SolutionismBuilding a Responsible AI CultureAI Safety Beyond Algorithms: The Human ElementsSummary
9. Building Privacy-Preserving AI Capabilities
Healthcare AI in Action: Differentially Private Clinical Note AnalysisThe Healthcare Privacy ChallengeSynthetic Data as a Privacy-Preserving FoundationLoRA: Efficient and Privacy-Friendly Fine-TuningPrivacy Accounting with RDPReal-World Deployment ConsiderationsLegal AI in Action: Federated Learning Across Law Firms or CourtsThe Legal Confidentiality ImperativeFederated Learning Architecture for Legal AISecure Aggregation and Model UpdatesLegal and Ethical Considerations in Federated Legal AIPerformance and Utility EvaluationBuilding Your Privacy-First AI CapabilityOrganizational Readiness and Implementation StrategyTeam Structure and Technology DecisionsGovernance Integration and Success MeasurementPreparing for Tomorrow’s Privacy LandscapeTechnology Convergence and Regulatory EvolutionMarket Dynamics and Competitive PositioningA Strategic Position for the FutureSummaryConclusionThe Transformation You’ve WitnessedThe Path We’re OnYour Role in Shaping the Future

Index
About the Author

Content preview from Privacy and Security for Large Language Models

Chapter 6. Adversarial Attacks and Defenses

In the previous chapter, you’ve explored the secure deployment of large language models (LLMs) from both engineering and organizational perspectives. You examined various infrastructure considerations, API design patterns, and access control mechanisms that help safeguard these powerful models in production environments. However, even the most carefully deployed system remains vulnerable if the underlying model itself can be manipulated.

This chapter shifts our focus to the fascinating cat-and-mouse game between attackers and defenders in the LLM landscape. You’ll now don the hat of an adversary to understand how these models can be attacked and then pivot to examine the defensive measures that can protect them. Like other deep learning systems, LLMs are vulnerable to adversarial attacks: carefully crafted inputs designed to manipulate the model’s behavior in unintended and potentially harmful ways.

The stakes in this arena are significant. As LLMs become increasingly integrated into critical applications, from financial services and healthcare to content moderation and security systems, their vulnerabilities can lead to severe consequences. An attacker who successfully manipulates an LLM might bypass content filters to generate harmful content, extract private information used during training, or even compromise downstream systems that rely on the model’s outputs.

In this chapter, you’ll explore four key aspects of LLM security. First, ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781098160838Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Privacy and Security for Large Language Models

by Baihan Lin

Chapter 6. Adversarial Attacks and Defenses

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.