book

Privacy and Security for Large Language Models

Name: Privacy and Security for Large Language Models
Author: Baihan Lin
ISBN: 9781098160845

by Baihan Lin

January 2026

Intermediate to advanced

318 pages

8h 44m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Includes

Quizzes

Preface
Who Should Read This BookWhy I Wrote This BookNavigating This BookConventions Used in This BookUsing Code ExamplesO’Reilly Online LearningHow to Contact UsAcknowledgments
1. Introduction
The Rise of Large Language ModelsPrivacy and Security Concerns in LLMsWhat This Book CoversYour Role in This JourneySummary
2. Understanding Large Language Models
Fundamentals of Large Language ModelsBasic Building Blocks of Language ModelsKey Concepts in LLMsLLM ArchitecturesTransformer ArchitectureMixture of Experts ArchitecturePopular LLM ModelsTraining Techniques for LLMsPre-Training TechniquesFine-Tuning TechniquesRetrieval-Augmented GenerationSummary
3. Evaluating the Privacy and Security Risks of LLMs
Privacy MetricsDifferential PrivacyPrivacy Lossk-anonymityPrivacy Considerations in RAG SystemsSecurity MetricsAttack Success Rate (ASR)False Positive Rate (FPR) for Membership InferenceReconstruction Error for Model InversionLLM Privacy and Security AuditsSimulating AttacksLLMPrivacySecurityEvaluator: The All-in-One AuditorModern Evaluation Frameworks and BenchmarksSummary
4. Privacy-Preserving Training Techniques
A Real-World Example of Privacy Breach in the Training PhaseSynthetic Data for Privacy EvaluationHow to Apply LLMPrivacySecurityEvaluator on Your DataDifferential Privacy for LLMsThe Mathematical FoundationImplementing DP-SGD for LLMsPrivacy Accounting in PracticeTrade-Offs and ConsiderationsApplying Differential Privacy to Retrieval-Augmented GenerationFederated Learning with LLMsThe ConceptImplementing Federated Learning for LLMsAdvantages and Challenges of Federated LearningHomomorphic Encryption in LLMsThe ConceptImplementing HE for LLMsAdvantages and Challenges of Homomorphic EncryptionMulti-Party Computation for Secure AggregationThe ConceptImplementing MPC with Modern LibrariesAdvantages and Challenges of MPCParameter-Efficient Fine-Tuning for PrivacyLow-Rank AdaptationQuantized Low-Rank AdaptationPrivacy-Preserving Data TransformationData Anonymization and De-IdentificationPrivacy-Preserving Data AugmentationAdvantages and Challenges of Privacy-Preserving Data AugmentationSummary
5. Secure Deployment of LLMs
Secure Model Hosting and InfrastructureUnderstanding Infrastructure ComponentsIsolation StrategiesNetwork SecurityResource Management and MonitoringSecure APIs and CommunicationsAPI Design PrinciplesImplementation of Secure APIsAuthentication and AuthorizationSecure CommunicationSecure Model Versioning and UpdatesModel Registry and Version ControlSecure Update ProcessSummary
6. Adversarial Attacks and Defenses
Understanding Adversarial Attacks on LLMsTaxonomy of Adversarial Attacks on LLMsNotable Attack MethodsEmbedding Space AttacksLLM Agent AttacksImpact of Model Scale and ArchitectureCase Study: Defending Against Jailbreaking AttacksRobust Fine-Tuning TechniquesAdversarial TrainingRobust Optimization TechniquesData Augmentation for RobustnessPrefix-Tuning and Prompt-Based RobustnessEnsemble MethodsCertifiably Robust Fine-TuningRed-Teaming LLMsRed-Teaming MethodologiesImplementing a Red-Teaming ProgramRed-Teaming Tools and FrameworksAutomated Multiround Red-TeamingCase Study: Red-Teaming in PracticeAdversarial Evaluation and Robustness MetricsRobustness BenchmarksRobustness Under Distribution ShiftHuman-in-the-Loop EvaluationAgent-Based EvaluationStandardized Attack Success MetricsDefense Evaluation MetricsChallenges in Robustness EvaluationBest PracticesFuture Directions in LLM RobustnessSummary
7. Ethical Considerations in Fine-Tuning LLMs
Bias and Fairness Issues in PersonalizationUnderstanding Bias in Fine-Tuned LLMsMeasuring Fairness in Fine-Tuned ModelsBias Mitigation StrategiesChallenges in Privacy-Preserving Bias MitigationTransparency and Explainability in Fine-Tuned ModelsThe Explainability Challenge in LLMsTechniques for Explaining LLM BehaviorPrivacy-Preserving ExplainabilityAddressing AI Bias with Privacy ConstraintsThe Privacy-Fairness Trade-OffGroup-Aware Privacy MechanismsBias-Aware Federated LearningPrivacy-Preserving Bias AuditingSummary
8. Navigating the Cultural, Social, and Legal Landscapes
A New Kind of Socio-Technical SystemsRiding Amidst an AI-Mediated Cultural EvolutionThe Rise of AI-Generated Content and the Erosion of TrustPersonalized AI and Identity Crisis in the Age of Surveillance CapitalismExistential Questions in Human-Machine InteractionUnveiling the Generative AI Supply ChainThe Emergence of Machine CultureAdaptable Legal Frameworks for Regulation and AccountabilityThe Case of Copyright and Intellectual Property in the Age of LLMsThe Case of Data Privacy and Protection in Personalized AI SystemsThe Case of Algorithmic Bias and Discrimination in AI-Powered Decision MakingThe Case of Liability and Accountability in AI-Powered SystemsUniversal Challenges to Techno-Legal SolutionismBuilding a Responsible AI CultureAI Safety Beyond Algorithms: The Human ElementsSummary
9. Building Privacy-Preserving AI Capabilities
Healthcare AI in Action: Differentially Private Clinical Note AnalysisThe Healthcare Privacy ChallengeSynthetic Data as a Privacy-Preserving FoundationLoRA: Efficient and Privacy-Friendly Fine-TuningPrivacy Accounting with RDPReal-World Deployment ConsiderationsLegal AI in Action: Federated Learning Across Law Firms or CourtsThe Legal Confidentiality ImperativeFederated Learning Architecture for Legal AISecure Aggregation and Model UpdatesLegal and Ethical Considerations in Federated Legal AIPerformance and Utility EvaluationBuilding Your Privacy-First AI CapabilityOrganizational Readiness and Implementation StrategyTeam Structure and Technology DecisionsGovernance Integration and Success MeasurementPreparing for Tomorrow’s Privacy LandscapeTechnology Convergence and Regulatory EvolutionMarket Dynamics and Competitive PositioningA Strategic Position for the FutureSummaryConclusionThe Transformation You’ve WitnessedThe Path We’re OnYour Role in Shaping the Future

Index
About the Author

Content preview from Privacy and Security for Large Language Models

Chapter 4. Privacy-Preserving Training Techniques

In our journey so far, you have learned how to create LLMs and how to evaluate them properly in terms of their health conditions in privacy and security. Now you are going to learn how to keep these AI friends healthy by building these protections directly into your models. In this chapter, you’re going to explore a class of techniques that allow your AI to train on sensitive information while keeping that information under wraps.

Privacy-preserving methods represent a critical frontier in AI development, especially as LLMs increasingly process personal, medical, financial, and other sensitive information. These approaches enable models to extract valuable patterns and insights from data without compromising the confidentiality of individual records or examples. They function by creating mathematical guarantees and cryptographic protections that limit what information can be extracted or inferred from the trained model.

In this chapter, you’ll explore several key techniques that allow AI systems to learn from sensitive information while maintaining strong privacy protections. These methods represent the intersection of machine learning, cryptography, and privacy theory, creating systems that can analyze data they cannot fully “see” in its original form.

We’ll cover five major classes of privacy-preserving techniques: differential privacy, federated learning, homomorphic encryption, multi-party computation, and privacy-preserving ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

The Developer's Playbook for Large Language Model Security

Publisher Resources

ISBN: 9781098160838Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Privacy and Security for Large Language Models

by Baihan Lin

Chapter 4. Privacy-Preserving Training Techniques

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.