book

Adversarial Machine Learning

Name: Adversarial Machine Learning
Author: Jason Edwards
ISBN: 9781394402038

by Jason Edwards

February 2026

Intermediate to advanced

400 pages

17h 17m

English

Wiley

Read now

Unlock full access

Cover
Table of Contents
Title Page
Copyright
Dedication
Preface
Acknowledgments
From the Author
Introduction
Why AI Security Matters NowScope of This BookWho This Book Is ForA New Security DisciplineLooking Ahead
About the Companion Website

1 The Age of Intelligent Threats
The Rise of AI as a Security TargetFragility in Intelligent SystemsCategories of AI: Predictive, Generative, and AgenticMilestones in Adversarial VulnerabilityIntelligence as an Attack MultiplierWhy This Book and Who It's ForRecommendationsConclusionKey Concepts
2 Anatomy of AI Systems and Their Attack Surfaces
The Architecture of Predictive, Generative, and Agentic AIThe AI Development Lifecycle: From Data to DeploymentClassical Machine Learning vs. Modern AI PipelinesIdentifying Entry Points: Training, Inference, and Supply ChainSecurity Debt in the Model Development LifecycleRecommendationsConclusionKey Concepts
3 The Adversary's Playbook
Threat Actors: Profiles, Motivations, and ObjectivesWhite-Box Attack Techniques and MethodologiesBlack-Box Attack Techniques and MethodologiesGray-Box Attack Techniques and MethodologiesOperationalizing AI Attacks: Tactical Methodologies and ExecutionAdvanced Multi-Stage and Coordinated AI AttacksRecommendationsConclusionKey Concepts
4 Evasion Attacks—Tricking AI Models at Inference
Core Principles and Mechanisms of Evasion AttacksGradient-Based Evasion TechniquesLinguistic and Textual Evasion MethodsImage- and Vision-Based Evasion TechniquesEvasion Attacks on Time-Series and Sequential ModelsRecommendationsConclusionKey Concepts
5 Poisoning Attacks—Compromising AI Systems During Training
Fundamentals and Mechanisms of Training-Time PoisoningLabel Manipulation and Clean-Label Poisoning TechniquesBackdoor and Trojan Insertion in Training DataPoisoning Attacks on Federated and Distributed Learning SystemsPoisoning Attacks Against Reinforcement Learning (RL) SystemsPoisoning Attacks on Transfer Learning and Fine-Tuning ProcessesRecommendationsConclusionKey Concepts
6 Privacy Attacks—Extracting Secrets from AI Models
Core Mechanisms and Objectives of AI Privacy AttacksMembership Inference TechniquesModel Inversion Attacks and Data ReconstructionAttribute and Property Inference AttacksModel Extraction and Functionality ReconstructionExploiting Privacy Leakage Through Prompting Generative AIRecommendationsConclusionKey Concepts
7 Backdoor and Trojan Attacks—Embedding Hidden Behaviors in AI Models
Fundamental Concepts of AI Backdoors and TrojansBackdoor Trigger Design and OptimizationData Poisoning Methods for Backdoor EmbeddingTrojan Attacks in Transfer and Fine-Tuning ScenariosEmbedding Backdoors in Federated and Decentralized TrainingAdvanced Trigger Embedding in Generative and Agentic AI ModelsRecommendationsConclusionKey Concepts
8 The Generative AI Attack Surface
Architectural Foundations of Large Language ModelsHow Generative Architectures Expand Attack OpportunitiesExploiting Fine-Tuning as an Adversarial VectorPrompt Engineering as an Adversarial Exploitation PathwayTechnical Risks in Retrieval-Augmented Generation SystemsLeveraging Model Internals for Generative AI ExploitationRecommendationsConclusionKey Concepts
9 Prompt Injection and Jailbreak Techniques
Technical Foundations of Prompt Injection AttacksDirect Prompt Injection Methods and Input CraftingIndirect Prompt Injection via External or Retrieved ContentJailbreak Techniques and Semantic Boundary ExploitationToken-Level and Embedding Space ManipulationsContextual and Conversational Injection StrategiesRecommendationsConclusionKey Concepts
10 Data Leakage and Model Hallucination
Technical Mechanisms of Data Leakage in Generative ModelsMembership and Attribute Inference via Generative OutputsModel Inversion and Training Data ReconstructionHallucination Exploitation in Generative OutputsPrompt-Based Extraction of Memorized DataExploiting Multi-Modal and Cross-Modal Leakage in Generative ModelsRecommendationsConclusionKey Concepts
11 Adversarial Fine-Tuning and Model Reprogramming
Technical Foundations of Adversarial Fine-TuningSemantic Perturbation Methods for Adversarial Fine-TuningEmbedding Covert Behaviors via Adversarial Prompt ConditioningAdvanced Trojan Embedding via Fine-Tuning GradientsCross-Model and Transferable Adversarial Fine-Tuning AttacksModel Reprogramming via Adversarial Fine-Tuning TechniquesRecommendationsConclusionKey Concepts
12 Agentic AI and Autonomous Threat Loops
Technical Foundations of Agentic AI SystemsTechnical Manipulation of Autonomous Decision LoopsExploitation of Agentic Memory and Context ManagementAgentic Tool Integration and External API ExploitationTechnical Embedding of Autonomous Chain InjectionExploitation of Environmental Interactions and Stateful VulnerabilitiesRecommendationsConclusionKey Concepts
13 Securing the AI Supply Chain
Technical Mechanisms of Supply Chain Poisoning in AI ModelsArtifact and Model Checkpoint Contamination TechniquesTechnical Exploitation of Third-Party AI Libraries and FrameworksDataset Provenance and Annotation Manipulation TechniquesTechnical Exploitation of Hosted and Cloud-based Model InfrastructureArtifact Repositories and Model Zoo Contamination MethodsRecommendationsConclusionKey Concepts
14 Evaluating AI Robustness and Response Strategies
Technical Foundations of AI Robustness EvaluationMetrics for Evaluating AI Security and RobustnessRobust Optimization Methods and Adversarial TrainingCertified Robustness and Formal Verification TechniquesTechnical Benchmarking Tools and Evaluation FrameworksTechnical Analysis of Robustness Across Model Architectures and ModalitiesRecommendationsConclusionKey Concepts
15 Building Trustworthy AI by Design
Technical Foundations of Security-by-Design in AI SystemsRobust Embedding and Representation Learning MethodsTechnical Approaches to Adversarially Robust ArchitecturesTechnical Integration of Formal Verification in Model DesignTechnical Frameworks for Runtime Anomaly Detection and FilteringTechnical Embedding of Model Interpretability and TransparencyRecommendationsConclusionKey Concepts
16 Looking Ahead—Security in the Era of Intelligent Agents
Technical Foundations of Future Agentic AI SystemsEmerging Technical Attack Vectors in Agentic SystemsTechnical Exploitation of Multi-Modal and Cross-Domain Agentic CapabilitiesFuture Technical Capabilities in Automated Adversarial GenerationTechnical Mechanisms for Evaluating Advanced Agentic RobustnessTechnical Embedding of Ethical Constraints and Safety MechanismsRecommendationsConclusionKey Concepts
Glossary
Index
End User License Agreement

Content preview from Adversarial Machine Learning

4Evasion Attacks—Tricking AI Models at Inference

Evasion attacks represent one of the most immediate and operationally dangerous threats in the adversarial machine learning landscape. Unlike poisoning or backdoor attacks that target the training phase, evasion occurs at inference—precisely where AI systems are deployed, trusted, and acting in real time. These attacks exploit the fragile boundaries of learned models, manipulating inputs just enough to induce misclassification without triggering human suspicion or standard validation checks. As AI becomes deeply integrated into security-sensitive environments, from identity verification to autonomous systems, the ability to reliably detect and defend against evasion becomes central to AI risk governance.

Understanding the diverse mechanisms of evasion—from gradient-based perturbation in image classifiers to subtle linguistic manipulation in text models and temporal distortion in time-series systems—is essential to securing AI deployments across modalities. These attacks demonstrate how small changes in surface-level data can produce disproportionately harmful outcomes, revealing latent weaknesses in model generalization, embedding sensitivity, and feature attention. The chapter explores how attackers tailor perturbations to bypass defenses, disrupt detection, and degrade model performance—all while remaining within constraints that preserve realism and operational believability.

From automated perturbation pipelines to physical-world ...

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781394402038

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Adversarial Machine Learning

by Jason Edwards

4Evasion Attacks—Tricking AI Models at Inference

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.