Chapter 10. Defending Against Adversarial Inputs

In this chapter, we’ll consider some of the methods that have been proposed for detecting and defending against adversarial example attacks. The good news is that some defenses can work. The bad news is that each defense has limitations, and if the attacker is aware of the method being used, they may be able to adapt their attack to circumvent the defense.

This chapter considers three broad approaches to defense:

Improve the model: In the first part of this chapter, we’ll focus on the model itself and techniques that have been proposed for creating more robust neural networks.
Remove adversarial aspects from input: In “Data Preprocessing”, we’ll then look at whether it’s possible to render adversarial input benign before it’s submitted to the model.
Minimize the adversary’s knowledge: Next, “Concealing the Target” will consider ways in which the adversary’s knowledge of the target model and broader processing chain might be reduced to make it more difficult to create successful adversarial examples. As highlighted in Chapter 9, target concealment should not be relied upon as a defense.

There’s currently no single solution to this problem, but it is an active area of research. Table 10-1 summarizes the capabilities of the defense techniques described in this chapter at the time of writing.

Table 10-1. Summary of defenses
Defense	Improve model robustness	Remove adversarial data characteristics	Minimize the adversary’s knowledge ...

Get Strengthening Deep Neural Networks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Strengthening Deep Neural Networks by Katy Warr

Chapter 10. Defending Against Adversarial Inputs

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly