Chapter 5. The Principles of Adversarial Input

This chapter looks at some of the core principles underpinning the generation of adversarial examples. We’ll hold off on the more detailed mathematics and specific techniques and begin by building upon the ideas presented in the previous chapters. This discussion will use analogy and approximation to provide an intuitive understanding prior to delving into the details. The aim is to understand, at a high level, how the addition of adversarial perturbation or an adversarial patch could cause a DNN to return an incorrect result.

To recap:

Adversarial perturbation

A combination of imperceptible (or nearly imperceptible) small changes distributed across the input data which cause the model to return an incorrect result. For an image, this might be small changes to several disparate pixels across an image.

Adversarial patch

An addition to a specific area (spatial or temporal) of the input data to cause the model to return an incorrect result. An adversarial patch is likely to be perceptible by a human observer, but could be disguised as something benign.

This chapter considers the generation of adversarial perturbation and patches by direct manipulation of digital data. Adversarial patches and perturbation are easier to apply to the input in its digital form, but it may be possible to apply these techniques to the real world (altering traffic signs for autonomous vehicles, for example) to cause the sensor (camera or microphone) to ...

Get Strengthening Deep Neural Networks now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.