Chapter 11. Red-Teaming XGBoost

In Chapter 5, we introduced a number of concepts related to the security of machine learning models. Now we will put them into practice. In this chapter, we’ll explain how to hack our own models so that we can add red-teaming into our model debugging repertoire. The main idea of the chapter is that when we know what hackers will try to do to our model, then we can try it out first and devise effective defenses. We’ll start out with a concept refresher that reintroduces common ML attacks and countermeasures, then we’ll dive into examples of attacking an XGBoost classifier trained on structured data.1 We’ll then introduce two XGBoost models, one trained with the standard unexplainable approach, and one trained with constraints and a high degree of L2 regularization. We’ll use these two models to explain the attacks and to test whether transparency and L2 regularization are adequate countermeasures. After that, we’ll jump into attacks that are likely to be performed by external adversaries against an unexplainable ML API: model extraction and adversarial example attacks. From there, we’ll try out insider attacks that involve making deliberate changes to an ML modeling pipeline: data poisoning and model backdoors. As a reminder, the chapter’s code examples are available online. Now, let’s get started—remember to bring your tinfoil hat, and your adversarial mindset from Chapter 5.

Note

The web and academic literature abound with examples of, and tools ...

Get Machine Learning for High-Risk Applications now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.