Chapter 9. Differentially Private Machine Learning
Machine learning (ML) is the process of learning relationships and patterns in a data set. Statistical modeling, as discussed in Chapter 8, places greater emphasis on model interpretability. This difference happens to form a natural division in DP techniques.
ML model parameters can leak information about the training data, just as they can in statistical modeling. When you privately train a model, your goal is to release parameters/weights for the model that accurately capture the relationship between variables while protecting your sensitive data with the guarantees of differential privacy.
In this chapter, you will learn about a variety of techniques that are typically used to privately train ML models. Stochastic gradient descent (SGD) is a focal point, as it is the workhorse of non-DP ML training.
The content of this chapter assumes both a working knowledge of non-DP ML and relies heavily on concepts introduced in previous chapters: Chapters 3, 4, 5, and 6. While this may seem daunting, the chapter will start with a more approachable minimum viable DP-SGD before gradually mixing in more advanced tools.
The chapter ends with a discussion and examples of frameworks and tools that will help you create DP ML models. Before diving in, we’ll first motivate the use of DP in this domain by discussing privacy attacks.
Why Make Machine Learning Models Differentially Private?
Suppose you are running a company that sells online educational ...
Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.
Read now
Unlock full access