Between major privacy regulations like the GDPR and CCPA and expensive and notorious data breaches, there has never been so much pressure to ensure data privacy. Unfortunately, integrating privacy into data systems is still complicated. This essential guide will give you a fundamental understanding of modern privacy building blocks, like differential privacy, federated learning, and encrypted computation. Based on hard-won lessons, this book provides solid advice and best practices for integrating breakthrough privacy-enhancing technologies into production systems.
Practical Data Privacy answers important questions such as:
- What do privacy regulations like GDPR and CCPA mean for my data workflows and data science use cases?
- What does "anonymized data" really mean? How do I actually anonymize data?
- How does federated learning and analysis work?
- Homomorphic encryption sounds great, but is it ready for use?
- How do I compare and choose the best privacy-preserving technologies and methods? Are there open-source libraries that can help?
- How do I ensure that my data science projects are secure by default and private by design?
- How do I work with governance and infosec teams to implement internal policies appropriately?
Table of contents
- 1. Data Governance and Simple Privacy Approaches
- What Is Anonymization?
- Defining Differential Privacy
- Understanding Epsilon: What Is Privacy Loss?
- What Differential Privacy Guarantees, and What It Doesn’t
- Understanding Differential Privacy
- Differential Privacy with the Laplace Mechanism
- Exploring Other Mechanisms: Gaussian Noise for Differential Privacy
- Sensitivity and Privacy Units
- What About k-Anonymity?
3. Building Privacy into Data Pipelines
- How to Build Privacy into Data Pipelines
- Engineering Privacy and Data Governance into Pipelines
- Using Differential Privacy Libraries in Pipelines
- Collecting Data Anonymously
- Working with Data Engineering Team and Leadership
4. Privacy Attacks
- Privacy Attacks: Analyzing Common Attack Vectors
- Data Security
- Probabilistic Reasoning About Attacks
- Data Security Mitigations
5. Privacy-Aware Machine Learning and Data Science
- Using Privacy-Preserving Techniques in Machine Learning
- Open Source Libraries for PPML
- Architecting Privacy in Data and Machine Learning Projects
6. Federated Learning and Data Science
- Distributed Data
- Federated Learning
- Architecting Federated Systems
- Open Source Federated Libraries
- A Federated Data Science Future Outlook
7. Encrypted Computation
- What Is Encrypted Computation?
- When to Use Encrypted Computation
- Types of Encrypted Computation
- Real-World Encrypted Computation
- Getting Started with PSI and Moose
- Imagining a World with Secure Data Sharing
8. Navigating the Legal Side of Privacy
- GDPR: An Overview
- California Consumer Privacy Act (CCPA)
- Other Regulations: HIPAA, LGPD, PIPL, and More!
- Internal Policies and Contracts
- Working with Legal Professionals
- Data Governance 2.0
9. Privacy and Practicality Considerations
- Getting Practical: Managing Privacy and Security Risk
- Practical Privacy Technology: Use-Case Analysis
- Step-by-Step: How to Integrate and Automate Privacy in ML
- Embracing the Future: Working with Research Libraries and Teams
10. Frequently Asked Questions (and Their Answers!)
Encrypted Computation and Confidential Computing
- Is Secure Computation Quantum-Safe?
- Can I Use Enclaves to Solve Data Privacy or Data Secrecy Problems?
- What If I Need to Protect the Privacy of the Client or User Who Sends the Database Query or Request?
- Do Clean Rooms or Remote Data Analysis/Access Solve My Privacy Problem?
- I Want to Provide Perfect Privacy or Perfect Secrecy. Is That Possible?
- How Do I Determine That an Encrypted Computation Is Secure Enough?
- If I Want to Use Encrypted Computation, How Do I Manage Key Rotation?
- What Is Google’s Privacy Sandbox? Does It Use Encrypted Computation?
Data Governance and Protection Mechanisms
- Why Isn’t k-Anonymity Enough?
- I Don’t Think Differential Privacy Works for My Use Case. What Do I Do?
- Can I Use Synthetic Data to Solve Privacy Problems?
- How Should Data Be Shared Ethically or What Are Alternatives to Selling Data?
- How Can I Find All the Private Information That I Need to Protect?
- I Dropped the Personal Identifiers, so the Data Is Safe Now, Right?
- How Do I Reason About Data I Released in the Past?
- I’m Working on a BI Dashboard or Visualization. How Do I Make It Privacy-Friendly?
- Who Makes Privacy Engineering Decisions? How Do I Fit Privacy Engineering into My Organization?
- What Skills or Background Do I Need to Become a Privacy Engineer?
- Why Didn’t You Mention (Insert Technology or Company Here)? How Do I Learn More? Help!
- GDPR and Data Protection Regulations
Personal Choices and Social Privacy
- What Email Provider, Browser, and Application Should I Use if I Care About My Privacy?
- My Friend Has an Automated Home or Phone Assistant. I Don’t Want It Listening to Me. What Should I Do?
- I Gave Up on Privacy a Long Time Ago. I Have Nothing to Hide. Why Should I Change?
- Can I Just Sell My Own Data to Companies?
- I Like Personalized Ads. Why Don’t You?
- Is (Fill in the Blank) Listening to Me? What Should I Do About It?
- Encrypted Computation and Confidential Computing
11. Go Forth and Engineer Privacy!
- Surveillance Capitalism and Data Science
- Vast Data Collection and Society
- Fighting Back
- Privacy Champions
- About the Author
- Title: Practical Data Privacy
- Release date: April 2023
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098129460
You might also like
Engineer privacy into your systems with these hands-on techniques for data governance, legal compliance, and surviving …
Data Governance: The Definitive Guide
As you move data to the cloud, you need to consider a comprehensive approach to data …
Practical Statistics for Data Scientists, 2nd Edition
Statistical methods are a key part of data science, yet few data scientists have formal statistical …
Data Management at Scale, 2nd Edition
As data management continues to evolve rapidly, managing all of your data in a central place, …