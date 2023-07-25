Practical Data Privacy

Practical Data Privacy

by Katharine Jarmul
Released July 2023
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781098129446

Book description

Between major privacy regulations like the GDPR and CCPA and expensive and notorious data breaches, there has never been so much pressure for data scientists to ensure data privacy. Unfortunately, integrating privacy into your data science workflow is still complicated. This essential guide will give you solid advice and best practices on breakthrough privacy-enhancing technologies such as encrypted learning and differential privacy--as well as a look at emerging technologies and techniques in the field.

Practical Data Privacy answers important questions such as:

  • What do privacy regulations like GDPR and CCPA mean for my project?
  • What does "anonymized data" really mean?
  • Should I anonymize the data? If so, how?
  • Which privacy techniques fit my project and how do I incorporate them?
  • What are the differences and similarities between privacy-preserving technologies and methods?
  • How do I utilize an open-source library for a privacy-enhancing technique?
  • How do I ensure that my projects are secure by default and private by design?
  • How do I create a plan for internal policies or a specific data project that incorporates privacy and security from the start?

Table of contents

  1. Preface
    1. Who Should Read this Book
      1. Privacy Engineering
    2. Why I Wrote This Book
    3. Navigating This Book
    4. Conventions Used in This Book
    5. Using Code Examples
    6. O’Reilly Online Learning
    7. How to Contact Us
  2. 1. Data Governance and Simple Privacy Approaches
    1. Data Governance: What is it?
    2. Identifying Sensitive Data
      1. Identifying PII
    3. Documenting Data for Use
      1. Finding and Documenting Unknown Data
      2. Basic Data Documentation
      3. Documenting Data Collection
      4. Documenting Data Quality
      5. Documenting Data Security
      6. Documenting Data Privacy
      7. Documenting Data Descriptions
      8. Documenting Data Statistics
      9. Tracking Data Lineage
      10. Data Version Control
    4. Basic Privacy: Pseudonymization for Privacy by Design
    5. Summary
  3. 2. Anonymization
    1. What is anonymization?
    2. Defining Differential Privacy
      1. Understanding Epsilon: What is privacy loss?
      2. What Differential Privacy Guarantees, and What it Doesn’t
    3. Understanding Differential Privacy
      1. Differential Privacy in Practice: Anonymizing the US Census
    4. Differential Privacy with the Laplace Mechanism
      1. Differential Privacy with Laplace: A Naive Attempt
      2. Sensitivity and Error
      3. Privacy Budgets & Composition
    5. Exploring Other Mechanisms: Gaussian Noise for Differential Privacy
      1. Comparing Laplace and Gaussian noise
      2. Real-World Differential Privacy: Debiasing Noisy Results
    6. Differential Privacy: A more nuanced definition
    7. What about k-anonymity?
    8. Summary
  4. 3. Building Privacy into Pipelines
    1. How to Build Privacy into Pipelines
      1. Design Appropriate Privacy Measures
      2. Meet the User Where They Are
      3. Engineer Privacy In
      4. Test and Verify
    2. Engineering Privacy and Data Governance into Pipelines
      1. An Example Data Sharing Workflow
      2. Adding Provenance and Consent Information to Collection
    3. Using Differential Privacy Libraries in Pipelines
    4. Collecting Data Anonymously
      1. Apple’s Differentially Private Data Collection
      2. Google’s Differential Privacy via Separation of Responsibility: Encode, Shuffle, Analyze
      3. Why Chrome’s Original Differential Privacy Method Died
    5. Working with Data Engineering Team and Leadership
      1. Share responsibility
      2. Create workflows with documentation and privacy
      3. Privacy as Core Value Proposition
    6. Summary
