Practicing Trustworthy Machine Learning

Book description

With the increasing use of AI in high-stakes domains such as medicine, law, and defense, organizations spend a lot of time and money to make ML models trustworthy. Many books on the subject offer deep dives into theories and concepts. This guide provides a practical starting point to help development teams produce models that are secure, more robust, less biased, and more explainable.

Authors Yada Pruksachatkun, Matthew McAteer, and Subhabrata Majumdar translate best practices in the academic literature for curating datasets and building models into a blueprint for building industry-grade trusted ML systems. With this book, engineers and data scientists will gain a much-needed foundation for releasing trustworthy ML applications into a noisy, messy, and often hostile world.

You'll learn:

  • Methods to explain ML models and their outputs to stakeholders
  • How to recognize and fix fairness concerns and privacy leaks in an ML pipeline
  • How to develop ML systems that are robust and secure against malicious attacks
  • Important systemic considerations, like how to manage trust debt and which ML obstacles require human intervention

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. Implementing Machine Learning in Production
    2. The Transformer Convergence
    3. An Explosion of Large and Highly Capable ML Models
    4. Why We Wrote This Book
    5. Who This Book Is For
    6. AI Safety and Alignment
    7. Use of HuggingFace PyTorch for AI Models
    8. Foundations
    9. Conventions Used in This Book
    10. Using Code Examples
    11. O’Reilly Online Learning
    12. How to Contact Us
    13. Acknowledgments
  2. 1. Privacy
    1. Attack Vectors for Machine Learning Pipelines
    2. Improperly Implemented Privacy Features in ML: Case Studies
      1. Case 1: Apple’s CSAM
      2. Case 2: GitHub Copilot
      3. Case 3: Model and Data Theft from No-Code ML Tools
    3. Definitions
      1. Definition of Privacy
      2. Proxies and Metrics for Privacy
      3. Legal Definitions of Privacy
      4. k-Anonymity
    4. Types of Privacy-Invading Attacks on ML Pipelines
      1. Membership Attacks
      2. Model Inversion
      3. Model Extraction
    5. Stealing a BERT-Based Language Model
      1. Defenses Against Model Theft from Output Logits
      2. Privacy-Testing Tools
    6. Methods for Preserving Privacy
      1. Differential Privacy
      2. Stealing a Differentially Privately Trained Model
      3. Further Differential Privacy Tooling
      4. Homomorphic Encryption
      5. Secure Multi-Party Computation
      6. SMPC Example
      7. Further SMPC Tooling
      8. Federated Learning
    7. Conclusion
  3. 2. Fairness and Bias
    1. Case 1: Social Media
    2. Case 2: Triaging Patients in Healthcare Systems
    3. Case 3: Legal Systems
    4. Key Concepts in Fairness and Fairness-Related Harms
      1. Individual Fairness
      2. Parity Fairness
      3. Calculating Parity Fairness
    5. Scenario 1: Language Generation
    6. Scenario 2: Image Captioning
    7. Fairness Harm Mitigation
      1. Mitigation Methods in the Pre-Processing Stage
      2. Mitigation Methods in the In-Processing Stage
      3. Mitigation Methods in the Post-Processing Stage
    8. Fairness Tool Kits
    9. How Can You Prioritize Fairness in Your Organization?
    10. Conclusion
    11. Further Reading
  4. 3. Model Explainability and Interpretability
    1. Explainability Versus Interpretability
    2. The Need for Interpretable and Explainable Models
    3. A Possible Trade-off Between Explainability and Privacy
    4. Evaluating the Usefulness of Interpretation or Explanation Methods
    5. Definitions and Categories
      1. “Black Box”
      2. Global Versus Local Interpretability
      3. Model-Agnostic Versus Model-Specific Methods
      4. Interpreting GPT-2
    6. Methods for Explaining Models and Interpreting Outputs
      1. Inherently Explainable Models
      2. Local Model-Agnostic Interpretability Methods
      3. Global Model-Agnostic Interpretability Methods
      4. Explaining Neural Networks
      5. Saliency Mapping
      6. Deep Dive: Saliency Mapping with CLIP
      7. Adversarial Counterfactual Examples
    7. Overcome the Limitations of Interpretability with a Security Mindset
    8. Limitations and Pitfalls of Explainable and Interpretable Methods
    9. Risks of Deceptive Interpretability
    10. Conclusion
  5. 4. Robustness
    1. Evaluating Robustness
    2. Non-Adversarial Robustness
      1. Step 1: Apply Perturbations
      2. Step 2: Defining and Applying Constraints
      3. Deep Dive: Word Substitution with Cosine Similarity Constraints
    3. Adversarial Robustness
      1. Deep Dive: Adversarial Attacks in Computer Vision
      2. Creating Adversarial Examples
    4. Improving Robustness
    5. Conclusion
  6. 5. Secure and Trustworthy Data Generation
    1. Case 1: Unsecured AWS Buckets
    2. Case 2: Clearview AI Scraping Photos from Social Media
    3. Case 3: Improperly Stored Medical Data
    4. Issues in Procuring Real-World Data
      1. Using the Right Data for the Modeling Goal
      2. Consent
      3. PII, PHI, and Secrets
      4. Proportionality and Sampling Techniques
      5. Undescribed Variation
      6. Unintended Proxies
      7. Failures of External Validity
      8. Data Integrity
      9. Setting Reasonable Expectations
      10. Tools for Addressing Data Collection Issues
    5. Synthetically Generated Data
      1. DALL·E, GPT-3, and Synthetic Data
      2. Improving Pattern Recognition with Synthetic Data
      3. Deep Dive: Pre-Training a Model with a Process-Driven Synthetic Dataset
      4. Facial Recognition, Pose Detection, and Human-Centric Tasks
      5. Object Recognition and Related Tasks
      6. Environment Navigation
      7. Unity and Unreal Environments
      8. Limitations of Synthetic Data in Healthcare
      9. Limitations of Synthetic Data in NLP
      10. Self-Supervised Learned Models Versus Giant Natural Datasets
      11. Repurposing Quality Control Metrics for Security Purposes
    6. Conclusion
  7. 6. More State-of-the-Art Research Questions
    1. Making Sense of Improperly Overhyped Research Claims
      1. Shallow Human-AI Comparison Antipattern
      2. Downplaying the Limitations of the Technique Antipattern
      3. Uncritical PR Piece Antipattern
      4. Hyperbolic or Just Plain Wrong Antipattern
      5. Getting Past These Antipatterns
    2. Quantized ML
      1. Tooling for Quantized ML
      2. Privacy, Bias, Interpretability, and Stability in Quantized ML
    3. Diffusion-Based Energy Models
    4. Homomorphic Encryption
    5. Simulating Federated Learning
    6. Quantum Machine Learning
      1. Tooling and Resources for Quantum Machine Learning
      2. Why QML Will Not Solve Your Regular ML Problems
    7. Making the Leap from Theory to Practice
  8. 7. From Theory to Practice
    1. Part I: Additional Technical Factors
      1. Causal Machine Learning
      2. Sparsity and Model Compression
      3. Uncertainty Quantification
    2. Part II: Implementation Challenges
      1. Motivating Stakeholders to Develop Trustworthy ML Systems
      2. Trust Debts
      3. Important Aspects of Trust
      4. Evaluation and Feedback
      5. Trustworthiness and MLOps
    3. Conclusion
  9. 8. An Ecosystem of Trust
    1. Tooling
      1. LiFT
      2. Datasheets
      3. Model Cards
      4. DAG Cards
    2. Human-in-the-Loop Steps
      1. Oversight Guidelines
      2. Stages of Assessment
    3. The Need for a Cross-Project Approach
      1. MITRE ATLAS
      2. Benchmarks
      3. AI Incident Database
      4. Bug Bounties
    4. Deep Dive: Connecting the Dots
      1. Data
      2. Pre-Processing
      3. Model Training
      4. Model Inference
      5. Trust Components
    5. Conclusion
  10. A. Synthetic Data Generation Tools
  11. B. Other Interpretability and Explainability Tool Kits
    1. Interpretable or Fair Modeling Packages
    2. Other Python Packages for General Explainability
  12. Index
  13. About the Authors

Product information

  • Title: Practicing Trustworthy Machine Learning
  • Author(s): Yada Pruksachatkun, Matthew Mcateer, Subho Majumdar
  • Release date: January 2023
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098120276