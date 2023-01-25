Practicing Trustworthy Machine Learning

by Yada Pruksachatkun, Matthew Mcateer, Subho Majumdar
Released January 2023
Publisher(s): O'Reilly Media, Inc.
ISBN: 9781098120276

Book description

With the increasing use of AI in high-stakes domains such as medicine, law, and defense, organizations spend a lot of time and money to make ML models trustworthy. Many books on the subject offer deep dives into theories and concepts. This guide provides a practical starting point to help development teams produce models that are secure, more robust, less biased, and more explainable.

Authors Yada Pruksachatkun, Matthew McAteer, and Subhabrata Majumdar translate best practices in the academic literature for curating datasets and building models into a blueprint for building industry-grade trusted ML systems. With this book, engineers and data scientists will gain a much-needed foundation for releasing trustworthy ML applications into a noisy, messy, and often hostile world.

You'll learn:

  • Methods to explain ML models and their outputs to stakeholders
  • How to recognize and fix fairness concerns and privacy leaks in an ML pipeline
  • How to develop ML systems that are robust and secure against malicious attacks
  • Important systemic considerations, like how to manage trust debt and which ML obstacles require human intervention

Table of contents

  1. 1. Introduction
    1. Implementing Machine Learning in Production
    2. The Transformer Convergence
    3. Why we wrote this book
    4. Who this book is for
    5. Does this book have anything to do with AI safety or AI alignment?
    6. What do I do if I find an omission or error in this book?
    7. Acknowledgements
    8. References
  2. 2. Privacy
    1. Improperly Implemented Privacy Features in Machine Learning: Case Studies
      1. Case #1: Apple’s CSAM
      2. Case #2: Github Copilot
      3. Case #3: Model and Data Theft from No-Code ML tools
    2. Definitions
      1. Definition of Privacy
      2. Proxies and metrics for Privacy
      3. Legal Definitions of Privacy
      4. k-anonymity
    3. Types of Privacy-invading Attacks on ML Pipelines
      1. Membership Attacks
      2. Model Inversion
      3. Model Extraction
    4. Deep-Dive Example (with code): Training (and then stealing) an ordinary BERT-based Language Model
      1. Privacy Testing Tooling
    5. Methods for Preserving Privacy
      1. Differential privacy (DP)
      2. Deep-Dive Example (with code): Stealing a differentially privately trained model
      3. Further Differential Privacy Tooling
      4. Homomorphic encryption (HE)
      5. Secure Multi-Party Computation (SCMP)
      6. Deep-Dive Example (with code): Secure Multi-Party Computation (SMPC) Example
      7. Further SCMP Tooling
      8. Federated Learning (FL)
    6. Bringing things together
    7. References
  3. 3. Fairness and Bias
    1. Case #1: Social media
    2. Case #2: Triaging Patients in Healthcare Systems
    3. Case #3: Legal systems
    4. Key Concepts in Fairness and Fairness-Related Harms
      1. Individual Fairness
      2. Parity Fairness
      3. Calculating Parity Fairness
      4. Examples of Parity Fairness Framing and Calculation
    5. Scenario 1: Evaluating Fairness Harms in Language Generation using BOLD Dataset
    6. Scenario 2: Image Captioning
    7. Fairness Harm Mitigation
    8. Mitigation Methods in the Pre-processing Stage
    9. Mitigation Methods in the In-processing Stage
      1. Adversarial bias mitigation
      2. Regularization
    10. Mitigation Methods in the Post-processing Stage
    11. Fairness Toolkits
    12. How can you prioritize fairness in your organization?
    13. Conclusion
    14. References
  4. 4. Model Explainability and Interpretability
    1. Explainability versus interpretability
    2. The need for interpretable and explainable models
    3. Limitations and Pitfalls of Explainable and Interpretable methods
      1. A Possible Tradeoff between Explainability and Privacy
    4. Evaluating the usefulness of Interpretation or explanation methods
      1. Deep Dive: Interpreting large language models like GPT-2
    5. Definitions and Categories
      1. “Black Box”
      2. Global versus Local interpretability
      3. Model-Agnostic versus Model-Specific methods
    6. Methods for Explaining Models and interpreting outputs
      1. Inherently explainable models
      2. Local Model-agnostic Interpretability methods
      3. Global Model-agnostic Interpretability methods
      4. Explaining Neural Networks
      5. Learned features
      6. Saliency mapping
      7. Deep Dive: Saliency mapping with CLIP
      8. Adversarial Counterfactual Examples
      9. Detecting Concepts
    7. Other Explainability/Interpretability Toolkits
      1. Interpretable (“Whitebox”) or Fair Modeling Packages
      2. Other Python Packages for General Explainability
    8. Having a “Security Mindset” to overcome the limitations of interpretability
      1. Risks of Deceptive Interpretability
    9. References
  5. 5. Robustness
    1. Evaluating Robustness
    2. Non-Adversarial Robustness
      1. Step One: Apply Perturbations
      2. Computer Vision
      3. Language
    3. Deep Dive: Data Perturbation in Natural Language Processing
    4. Step Two: Defining and Applying Constraints
      1. Natural Language Processing
      2. Fluency
      3. Preserving semantic meaning
    5. Computer Vision
    6. Deep Dive: Word Substitution Data Augmentation with Cosine Similarity Constraints
    7. Adversarial Robustness
    8. Deep Dive: Adversarial Attacks in Computer Vision
      1. Creating Adversarial Examples
    9. Improving Robustness
      1. Conclusion
  6. 6. From Theory to Practice
    1. Additional technical factors
      1. Causal machine learning
      2. Sparsity and model compression
      3. Uncertainty quantification
    2. Implementation Challenges
      1. Motivation to Develop Trustworthy ML Systems
      2. Important aspects of trust
      3. Evaluation and Feedback
      4. Trustworthiness and MLOps
    3. So What Should you Take Away from this Chapter?
Product information

