Machine Learning for High-Risk Applications

Book description

The past decade has witnessed a wide adoption of artificial intelligence and machine learning (AI/ML) technologies. However, a lack of oversight into their widespread implementation has resulted in harmful outcomes that could have been avoided with proper oversight. Before we can realize AI/ML's true benefit, practitioners must understand how to mitigate its risks. This book describes responsible AI, a holistic approach for improving AI/ML technology, business processes, and cultural competencies that builds on best practices in risk management, cybersecurity, data privacy, and applied social science.

It's an ambitious undertaking that requires a diverse set of talents, experiences, and perspectives. Data scientists and nontechnical oversight folks alike need to be recruited and empowered to audit and evaluate high-impact AI/ML systems. Author Patrick Hall created this guide for a new generation of auditors and assessors who want to make AI systems better for organizations, consumers, and the public at large.

  • Learn how to create a successful and impactful responsible AI practice
  • Get a guide to existing standards, laws, and assessments for adopting AI technologies
  • Look at how existing roles at companies are evolving to incorporate responsible AI
  • Examine business best practices and recommendations for implementing responsible AI
  • Learn technical approaches for responsible AI at all stages of system development

Publisher resources

View/Submit Errata

Table of contents

  1. Preface
    1. Who Should Read This Book
    2. What Readers Will Learn
    3. Alignment with the NIST AI Risk Management Framework
    4. Preliminary Book Outline
      1. Part One
      2. Part Two
      3. How to Succeed in High-risk Applications
    5. Example Datasets
      1. Taiwan Credit Data
      2. Kaggle Chest X-ray Data
    6. Conventions Used in This Book
    7. Using Code Examples
    8. O’Reilly Online Learning
    9. How to Contact Us
    10. Acknowledgments
  2. 1. Contemporary Machine Learning Risk Management
    1. A Snapshot of the Legal and Regulatory Landscape
      1. The Proposed EU AI Act
      2. US Federal Laws and Regulations
      3. State and Municipal Laws
      4. Basic Product Liability
      5. Federal Trade Commission Enforcement
    2. Authoritative Best Practices
    3. AI Incidents
    4. Cultural Competencies for Machine Learning Risk Management
      1. Organizational Accountability
      2. Culture of Effective Challenge
      3. Diverse and Experienced Teams
      4. Drinking Our Own Champagne
      5. Moving Fast and Breaking Things
    5. Organizational Processes for Machine Learning Risk Management
      1. Forecasting Failure Modes
      2. Model Risk Management Processes
      3. Beyond Model Risk Management
    6. Case Study: The Rise and Fall of Zillow’s iBuying
      1. Fallout
      2. Lessons Learned
    7. Resources
  3. 2. Interpretable and Explainable Machine Learning
    1. Important Ideas for Interpretability and Explainability
    2. Explainable Models
      1. Additive Models
      2. Decision Trees
      3. An Ecosystem of Explainable Machine Learning Models
    3. Post-hoc Explanation
      1. Feature Importance
      2. Surrogate Models
      3. Linear Models and Local Interpretable Model-agnostic Explanations
      4. Plots of Model Performance
      5. Cluster profiling
    4. Stubborn Difficulties of Post-hoc Explanation in Practice
    5. Pairing Explainable Models and Post-hoc Explanation
    6. Case Study: Graded by Algorithm
    7. Resources
  4. 3. Debugging Machine Learning Systems for Safety and Performance
    1. Training
      1. Reproducibility
      2. Data Quality
      3. Model Specification for Real-world Outcomes
      4. The Future of Safe and Robust Machine Learning
    2. Model Debugging
      1. Software Testing
      2. Traditional Model Assessment
      3. Machine Learning Bugs
      4. Residual Analysis
      5. Sensitivity Analysis
      6. Benchmark Models
      7. Remediation: Fixing Bugs
    3. Deployment
      1. Domain Safety
      2. Model Monitoring
    4. Case Study: Death by Autonomous Vehicle
      1. Fallout
      2. An Unprepared Legal System
      3. Lessons Learned
    5. Resources
  5. 4. Managing Bias in Machine Learning
    1. ISO and NIST Definitions for Bias
      1. Systemic Bias
      2. Statistical Bias
      3. Human Biases and Data Science Culture
    2. United States Legal Notions of ML Bias
    3. Who Tends to Experience Bias from ML Systems
    4. Harms That People Experience
    5. Testing for Bias
      1. Testing Data
      2. Traditional Approaches: Testing for Equivalent Outcomes
      3. A New Mindest: Testing for Equivalent Performance Quality
      4. On the Horizon: Tests for the Broader ML Ecosystem
      5. Summary Test Plan
    6. Mitigating Bias
      1. Technical Factors in Mitigating Bias
      2. The Scientific Method and Experimental Design
      3. Bias Mitigation Approaches
      4. Human Factors in Mitigating Bias
    7. Case Study: The Bias Bug Bounty
    8. Resources
  6. 5. Security for Machine Learning
    1. Security Basics
      1. The Adversarial Mindset
      2. CIA Triad
      3. Best Practices for Data Scientists
    2. Machine Learning Attacks
      1. Integrity Attacks: Manipulated Machine Learning Outputs
      2. Confidentiality Attacks: Extracted Information
    3. General AI Security Concerns
    4. Counter-measures
      1. Model Debugging for Security
      2. Model Monitoring For Security
      3. Privacy-enhancing Technologies
      4. Robust Machine Learning
      5. General Countermeasures
    5. Case Study: Real-world Evasion Attacks
      1. Lessons Learned
    6. Resources
  7. 6. Explainable Boosting Machines and Explaining XGBoost
    1. Concept Refresher: ML Transparency
      1. Additivity vs. Interactions
      2. Steps Toward Causality with Constraints
      3. Partial Dependence and Individual Conditional Explanation
      4. Shapley Values
      5. Model Documentation
    2. The GAM Family of Interpretable Models
      1. Elastic Net Penalized GLM w/ Alpha and Lambda Search
      2. Generalized Additive Models
      3. GA2M and Explainable Boosting Machines
    3. XGBoost with Constraints and Explainable Artificial Intelligence
      1. Constrained and Unconstrained XGBoost
      2. Explaining Model Behavior with Partial Dependence and ICE
      3. Decision Tree Surrogate Models as an Explanation Technique
      4. Shapley Value Explanations
    4. Resources
  8. 7. Debugging a PyTorch Image Classifier
    1. Concept Refresher: Debugging Deep Learning
    2. Debugging a PyTorch Image Classifier
      1. Data Quality and Leaks
      2. Software Testing for Deep Learning
      3. Sensitivity Analysis for Deep Learning
      4. Remediation
    3. Conclusion
    4. Resources
  9. 8. Testing and Remediating Bias with XGBoost
    1. Concept Refresher: Managing ML Bias
    2. Model Training
    3. Evaluating Models for Bias
      1. Testing Approaches for Groups
      2. Individual Fairness
      3. Proxy Bias
    4. Remediating Bias
      1. Pre-processing
      2. In-processing
      3. Post-processing
      4. Model Selection
    5. Conclusion
    6. Resources
  10. 9. Red-teaming XGBoost
    1. Concept Refresher
      1. CIA Triad
      2. Attacks
      3. Countermeasures
    2. Model Training
    3. Attacks for Red-teaming
      1. Model Extraction Attacks
      2. Adversarial Example Attacks
      3. Membership Attacks
      4. Data Poisoning
      5. Backdoors
    4. Conclusion
    5. Resources
  11. About the Authors

Product information

  • Title: Machine Learning for High-Risk Applications
  • Author(s): Patrick Hall, James Curtis, Parul Pandey
  • Release date: June 2023
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098102432