Chapter 4. Technology: Engineering Machine Learning for Human Trust and Understanding

“If builders built houses the way programmers built programs, the first woodpecker to come along would destroy civilization.”

Gerald M. Weinberg

Human users of ML need to trust that any decision made by an ML system is maximally accurate, secure, and stable, and minimally discriminatory. We may also need to understand any decision made by an ML system for compliance, curiosity, debugging, appeal, or override purposes. This chapter discusses many technologies that can help organizations build human trust and understanding into their ML systems. We’ll begin by touching on reproducibility, because without that, you’ll never know if your ML system is any better or worse today than it was in the past. We’ll then proceed to interpretable models and post hoc explanation because interpretability into ML system mechanisms enables debugging of quality, discrimination, security, and privacy problems. After presenting some of these debugging techniques, we’ll close the chapter with a brief discussion of causality in ML.

Reproducibility

Establishing reproducible benchmarks to gauge improvements (or degradation) in accuracy, fairness, interpretability, privacy, or security is crucial for applying the scientific method. Reproducibility can also be necessary for regulatory compliance in certain cases. Unfortunately, the complexity of ML workflows makes reproducibility a real challenge. This section presents a few pointers for increasing reproducibility in your organization’s ML systems.

Metadata

Metadata about ML systems allows data scientists to track all model artifacts that lead to a deployed model (e.g., datasets, preprocessing steps, model, data and model validation results, human sign offs, and deployment details). Many of the additional reproducibility steps presented below are just specific ways to track ML system metadata. Tracking metadata also allows retracing of what went wrong, throughout the entire ML life cycle, when an AI incident occurs. For an open-source example of a nice tool for tracking metadata, checkout TensorFlow’s MLMD.

Random Seeds

ML models are subject to something known as the “multiplicity of good models,” or the “Rashomon effect.” Unlike more traditional linear models, this means that there can be huge numbers of acceptable ML models for any given dataset. ML models also utilize randomness, which can cause unexpected results. These factors conspire to make reproducible outcomes in ML models more difficult than in traditional statistics and software engineering. Luckily, almost all contemporary, high-quality ML software comes with a “seed” parameter to help improve reproducibility. The seed typically starts the random number generator inside an algorithm at the same place every time. The key with seeds is to understand how they work in different packages and then use them consistently.

Version Control

ML code is often highly intricate and typically relies on many third-party libraries or packages. Of course, changes in your code and changes to third-party code can change the outcomes of an ML system. Systematically keeping track of these changes is another good way to increase reproducibility, transparency, and your sanity. Git and GitHub are free and ubiquitous resources for software version control, but there are plenty of other options to explore. Ensuring correct versions of certain ML libraries is also very important in any ML application, as different versions of ML libraries can lead to differences in performance and accuracy. So, ensuring that versions of each library used are documented and controlled will often lead to better reproducibility. Also, remember that tracking changes to large datasets and other ML-related artifacts is different than tracking code changes. In addition to some of the environment tools we discuss in the next subsection, checkout Pachyderm or DVC for data versioning.

Environments

ML models are trained, tested, and deployed in an environment that is determined by software, hardware, and running programs. Ensuring a consistent environment for your ML model during training, testing, and deployment is critical. Different environments will most likely be detrimental to reproducibility (and just a huge pain to handle manually). Happily, many tools are now available to help data scientists and ML engineers preserve their computing environments. For instance, Python, sometimes called the lingua franca of ML, now includes virtual environments for preserving coding environments.

Virtual machines, and more recently, containers, provide a mechanism to replicate the entire software environment in which an ML system operates. When it comes to ML, the container framework is very popular. It can preserve the exact environment a model was trained in and be run later on different hardware—major pluses for reproducibility and easing ML system deployment! Moreover, specialized software has even been developed specifically to address environment reproducibility in data and ML workflows. Check out Domino Data Lab, Gigantum, KubeFlow Pipelines, and TensorFlow Extended to see what these specialized offerings look like.

Hardware

Hardware is the collection of different physical components that enable a computer to run, subsequently allowing ML code to run, which finally enables the training and deployment of ML systems. Of course, hardware can have a major impact on ML system reproducibility. Basic considerations for hardware and ML reproducibility include ensuring similarity of the hardware used between training and deployment of ML systems and testing ML systems across different hardware with an eye toward reproducibility.

By taking stock of these factors, along with the benchmark models also discussed later in Chapter 4, data scientists, ML and data engineers, and other IT personnel should be able to enhance your organization’s ML reproducibility capabilities. This is just a first step to being more responsible with ML, but should also lead to happier customers and faster ML product delivery over an ML system’s lifespan. And once you know your ML system is standing on solid footing, then the next big technological step is to start applying interpretable and explainable ML techniques so you can know exactly how your system works.

Interpretable Machine Learning Models and Explainable AI

Interpretability is another basic requirement for mitigating risks in ML. It’s just more difficult to mitigate risks in a black-box system that you don’t understand. Hence, interpretability enables full debuggability. Interpretability is also crucial for human learning from ML results, enabling human appeal and override of ML outcomes, and often for regulatory compliance. Today, there are numerous methods for increasing ML’s interpretability, but they usually fall into two major categories: interpretable ML models and post hoc explanation techniques.

Interpretable Models

For decades, an informal belief in a so-called “accuracy-interpretability tradeoff” led most researchers and practitioners in ML to treat their models as supposedly accurate, but inscrutable, black boxes. In recent years, papers from leading ML scholars and several empirical studies have begun to cast serious doubt on the perceived tradeoff.¹ There has been a flurry of papers and software for new ML algorithms that are nonlinear, highly accurate, and directly interpretable. Moreover, “interpretable” as a term has become more associated with these kinds of new models.

New interpretable models are often Bayesian or constrained variants of older ML algorithms, such as the explainable neural network (XNN) pictured in the online resources that accompany this report. In the example XNN, the model’s architecture is constrained to make it more understandable to human operators.

Another key concept with interpretability is that it’s not a binary on-off switch. And XNNs are probably some of the most complex kinds of interpretable models. Scalable Bayesian rule lists, like some other interpretable models, can create model architectures and results that are perhaps interpretable enough for business decision makers. Other interesting examples of these interpretable ML models include:

Explainable boosting machines (EBMs, also known as GA2M)
Monotonically constrained gradient boosting machines
Skope-rules
Supersparse linear integer models (SLIMs)
RuleFit

Next time you’re starting an ML project, especially if it involves standard structured data sources, evaluate one of these accurate and interpretable algorithms. We hope you’ll be pleasantly surprised.

Post hoc Explanation

Post hoc explanations are summaries of model mechanisms and results that are typically generated after ML model training. These techniques are also sometimes called explainable AI (XAI). These techniques can be roughly broken down into:

Local feature importance measurements: For example, Shapley values, integrated gradients, and counterfactual explanations. Sometimes also referred to as “local” explanations, these can tell users how each input feature in each row of data contributed to a model outcome. These measures can also be crucial for the generation of adverse action notices in the US financial services industry.
Surrogate models: For example, local interpretable model-agnostic explanations (LIME) or anchors. These techniques are simpler models of more complex ML models that can be used to reason about the more complex ML models.
Visualizations of ML model results: For example, variable importance, accumulated local effect, individual conditional expectation, and partial dependence plots. These plots help summarize many different aspects of ML model results into consumable visualizations. They’re also helpful for model documentation requirements in US financial services applications.

Many of these post hoc explanation techniques can be applied to traditional ML black boxes to increase their interpretability, but these techniques also have to be used with care. They have known drawbacks involving fidelity, consistency, and comprehensibility. Fidelity refers to the accuracy of an explanation. Consistency refers to how much an explanation changes if small changes are made to training data or model specifications. And comprehensibility refers to human understanding of generated explanations. All of these drawbacks must be considered carefully when using XAI techniques. Likely one of the best ways to use post hoc explanations is with constrained and interpretable ML models. Both the constraints and the inherent interpretability can counterbalance concerns related to the validity of post hoc explanations. Pairing interpretable model architectures with post hoc explanation also sets the stage for effective model debugging and ML system testing.

Model Debugging and Testing Machine Learning Systems

Despite all the positive hype, there’s nothing about ML systems that makes them immune to the bugs and attacks that affect traditional software systems. In fact, due to their complexity, drift characteristics, and inherently stochastic nature, ML systems may be even more likely than traditional software to suffer from these kinds of incidents. Put bluntly, current model assessment techniques, like cross validation or receiver operator characteristic (ROC) and lift curves, just don’t tell us enough about all the incidents that can occur when ML models are deployed as part of public facing, organizational IT systems.

This is where model debugging comes in. Model debugging is a practice that’s focused on finding and fixing problems in ML systems. In addition to a few novel approaches, the discipline borrows from model governance, traditional model diagnostics, and software testing. Model debugging attempts to test ML systems like computer code because ML models are almost always made from code. And it uses diagnostic approaches to trace complex ML model response functions and decision boundaries to hunt down and address accuracy, fairness, security, and other problems in ML systems. This section will discuss two types of model debugging: porting software quality assurance (QA) techniques to ML, and specialized techniques needed to find and fix problems in the complex inner workings of ML systems.

Software Quality Assurance for Machine Learning

ML is software. All the testing that’s done on traditional enterprise software assets should generally be done on ML as well.

Note

This is just the starting point! We give ample consideration to special ML risks in other sections of this report. This subsection simply aims to clarify that we recommend doing all the testing your organization is doing on traditional software assets on ML systems too—and then moving on to address the wide variety of risks presented by ML systems. Yes, that’s a lot of work. With great power comes great responsibility.

Unit tests should be written for data processing, optimization, and training code. Integration testing should be applied to ML system APIs and interfaces to spot mismatches and other issues. And functional testing techniques should be applied to ML system user interfaces and endpoints to ensure that systems behave as expected. Wrapping some of these testing processes and benchmarking into continuous integration/continuous deployment (CI/CD) processes can lead to efficiency gains and even higher ML software quality. To learn more about getting started with simple QA and model debugging, check out Google’s free course, Testing and Debugging in Machine Learning.

Specialized Debugging Techniques for Machine Learning

ML does present concerns above and beyond traditional software. As discussed in other report sections, ML poses some very specialized discrimination, privacy, and security concerns. ML systems can also just be wrong. In one famous case, a medical risk system asserted that asthma patients were at lower risk than others of dying from pneumonia. In another shocking instance, a self-driving car was found to be unprepared to handle jaywalking. It killed a human.

Finding these types of bugs does require some specialized approaches, but it’s an absolute must for high-stakes ML deployments. Practical techniques for finding bugs in ML systems tend to be variants of sensitivity analysis and residual analysis. Sensitivity analysis involves simulating data and testing model performance on that data. Residual analysis is the careful study of model errors at training time. These two techniques, when combined with benchmark models, discrimination testing, and security audits can find logical errors, blinspots, and other problems in ML systems. Of course, once bugs are found, they must be fixed. There are lots of good options for that too. These include data augmentation and model assertions and editing, among others. For a summary of contemporary model debugging techniques, see Why You Should Care About Debugging Machine Learning Models. For code and examples of debugging an example consumer credit model, check out Real-World Strategies for Model Debugging. The next section will stay on the theme of model debugging and introduce benchmark models in more detail.

Benchmark Models

Benchmark models are simple, trusted, or transparent models to which ML systems can be compared. They serve myriad risk mitigation purposes in a typical ML workflow, including use in model debugging and model monitoring.

Model Debugging

First, it’s always a good idea to check that a new complex ML model outperforms a simpler benchmark model. Once an ML model passes this baseline test, benchmark models can serve as debugging tools. Use them to test your ML model by asking questions like, “What did my ML model get wrong that my benchmark model got right? And can I see why?” Another important function that benchmark models can serve is tracking changes in complex ML pipelines. Running a benchmark model at the beginning of a new training exercise can help you confirm that you are starting on solid ground. Running that same benchmark after making changes can help to confirm whether changes truly improved an ML model or pipeline. Moreover, automatically running benchmarks as part of a CI/CD process can be a great way to understand how code changes impact complex ML systems.

Model Monitoring

Comparing simpler benchmark models and ML system predictions as part of model monitoring can help to catch stability, fairness, or security anomalies in near real time. Due to their simple mechanisms, an interpretable benchmark model should be more stable, easier to confirm as minimally discriminatory, and should be harder to hack. So, the idea is to use a highly transparent benchmark model when scoring new data and your more complex ML system. Then compare your ML system predictions against a trusted benchmark model. If the difference between your more complex ML system and your benchmark model is above some reasonable threshold, then fall back to issuing the benchmark model’s predictions or send the row of data for manual processing. Also, record the incident. It might turn out to be meaningful later. (It should be mentioned that one concern when comparing an ML model versus a benchmark model in production is the time it takes to score new data, i.e., increased latency.)

Given the host of benefits that benchmark models can provide, we hope you’ll consider adding them into your training or deployment technology stack.

Discrimination Testing and Remediation

Another critical model debugging step is discrimination testing and remediation. A great deal of effort has gone into these practices over the past several decades. Discrimination tests run the gambit from simple arithmetic, to tests with long-standing legal precedent, to cutting-edge ML research. Approaches used to remediate discrimination usually fall into two major categories:

Searching across possible algorithmic and feature specifications as part of standard ML model selection.
Attempting to minimize discrimination in training data, ML algorithms, and in ML system outputs.

These are discussed in more detail below. While picking the right tool for discrimination testing and remediation is often difficult and context sensitive, ML practitioners must make this effort. If you’re using data about people, it probably encodes historical discrimination that will be reflected in your ML system outcomes, unless you find and fix it. This section will present the very basics of discrimination testing and remediation in hopes of helping your organization get a jump start on fighting this nasty problem.

Testing for Discrimination

In terms of testing for ML discrimination, there are two major problems for which to be on the lookout: group disparities and individual disparities. Group disparities occur when a model’s outcome is unfair across demographic groups by some measure or when the model exhibits different accuracy or error characteristics across different demographic groups—most open source packages test for these kinds of disparities.

Individual disparity is a much trickier concept, and if you’re just starting testing for discrimination in ML, it may not be your highest priority. Basically, individual disparity occurs when a model treats individual people who are similar in all respects except for some demographic information, differently. This can happen in ML for many reasons, such as overly complex decision boundaries, where a person in a historically marginalized demographic group is placed on the harmful side of an ML decision outcome without good reason. It can also happen when an ML system learns to combine some input data features to create proxy features for someone’s unique demographic information.

While functionality to find counterfactual or adversarial examples is becoming more common (e.g., Google’s “What-if” tool), testing for individual disparity is typically more involved than testing for group disparities. Today, it just takes some snooping around, looking at many individuals in your data, training adversary models or using special training constraints, tracing decision boundaries, and using post hoc explanation techniques to understand if features in your models are local proxies for demographic variables. Of course, doing all this extra work is never a bad idea, as it can help you understand drivers of discrimination in your ML system, whether these are group disparities or local disparities. And these extra steps can be used later in your ML training process to confirm if any applied remediation measures were truly successful.

Remediating Discovered Discrimination

If you find discrimination in your ML system, what do you do? The good news is you have at least two major remediation strategies to apply—one tried and true and the others more cutting edge, but also potentially a little risky in regulated industries. We’ll provide some details on these strategies below.

Strategy 1

Strategy 1 is the traditional strategy (and safest from a US regulatory perspective). Make sure to use no demographic features in your model training, and simply check standard discrimination metrics (like adverse impact ratio or standardized mean difference) across an array of candidate ML models. Then select the least discriminatory model that is accurate enough to meet your business needs. This is often the strategy used today in highly regulated areas like lending and insurance.

Figure 4-1 illustrates how simply considering a discrimination measure, adverse impact ratio (AIR) for African Americans versus Caucasians in this case, during ML model selection can help find accurate and less discriminatory models. AIR is usually accompanied by the four-fifths rule practical significance test, wherein the ratio of positive outcomes for a historically marginalized demographic versus the positive outcomes for a reference group, often Whites or males, should be greater than 0.8, or four-fifths.

A random grid search for neural network models with results plotted in both quality and fairness dimensions. The results show that high quality lower discrimination models are available in this case. We just needed to look for them. Figure courtesy of Patrick Hall.

Strategy 2

Strategy 2 includes newer methods from the ML, computer science, and fairness research communities.

Fix your data

Today, in less regulated industrial sectors, you’ll likely be able to use software packages that can help you resample or reweight your data so that it brings less discrimination into your ML model training to begin with. Another key consideration here is simply collecting representative data; if you plan to use an ML system on a certain population, you should collect data that accurately represents that population.

Fix your model

ML researchers have developed many interesting approaches to decrease discrimination during ML model training. Some of these might even be permissible in highly regulated settings today but be sure to confer with your compliance or legal department before getting too invested in one of these techniques.

Regularization: The most aggressive, and perhaps riskiest approach from a regulatory standpoint, is to leave demographic features in your ML model training and decision-making processes, but use specialized methods that attempt to regularize, or down weight, their importance in the model.
Dual optimization: In a dual optimization approach, demographic features are not typically used in the ML system decision-making process. But, they are used during the ML model training process to down weight model mechanisms that could result in more discriminatory outcomes. If you’re careful, dual optimization approaches may be acceptable in some US regulated settings since demographic information is not technically used in decision making.
Adversarial debiasing: In adversarial debiasing, two models compete against one another. One ML model will be the model used inside your ML system for decision making. This model usually does not have access to any explicit demographic information. The other model is an adversary model that is discarded after training, and it does have access to explicit demographic information. Training proceeds by first fitting the main model, then seeing if the adversary can accurately predict demographic information from only the main model’s predictions. If the adversary can, the main model uses information from the adversary, but not explicit demographic information, to down weight any hidden demographic information in its training data. This back-and-forth continues until the adversary can no longer predict demographic information based on the main model’s predictions. Like a dual objective approach, adversarial debiasing may be acceptable in some US regulated settings.

Fix your predictions

Decisions based on low-confidence predictions or harmful decisions affecting historically marginalized demographic groups can be sent for human review. It’s also possible to directly change your ML predictions to make ML systems less discriminatory by some measure. This can potentially be used for already-in-flight ML systems with discrimination problems, as discrimination can be decreased without retraining the system in some cases. But this heavy-handed intervention may also raise regulatory eyebrows in the US consumer finance vertical.

As you can see, there are numerous ways to find and fix discrimination in your ML systems. Use them, but do so carefully. Without discrimination testing and remediation, it’s possible that your ML system is perpetuating harmful, inaccurate, and even illegal discrimination. With these techniques, you’ll still need to monitor your system outcomes for discrimination on ever-changing live data and be on the lookout for unintended side effects. As discussed in the next section, ML systems can be attacked, and in one famous example, hacked to be discriminatory. Or interventions that were intended to diminish discrimination can end up causing harm in the long run.

Securing Machine Learning

Various ML software artifacts, ML prediction APIs, and other ML endpoints can now be vectors for cyber and insider attacks. These ML attacks can negate all the hard work an ML team puts into mitigating risks. After all, once your model is attacked, it’s not your model anymore. And the attackers could have their own agendas regarding accuracy, discrimination, privacy, or stability. This section will present a brief overview of the current known ML attacks and some basic defensive measures your team can use to protect your AI investments.

Machine Learning Attacks

ML systems today are subject to general attacks that can affect any public facing IT system; specialized attacks that exploit insider access to data and ML code; external access to ML prediction APIs and endpoints; and trojans that can hide in third-party ML artifacts.

General attacks

ML systems are subject to hacks like distributed denial of service (DDOS) attacks and man-in-the-middle attacks.

Insider attacks

Malicious or extorted insiders can change ML training data to manipulate ML system outcomes. This is known as data poisoning. They can also alter code used to score new data, including creating back doors, to impact ML system outputs. (These attacks can also be performed by unauthorized external adversaries but are often seen as more realistic attack vectors for insiders.)

External attacks

Several types of external attacks involve hitting ML endpoints with weird data to change the system’s output. This can be as simple as using strange input data, known as adversarial examples, to game the ML system’s results. Or these attacks can be more specific, say impersonating another person’s data, or using tweaks to your own data to evade certain ML-based security measures. Another kind of external ML attack involves using ML prediction endpoints as designed, meaning simply submitting data to—and receiving predictions from—ML endpoints. But instead of using the submitted data and received predictions for legitimate business purposes, this information is used to steal ML model logic and to reason about, or even replicate, sensitive ML training data.

Trojans

ML systems are often dependent on numerous third-party and open-source software packages, and, more recently, large pretrained architectures. Any of these can contain malicious payloads.

Illustrations of some ML attacks are provided in the online resources that accompany this report. These illustrations are visual summaries of the discussed insider and external ML attacks. For an excellent overview of most known attacks, see the Berryville Machine Learning Institute’s Interactive Machine Learning Risk Framework.

Countermeasures

Given the variety of attacks for ML systems, you may now be wondering about how to protect your organization’s ML and AI models. There are several countermeasures you can use and, when paired with the processes proposed in Chapter 3—bug bounties, security audits, and red teaming—such measures are more likely to be effective. Moreover, there are the newer subdisciplines of adversarial ML and robust ML that are giving the full academic treatment to these subjects.

This section of the report will outline some of the most basic defensive measures you can use to help make your ML system more secure, including general measures, model monitoring for security, and defenses for insider attacks. Also, be sure to follow new work in secure, adversarial, and robust ML, as this subject is evolving quickly.

The basics

Whenever possible, require consumer authentication to access predictions or use ML systems. Also, throttle system response times for large or anomalous requests. Both of these basic IT security measures go a long way in hindering external attacks.

Model debugging

Use sensitivity analysis and adversarial example searches to profile how your ML system responds to different types of data. If you find that your model may be subject to manipulation by certain kinds of input data, either retrain your model with more data, constraints and regularization, or alert those responsible for model monitoring to be on the lookout for the discovered vulnerabilities.

Model monitoring

As discussed elsewhere in the report, models are often monitored for decaying accuracy. But models should also be monitored for an adversarial attack. Because a model could be attacked to be made discriminatory, real-time discrimination testing should be conducted if possible. In addition to monitoring for accuracy and discrimination, watching for strange inputs such as unrealistic data, random data, duplicate data, and training data can help to catch external adversarial attacks as they occur. Finally, a general strategy that has also been discussed in other sections is the real-time comparison of the ML system results to simpler benchmark model results.

Thwarting malicious insiders

A strict application of the notion of least privilege, i.e., ensuring all personnel—even “rockstar” data scientists and ML engineers—receive the absolute minimum IT system permissions, is one of the best ways to guard against insider ML attacks. Other strategies include careful control and documentation of data and code for ML systems and residual analysis to find strange predictions for insiders or their close associates.

Other key points in ML security include privacy-enhancing technologies (PETs) to obscure and protect training data and organizational preparation with AI incident response plans. As touched on in Chapter 3, incorporating some defensive strategies—and training on how and when to use them—into your organization’s AI incident response plans can improve your overall ML security. As for PETs, the next section will address them.

Privacy-Enhancing Technologies for Machine Learning

Privacy-preserving ML is yet another research subdiscipline with direct ramifications for the responsible practice of ML. Some of the most promising and practical techniques from this field include federated learning and differential privacy.

Federated Learning

Federated learning is an approach to training ML algorithms across multiple decentralized edge devices or servers holding local data samples, without exchanging raw data. This approach is different from traditional centralized ML techniques where all datasets are uploaded to a single server. The main benefit of federated learning is that it enables the construction of robust ML models without sharing data among many parties. Federated learning avoids sharing data by training local models on local data samples and exchanging parameters between servers or edge devices to generate a global model, which is then shared by all servers or edge devices. Assuming a secure aggregation process is used, federated learning helps address fundamental data privacy and data security concerns.

Differential Privacy

Differential privacy is a system for sharing information about a dataset by describing patterns about groups in the dataset without disclosing information about specific individuals. In ML tools, this is often accomplished using specialized types of differentially private learning algorithms.² This makes it more difficult to extract sensitive data from training data or the trained model. In fact, an ML model is said to be differentially private if an outside observer cannot tell if an individual’s information was used to train the model. (This sounds great for preventing those data extraction attacks described in the previous section!)

Federated learning, differential privacy, and ML security measures can go hand in hand to add an extra layer of privacy and security to your ML systems. While they will be extra work, they’re very likely worth considering for high-stakes or mission-critical ML deployments.

Causality

We’ll close our responsible ML technology discussion with causality, because modeling causal drivers of some phenomenon, instead of complex correlations, could help address many of the risks we’ve brought up. Correlation is not causation. And nearly all of today’s popular ML approaches rely on correlation, or some more localized variant of the same concept, to learn from data. Yet, data can be both correlated and misleading. For instance, in the famous asthma patient example discussed earlier, having asthma is correlated with greater medical attention, not being at a lower risk of death from pneumonia. Furthermore, a major concern in discrimination testing and remediation is ML models learning complex correlations to demographic features, instead of real relationships. Until ML algorithms can learn such causal relationships, they will be subject to these kinds of basic logical flaws and other problems. Fortunately, techniques like Markov Chain Monte Carlo (MCMC) sampling, Bayesian networks, and various frameworks for causal inference are beginning to pop up in commercial and open-source software ML packages. More innovations are likely on the way, so keep an eye on this important corner of the data world.

Aside from rigorous causal inference approaches, there are steps you can take right now to incorporate causal concepts into your ML projects. For instance, enhanced interpretability and model debugging can lead to a type of “poor man’s causality” where debugging is used to find logical flaws in ML models and remediation techniques such as model assertions, model editing, monotonicity constraints, or interaction constraints are used to fix the flaw with human domain knowledge. Root cause analysis is also a great addition to high-stakes ML workflows. Interpretable ML models and post hoc explanation techniques can now indicate reasons for ML model behaviors, which human caseworkers can confirm or deny. These findings can then be incorporated into the next iteration of the ML system in hopes of improving multiple system KPIs. Of course, all of these different suggestions are not a substitute for true causal inference approaches, but they can help you make progress toward this goal.

¹ See https://oreil.ly/gDhzh and https://oreil.ly/Fzilg.

² See also https://oreil.ly/ESyqR.

Get Responsible Machine Learning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial