Introduction
Are Human Decisions Less Biased Than Automated Ones?
This report takes data science leaders and practitioners through the key challenges of defining fairness and reducing unfair bias throughout the machine learning pipeline. It shows why data science teams need to engage early and authoritatively on building trusted artificial intelligence (AI). It also explains in plain English how organizations should think about AI fairness, as well as the trade-offs between model bias and model accuracy. Much has been written on the social injustice of AI bias, but this report focuses on how teams can mitigate unfair machine bias by using the open source tools available in AI Fairness 360 (AIF360).
Developing unbiased algorithms involves many stakeholders across a company. After reading this report, you will understand the many factors that you need to consider when defining fairness for your use case (such as legal compliance, ethics, and trust). You will also learn that there are several ways to define fairness, and thus many different ways to measure and reduce unfair bias. Although not all bias can be removed, you will learn that organizations should define acceptable thresholds for both model accuracy and bias.
In this report, we provide an overview of how AI bias could negatively affect every industry as the use of algorithms becomes more prevalent. Next, we discuss how to define fairness and the source of bias. Then, we discuss bias in the machine learning pipeline and the current bias mitigation toolkits. We then go into the AI Fairness 360 toolkit with a focus on how to measure bias and attempt to remove it. We conclude with a Python tutorial of a credit-scoring use case and provide some guidance on fairness and bias.
AI Fairness Is Becoming Increasingly Critical
AI is increasingly being used in highly sensitive areas such as health care, hiring, and criminal justice, so there has been a wider focus on the implications of bias and unfairness embedded in it. We know that human decision-making in many areas is biased and shaped by our individual or societal biases, which are often unconscious. One may assume that using data to automate decisions would make everything fair, but we now know that this is not the case. AI bias can come in through societal bias embedded in training datasets, decisions made during the machine learning development process, and complex feedback loops that arise when a machine learning model is deployed in the real world.
Extensive evidence has shown that AI can embed human and societal biases and deploy them at scale. Many experts are now saying that unwanted bias might be the major barrier that prevents AI from reaching its full potential. One such case that has received media attention is that of Correctional Offender Management Profiling for Alternative Sanctions (COMPAS), an algorithm used to predict whether defendants in Broward County, Florida, among other jurisdictions, were likely to commit a crime if they were not remanded. A 2016 investigation by journalists at ProPublica found that the COMPAS algorithm incorrectly labeled African-American defendants as “high-risk” at nearly twice the rate it mislabeled white defendants. This illustrates the significant negative impact an AI algorithm can have on society. So how do we ensure that automated decisions are less biased than human decision-making?
Defining Fairness
Fairness is a complex and multifaceted concept that depends on context and culture. Defining it for an organization’s use case can thus be difficult. There are at least 21 mathematical definitions of fairness.1 These are not just theoretical differences in how to measure fairness; different definitions focus on different aspects of fairness and thus can produce entirely different outcomes. Fairness researchers have shown that it is impossible to satisfy different definitions of fairness at the same time.2 The University of Chicago has created a decision tree that is useful in thinking through how organizations can define fairness.
Many definitions focus either on individual fairness (treating similar individuals similarly) or on group fairness (making the model’s predictions/outcomes equitable across groups). Individual fairness seeks to ensure that statistical measures of outcomes are equal for similar individuals. Group fairness partitions a population into groups defined by protected attributes and seeks to ensure that statistical measures of outcomes are equal across groups.
Within those who focus on group fairness, there are two opposing worldviews. These can be roughly summarized as “We’re All Equal” (WAE) and “What You See is What You Get” (WYSIWYG).3 The WAE worldview holds that all groups have the same abilities, while the WYSIWYG worldview holds that observations reflect the abilities of groups.
For example, let’s consider how universities could define group fairness using SAT scores as a feature for predicting success in college. The WYSIWYG worldview would say that the score correlates well with future success and can be used to compare the abilities of applicants accurately. In contrast, the WAE worldview would say that SAT scores may contain structural biases, so its distribution being different across groups should not be mistaken for a difference in distribution of ability.
In addition, different bias-handling algorithms address different parts of the machine learning life cycle; understanding how, when, and why to use each research contribution is challenging even for experts in algorithmic fairness. As a result, AI stakeholders and practitioners need clarity on how to proceed. Currently the burden is on the practitioners, who face questions such as these:
-
Should our data be debiased?
-
Should we create new classifiers that learn unbiased models?
-
Is it better to correct predictions from the model?
This report helps you to strategize on how to approach such questions for your organization’s specific use case.
Where Does Bias Come From?
Algorithmic bias is often discussed in machine learning, but in most cases the underlying data, rather than the algorithm, is the main source of bias. Consider supervised learning, which is the most common form of machine learning. Its goal is to find a mathematical function that takes data points of numerical, ordinal, or categorical features as inputs and predicts correct labels for those data points. Models are trained on data, and often that data contains human decisions that reflect the effects of societal or historical inequities.
For example, in a credit scoring model, the features might be income, education level, and occupation of an applicant, and the label might be whether an applicant defaults three years later. The algorithm finds the desired function by training on a large set of already labeled examples. Although the function fits the training data, when it is applied to new data points, it must generalize to predict well. In its most basic form, fitting is performed to optimize a criterion such as average accuracy.
The biggest problem with machine learning models is that the training distribution does not always match the desired distribution. If the present reality puts certain individuals at a systematic disadvantage, the training data distribution is likely to reproduce that disadvantage rather than reflecting a fairer future. Biases such as those against African-Americans in the criminal justice system and women in employment can be present whenever judges and hiring managers make decisions. These decisions are reflected in the training data and subsequently baked into future machine learning model decisions. Some types of bias that can be introduced into data include the following:
- Sample bias
-
Sample bias occurs when one population is overrepresented or underrepresented in a training dataset. An example of this would be a recruiting tool that has been predominantly trained on white male job candidates.
- Label bias
-
Label bias occurs when the annotation process introduces bias during the creation of training data. For example, the people labeling the data might not represent a diverse group of locations, ethnicities, languages, ages, and genders, and can bring their implicit personal biases into their labels. This can lead to labels that are skewed in ways that yield systematic disadvantages to certain groups.
- Outcome proxy bias
-
Outcome proxy bias occurs when the machine learning task is not specified appropriately. For example, if one would like to predict the likelihood of a person committing a crime, using arrests as a proxy is biased because arrest rates are greater in neighborhoods with more police patrols. Also, being arrested does not imply guilt. Similarly, using the cost of a person to a health system is a biased proxy for the person’s quality of health.
Bias and Machine Learning
Machine learning models make predictions of an outcome for a particular instance. For example, given an instance of a loan application, a model might predict whether an applicant will repay the loan. The model makes these predictions based on a training dataset for which many other instances (other loan applications) and actual outcomes (whether the borrowers repaid) are provided. Thus, a machine learning algorithm will attempt to find patterns, or generalizations, in the training dataset to use when a prediction for a new instance is needed. For example, one pattern might be that if a person has a salary greater than $40,000 and outstanding debt less than $5,000, they will repay the loan. In many domains, this technique, called supervised machine learning, has worked very well.
However, sometimes the patterns such models find are undesirable or even illegal. For example, our loan repayment model might determine that age plays a significant role in the prediction of repayment, because the training dataset happened to have better repayment rates for one age group than for another. This raises two problems. First, the training dataset might not be representative of the true population of people of all age groups. Second, even if it is representative, it is illegal to base a decision on an applicant’s age, regardless of whether this is a good prediction based on historical data.
AIF 360 addresses this problem with fairness metrics and bias mitigators. We can use fairness metrics to check for bias in machine learning workflows. We can use bias mitigators to overcome bias in the workflow to produce a fairer outcome. The loan scenario describes an intuitive example of illegal bias. However, not all undesirable bias in machine learning is illegal; bias can also manifest in subtler ways. For example, a loan company might want a diverse portfolio of customers across all income levels and thus will deem it undesirable to make more loans to high-income customers than to low-income customers. Although doing so is not illegal or unethical, it is undesirable for the company’s strategy.
Can’t I Just Remove Protected Attributes?
When building machine learning models, many data scientists assume that they can just remove protected attributes (i.e., race, gender, age) to avoid unfair bias. However, there are many features that are too closely correlated to protected attributes, which makes it easy to reconstruct a protected attribute such as race even if you drop it from your training set.
An example of this is when Amazon rolled out its Prime Free Same-Day Delivery service for many products on orders over $35. Eleven months after rolling out the program, Amazon offered same-day service in 27 metropolitan areas. However, a 2016 Bloomberg News analysis found that six of the major cities serviced were excluding predominantly black zip codes to varying degrees. And in Atlanta, Chicago, Dallas, Washington, Boston, and New York, black citizens were about half as likely as white residents to live in neighborhoods with access to Amazon same-day delivery. How did this occur? Amazon focused its same-day service model on zip codes with a high concentration of Prime members, not on race. Yet in cities where most of those paying members are concentrated in predominantly white parts of town, looking at numbers instead of people did not prevent a data-driven calculation from reinforcing long-entrenched inequality in access to retail services.
Conclusion
Bias mitigation is not a silver bullet. Fairness is a multifaceted, context-dependent social construct that defies simple definition. The metrics and algorithms in AIF 360 can be viewed through the lens of distributive justice and do not capture the full scope of fairness in all situations. The toolkit should be used in only a very limited setting: allocation or risk assessment problems with well-defined protected attributes in which one would like to have some sort of statistical or mathematical notion of sameness. Even then, the code and collateral contained in AIF 360 are only a starting point for a broader discussion among multiple stakeholders on overall decision-making workflows.
1 Arvind Narayanan, “21 Fairness Definitions and Their Politics,” tutorial at Conference on Fairness, Accountability, and Transparency, February 2018. https://oreil.ly/MhDrk.
2 J. Kleinberg et al., “Inherent Tradeoffs in the Fair Determination of Risk Scores,” in Proceedings of the 8th Innovations in Theoretical Computer Science Conference, January 2017, 43.1–43.23.
3 S. A. Friedler et al., “On the (Im)possibility of Fairness,” September 2016, arXiv:1609.07236. https://oreil.ly/ps4so.
Get AI Fairness now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.