Notes on fairness and nondiscrimination in machine learning
There has been a lot of work done lately on “fairness in machine learning” and related topics. It cannot be a coincidence that this work has paralleled a rise in political intolerance that is sensitized to issues of gender, race, citizenship, and so on. I more or less stand by my initial reaction to this line of work. But very recently I’ve done a deeper and more responsible dive into this literature and it’s proven to be insightful beyond the narrow problems which it purports to solve. These are some notes on the subject, ordered so as to get to the point.
The subject of whether and to what extent computer systems can enact morally objectionable bias goes back at least as far as Friedman and Nissenbaum’s 1996 article, in which they define “bias” as systematic unfairness. They mean this very generally, not specifically in a political sense (though inclusive of it). Twenty years later, Kleinberg et al. (2016) prove that there are multiple, competing notions of fairness in machine classification which generally cannot be satisfied all at once; they must be traded off against each other. In particular, a classifier that uses all available information to optimize accuracy–one that achieves what these authors call calibration–cannot also have equal false positive and false negative rates across population groups (read: race, sex), properties that Hardt et al. (2016) call “equal opportunity”. This is no doubt inspired by a now very famous ProPublica article asserting that a particular kind of commercial recidivism prediction software was “biased against blacks” because it had a higher false positive rate for black suspects than white offenders. Because bail and parole rates are set according to predicted recidivism, this led to cases where a non-recidivist was denied bail because they were black, which sounds unfair to a lot of people, including myself.
While I understand that there is a lot of high quality and well-intentioned research on this subject, I haven’t found anybody who could tell me why the solution to this problem was to stop using predicted recidivism to set bail, as opposed to futzing around with a recidivism prediction algorithm which seems to have been doing its job (Dieterich et al., 2016). Recidivism rates are actually correlated with race (Hartney and Vuong, 2009). This is probably because of centuries of systematic racism. If you are serious about remediating historical inequality, the least you could do is cut black people some slack on bail.
This gets to what for me is the most baffling aspect of this whole research agenda, one that I didn’t have the words for before reading Barocas and Selbst (2016). A point well-made by them is that the interpretation anti-discrimination law, which motivates a lot of this research, is fraught with tensions that complicate its application to data mining.
“Two competing principles have always undergirded anti-discrimination law: nondiscrimination and antisubordination. Nondiscrimination is the narrower of the two, holding that the responsibility of the law is to eliminate the unfairness individuals experience a the hands of decisionmakers’ choices due to membership in certain protected classes. Antisubordination theory, in contrast, holds that the goal of antidiscrimination law is, or at least should be, to eliminate status-based inequality due to membership in those classes, not as a matter of procedure, but substance.” (Barocas and Selbst, 2016)
More specifically, these two principles motivate different interpretations of the two pillars of anti-discrimination law, disparate treatment and disparate impact. I draw on Barocas and Selbst for my understanding of each:
A judgment of disparate treatment requires either a formal disparate treatment (across protected groups) of similarly situated people, or an intent to discriminate. Since in a large data mining application protected group membership will be proxied by many other factors, it’s not clear if the ‘formal’ requirement makes much sense here. And since machine learning applications only very rarely have racist intent, that option seems challengeable as well. While there are interpretations of these criteria that are tougher on decision-makers (i.e. unconscious intents), these seem to be motivated by antisubordination rather than the weaker nondiscrimination principle.
A judgment of disparate impact is perhaps more straightforward, but it can be mitigated in cases of “business necessity”, which (to get to the point) is vague enough to plausibly include optimization in a technical sense. Once again, there is nothing to see here from a nondiscrimination standpoint, though a nonsubordinationist would rather that these decision-makers have to take correcting for historical inequality into account.
I infer from their writing that Barocas and Selbst believe that nonsubordination is an important principle for nondiscrimination. In any case, they maintain that making the case for applying nondiscrimination laws to data mining effectively requires a commitment to “substantive remediation”. This is insightful!
Just to put my cards on the table: as much as I may like the idea of substantive remediation in principle, I personally don’t think that every application of nondiscrimination law needs to be animated by it. For many institutions, narrow nondiscrimination seems to be adequate if not preferable. I’d prefer remediation to occur through other specific policies, such as more public investment in schools in low-income districts. Perhaps for this reason, I’m not crazy about “fairness in machine learning” as a general technical practice. It seems to me to be trying to solve social problems with a technical fix, which despite being quite technical myself I don’t always see as a good idea. It seems like in most cases you could have a machine learning mechanism based on normal statistical principles (the learning step) and then use a decision procedure separately that achieves your political ends.
I wish that this research community (and here I mean more the qualitative research community surrounding it more than the technical community, which tends to define its terms carefully) would be more careful about the ways it talks about “bias”, because often it seems to encourage a conflation between statistical or technical senses of bias and political senses. The latter carry so much political baggage that it can be intimidating to try to wade in and untangle the two senses. And it’s important to do this untangling, because while bad statistical bias can lead to political bias, it can, depending on the circumstances, lead to either “good” or “bad” political bias. But it’s important, from the sake of numeracy (mathematical literacy) to understand that even if a statistically bad process has a politically “good” outcome, that is still, statistically speaking, bad.
My sense is that there are interpretations of nondiscrimination law that make it illegal to make certain judgments taking into account certain facts about sensitive properties like race and sex. There are also theorems showing that if you don’t take into account those sensitive properties, you are going to discriminate against them by accident because those sensitive variables are correlated with anything else you would use to judge people. As a general principle, while being ignorant may sometimes make things better when you are extremely lucky, in general it makes things worse! This should be a surprise to nobody.
Barocas, Solon, and Andrew D. Selbst. “Big data’s disparate impact.” (2016).
Dieterich, William, Christina Mendoza, and Tim Brennan. “COMPAS risk scales: Demonstrating accuracy equity and predictive parity.” Northpoint Inc (2016).
Friedman, Batya, and Helen Nissenbaum. “Bias in computer systems.” ACM Transactions on Information Systems (TOIS) 14.3 (1996): 330-347.
Hardt, Moritz, Eric Price, and Nati Srebro. “Equality of opportunity in supervised learning.” Advances in Neural Information Processing Systems. 2016.
Hartney, Christopher, and Linh Vuong. “Created equal: Racial and ethnic disparities in the US criminal justice system.” (2009).
Kleinberg, Jon, Sendhil Mullainathan, and Manish Raghavan. “Inherent trade-offs in the fair determination of risk scores.” arXiv preprint arXiv:1609.05807 (2016).