On descent-based discrimination (a reply to Hanna et al. 2020)

In what is likely to be a precedent-setting case, California regulators filed a suit in the federal court on June 30 against Cisco Systems Inc, alleging that the company failed to prevent discrimination, harassment and retaliation against a Dalit engineer, anonymised as “John Doe” in the filing.

The Cisco case bears the burden of making anti-Dalit prejudice legible to American civil rights law as an extreme form of social disability attached to those formerly classified as “Untouchable.” Herein lies its key legal significance. The suit implicitly compares two systems of descent-based discrimination – caste and race – and translates between them to find points of convergence or family resemblance.

A. Rao, link

There is not much I can add to this article about caste-based discrimination in the U.S. In the law suit, a team of high caste South Asians in California is alleged to have discriminated against a Dalit engineer coworker. The work of the law suit is to make caste-based discrimination legible to American civil rights law. It, correctly, in my view, draws the connection to race.

This illustrative example prompts me to respond to Hanna et al.’s 2020 “Towards a critical race methodology in algorithmic fairness.” This paper by a Google team included a serious, thoughtful consideration of the argument I put forward with my co-author Bruce Haynes in “Racial categories in machine learning”. I like the Hanna et al. paper, think it makes interesting and valid points about the multidimensionality of race, and am grateful for their attention to my work.

I also disagree with some of their characterization of our argument and one of the positions they take. For some time I’ve intended to write a response. Now is a fine time.

First, a quibble: Hanna et al. describe Bruce D. Haynes as a “critical race scholar” and while he may have changed his mind since our writing, at the time he was adamant (in conversation) that he is not a critical race scholar, but that “critical race studies” refers to a specific intellectual project of racial critique that just happens to be really trendy on Twitter. There are lots and lots of other ways to study race critically that are not “critical race studies”. I believe this point was important to Bruce as a matter of scholarly identity. I also feel that it’s an important point because, frankly, I don’t find a lot of “critical race studies” scholarship persuasive and I probably wouldn’t have collaborated as happily with somebody of that persuasion.

So that fact that Hanna et al. explicitly position their analysis in “critical race” methods is a signpost that they are actually trying to accomplish a much more specifically disciplinarily informed project than we were. Sadly, they did not get into the question of how “critical race methodology” differs from other methodologies one might use to study race. That’s too bad, as it supports what I feel is a stifling hegemony that particular discourse has over discussions of race and technology.

The Google team is supportive of the most important contribution of our paper–that racial categories are problematic and that this needs to be addressed in the fairness in AI literature. They then go on to argue against out proposed solution of “using an unsupervised machine learning method to create race-like categories which aim to address “historical racial segregation with reproducing the political construction of racial categories.”” (their rendering). I will defend our solution here.

Their first claim:

First, it would be a grave error to supplant the existing categories of race with race-like categories inferred by unsupervised learning methods. Despite the risk of reifying the socially constructed idea called race, race does exist in the world, as a way of mental sorting, as a discourse which is adopted, as a social thing which has both structural and ideological components. In other words, although race is social constructed, race still has power. To supplant race with race-like categories for the purposes of measurement sidesteps the problem.

This paragraph does feel very “critical race studies” to me, in that it makes totalizing claims about the work race does in society in a way that precludes the possibility of any concrete or focused intervention. I think they misunderstand our proposal in the following ways:

  • We are not proposing that, at a societal and institutional level, we institute a new, stable system of categories derived from patterns of segregation. We are proposing that, ideally, temporary quasi-racial categories are derived dynamically from data about segregation in a way that destabilizes the social mechanisms that reproduce racial hierarchy, reducing the power of those categories.
  • This is proposed as an intervention to be adopted by specific technical systems, not at the level of hegemonic political discourse. It is a way of formulating an anti-racist racial project by undermining the way categories are maintained.
  • Indeed, the idea is to sidestep the problem, in the sense that it is an elegant way to reduce the harm that the problem does. Sidestepping is, imagine it, a way of avoiding a danger. In this case, that danger is the reification of race in large scale digital platforms (for example).

Next, they argue:

Second, supplanting race with race-like categories depends highly on context, namely how race operates within particular systems of inequality and domination. Benthall and Haynes restrict their analysis to that of spatial segregation, which is to be sure, an important and active research area and subject of significant policy discussion (e.g. [76, 99]). However, that metric may appear illegible to analyses pertaining to other racialized institutions, such as the criminal justice system, education, or employment (although one can readily see their connections and interdependencies). The way that race matters or pertains to particular types of structural inequality depends on that context and requires its own modes of operationalization

Here, the Google team takes the anthropological turn and, like many before them, suggests that a general technical proposal is insufficient because it is not sufficiently contextualized. Besides echoing the general problem of the ineffectualness of anthropological methods in technology ethics, they also mischaracterize our paper by saying we restrict our analysis to spatial segregation. This is not true: in the paper we generalize our analysis to social segregation, as in on a social network graph. Naturally, we would be (a) interested in and open to other systems of identifying race as a feature of social structure, and (b) would want to tailor data over which any operationalization technique was applied, where appropriate, to technical and functional context. At the same time, we are on quite solid ground in saying that racial is structural and systemic, and in a sense defined at a holistic societal level as much as it has ramifications in, and is impacted by, the micro- and contextual level as well. As we are approaching the problem from a structural sociological one, we can imagine a structural technical solution. This is an advantage of the method over a more anthropological one.


At the same time we focus on the ontological aspects of race (what is race, how is it constituted and imagined in the world), it is necessary to pay attention to what we do with race and measures which may be interpreted as race. The creation of metrics and indicators which are race-like will still be interpreted as race.

This is a strange criticism given that one of the potential problems with our paper is that the quasi-racial categories we propose are not interpretable. The authors seem think that our solution involves the institution of new quasi-racial categories at the level of representation or discourse. That’s not what we’ve proposed. We’ve proposed a design for a machine learning system which, we’d hope, would be understood well enough by its engineers to work as an intervention. Indeed, the correlation of the quasi-racial categories with socially recognized racial ones is important if they are to ground fairness interventions; the purpose of our proposed solution is narrowly to allow for these interventions without the reification of the categories.

Enough defense. There is a point the Google team insists on which strikes me as somewhat odd and to me signals a further weakness of their hyper contextualized method: its inability to generalize beyond the hermeneutic cycles of “critical race theory”.

Hanna et al. list several (seven) different “dimensions of race” based on different ways race can be ascribed, inferred, or expressed. There is, here, the anthropological concern with the individual body and its multifaceted presentations in the complex social field. But they explicitly reject one of the most fundamental ways in which race operates at a transpersonal and structural level, which is through families and genealogy. This is well-intentioned but ultimately misguided.

Note that we have excluded “racial ancestry” from this table. Genetics, biomedical researchers, and sociologists of science have criticized the use of “race” to describe genetic ancestry within biomedical research [40, 49, 84, 122], while others have criticized the use of direct-to-consumer genetic testing and its implications for racial and ethnic identification [15, 91, 113]

In our paper, we take pains to point out responsibly how many aspects of racial, such as phenotype, nationality (through citizenship rules), and class signifiers (through inheritance) are connected with ancestry. We, of course, do not mean to equate ancestry with race. Nor, especially, are we saying that there are genetic racialized qualities besides perhaps those associated with phenotype. We are also not saying that direct-to-consumer genetic test data is what institutions should be basing their inference of quasi-racial categories on. Nothing like that.

However, speaking for myself, I believe that an important aspect of how race functions at a social structural level is how it implicates relations of ancestry. A. Rao perhaps puts the point better: race is a system of inherited privilege, and racial discrimination is more often than not discrimination based on descent.

Understanding this about race allows us to see what race has in common with other systems of categorical inequality, such as the caste system. And here was a large part of the point of offering an algorithmic solution: to suggest a system for identifying inequality that transcends the logic of what is currently recognized within the discourse of “critical race theory” and anticipates forms of inequality and discrimination that have not yet been so politically recognized. This will become increasingly an issue when a pluralistic society (or user base of an on-line platform) interacts with populations whose categorical inequalities have different histories and origins besides the U.S. racial system. Though our paper used African-Americans as a referent group, the scope of our proposal was intentionally much broader.


Benthall, S., & Haynes, B. D. (2019, January). Racial categories in machine learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency (pp. 289-298).

Hanna, A., Denton, E., Smart, A., & Smith-Loud, J. (2020, January). Towards a critical race methodology in algorithmic fairness. In Proceedings of the 2020 Conference on Fairness, Accountability, and Transparency (pp. 501-512).