Digifesto

All the problems with our paper, “Racial categories in machine learning”

Bruce Haynes and I were blown away by the reception to our paper, “Racial categories in machine learning“. This was a huge experiment in interdisciplinary collaboration for us. We are excited about the next steps in this line of research.

That includes engaging with criticism. One of our goals was to fuel a conversation in the research community about the operationalization of race. That isn’t a question that can be addressed by any one paper or team of researchers. So one thing we got out of the conference was great critical feedback on potential problems with the approach we proposed.

This post is an attempt to capture those critiques.

Need for participatory design

Khadijah Abdurahman, of Word to RI , issues a subtweeted challenge to us to present our paper to the hood. (RI stands for Roosevelt Island, in New York City, the location of the recently established Cornell Tech campus.)

One striking challenge, raised by Khadijah Abdurahman on Twitter, is that we should be developing peer relationships with the communities we research. I read this as a call for participatory design. It’s true this was not part of the process of the paper. In particular, Ms. Abdurahman points to a part of our abstract that uses jargon from computer science.

There are a lot of ways to respond to this comment. The first is to accept the challenge. I would personally love it if Bruce and I could present our research to folks on Roosevelt Island and get feedback from them.

There are other ways to respond that address the tensions of this comment. One is to point out that in addition to being an accomplished scholar of the sociology of race and how it forms, especially in urban settings, Bruce is a black man who is originally from Harlem. Indeed, Bruce’s family memoir shows his deep and well-researched familiarity with the life of marginalized people of the hood. So a “peer relationship” between an algorithm designer (me) and a member of an affected community (Bruce) is really part of the origin of our work.

Another is to point out that we did not research a particular community. Our paper was not human subjects research; it was about the racial categories that are maintained by the Federal U.S. government and which pervade society in a very general way. Indeed, everybody is affected by these categories. When I and others who looks like me are ascribed “white”, that is an example of these categories at work. Bruce and I were very aware of how different kinds of people at the conference responded to our work, and how it was an intervention in our own community, which is of course affected by these racial categories.

The last point is that computer science jargon is alienating to basically everybody who is not trained in computer science, whether they live in the hood or not. And the fact is we presented our work at a computer science venue. Personally, I’m in favor of universal education in computational statistics, but that is a tall order. If our work becomes successful, I could see it becoming part of, for example, a statistical demography curriculum that could be of popular interest. But this is early days.

The Quasi-Racial (QR) Categories are Not Interpretable

In our presentation, we introduced some terminology that did not make it into the paper. We named the vectors of segregation derived by our procedure “quasi-racial” (QR) vectors, to denote that we were trying to capture dimensions that were race-like, in that they captured the patterns of historic and ongoing racial injustice, without being the racial categories themselves, which we argued are inherently unfair categories of inequality.

First, we are not wedded to the name “quasi-racial” and are very open to different terminology if anybody has an idea for something better to call them.

More importantly, somebody pointed out that these QR vectors may not be interpretable. Given that the conference is not only about Fairness, but also Accountability and Transparency, this critique is certainly on point.

To be honest, I have not yet done the work of surveying the extensive literature on algorithm interpretability to get a nuanced response. I can give two informal responses. The first is that one assumption of our proposal is that there is something wrong with how race and racial categories are intuitive understood. Normal people’s understanding of race is, of course, ridden with stereotypes, implicit biases, false causal models, and so on. If we proposed an algorithm that was fully “interpretable” according to most people’s understanding of what race is, that algorithm would likely have racist or racially unequal outcomes. That’s precisely the problem that we are trying to get at with our work. In other words, when categories are inherently unfair, interpretability and fairness may be at odds.

The second response is that educating people about how the procedure works and why its motivated is part of what makes its outcomes interpretable. Teaching people about the history of racial categories, and how those categories are both the cause and effect of segregation in space and society, makes the algorithm interpretable. Teaching people about Principal Component Analysis, the algorithm we employ, is part of what makes the system interpretable. We are trying to drop knowledge; I don’t think we are offering any shortcuts.

Principal Component Analysis (PCA) may not be the right technique

An objection from the computer science end of the spectrum was that our proposed use of Principal Component Analysis (PCA) was not well-motivated enough. PCA is just one of many dimensionality reduction techniques–why did we choose it in particular? PCA has many assumptions about the input embedded within it, including the component vectors of interest are linear combinations of the inputs. What if the best QR representation is a non-linear combination of the input variables? And our use of unsupervised learning, as a general criticism, is perhaps lazy, since in order to validate its usefulness we will need to test it with labeled data anyway. We might be better off with a more carefully calibrated and better motivated alternative technique.

These are all fair criticisms. I am personally not satisfied with the technical component of the paper and presentation. I know the rigor of the analysis is not of the standard that would impress a machine learning scholar and can take full responsibility for that. I hope to do better in a future iteration of the work, and welcome any advice on how to do that from colleagues. I’d also be interested to see how more technically skilled computer scientists and formal modelers address the problem of unfair racial categories that we raised in the paper.

I see our main contribution as the raising of this problem of unfair categories, not our particular technical solution to it. As a potential solution, I hope that it’s better than nothing, a step in the right direction, and provocative. I subscribe to the belief that science is an iterative process and look forward to the next cycle of work.

Please feel free to reach out if you have a critique of our work that we’ve missed. We do appreciate all the feedback!

Notes on O’Neil, Chapter 2, “Bomb Parts”

Continuing with O’Neil’s Weapons of Math Destruction on to Chapter 2, “Bomb Parts”. This is a popular book and these are quick chapters. But that’s no reason to underestimate them! This is some of the most lucid work I’ve read on algorithmic fairness.

This chapter talks about three kinds of “models” used in prediction and decision making, with three examples. O’Neil speak highly of the kinds of models used in baseball to predict the trajectory of hits and determine the optimal placement of people in the field. (Ok, I’m not so good at baseball terms). These are good, O’Neil says, because they are transparent, they are consistently adjusted with new data, and the goals are well defined.

O’Neil then very charmingly writes about the model she uses mentally to determine how to feed her family. She juggles a lot of variables: the preferences of her kids, the nutrition and cost of ingredients, and time. This is all hugely relatable–everybody does something like this. Her point, it seems, is that this form of “model” encodes a lot of opinions or “ideology” because it reflects her values.

O’Neil then discusses recidivism prediction, specifically the LSI-R (Level of Service Inventory–Revised) tool. It asks questions like “How many previous convictions have you had?” and uses that to predict likelihood of future prediction. The problem is that (a) this is sensitive to overpolicing in neighborhoods, which has little to do with actual recidivism rates (as opposed to rearrest rates), and (b) e.g. black neighborhoods are more likely to be overpoliced, meaning that the tool, which is not very good at predicting recidivism, has disparate impact. This is an example of what O’Neil calls an (eponymous) weapon of math destruction.(WMD)

She argues that the three qualities of a WMD are Scale, Opacity, and Damage. Which makes sense.

As I’ve said, I think this is a better take on algorithmic ethics than almost anything I’ve read on the subject before. Why?

First, it doesn’t use the word “algorithm” at all. That is huge, because 95% of the time the use of the word “algorithmic” in the technology-and-society literature is stupid. People use “algorithm” when they really mean “software”. Now, they use “AI System” to mean “a company”. It’s ridiculous.

O’Neil makes it clear in this chapter that what she’s talking about are different kinds of models. Models can be in ones head (as in her plan for feeding her family) or in a computer, and both kinds of models can be racist. That’s a helpful, sane view. It’s been the consensus of computer scientists, cognitive scientists, and AI types for decades.

The problem with WMDs, as opposed to other, better models, is that the WMDS models are unhinged from reality. O’Neil’s complaint is not with use of models, but rather that models are being used without being properly trained using sound sampling on data and statistics. WMDs are not artificially intelligences; they are artificial stupidities.

In more technical terms, it seems like the problem with WMDs is not that they don’t properly trade off predictive accuracy with fairness, as some computer science literature would suggest is necessary. It’s that the systems have high error rates in the first place because the training and calibration systems are poorly designed. What’s worse, this avoidable error is disparately distributed, causing more harm to some groups than others.

This is a wonderful and eye-opening account of unfairness in the models used by automated decision-making systems (note the language). Why? Because it shows that there is a connection between statistical bias, the kind of bias that creates distortions in a quantitative predictive process, and social bias, the kind of bias people worry about politically, which consistently uses the term in both ways. If there is statistical bias that is weighing against some social group, then that’s definitely, 100% a form of bias.

Importantly, this kind of bias–statistical bias–is not something that every model must have. Only badly made models have it. It’s something that can be mitigated using scientific rigor and sound design. If we see the problem the way O’Neil sees it, then we can see clearly how better science, applied more rigorously, is also good for social justice.

As a scientist and technologist, it’s been terribly discouraging in the past years to be so consistently confronted with a false dichotomy between sound engineering and justice. At last, here’s a book that clearly outlines how the opposite is the case!

Reading O’Neil’s Weapons of Math Destruction

I probably should have already read Cathy O’Neil’s Weapons of Math Destruction. It was a blockbuster of the tech/algorithmic ethics discussion. It’s written by an accomplished mathematician, which I admire. I’ve also now seen O’Neil perform bluegrass music twice in New York City and think her band is great. At last I’ve found a copy and have started to dig in.

On the other hand, as is probably clear from other blog posts, I have a hard time swallowing a lot of the gloomy political work that puts the role of algorithms in society in such a negative light. I encounter is very frequently, and every time feel that some misunderstanding must have happened; something seems off.

It’s very clear that O’Neil can’t be accused of mathophobia or not understanding the complexity of the algorithms at play, which is an easy way to throw doubt on the arguments of some technology critics. Yet perhaps because it’s a popular book and not an academic work of Science and Technology Studies, I haven’t it’s arguments parsed through and analyzed in much depth.

This is a start. These are my notes on the introduction.

O’Neil describes the turning point in her career where she soured on math. After being an academic mathematician for some time, O’Neil went to work as a quantitative analyst for D.E. Shaw. She saw it as an opportunity to work in a global laboratory. But then the 2008 financial crisis made her see things differently.

The crash made it all too clear that mathematics, once my refuge, was not only deeply entangled in the world’s problems but also fueling many of them. The housing crisis, the collapse of major financial institutions, the rise of unemployment–all had been aided and abetted by mathematicians wielding magic formulas. What’s more, thanks to the extraordinary powers that I loved so much, math was able to combine with technology to multiply the chaos and misfortune, adding efficiency and scale to systems I now recognized as flawed.

O’Neil, Weapons of Math Destruction, p.2

As an independent reference on the causes of the 2008 financial crisis, which of course has been a hotly debated and disputed topic, I point to Sassen’s 2017 “Predatory Formations” article. Indeed, the systems that developed the sub-prime mortgage market were complex, opaque, and hard to regulate. Something went seriously wrong there.

But was it mathematics that was the problem? This is where I get hung up. I don’t understand the mindset that would attribute a crisis in the financial system to the use of abstract, logical, rigorous thinking. Consider the fact that there would not have been a financial crisis if there had not been a functional financial services system in the first place. Getting a mortgage and paying them off, and the systems that allow this to happen, all require mathematics to function. When these systems operate normally, they are taken for granted. When they suffer a crisis, when the system fails, the mathematics takes the blame. But a system can’t suffer a crisis if it didn’t start working rather well in the first place–otherwise, nobody would depend on it. Meanwhile, the regulatory reaction to the 2008 financial crisis required, of course, more mathematicians working to prevent the same thing from happening again.

So in this case (and I believe others) the question can’t be, whether mathematics, but rather which mathematics. It is so sad to me that these two questions get conflated.

O’Neil goes on to describe a case where an algorithm results in a teacher losing her job for not adding enough value to her students one year. An analysis makes a good case that the cause of her students’ scores not going up is that in the previous year, the students’ scores were inflated by teachers cheating the system. This argument was not consider conclusive enough to change the administrative decision.

Do you see the paradox? An algorithm processes a slew of statistics and comes up with a probability that a certain person might be a bad hire, a risky borrower, a terrorist, or a miserable teacher. That probability is distilled into a score, which can turn someone’s life upside down. And yet when the person fights back, “suggestive” countervailing evidence simply won’t cut it. The case must be ironclad. The human victims of WMDs, we’ll see time and again, are held to a far higher standard of evidence than the algorithms themselves.

O’Neil, WMD, p.10

Now this is a fascinating point, and one that I don’t think has been taken up enough in the critical algorithms literature. It resonates with a point that came up earlier, that traditional collective human decision making is often driven by agreement on narratives, whereas automated decisions can be a qualitatively different kind of collective action because they can make judgments based on probabilistic judgments.

I have to wonder what O’Neil would argue the solution to this problem is. From her rhetoric, it seems like her recommendation must be prevent automated decisions from making probabilistic judgments. In other words, one could raise the evidenciary standard for algorithms so that they we equal to the standards that people use with each other.

That’s an interesting proposal. I’m not sure what the effects of it would be. I expect that the result would be lower expected values of whatever target was being optimized for, since the system would not be able to “take bets” below a certain level of confidence. One wonders if this would be a more or less arbitrary system.

Sadly, in order to evaluate this proposal seriously, one would have to employ mathematics. Which is, in O’Neil’s rhetoric, a form of evil magic. So, perhaps it’s best not to try.

O’Neil attributes the problems of WMD’s to the incentives of the data scientists building the systems. Maybe they know that their work effects people, especially the poor, in negative ways. But they don’t care.

But as a rule, the people running the WMD’s don’t dwell on these errors. Their feedback is money, which is also their incentive. Their systems are engineered to gobble up more data fine-tune their analytics so that more money will pour in. Investors, of course, feast on these returns and shower WMD companies with more money.

O’Neil, WMD, p.13

Calling out greed as the problem is effective and true in a lot of cases. I’ve argued myself that the real root of the technology ethics problem is capitalism: the way investors drive what products get made and deployed. This is a worthwhile point to make and one that doesn’t get made enough.

But the logical implications of this argument are off. Suppose it is true that “as a rule”, the makers of algorithms that do harm are made by people responding to the incentives of private capital. (IF harmful algorithm, THEN private capital created it.) That does not mean that there can’t be good algorithms as well, such as those created in the public sector. In other words, there are algorithms that are not WMDs.

So the insight here has to be that private capital investment corrupts the process of designing algorithms, making them harmful. One could easily make the case that private capital investment corrupts and makes harmful many things that are not algorithmic as well. For example, the historic trans-Atlantic slave trade was a terribly evil manifestation of capitalism. It did not, as far as I know, depend on modern day computer science.

Capitalism here looks to be the root of all evil. The fact that companies are using mathematics is merely incidental. And O’Neil should know that!

Here’s what I find so frustrating about this line of argument. Mathematical literacy is critical for understanding what’s going on with these systems and how to improve society. O’Neil certainly has this literacy. But there are many people who don’t have it. There is a power disparity there which is uncomfortable for everybody. But while O’Neil is admirably raising awareness about how these kinds of technical systems can and do go wrong, the single-minded focus and framing risks giving people the wrong idea that these intellectual tools are always bad or dangerous. That is not a solution to anything, in my view. Ignorance is never more ethical than education. But there is an enormous appetite among ignorant people for being told that it is so.

References

O’Neil, Cathy. Weapons of math destruction: How big data increases inequality and threatens democracy. Broadway Books, 2017.

Sassen, Saskia. “Predatory Formations Dressed in Wall Street Suits and Algorithmic Math.” Science, Technology and Society22.1 (2017): 6-20.

computational institutions as non-narrative collective action

Nils Gilman recently pointed to a book chapter that confirms the need for “official futures” in capitalist institutions.

Nils indulged me in a brief exchange that helped me better grasp at a bothersome puzzle.

There is a certain class of intellectuals that insist on the primacy of narratives as a mode of human experience. These tend to be, not too surprisingly, writers and other forms of storytellers.

There is a different class of intellectuals that insists on the primacy of statistics. Statistics does not make it easy to tell stories because it is largely about the complexity of hypotheses and our lack of confidence in them.

The narrative/statistic divide could be seen as a divide between academic disciplines. It has often been taken to be, I believe wrongly, the crux of the “technology ethics” debate.

I questioned Nils as to whether his generalization stood up to statistically driven allocation of resources; i.e., those decisions made explicitly on probabilistic judgments. He argued that in the end, management and collective action require consensus around narrative.

In other words, what keeps narratives at the center of human activity is that (a) humans are in the loop, and (b) humans are collectively in the loop.

The idea that communication is necessary for collective action is one I used to put great stock in when studying Habermas. For Habermas, consensus, and especially linguistic consensus, is how humanity moves together. Habermas contrasted this mode of knowledge aimed at consensus and collective action with technical knowledge, which is aimed at efficiency. Habermas envisioned a society ruled by communicative rationality, deliberative democracy; following this line of reasoning, this communicative rationality would need to be a narrative rationality. Even if this rationality is not universal, it might, in Habermas’s later conception of governance, be shared by a responsible elite. Lawyers and a judiciary, for example.

The puzzle that recurs again and again in my work has been the challenge of communicating how technology has become an alternative form of collective action. The claim made by some that technologists are a social “other” makes more sense if one sees them (us) as organizing around non-narrative principles of collective behavior.

It is I believe beyond serious dispute that well-constructed, statistically based collective decision-making processes perform better than many alternatives. In the field of future predictions, Phillip Tetlock’s work on superforecasting teams and prior work on expert political judgment has long stood as an empirical challenge to the supposed primacy of narrative-based forecasting. This challenge has not been taken up; it seems rather one-sided. One reason for this may be because the rationale for the effectiveness of these techniques rests ultimately in the science of statistics.

It is now common to insist that Artificial Intelligence should be seen as a sociotechnical system and not as a technological artifact. I wholeheartedly agree with this position. However, it is sometimes implied that to understand AI as a social+ system, one must understand it one narrative terms. This is an error; it would imply that the collective actions made to build an AI system and the technology itself are held together by narrative communication.

But if the whole purpose of building an AI system is to collectively act in a way that is more effective because of its facility with the nuances of probability, then the narrative lens will miss the point. The promise and threat of AI is that is delivers a different, often more effective form of collective or institution. I’ve suggested that computational institution might be the best way to refer to such a thing.

State regulation and/or corporate self-regulation

The dust from the recent debates about whether regulation or industrial self-regulation in the data/tech/AI industry appears to be settling. The smart money is on regulation and self-regulation being complementary for attaining the goal of an industry dominated by responsible actors. This trajectory leads to centralized corporate power that is lead from the top; it is a Hamiltonian not Jeffersonian solution, in Pasquale’s terms.

I am personally not inclined towards this solution. But I have been convinced to see it differently after a conversation today about environmentally sustainable supply chains in food manufacturing. Nestle, for example, has been internally changing its sourcing practices to more sustainable chocolate. It’s able to finance this change from its profits, and when it does change its internal policy, it operates on a scale that’s meaningful. It is able to make this transition in part because non-profits, NGO’s, and farmers cooperatives lay through groundwork for sustainable sourcing external to the company. This lowers the barriers to having Nestle switch over to new sources–they have already been subsidized through philanthropy and international aid investments.

Supply chain decisions, ‘make-or-buy’ decisions, are the heart of transaction cost economics (TCE) and critical to the constitution of institutions in general. What this story about sustainable sourcing tells us is that the configuration of private, public, and civil society institutions is complex, and that there are prospects for agency and change in the reconfiguration of those relationships. This is no different in the ‘tech sector’.

However, this theory of economic and political change is not popular; it does not have broad intellectual or media appeal. Why?

One reason may be because while it is a critical part of social structure, much of the supply chain is in the private sector, and hence is opaque. This is not a matter of transparency or interpretability of algorithms. This is about the fact that private institutions, by virtue of being ‘private’, do not have to report everything that they do and, probably, shouldn’t. But since so much of what is done by the massive private sector is of public import, there’s a danger of the privatization of public functions.

Another reason why this view of political change through the internal policy-making of enormous private corporations is unpopular is because it leaves decision-making up to a very small number of people–the elite managers of those corporations. The real disparity of power involved in private corporate governance means that the popular attitude towards that governance is, more often than not, irrelevant. Even less so that political elites, corporate elites are not accountable to a constituency. They are accountable, I suppose, to their shareholders, which have material interests disconnected from political will.

This disconnected shareholder will is one of the main reasons why I’m skeptical about the idea that large corporations and their internal policies are where we should place our hopes for moral leadership. But perhaps what I’m missing is the appropriate intellectual framework for how this will is shaped and what drives these kinds of corporate decisions. I still think TCE might provide insights that I’ve been missing. But I am on the lookout for other sources.

Ordoliberalism and industrial organization

There’s a nice op-ed by Wolfgang Münchau in FT, “The crisis of modern liberalism is down to market forces”.

Among other things, it reintroduces the term “ordoliberalism“, a particular Germanic kind of enlightened liberalism designed to prevent the kind of political collapse that had precipitated the war.

In Münchau’s account, the key insight of ordoliberalism is its attention to questions of social equality, but not through the mechanism of redistribution. Rather, ordoliberal interventions primarily effect industrial organization, favoring small to mid- sized companies.

As Germany’s economy remains robust and so far relatively politically stable, it’s interesting that ordoliberalism isn’t discussed more.

Another question that must be asked is to what extent the rise of computational institutions challenges the kind of industrial organization recommended by ordoliberalism. If computation induces corporate concentration, and there are not good policies for addressing that, then that’s due to a deficiency in our understanding of what ‘market forces’ are.

When *shouldn’t* you build a machine learning system?

Luke Stark raises an interesting question, directed at “ML practitioner”:

As an “ML practitioner” in on this discussion, I’ll have a go at it.

In short, one should not build an ML system for making a class of decisions if there is already a better system for making that decision that does not use ML.

An example of a comparable system that does not use ML would be a team of human beings with spreadsheets, or a team of people employed to judge for themselves.

There are a few reasons why a non-ML system could be superior in performance to an ML system:

  • The people involved could have access to more data, in the course of their lives, in more dimensions of variation, than is accessible by the machine learning system.
  • The people might have more sensitized ability to make semantic distinctions, such as in words or images, than an ML system
  • The problem to be solved could be a “wicked problem” that is itself over a very high-dimensional space of options, with very irregular outcomes, such that they are not amenable to various forms of, e.g., linear approximations
  • The people might be judging an aspect of their own social environment, such that the outcome’s validity is socially procedural (as in the outcome of a vote, or of an auction)

These are all fine reasons not to use an ML system. On the other hand, the term “ML” has been extended, as with “AI”, to include many hybrid human-computer systems, which has led to some confusion. So, for example. crowdsourced labels of images provide useful input data to ML systems. This hybrid system might perform semantic judgments over a large scale of data, at a high speed, at a tolerable rate of accuracy. Does this system count as an ML system? Or is it a form of computational institution that rivals other ways of solving the problem, and just so happens to have a machine learning algorithm as part of its process?

Meanwhile, the research frontier of machine learning is all about trying to solve problems that previously haven’t been solved, or solved as well, as alternative kinds of systems. This means there will always be a disconnect between machine learning research, which is trying to expand what it is possible to do with machine learning, and what machine learning research should, today, be deployed. Sometimes, research is done to develop technology that is not mature enough to deploy.

We should expect that a lot of ML research is done on things that should not ultimately be deployed! That’s because until we do the research, we may not understand the problem well enough to know the consequences of deployment. There’s a real sense in which ML research is about understanding the computational contours of a problem, whereas ML industry practice is about addressing the problems customers have with an efficient solution. Often this solution is a hybrid system in which ML only plays a small part; the use of ML here is really about a change in the institutional structure, not so much a part of what service is being delivered.

On the other hand, there have been a lot of cases–search engines and social media being important ones–where the scale of data and the use of ML for processing has allowed for a qualitatively different form of product or service. These are now the big deal companies we are constantly talking about. These are pretty clearly cases of successful ML.

computational institutions

As the “AI ethics” debate metastasizes in my newsfeed and scholarly circles, I’m struck by the frustrations of technologists and ethicists who seem to be speaking past each other.

While these tensions play out along disciplinary fault-lines, for example, between technologists and science and technology studies (STS), the economic motivations are more often than not below the surface.

I believe this is to some extent a problem of the nomenclature, which is again the function of the disciplinary rifts involved.

Computer scientists work, generally speaking, on the design and analysis of computational systems. Many see their work as bounded by the demands of the portability and formalizability of technology (see Selbst et al., 2019). That’s their job.

This is endlessly unsatisfying to critics of the social impact of technology. STS scholars will insist on changing the subject to “sociotechnical systems”, a term that means something very general: the assemblage of people and artifacts that are not people. This, fairly, removes focus from the computational system and embeds it in a social environment.

A goal of this kind of work seems to be to hold computational systems, as they are deployed and used socially, accountable. It must be said that once this happens, we are no longer talking about the specialized domain of computer science per se. It is a wonder why STS scholars are so often picking fights with computer scientists, when their true beef seems to be with businesses that use and deploy technology.

The AI Now Institute has attempted to rebrand the problem by discussing “AI Systems” as, roughly, those sociotechnical systems that use AI. This is one the one hand more specific–AI is a particular kind of technology, and perhaps it has particular political consequences. But their analysis of AI systems quickly overflows into sweeping claims about “the technology industry”, and it’s clear that most of their recommendations have little to do with AI, and indeed are trying, once again, to change the subject from discussion of AI as a technology (a computer science research domain) to a broader set of social and political issues that do, in fact, have their own disciplines where they have been researched for years.

The problem, really, is not that any particular conversation is not happening, or is being excluded, or is being shut down. The problem is that the engineering focused conversation about AI-as-a-technology has grown very large and become an awkward synecdoche for the rise of major corporations like Google, Apple, Amazon, Facebook, and Netflix. As these corporations fund and motivate a lot of research, there’s a question of who is going to get pieces of the big pie of opportunity these companies represent, either in terms of research grants or impact due to regulation, education, etc.

But there are so many aspects of these corporations that are neither addressed by the terms “sociotechnical system”, which is just so broad, and “AI System”, which is as broad and rarely means what you’d think it does (that the system uses AI is incidental if not unnecessary; what matters is that it’s a company operating in a core social domain via primarily technological user interfaces). Neither of these gets at the unit of analysis that’s really of interest.

An alternative: “computational institution”. Computational, in the sense of computational cognitive science and computational social science: it denotes the essential role of theory of computation and statistics in explaining the behavior of the phenomenon being studied. “Institution”, in the sense of institutional economics: the unit is a firm, which is comprised of people, their equipment, and their economic relations, to their suppliers and customers. An economic lens would immediately bring into focus “the data heist” and the “role of machines” that Nissenbaum is concerned are being left to the side.

The secret to social forms has been in institutional economics all along?

A long-standing mystery for me has been about the ontology of social forms (1) (2): under what conditions is it right to call a particular assemblage of people a thing, and why? Most people don’t worry about this; in literatures I’m familiar with it’s easy to take a sociotechnical complex or assemblage, or a company, or whatever, as a basic unit of analysis.

A lot of the trickiness comes from thinking about this as a problem of identifying social structure (Sawyer, 200; Cederman, 2005). This implies that people are in some sense together and obeying shared norms, and raises questions about whether those norms exist in their own heads or not, and so on. So far I haven’t seen a lot that really nails it.

But what if the answer has been lurking in institutional economics all along? The “theory of the firm” is essentially a question of why a particular social form–the firm–exists as opposed to a bunch of disorganized transactions. The answers that have come up are quite good.

Take for example Holmstrom (1982), who argues that in a situation where collective outcomes depend on individual efforts, individuals will be tempted to free-ride. That makes it beneficial to have somebody monitor the activities of the other people and have their utility be tied to the net success of the organization. That person becomes the owner of the company, in a capitalist firm.

What’s nice about this example is that it explains social structure based on an efficiency argument; we would expect organizations shaped like this to be bigger and command more resources than others that are less well organized. And indeed, we have many enormous hierarchical organizations in the wild to observe!

Another theory of the firm is Williamson’s transaction cost economics (TCE) theory, which is largely about the make-or-buy decision. If the transaction between a business and its supplier has “asset specificity”, meaning that the asset being traded is specific to the two parties and their transaction, then any investment from either party will induce a kind of ‘lock-in’ or ‘switching cost’ or, in Williamson’s language, a ‘bilateral dependence’. The more of that dependence, the more a free market relationship between the two parties will expose them to opportunistic hazards. Hence, complex contracts, or in the extreme case outright ownership and internalization, tie the firms together.

I’d argue: bilateral dependence and the complex ‘contracts’ the connect entities are very much the stuff of “social forms”. Cooperation between people is valuable; the relation between people who cooperate is valuable as a consequence; and so both parties are ‘structurated’ (to mangle a Giddens term) individually into maintaining the reality of the relation!

References

Cederman, L.E., 2005. Computational models of social forms: Advancing generative process theory 1. American Journal of Sociology, 110(4), pp.864-893.

Holmstrom, Bengt. “Moral hazard in teams.” The Bell Journal of Economics (1982): 324-340.

Sawyer, R. Keith. “Simulating emergence and downward causation in small groups.” Multi-agent-based simulation. Springer Berlin Heidelberg, 2000. 49-67.

Williamson, Oliver E. “Transaction cost economics.” Handbook of new institutional economics. Springer, Berlin, Heidelberg, 2008. 41-65.

Transaction cost economics and privacy: looking at Hoofnagle and Whittington’s “Free”

As I’ve been reading about transaction cost economics (TCE) and independently scrutinizing the business model of search engines, it stands to reason that I should look to the key paper holding down the connection between TCE and privacy, Hoofnagle and Whittinton’s “Free: Accounting for the Costs of the Internet’s Most Popular Price” (2014).

I want to preface the topic by saying I stand by what I wrote earlier: that at the heart of what’s going on with search engines, you have a trade of attention; it requires imagining the user has have attention-time as a scarce resource. The user has a query and has the option to find material relevant to the query in a variety of ways (like going to a library). Often (!) they will do so in a way that costs them as little attention as possible: they use a search engine, which gives an almost instant and often high-quality response; they are also shown advertisements which consume some small amount of their attention, but less than they would expend searching through other means. Advertisers pay the search engine for this exposure to the user’s attention, which funds the service that is “free”, in dollars (but not in attention) to the users.

Hoofnagle and Whittington make a very different argument about what’s going on with “free” web services, which includes free search engines. They argue that the claim that these web services are “free” is deceptive because the user may incur costs after the transaction on account of potential uses of their personal data. An example:

The freemium business model Anderson refers to is popular among industries online. Among them, online games provide examples of free services with hidden costs. By prefacing play with the disclosure of personal identification, the firms that own and operate games can contact and monitor each person in ways that are difficult for the consumer to realize or foresee. This is the case for many games, including Disney’s “Club Penguin,” an entertainment website for children. After providing personal information to the firm, consumers of Club Penguin receive limited exposure to basic game features and can see numerous opportunities to enrich their play with additional features. In order to enrich the free service, consumers must buy all sort of enhancements, such as an upgraded igloo or pets for one’s penguin. Disney, like others in the industry, places financial value on the number of consumers it identifies, the personal information they provide, and the extent to which Disney can track consumer activity in order to modify the game and thus increase the rate of conversion of consumers from free players to paying customers.

There are a number of claims here. Let’s enumerate them:

  1. This is an example of a ‘free’ service with hidden costs to users.
  2. The consumer doesn’t know what the game company will do with their personal information.
  3. In fact, the game will use the personal information to personalize pitches for in-game purchases that ‘enrich’ the free service.
  4. The goal of the company is to convert free players to paying customers.

Working backwards, claim (4) is totally true. The company wants to make money by getting their customers to pay, and they will use personal information to make paying attractive to the customers (3). But this does not mean that the customer is always unwitting. Maybe children don’t understand the business model when they begin playing Penguin Club, but especially today parents certainly do. App Stores, for example, now label apps when they have “in-app purchases”, which is a pretty strong signal. Perhaps this is a recent change due to some saber rattling by the FTC, which to be fair would be attributable as a triumph to the authors if this article had influence on getting that to happen. On the other hand, this is a very simple form of customer notice.

I am not totally confident that even if (2), (3), and (4) are true, that that entails (1), that there are “hidden costs” to free services. Elsewhere, Hoofnagle and Whittington raise more convincing examples of “costs” to release of PII, including being denied a job and resolving identity theft. But being convincingly sold an upgraded igloo for your digital penguin seems so trivial. Even if it’s personalized, how could it be a hidden cost? It’s a separate transaction, no? Do you or do you not buy the igloo?

Parsing this through requires, perhaps, a deeper look at TCE. According to TCE, agents are boundedly rational (they can’t know everything) and opportunistic (they will make an advantageous decision in the moment). Meanwhile, the world is complicated. These conditions imply that there’s a lot of uncertainty about future behavior, as agents will act strategically in ways that they can’t themselves predict. Nevertheless, agents engage in contracts with some kinds of obligations in them in the course of a transaction. TCE’s point is that these contracts are always incomplete, meaning that there are always uncertainties left unresolved in contracts that will need to be negotiated in certain contingent cases. All these costs of drafting, negotiating, and safeguarding the agreement are transaction costs.

Take an example of software contracting, which I happen to know about from personal experience. A software vendor gets a contract from a client to do some customization. The client and the vendor negotiated some sort of scope of work ex ante. But always(!), the client doesn’t actually know what they want, and if the vendor delivers on the specification literally the client doesn’t like it. Then begins the ex post negotiation as the client tries to get the vendor to tweak the system into something more usable.

Software contracting often resolves this by getting off the fixed cost contracting model and onto a cost-and-materials contact that allows billing by hours of developer time. Alternatively, the vendor can internalize the costs into the contract by inflating the cost “estimates” to cover for contingencies. In general, this all amounts to having more contract and a stronger relationship between the client and vendor, a “bilateral dependency” which TCE sees as a natural evolution of the incomplete contract under several common conditions, like “asset specificity”, which means that the asset is specialized to a particular transaction (or the two agents involved in it). Another term for this is lock-in, or the presence of high switching costs, though this way of thinking about it reintroduces the idea of a classical market for essentially comparable goods and services that TCE is designed to mitigate against. This explains how technical dependencies of an organization become baked in more or less constitutionally as part of the organization, leading to the robustness of installed base of a computing platform over time.

This ebb and flow of contract negotiation with software vendors was a bit unsettling to me when I first encountered it on the job, but I think it’s safe to say that most people working in the industry accept this as How Things Work. Perhaps it’s the continued influence of orthodox economics that makes this all seem inefficient somehow, and TCE is the right way to conceptualize things that makes better sense of reality.

But back to the Penguins…

Hoofnagle and Whittington make the case that sharing PII with a service that then personalizes its offerings to you creates a kind of bilateral dependence between service and user. They also argue that loss of privacy, due to the many possible uses of this personal information (some nefarious), is a hidden cost that can be thought of as an ex post transaction cost that is a hazard because it has not been factored into the price ex ante. The fact that this data is valuable to the platform/service for paying their production costs, which is not part of the “free” transaction, is an indication that this data is a lot more valuable than consumers think it is.

I am still on the fence about this.

I can’t get over the feeling that successfully selling a user a personalized, upgraded digital igloo is such an absurd example of a “hidden cost” that it belies the whole argument that these services have hidden costs.

Splitting hairs perhaps, it seems reasonable to say that Penguin Club has a free version, which is negotiated as one transaction. Then, conditional on the first transaction, it offers personalized igloos for real dollars. This purchase, if engaged in, would be another, different transaction, not an ex post renegotiation of the original contract with the Disney. This small difference changes the cost of the igloo from a hidden transaction cost into a normal, transparent cost. So it’s no big deal!

Does the use of PII create a bilateral dependence between Disney and the users of Penguin Club? Yes, in a sense. Any application of attention to an information service, learning how to use it and getting it to be part of your life, is in a sense a bilateral dependence with a switching cost. But there are so many other free games to play on the internet that these costs seem hardly hidden. They could just be understood as part of the game. Meanwhile, we are basically unconcerned with Disney’s “dependence” on the consumer data, because Disney can get new users easily (unless the user is a “whale”, who actual pays the company). And “dependence” Disney has on particular users is a hidden cost for Disney, not for the user, and who cares about Disney.

The cases of identity theft or job loss are strange cases that seem to have more to do with freaky data reuse than what’s going on with a particular transaction. Purpose binding notices and restrictions, which are being normed on through generalized GDPR compliance, seem adequate to deal with these cases.

So, I have two conclusions:

(1) Maybe TCE is the right lens for making an economic argument for why purpose binding restrictions are a good idea. They make transactions with platforms less incomplete, avoiding the moral hazard of ex post use of data in ways that incurs asymmetrically unknown effects on users.

(2) This TCE analysis of platforms doesn’t address the explanatorily powerful point that attention is part of the trade. In addition to being concretely what the user is “giving up” to the platform and directly explaining monetization in some circumstances, the fact that attention is “sticky” and creates some amount of asset-specific learning is a feature of the information economy more generally. Maybe it needs a closer look.

References

Hoofnagle, Chris Jay, and Jan Whittington. “Free: accounting for the costs of the internet’s most popular price.” UCLA L. Rev. 61 (2013): 606.