Digifesto

A few brief notes towards “Procuring Cybersecurity”

I’m shifting research focus a bit and wanted to jot down a few notes. The context for the shift is that I have the pleasure of organizing a roundtable discussion for NYU’s Center for Cybersecurity and Information Law Institute, working closely with Thomas Streinz of NYU’s Guarini Global Law and Tech.

The context for the workshop is the steady feed of news about global technology supply chains and how they are not just relevant to “cybersecurity”, but in some respects are constitutive of cyberinfrastructure and hence the field of its security.

I’m using “global technology supply chains” rather loosely here, but this includes:

  • Transborder personal data flows as used in e-commerce
  • Software- (and Infrastructure-)-as-a-Service being marketing internationally (including Google used abroad, for example)
  • Enterprise software import/export
  • Electronics manufacturing and distribution.

Many concerns about cybersecurity as a global phenomenon circulate around the imagined or actual supply chain. These are sometimes national security concerns that result in real policy, as when Australia recently banned Hauwei and ZTE from supplying 5G network equipment for fear that it would provide a vector of interference from the Chinese government.

But the nationalist framing is certainly not the whole story. I’ve heard anecdotally that after the Snowden revelations, Microsoft’s internally began to see the U.S. government as a cybersecurity “adversary“. Corporate tech vendors naturally don’t want to be known as being vectors for national surveillance, as this cuts down on their global market share.

Governments and corporations have different cybersecurity incentives and threat models. These models intersect and themselves create the dynamic cybersecurity field. For example, these Chinese government has viewed foreign software vendors as cybersecurity threats, and has responded by mandating source code disclosure. But as this is a vector of potential IP theft, foreign vendors have balked, seeing this mandate as a threat. (Ahmed and Weber, 2018).Complicating things further, a defensive “cybersecurity” measure can also serve the goal of protecting domestic technology innovation–which can be framed as providing a nationalist “cybersecurity” edge in the long run.

What, if anything, prevents a total cyberwar of all against all? One answer is trade agreements that level the playing field, or at least establish rules for the game. Another is open technology and standards, which provide an alternative field driven by the benefits of interoperability rather than proprietary interest and secrecy. Is it possible to capture any of this in accurate model or theory?

I love having the opportunity to explore these questions, as they are at the intersection of my empirical work on software supply chains (Benthall et al., 2016; Benthall, 2017) and also theoretical work on data economics in my dissertation. My hunch for some time has been that there’s a dearth of solid economics theory for the contemporary digital economy, and this is one way of getting at that.

References

Ahmed, S., & Weber, S. (2018). China’s long game in techno-nationalism. First Monday, 23(5). 

Benthall, S., Pinney, T., Herz, J. C., Plummer, K., Benthall, S., & Rostrup, S. (2016). An ecological approach to software supply chain risk management. In 15th Python in Science Conference.

Benthall, S. (2017, September). Assessing software supply chain risk using public data. In 2017 IEEE 28th Annual Software Technology Conference (STC) (pp. 1-5). IEEE.

Advertisements

Why STS is not the solution to “tech ethics”

“Tech ethics” are in (1) (2) (3) and a popular refrain at FAT* this year was that sensitivity to social and political context is the solution to the problems of unethical technology. How do we bring this sensitivity to technical design? Using the techniques of Science and Technology Studies (STS), argue variously Dobbe and Ames, as well as Selbst et al. (2019). Value Sensitive Design (VSD) (Friedman and Bainbridge, 2004) is one typical STS technique proposed technique for bringing this political awareness into the design process. In general, there is broad agreement that computer scientists should be working with social scientists when developing socially impactful technologies.

In this blog post, I argue that STS is not the solution to “tech ethics” that it tries to be.

Encouraging computer scientists to collaborate with social science domain experts is a great idea. My paper with Bruce Haynes (1) (2) (3) is an example of this kind of work. In it, we drew from sociology of race to inform a technical design that addressed the unfairness of racial categories. Significantly, in my view, we did not use STS in our work. Because the social injustices we were addressing were due to broad reaching social structures and politically constructed categories, we used sociology to elucidate what was at stake and what sorts of interventions would be a good idea.

It is important to recognize that there are many different social sciences dealing with “social and political context”, and that STS, despite its interdisciplinarity, is only one of them. This is easily missed in an interdisciplinary venue in which STS is active, because STS is somewhat activist in asserting its own importance in these venues. In a sense, STS frequently positions itself as a reminder to blindered technologists that there is a social world out there. “Let me tell you about what you’re missing!” That’s it’s shtick. Because of this positioning, STS scholars frequently get a seat at the table with scientists and technologists. It’s a powerful position, in sense.

What STS scholar tend to ignore is how and when other forms of social scientists involve themselves in the process of technical design. For example, at FAT* this year there were two full tracks of Economic Models. Economic Models. Economics is a well-established social scientific discipline that has tools for understanding how a particular mechanism can have unintended effects when put into a social context. In economics, this is called “mechanism design”. It addresses what Selbst et al. might call the “Ripple Effect Trap”–the fact that a system in context may have effects that are different from the intention of designers. I’ve argued before that wiser economics are something we need to better address technology ethics, especially if we are talking about technology deployed by industry, which is most of it! But despite deep and systematic social scientific analysis of secondary and equilibrium effects at the conference, these peer-reviewed works are not acknowledged by STS interventionists. Why is that?

As usual, quantitative social scientists are completely ignored by STS-inspired critiques of technologists and their ethics. That is too bad, because at the scale at which these technologies are operating (mainly, we are discussing civic- or web-scale automated decision making systems that are inherently about large numbers of people), fuzzier debates about “values” and contextualized impact would surely benefit from quantitative operationalization.

The problem is that STS is, at its heart, a humanistic discipline, a subfield of anthropology. If and when STS does not deny the utility or truth or value of mathematization or quantification entirely, as a field of research it is methodologically skeptical about such things. In the self-conception of STS, this methodological relativism is part of its ethnographic rigor. This ethnographic relativism is more or less entirely incompatible with formal reasoning, which aspires to universal internal validity. At a moralistic level, it is this aspiration of universal internal validity that is so bedeviling to the STS scholar: the mathematics are inherently distinct from an awareness of the social context, because social context can only be understood in its ethnographic particularity.

This is a false dichotomy. There are other social sciences that address social and political context that do not have the same restrictive assumptions of STS. Some of these are quantitative, but not all of them are. There are qualitative sociologists and political scientists with great insights into social context that are not disciplinarily allergic to the standard practices of engineering. In many ways, these kinds of social sciences are far more compatible with the process of designing technology than STS! For example, the sociology we draw on in our “Racial categories in machine learning” paper is variously: Gramscian racial hegemony theory, structuralist sociology, Bourdieusian theories of social capital, and so on. Significantly, these theories are not based exclusively on ethnographic method. They are based on disciplines that happily mix historical and qualitative scholarship with quantitative research. The object of study is the social world, and part of the purpose of the research is to develop politically useful abstractions from it that generalize and can be measured. This is the form of social sciences that is compatible with quantitative policy evaluation, the sort of thing you would want to use if, for example, understanding the impact of an affirmative action policy.

Given the widely acknowledge truism that public sector technology design often encodes and enacts real policy changes (a point made in Deirdre Mulligan’s keynote), it would make sense to understand the effects of these technologies using the methodologies of policy impact evaluation. That would involve enlisting the kinds of social scientific expertise relevant to understand society at large!

But that is absolutely not what STS has to offer. STS is, at best, offering a humanistic evaluation of the social processes of technology design. The ontology of STS is flat, and its epistemology and ethics are immediate: the design decision comes down to a calculus of “values” of different “stakeholders”. Ironically, this is a picture of social context that often seems to neglect the political and economic context of that context. It is not an escape from empty abstraction. Rather, it insists on moving from clear abstractions to more nebulous ones, “values” like “fairness”, maintaining that if the conversation never ends and the design never gets formalized, ethics has been accomplished.

This has proven, again and again, to be a rhetorically effective position for research scholarship. It is quite popular among “ethics” researchers that are backed by corporate technology companies. That is quite possibly because the form of “ethics” that STS offers, for all of its calls for political sensitivity, is devoid of political substance. An apples-to-apples comparison of “values”, without considering the social origins of those values and the way those values are grounded in political interests that are not merely about “what we think is important in life”, but real contests over resource allocation. The observation by Ames et al. (2011) that people’s values with respect to technology varies with socio-economic class is terribly relevant, Bourdieusian lesson in how the standpoint of “values sensitivity” may, when taken seriously, run up against the hard realities of political agonism. I don’t believe STS researchers are truly naive about these points; however, in their rhetoric of design intervention, conducted in labs but isolated from the real conditions of technology firms, there is an idealism that can only survive under the self-imposed severity of STS’s own methodological restrictions.

Independent scholars can take up this position and publish daring pieces, winning the moral high ground. But that is not a serious position to take in an industrial setting, or when pursuing generalizable knowledge about the downstream impact of a design on a complex social system. Those empirical questions require different tools, albeit far more unwieldy ones. Complex survey instruments, skilled data analysis, and substantive social theory are needed to arrive at solid conclusions about the ethical impact of technology.

References

Ames, M. G., Go, J., Kaye, J. J., & Spasojevic, M. (2011, March). Understanding technology choices and values through social class. In Proceedings of the ACM 2011 conference on Computer supported cooperative work (pp. 55-64). ACM.

Friedman, B., & Bainbridge, W. S. (2004). Value sensitive design.

Selbst, A. D., Friedler, S., Venkatasubramanian, S., & Vertesi, J. (2018, August). Fairness and Abstraction in Sociotechnical Systems. In ACM Conference on Fairness, Accountability, and Transparency (FAT*).

All the problems with our paper, “Racial categories in machine learning”

Bruce Haynes and I were blown away by the reception to our paper, “Racial categories in machine learning“. This was a huge experiment in interdisciplinary collaboration for us. We are excited about the next steps in this line of research.

That includes engaging with criticism. One of our goals was to fuel a conversation in the research community about the operationalization of race. That isn’t a question that can be addressed by any one paper or team of researchers. So one thing we got out of the conference was great critical feedback on potential problems with the approach we proposed.

This post is an attempt to capture those critiques.

Need for participatory design

Khadijah Abdurahman, of Word to RI , issues a subtweeted challenge to us to present our paper to the hood. (RI stands for Roosevelt Island, in New York City, the location of the recently established Cornell Tech campus.)

On striking challenge, raised by Khadijah Abdurahman on Twitter, is that we should be developing peer relationships with the communities we research. I read this as a call for participatory design. It’s true this was not part of the process of the paper. In particular, Ms. Abdurahman points to a part of our abstract that uses jargon from computer science.

There are a lot of ways to respond to this comment. The first is to accept the challenge. I would personally love it if Bruce and I could present our research to folks on Roosevelt Island and get feedback from them.

There are other ways to respond that address the tensions of this comment. One is to point out that in addition to being an accomplished scholar of the sociology of race and how it forms, especially in urban settings, Bruce is a black man who is originally from Harlem. Indeed, Bruce’s family memoir shows his deep and well-researched familiarity with the life of marginalized people of the hood. So a “peer relationship” between an algorithm designer (me) and a member of an affected community (Bruce) is really part of the origin of our work.

Another is to point out that we did not research a particular community. Our paper was not human subjects research; it was about the racial categories that are maintained by the Federal U.S. government and which pervade society in a very general way. Indeed, everybody is affected by these categories. When I and others who looks like me are ascribed “white”, that is an example of these categories at work. Bruce and I were very aware of how different kinds of people at the conference responded to our work, and how it was an intervention in our own community, which is of course affected by these racial categories.

The last point is that computer science jargon is alienating to basically everybody who is not trained in computer science, whether they live in the hood or not. And the fact is we presented our work at a computer science venue. Personally, I’m in favor of universal education in computational statistics, but that is a tall order. If our work becomes successful, I could see it becoming part of, for example, a statistical demography curriculum that could be of popular interest. But this is early days.

The Quasi-Racial (QR) Categories are Not Interpretable

In our presentation, we introduced some terminology that did not make it into the paper. We named the vectors of segregation derived by our procedure “quasi-racial” (QR) vectors, to denote that we were trying to capture dimensions that were race-like, in that they captured the patterns of historic and ongoing racial injustice, without being the racial categories themselves, which we argued are inherently unfair categories of inequality.

First, we are not wedded to the name “quasi-racial” and are very open to different terminology if anybody has an idea for something better to call them.

More importantly, somebody pointed out that these QR vectors may not be interpretable. Given that the conference is not only about Fairness, but also Accountability and Transparency, this critique is certainly on point.

To be honest, I have not yet done the work of surveying the extensive literature on algorithm interpretability to get a nuanced response. I can give two informal responses. The first is that one assumption of our proposal is that there is something wrong with how race and racial categories are intuitive understood. Normal people’s understanding of race is, of course, ridden with stereotypes, implicit biases, false causal models, and so on. If we proposed an algorithm that was fully “interpretable” according to most people’s understanding of what race is, that algorithm would likely have racist or racially unequal outcomes. That’s precisely the problem that we are trying to get at with our work. In other words, when categories are inherently unfair, interpretability and fairness may be at odds.

The second response is that educating people about how the procedure works and why its motivated is part of what makes its outcomes interpretable. Teaching people about the history of racial categories, and how those categories are both the cause and effect of segregation in space and society, makes the algorithm interpretable. Teaching people about Principal Component Analysis, the algorithm we employ, is part of what makes the system interpretable. We are trying to drop knowledge; I don’t think we are offering any shortcuts.

Principal Component Analysis (PCA) may not be the right technique

An objection from the computer science end of the spectrum was that our proposed use of Principal Component Analysis (PCA) was not well-motivated enough. PCA is just one of many dimensionality reduction techniques–why did we choose it in particular? PCA has many assumptions about the input embedded within it, including the component vectors of interest are linear combinations of the inputs. What if the best QR representation is a non-linear combination of the input variables? And our use of unsupervised learning, as a general criticism, is perhaps lazy, since in order to validate its usefulness we will need to test it with labeled data anyway. We might be better off with a more carefully calibrated and better motivated alternative technique.

These are all fair criticisms. I am personally not satisfied with the technical component of the paper and presentation. I know the rigor of the analysis is not of the standard that would impress a machine learning scholar and can take full responsibility for that. I hope to do better in a future iteration of the work, and welcome any advice on how to do that from colleagues. I’d also be interested to see how more technically skilled computer scientists and formal modelers address the problem of unfair racial categories that we raised in the paper.

I see our main contribution as the raising of this problem of unfair categories, not our particular technical solution to it. As a potential solution, I hope that it’s better than nothing, a step in the right direction, and provocative. I subscribe to the belief that science is an iterative process and look forward to the next cycle of work.

Please feel free to reach out if you have a critique of our work that we’ve missed. We do appreciate all the feedback!

Notes on O’Neil, Chapter 2, “Bomb Parts”

Continuing with O’Neil’s Weapons of Math Destruction on to Chapter 2, “Bomb Parts”. This is a popular book and these are quick chapters. But that’s no reason to underestimate them! This is some of the most lucid work I’ve read on algorithmic fairness.

This chapter talks about three kinds of “models” used in prediction and decision making, with three examples. O’Neil speak highly of the kinds of models used in baseball to predict the trajectory of hits and determine the optimal placement of people in the field. (Ok, I’m not so good at baseball terms). These are good, O’Neil says, because they are transparent, they are consistently adjusted with new data, and the goals are well defined.

O’Neil then very charmingly writes about the model she uses mentally to determine how to feed her family. She juggles a lot of variables: the preferences of her kids, the nutrition and cost of ingredients, and time. This is all hugely relatable–everybody does something like this. Her point, it seems, is that this form of “model” encodes a lot of opinions or “ideology” because it reflects her values.

O’Neil then discusses recidivism prediction, specifically the LSI-R (Level of Service Inventory–Revised) tool. It asks questions like “How many previous convictions have you had?” and uses that to predict likelihood of future prediction. The problem is that (a) this is sensitive to overpolicing in neighborhoods, which has little to do with actual recidivism rates (as opposed to rearrest rates), and (b) e.g. black neighborhoods are more likely to be overpoliced, meaning that the tool, which is not very good at predicting recidivism, has disparate impact. This is an example of what O’Neil calls an (eponymous) weapon of math destruction.(WMD)

She argues that the three qualities of a WMD are Scale, Opacity, and Damage. Which makes sense.

As I’ve said, I think this is a better take on algorithmic ethics than almost anything I’ve read on the subject before. Why?

First, it doesn’t use the word “algorithm” at all. That is huge, because 95% of the time the use of the word “algorithmic” in the technology-and-society literature is stupid. People use “algorithm” when they really mean “software”. Now, they use “AI System” to mean “a company”. It’s ridiculous.

O’Neil makes it clear in this chapter that what she’s talking about are different kinds of models. Models can be in ones head (as in her plan for feeding her family) or in a computer, and both kinds of models can be racist. That’s a helpful, sane view. It’s been the consensus of computer scientists, cognitive scientists, and AI types for decades.

The problem with WMDs, as opposed to other, better models, is that the WMDS models are unhinged from reality. O’Neil’s complaint is not with use of models, but rather that models are being used without being properly trained using sound sampling on data and statistics. WMDs are not artificially intelligences; they are artificial stupidities.

In more technical terms, it seems like the problem with WMDs is not that they don’t properly trade off predictive accuracy with fairness, as some computer science literature would suggest is necessary. It’s that the systems have high error rates in the first place because the training and calibration systems are poorly designed. What’s worse, this avoidable error is disparately distributed, causing more harm to some groups than others.

This is a wonderful and eye-opening account of unfairness in the models used by automated decision-making systems (note the language). Why? Because it shows that there is a connection between statistical bias, the kind of bias that creates distortions in a quantitative predictive process, and social bias, the kind of bias people worry about politically, which consistently uses the term in both ways. If there is statistical bias that is weighing against some social group, then that’s definitely, 100% a form of bias.

Importantly, this kind of bias–statistical bias–is not something that every model must have. Only badly made models have it. It’s something that can be mitigated using scientific rigor and sound design. If we see the problem the way O’Neil sees it, then we can see clearly how better science, applied more rigorously, is also good for social justice.

As a scientist and technologist, it’s been terribly discouraging in the past years to be so consistently confronted with a false dichotomy between sound engineering and justice. At last, here’s a book that clearly outlines how the opposite is the case!

Reading O’Neil’s Weapons of Math Destruction

I probably should have already read Cathy O’Neil’s Weapons of Math Destruction. It was a blockbuster of the tech/algorithmic ethics discussion. It’s written by an accomplished mathematician, which I admire. I’ve also now seen O’Neil perform bluegrass music twice in New York City and think her band is great. At last I’ve found a copy and have started to dig in.

On the other hand, as is probably clear from other blog posts, I have a hard time swallowing a lot of the gloomy political work that puts the role of algorithms in society in such a negative light. I encounter is very frequently, and every time feel that some misunderstanding must have happened; something seems off.

It’s very clear that O’Neil can’t be accused of mathophobia or not understanding the complexity of the algorithms at play, which is an easy way to throw doubt on the arguments of some technology critics. Yet perhaps because it’s a popular book and not an academic work of Science and Technology Studies, I haven’t it’s arguments parsed through and analyzed in much depth.

This is a start. These are my notes on the introduction.

O’Neil describes the turning point in her career where she soured on math. After being an academic mathematician for some time, O’Neil went to work as a quantitative analyst for D.E. Shaw. She saw it as an opportunity to work in a global laboratory. But then the 2008 financial crisis made her see things differently.

The crash made it all too clear that mathematics, once my refuge, was not only deeply entangled in the world’s problems but also fueling many of them. The housing crisis, the collapse of major financial institutions, the rise of unemployment–all had been aided and abetted by mathematicians wielding magic formulas. What’s more, thanks to the extraordinary powers that I loved so much, math was able to combine with technology to multiply the chaos and misfortune, adding efficiency and scale to systems I now recognized as flawed.

O’Neil, Weapons of Math Destruction, p.2

As an independent reference on the causes of the 2008 financial crisis, which of course has been a hotly debated and disputed topic, I point to Sassen’s 2017 “Predatory Formations” article. Indeed, the systems that developed the sub-prime mortgage market were complex, opaque, and hard to regulate. Something went seriously wrong there.

But was it mathematics that was the problem? This is where I get hung up. I don’t understand the mindset that would attribute a crisis in the financial system to the use of abstract, logical, rigorous thinking. Consider the fact that there would not have been a financial crisis if there had not been a functional financial services system in the first place. Getting a mortgage and paying them off, and the systems that allow this to happen, all require mathematics to function. When these systems operate normally, they are taken for granted. When they suffer a crisis, when the system fails, the mathematics takes the blame. But a system can’t suffer a crisis if it didn’t start working rather well in the first place–otherwise, nobody would depend on it. Meanwhile, the regulatory reaction to the 2008 financial crisis required, of course, more mathematicians working to prevent the same thing from happening again.

So in this case (and I believe others) the question can’t be, whether mathematics, but rather which mathematics. It is so sad to me that these two questions get conflated.

O’Neil goes on to describe a case where an algorithm results in a teacher losing her job for not adding enough value to her students one year. An analysis makes a good case that the cause of her students’ scores not going up is that in the previous year, the students’ scores were inflated by teachers cheating the system. This argument was not consider conclusive enough to change the administrative decision.

Do you see the paradox? An algorithm processes a slew of statistics and comes up with a probability that a certain person might be a bad hire, a risky borrower, a terrorist, or a miserable teacher. That probability is distilled into a score, which can turn someone’s life upside down. And yet when the person fights back, “suggestive” countervailing evidence simply won’t cut it. The case must be ironclad. The human victims of WMDs, we’ll see time and again, are held to a far higher standard of evidence than the algorithms themselves.

O’Neil, WMD, p.10

Now this is a fascinating point, and one that I don’t think has been taken up enough in the critical algorithms literature. It resonates with a point that came up earlier, that traditional collective human decision making is often driven by agreement on narratives, whereas automated decisions can be a qualitatively different kind of collective action because they can make judgments based on probabilistic judgments.

I have to wonder what O’Neil would argue the solution to this problem is. From her rhetoric, it seems like her recommendation must be prevent automated decisions from making probabilistic judgments. In other words, one could raise the evidenciary standard for algorithms so that they we equal to the standards that people use with each other.

That’s an interesting proposal. I’m not sure what the effects of it would be. I expect that the result would be lower expected values of whatever target was being optimized for, since the system would not be able to “take bets” below a certain level of confidence. One wonders if this would be a more or less arbitrary system.

Sadly, in order to evaluate this proposal seriously, one would have to employ mathematics. Which is, in O’Neil’s rhetoric, a form of evil magic. So, perhaps it’s best not to try.

O’Neil attributes the problems of WMD’s to the incentives of the data scientists building the systems. Maybe they know that their work effects people, especially the poor, in negative ways. But they don’t care.

But as a rule, the people running the WMD’s don’t dwell on these errors. Their feedback is money, which is also their incentive. Their systems are engineered to gobble up more data fine-tune their analytics so that more money will pour in. Investors, of course, feast on these returns and shower WMD companies with more money.

O’Neil, WMD, p.13

Calling out greed as the problem is effective and true in a lot of cases. I’ve argued myself that the real root of the technology ethics problem is capitalism: the way investors drive what products get made and deployed. This is a worthwhile point to make and one that doesn’t get made enough.

But the logical implications of this argument are off. Suppose it is true that “as a rule”, the makers of algorithms that do harm are made by people responding to the incentives of private capital. (IF harmful algorithm, THEN private capital created it.) That does not mean that there can’t be good algorithms as well, such as those created in the public sector. In other words, there are algorithms that are not WMDs.

So the insight here has to be that private capital investment corrupts the process of designing algorithms, making them harmful. One could easily make the case that private capital investment corrupts and makes harmful many things that are not algorithmic as well. For example, the historic trans-Atlantic slave trade was a terribly evil manifestation of capitalism. It did not, as far as I know, depend on modern day computer science.

Capitalism here looks to be the root of all evil. The fact that companies are using mathematics is merely incidental. And O’Neil should know that!

Here’s what I find so frustrating about this line of argument. Mathematical literacy is critical for understanding what’s going on with these systems and how to improve society. O’Neil certainly has this literacy. But there are many people who don’t have it. There is a power disparity there which is uncomfortable for everybody. But while O’Neil is admirably raising awareness about how these kinds of technical systems can and do go wrong, the single-minded focus and framing risks giving people the wrong idea that these intellectual tools are always bad or dangerous. That is not a solution to anything, in my view. Ignorance is never more ethical than education. But there is an enormous appetite among ignorant people for being told that it is so.

References

O’Neil, Cathy. Weapons of math destruction: How big data increases inequality and threatens democracy. Broadway Books, 2017.

Sassen, Saskia. “Predatory Formations Dressed in Wall Street Suits and Algorithmic Math.” Science, Technology and Society22.1 (2017): 6-20.

computational institutions as non-narrative collective action

Nils Gilman recently pointed to a book chapter that confirms the need for “official futures” in capitalist institutions.

Nils indulged me in a brief exchange that helped me better grasp at a bothersome puzzle.

There is a certain class of intellectuals that insist on the primacy of narratives as a mode of human experience. These tend to be, not too surprisingly, writers and other forms of storytellers.

There is a different class of intellectuals that insists on the primacy of statistics. Statistics does not make it easy to tell stories because it is largely about the complexity of hypotheses and our lack of confidence in them.

The narrative/statistic divide could be seen as a divide between academic disciplines. It has often been taken to be, I believe wrongly, the crux of the “technology ethics” debate.

I questioned Nils as to whether his generalization stood up to statistically driven allocation of resources; i.e., those decisions made explicitly on probabilistic judgments. He argued that in the end, management and collective action require consensus around narrative.

In other words, what keeps narratives at the center of human activity is that (a) humans are in the loop, and (b) humans are collectively in the loop.

The idea that communication is necessary for collective action is one I used to put great stock in when studying Habermas. For Habermas, consensus, and especially linguistic consensus, is how humanity moves together. Habermas contrasted this mode of knowledge aimed at consensus and collective action with technical knowledge, which is aimed at efficiency. Habermas envisioned a society ruled by communicative rationality, deliberative democracy; following this line of reasoning, this communicative rationality would need to be a narrative rationality. Even if this rationality is not universal, it might, in Habermas’s later conception of governance, be shared by a responsible elite. Lawyers and a judiciary, for example.

The puzzle that recurs again and again in my work has been the challenge of communicating how technology has become an alternative form of collective action. The claim made by some that technologists are a social “other” makes more sense if one sees them (us) as organizing around non-narrative principles of collective behavior.

It is I believe beyond serious dispute that well-constructed, statistically based collective decision-making processes perform better than many alternatives. In the field of future predictions, Phillip Tetlock’s work on superforecasting teams and prior work on expert political judgment has long stood as an empirical challenge to the supposed primacy of narrative-based forecasting. This challenge has not been taken up; it seems rather one-sided. One reason for this may be because the rationale for the effectiveness of these techniques rests ultimately in the science of statistics.

It is now common to insist that Artificial Intelligence should be seen as a sociotechnical system and not as a technological artifact. I wholeheartedly agree with this position. However, it is sometimes implied that to understand AI as a social+ system, one must understand it one narrative terms. This is an error; it would imply that the collective actions made to build an AI system and the technology itself are held together by narrative communication.

But if the whole purpose of building an AI system is to collectively act in a way that is more effective because of its facility with the nuances of probability, then the narrative lens will miss the point. The promise and threat of AI is that is delivers a different, often more effective form of collective or institution. I’ve suggested that computational institution might be the best way to refer to such a thing.

State regulation and/or corporate self-regulation

The dust from the recent debates about whether regulation or industrial self-regulation in the data/tech/AI industry appears to be settling. The smart money is on regulation and self-regulation being complementary for attaining the goal of an industry dominated by responsible actors. This trajectory leads to centralized corporate power that is lead from the top; it is a Hamiltonian not Jeffersonian solution, in Pasquale’s terms.

I am personally not inclined towards this solution. But I have been convinced to see it differently after a conversation today about environmentally sustainable supply chains in food manufacturing. Nestle, for example, has been internally changing its sourcing practices to more sustainable chocolate. It’s able to finance this change from its profits, and when it does change its internal policy, it operates on a scale that’s meaningful. It is able to make this transition in part because non-profits, NGO’s, and farmers cooperatives lay through groundwork for sustainable sourcing external to the company. This lowers the barriers to having Nestle switch over to new sources–they have already been subsidized through philanthropy and international aid investments.

Supply chain decisions, ‘make-or-buy’ decisions, are the heart of transaction cost economics (TCE) and critical to the constitution of institutions in general. What this story about sustainable sourcing tells us is that the configuration of private, public, and civil society institutions is complex, and that there are prospects for agency and change in the reconfiguration of those relationships. This is no different in the ‘tech sector’.

However, this theory of economic and political change is not popular; it does not have broad intellectual or media appeal. Why?

One reason may be because while it is a critical part of social structure, much of the supply chain is in the private sector, and hence is opaque. This is not a matter of transparency or interpretability of algorithms. This is about the fact that private institutions, by virtue of being ‘private’, do not have to report everything that they do and, probably, shouldn’t. But since so much of what is done by the massive private sector is of public import, there’s a danger of the privatization of public functions.

Another reason why this view of political change through the internal policy-making of enormous private corporations is unpopular is because it leaves decision-making up to a very small number of people–the elite managers of those corporations. The real disparity of power involved in private corporate governance means that the popular attitude towards that governance is, more often than not, irrelevant. Even less so that political elites, corporate elites are not accountable to a constituency. They are accountable, I suppose, to their shareholders, which have material interests disconnected from political will.

This disconnected shareholder will is one of the main reasons why I’m skeptical about the idea that large corporations and their internal policies are where we should place our hopes for moral leadership. But perhaps what I’m missing is the appropriate intellectual framework for how this will is shaped and what drives these kinds of corporate decisions. I still think TCE might provide insights that I’ve been missing. But I am on the lookout for other sources.

Ordoliberalism and industrial organization

There’s a nice op-ed by Wolfgang Münchau in FT, “The crisis of modern liberalism is down to market forces”.

Among other things, it reintroduces the term “ordoliberalism“, a particular Germanic kind of enlightened liberalism designed to prevent the kind of political collapse that had precipitated the war.

In Münchau’s account, the key insight of ordoliberalism is its attention to questions of social equality, but not through the mechanism of redistribution. Rather, ordoliberal interventions primarily effect industrial organization, favoring small to mid- sized companies.

As Germany’s economy remains robust and so far relatively politically stable, it’s interesting that ordoliberalism isn’t discussed more.

Another question that must be asked is to what extent the rise of computational institutions challenges the kind of industrial organization recommended by ordoliberalism. If computation induces corporate concentration, and there are not good policies for addressing that, then that’s due to a deficiency in our understanding of what ‘market forces’ are.

When *shouldn’t* you build a machine learning system?

Luke Stark raises an interesting question, directed at “ML practitioner”:

As an “ML practitioner” in on this discussion, I’ll have a go at it.

In short, one should not build an ML system for making a class of decisions if there is already a better system for making that decision that does not use ML.

An example of a comparable system that does not use ML would be a team of human beings with spreadsheets, or a team of people employed to judge for themselves.

There are a few reasons why a non-ML system could be superior in performance to an ML system:

  • The people involved could have access to more data, in the course of their lives, in more dimensions of variation, than is accessible by the machine learning system.
  • The people might have more sensitized ability to make semantic distinctions, such as in words or images, than an ML system
  • The problem to be solved could be a “wicked problem” that is itself over a very high-dimensional space of options, with very irregular outcomes, such that they are not amenable to various forms of, e.g., linear approximations
  • The people might be judging an aspect of their own social environment, such that the outcome’s validity is socially procedural (as in the outcome of a vote, or of an auction)

These are all fine reasons not to use an ML system. On the other hand, the term “ML” has been extended, as with “AI”, to include many hybrid human-computer systems, which has led to some confusion. So, for example. crowdsourced labels of images provide useful input data to ML systems. This hybrid system might perform semantic judgments over a large scale of data, at a high speed, at a tolerable rate of accuracy. Does this system count as an ML system? Or is it a form of computational institution that rivals other ways of solving the problem, and just so happens to have a machine learning algorithm as part of its process?

Meanwhile, the research frontier of machine learning is all about trying to solve problems that previously haven’t been solved, or solved as well, as alternative kinds of systems. This means there will always be a disconnect between machine learning research, which is trying to expand what it is possible to do with machine learning, and what machine learning research should, today, be deployed. Sometimes, research is done to develop technology that is not mature enough to deploy.

We should expect that a lot of ML research is done on things that should not ultimately be deployed! That’s because until we do the research, we may not understand the problem well enough to know the consequences of deployment. There’s a real sense in which ML research is about understanding the computational contours of a problem, whereas ML industry practice is about addressing the problems customers have with an efficient solution. Often this solution is a hybrid system in which ML only plays a small part; the use of ML here is really about a change in the institutional structure, not so much a part of what service is being delivered.

On the other hand, there have been a lot of cases–search engines and social media being important ones–where the scale of data and the use of ML for processing has allowed for a qualitatively different form of product or service. These are now the big deal companies we are constantly talking about. These are pretty clearly cases of successful ML.

computational institutions

As the “AI ethics” debate metastasizes in my newsfeed and scholarly circles, I’m struck by the frustrations of technologists and ethicists who seem to be speaking past each other.

While these tensions play out along disciplinary fault-lines, for example, between technologists and science and technology studies (STS), the economic motivations are more often than not below the surface.

I believe this is to some extent a problem of the nomenclature, which is again the function of the disciplinary rifts involved.

Computer scientists work, generally speaking, on the design and analysis of computational systems. Many see their work as bounded by the demands of the portability and formalizability of technology (see Selbst et al., 2019). That’s their job.

This is endlessly unsatisfying to critics of the social impact of technology. STS scholars will insist on changing the subject to “sociotechnical systems”, a term that means something very general: the assemblage of people and artifacts that are not people. This, fairly, removes focus from the computational system and embeds it in a social environment.

A goal of this kind of work seems to be to hold computational systems, as they are deployed and used socially, accountable. It must be said that once this happens, we are no longer talking about the specialized domain of computer science per se. It is a wonder why STS scholars are so often picking fights with computer scientists, when their true beef seems to be with businesses that use and deploy technology.

The AI Now Institute has attempted to rebrand the problem by discussing “AI Systems” as, roughly, those sociotechnical systems that use AI. This is one the one hand more specific–AI is a particular kind of technology, and perhaps it has particular political consequences. But their analysis of AI systems quickly overflows into sweeping claims about “the technology industry”, and it’s clear that most of their recommendations have little to do with AI, and indeed are trying, once again, to change the subject from discussion of AI as a technology (a computer science research domain) to a broader set of social and political issues that do, in fact, have their own disciplines where they have been researched for years.

The problem, really, is not that any particular conversation is not happening, or is being excluded, or is being shut down. The problem is that the engineering focused conversation about AI-as-a-technology has grown very large and become an awkward synecdoche for the rise of major corporations like Google, Apple, Amazon, Facebook, and Netflix. As these corporations fund and motivate a lot of research, there’s a question of who is going to get pieces of the big pie of opportunity these companies represent, either in terms of research grants or impact due to regulation, education, etc.

But there are so many aspects of these corporations that are neither addressed by the terms “sociotechnical system”, which is just so broad, and “AI System”, which is as broad and rarely means what you’d think it does (that the system uses AI is incidental if not unnecessary; what matters is that it’s a company operating in a core social domain via primarily technological user interfaces). Neither of these gets at the unit of analysis that’s really of interest.

An alternative: “computational institution”. Computational, in the sense of computational cognitive science and computational social science: it denotes the essential role of theory of computation and statistics in explaining the behavior of the phenomenon being studied. “Institution”, in the sense of institutional economics: the unit is a firm, which is comprised of people, their equipment, and their economic relations, to their suppliers and customers. An economic lens would immediately bring into focus “the data heist” and the “role of machines” that Nissenbaum is concerned are being left to the side.