On the AI Safety and Ethics Debate: From Political to Scientific Answers

by Sebastian Benthall

I am working on AI safety and ethics research again! I’ve contributed to the “Toward Causal Foundations of Safe AGI” sequence on Alignment Forum, and will soon be shifting my research focus back to using agent-based models to improve software accountability. This is an exciting field to work in, in part because there are no shortage of spicy takes by smart people about how dangerous AI is and how important it is to get it right. There’s also more than a little controversy. I want to unpack this controversy and present my own take, which (a) is not one I find expressed by others, and (b) has evolved since I last wrote about it.

Three positions

With many caveats, because many people writing about this topic are much more carefully researched than I am, I think it’s worth sketching a few different positions on why AI is potentially dangerous and what we should do about it.

The first position is that Advanced AI Poses A Global Catastrophic Existential Risk (AIX). This is a position made famous by Yudkowksy and Bostrom, and occasionally echoed by scientific luminaries. The original idea is that an autonomous, self-improving AI that is misaligned will grow in power in pursuit of its goals and slay humanity. The argument goes, basically, that since the slaying of humanity is the worst thing that could happen, this is a terribly alarming and so resources should be mustered to make sure that this never happens.

This position has many critics (I wrote a critique once). But that hasn’t stopped a lot of philanthropic activity directed at preventing this existential risk. Partly as a result of that, the theory of AI X-Risk has grown in sophistication from its original versions (I’ll get to this). There is now a lot of interesting research about AI alignment and corrigibility.

The second position is that AI Is (What We Don’t Like About) Capitalism (AIC), and especially “Big Tech” understood as very large, powerful businesses. One of the more fun articulations of this view is by Ted Chiang. A lot of AI policy has this flavor; Meredith Whitaker provided a political economic critique of AI for the AI Now Salon. An important part of this argument is that AI is not, actually, very autonomous. Rather, in its current manifestations, it depends on the cloud computing offerings of a handful of large companies. So AI is not risky to humanity as a whole. Rather, it is risky to those who are not benefiting from it as an industrial process. These critics suggest that this is most people. It is also risky because the idea of AI masks the social and commercial relations that constitute it. As has been said many times, “artificial intelligence is neither artificial nor intelligent”. (I’ve read this very line in recent work by Kate Crawford and Evgeny Morozov, but Googling for it finds this observation in articles going back at least to 2016 if not earlier).

The third position is that AI Offends Liberal Values (AIL). I mean “liberal” very broadly here, in precisely the sense that Jake Goldenfein use in our paper about AI. I mean that AI threatens to be inegalitarian (“unfair”), to upset the democratic process of self-determination (“misinformation”), to be violate individual autonomy (“manipulation”), and so on. These liberal values are core to how Western democracies operate and are tied up with a lot of real legal liabilities. So there’s plenty of commercial and reputational incentives to work on these kinds of problems. So many do.

A lot of the “debate” about AI safety and ethics concerns which of these views — AIX, AIC, or AIL — is either more correct or, more to the point, is more deserving of our scarce resources: our attention; our labor, if we are researchers or practitioners; our philanthropic donations; our political prioritization. Richards et al.’s recent piece in NOEMA is an example: it argues, like many do, that AIX is a distraction from the pressing AIL position.

Where do I stand on these issues?

It’s political

First, I observe that these different positions have different constituencies. AIX has been a popular position with people raising funding from the extremely wealthy. My suspicion is that the extremely wealthy, being humans, quite rationally do not want humanity to be slain by AI, and so this is a way for them to make philanthropic donations without being altruistic. That the AIX has become associated with Effective Altruism is, in this light, perhaps ironic, though of course, from the perspective of saving humanity, we are truly all in it together.

A great deal of thought has gone into how likely an AI X-risk scenario is in fact. But most people will prioritize based on what is more personally salient. Especially given the uncertainty around how truly remote the possibility of AI X-threat is, people are more likely to be motivated by their comparative advantage with respect to other humans. So, AIL is more successful at orienting the priorities of liberal governments and the industrial corporations that thrive within a liberal state, because for these entities what matters is AI’s legitimacy among the body politic and as a part of these institutions. And AIC is more compelling to those who, for whatever reason, empathize with those that are not well rewarded by capitalism, or who are rewarded by their scathing critique of it.

So to some extent these AI debates are the epiphenomenal discourse and signalling of stakeholders occupying different socioeconomic habitus with respect to the phenomenon of AI. Is it possible to put this politics aside?

These three positions are not, actually, mutually exclusive. (What we don’t like about) capitalism may indeed even pose a threat to liberal values and even an existential threat to humanity, and this perhaps this problem should be where we focus more of our scarce resources. There are a number of obstacle to pursuing this line of inquiry:

  • (a) those that control the preponderance of scarce resources are winners under capitalism and so are going to experience capitalism as legitimizing of liberalism, not as a threat to it (i.e., they will not see AIL as urgent),
  • (b) it has nothing to do with AI, and all of the component arguments (AIX, AIL, and even AIC) gain prestige because they are about AI, which notionally is what’s creating so much economic value right now, and
  • (c) the existential risk probability of capitalism is truly remote, because capitalism is driven by human libido; once AI is identified with capitalism the probability of AIX reduces.
  • (d) there is powerful ideological view that capitalism improves material abundance, promotes liberal values, and the viability of humanity; this view might be right, in which case most of noise about AI safety and ethics in the grand scheme of things is just people complaining or protecting themselves from legal or reputational liability!

On the other hand, it looks like recent work on potential AI X-risk scenarios has been moving away from the unipolar singularity problem and towards problems of failure to coordinate between multiple actors, including the failure to regulate corporate entities as they grow more intelligent. Andrew Critch has written about multi-polar failure as a result of supply chain miscoordination. The Deepmind AGI Safety team seems to think the most likely X-risk scenario involves a failure to co-regulate or adjust becaust of the deception or lack of transparency of some agents as they build to dangerous levels of intelligence.

This is significant. For years, the AIX position has focused on the purely technical aspects of AI and how these might pose a danger. However, now even AIX researchers and advocates are seeing how the worst problems with AI can be due to failures of socioeconomic organization. This means they have much more in common with the AIL and AIC positions.

We need more exact social science

My own personal frustration with the state of the AI safety and ethics debate is that it raises problems that demand both the rigor of the exact sciences (mathematics, computer science, and so on) and which address directly social and economic phenomena that have been, properly speaking, the object of the social sciences. But the social sciences do not seem equipped to address these questions in a serious way, and so we have endless punditry and speculation.

For the past two years I have been working as a National Science Foundation fellow to try to economically model the effects of personal data flows in the economy. This is just one subproblem of the AI safety and ethics gestalt. I can’t say I have succeeded in my original objective, despite having a wide range of methodologies at my disposal, and great collaborators.

Roughly speaking, I’ve been unable to, in two years, successful bridge between three quite different disciplinary camps:

  • Realist legal scholars and sociologists of technology, who frequently are capable of noticing and putting into words how AI technology interacts with people, how business drives these operations, and so one. But they rarely provide analytic (mathematical) rigor and so it is hard to empirically test their theories or use them in technical design.
  • Computer scientists, who are extremely rigorous in their designs and validations but most often avoid speculation or theorizing about the social processes that contextualize the use of computation, let alone constitute it. Computer scientists know why artificial intelligence is useful: because it can perform computation that humans cannot perform without it.
  • Economists, who are the most practiced at modeling economic systems in their complexity but who remain quite bizarrely attached to rational expectations and unbounded rationality as a disciplinary pillar, despite this being known to be nonsense for many decades. The core issue with artificial intelligence, as it is constituted economically, is that it is useful because our human intelligence is limited. This human limitation is precisely what economists are trained to avoid thinking carefully about.

So, in order to properly test and validate the hypotheses raised by realistic observers of AI in society and the economy, there has to be a conversation between two fields that intellectually want nothing to do with each other: computer science and economics.

Of course, I am being somewhat glib. There are people working at the intersection of computer science and economics. There are economics who work on bounded rationality. There are folks doing difficult empirical validation of realistic social and legal theories of the impact of AI. But this is hard work that requires a paradoxical combination of intellectual humility and ambition. To the extent that the outcomes of such research is uncertain, it does not fall easily into any of the political camps of AIX, AIC, or AIL. I am wondering who else is trying to do this work, and how I can work with them.