existential risk | Digifesto

July 21, 2023

On the AI Safety and Ethics Debate: From Political to Scientific Answers

I am working on AI safety and ethics research again! I’ve contributed to the “Toward Causal Foundations of Safe AGI” sequence on Alignment Forum, and will soon be shifting my research focus back to using agent-based models to improve software accountability. This is an exciting field to work in, in part because there are no shortage of spicy takes by smart people about how dangerous AI is and how important it is to get it right. There’s also more than a little controversy. I want to unpack this controversy and present my own take, which (a) is not one I find expressed by others, and (b) has evolved since I last wrote about it.

Three positions

With many caveats, because many people writing about this topic are much more carefully researched than I am, I think it’s worth sketching a few different positions on why AI is potentially dangerous and what we should do about it.

The first position is that Advanced AI Poses A Global Catastrophic Existential Risk (AIX). This is a position made famous by Yudkowksy and Bostrom, and occasionally echoed by scientific luminaries. The original idea is that an autonomous, self-improving AI that is misaligned will grow in power in pursuit of its goals and slay humanity. The argument goes, basically, that since the slaying of humanity is the worst thing that could happen, this is a terribly alarming and so resources should be mustered to make sure that this never happens.

This position has many critics (I wrote a critique once). But that hasn’t stopped a lot of philanthropic activity directed at preventing this existential risk. Partly as a result of that, the theory of AI X-Risk has grown in sophistication from its original versions (I’ll get to this). There is now a lot of interesting research about AI alignment and corrigibility.

The second position is that AI Is (What We Don’t Like About) Capitalism (AIC), and especially “Big Tech” understood as very large, powerful businesses. One of the more fun articulations of this view is by Ted Chiang. A lot of AI policy has this flavor; Meredith Whitaker provided a political economic critique of AI for the AI Now Salon. An important part of this argument is that AI is not, actually, very autonomous. Rather, in its current manifestations, it depends on the cloud computing offerings of a handful of large companies. So AI is not risky to humanity as a whole. Rather, it is risky to those who are not benefiting from it as an industrial process. These critics suggest that this is most people. It is also risky because the idea of AI masks the social and commercial relations that constitute it. As has been said many times, “artificial intelligence is neither artificial nor intelligent”. (I’ve read this very line in recent work by Kate Crawford and Evgeny Morozov, but Googling for it finds this observation in articles going back at least to 2016 if not earlier).

The third position is that AI Offends Liberal Values (AIL). I mean “liberal” very broadly here, in precisely the sense that Jake Goldenfein use in our paper about AI. I mean that AI threatens to be inegalitarian (“unfair”), to upset the democratic process of self-determination (“misinformation”), to be violate individual autonomy (“manipulation”), and so on. These liberal values are core to how Western democracies operate and are tied up with a lot of real legal liabilities. So there’s plenty of commercial and reputational incentives to work on these kinds of problems. So many do.

A lot of the “debate” about AI safety and ethics concerns which of these views — AIX, AIC, or AIL — is either more correct or, more to the point, is more deserving of our scarce resources: our attention; our labor, if we are researchers or practitioners; our philanthropic donations; our political prioritization. Richards et al.’s recent piece in NOEMA is an example: it argues, like many do, that AIX is a distraction from the pressing AIL position.

Where do I stand on these issues?

It’s political

First, I observe that these different positions have different constituencies. AIX has been a popular position with people raising funding from the extremely wealthy. My suspicion is that the extremely wealthy, being humans, quite rationally do not want humanity to be slain by AI, and so this is a way for them to make philanthropic donations without being altruistic. That the AIX has become associated with Effective Altruism is, in this light, perhaps ironic, though of course, from the perspective of saving humanity, we are truly all in it together.

A great deal of thought has gone into how likely an AI X-risk scenario is in fact. But most people will prioritize based on what is more personally salient. Especially given the uncertainty around how truly remote the possibility of AI X-threat is, people are more likely to be motivated by their comparative advantage with respect to other humans. So, AIL is more successful at orienting the priorities of liberal governments and the industrial corporations that thrive within a liberal state, because for these entities what matters is AI’s legitimacy among the body politic and as a part of these institutions. And AIC is more compelling to those who, for whatever reason, empathize with those that are not well rewarded by capitalism, or who are rewarded by their scathing critique of it.

So to some extent these AI debates are the epiphenomenal discourse and signalling of stakeholders occupying different socioeconomic habitus with respect to the phenomenon of AI. Is it possible to put this politics aside?

These three positions are not, actually, mutually exclusive. (What we don’t like about) capitalism may indeed even pose a threat to liberal values and even an existential threat to humanity, and this perhaps this problem should be where we focus more of our scarce resources. There are a number of obstacle to pursuing this line of inquiry:

(a) those that control the preponderance of scarce resources are winners under capitalism and so are going to experience capitalism as legitimizing of liberalism, not as a threat to it (i.e., they will not see AIL as urgent),
(b) it has nothing to do with AI, and all of the component arguments (AIX, AIL, and even AIC) gain prestige because they are about AI, which notionally is what’s creating so much economic value right now, and
(c) the existential risk probability of capitalism is truly remote, because capitalism is driven by human libido; once AI is identified with capitalism the probability of AIX reduces.
(d) there is powerful ideological view that capitalism improves material abundance, promotes liberal values, and the viability of humanity; this view might be right, in which case most of noise about AI safety and ethics in the grand scheme of things is just people complaining or protecting themselves from legal or reputational liability!

On the other hand, it looks like recent work on potential AI X-risk scenarios has been moving away from the unipolar singularity problem and towards problems of failure to coordinate between multiple actors, including the failure to regulate corporate entities as they grow more intelligent. Andrew Critch has written about multi-polar failure as a result of supply chain miscoordination. The Deepmind AGI Safety team seems to think the most likely X-risk scenario involves a failure to co-regulate or adjust becaust of the deception or lack of transparency of some agents as they build to dangerous levels of intelligence.

This is significant. For years, the AIX position has focused on the purely technical aspects of AI and how these might pose a danger. However, now even AIX researchers and advocates are seeing how the worst problems with AI can be due to failures of socioeconomic organization. This means they have much more in common with the AIL and AIC positions.

We need more exact social science

My own personal frustration with the state of the AI safety and ethics debate is that it raises problems that demand both the rigor of the exact sciences (mathematics, computer science, and so on) and which address directly social and economic phenomena that have been, properly speaking, the object of the social sciences. But the social sciences do not seem equipped to address these questions in a serious way, and so we have endless punditry and speculation.

For the past two years I have been working as a National Science Foundation fellow to try to economically model the effects of personal data flows in the economy. This is just one subproblem of the AI safety and ethics gestalt. I can’t say I have succeeded in my original objective, despite having a wide range of methodologies at my disposal, and great collaborators.

Roughly speaking, I’ve been unable to, in two years, successful bridge between three quite different disciplinary camps:

Realist legal scholars and sociologists of technology, who frequently are capable of noticing and putting into words how AI technology interacts with people, how business drives these operations, and so one. But they rarely provide analytic (mathematical) rigor and so it is hard to empirically test their theories or use them in technical design.
Computer scientists, who are extremely rigorous in their designs and validations but most often avoid speculation or theorizing about the social processes that contextualize the use of computation, let alone constitute it. Computer scientists know why artificial intelligence is useful: because it can perform computation that humans cannot perform without it.
Economists, who are the most practiced at modeling economic systems in their complexity but who remain quite bizarrely attached to rational expectations and unbounded rationality as a disciplinary pillar, despite this being known to be nonsense for many decades. The core issue with artificial intelligence, as it is constituted economically, is that it is useful because our human intelligence is limited. This human limitation is precisely what economists are trained to avoid thinking carefully about.

So, in order to properly test and validate the hypotheses raised by realistic observers of AI in society and the economy, there has to be a conversation between two fields that intellectually want nothing to do with each other: computer science and economics.

Of course, I am being somewhat glib. There are people working at the intersection of computer science and economics. There are economics who work on bounded rationality. There are folks doing difficult empirical validation of realistic social and legal theories of the impact of AI. But this is hard work that requires a paradoxical combination of intellectual humility and ambition. To the extent that the outcomes of such research is uncertain, it does not fall easily into any of the political camps of AIX, AIC, or AIL. I am wondering who else is trying to do this work, and how I can work with them.

March 27, 2017

More assessment of AI X-risk potential

I’m been stimulated by Luciano Floridi’s recent article in Aeon “Should we be afraid of AI?”. I’m surprised that this issue hasn’t been settled yet, since it seems like “we” have the formal tools necessary to solve the problem decisively. But nevertheless this appears to be the subject of debate.

I was referred to Kaj Sotala’s rebuttal of an earlier work by Floridi which his Aeon article was based on. The rebuttal appears in this APA Newsletter on Philosophy and Computers. It is worth reading.

The issue that I’m most interested in is whether or not AI risk research should constitute a special, independent branch of research, or whether it can be approached just as well by pursuing a number of other more mainstream artificial intelligence research agendas. My primary engagement with these debates has so far been an analysis of Nick Bostrom’s argument in his book Superintelligence, which tries to argue in particular that there is an existential risk (or X-risk) to humanity from artificial intelligence. “Existential risk” means a risk to the existence of something, in this case humanity. And the risk Bostrom has written about is the risk of eponymous superintelligence: an artificial intelligence that gets smart enough to improve its own intelligence, achieve omnipotence, and end the world as we know it.

I’ve posted my rebuttal to this argument on arXiv. The one-sentence summary of the argument is: algorithms can’t just modify themselves into omnipotence because they will hit performance bounds due to data and hardware.

A number of friends have pointed out to me that this is not a decisive argument. They say: don’t you just need the AI to advance fast enough and far enough to be an existential threat?

There are a number of reasons why I don’t believe this is likely. In fact, I believe that it is provably vanishingly unlikely. This is not to say that I have a proof, per se. I suppose it’s incumbent on me to work it out and see if the proof is really there.

So: Herewith is my Sketch Of A Proof of why there’s no significant artificial intelligence existential risk.

Lemma: Intelligence advances due to purely algorithmic self-modificiation will always plateau due to data and hardware constraints, which advance more slowly.

Proof: This paper.

As a consequence, all artificial intelligence explosions will be sigmoid. That is, starting slow, accelerating, then decelerating, the growing so slowly as to be asymptotic. Let’s call the level of intelligence at which an explosion asymptotes the explosion bound.

There’s empirical support for this claim. Basically, we have never had a really big intelligence explosion due to algorithmic improvement alone. Looking at the impressive results of the last seventy years, most of the impressiveness can be attributed to advances in hardware and data collection. Notoriously, Deep Learning is largely just decades old artificial neural network technology repurposed to GPU’s on the cloud. Which is awesome and a little scary. But it’s not an algorithmic intelligence explosion. It’s a consolidation of material computing power and sensor technology by organizations. The algorithmic advances fill those material shoes really quickly, it’s true. This is precisely the point: it’s not the algorithms that’s the bottleneck.

Observation: Intelligence explosions are happening all the time. Most of them are small.

Once we accept the idea that intelligence explosions are all bounded, it becomes rather arbitrary where we draw the line between an intelligence explosion and some lesser algorithmic intelligence advance. There is a real sense in which any significant intelligence advance is a sigmoid expansion in intelligence. This would include run-of-the-mill scientific discoveries and good ideas.

If intelligence explosions are anything like virtually every other interesting empirical phenomenon, then they are distributed according to a heavy tail distribution. This means a distribution with a lot of very small values and a diminishing probability of higher values that nevertheless assigns some probability to very high values. Assuming intelligence is something that can be quantified and observed empirically (a huge ‘if’ taken for granted in this discussion), we can (theoretically) take a good hard look at the ways intelligence has advanced. Look around you. Do you see people and computers getting smarter all the time, sometimes in leaps and bounds but most of the time minutely? That’s a confirmation of this hypothesis!

The big idea here is really just to assert that there is a probability distribution over intelligence explosion bounds that all actual intelligence explosions are being drawn from. This follows more or less directly from the conclusion that all intelligence explosions are bounded. Once we posit such a distribution, it becomes possible to take expected values of functions of its values and functions of its values.

Empirical claim: Hardware and sensing advances diffuse rapidly relative to their contribution to intelligence gains.

There’s an material, socio-technical analog to Bostrom’s explosive superintelligence. We could imagine a corporation that is working in secret on new computing infrastructure. Whenever it has an advance in computing infrastructure, the AI people (or increasingly, the AI-writing-AI) develops programming that maximizes its use of this new technology. Then it uses that technology to enrich its own computer-improving facilities. When it needs more…minerals…or whatever it needs to further its research efforts, it finds a way to get them. It proceeds to take over the world.

This may presently be happening. But evidence suggests that this isn’t how the technology economy really works. No doubt Amazon (for example) is using Amazon Web Services internally to do its business analytics. But also it makes its business out of selling out its computing infrastructure to other organizations as a commodity. That’s actually the best way it can enrich itself.

What’s happening here is the diffusion of innovation, which is a well-studied phenomenon in economics and other fields. Ideas spread. Technological designs spread. I’d go so far as to say that it is often (perhaps always?) the best strategy for some agent that has locally discovered a way to advance its own intelligence to figure out how to trade that intelligence to other agents. Almost always that trade involves the diffusion of the basis of that intelligence itself.

Why? Because since there are independent intelligence advances of varying sizes happening all the time, there’s actually a very competitive market for innovation that quickly devalues any particular gain. A discovery, if hoarded, will likely be discovered by somebody else. The race to get credit for any technological advance at all motivates diffusion and disclosure.

The result is that the distribution of innovation, rather than concentrating into very tall spikes, is constantly flattening and fattening itself. That’s important because…

Claim: Intelligence risk is not due to absolute levels of intelligence, but relative intelligence advantage.

The idea here is that since humanity is composed of lots of interacting intelligence sociotechnical organizations, any hostile intelligence is going to have a lot of intelligent adversaries. If the game of life can be won through intelligence alone, then it can only be won with a really big intelligence advantage over other intelligent beings. It’s not about absolute intelligence, it’s intelligence inequality we need to worry about.

Consequently, the more intelligence advances (i.e, technologies) diffuse, the less risk there is.

Conclusion: The chance of an existential risk from an intelligence explosion is small and decreasing all the time.

So consider this: globally, there’s tons of investment in technologies that, when discovered, allow for local algorithmic intelligence explosions.

But even if we assume these algorithmic advances are nearly instantaneous, they are still bounded.

Lots of independent bounded explosions are happening all the time. But they are also diffusing all the time.

Since the global intelligence distribution is always fattening, that means that the chance of any particular technological advance granting a decisive advantage over others is decreasing.

There is always the possibility of a fluke, of course. But if there was going to be a humanity destroying technological discovery, it would probably have already been invented and destroyed us. Since it hasn’t, we have a lot more resilience to threats from intelligence explosions, not to mention a lot of other threats.

This doesn’t mean that it isn’t worth trying to figure out how to make AI better for people. But it does diminish the need to think about artificial intelligence as an existential risk. It makes AI much more comparable to a biological threat. Biological threats could be really bad for humanity. But there’s also the organic reality that life is very resilient and human life in general is very secure precisely because it has developed so much intelligence.

I believe that thinking about the risks of artificial intelligence as analogous to the risks from biological threats is helpful for prioritizing where research effort in artificial intelligence should go. Just because AI doesn’t present an existential risk to all of humanity doesn’t mean it doesn’t kill a lot of people or make their lives miserable. On the contrary, we are in a world with both a lot of artificial and non-artificial intelligence and a lot of miserable and dying people. These phenomena are not causally disconnected. A good research agenda for AI could start with an investigation of these actually miserable people and what their problems are, and how AI is causing that suffering or alternatively what it could do to improve things. That would be an enormously more productive research agenda than one that aims primarily to reduce the impact of potential explosions which are diminishingly unlikely to occur.

August 27, 2015

The relationship between Bostrom’s argument and AI X-Risk

One reason why I have been writing about Bostrom’s superintelligence argument is because I am acquainted with what could be called the AI X-Risk social movement. I think it is fair to say that this movement is a subset of Effective Altruism (EA), a laudable movement whose members attempt to maximize their marginal positive impact on the world.

The AI X-Risk subset, which is a vocal group within EA, sees the emergence of a superintelligent AI as one of several risks that is notably because it could ruin everything. AI is considered to be a “global catastrophic risk” unlike more mundane risks like tsunamis and bird flu. AI X-Risk researchers argue that because of the magnitude of the consequences of the risk they are trying to anticipate, they must raise more funding and recruit more researchers.

While I think this is noble, I think it is misguided for reasons that I have been outlining in this blog. I am motivated to make these arguments because I believe that there are urgent problems/risks that are conceptually adjacent (if you will) to the problem AI X-Risk researchers study, but that the focus on AI X-Risk in particular diverts interest away from them. In my estimation, as more funding has been put into evaluating potential risks from AI many more “mainstream” researchers have benefited and taken on projects with practical value. To some extent these researchers benefit from the alarmism of the AI X-Risk community. But I fear that their research trajectory is thereby distorted from where it could truly provide maximal marginal value.

My reason for targeting Bostrom’s argument for the existential threat of superintelligent AI is that I believe it’s the best defense of the AI X-Risk thesis out there. In particular, if valid the argument should significantly raise the expected probability of an existentially risky AI outcome. For Bostrom, it is likely a natural consequence of advancement in AI research more generally because of recursive self-improvement and convergent instrumental values.

As I’ve informally work shopped this argument I’ve come upon this objection: Even if it is true that a superintelligent system would not for systematic reasons become a existentially risky singleton, that does not mean that somebody couldn’t develop such a superintelligent system in an unsystematic way. There is still an existential risk, even if it is much lower. And because existential risks are so important, surely we should prepare ourselves for even this low probability event.

There is something inescapable about this logic. However, the argument applies equally well to all kinds of potential apocalypses, such as enormous meteors crashing into the earth and biowarfare produced zombies. Without some kind of accounting of the likelihood of these outcomes, it’s impossible to do a rational budgeting.

Moreover, I have to call into question the rationality of this counterargument. If Bostrom’s arguments are used in defense of the AI X-Risk position but then the argument is dismissed as unnecessary when it is challenged, that suggests that the AI X-Risk community is committed to their cause for reasons besides Bostrom’s argument. Perhaps these reasons are unarticulated. One could come up with all kinds of conspiratorial hypotheses about why a group of people would want to disingenuously spread the idea that superintelligent AI poses an existential threat to humanity.

The position I’m defending on this blog (until somebody convinces me otherwise–I welcome all comments) is that a superintelligent AI singleton is not a significantly likely X-Risk. Other outcomes that might be either very bad or very good, such as ones with many competing and cooperating superintelligences, are much more likely. I’d argue that it’s more or less what we have today, if you consider sociotechnical organizations as a form of collective superintelligence. This makes research into this topic not only impactful in the long run, but also relevant to problems faced by people now and in the near future.

Tag: existential risk

Three positions

It’s political

We need more exact social science