Digifesto

Category: artificial intelligence

For fairness in machine learning, we need to consider the unfairness of racial categorization

Pre-prints of papers accepted to this coming 2019 Fairness, Accountability, and Transparency conference are floating around Twitter. From the looks of it, many of these papers add a wealth of historical and political context, which I feel is a big improvement.

A noteworthy paper, in this regard, is Hutchinson and Mitchell’s “50 Years of Test (Un)fairness: Lessons for Machine Learning”, which puts recent ‘fairness in machine learning’ work in the context of very analogous debates from the 60’s and 70’s that concerned the use of testing that could be biased due to cultural factors.

I like this paper a lot, in part because it is very thorough and in part because it tees up a line of argument that’s dear to me. Hutchinson and Mitchell raise the question of how to properly think about fairness in machine learning when the protected categories invoked by nondiscrimination law are themselves social constructs.

Some work on practically assessing fairness in ML has tackled the problem of using race as a construct. This echoes concerns in the testing literature that stem back to at least 1966: “one stumbles immediately over the scientific difficulty of establishing clear yardsticks by which people can be classified into convenient racial categories” [30]. Recent approaches have used Fitzpatrick skin type or unsupervised clustering to avoid racial categorizations [7, 55]. We note that the testing literature of the 1960s and 1970s frequently uses the phrase “cultural fairness” when referring to parity between blacks and whites.

They conclude that this is one of the areas where there can be a lot more useful work:

This short review of historical connections in fairness suggest several concrete steps forward for future research in ML fairness: Diving more deeply into the question of how subgroups are defined, suggested as early as 1966 [30], including questioning whether subgroups should be treated as discrete categories at all, and how intersectionality can be modeled. This might include, for example, how to quantify fairness along one dimension (e.g., age) conditioned on another dimension (e.g., skin tone), as recent work has begun to address [27, 39].

This is all very cool to read, because this is precisely the topic that Bruce Haynes and I address in our FAT* paper, “Racial categories in machine learning” (arXiv link). The problem we confront in this paper is that the racial categories we are used to using in the United States (White, Black, Asian) originate in the white supremacy that was enshrined into the Constitution when it was formed and perpetuated since then through the legal system (with some countervailing activity during the Civil Rights Movement, for example). This puts “fair machine learning” researchers in a bind: either they can use these categories, which have always been about perpetuating social inequality, or they can ignore the categories and reproduce the patterns of social inequality that prevail in fact because of the history of race.

In the paper, we propose a third option. First, rather than reify racial categories, we propose breaking race down into the kinds of personal features that get inscribed with racial meaning. Phenotype properties like skin type and ocular folds are one such set of features. Another set are events that indicate position in social class, such as being arrested or receiving welfare. Another set are facts about the national and geographic origin of ones ancestors. These facts about a person are clearly relevant to how racial distinctions are made, but are themselves more granular and multidimensional than race.

The next step is to detect race-like categories by looking at who is segregated from each other. We propose an unsupervised machine learning technique that works with the distribution of the phenotype, class, and ancestry features across spatial tracts (as in when considering where people physically live) or across a social network (as in when considering people’s professional networks, for example). Principal component analysis can identify what race-like dimensions capture the greatest amounts of spatial and social separation. We hypothesize that these dimensions will encode the ways racial categorization has shaped the social structure in tangible ways; these effects may include both politically recognized forms of discrimination as well as forms of discrimination that have not yet been surfaced. These dimensions can then be used to classify people in race-like ways as input to fairness interventions in machine learning.

A key part of our proposal is that race-like classification depends on the empirical distribution of persons in physical and social space, and so are not fixed. This operationalizes the way that race is socially and politically constructed without reifying the categories in terms that reproduce their white supremacist origins.

I’m quite stoked about this research, though obviously it raises a lot of serious challenges in terms of validation.

Advertisements

“the privatization of public functions”

An emerging theme from the conference on Trade Secrets and Algorithmic Systems was that legal scholars have become concerned about the privatization of public functions. For example, the use of proprietary risk assessment tools instead of the discretion of judges who are supposed to be publicly accountable is a problem. More generally, use of “trade secrecy” in court settings to prevent inquiry into software systems is bogus and moves more societal control into the realm of private ordering.

Many remedies were proposed. Most involved some kind of disclosure and audit to experts. The most extreme form of disclosure is making the software and, where it’s a matter of public record, training data publicly available.

It is striking to me to be encountering the call for government use of open source systems because…this is not a new issue. The conversation about federal use of open source software was alive and well over five years ago. Then, the arguments were about vendor lock-in; now, they are about accountability of AI. But the essential problem of whether core governing logic should be available to public scrutiny, and the effects of its privatization, have been the same.

If we are concerned with the reliability of a closed and large-scale decision-making process of any kind, we are dealing with problems of credibility, opacity, and complexity. The prospects of an efficient market for these kinds of systems are dim. These market conditions are the conditions of sustainability of open source infrastructure. Failures in sustainability are manifest as software vulnerabilities, which are one of the key reasons why governments are warned against OSS now, though the process of measurement and evaluation of OSS software vulnerability versus proprietary vulnerabilities is methodologically highly fraught.

Subjectivity in design

One of the reason why French intellectuals have developed their own strange way of talking is because they have implicitly embraced a post-Heideggerian phenomenological stance which deals seriously with the categories of experience of the individual subject. Americans don’t take this sort of thing so seriously because our institutions have been more post-positivist and now, increasingly, computationalist. If post-positivism makes the subject of science the powerful bureaucratic institution able leverage statistically sound and methodologically responsible survey methodology, computationalism makes the subject of science the data analyst operating a cloud computing platform with data sourced from wherever. These movements are, probably, increasingly alienating to “regular people”, including humanists, who are attracted to phenomenology precisely because they have all the tools for it already.

To the extent that humanists are best informed about what it really means to live in the world, their position must be respected. It is really out of deference to the humble (or, sometimes, splendidly arrogant) representatives of the human subject as such that I have written about existentialism in design, which is really an attempt to ground technical design in what is philosophically “known” about the human condition.

This approach differs from “human centered design” importantly because human centered design wisely considers design to be an empirically rigorous task that demands sensitivity to the particular needs of situated users. This is wise and perfectly fine except for one problem: it doesn’t scale. And as we all know, the great and animal impulse of technology progress, especially today, is to develop the one technology that revolutionizes everything for everyone, becoming new essential infrastructure that reveals a new era of mankind. Human centered designers have everything right about design except for the maniacal ambition of it, without which it will never achieve technology’s paramount calling. So we will put it to one side and take a different approach.

The problem is that computationalist infrastructure projects, and by this I’m referring to the Googles, the Facebooks, the Amazons, Tencents, the Ali Babas, etc., are essentially about designing efficient machines and so they ultimately become about objective resource allocation in one sense or another. The needs of the individual subject are not as relevant to the designers h of these machines as are the behavioral responses of their users to their use interfaces. What will result in more clicks, more “conversions”? Asking users what they really want on the scale that it would affect actual design is secondary and frivolous when A/B s testing can optimize practical outcomes as efficiently as they do.

I do not mean to cast aspersions at these Big Tech companies by describing their operations so baldly. I do not share the critical perspective of many of my colleagues who write as if they have discovered, for the first time, that corporate marketing is hypocritical and that businesses are mercenary. This is just the way things are; what’s more, the engineering accomplishments involved are absolutely impressive and worth celebrating, as is the business management.

What I would like to do is propose that a technology of similar scale can be developed according to general principles that nevertheless make more adept use of what is known about the human condition. Rather than be devoted to cheap proxies of human satisfaction that address his or her objective condition, I’m proposing a service that delivers something tailored to the subjectivity of the user.

Existentialism in Design: Comparison with “Friendly AI” research

Turing Test [xkcd]

I made a few references to Friendly AI research in my last post on Existentialism in Design. I positioned existentialism as an ethical perspective that contrasts with the perspective taken by the Friendly AI research community, among others. This prompted a response by a pseudonymous commenter (in a sadly condescending way, I must say) who linked me to a a post, “Complexity of Value” on what I suppose you might call the elite rationalist forum Arbital. I’ll take this as an invitation to elaborate on how I think existentialism offers an alternative to the Friendly AI perspective of ethics in technology, and particularly the ethics of artificial intelligence.

The first and most significant point of departure between my work on this subject and Friendly AI research is that I emphatically don’t believe the most productive way to approach the problem of ethics in AI is to consider the problem of how to program a benign Superintelligence. This is for reasons I’ve written up in “Don’t Fear the Reaper: Refuting Bostrom’s Superintelligence Argument”, which sums up arguments made in several blog posts about Nick Bostrom’s book on the subject. This post goes beyond the argument in the paper to address further objections I’ve heard from Friendly AI and X-risk enthusiasts.

What superintelligence gives researchers is a simplified problem. Rather than deal with many of the inconvenient contingencies of humanity’s technically mediated existence, superintelligence makes these irrelevant in comparison to the limiting case where technology not only mediates, but dominates. The question asked by Friendly AI researchers is how an omnipotent computer should be programmed so that it creates a utopia and not a dystopia. It is precisely because the computer is omnipotent that it is capable of producing a utopia and is in danger of creating a dystopia.

If you don’t think superintelligences are likely (perhaps because you think there are limits to the ability of algorithms to improve themselves autonomously), then you get a world that looks a lot more like the one we have now. In our world, artificial intelligence has been incrementally advancing for maybe a century now, starting with the foundations of computing in mathematical logic and electrical engineering. It proceeds through theoretical and engineering advances in fits and starts, often through the application of technology to solve particular problems, such as natural language processing, robotic control, and recommendation systems. This is the world of “weak AI”, as opposed to “strong AI”.

It is also a world where AI is not the great source of human bounty or human disaster. Rather, it is a form of economic capital with disparate effects throughout the total population of humanity. It can be a source of inspiring serendipity, banal frustration, and humor.

Let me be more specific, using the post that I was linked to. In it, Eliezer Yudkowsky posits that a (presumeably superintelligent) AI will be directed to achieve something, which he calls “value”. The post outlines a “Complexity of Value” thesis. Roughly, this means that the things that we want AI to do cannot be easily compressed into a brief description. For an AI to not be very bad, it will need to either contain a lot of information about what people really want (more than can be easily described) or collect that information as it runs.

That sounds reasonable to me. There’s plenty of good reasons to think that even a single person’s valuations are complex, hard to articulate, and contingent on their circumstances. The values appropriate for a world dominating supercomputer could well be at least as complex.

But so what? Yudkowsky argues that this thesis, if true, has implications for other theoretical issues in superintelligence theory. But does it address any practical questions of artificial intelligence problem solving or design? That it is difficult to mathematically specify all of values or normativity, and that to attempt to do so one would need to have a lot of data about humanity in its particularity, is a point that has been apparent to ethical philosophy for a long time. It’s a surprise or perhaps disappointment only to those who must mathematize everything. Articulating this point in terms of Kolmogorov complexity does not particularly add to the insight so much as translate it into an idiom used by particular researchers.

Where am I departing from this with “Existentialism in Design”?

Rather than treat “value” as a wholly abstract metasyntactic variable representing the goals of a superintelligent, omniscient machine, I’m approaching the problem more practically. First, I’m limiting myself to big sociotechnical complexes wherein a large number of people have some portion of their interactions mediated by digital networks and data centers and, why not, smartphones and even the imminent dystopia of IoT devices. This may be setting my work up for obsolescence, but it also grounds the work in potential action. Since these practical problems rely on much of the same mathematical apparatus as the more far-reaching problems, there is a chance that a fundamental theorem may arise from even this applied work.

That restriction on hardware may seem banal; but it’s a particular philosophical question that I am interested in. The motivation for considering existentialist ethics in particular is that it suggests new kinds of problems that are relevant to ethics but which have not been considered carefully or solved.

As I outlined in a previous post, many ethical positions are framed either in terms of consequentialism, evaluating the utility of a variety of outcomes, or deontology, concerned with the consistency of behavior with more or less objectively construed duties. Consequentialism is attractive to superintelligence theorists because they imagine their AI’s to have to ability to cause any consequence. The critical question is how to give it a specification the leads to the best or adequate consequences for humanity. This is a hard problem, under their assumptions.

Deontology is, as far as I can tell, less interesting to superintelligence theorists. This may be because deontology tends to be an ethics of human behavior, and for superintelligence theorists human behavior is rendered virtually insignificant by superintelligent agency. But deontology is attractive as an ethics precisely because it is relevant to people’s actions. It is intended as a way of prescribing duties to a person like you and me.

With Existentialism in Design (a term I may go back and change in all these posts at some point; I’m not sure I love the phrase), I am trying to do something different.

I am trying to propose an agenda for creating a more specific goal function for a limited but still broad-reaching AI, assigning something to its ‘value’ variable, if you will. Because the power of the AI to bring about consequences is limited, its potential for success and failure is also more limited. Catastrophic and utopian outcomes are not particularly relevant; performance can be evaluated in a much more pedestrian way.

Moreover, the valuations internalized by the AI are not to be done in a directly consequentialist way. I have suggested that an AI could be programmed to maximize the meaningfulness of its choices for its users. This is introducing a new variable, one that is more semantically loaded than “value”, though perhaps just as complex and amorphous.

Particular to this variable, “meaningfulness”, is that it is a feature of the subjective experience of the user, or human interacting with the system. It is only secondarily or derivatively an objective state of the world that can be evaluated for utility. To unpack in into a technical specification, we will require a model (perhaps a provisional one) of the human condition and what makes life meaningful. This very well may include such things as the autonomy, or the ability to make one’s own choices.

I can anticipate some objections along the lines that what I am proposing still looks like a special case of more general AI ethics research. Is what I’m proposing really fundamentally any different than a consequentialist approach?

I will punt on this for now. I’m not sure of the answer, to be honest. I could see it going one of two different ways.

The first is that yes, what I’m proposing can be thought of as a narrow special case of a more broadly consequentialist approach to AI design. However, I would argue that the specificity matters because of the potency of existentialist moral theory. The project of specify the latter as a kind of utility function suitable for programming into an AI is in itself a difficult and interesting problem without it necessarily overturning the foundations of AI theory itself. It is worth pursuing at the very least as an exercise and beyond that as an ethical intervention.

The second case is that there may be something particular about existentialism that makes encoding it different from encoding a consequentialist utility function. I suspect, but leave to be shown, that this is the case. Why? Because existentialism (which I haven’t yet gone into much detail describing) is largely a philosophy about how we (individually, as beings thrown into existence) come to have values in the first place and what we do when those values or the absurdity of circumstances lead us to despair. Existentialism is really a kind of phenomenological metaethics in its own right, one that is quite fluid and resists encapsulation in a utility calculus. Most existentialists would argue that at the point where one externalizes one’s values as a utility function as opposed to living as them and through them, one has lost something precious. The kinds of things that existentialism derives ethical imperatives from, such as the relationship between one’s facticity and transcendence, or one’s will to grow in one’s potential and the inevitability of death, are not the kinds of things a (limited, realistic) AI can have much effect on. They are part of what has been perhaps quaintly called the human condition.

To even try to describe this research problem, one has to shift linguistic registers. The existentialist and AI research traditions developed in very divergent contexts. This is one reason to believe that their ideas are new to each other, and that a synthesis may be productive. In order to accomplish this, one needs a charitably considered, working understanding of existentialism. I will try to provide one in my next post in this series.

More assessment of AI X-risk potential

I’m been stimulated by Luciano Floridi’s recent article in Aeon “Should we be afraid of AI?”. I’m surprised that this issue hasn’t been settled yet, since it seems like “we” have the formal tools necessary to solve the problem decisively. But nevertheless this appears to be the subject of debate.

I was referred to Kaj Sotala’s rebuttal of an earlier work by Floridi which his Aeon article was based on. The rebuttal appears in this APA Newsletter on Philosophy and Computers. It is worth reading.

The issue that I’m most interested in is whether or not AI risk research should constitute a special, independent branch of research, or whether it can be approached just as well by pursuing a number of other more mainstream artificial intelligence research agendas. My primary engagement with these debates has so far been an analysis of Nick Bostrom’s argument in his book Superintelligence, which tries to argue in particular that there is an existential risk (or X-risk) to humanity from artificial intelligence. “Existential risk” means a risk to the existence of something, in this case humanity. And the risk Bostrom has written about is the risk of eponymous superintelligence: an artificial intelligence that gets smart enough to improve its own intelligence, achieve omnipotence, and end the world as we know it.

I’ve posted my rebuttal to this argument on arXiv. The one-sentence summary of the argument is: algorithms can’t just modify themselves into omnipotence because they will hit performance bounds due to data and hardware.

A number of friends have pointed out to me that this is not a decisive argument. They say: don’t you just need the AI to advance fast enough and far enough to be an existential threat?

There are a number of reasons why I don’t believe this is likely. In fact, I believe that it is provably vanishingly unlikely. This is not to say that I have a proof, per se. I suppose it’s incumbent on me to work it out and see if the proof is really there.

So: Herewith is my Sketch Of A Proof of why there’s no significant artificial intelligence existential risk.

Lemma: Intelligence advances due to purely algorithmic self-modificiation will always plateau due to data and hardware constraints, which advance more slowly.

Proof: This paper.

As a consequence, all artificial intelligence explosions will be sigmoid. That is, starting slow, accelerating, then decelerating, the growing so slowly as to be asymptotic. Let’s call the level of intelligence at which an explosion asymptotes the explosion bound.

There’s empirical support for this claim. Basically, we have never had a really big intelligence explosion due to algorithmic improvement alone. Looking at the impressive results of the last seventy years, most of the impressiveness can be attributed to advances in hardware and data collection. Notoriously, Deep Learning is largely just decades old artificial neural network technology repurposed to GPU’s on the cloud. Which is awesome and a little scary. But it’s not an algorithmic intelligence explosion. It’s a consolidation of material computing power and sensor technology by organizations. The algorithmic advances fill those material shoes really quickly, it’s true. This is precisely the point: it’s not the algorithms that’s the bottleneck.

Observation: Intelligence explosions are happening all the time. Most of them are small.

Once we accept the idea that intelligence explosions are all bounded, it becomes rather arbitrary where we draw the line between an intelligence explosion and some lesser algorithmic intelligence advance. There is a real sense in which any significant intelligence advance is a sigmoid expansion in intelligence. This would include run-of-the-mill scientific discoveries and good ideas.

If intelligence explosions are anything like virtually every other interesting empirical phenomenon, then they are distributed according to a heavy tail distribution. This means a distribution with a lot of very small values and a diminishing probability of higher values that nevertheless assigns some probability to very high values. Assuming intelligence is something that can be quantified and observed empirically (a huge ‘if’ taken for granted in this discussion), we can (theoretically) take a good hard look at the ways intelligence has advanced. Look around you. Do you see people and computers getting smarter all the time, sometimes in leaps and bounds but most of the time minutely? That’s a confirmation of this hypothesis!

The big idea here is really just to assert that there is a probability distribution over intelligence explosion bounds that all actual intelligence explosions are being drawn from. This follows more or less directly from the conclusion that all intelligence explosions are bounded. Once we posit such a distribution, it becomes possible to take expected values of functions of its values and functions of its values.

Empirical claim: Hardware and sensing advances diffuse rapidly relative to their contribution to intelligence gains.

There’s an material, socio-technical analog to Bostrom’s explosive superintelligence. We could imagine a corporation that is working in secret on new computing infrastructure. Whenever it has an advance in computing infrastructure, the AI people (or increasingly, the AI-writing-AI) develops programming that maximizes its use of this new technology. Then it uses that technology to enrich its own computer-improving facilities. When it needs more…minerals…or whatever it needs to further its research efforts, it finds a way to get them. It proceeds to take over the world.

This may presently be happening. But evidence suggests that this isn’t how the technology economy really works. No doubt Amazon (for example) is using Amazon Web Services internally to do its business analytics. But also it makes its business out of selling out its computing infrastructure to other organizations as a commodity. That’s actually the best way it can enrich itself.

What’s happening here is the diffusion of innovation, which is a well-studied phenomenon in economics and other fields. Ideas spread. Technological designs spread. I’d go so far as to say that it is often (perhaps always?) the best strategy for some agent that has locally discovered a way to advance its own intelligence to figure out how to trade that intelligence to other agents. Almost always that trade involves the diffusion of the basis of that intelligence itself.

Why? Because since there are independent intelligence advances of varying sizes happening all the time, there’s actually a very competitive market for innovation that quickly devalues any particular gain. A discovery, if hoarded, will likely be discovered by somebody else. The race to get credit for any technological advance at all motivates diffusion and disclosure.

The result is that the distribution of innovation, rather than concentrating into very tall spikes, is constantly flattening and fattening itself. That’s important because…

Claim: Intelligence risk is not due to absolute levels of intelligence, but relative intelligence advantage.

The idea here is that since humanity is composed of lots of interacting intelligence sociotechnical organizations, any hostile intelligence is going to have a lot of intelligent adversaries. If the game of life can be won through intelligence alone, then it can only be won with a really big intelligence advantage over other intelligent beings. It’s not about absolute intelligence, it’s intelligence inequality we need to worry about.

Consequently, the more intelligence advances (i.e, technologies) diffuse, the less risk there is.

Conclusion: The chance of an existential risk from an intelligence explosion is small and decreasing all the time.

So consider this: globally, there’s tons of investment in technologies that, when discovered, allow for local algorithmic intelligence explosions.

But even if we assume these algorithmic advances are nearly instantaneous, they are still bounded.

Lots of independent bounded explosions are happening all the time. But they are also diffusing all the time.

Since the global intelligence distribution is always fattening, that means that the chance of any particular technological advance granting a decisive advantage over others is decreasing.

There is always the possibility of a fluke, of course. But if there was going to be a humanity destroying technological discovery, it would probably have already been invented and destroyed us. Since it hasn’t, we have a lot more resilience to threats from intelligence explosions, not to mention a lot of other threats.

This doesn’t mean that it isn’t worth trying to figure out how to make AI better for people. But it does diminish the need to think about artificial intelligence as an existential risk. It makes AI much more comparable to a biological threat. Biological threats could be really bad for humanity. But there’s also the organic reality that life is very resilient and human life in general is very secure precisely because it has developed so much intelligence.

I believe that thinking about the risks of artificial intelligence as analogous to the risks from biological threats is helpful for prioritizing where research effort in artificial intelligence should go. Just because AI doesn’t present an existential risk to all of humanity doesn’t mean it doesn’t kill a lot of people or make their lives miserable. On the contrary, we are in a world with both a lot of artificial and non-artificial intelligence and a lot of miserable and dying people. These phenomena are not causally disconnected. A good research agenda for AI could start with an investigation of these actually miserable people and what their problems are, and how AI is causing that suffering or alternatively what it could do to improve things. That would be an enormously more productive research agenda than one that aims primarily to reduce the impact of potential explosions which are diminishingly unlikely to occur.

artificial life, artificial intelligence, artificial society, artificial morality

“Everyone” “knows” what artificial intelligence is and isn’t and why it is and isn’t a transformative thing happening in society and technology and industry right now.

But the fact is that most of what “we” “call” artificial intelligence is really just increasingly sophisticated ways of solving a single class of problems: optimization.

Essentially what’s happened in AI is that all empirical inference problems can be modeled as Bayesian problems, which are then solved using variational inference methods, which are essentially just turning the Bayesian statistic problem into a solvable form of an optimization problem, and solving it.

Advances in optimization have greatly expanded the number of things computers can accomplish as part of a weak AI research agenda.

Frequently these remarkable successes in Weak AI are confused with an impending revolution in what used to be called Strong AI but which now is more frequently called Artificial General Intelligence, or AGI.

Recent interest in AGI has spurred a lot of interesting research. How could it not be interesting? It is also, for me, extraordinarily frustrating research because I find the philosophical precommitments of most AGI researchers baffling.

One insight that I wish made its way more frequently into discussions of AGI is an insight made by the late Francisco Varela, who argued that you can’t really solve the problem of artificial intelligence until you have solved the problem of artificial life. This is for the simple reason that only living things are really intelligent in anything but the weak sense of being capable of optimization.

Once being alive is taken as a precondition for being intelligent, the problem of understanding AGI implicates a profound and fascinating problem of understanding the mathematical foundations of life. This is a really amazing research problem that for some reason is never ever discussed by anybody.

Let’s assume it’s possible to solve this problem in a satisfactory way. That’s a big If!

Then a theory of artificial general intelligence should be able to show how some artificial living organisms are and others are not intelligent. I suppose what’s most significant here is the shift in thinking of AI in terms of “agents”, a term so generic as to be perhaps at the end of the day meaningless, to thinking of AI in terms of “organisms”, which suggests a much richer set of preconditions.

I have similar grief over contemporary discussion of machine ethics. This is a field with fascinating, profound potential. But much of what machine ethics boils down to today are trolley problems, which are as insipid as they are troublingly intractable. There’s other, better machine ethics research out there, but I’ve yet to see something that really speaks to properly defining the problem, let alone solving it.

This is perhaps because for a machine to truly be ethical, as opposed to just being designed and deployed ethically, it must have moral agency. I don’t mean this in some bogus early Latourian sense of “wouldn’t it be fun if we pretended seatbelts were little gnomes clinging to our seats” but in an actual sense of participating in moral life. There’s a good case to be made that the latter is not something easily reducible to decontextualized action or function, but rather has to do with how own participates more broadly in social life.

I suppose this is a rather substantive metaethical claim to be making. It may be one that’s at odds with common ideological trainings in Anglophone countries where it’s relatively popular to discuss AGI as a research problem. It has more in common, intellectually and philosophically, with continental philosophy than analytic philosophy, whereas “artificial intelligence” research is in many ways a product of the latter. This perhaps explains why these two fields are today rather disjoint.

Nevertheless, I’d happily make the case that the continental tradition has developed a richer and more interesting ethical tradition than what analytic philosophy has given us. Among other reasons this is because of how it is able to situated ethics as a function of a more broadly understood social and political life.

I postulate that what is characteristic of social and political life is that it involves the interaction of many intelligent organisms. Which of course means that to truly understand this form of life and how one might recreate it artificially, one must understand artificial intelligence and, transitively, artificial life.

Only one artificial society is sufficiently well-understood could we then approach the problem of artificial morality, or how to create machines that truly act according to moral or ethical ideals.