Digifesto

Tag: artificial intelligence

I’m building something new

Push has come to shove, and I have started building something new.

I’m not building it alone, thank God, but it’s a very petite open source project at the moment.

I’m convinced that it is actually something new. Some of my colleagues are excited to hear that what I’m building will exist for them shortly. That’s encouraging! I also tried to build this thing in the context of another project that’s doing something similar. I was told that I was being too ambitious and couldn’t pull it off. That wasn’t exactly encouraging, but was good evidence that I’m actually doing something new. I will now do this thing and take the credit. So much the better for me.

What is it that I’m building?

Well, you see, it’s software for modeling economic systems. Or sociotechnical systems. Or, broadly speaking, complex systems with agents in them. Also, doing statistics with those models: fitting them to data, understanding what emergent properties occur in them, exploring counterfactuals, and so on.

I will try to answer some questions I wish somebody would ask me.

Q: Isn’t that just agent-based modeling? Why aren’t you just using NetLogo or something?

A: Agent-based modeling (ABM) is great, but it’s a very expansive term that means a lot of things. Very often, ABMs consist of agents whose behavior is governed by simple rules, rather directed towards accomplishing goals. That notion of “agent” in ABM is almost entirely opposed to the notion of “agent” used in AI — propagated by Stuart Russell, for example. To AI people, goal-directedness is essential for agency. I’m not committed to rational behavior in this framework — I’m not an economist! But I think a requirement to be able to train agents’ decision rules with respect to their goals.

There are a couple other ways in which I’m not doing paradigmatic ABM with this project. One is that I’m not focused on agents moving in 2D or 3D space. Rather, I’m much more interested in the settings defined by systems of structural equations. So, more continuous state spaces. I’m basing this work on years of contributing to heterogeneous agent macroeconomics tooling, and my frustrating with that paradigm. So, no turtles on patches. I anticipate spatial and even geospatial extensions to what I’m building would be really cool and useful. But I’m not there yet.

I think what I’m working on is ABM in the extended sense that Rob Axtell and Doyne Farmer use the term, and I hope to one day show them what I’m doing and for them to think it’s cool.

Q: Wait, is this about AI agents, as in Generative AI?

A: Ahaha… mainly no, but a little yes. I’m talking about “agents” in the more general sense used before the GenAI industry tried to make the word about them. I don’t see Generative AI or LLMs to be a fundamental part of what I’m building. However, I do see what I’m building as a tool for evaluating the economic impact and trustworthiness of GenAI systems by modeling their supply chains and social consequences. And I can imagine deeper integrations with “(generative) agentic AI” down the line. I am building a tool, and an LLM might engage it through “tool use”. It’s also I suppose possible to make the agents inside the model use LLMs somehow, though I don’t see a good reason for that at the moment.

Q: Does it use AI at all?

A: Yes! I mean, perhaps you know that “AI” has meant many things and much of what it has meant is now considered quite mundane. But it does use deep learning, which is something at “AI” means now. In particular, part of the core functionality that I’m trying to build into it is a flexible version of the deep learning econometrics methods invented not-too-long-ago by Lilia and Serguei Maliar. I hope to one day show this project to them, and for them to think it’s cool. Deep learning methods have become quite popular in economics, and this is in some ways yet-another-deep-learning-economics project. I hope it has a few features that distinguish it.

Q: How is this different from somebody else’s deep learning economics analysis package?

A: Great question! There are a few key ways that it’s different. One is that it’s designed around a clean separation between model definition and solution algorithms. There will be no model-specific solution code in this project. It’s truly intended to be library, comparable to scikit-learn, but for systems of agents. In fact, I’m calling this project scikit-agent. You heard it hear first!

Separating the model definitions from the solution algorithms means that there’s a lot more flexibility in how models are composed. This framework is based on the idea that parts of a model can be “blocks” which can be composed into more complex models. The “blocks” are bundles of structural equations, which can include state, control, and reward variables.

These ‘blocks’ are symbolically defined systems or environments. “Solving” the agent strategies in the multi-agent environment will be done with deep learning, otherwise known as artificial neural networks. So I think that it will be fair to call this framework a “neurosymbolic AI system”. I hope that saying that makes it easier to find funding for it down the line :)

Q: That sounds a little like causal game theory, or multi-agent influence diagrams. Are those part of this?

A: In fact, yes, so glad you asked. I think there’s a deep equivalence between multi-agent influence diagrams and ABM/computational economics which hasn’t been well explored. There are small notational differences that keep these communities from communicating. There are also a handful of substantively difficult theoretical issues that need to be settled with respect to, say, under what conditions a dynamic structure causal game can be solved using multi-agent reinforcement learning. These are cool problems, and I hope the thing I’m building implements good solutions to them.

Q: So, this is a framework for modeling dynamic Pearlian causal models, with multiple goal-directed agents, solving those models for agent strategy, and using those model econometrically?

A: Exactly.

Q: Does the thing you are building have any practical value? Or is it just more weird academic code?

A: I have high hopes that this thing I’m building could have a lot of practical value. Rigorous analysis of complex sociotechnical and economic systems remain hard problems. In finance, for example, as well as public policy, insurance, international relations, and other fields. I do hope what I’m building interfaces well with real data to help with significant decision-making. These are problems that Generative AI is quite bad at, I believe. I’m trying to build a strong, useful foundation for working with statistical models that include agents in them. This is more difficult than regression or even ‘transformer’-based learning from media, because the agents are solving optimization problems inside the model.

Q: What are the first applications you have in mind for this tool?

A: I’m motivated to build this because I think it’s needed to address questions in technology policy and design. This is the main subject of my NSF-funded research over the past several years. Here are some problems I’m actively working on which overlap with the scope of this tool:

  • Integrating Differential Privacy and Contextual Integrity. I have a working paper with Rachel Cummings where we use Structural Causal Games (SCGs) to set up the parameter tuning of a differentially private system as a mechanism design problem. The tool I’m building will be great at representing structural causal games (SCG). With it, the theoretical technique can be used in practice.
  • Understanding the Effects of Consumer Finance Policy. I’m working with an amazing team on a project modeling consumer lending and the effects of various consumer protection regulations. We are looking at anti-usury laws, nondiscrimination rules, forgiveness of negative information, the use of alternative data by fintech companies, and so on. This policy analysis involves comparing a number of different scenarios and looking for what regulations produce which results, robustly. I’m building a tool to solve this problem.
  • AI Governance and Fiduciary Duties. A lot of people have pointed out that effective AI governance requires an understanding of AI supply chains. AI services rely on complex data flows through multiple actors, which are often imperfectly aligned and incompletely contracted. They also depend on physical data centers and the consumption of energy. This raises many questions around liability and quality control that wind up ultimately being about institutional design rather than neural network architectures. I’m building a tool to help reason through these institution design questions. In other words, I’m building a new way to do threat modeling on AI supply chain ecosystems.

Q: Really?

A: Yes! I sometimes have a hard time wrapping my own head around what I’m doing, which is why I’ve written out this blog post. But I do feel very good about what I’m working on at the moment. I think it has a lot of potential.

Notes about “Data Science and the Decline of Liberal Law and Ethics”

Jake Goldenfein and I have put up on SSRN our paper, “Data Science and the Decline of Liberal Law and Ethics”. I’ve mentioned it on this blog before as something I’m excited about. It’s also been several months since we’ve finalized it, and I wanted to quickly jot some notes about it based on considerations going into it and since then.

The paper was the result of a long and engaged collaboration with Jake which started from a somewhat different place. We considered the question, “What is sociopolitical emancipation in the paradigm of control?” That was a mouthful, but it captured what we were going for:

  • Like a lot of people today, we are interested in the political project of freedom. Not just freedom in narrow, libertarian senses that have proven to be self-defeating, but in broader senses of removing social barriers and systems of oppression. We were ambivalent about the form that would take, but figured it was a positive project almost anybody would be on board with. We called this project emancipation.
  • Unlike a certain prominent brand of critique, we did not begin from an anthropological rejection of the realism of foundational mathematical theory from STEM and its application to human behavior. In this paper, we did not make the common move of suggesting that the source of our ethical problems is one that can be solved by insisting on the terminology or methodological assumptions of some other discipline. Rather, we took advances in, e.g., AI as real scientific accomplishments that are telling us how the world works. We called this scientific view of the world the paradigm of control, due to its roots in cybernetics.

I believe our work is making a significant contribution to the “ethics of data science” debate because it is quite rare to encounter work that is engaged with both project. It’s common to see STEM work with no serious moral commitments or valence. And it’s common to see the delegation of what we would call emancipatory work to anthropological and humanistic disciplines: the STS folks, the media studies people, even critical X (race, gender, etc.) studies. I’ve discussed the limitations of this approach, however well-intentioned, elsewhere. Often, these disciplines argue that the “unethical” aspect of STEM is because of their methods, discourses, etc. To analyze things in terms of their technical and economic properties is to lose the essence of ethics, which is aligned with anthropological methods that are grounded in respectful, phenomenological engagement with their subjects.

This division of labor between STEM and anthropology has, in my view (I won’t speak for Jake) made it impossible to discuss ethical problems that fit uneasily in either field. We tried to get at these. The ethical problem is instrumentality run amok because of the runaway economic incentives of private firms combined with their expanded cognitive powers as firms, a la Herbert Simon.

This is not a terribly original point and we hope it is not, ultimately, a fringe political position either. If Martin Wolf can write for the Financial Times that there is something threatening to democracy about “the shift towards the maximisation of shareholder value as the sole goal of companies and the associated tendency to reward management by reference to the price of stocks,” so can we, and without fear that we will be targeted in the next red scare.

So what we are trying to add is this: there is a cognitivist explanation for why firms can become so enormously powerful relative to individual “natural persons”, one that is entirely consistent with the STEM foundations that have become dominant in places like, most notably, UC Berkeley (for example) as “data science”. And, we want to point out, the consequences of that knowledge, which we take to be scientific, runs counter to the liberal paradigm of law and ethics. This paradigm, grounded in individual autonomy and privacy, is largely the paradigm animating anthropological ethics! So we are, a bit obliquely, explaining why the the data science ethics discourse has gelled in the ways that it has.

We are not satisfied with the current state of ‘data science ethics’ because to the extent that they cling to liberalism, we fear that they miss and even obscure the point, which can best be understood in a different paradigm.

We left as unfinished the hard work of figuring out what the new, alternative ethical paradigm that took cognitivism, statistics, and so on seriously would look like. There are many reasons beyond the conference publication page limit why we were unable to complete the project. The first of these is that, as I’ve been saying, it’s terribly hard to convince anybody that this is a project worth working on in the first place. Why? My view of this may be too cynical, but my explanations are that either (a) this is an interdisciplinary third rail because it upsets the balance of power between different academic departments, or (b) this is an ideological third rail because it successfully identifies a contradiction in the current sociotechnical order in a way that no individual is incentivized to recognize, because that order incentivizes individuals to disperse criticism of its core institutional logic of corporate agency, or (c) it is so hard for any individual to conceive of corporate cognition because of how it exceeds the capacity of human understanding that speaking in this way sounds utterly speculative to a lot of fo people. The problem is that it requires attributing cognitive and adaptive powers to social forms, and a successful science of social forms is, at best, in the somewhat gnostic domain of complex systems research.

The latter are rarely engaged in technology policy but I think it’s the frontier.

References

Benthall, Sebastian and Goldenfein, Jake, Data Science and the Decline of Liberal Law and Ethics (June 22, 2020). Ethics of Data Science Conference – Sydney 2020 (forthcoming). Available at SSRN: https://ssrn.com/abstract=

computational institutions as non-narrative collective action

Nils Gilman recently pointed to a book chapter that confirms the need for “official futures” in capitalist institutions.

Nils indulged me in a brief exchange that helped me better grasp at a bothersome puzzle.

There is a certain class of intellectuals that insist on the primacy of narratives as a mode of human experience. These tend to be, not too surprisingly, writers and other forms of storytellers.

There is a different class of intellectuals that insists on the primacy of statistics. Statistics does not make it easy to tell stories because it is largely about the complexity of hypotheses and our lack of confidence in them.

The narrative/statistic divide could be seen as a divide between academic disciplines. It has often been taken to be, I believe wrongly, the crux of the “technology ethics” debate.

I questioned Nils as to whether his generalization stood up to statistically driven allocation of resources; i.e., those decisions made explicitly on probabilistic judgments. He argued that in the end, management and collective action require consensus around narrative.

In other words, what keeps narratives at the center of human activity is that (a) humans are in the loop, and (b) humans are collectively in the loop.

The idea that communication is necessary for collective action is one I used to put great stock in when studying Habermas. For Habermas, consensus, and especially linguistic consensus, is how humanity moves together. Habermas contrasted this mode of knowledge aimed at consensus and collective action with technical knowledge, which is aimed at efficiency. Habermas envisioned a society ruled by communicative rationality, deliberative democracy; following this line of reasoning, this communicative rationality would need to be a narrative rationality. Even if this rationality is not universal, it might, in Habermas’s later conception of governance, be shared by a responsible elite. Lawyers and a judiciary, for example.

The puzzle that recurs again and again in my work has been the challenge of communicating how technology has become an alternative form of collective action. The claim made by some that technologists are a social “other” makes more sense if one sees them (us) as organizing around non-narrative principles of collective behavior.

It is I believe beyond serious dispute that well-constructed, statistically based collective decision-making processes perform better than many alternatives. In the field of future predictions, Phillip Tetlock’s work on superforecasting teams and prior work on expert political judgment has long stood as an empirical challenge to the supposed primacy of narrative-based forecasting. This challenge has not been taken up; it seems rather one-sided. One reason for this may be because the rationale for the effectiveness of these techniques rests ultimately in the science of statistics.

It is now common to insist that Artificial Intelligence should be seen as a sociotechnical system and not as a technological artifact. I wholeheartedly agree with this position. However, it is sometimes implied that to understand AI as a social+ system, one must understand it one narrative terms. This is an error; it would imply that the collective actions made to build an AI system and the technology itself are held together by narrative communication.

But if the whole purpose of building an AI system is to collectively act in a way that is more effective because of its facility with the nuances of probability, then the narrative lens will miss the point. The promise and threat of AI is that is delivers a different, often more effective form of collective or institution. I’ve suggested that computational institution might be the best way to refer to such a thing.

The Data Processing Inequality and bounded rationality

I have long harbored the hunch that information theory, in the classic Shannon sense, and social theory are deeply linked. It has proven to be very difficult to find an audience for this point of view or an opportunity to work on it seriously. Shannon’s information theory is widely respected in engineering disciplines; many social theorists who are unfamiliar with it are loathe to admit that something from engineering should carry essential insights for their own field. Meanwhile, engineers are rarely interested in modeling social systems.

I’ve recently discovered an opportunity to work on this problem through my dissertation work, which is about privacy engineering. Privacy is a subtle social concept but also one that has been rigorously formalized. I’m working on formal privacy theory now and have been reminded of a theorem from information theory: the Data Processing Theorem. What strikes me about this theorem is that is captures an point that comes up again and again in social and political problems, though it’s a point that’s almost never addressed head on.

The Data Processing Inequality (DPI) states that for three random variables, X, Y, and Z, arranged in Markov Chain such that X \rightarrow Y \rightarrow Z, then I(X,Z) \leq I(X,Y), where here I stands for mutual information. Mutual information is a measure of how much two random variables carry information about each other. If $I(X,Y) = 0$, that means the variables are independent. $I(X,Y) \geq 0$ always–that’s just a mathematical fact about how it’s defined.

The implications of this for psychology, social theory, and artificial intelligence are I think rather profound. It provides a way of thinking about bounded rationality in a simple and generalizable way–something I’ve been struggling to figure out for a long time.

Suppose that there’s a big world out the, W and there’s am organism, or a person, or a sociotechnical organization within it, Y. The world is big and complex, which implies that it has a lot of informational entropy, H(W). Through whatever sensory apparatus is available to Y, it acquires some kind of internal sensory state. Because this organism is much small than the world, its entropy is much lower. There are many fewer possible states that the organism can be in, relative to the number of states of the world. H(W) >> H(Y). This in turn bounds the mutual information between the organism and the world: I(W,Y) \leq H(Y)

Now let’s suppose the actions that the organism takes, Z depend only on its internal state. It is an agent, reacting to its environment. Well whatever these actions are, they can only be so calibrated to the world as the agent had capacity to absorb the world’s information. I.e., I(W,Z) \leq H(Y) << H(W). The implication is that the more limited the mental capacity of the organism, the more its actions will be approximately independent of the state of the world that precedes it.

There are a lot of interesting implications of this for social theory. Here are a few cases that come to mind.

I've written quite a bit here (blog links) and here (arXiv) about Bostrom’s superintelligence argument and why I’m generally not concerned with the prospect of an artificial intelligence taking over the world. My argument is that there are limits to how much an algorithm can improve itself, and these limits put a stop to exponential intelligence explosions. I’ve been criticized on the grounds that I don’t specify what the limits are, and that if the limits are high enough then maybe relative superintelligence is possible. The Data Processing Inequality gives us another tool for estimating the bounds of an intelligence based on the range of physical states it can possibly be in. How calibrated can a hegemonic agent be to the complexity of the world? It depends on the capacity of that agent to absorb information about the world; that can be measured in information entropy.

A related case is a rendering of Scott’s Seeing Like a State arguments. Why is it that “high modernist” governments failed to successfully control society through scientific intervention? One reason is that the complexity of the system they were trying to manage vastly outsized the complexity of the centralized control mechanisms. Centralized control was very blunt, causing many social problems. Arguably, behavioral targeting and big data centers today equip controlling organizations with more informational capacity (more entropy), but they
still get it wrong sometimes, causing privacy violations, because they can’t model the entirety of the messy world we’re in.

The Data Processing Inequality is also helpful for explaining why the world is so messy. There are a lot of different agents in the world, and each one only has so much bandwidth for taking in information. This means that most agents are acting almost independently from each other. The guiding principle of society isn’t signal, it’s noise. That explains why there are so many disorganized heavy tail distributions in social phenomena.

Importantly, if we let the world at any time slice be informed by the actions of many agents acting nearly independently from each other in the slice before, then that increases the entropy of the world. This increases the challenge for any particular agent to develop an effective controlling strategy. For this reason, we would expect the world to get more out of control the more intelligence agents are on average. The popularity of the personal computer perhaps introduced a lot more entropy into the world, distributed in an agent-by-agent way. Moreover, powerful controlling data centers may increase the world’s entropy, rather than redtucing it. So even if, for example, Amazon were to try to take over the world, the existence of Baidu would be a major obstacle to its plans.

There are a lot of assumptions built into these informal arguments and I’m not wedded to any of them. But my point here is that information theory provides useful tools for thinking about agents in a complex world. There’s potential for using it for modeling sociotechnical systems and their limitations.

Managerialism as political philosophy

Technologically mediated spaces and organizations are frequently described by their proponents as alternatives to the state. From David Clark’s maxim of Internet architecture, “We reject: kings, presidents and voting. We believe in: rough consensus and running code”, to cyberanarchist efforts to bypass the state via blockchain technology, to the claims that Google and Facebook, as they mediate between billions of users, are relevant non-state actor in international affairs, to Lessig’s (1999) ever prescient claim that “Code is Law”, there is undoubtedly something going on with technology’s relationship to the state which is worth paying attention to.

There is an intellectual temptation (one that I myself am prone to) to take seriously the possibility of a fully autonomous technological alternative to the state. Something like a constitution written in source code has an appeal: it would be clear, precise, and presumably based on something like a consensus of those who participate in its creation. It is also an idea that can be frightening (Give up all control to the machines?) or ridiculous. The example of The DAO, the Ethereum ‘distributed autonomous organization’ that raised millions of dollars only to have them stolen in a technical hack, demonstrates the value of traditional legal institutions which protect the parties that enter contracts with processes that ensure fairness in their interpretation and enforcement.

It is more sociologically accurate, in any case, to consider software, hardware, and data collection not as autonomous actors but as parts of a sociotechnical system that maintains and modifies it. This is obvious to practitioners, who spend their lives negotiating the social systems that create technology. For those for whom it is not obvious, there’s reams of literature on the social embededness of “algorithms” (Gillespie, 2014; Kitchin, 2017). These themes are recited again in recent critical work on Artificial Intelligence; there are those that wisely point out that a functioning artificially intelligent system depends on a lot of labor (those who created and cleaned data, those who built the systems they are implemented on, those that monitor the system as it operates) (Kelkar, 2017). So rather than discussing the role of particular technologies as alternatives to the state, we should shift our focus to the great variety of sociotechnical organizations.

One thing that is apparent, when taking this view, is that states, as traditionally conceived, are themselves sociotechnical organizations. This is, again, an obvious point well illustrated in economic histories such as (Beniger, 1986). Communications infrastructure is necessary for the control and integration of society, let alone effective military logistics. The relationship between those industrial actors developing this infrastructure, whether it be building roads, running a postal service, laying rail or telegram wires, telephone wires, satellites, Internet protocols, and now social media–and the state has always been interesting and a story of great fortunes and shifts in power.

What is apparent after a serious look at this history is that political theory, especially liberal political theory as it developed in the 1700’s an onward as a theory of the relationship between individuals bound by social contract emerging from nature to develop a just state, leaves out essential scientific facts of the matter of how society has ever been governed. Control of communications and control infrastructure has never been equally dispersed and has always been a source of power. Late modern rearticulations of liberal theory and reactions against it (Rawls and Nozick, both) leave out technical constraints on the possibility of governance and even the constitution of the subject on which a theory of justice would have its ground.

Were political theory to begin from a more realistic foundation, it would need to acknowledge the existence of sociotechnical organizations as a political unit. There is a term for this view, “managerialism“, which, as far as I can tell is used somewhat pejoratively, like “neoliberalism”. As an “-ism”, it’s implied that managerialism is an ideology. When we talk about ideologies, what we are doing is looking from an external position onto an interdependent set of beliefs in their social context and identifying, through genealogical method or logical analysis, how those beliefs are symptoms of underlying causes that are not precisely as represented within those beliefs themselves. For example, one critiques neoliberal ideology, which purports that markets are the best way to allocate resources and advocates for the expansion of market logic into more domains of social and political life, but pointing out that markets are great for reallocating resources to capitalists, who bankroll neoliberal ideologues, but that many people who are subject to neoliberal policies do not benefit from them. While this is a bit of a parody of both neoliberalism and the critiques of it, you’ll catch my meaning.

We might avoid the pitfalls of an ideological managerialism (I’m not sure what those would be, exactly, having not read the critiques) by taking from it, to begin with, only the urgency of describing social reality in terms of organization and management without assuming any particular normative stake. It will be argued that this is not a neutral stance because to posit that there is organization, and that there is management, is to offend certain kinds of (mainly academic) thinkers. I get the sense that this offendedness is similar to the offense taken by certain critical scholars to the idea that there is such a thing as scientific knowledge, especially social scientific knowledge. Namely, it is an offense taken to the idea that a patently obvious fact entails ones own ignorance of otherwise very important expertise. This is encouraged by the institutional incentives of social science research. Social scientists are required to maintain an aura of expertise even when their particular sub-discipline excludes from its analysis the very systems of bureaucratic and technical management that its university depends on. University bureaucracies are, strangely, in the business of hiding their managerialist reality from their own faculty, as alternative avenues of research inquiry are of course compelling in their own right. When managerialism cannot be contested on epistemic grounds (because the bluff has been called), it can be rejected on aesthetic grounds: managerialism is not “interesting” to a discipline, perhaps because it does not engage with the personal and political motivations that constitute it.

What sets managerialism aside from other ideologies, however, is that when we examine its roots in social context, we do not discover a contradiction. Managerialism is not, as far as I can tell, successful as a popular ideology. Managerialism is attractive only to that rare segment of the population that work closely with bureaucratic management. It is here that the technical constraints of information flow and its potential uses, the limits of autonomy especially as it confronts the autonomies of others, the persistence of hierarchy despite the purported flattening of social relations, and so on become unavoidable features of life. And though one discovers in these situations plenty of managerial incompetence, one also comes to terms with why that incompetence is a necessary feature of the organizations that maintain it.

Little of what I am saying here is new, of course. It is only new in relation to more popular or appealing forms of criticism of the relationship between technology, organizations, power, and ethics. So often the political theory implicit in these critiques is a form of naive egalitarianism that sees a differential in power as an ethical red flag. Since technology can give organizations a lot of power, this generates a lot of heat around technology ethics. Starting from the perspective of an ethicist, one sees an uphill battle against an increasingly inscrutable and unaccountable sociotechnical apparatus. What I am proposing is that we look at things a different way. If we start from general principles about technology its role in organizations–the kinds of principles one would get from an analysis of microeconomic theory, artificial intelligence as a mathematical discipline, and so on–one can try to formulate managerial constraints that truly confront society. These constraints are part of how subjects are constituted and should inform what we see as “ethical”. If we can broker between these hard constraints and the societal values at stake, we might come up with a principle of justice that, if unpopular, may at least be realistic. This would be a contribution, at the end of the day, to political theory, not as an ideology, but as a philosophical advance.

References

Beniger, James R. “The Control Revolution: Technological and Economic Origins of the.” Information Society (1986).

Bird, Sarah, et al. “Exploring or Exploiting? Social and Ethical Implications of Autonomous Experimentation in AI.” (2016).

Gillespie, Tarleton. “The relevance of algorithms.” Media technologies: Essays on communication, materiality, and society 167 (2014).

Kelkar, Shreeharsh. “How (Not) to Talk about AI.” Platypus, 12 Apr. 2017, blog.castac.org/2017/04/how-not-to-talk-about-ai/.

Kitchin, Rob. “Thinking critically about and researching algorithms.” Information, Communication & Society 20.1 (2017): 14-29.

Lessig, Lawrence. “Code is law.” The Industry Standard 18 (1999).

artificial life, artificial intelligence, artificial society, artificial morality

“Everyone” “knows” what artificial intelligence is and isn’t and why it is and isn’t a transformative thing happening in society and technology and industry right now.

But the fact is that most of what “we” “call” artificial intelligence is really just increasingly sophisticated ways of solving a single class of problems: optimization.

Essentially what’s happened in AI is that all empirical inference problems can be modeled as Bayesian problems, which are then solved using variational inference methods, which are essentially just turning the Bayesian statistic problem into a solvable form of an optimization problem, and solving it.

Advances in optimization have greatly expanded the number of things computers can accomplish as part of a weak AI research agenda.

Frequently these remarkable successes in Weak AI are confused with an impending revolution in what used to be called Strong AI but which now is more frequently called Artificial General Intelligence, or AGI.

Recent interest in AGI has spurred a lot of interesting research. How could it not be interesting? It is also, for me, extraordinarily frustrating research because I find the philosophical precommitments of most AGI researchers baffling.

One insight that I wish made its way more frequently into discussions of AGI is an insight made by the late Francisco Varela, who argued that you can’t really solve the problem of artificial intelligence until you have solved the problem of artificial life. This is for the simple reason that only living things are really intelligent in anything but the weak sense of being capable of optimization.

Once being alive is taken as a precondition for being intelligent, the problem of understanding AGI implicates a profound and fascinating problem of understanding the mathematical foundations of life. This is a really amazing research problem that for some reason is never ever discussed by anybody.

Let’s assume it’s possible to solve this problem in a satisfactory way. That’s a big If!

Then a theory of artificial general intelligence should be able to show how some artificial living organisms are and others are not intelligent. I suppose what’s most significant here is the shift in thinking of AI in terms of “agents”, a term so generic as to be perhaps at the end of the day meaningless, to thinking of AI in terms of “organisms”, which suggests a much richer set of preconditions.

I have similar grief over contemporary discussion of machine ethics. This is a field with fascinating, profound potential. But much of what machine ethics boils down to today are trolley problems, which are as insipid as they are troublingly intractable. There’s other, better machine ethics research out there, but I’ve yet to see something that really speaks to properly defining the problem, let alone solving it.

This is perhaps because for a machine to truly be ethical, as opposed to just being designed and deployed ethically, it must have moral agency. I don’t mean this in some bogus early Latourian sense of “wouldn’t it be fun if we pretended seatbelts were little gnomes clinging to our seats” but in an actual sense of participating in moral life. There’s a good case to be made that the latter is not something easily reducible to decontextualized action or function, but rather has to do with how own participates more broadly in social life.

I suppose this is a rather substantive metaethical claim to be making. It may be one that’s at odds with common ideological trainings in Anglophone countries where it’s relatively popular to discuss AGI as a research problem. It has more in common, intellectually and philosophically, with continental philosophy than analytic philosophy, whereas “artificial intelligence” research is in many ways a product of the latter. This perhaps explains why these two fields are today rather disjoint.

Nevertheless, I’d happily make the case that the continental tradition has developed a richer and more interesting ethical tradition than what analytic philosophy has given us. Among other reasons this is because of how it is able to situated ethics as a function of a more broadly understood social and political life.

I postulate that what is characteristic of social and political life is that it involves the interaction of many intelligent organisms. Which of course means that to truly understand this form of life and how one might recreate it artificially, one must understand artificial intelligence and, transitively, artificial life.

Only one artificial society is sufficiently well-understood could we then approach the problem of artificial morality, or how to create machines that truly act according to moral or ethical ideals.

autonomy and immune systems

Somewhat disillusioned lately with the inflated discourse on “Artificial Intelligence” and trying to get a grip on the problem of “collective intelligence” with others in the Superintelligence and the Social Sciences seminar this semester, I’ve been following a lead (proposed by Julian Jonker) that perhaps the key idea at stake is not intelligence, but autonomy.

I was delighted when searching around for material on this to discover Bourgine and Varela’s “Towards a Practice of Autonomous Systems” (pdf link) (1992). Francisco Varela is one of my favorite thinkers, though he is a bit fringe on account of being both Chilean and unafraid of integrating Buddhism into his scholarly work.

The key point of the linked paper is that for a system (such as a living organism, but we might extend the idea to a sociotechnical system like an institution or any other “agent” like an AI) to be autonomous, it has to have a kind of operational closure over time–meaning not that it is closed to interaction, but that its internal states progress through some logical space–and that it must maintain its state within a domain of “viability”.

Though essentially a truism, I find it a simple way of thinking about what it means for a system to preserve itself over time. What we gain from this organic view of autonomy (Varela was a biologist) is an appreciation of the fact that an agent needs to adapt simply in order to survive, let alone to act strategically or reproduce itself.

Bourgine and Varela point out three separate adaptive systems to most living organisms:

  • Cognition. Information processing that determines the behavior of the system relative to its environment. It adapts to new stimuli and environmental conditions.
  • Genetics. Information processing that determines the overall structure of the agent. It adapts through reproduction and natural selection.
  • The Immune system. Information processing to identify invasive micro-agents that would threaten the integrity of the overall agent. It creates internal antibodies to shut down internal threats.

Sean O Nuallain has proposed that ones sense of personal self is best thought of as a kind of immune system. We establish a barrier between ourselves and the world in order to maintain a cogent and healthy sense of identity. One could argue that to have an identity at all is to have a system of identifying what is external to it and rejecting it. Compare this with psychological ideas of ego maintenance and Jungian confrontations with “the Shadow”.

At an social organizational level, we can speculate that there is still an immune function at work. Left and right wing ideologies alike have cultural “antibodies” to quickly shut down expressions of ideas that pattern match to what might be an intellectual threat. Academic disciplines have to enforce what can be said within them so that their underlying theoretical assumptions and methodological commitments are not upset. Sociotechnical “cybersecurity” may be thought of as a kind of immune system. And so on.

Perhaps the most valuable use of the “immune system” metaphor is that it identifies a mid-range level of adaptivity that can be truly subconscious, given whatever mode of “consciousness” you are inclined to point to. Social and psychological functions of rejection are in a sense a condition for higher-level cognition. At the same time, this pattern of rejection means that some information cannot be integrated materially; it must be integrated, if at all, through the narrow lens of the senses. At an organizational or societal level, individual action may be rejected because of its disruptive effect on the total system, especially if the system has official organs for accomplishing more or less the same thing.

Instrumentality run amok: Bostrom and Instrumentality

Narrowing our focus onto the crux of Bostrom’s argument, we can see how tightly it is bound to a much older philosophical notion of instrumental reason. This comes to the forefront in his discussion of the orthogonality thesis (p.107):

The orthogonality thesis
Intelligence and final goals are orthogonal: more or less any level of intelligence could in principle be combined with more or less any final goal.

Bostrom goes on to clarify:

Note that the orthogonality thesis speaks not of rationality or reason, but of intelligence. By “intelligence” we here mean something like skill at prediction, planning, and means-ends reasoning in general. This sense of instrumental cognitive efficaciousness is most relevant when we are seeking to understand what the causal impact of a machine superintelligence might be.

Bostrom maintains that the generality of instrumental intelligence, which I would argue is evinced by the generality of computing, gives us a way to predict how intelligent systems will act. Specifically, he says that an intelligent system (and specifically a superintelligent) might be predictable because of its design, because of its inheritance of goals from a less intelligence system, or because of convergent instrumental reasons. (p.108)

Return to the core logic of Bostrom’s argument. The existential threat posed by superintelligence is simply that the instrumental intelligence of an intelligent system will invest in itself and overwhelm any ability by us (its well-intentioned creators) to control its behavior through design or inheritance. Bostrom thinks this is likely because instrumental intelligence (“skill at prediction, planning, and means-ends reasoning in general”) is a kind of resource or capacity that can be accumulated and put to other uses more widely. You can use instrumental intelligence to get more instrumental intelligence; why wouldn’t you? The doomsday prophecy of a fast takeoff superintelligence achieving a decisive strategic advantage and becoming a universe-dominating singleton depends on this internal cycle: instrumental intelligence investing in itself and expanding exponentially, assuming low recalcitrance.

This analysis brings us to a significant focal point. The critical missing formula in Bostrom’s argument is (specifically) the recalcitrance function of instrumental intelligence. This is not the same as recalcitrance with respect to “general” intelligence or even “super” intelligence. Rather, what’s critical is how much a process dedicated to “prediction, planning, and means-ends reasoning in general” can improve its own capacities at those things autonomously. The values of this recalcitrance function will bound the speed of superintelligence takeoff. These bounds can then inform the optimal allocation of research funding towards anticipation of future scenarios.


In what I hope won’t distract from the logical analysis of Bostrom’s argument, I’d like to put it in a broader context.

Take a minute to think about the power of general purpose computing and the impact it has had on the past hundred years of human history. As the earliest digital computers were informed by notions of artificial intelligence (c.f. Alan Turing), we can accurately say that the very machine I use to write this text, and the machine you use to read it, are the result of refined, formalized, and materialized instrumental reason. Every programming language is a level of abstraction over a machine that has no ends in itself, but which serves the ends of its programmer (when it’s working). There is a sense in which Bostrom’s argument is not about a near future scenario but rather is just a description of how things already are.

Our very concepts of “technology” and “instrument” are so related that it can be hard to see any distinction at all. (c.f. Heidegger, “The Question Concerning Technology“) Bostrom’s equating of instrumentality with intelligence is a move that makes more sense as computing becomes ubiquitously part of our experience of technology. However, if any instrumental mechanism can be seen as a form of intelligence, that lends credence to panpsychist views of cognition as life. (c.f. the Santiago theory)

Meanwhile, arguably the genius of the market is that it connects ends (through consumption or “demand”) with means (through manufacture and services, or “supply”) efficiently, bringing about the fruition of human desire. If you replace “instrumental intelligence” with “capital” or “money”, you get a familiar critique of capitalism as a system driven by capital accumulation at the expense of humanity. The analogy with capital accumulation is worthwhile here. Much as in Bostrom’s “takeoff” scenarios, we can see how capital (in the modern era, wealth) is reinvested in itself and grows at an exponential rate. Variable rates of return on investment lead to great disparities in wealth. We today have a “multipolar scenario” as far as the distribution of capital is concerned. At times people have advocated for an economic “singleton” through a planned economy.

It is striking that contemporary analytic philosopher and futurist Nick Bostrom’s contemplates the same malevolent force in his apocalyptic scenario as does Max Horkheimer in his 1947 treatise “Eclipse of Reason“: instrumentality run amok. Whereas Bostrom concerns himself primarily with what is literally a machine dominating the world, Horkheimer sees the mechanism of self-reinforcing instrumentality as pervasive throughout the economic and social system. For example, he sees engineers as loci of active instrumentalism. Bostrom never cites Horkheimer, let alone Heidegger. That there is a convergence of different philosophical sub-disciplines on the same problem suggests that there are convergent ultimate reasons which may triumph over convergent instrumental reasons in the end. The question of what these convergent ultimate reasons are, and what their relationship to instrumental reasons is, is a mystery.