We need a theory of collective agency to guide data intermediary design

Last week Jake Goldenfein and I presented some work-in-progress to the Centre for Artificial Intelligence and Digital Ethics (CAIDE) at the University of Melbourne. The title of the event was “Data science and the need for collective law and ethics”; perhaps masked by that title is the shift we’re taking to dive into the problem of data intermediaries. I wanted to write a bit about how we’re thinking about these issues.

This work builds on our work “Data Science and the Decline of Liberal Law and Ethics“, which was accepted by a conference that was then canceled due to COVID-19. In retrospect, it’s perhaps for the best that the conference was canceled. The “decline of liberalism” theme fit the political moment when we wrote the piece, when Trump and Sanders were contenders for the presidency of the U.S, and authoritarian regimes appeared to be providing a new paradigm for governance. Now, Biden is the victor and it doesn’t look like liberalism is going anywhere. We must suppose that our project will take place in a (neo)liberal context.

Our argument in that work was that many of the ideas animating the (especially Anglophone) liberalism we see in the U.S., the U.K., and Australia legal systems have been inadequate to meaningfully regulate artificial intelligence. This is because liberalism imagines a society of rational individuals appropriating private property through exchanges on a public market and acting autonomously, whereas today we have a wide range of agents with varying levels of bounded rationality, many of which are “artificial” in Herbert Simon’s sense of being computer-enabled firms, tied together in networks of control, not least of these being privately owned markets (the platforms). Essentially, loopholes in liberalism have allowed a quite different form of sociotechnical ordering to emerge because that political theory did not take into account a number of rather recently discovered scientific truths about information, computing, and control. Our project is to tackle this disconnect between theory and actuality, and to try to discover what’s next in terms of a properly cybernetic political theory that advances the goal of human emancipation.

Picking up where our first paper left off, this has gotten us looking at data intermediaries. This is an area where there has been a lot of work! We were particularly inspired by Mozilla’s Data Futures review of different forms of data intermediary institutions, including data coops, data trusts, data marketplaces, and so on. There is a wide range of ongoing experiments with alternative forms of “data stewardship” or “data governance”.

Our approach has been to try to frame and narrow down the options based on normative principles, legal options, and technical expertise. Rather than asking empirically what forms of data governance have been attempted, we are wondering: what ought the goals of a data intermediary be, given the facts about cybernetic agency in the world we live? How could such an institution accomplish what has been lost by the inadequacies of liberalism?

Our thinking has led us to the position that what has prevented liberalism from regulating the digital economy is its emphasis on individual autonomy. We draw on the new consensus in privacy scholarship that individual “notice and choice” is an ineffective way to guarantee consumer protection in the digital economy. Not only are bounded rationality constraints on consumers preventing them from understanding what they are agreeing to, but also the ability of firms to control consumer’s choice architecture has dwarfed the meaningfulness of whatever rationality individuals do have. Meanwhile, it is now well understood (perhaps most recently by Pistor (2020)) that personal data is valuable only when it is cleaned and aggregated. This makes the locus of economic agency around personal data necessarily a collective one.

This line of inquiry leads us to a deep question to which we do not yet have a ready answer, which is “What is collective emancipation in the paradigm of control?” Meaning, given what we know about the “sciences of the artificial”, control theory, theory of computation and information, etc., with all of its challenges to the historical idea of the autonomous liberal agent, what does it mean for a collective of individuals to be free and autonomous?

We got a lot of good feedback on our talk, especially from discussant Seth Lazar, who pointed out that there are many communitarian strands of liberalism that we could look to for normative guides. He mentioned, for example, Elizabeth Anderson’s relational egalitarianism. We asked Seth whether he thought that the kind of institution that guaranteed the collective autonomy of its members would have to be a state, and he pointed out that that was a question of whether or not such a system would be entitled to use coercion.

There’s a lot to do on this project. While it is quite heady and philosophical, I do not think that it is necessarily only an abstract or speculative project. In a recent presentation by Vincent Southerland, he proposed that one solution to the problematic use of algorithms in criminal sentencing would be if “the community” of those advocating for equity in the criminal justice system operated their own automated decision systems. This raises an important question: how could and should a community govern its own a technical systems, in order to support what in Southerland’s case is an abolitionist agenda. I see this as a very aligned project.

There is also a technical component to the problem. Because of economies of scale and the legal climate, more and more computation is moving onto proprietary cloud systems. Most software now is provided “as a service”. It’s unclear what this means for organizations that would try to engage in self-governance, even when these organizations are autonomous state entities such as municipalities. In some conversations, we have considered what modifications of the technical ideas of the “user agent”, security firewalls and local networks, and hybrid cloud infrastructure would enable collective self-governance. This is the pragmatic “how?” that follows our normative “what?” and “why?” question but it is no less important to implementing a prototype solution.


Benthall, Sebastian and Goldenfein, Jake, Data Science and the Decline of Liberal Law and Ethics (June 22, 2020). Available at SSRN: https://ssrn.com/abstract=3632577 or http://dx.doi.org/10.2139/ssrn.3632577

Narayanan, A., Toubiana, V., Barocas, S., Nissenbaum, H., & Boneh, D. (2012). A critical look at decentralized personal data architectures. arXiv preprint arXiv:1202.4503.

Pistor, K. (2020). Rule by data: The end of markets?. Law & Contemp. Probs.83, 101.