Research update: to study the economy of personal data

by Sebastian Benthall

I have not been writing here for some time because of strokes of good luck that have been keeping me busy.

I’ve been awarded a Social Behavioral and Economic Sciences (SBE) Post-Doctoral Research Fellowship (“SPRF” in total) by the National Science Foundation.

This is a lot of words to write out, but they sum up to a significant change in my research and role that I’m still adjusting to.

First, I believe this means that I am a social scientist of some kind. What kind? It’s not clear. If I could have my choice, it would be “economist”. But since Economics is a field widely known for gatekeeping, and I do not have an Economic degree, I’m not sure I can get away with this.

Nevertheless, my SPRF research project is an investigation into the economics of data (especially personal data) using methods that are build on those used in orthodox and heterodox economics.

The study of the economics of personal data is coming from my dissertation work and the ongoing policy research I’ve done at NYU School of Law’s Information Law Institute. Though my work has touched on many other fields — computer science and the design of information systems; sociology and the study of race and networked publics; philosophy and law — at the end of the day the drivers of “technology’s” impact on society are businesses operating according to an economic logic. This is something that everybody knows, but that few academic researchers are in a position to admit, because many of the scholars who think seriously about these issues are coming from other disciplines.

For better or for worse, I have trouble sticking to a tunnel of which I can’t see the intellectual daylight at the end.

So how can we study the economy of personal data?

I would argue — that this is something that most Economists will balk at — that the tools currently available to study this economy are insufficient for the task. Who am I to say such a thing? Nobody special.

But weighing in my favor is the argument that the even the tools used by Economists to study the macroeconomy are insufficient for the task. This point was made decisively by the 2008 Financial Crisis, which blindsided the economic establishment. One of the reasons why Economics failed was because the discipline had deeply entrenched oversimplified assumptions in their economic models. One of these was representative agent modeling, which presumed to model the enter economy with a single “representative agent” for a sector or domain. This makes the economist’s calculations easier but is clearly unrealistic, and indeed it’s the differences between agents that create much of the dynamism and pitfalls of the economy. Hence the rise in heterogeneous agent modeling (HAM), which is explicit about the differences between agents with respect to things like, for example, wealth, risk aversion, discount factor, level of education, and so on.

It was my extraordinary good fortune to find an entry into the world of HAM via the Econ-ARK software project (Carroll et al, 2018; Benthall and Seth, 2020), which needed a software engineer enthusiastic about open source scientific tools at a moment when I was searching for a job. Econ-ARK’s HAM toolkit, HARK, has come a long way since I joined the project in late 2019. And it still has quite a ways to go. But it’s been a tremendously rewarding project to be involved with, in no small part because it has been a hands-on introduction to the nitty-gritty of contemporary Economics methods.

It’s these tools which I will be extending with insights from my other work, which is grounded more in computer science and legal scholarship, in order to model the data economy. Naturally, the economy for personal data depends on the heterogeneity of consumers — it is those differences that make a difference between consumers that make the trade in personal information possible and relevant. And while there are many notational and conventional differences between the orthodox Economics methods and the causal Bayesian frameworks that I’ve worked in before, these methods in fact share a logical core that makes them commensurable.

I’ve mentioned both orthodox and heterodox economics. By this I mean to draw a distinction between the core of the Economics discipline, which in my understanding is still tied to rational expectations and general equilibria — meaning the idea that agents know what to expect from the market and act accordingly — and heterodox views that find these assumptions to be dangerously unrealistic. This is truly a sore spot for Economics. As the trenchant critiques of Mirowski and Nik-Kah (2017) reveal, these core assumptions commit Economists to many absurd conclusions; however, they are loathe to abandon them lest they lose the tight form of rigor which they have demanding to maintain a kind of standardization within the discipline. Rational expectations aligns economics with engineering disciplines, like control theory and artificial intelligence, which makes their methods more in-demand. Equilibrium theories give Economics a normative force and excuses when its predictions do not pan out. However, the 2008 Financial Crisis embarassed these methods, and now the emerging HAM techniqes include not only a broadened from of rational agent modeling, but also a much looser paradigm of Agent-Based Modeling (ABM) that allow for more realistic dynamics with boundedly rational agents (Bookstaber, 2017).

Today, the biggest forces in the economy are precisely those that have marshaled information to their advantage in a world with heterogeneous agents (Benthall and Goldenfein, 2021). Economic agents differ both horizontally — like consumers of different demographic categories such as race and sex — and vertically — as consumers and producers of information services have different relationships to personal data. As I explore in forthcoming work with Salome Viljoen (2021), the monetization of personal data has always been tied to the financial system, first via credit reporting, and later through the financialization of consumer behavior through digital advertising networks. And yet the macroeconomic impact of the industries that profit from these information flows, which now account for the largest global companies, is not understood because of disciplinary blinders that Economics has had for decades and is only now trying to shed.

I’m convinced the research is well motivated. The objection, which comes from my most well-meaning mentors, is that the work is too difficult or in fact impossible. Introducing heterogeneously bounded rationality into economic modeling creates a great deal of modeling and computational complexity. Calibrating, simulating, and testing such models is expensive, and progress requires a great deal of technical thinking about how to compute results efficiently. There are also many social and disciplinary obstacles to this kind of work: for the reasons discussed above, it’s not clear where this work belongs.

However, I consider myself immensely fortunate to have a real, substantive, difficult problem to work on, and enough confidence from the National Science Foundation that they support my trying to solve it. It’s an opportunity of a lifetime and, to be honest, as a researcher who has often felt at the fringes of a viable scholarly career, a real break. The next steps are exciting and I can’t wait to see what’s around the corner.

References

Benthall, S., & Goldenfein, J. (2021, May). Artificial Intelligence and the Purpose of Social Systems. In Proceedings of the 2021 AAAI/ACM Conference on AI Ethics and Society (AIES’21).

Benthall, S., & Seth, M. (2020). Software Engineering as Research Method: Aligning Roles in Econ-ARK.

Benthall, S. & Viljoen, S. (2021) Data Market Discipline: From Financial Regulation to Data Governance. J. Int’l & Comparative Law https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3774418

Bookstaber, R. (2017). The end of theory. Princeton University Press.

Carroll, C. D., Kaufman, A. M., Kazil, J. L., Palmer, N. M., & White, M. N. (2018). The Econ-ARK and HARK: Open Source Tools for Computational Economics. In Proceedings of the 17th Python in Science Conference (pp. 25-30).

Mirowski, P., & Nik-Khah, E. (2017). The knowledge we have lost in information: the history of information in modern economics. Oxford University Press.

Digifesto