The relationship between scientific research, higher education, and open source software has evolved considerably over the last several years. Today, it’s fair to say that most industrially relevant “data science” practice now depends on open source software that was originally built for scientific research purposes. This has in turn legitimized that software; universities have now placed using open source data science software libraries in their undergraduate curriculum. In computer science and by very loose extension other hard sciences, releasing a high quality software tool is a recognizable academic contribution. We’ve come a long way.
The social sciences have perhaps been slower to take the software turn for many notable reasons. One major reason for this is the broad and disparate nature of the social sciences. A related reason is the disciplinary incompatibility of many social sciences with computational modeling. Abutting this academic resistance to software-based social research, however, is the wide adoption of industrial methods for managing and learning from social data. Arguably, the main industrial drivers of data science have always been social science applications, albeit those within a narrow range. Human-Computer Interaction, Computer Supported Cooperative Work, Management Science, Operations Research, and other business-applicable fields have flourished in recent years in ways that traditional “social sciences” such as Sociology, Anthropology, and History have not.
Enter the question of Economics, widely known to be the hardest (most quantitative) of the social sciences. If there were ever a social scientific field that could make the transition over onto an effective software stack, it would be Econ. In addition to what is in principle a methodological resonance, there is also the plausible link between efficient research tools and industrial applications.
Indeed, the beginnings of an open source economics field are underway. There’s an Open Source Economics Lab at University of Chicago. There’s a NumFOCUS sponsored non-profit, QuantEcon, supporting basic economics tools and associated with Nobel-prize winner Thomas Sargent. There’s Econ-Ark, a different economics toolkit funded by the Sloan Foundation. There’s the Dolo project, and so on.
In this loose taxonomy of scientific software maturity developed at an NSF-funded workshop on Scientific Software Incubators, these projects range between Stage 1, developed by a single software team for internal use, Stage 2, developed by multiple software teams for internal use, and Stage 3, a self-governing community deliberately supporting a broader community.
These are, it must be said, so far small efforts in the field of economics. One explanation for “Why?” comes from the Charter of the nascent Journal for Open Source Economics (JOSEcon). Summarizing the motivations for the journal described in that charter, there’s a compelling argument for the need for a high impact journal that requires of submissions sound software engineering behind its computational tools.
- There are computational and numerical methods in economics research with many benefits:
- More expressive than purely analytically tractable models
- Ability to support parameter estimation/model fitting
- Software development practice among economics researchers is currently weak
- Mainly informal code transfer with little effective code reuse
- Publication standards are not guaranteeing reproducibility
- Lots of reinventing the wheel
- Potential of a replicability crisis
- The solution is a change in incentive structure
- JOSEcon aims to be a high prestige journal that requires better software practices for submissions
- A submission includes:
- A well-documented software package
- Short script or notebook demonstrating functionality
- A couple pages of prose of applicability
- Could be new research, or a replication of existing research
- Submissions are citable for academic credit towards e.g. tenure
At the moment, there seems to be a bit of a chicken-and-egg problem. Software engineering skills are in short supply among economists. So it’s unlikely that a journal that requires sound software practices behind its submissions will quickly become prominent in the field. On the other hand, it’s possible that the infrastructure for general-purpose scientific publishing will accommodate computational research and it will be left to economists to take advantage of it after the way has been prepared ahead of them.
Current proposals may lack conceptual clarity about software engineering and its precise relationship with academic publication. The incentives and needs of the two fields are subtly different in ways besides how academic research values citations. The library and dependency structure of software depends critically on functional modularity. Arguably, research publications are organized around a more narrative structure. The logic of presentation of a research publication is rarely going to fit the most efficient architecture of computational modeling.
All this points to a fascinating intellectual problem at the core of all this: what is the right architecture for computational economics software tools? Is an economic model a functional unit of logic? Or is it a narrative for presentation? Can the logical units be efficiently decomposed and reused?