The ontology of software, revisited

by Sebastian Benthall

I’m now a software engineer again after many years doing and studying other things. My first-person experience, my phenomenological relationship with this practice, is different this time around. I’ve been meaning to jot down some notes based on that fresh experience. Happily, there’s resonance with topics of my academic focus as well. I’m trying to tease out these connections.

To briefly recap: There’s a recurring academic discourse around technology ethics. Roughly speaking, it starts with a concern about a newish technology that has media or funding agency interest. Articles then get written capitalizing on this hot topic; these articles are fractured according to the disciplinary background of their authors.

Engineers try to come up with an improved version of the technology.
Lawyers try to come up with ways to regulate the production and use of the technology broadly speaking.
Organizational sociologists come up with institutional practices (‘ethics boards’, ‘contestability’) which would prevent the technology from being misused.
Critical theorists argue that the technology would be less worrisome if representational desiderata within the field of technology production were better.
… and so on.

This is a very active and interesting discourse, but from my (limited) perspective, is rarely impacts industry practice. This isn’t because people in industry don’t care about the ethical implications of their work. It’s because people in industry are engaged full-time in a different discourse. This is the discourse of industry practitioners.

My industrial background is in software development and data science. Obviously there are other kinds of industrial work–hardware, biotech, etc. But it’s fair to say that a great deal of the production of “technology” in the 21st century is, specifically, software development. And my point here is that software development has its own field of discourse that is rich and vivid and a full-time job to keep up with. Here’s some examples of what I’m getting at:

There is always-already a huge world of communication between engineers about what technologies are interesting, how to use them effectively, how they compare with prior technologies, the implications of these trends for technical careers, and so on. Browse Hacker News. Look at industry software conferences.
There’s also a huge world of industrial discussion about the social practices of software development. A lot of my knowledge of this is a bit dated. But as I come back to industry, I find myself looking back to now Classic sources on how-to-work-effectively-on-software. I’m linking to articles from Joel Spolsky’s blog. I’m ordering a copy of Fred Brooks’s classic The Mythical Man-Month.
I’m reading documentation, endlessly, about how to configure and use the various SaaS, IaaS, PaaS, etc. tools that are now necessary parts of full-stack development. When the documentation is limited, I’m engaging with customer service people of technical products, who have their own advice, practices, etc.

This is a complex world of literature and practice. Part of what makes it complex is that it is always-already densely documented and self-referential, enacted by smart and literate people, most of whom are quite socially skilled. It’s people working full-time jobs in a field that is now over 40 years old.

I’ve argued in other posts that if we want to solve the ‘technology ethics’ problem, we should see it as an economic problem. At a high level, I still believe that’s true. I want to qualify that point though, and say: now that I’m back in a more engage position with respect to the field of technical production, I believe there are institutional/organizational ways to address broader social concerns through interventions on engineering practice.

What is missing, in my view, is a sincere engagement with the nitty-gritty of engineering practice itself. I know there are anthropologists who think they do this. I haven’t read anybody who really does it, in their writing, and I believe the reason for that is: anthropologists writing for other academic anthropologists are not going to write what would be actually useful here, which is a guide for product and project management that would likely recapitulate a lot of conventional (but too often ignored) wisdom about software engineering “best practices”–documentation, testing, articulation of use cases, etc. These are the kinds of things that improve technical quality in a real way.

Now that I write this, I recall that the big ethics research teams at, say, Google, do stuff like this. It’s great.

I was going to say something about the ontology of software.

Recall: I have a position on the ontology of data, which I’ve called Situated Information Flow Theory (SIFT). I worked hard on it. According to SIFT, an information flow is a causal flow situated in a network of other causal relations. The meaning of the information depends on that causally defined situation.

What then is software?

“Software” refers to sets of instructions written by people in a specialized “programming” language as text data, which is then interpreted and compiled by a machine. In paradigmatic industrial practice (I’m simplifying, bear with me), ultimately these instructions will be used to control the behavior of a machine that interfaces with the world in a real-time, consequential way. This latter machine is referred to, internally, as being “in production”.

When you’re programming a technical product, first you write software “in development”. You are writing drafts of code. You get your colleagues to review it. You link up the code you wrote to the code the other team wrote and you see if it works together. There is a long and laborious process of building tests for new requirements and fixing the code so that it meets those requirements. There are designs, and redesigns, of internal and external facing features. The complexity of the total task is divided up into modules; the boundaries of those modules shifts over time. The social structure of the team adapts as new modules become necessary.

There is an isomorphism, a well documented phenomenon in organizational social theory, between the technology being created and the social structure that creates it. The team structure mirrors the software architecture.

When the pieces are in place adequately enough–and when the investors/management has grown impatient enough–the software is finally “deployed to production”. It “goes live”. What was an internal exercise is now a process with reputational consequences for the business, as well as possibly real consequences for the users of the technology.

Inevitably, the version of the product “in production” is not complete. There are errors. There are new features requested. So the technology firm now organizes itself around several “cycles” running at different frequencies in parallel. There’s a “development cycle” of writing new software code. There’s a “release cycle” of packaging new improvements into bundles that are documented and tested for quality. The releases are deployed to production on a schedule. Different components may have different development and release cycles. The impedance match or mismatch between these cycles becomes its own source of robustness or risk. (I’ve done some empirical research work on this.)

What does this mean for the ontology of software?

The first thing it means is that the notion that software is a static artifact, something like either a physical object (like a bicycle) or a publication (like a book) is mostly irrelevant to what’s happening. The software production process depends on the fluidity of source code. When software is deployed “as a service”, it’s dubious for it to qualify as a “creative work”, subject to copyright law, except by virtue of legal inertia. Something totally different is going on.

The second thing it means is that the live technical product is an ongoing institutional accomplishment. It’s absurd to ever say that humans are not “in the loop”. This is one of the big insights of the critical/anthro reaction to “Big Tech” in the past five years or so. But it has also been common knowledge within the industry for fifteen years or so.

The third thing it means is that software is the structuring of a system of causal relations. Software, when it’s deployed, determines what causes what. See above for a definition of the the nature of information: it’s a causal flow situated in other causal relations. The link between software and information then is quite clear and direct. Software (as far as it goes) is a definition of a causal situation.

The fourth thing it means is that software products are the result of agreement between people. Software only makes it into production if it has gotten there through agreed-upon processes by the team that deploys it. The strength of software is in the collective input that went into it. In a sense, software is much more like a contract, in legal terms, than it is like a creative work. In the extended network of human and machine actors, software is the result of, the expression of, self-regulation first. Only secondarily does it, in Lessig’s terms, become a regulatory force more broadly.

What is software? Software is a form of social structure.

Digifesto