reflective data science: technical, practical and emancipatory interests?

by Sebastian Benthall

As Cathryn Carson is currently my boss at Berkeley’s D-Lab, it seems like it behooves me to read her papers. Thankfully, we share an interest in Habermasian epistemology. Today I read her “Science as instrumental reason: Heidegger, Habermas, Heisenberg.”

Though I can barely do justice to the paper, I’ll try to summarize: it grapples with the history of how science became constructed as purely instrumental project (a mode of inquiry that perfects means without specifying particular ends) through the interactions between Heisenberg, the premier theoretical physicist of Germany at his time, and Heidegger, the great philosopher, and then later the response to Heidegger by Habermas.

Heisenberg, most famous perhaps for the Heisenberg Uncertainty Principle, was himself reflective on the role of the scientist within science, and identified the limits of the subject and measurement within physics. But far from surpassing an older metaphysical idea of the subject-object divide, this only entrenched the scientist further, according to Heidegger. This is because scientist qua scientist never encounters the world in a way that is not tied up in the scientific, technical mode and so elludes pure being. While that may simply mean that pure being is left for philosophers and scientists are allowed to go on with their instrumental project, this mode of inquiry becomes insufficient when scientists were called on to comment on nuclear proliferation policy.

Such policy decisions are questions of praxis, or practical action in the human (as opposed to natural) world. Habermas was concerned with the hermeneutic epistemology of praxis, as well as the critical epistemology of emancipation, which are more the purview of the social sciences. Habermas tends to segment these modes of inquiry from each other, without (as far as I’ve encountered so far) anticipating a synthesis.

In data science, we see the broadly positivist, statistical, analytic treatment of social data. In its commercial applications to sell ads or conduct high-speed trading, we could say on a first pass that the science serves the technical human interest: prediction and control for some unspecified end. But that would be misleading. The breadth of methodological options available to the data scientist mean the the methods are often very closely tailored to the particular ends and conditions of the project. Data science as a method is an instrument. But the results of commercial data science are by and large not nomological (identifying laws of human behavior), but rather an immediately applied idiography. Or, more than an applied idiography, data science provides a probabilistic profile of its diverse subjects–an electron cloud of possibilities that the commercial data scientist uses to steer behavior en masse.

Of course, the uncertainty principle applies here as well: the human subject reacts to being measured, has the potential to change direction when they see that they are being targetted with this ad or that.

Further complicating the picture is that the application of ‘social technology’ of commercially driven data science is praxis, albeit in an apolitical sense. Enmeshed in a thick and complex technological web, nevertheless showing an ad and having it be clicked on is a move in the game of social relations. It is a handshake between cyborgs. And so even commercial data science must engage in hermeneutics, if Habermas is correct. Natural language processing provides the uncomfortable edge case here: can we have a technology that accomplishes hermeneutics for us? Apparently so, if a machine can identify somebody’s interest in a product or service from their linguistic output.

Though jarring, this is easier to cope with intellectually if we see the hermeneutic agent as a socio-technical system, as opposed to a purely technical system. Cyborg praxis will includes statistical/technical systems made of wires and silicon, just as meatier praxis includes statistical/technical systems made of proteins and cartilege.

But what of emancipation? This is the least likely human interest to be advanced by commercial interests. If I’ve got my bearings right, the emancipatory interest in the (social) sciences comes from the critical theory tradition, perhaps best exemplified in German thought by the Frankfurt School. One is meant to be emancipated by such inquiry from the power of the capitalist state. What would it mean for there to be an emancipatory data science?

I was recently asked out of the blue in an email whether there were any organizations using machine learning and predictive analytics towards social justice interests. I was ashamed to say I didn’t know of any organizations doing that kind of work. It is hard to imagine what an emancipatory data science would look like. An education or communication about data scientific techniques might be emancipatory (I was trying to accomplish something like this with Why Weird Twitter, for what its worth), but that was a qualitative study, not a data scientific one.

Taking our cue from above, an emancipatory data science would have to use data science methods towards the human interest of emancipation. For this, we would need to use the methods to understand the conditions of power and dependency that bind us. Difficult as an individual, it’s possible that these techniques could be used to greater effect by an emancipatory sociotechnical organization. Such an organization would need to be concerned with its own autonomy as well as the autonomy of others.

The closest thing I can imagine to such a sociotechnical system is what Kelty describes as the recursive public: the loose coallition of open source developers, open access researchers, and others concerned with transforming their social, economic, and technical conditions for emancipatory ends. Happily, the D-Lab’s technical infrastructure team appears to be populated entirely by citizens of the recursive public. Though this is naturally a matter of minor controversy within the lab (its hard to convince folks who haven’t directly experienced the emancipatory potential of the movement of its value), I’m glad that it stands on more or less robust historical grounds. While the course I am co-teaching on Open Collaboration and Peer Production will likely not get into critical theory, I expect that the exposure to more emancipated communities of praxis will make something click.

What I’m going for, personally, is a synthetic science that is at once technical and engaged in emancipatory praxis.