thinking about computational social science

by Sebastian Benthall

I’m facing a challenging paradox in how to approach my research.

On the one hand, we have the trend of increasing instrumentation of society. From quantified self to the Internet of things to Netflix clicks to the fully digitized archives of every newspaper, we have more data than we’ve ever had before to ask fundamental social scientific questions.

That should make it easier to research society and infer principles about how it works. But there is a long-standing counterpoint in the social sciences that claims that all social phenomena are sui generis and historically situated. If no social phenomenon generalizes, then it shouldn’t be possible to infer anything from the available data, no matter how much of it there is.

One view is that we should only be able to infer stuff that isn’t very interesting at all. One name for this view is “punctuated equilibrium.” The national borders of countries don’t move around…until they do. Regimes don’t change…until they do. It’s the ability to predict these kinds of political events that Philip Tetlock has called “expert political judgment.” The Good Judgment Project is a test to see what properties make a person or team of people good at this kind of task.

What now seems like many years ago I wrote a book review of Tetlock’s book. In that review, I pointed out a facet of Tetlock’s research I found most compelling but underdeveloped: that the best predictors he found were algorithmic predictors that drew their conclusions from linear regressions drawn from just the top three or so salient features in the data.

Six or so years later, Big Data is a powerful enough industrial and political phenomenon academic social science feels it needs to catch up. But to a large extent industrial data science is still about using pretty basic statistical models drawn from physics (that assume that everything stands in Gaussian relations to everything else, say), or otherwise applying a broad range of modeling techniques and aggregating them under statistical boosting. This is great for edge out the competition on selling ads.

But it tells us nothing about the underlying structure of what’s going on in society. And it’s possible that the fact that we haven’t done any better is really a condemnation of the whole process of social science in general. The data we are getting, rather than making us understand what’s going on around us better, is perhaps just proving to us that it’s a complex chaotic system. If so, the better we understand it, the more we will lose our confidence in our ability to predict it.

Historically, we’ve been through all this before. The mid-20th century saw the expansion of scope of Norbert Weiner’s cybernetics from electrical engineering of homeostatic machines to modeling of the political system and the economy as complex feedback systems. Indeed, cybernetics was intended as a theory of steering systems by thinking about their communications mechanisms. (Wikipedia: “The word “cybernetics” comes from the Greek word κυβερνητική (kyverni̱tikí̱, “government”), i.e. all that are pertinent to κυβερνώ (kyvernó̱), the latter meaning to “steer,” “navigate” or “govern,” hence κυβέρνησις (kyvérni̱sis, “government”) is the government while κυβερνήτης (kyverní̱ti̱s) is the governor or the captain.”) These models were on some level interesting and intuitive, even beautiful in their ambition. But they failed in their applications because social systems did not obey the kind of regularity that systems engineered for reliable equilibria did.

The difficulty with applying these theories that acknowledge the complexity of the social system to reality is that they are only explanatory in retrospect because other the path dependence of history. That’s pretty close to rendering them pseudoscientific.

Nevertheless, there are countless pressing societal challenges–climate change, unfair crime laws, war, political crisis, public health policy–on which social scientific research must be brought to bear, because there is a dimension to them which is a problem of predicting social action.

It is possible (I wonder if it’s necessary) that there are laws–perhaps just local laws–of social activity. Most people certainly believe their are. Business strategy, for example, depends on so much theorizing about the market and the relationships between different companies and their products. If these laws exist, they must be operationalizable and discoverable in the data itself.

But there is the problem of the researcher’s effect on the system being observed and, even more confounding, the result of the researcher’s discovery on the system itself. When a social system becomes self-aware through a particular theoretical lens, it can change its behavior. (I’ve heard that Milton Friedman’s monetarist economics are fantastically predictive of economic growth in the United States right up until he published them.)

If reflexivity contributes to social entropy, then it’s not clear what the point of any social research agenda is.

The one exception I can think of is if an empirical principle of social organization is robust under social reflection. The goal would be to define an equilibrium state worth striving for, so that the society in question can accept it harmoniously as a norm.

This looks like relevant prior work–a lucky google hit.

Advertisements