Digifesto

Tag: data processing inequality

social structure and the private sector

The Human Cell

Academic social scientists leaning towards the public intellectual end of the spectrum love to talk about social norms.

This is perhaps motivated by the fact that these intellectual figures are prominent in the public sphere. The public sphere is where these norms are supposed to solidify, and these intellectuals would like to emphasize their own importance.

I don’t exclude myself from this category of persons. A lot of my work has been about social norms and technology design (Benthall, 2014; Benthall, Gürses and Nissenbaum, 2017)

But I also work in the private sector, and it’s striking how differently things look from that perspective. It’s natural for academics who participate more in the public sphere than the private sector to be biased in their view of social structure. From the perspective of being able to accurately understand what’s going on, you have to think about both at once.

That’s challenging for a lot of reasons, one of which is that the private sector is a lot less transparent than the public sphere. In general the internals of actors in the private sector are not open to the scrutiny of commentariat onlookers. Information is one of the many resources traded in pairwise interactions; when it is divulged, it is divulged strategically, introducing bias. So it’s hard to get a general picture of the private sector, even though accounts for a much larger proportion of the social structure that’s available than the public sphere. In other words, public spheres are highly over-represented in analysis of social structure due to the available of public data about them. That is worrisome from an analytic perspective.

It’s well worth making the point that the public/private dichotomy is problematic. Contextual integrity theory (Nissenbaum, 2009) argues that modern society is differentiated among many distinct spheres, each bound by its own social norms. Nissenbaum actually has a quite different notion of norm formation from, say, Habermas. For Nissenbaum, norms evolve over social history, but may be implicit. Contrast this with Habermas’s view that norms are the result of communicative rationality, which is an explicit and linguistically mediated process. The public sphere is a big deal for Habermas. Nissenbaum, a scholar of privacy, reject’s the idea of the ‘public sphere’ simpliciter. Rather, social spheres self-regulate and privacy, which she defines as appropriate information flow, is maintained when information flows according to these multiple self-regulatory regimes.

I believe Nissenbaum is correct on this point of societal differentiation and norm formation. This nuanced understanding of privacy as the differentiated management of information flow challenges any simplistic notion of the public sphere. Does it challenge a simplistic notion of the private sector?

Naturally, the private sector doesn’t exist in a vacuum. In the modern economy, companies are accountable to the law, especially contract law. They have to pay their taxes. They have to deal with public relations and are regulated as to how they manage information flows internally. Employees can sue their employers, etc. So just as the ‘public sphere’ doesn’t permit a total free-for-all of information flow (some kinds of information flow in public are against social norms!), so too does the ‘private sector’ not involve complete secrecy from the public.

As a hypothesis, we can posit that what makes the private sector different is that the relevant social structures are less open in their relations with each other than they are in the public sphere. We can imagine an autonomous social entity like a biological cell. Internally it may have a lot of interesting structure and organelles. Its membrane prevents this complexity leaking out into the aether, or plasma, or whatever it is that human cells float around in. Indeed, this membrane is necessary for the proper functioning of the organelles, which in turn allows the cell to interact properly with other cells to form a larger organism. Echoes of Francisco Varela.

It’s interesting that this may actually be a quantifiable difference. One way of modeling the difference between the internal and external-facing complexity of an entity is using information theory. The more complex internal state of the entity has higher entropy than the membrane. The fact that the membrane causally mediates interactions between the internals and the environment prevents information flow between them; this is captured by the Data Processing Inequality. The lack of information flow between the system internals and externals is quantified as lower mutual information between the two domains. At zero mutual information, the two domains are statistically independent of each other.

I haven’t worked out all the implications of this.

References

Benthall, Sebastian. (2015) Designing Networked Publics for Communicative Action. Jenny Davis & Nathan Jurgenson (eds.) Theorizing the Web 2014 [Special Issue]. Interface 1.1. (link)

Sebastian Benthall, Seda Gürses and Helen Nissenbaum (2017), “Contextual Integrity through the Lens of Computer Science”, Foundations and Trends® in Privacy and Security: Vol. 2: No. 1, pp 1-69. http://dx.doi.org/10.1561/3300000016

Nissenbaum, H. (2009). Privacy in context: Technology, policy, and the integrity of social life. Stanford University Press.

Advertisements

The Data Processing Inequality and bounded rationality

I have long harbored the hunch that information theory, in the classic Shannon sense, and social theory are deeply linked. It has proven to be very difficult to find an audience for this point of view or an opportunity to work on it seriously. Shannon’s information theory is widely respected in engineering disciplines; many social theorists who are unfamiliar with it are loathe to admit that something from engineering should carry essential insights for their own field. Meanwhile, engineers are rarely interested in modeling social systems.

I’ve recently discovered an opportunity to work on this problem through my dissertation work, which is about privacy engineering. Privacy is a subtle social concept but also one that has been rigorously formalized. I’m working on formal privacy theory now and have been reminded of a theorem from information theory: the Data Processing Theorem. What strikes me about this theorem is that is captures an point that comes up again and again in social and political problems, though it’s a point that’s almost never addressed head on.

The Data Processing Inequality (DPI) states that for three random variables, X, Y, and Z, arranged in Markov Chain such that X \rightarrow Y \rightarrow Z, then I(X,Z) \leq I(X,Y), where here I stands for mutual information. Mutual information is a measure of how much two random variables carry information about each other. If $I(X,Y) = 0$, that means the variables are independent. $I(X,Y) \geq 0$ always–that’s just a mathematical fact about how it’s defined.

The implications of this for psychology, social theory, and artificial intelligence are I think rather profound. It provides a way of thinking about bounded rationality in a simple and generalizable way–something I’ve been struggling to figure out for a long time.

Suppose that there’s a big world out the, W and there’s am organism, or a person, or a sociotechnical organization within it, Y. The world is big and complex, which implies that it has a lot of informational entropy, H(W). Through whatever sensory apparatus is available to Y, it acquires some kind of internal sensory state. Because this organism is much small than the world, its entropy is much lower. There are many fewer possible states that the organism can be in, relative to the number of states of the world. H(W) >> H(Y). This in turn bounds the mutual information between the organism and the world: I(W,Y) \leq H(Y)

Now let’s suppose the actions that the organism takes, Z depend only on its internal state. It is an agent, reacting to its environment. Well whatever these actions are, they can only be so calibrated to the world as the agent had capacity to absorb the world’s information. I.e., I(W,Z) \leq H(Y) << H(W). The implication is that the more limited the mental capacity of the organism, the more its actions will be approximately independent of the state of the world that precedes it.

There are a lot of interesting implications of this for social theory. Here are a few cases that come to mind.

I've written quite a bit here (blog links) and here (arXiv) about Bostrom’s superintelligence argument and why I’m generally not concerned with the prospect of an artificial intelligence taking over the world. My argument is that there are limits to how much an algorithm can improve itself, and these limits put a stop to exponential intelligence explosions. I’ve been criticized on the grounds that I don’t specify what the limits are, and that if the limits are high enough then maybe relative superintelligence is possible. The Data Processing Inequality gives us another tool for estimating the bounds of an intelligence based on the range of physical states it can possibly be in. How calibrated can a hegemonic agent be to the complexity of the world? It depends on the capacity of that agent to absorb information about the world; that can be measured in information entropy.

A related case is a rendering of Scott’s Seeing Like a State arguments. Why is it that “high modernist” governments failed to successfully control society through scientific intervention? One reason is that the complexity of the system they were trying to manage vastly outsized the complexity of the centralized control mechanisms. Centralized control was very blunt, causing many social problems. Arguably, behavioral targeting and big data centers today equip controlling organizations with more informational capacity (more entropy), but they
still get it wrong sometimes, causing privacy violations, because they can’t model the entirety of the messy world we’re in.

The Data Processing Inequality is also helpful for explaining why the world is so messy. There are a lot of different agents in the world, and each one only has so much bandwidth for taking in information. This means that most agents are acting almost independently from each other. The guiding principle of society isn’t signal, it’s noise. That explains why there are so many disorganized heavy tail distributions in social phenomena.

Importantly, if we let the world at any time slice be informed by the actions of many agents acting nearly independently from each other in the slice before, then that increases the entropy of the world. This increases the challenge for any particular agent to develop an effective controlling strategy. For this reason, we would expect the world to get more out of control the more intelligence agents are on average. The popularity of the personal computer perhaps introduced a lot more entropy into the world, distributed in an agent-by-agent way. Moreover, powerful controlling data centers may increase the world’s entropy, rather than redtucing it. So even if, for example, Amazon were to try to take over the world, the existence of Baidu would be a major obstacle to its plans.

There are a lot of assumptions built into these informal arguments and I’m not wedded to any of them. But my point here is that information theory provides useful tools for thinking about agents in a complex world. There’s potential for using it for modeling sociotechnical systems and their limitations.