The GDPR and the future of the EU

In privacy scholarship and ‘big data’ engineering circles, much is being made about the EU’s General Data Protection Regulation (GDPR). It is probably the strongest regulation passed protecting personal data in a world of large-scale, global digital services. What makes it particularly fearsome is the extra-territoriality of its applicability. It applies to controllers and processors working in the EU whether or not the data processing is itself being done in the EU, and it applies processing of data whose subjects are in the EU whether or not the controller or processor is in the EU. In short, it protects the data of people in the EU, no matter where the organization using the data is.

This is interesting in light of the fact that the news is full of intimation that the EU might collapse with the result of the French election. Prediction markets currently favoring Macron, but he faces a strong contender in Le Pen, who is against the Eurozone.

The GDPR is scheduled to go into effect in 2018. I wonder what its jurisdiction will be once it goes into effect. A lot can happen between now and then.

A big, sincere THANK YOU to the anonymous reviewer who rejected my IC2S2 submission

I submitted an abstract to IC2S2 this year. It was a risky abstract so submit: I was trying to enter into a new field; the extended abstract length was maximum three pages; I had some sketches of an argument in mind that were far too large in scope and informed mainly by my dissatisfaction with other fields.

I got the most wonderful negative review from an anonymous reviewer. A careful dissection of my roughshod argument and firm pointers to literature (some of it quite old) where my naive intuitions had already been addressed. It was a brief and expertly written literature review of precisely the questions that I had been grasping at so poorly.

There have been moments in my brief research career where somebody has stepped in out of the blue and put be squarely on the right path. I can count them on one hand. This is one of them. I have enormous gratitude towards these people; my gratitude is not lessened by the anonymity of this reviewer. Likely this was a defining moment in my mental life. Thank you, wherever you are. You’ve set a high bar and one day I hope to pay that favor forward.

Three possibilities of political agency in an economy of control

I wrote earlier about three modes of social explanation: functionality, which explains a social phenomenon in terms of what it optimizes; politics, which explains a social phenomenon in terms of multiple agents working to optimize different goals; and chaos, which explains a social phenomenon in terms of the happenings of chance, independent of the will of any agent.

A couple notes on this before I go on. First, this view of social explanation is intentionally aligned with mathematical theories of agency widely used in what is broadly considered ‘artificial intelligence’ research and even more broadly  acknowledged under the rubrics of economics, cognitive science, multi-agent systems research, and the like. I am willfully opting into the hegemonic paradigm here. If years in graduate school at Berkeley have taught me one pearl of wisdom, it’s this: it’s hegemonic for a reason.

A second note is that when I say “social explanation”, what I really mean is “sociotechnical explanation”. This is awkward, because the only reason I have to make this point is because of an artificial distinction between technology and society that exists much more as a social distinction between technologists and–what should one call them?–socialites than as an actual ontological distinction. Engineers can, must, and do constantly engage societal pressures; they must bracket of these pressures in some aspects of their work to achieve the specific demands of engineering. Socialites can, must, and do adopt and use technologies in every aspect of their lives; they must bracket these technologies in some aspects of their lives in order to achieve the specific demands of mastering social fashions. The social scientist, qua socialite who masters specific social rituals, and the technologist, qua engineer who masters a specific aspect of nature, naturally advertise their mastery as autonomous and complete. The social scholar of technology, qua socialite engaged in arbitrage between communities of socialites and communities of technologists, naturally advertises their mastery as an enlightened view over and above the advertisements of the technologists. To the extent this is all mere advertising, it is all mere nonsense. Currency, for example, is surely a technology; it is also surely an artifact of socialization as much if not more than it is a material artifact. Since the truly ancient invention of currency and its pervasiveness through the fabric of social life, there has been no society that is not sociotechnical, and there has been no technology that is is not sociotechnical. A better word for the sociotechnical would be one that indicates its triviality, how it actually carries no specific meaning at all. It signals only that one has matured to the point that one disbelieves advertisements. We are speaking scientifically now.

With that out of the way…I have proposed three modes of explanation: functionality, politics, and chaos. They refer to specific distributions of control throughout a social system. The first refers to the capacity of the system for self-control. The second refers to the capacity of the components of the system for self-control. The third refers to the absence of control.

I’ve written elsewhere about my interest in the economy of control, or in economies of control, plurally. Perhaps the best way to go about studying this would be an in depth review of the available literature on information economics. Sadly, I am at this point a bit removed from this literature, having gone down a number of other rabbit holes. In as much as intellectual progress can be made by blazing novel trails through the wilderness of ideas, I’m intent on documenting my path back to the rationalistic homeland from which I’ve wandered. Perhaps I bring spices. Perhaps I bring disease.

One of the questions I bring with me is the question of political agency. Is there a mathematical operationalization of this concept? I don’t know it. What I do know is that it is associated most with the political mode of explanation, because this mode of explanation allows for the existence of politics, by which I mean agents engaged in complex interactions for their individual and sometimes collective gain. Perhaps it is the emerging dynamics of the individual’s shifting constitution as collectives that captures best what is interesting about politics. These collectives serve functions, surely, but what function? Is it a function with any permanence or real agency? Or is it a specious functionality, only a compromise of the agents that compose it, ready to be sabotaged by a defector at any moment?

Another question I’m interested in is how chaos plays a role in such an economy of control. There is plenty of evidence to suggest that entropy in society, far from being a purely natural consequence of thermodynamics, is a deliberate consequence of political activity. Brunton and Nissenbaum have recently given the name obfuscation to some kinds of political activity that are designed to mislead and misdirect. I believe this is not the only reason why agents in the economy of control work actively to undermine each others control. To some extent, the distribution of control over social outcomes is zero sum. It is certainly so at the Pareto boundary of such distributions. But I posit that part of what makes economies of control interesting is that they have a non-Euclidean geometry that confounds the simple aggregations that make Pareto optimality a useful concept within it. Whether this hunch can be put persuasively remains to be seen.

What I may be able to say now is this: there is a sense in which political agency in an economy of control is self-referential, in that what is at stake for each agent is not utility defined exogenously to the economy, but rather agency defined endogenously to the economy. This gives economic activity within it a particularly political character. For purposes of explanation, this enables us to consider three different modes of political agency (or should I say political action), corresponding to the three modes of social explanation outlined above.

A political agent may concern itself with seizing control. It may take actions which are intended to direct the functional orientation of the total social system of which it is a part to be responsive to its own functional orientation. One might see this narrowly as adapting the total system’s utility function to be in line with one’s own, but this is to partially miss the point. It is to align the agency of the total system with one’s one, or to make the total system a subsidiary to one’s agency.  (This demands further formalization.)

A political agent may instead be concerned with interaction with other agents in a less commanding way. I’ll call this negotiation for now. The autonomy of other agents is respected, but the political agent attempts a coordination between itself and others for the purpose of advancing its own interests (its own agency, its own utility). This is not a coup d’etat. It’s business as usual.

A political agent can also attempt to actively introduce chaos into its own social system. This is sabotage. It is an essentially disruptive maneuver. It is action aimed to cause the death of function and bring about instead emergence, which is the more positive way of characterizing the outcomes of chaos.

Varela’s modes of explanation and the teleonomic

I’m now diving deep into Francisco Varela’s Principles of Biological Autonomy (1979). Chapter 8 draws on his paper with Maturana, “Mechanism and biological explanation” (1972) (html). Chapter 9 draws heavily from his paper, “Describing the Logic of the Living: adequacies and limitations of the idea of autopoiesis” (1978) (html).

I am finding this work very enlightening. Somehow it bridges between my interests in philosophy of science right into my current work on privacy by design. I think I will find a way to work this into my dissertation after all.

Varela has a theory of different modes of explanation of phenomena.

One form of explanation is operational explanation. The categories used in these explanations are assumed to be components in the system that generated the phenomena. The components are related to each other in a causal and lawful (nomic) way. These explanations are valued by science because they are designed so that observers can best predict and control the phenomena under study. This corresponds roughly to what Habermas identifies as technical knowledge in Knowledge and Human Interests. In an operational explanation, the ideas of purpose or function have no explanatory value; rather the observer is free to employ the system for whatever purpose he or she wishes.

Another form of explanation is symbolic explanation, which is a more subtle and difficulty idea. It is perhaps better associated with phenomenology and social scientific methods that build on it, such as ethnomethodology. Symbolic explanations, Varela argues, are complementary to operational explanations and are necessary for a complete description of “living phenomenology”, which I believe Varela imagines as a kind of observer-inclusive science of biology.

To build up to his idea of the symbolic explanation, Varela first discusses an earlier form of explanation, now out of fashion: teleological explanation. Teleological explanations do not support manipulation, but rather “understanding, communication of intelligible perspective in regard to a phenomenal domain”. Understanding the “what for” of a phenomenon, what its purpose is, does not tell you how to control the phenomenon. While it may help regulate ones expectations, Varela does not see this as its primary purpose. Communicability motivates teleological explanation. This resonates with Habermas’s idea of hermeneutic knowledge, what is accomplished through intersubjective understanding.

Varela does not see these modes of explanation as exclusive. Operational explanations assume that “phenomena occur through a network of nomic (lawlike) relationships that follow one another. In the symbolic, communicative explanation the fundamental assumption is that phenomena occur through a certain order or pattern, but the fundamental focus of attention is on certain moments of such an order, relative to the inquiring community.” But these modes of explanation are fundamentally compatible.

“If we can provide a nomic basis to a phenomenon, an operational description, then a teleological explanation only consists of putting in parenthesis or conceptually abbreviating the intermediate steps of a chain of causal events, and concentrating on those patterns that are particularly interesting to the inquiring community. Accordingly, Pittendrich introduced the term teleonomic to designate those teleological explanations that assume a nomic structure in the phenomena, but choose to ignore intermediate steps in order to concentrate on certain events (Ayala, 1970). Such teleologic explanations introduce finalistic terms in an explanation while assuming their dependence in some nomic network, hence the name teleo-nomic.”

A symbolic explanation that is consistent with operational theory, therefore, is a teleonomic explanation: it chooses to ignore some of the operations in order to focus on relationships that are important to the observer. There are coherent patterns of behavior which the observer chooses to pay attention to. Varela does not use the word ‘abstraction’, as a computer scientist I am tempted to. But Varela’s domains of interest, however, are complex physical systems often represented as dynamic systems, not the kind of well-defined chains of logical operations familiar from computer programming. In fact, one of the upshots of Varela’s theory of the symbolic explanation is a criticism of naive uses of “information” in causal explanations that are typical of computer scientists.

“This is typical in computer science and systems engineering, where information and information processing are in the same category as matter and energy. This attitude has its roots in the fact that systems ideas and cybernetics grew in a technological atmosphere that acknowledged the insufficiency of the purely causalistic paradigm (who would think of handling a computer through the field equations of thousands of integrated circuits?), but had no awareness of the need to make explicit the change in perspective taken by the inquiring community. To the extent that the engineering field is prescriptive (by design), this kind of epistemological blunder is still workable. However, it becomes unbearable and useless when exported from the domain of prescription to that of description of natural systems, in living systems and human affairs.”

This form of critique makes its way into a criticism of artificial intelligence by Winograd and Flores, presumabley through the Chilean connection.

More assessment of AI X-risk potential

I’m been stimulated by Luciano Floridi’s recent article in Aeon “Should we be afraid of AI?”. I’m surprised that this issue hasn’t been settled yet, since it seems like “we” have the formal tools necessary to solve the problem decisively. But nevertheless this appears to be the subject of debate.

I was referred to Kaj Sotala’s rebuttal of an earlier work by Floridi which his Aeon article was based on. The rebuttal appears in this APA Newsletter on Philosophy and Computers. It is worth reading.

The issue that I’m most interested in is whether or not AI risk research should constitute a special, independent branch of research, or whether it can be approached just as well by pursuing a number of other more mainstream artificial intelligence research agendas. My primary engagement with these debates has so far been an analysis of Nick Bostrom’s argument in his book Superintelligence, which tries to argue in particular that there is an existential risk (or X-risk) to humanity from artificial intelligence. “Existential risk” means a risk to the existence of something, in this case humanity. And the risk Bostrom has written about is the risk of eponymous superintelligence: an artificial intelligence that gets smart enough to improve its own intelligence, achieve omnipotence, and end the world as we know it.

I’ve posted my rebuttal to this argument on arXiv. The one-sentence summary of the argument is: algorithms can’t just modify themselves into omnipotence because they will hit performance bounds due to data and hardware.

A number of friends have pointed out to me that this is not a decisive argument. They say: don’t you just need the AI to advance fast enough and far enough to be an existential threat?

There are a number of reasons why I don’t believe this is likely. In fact, I believe that it is provably vanishingly unlikely. This is not to say that I have a proof, per se. I suppose it’s incumbent on me to work it out and see if the proof is really there.

So: Herewith is my Sketch Of A Proof of why there’s no significant artificial intelligence existential risk.

Lemma: Intelligence advances due to purely algorithmic self-modificiation will always plateau due to data and hardware constraints, which advance more slowly.

Proof: This paper.

As a consequence, all artificial intelligence explosions will be sigmoid. That is, starting slow, accelerating, then decelerating, the growing so slowly as to be asymptotic. Let’s call the level of intelligence at which an explosion asymptotes the explosion bound.

There’s empirical support for this claim. Basically, we have never had a really big intelligence explosion due to algorithmic improvement alone. Looking at the impressive results of the last seventy years, most of the impressiveness can be attributed to advances in hardware and data collection. Notoriously, Deep Learning is largely just decades old artificial neural network technology repurposed to GPU’s on the cloud. Which is awesome and a little scary. But it’s not an algorithmic intelligence explosion. It’s a consolidation of material computing power and sensor technology by organizations. The algorithmic advances fill those material shoes really quickly, it’s true. This is precisely the point: it’s not the algorithms that’s the bottleneck.

Observation: Intelligence explosions are happening all the time. Most of them are small.

Once we accept the idea that intelligence explosions are all bounded, it becomes rather arbitrary where we draw the line between an intelligence explosion and some lesser algorithmic intelligence advance. There is a real sense in which any significant intelligence advance is a sigmoid expansion in intelligence. This would include run-of-the-mill scientific discoveries and good ideas.

If intelligence explosions are anything like virtually every other interesting empirical phenomenon, then they are distributed according to a heavy tail distribution. This means a distribution with a lot of very small values and a diminishing probability of higher values that nevertheless assigns some probability to very high values. Assuming intelligence is something that can be quantified and observed empirically (a huge ‘if’ taken for granted in this discussion), we can (theoretically) take a good hard look at the ways intelligence has advanced. Look around you. Do you see people and computers getting smarter all the time, sometimes in leaps and bounds but most of the time minutely? That’s a confirmation of this hypothesis!

The big idea here is really just to assert that there is a probability distribution over intelligence explosion bounds that all actual intelligence explosions are being drawn from. This follows more or less directly from the conclusion that all intelligence explosions are bounded. Once we posit such a distribution, it becomes possible to take expected values of functions of its values and functions of its values.

Empirical claim: Hardware and sensing advances diffuse rapidly relative to their contribution to intelligence gains.

There’s an material, socio-technical analog to Bostrom’s explosive superintelligence. We could imagine a corporation that is working in secret on new computing infrastructure. Whenever it has an advance in computing infrastructure, the AI people (or increasingly, the AI-writing-AI) develops programming that maximizes its use of this new technology. Then it uses that technology to enrich its own computer-improving facilities. When it needs more…minerals…or whatever it needs to further its research efforts, it finds a way to get them. It proceeds to take over the world.

This may presently be happening. But evidence suggests that this isn’t how the technology economy really works. No doubt Amazon (for example) is using Amazon Web Services internally to do its business analytics. But also it makes its business out of selling out its computing infrastructure to other organizations as a commodity. That’s actually the best way it can enrich itself.

What’s happening here is the diffusion of innovation, which is a well-studied phenomenon in economics and other fields. Ideas spread. Technological designs spread. I’d go so far as to say that it is often (perhaps always?) the best strategy for some agent that has locally discovered a way to advance its own intelligence to figure out how to trade that intelligence to other agents. Almost always that trade involves the diffusion of the basis of that intelligence itself.

Why? Because since there are independent intelligence advances of varying sizes happening all the time, there’s actually a very competitive market for innovation that quickly devalues any particular gain. A discovery, if hoarded, will likely be discovered by somebody else. The race to get credit for any technological advance at all motivates diffusion and disclosure.

The result is that the distribution of innovation, rather than concentrating into very tall spikes, is constantly flattening and fattening itself. That’s important because…

Claim: Intelligence risk is not due to absolute levels of intelligence, but relative intelligence advantage.

The idea here is that since humanity is composed of lots of interacting intelligence sociotechnical organizations, any hostile intelligence is going to have a lot of intelligent adversaries. If the game of life can be won through intelligence alone, then it can only be won with a really big intelligence advantage over other intelligent beings. It’s not about absolute intelligence, it’s intelligence inequality we need to worry about.

Consequently, the more intelligence advances (i.e, technologies) diffuse, the less risk there is.

Conclusion: The chance of an existential risk from an intelligence explosion is small and decreasing all the time.

So consider this: globally, there’s tons of investment in technologies that, when discovered, allow for local algorithmic intelligence explosions.

But even if we assume these algorithmic advances are nearly instantaneous, they are still bounded.

Lots of independent bounded explosions are happening all the time. But they are also diffusing all the time.

Since the global intelligence distribution is always fattening, that means that the chance of any particular technological advance granting a decisive advantage over others is decreasing.

There is always the possibility of a fluke, of course. But if there was going to be a humanity destroying technological discovery, it would probably have already been invented and destroyed us. Since it hasn’t, we have a lot more resilience to threats from intelligence explosions, not to mention a lot of other threats.

This doesn’t mean that it isn’t worth trying to figure out how to make AI better for people. But it does diminish the need to think about artificial intelligence as an existential risk. It makes AI much more comparable to a biological threat. Biological threats could be really bad for humanity. But there’s also the organic reality that life is very resilient and human life in general is very secure precisely because it has developed so much intelligence.

I believe that thinking about the risks of artificial intelligence as analogous to the risks from biological threats is helpful for prioritizing where research effort in artificial intelligence should go. Just because AI doesn’t present an existential risk to all of humanity doesn’t mean it doesn’t kill a lot of people or make their lives miserable. On the contrary, we are in a world with both a lot of artificial and non-artificial intelligence and a lot of miserable and dying people. These phenomena are not causally disconnected. A good research agenda for AI could start with an investigation of these actually miserable people and what their problems are, and how AI is causing that suffering or alternatively what it could do to improve things. That would be an enormously more productive research agenda than one that aims primarily to reduce the impact of potential explosions which are diminishingly unlikely to occur.

Lenin and Luxemburg

One of the interesting parts of Scott’s Seeing Like a State is a detailed analysis of Vladimir Lenin’s ideological writings juxtaposed with one of this contemporary critics, Rosa Luxemburg, who was a philosopher and activist in Germany.

Scott is critical of Lenin, pointing out that while his writings emphasize the role of a secretive intelligentsia commanding the raw material of an angry working class through propaganda and a kind of middle management tier of revolutionarily educated factory bosses, this is not how the revolution actually happened. The Bolsheviks took over an empty throne, so to speak, because the czars had already lost their power fighting Austria in World War I. This left Russia headless, with local regions ruled by local autonomous powers. Many of these powers were in fact peasant and proletarian collectives. But others may have been soldiers returning from war and seizing whatever control they could by force.

Luxemburg’s revolutionary theory was much more sensitive to the complexity of decentralized power. Rather than expecting the working class to submit unquestioningly to top-down control and coordinating in mass strikes, she acknowledged a reality that decentralized groups would act in an uncoordinated way. This was good for the revolutionary cause, she argued, because it allowed the local energy and creativity of workers movements to move effectively and contribute spontaneously to the overall outcome. Whereas Lenin saw spontaneity in the working class as leading inevitably to their being coopted by bourgeois ideology, Luxemburg believed the spontaneous authentic action of autonomously acting working class people were vital to keeping the revolution unified and responsive to working class interests.