correcting an error in my analysis

by Sebastian Benthall

There is an error in my last post where I was thinking through the interpretation of 25,000,000 hit number reported for the Buzzfeed blue/black/white/whatever dress post. In that post I assumed that the distribution of viewers would be the standard one you see in on-line participation: a power law distribution with a long tail. Depending on which way you hold the diagram, the “tail” is either the enormous number of instances that only occur once (in this case, a visitor who goes to the page once and never again) or it’s population of instances that have bizarrely high occurrences (like that one guy who hit refresh on the page 100 times, and the woman that looked at the page 300 times, and…). You can turn one tail into the other by turning the histogram sideways and shaking really hard.

The problem with this analysis is that it ignores the data I’ve been getting from a significant subset of people who I’ve talked to about this in passing, which is that because the page contains some sort of well-crafted optical illusion, lots of people have looked at it once (and seen it as, say, a blue and black dress) and then looked at it again, seeing it as white and gold. In fact the article seems designed to get the reader to do just this.

If I’m being somewhat abstract in my analysis, it’s because I’ve refused to go click on the link myself. I have read too much Adorno. I hear the drumbeat of fascism in all popular culture. I do not want to take part in intelligently designed collective effervescence if I can help it. This is my idiosyncrasy.

But this inferred stickiness of the dress image has consequences for the traffic analysis. I’m sure that whoever is actually looking at the metrics on the article is tracking repeat version unique visitors. I wonder how deliberately the image was created with the idea of maximizing repeat visitations in mind, and the observed correlation between repeat and unique visitors. Repeated visits suggests sustained interest over time, whereas “mere” virality is a momentary spread of information over space. If you see content as a kind of property and sustained traffic over time as the value of that property, it makes sense to try to create things with staying power. Memetic globules forever gunking the crisscrossed manifold of attention. Culture.

Does this require a different statistical distribution to process properly? Is Cosma Shalizi right after all, and are these “power law” distributions just overhyped log-normal distributions? What happens when the generative process has a stickiness term? Is that just reflected in the power law distribution’s exponent? One day I will get a grip on this. Maybe I can do it working with mailing list data.

I’m writing this because over the weekend I was talking with a linguist and a philosopher about collective attention, a subject of great interest to me. It was the linguist who reported having looked at the dress twice and seeing it in different colors. The philosopher had not seen it. The latter’s research specialty was philosophy of mind, a kind of philosophy I care about a lot. I asked him whether in cases of collective attention the mental representation supervenes reductively on many individual minds or on more than that. He said that this is a matter of current debate but that he wants to argue that collective attention means more than my awareness of X, and my awareness of your awareness of X, ad infinitum. Ultimately I’m a mathematical person and am happy to see the limit of the infinite process as itself and its relationship with what it reduces to mediated by the logic of infinitesimals. But perhaps even this is not enough. I gave the philosopher my recommendation of Soren Brier and Ulanowicz, who together I think provide the groundwork needed for an ontology of macroorganic mentality and representation. The operationalization of these theories is the goal of my work at Glass Bead Labs.