miller and page | Digifesto

Why disorganized heavy tail distributions?

I wrote too soon.

Miller and Page (2009) do indeed address “fat tail” distributions explicitly in the same chapter on Emergence discussed in my last post.

However, they do not touch on the possibility that fat tail distributions might be log normal distributions generated by the Central Limit Theorem, as is well-documented by Mitzenmacher (2004).

Instead, they explicitly make a different case. They argue that there are two kinds of complexity:

disorganized complexity, complexity where extreme values balance each other out to create average aggregate behavior according to the Law of Large Numbers and Central Limit Theorem.
organized complexity, where positive and negative feedback can result in extreme outcomes, best characterized by power law or “heavy tail” distributions. Preferential attachment is an example of a feedback based mechanism for generating power law distributions (in the specific case of network degrees).

Indeed, this rough breakdown of possible scientific explanations (the relatively orderly null-hypothesis world of normal distributions, and the chaotic, more accurately rendered world of heavy tail distributions) was the one I had before I started studying complex systems and statistics more seriously in grad school.

Only later did I come to the conclusion that this is a pervasive error, because of the ease with which log normal distributions (which may be “disorganized”) can be confused with power law distributions (which tend to be explained by “organized” processes). I am a bit disappointed that Miller and Page repeat this error, but then again their book is written in 2009. I wonder whether the methodological realization (which I assume I’m not alone in, as I hear it confirmed informally in conversations with smart people sometimes) is relatively recent.

Because this is something so rarely discussed in focus, I think it may be worth pondering exactly why disorganized heavy tail distributions are not favored in the literature. There are several reasons I can think of, which I’ll offer informally here as possibilities or hypotheses.

One reason that I’ve argued for before here is that organized processes are more satisfying as explanations than disorganized processes. Most people are not very good at thinking about probabilities (Tetlock and Gardner (2016) have a great, accessible discussion of why this is the case). So to the extent that the Law of Large Numbers or Central Limit Theorem have true explanatory power, it may not be the kind of explanation most people are willing to entertain. This apparently includes scientists. Rather, a simple explanation in terms of feedback may be the kind of thing that feels like a robust scientific finding, even if there’s something spurious about it when viewed rigorously. (This is related, I think, to arguments about the end of narrative in social science.)

Another reason why disorganized heavy tail distributions may be underutilized as scientific explanations is that it is counter-intuitive that a disorganized process can produce such extreme inequality in outcomes.

This has to do with the key transformation that is the difference between a normal and a log normal distribution. A normal distribution is a bell-shaped distribution one gets when one adds a large number of independent random variables.

The log normal distribution is a heavy tail distribution one gets by multiplying a large number of positively valued independent random variables. While it does have a bell or hump, the top of the bell is not at the arithmetic mean, because the sides of the bell are skewed in size. But this is not necessarily because of the dominance of any particular factor (as would be expected if, for example, a single factor were involved in a positive feedback loop). Rather, it is the mathematical fact of many factors multiplied creating extraordinarily high values which creates the heavy right-hand side of the bell.

One way to put it is that rather than having a “deep” positive feedback loop where a single factor amplifies itself many times over, disorganized heavy tails have “shallow” positive feedback where each of many factors has a single and simultaneous amplifying effect on the impact of all the others. This amplification effect is, like multiplication itself, commutative, which means that no single factor can be considered to be causally prior to the others.

Once again, this defies specificity in an explanation, which may be for some people an explanatory desideratum.

But these extreme values are somehow ones that people demand specific explanations for. This is related, I believe, at the desire for a causal lever with which people can change outcomes, especially their own personal outcomes.

There’s an important political question implicated by all this, which is: why is wealth and power concentrated in the hands of the very few?

One explanation that must be considered is the possibility that society is accumulated history, and over thousands of years an innumerable number of independent factors have affected the distribution of wealth and power. Though rather disorganized, these factors amplify each other multiplicatively, resulting in the distribution that we see today.

The problem with this explanation is that it seems there is little to be done about this state of affairs. A person can effect a handful of the factors that contribute to their own wealth or the wealth of another, but if there are thousands of them then it’s hard to get a grip. One must view the other as simply lucky or unlucky. How can one politically mobilize around that?

References

Miller, John H., and Scott E. Page. Complex adaptive systems: An introduction to computational models of social life. Princeton university press, 2009

Mitzenmacher, Michael. “A brief history of generative models for power law and lognormal distributions.” Internet mathematics 1.2 (2004): 226-251.

Tetlock, Philip E., and Dan Gardner. Superforecasting: The art and science of prediction. Random House, 2016.

The Law: Miller and Page on Emergence, and statistics in social science

I’m working now through Complex Adaptive Systems by Miller and Page and have been deeply impressed with the clarity with which they lay out key scientific principles.

In their chapter on “Emergence”, they discuss the key problem in science of accounting for how some phenomena emerge from lower level phenomena. In the hard sciences, examples include how the laws and properties of chemistry emerge from the laws and properties of particles as determined by physics. It has been suggested that the psychological states of the mind emerge from the physical states of the brain. In social sciences, there is the open question of how social forms emerge from individual behavior.

Miller and Page acknowledge that “unfortunately, emergence is one of those complex systems ideas that exists in a well-trodden, but relatively untracked, bog of discussions”. Epstein’s (2006) treatment of it is particular aggressive, as he takes aim at early emergence theorists who used the term in a kind of mystifying sense and then attempts to replace this usage with his own much more concrete one.

So far in my reading on the subject there has been a lack of mathematical rigor in the treatment of the subject, but I’ve been impressed now with what Miller and Page specifically bring to bear on the problem.

Miller and Page provide two clear criteria for an emergent phenomenon:

“Emergence is a phenomenon whereby well-formulated aggregate behavior arises from localized, individual behavior.
“Such aggregate behavior should be immune to reasonable variations in the individual behavior.”

Significantly, their first example of such an effect comes from statistics: it’s the Law of Large Numbers and related theorems like the Central Limit Theorem.

These are basic theorems in statistics about the properties of a sample of random variables. The Law of Large Numbers states that the average of a large number of samples will converge on the expected value of the expected value of one sample. The Central Limit Theorem states that the distribution of the sum of many identical and independent random variables will tends towards a normal (or Gaussian) distribution whatever the distribution of the underlying variables are.

Though mathematically statements about random variables and their aggregate value, Miller and Page correctly generalize from this to say that these Laws apply to the relationship between individual behavior and aggregate patterns. The emergent phenomena here (the mean or distribution of outcomes) fulfill their criteria for emergent properties: they are well formed and depend less and less on individual behavior the more individuals there are involved.

These Laws are taught in Statistics 101. What is under-emphasized, in my experience, is the extent to which these Laws are determinative of social phenonema. Miller and Page cite an intriguing short story by Robert Coates, entitled “The Law” (1956), that explores the idea of what would happen if the Law of Large Numbers gave out. Suddenly traffic patterns would be radically unpredictable as the number of people on the road, or in a shopping mall, or outdoors enjoying nature, would be far from average far more often than we’re used to. Absurdly, the short story ends when the statistical law is at last adopted by Congress. This is absurd because of course this is one Law that affects all social and physical reality all the time.

Where this fact crops up less frequently than it should is in discussions of the origins of distributions of wide inequality. Physicists have for a couple decades been promoting the idea that the highly unequal “long tail” distributions found in society are likely power law distributions. Clauset, Shalizi, and Newman have developed a statistical test which, when applied, demonstrates that the empirical support for many of these claims isn’t truly there. Often these distributions are empirically closer to a log normal distribution, which can be explained by the Central Limit Theorem when one combines variables through multiplication rather than addition. My own small and flawed contribution to this long and significant line of research is here.

As far as explanatory hypotheses go, the immutable laws of statistics have advantages and disadvantages. Their advantage is that they are always correct. The disadvantage of these Laws in particular is that they do not lend themselves to narrative explanation, which means they are in principle excluded from those social sciences that hold themselves to argument via narration. Narration, it is argued, is more interesting and compelling for audiences not well-versed in the general science of statistics. Since many social sciences are interested in discussion of inequality in society, this seems to put these disciplines at odds with each other. Some disciplines, the ones converging now into computational social science, will use these Laws and be correct, but uninteresting. Other disciplines will ignore these laws and be incorrect but more compelling to popular audiences.

This is a disturbing conclusion, one that I believe strikes deeply at the heart of the epistemic crisis affecting politics today. No wonder we have “post-truth” media and “fake news” when our social scientists can’t even bring themselves to accept the inconvenience of learning basic statistics. I’m not speaking out of abstract concern here. I’ve encountered this problem personally and quite dramatically myself through my early dissertation work. Trying to make this very point proved so anathema to the way social sciences have been constructed that I had to abandon the project for lack of comprehending faculty support. This is despite The Law, as Coates refers to it whimsically, being well known and “on the books” for a very, very long time.

It is perhaps disconcerting to social scientists that their fields of expertise may be characterized well by the same kind of laws, grounded in mathematics, that determine chemical interactions that the evolution of biological ecosystems. And indeed there is a strong discourse around downward causation in social systems that discusses the ways in which individuals in society may be different from individuals random variables in a large sample. However, a clear understanding of statistical generative processes must be brought to bear on the understanding of social phenomena as a kind of null hypothesis. These statistical laws are due high prior probability, in the Bayesian sense. I hope to discover one day how to formalize this intuitively clear conclusion in more authoritative, mathematical terms.

References

Benthall, S. “Testing Generative Models of Online Collaboration with BigBang (pp. 182–189).” Proceedings of the 14th Python in Science Conference. Available at https://conference. scipy. org/proceedings/scipy2015/sebastian_benthall. html. 2015.

Benthall, Sebastian. “Philosophy of computational social science.” Cosmos and History: The Journal of Natural and Social Philosophy 12.2 (2016): 13-30.

Coates, Robert M. 1956. “The Law.” In The World of Mathematics, Vol. 4, edited by James R. Newman, 2268-71. New York: Simon and Schuster.

Clauset, Aaron, Cosma Rohilla Shalizi, and Mark EJ Newman. “Power-law distributions in empirical data.” SIAM review 51.4 (2009): 661-703.

Epstein, Joshua M. Generative social science: Studies in agent-based computational modeling. Princeton University Press, 2006.

Miller, John H., and Scott E. Page. Complex adaptive systems: An introduction to computational models of social life. Princeton university press, 2009.

Sawyer, R. Keith. “Simulating emergence and downward causation in small groups.” Multi-agent-based simulation. Springer Berlin Heidelberg, 2000. 49-67.

Digifesto

Tag: miller and page

July 11, 2017

Why disorganized heavy tail distributions?

July 10, 2017

The Law: Miller and Page on Emergence, and statistics in social science