philip tetlock | Digifesto

October 2, 2023

Practical social forecasting

I was once long ago asked to write a review of Philip Tetlock’s Expert Political Judgment: How Good Is It? How Can We Know? (2006) and was, like a lot of people, very impressed. If you’re not familiar with the book, the gist is that Tetlock, a psychologist, runs a 20 year study asking everybody who could plausibly be called a “political expert” to predict future events, and then scores them using a very reasonable Bayesian scoring system. He then searches the data for insights about what makes for good political forecasting ability. He finds it to be quite rare, but correlated with humbler and more flexible styles of thinking. Tetlock has gone on to pursue and publish about this line of research. There are now forecasting competitions, and the book Superforecasting. Tetlock has a following.

What I caught my attention in the original book, which was somewhat downplayed in the research program as a whole, is that rather simple statistical models, with two or three regressed variables, performed very well in comparison to even the best human experts. In a Bayesian sense, they were at least as good as the best people. These simple models tended towards guessing something close to the base rate of an event, whereas even the best humans tended to believe their own case-specific reasoning somewhat more than they perhaps should have.

This could be seen as a manifestation of the “bias/variance tradeoff” in (machine and other) learning. A learning system must either have a lot of concentration in the probability mass of its prior (bias) or it must spread this mass quite thin (variance). Roughly, a learning system is a good one for its context if, and maybe only if, its prior is a good enough fit for the environment that it’s in. There’s no free lunch. So the only way to improve social scientific forecasting is to encode more domain specific knowledge into the learning system. Or so I thought until recently.

For the past few years I have been working on computational economics tools that enable modelers to imagine and test theories about the dynamics behind our economic observations. This is a rather challenging and rewarding field to work in, especially right now, when the field of Economics is rapidly absorbing new idea from computer science and statistics. Last August, I had the privilege to attend a summer school and conference on the theme of “Deep Learning for Solving and Estimating Dynamic Models” put on by the Econometric Society DSE Summer School. It was awesome.

The biggest, least subtle, takeaway from the summer school and conference is that deep learning is going to be a big deal for Economics, because these techniques make it feasible to solve and estimate models with much higher dimensionality than has been possible with prior methods. By “solve”, I mean coming to conclusions, for a given model of a bunch of agents interacting with each other through, for example, a market, with some notion of their own reward structure, what the equilibrium dynamics of that system are. Solving these kinds of stochastic dynamic control problems, especially when there is nontrivial endogenous aggregation of agent behavior, is computationally quite difficult. But there are cool ways of encoding the equilibrium conditions of the model, or the optimality conditions of the agents involved, into the loss function of a neural network so that the deep learning training architecture works as a model solver. By “estimate”, I mean identify, for a give model, the parameterization of the model that produces results that make some empirical calibration targets maximally likely.

But maybe more foundationally exciting than seeing these results — which were very great — was the work that demonstrated some practical consequences of the double descent phenomenon in deep learning.

Double descent has been discussed, I guess, since 2018 but it has only recently gotten on my radar. It explains a lot about how and why deep learning has blown so many prior machine learning results out of the water. The core idea is that when a neural network is overparameterized — has so many degrees of freedom that, when trained, it can entirely interpolate (reproduce) the training data — it begins to perform better than any underparameterized model.

The underlying reasons for this are deep and somewhat mysterious. I have an intuition about it that I’m not sure checks out properly mathematically, but I will jot it down here anyway. There are some results suggesting that an infinitely parameterized neural network, of a certain kind, is equivalent to a Gaussian Process, a collection of random variables such that any infinite collection of them is a multivariate normal distribution. If the best model that we can ever train is an even largely and more complex Gaussain process, then this suggests that the Central Limit Theorem is once again the rule that explains the world as we see it, but in a far more textured and interesting way than is obvious. The problem with the Central Limit Theory and normal distributions is that they are not explainable — the explanation for the phenomenon is always a plethora of tiny factors, none of which are sufficient individually. And yet, because it is a foundational mathematical rule, it is always available as an explanation for any phenomenon we can experience. A perfect null hypothesis. Which turns out to be the best forecasting tool available?

It’s humbling material to work with, in any case.

References

Azinovic, Marlon and Gaegauf, Luca and Scheidegger, Simon, Deep Equilibrium Nets (May 24, 2019). Available at SSRN: https://ssrn.com/abstract=3393482 or http://dx.doi.org/10.2139/ssrn.3393482

Kelly, Bryan T. and Malamud, Semyon and Zhou, Kangying, The Virtue of Complexity in Return Prediction (December 13, 2021). Swiss Finance Institute Research Paper No. 21-90, Journal of Finance, forthcoming, Available at SSRN: https://ssrn.com/abstract=3984925 or http://dx.doi.org/10.2139/ssrn.3984925

Nakkiran, P., Kaplun, G., Bansal, Y., Yang, T., Barak, B. and Sutskever, I., 2021. Deep double descent: Where bigger models and more data hurt. Journal of Statistical Mechanics: Theory and Experiment, 2021(12), p.124003.

Leave a comment

September 6, 2020

System 2 hegemony and its discontents

Recent conversations have brought me back to the third rail of different modalities of knowledge and their implications for academic disciplines. God help me. The chain leading up to this is: a reminder of how frustrating it was trying to work with social scientists who methodologically reject the explanatory power of statistics, an intellectual encounter with a 20th century “complex systems” theorist who also didn’t seem to understand statistics, and the slow realization that’s been bubbling up for me over the years that I probably need to write an article or book about the phenomenology of probability, because I can’t find anything satisfying about it.

The hypothesis I am now entertaining is that probabilistic or statistical reasoning is the intellectual crux, disciplinarily. What we now call “STEM” is all happy to embrace statistics as its main mode of empirical verification. This includes the use of mathematical proof for “exact” or a priori verification of methods. Sometimes the use of statistics is delayed or implicit; there is qualitative research that is totally consistent with statistical methods. But the key to this whole approach is that the fields, in combination, are striving for consistency.

But not everybody is on board with statistics! Why is that?

One reason may be because statistics is difficult to learn and execute. Doing probabilistic reasoning correctly is at times counter-intuitive. That means that quite literally it can make your head hurt to think about it.

There is a lot of very famous empirical cognitive psychology that has explored this topic in depth. The heuristics and biases research program of Kahneman and Tversky was critical for showing that human behavior rarely accords with decision-theoretic models of mathematical, probabilistic rationality. An intuitive, “fast”, prereflective form of thinking, (“System 1”) is capable of making snap judgments but is prone to biases such as the availability heuristic and the representativeness heuristic.

A couple general comments can be made about System 1. (These are taken from Tetlock’s review of this material in Superforecasting). First, a hallmark of System 1 is that it takes whatever evidence it is working with as given; it never second-guesses it or questions its validity. Second, System 1 is fantastic at provided verbal rationalizations and justifications of anything that it encounters, even when these can be shown to be disconnected from reality. Many colorful studies of split brain cases, but also many other lab experiments, show the willingness people have to make of stories to explain anything, and their unwillingness to say, “this could be due to one of a hundred different reasons, or a mix of them, and so I don’t know.”

The cognitive psychologists will also describe a System 2 cognitive process that is more deliberate and reflective. Presumably, this is the system that is sometimes capable of statistical or otherwise logical reasons. And a big part of statistical reasoning is questioning the source of your evidence. A robust application of System 2 reasoning is capable of overcoming System 1’s biases. At the level of institutional knowledge creation, the statistical sciences are comprised mainly of formalized, shared results of System 2 reasoning.

Tetlock’s work, from Expert Political Judgment and on, is remarkable for showing that deference to one or the other cognitive system is to some extent a robust personality trait. Famously, those of the “hedgehog” cognitive style, who apply System 1 and a simplistic theory of the world to interpret everything they experience, are especially bad at predicting the outcomes of political events (what are certainly the results of ‘complex systems’), whereas the “fox” cognitive style, which is more cautious about considering evidence and coming to judgments, outperforms them. It seems that Tetlock’s analysis weighs in favor of System 2 as a way of navigating complex systems.

I would argue that there are academic disciplines, especially those grounded in Heideggerian phenomenology, that see the “dominance” of institutions (such as academic disciplines) that are based around accumulations of System 2 knowledge as a problem or threat.

This reaction has several different guises:

A simple rejection of cognitive psychology, which has exposed the System 1/System 2 distinction, as “behaviorism”. (This obscures the way cognitive psychology was a major break away from behaviorism in the 50’s.)
A call for more “authentic experience”, couched in language suggesting ownership or the true subject of one’s experience, contrasting this with the more alienated forms of knowing that rely on scientific consensus.
An appeal to originality: System 2 tends to converge; my System 1 methods can come up with an exciting new idea!
The interpretivist methodological mandate for anthropological sensitivity to “emic”, or directly “lived experience”, of research subjects. This mandate sometimes blurs several individually valid motivations, such as: when emic experience is the subject matter in its own right, but (crucially) with the caveat that the results are not generalizable; when emic sensitivity is identified via the researcher’s reflexivity as a condition for research access; or when the purpose of the work is to surface or represent otherwise underrepresented views.

There are ways to qualify or limit these kinds of methodologies or commitments that makes them entirely above reproach. However, under these limits, their conclusions are always fragile. According to the hegemonic logic of System 2 institutions, a consensus of those thoroughly considering the statistical evidence can always supercede the “lived experience” of some group or individual. This is, at the methodological level, simply the idea that while we may make theory-laden observations, when those theories are disproved, those observations are invalidated as being influenced by erronenous theory. Indeed, mainstream scientific institutions take as their duty this kind of procedural objectivity. There is no such thing as science unless a lot of people are often being proven wrong.

This provokes a great deal of grievance. “Who made scientists, an unrepresentative class of people and machines disconnected from authentic experience, the arbiter of the real? Who are they to tell me I am wrong, or my experiences invalid?” And this is where we start to find trouble.

Perhaps most troubling is how this plays out at the level of psychodynamic politics. To have one’s lived experiences rejected, especially those lived experiences of trauma, and especially when those experiences are rejected wrongly, is deeply disturbing. One of the more mighty political tendencies of recent years has been the idea that whole classes of people are systematically subject to this treatment. This is one reason, among others, for influential calls for recalibrating the weight given to the experiences of otherwise marginalized people. This is what Furedi calls the therapeutic ethos of the Left. This is slightly different from, though often conflated with, the idea that recalibration is necessary to allow in more relevant data that was being otherwise excluded from consideration. This latter consideration comes up in a more managerialist discussion of creating technology that satisfies diverse stakeholders (…customers) through “participatory” design methods. The ambiguity of the term “bias”–does it mean a statistical error, or does it mean any tendency of an inferential system at all?–is sometimes leveraged to accomplish this conflation.

It is in practice very difficult to disentangle the different psychological motivations here. This is partly because they are deeply personal and mixed even at the level of the individual. (Highlighting this is why I have framed this in terms of the cognitive science literature). It is also partly because these issues are highly political as well. Being proven right, or wrong, has material consequences–sometimes. I’d argue: perhaps not as often as it should. But sometimes. And so there’s always a political interest, especially among those disinclined towards System 2 thinking, in maintaining a right to be wrong.

So it is hypothesized (perhaps going back to Lyotard) that at an institutional level there’s a persistent heterodox movement that rejects the ideal of communal intellectual integrity. Rather, it maintains that the field of authoritative knowledge must contain contradictions and disturbances of statistical scientific consensus. In Lyotard’s formulation, this heterodoxy seeks “legitimation by paralogy”, which suggests that its telos is at best a kind of creative intellectual emancipation from restrictive logics, generative of new ideas, but perhaps at worst a heterodoxy for its own sake.

This tendency has an uneasy relationship with the sociopolitical motive of a more integrated and representative society, which is often associated with the goal of social justice. If I understand these arguments directly, the idea is that, in practice, legitimized paralogy is a way of giving the underrepresented a platform. This has the benefits of increasing, visibly, representation. Here, paralogy is legitimized as a means of affirmative action, but not as a means improving system performance objectively.

This is a source of persistent difficulty and unease, as the paralogical tendency is never capable of truly emancipating itself, but rather, in its recuperated form, is always-already embedded in a hierarchy that it must deny to its initiates. Authenticity is subsumed, via agonism, to a procedural objectivity that proves it wrong.

5 Comments

January 9, 2019

computational institutions as non-narrative collective action

Nils Gilman recently pointed to a book chapter that confirms the need for “official futures” in capitalist institutions.

Great book on the necessity of official futures: “Collectively held images of how the future will unfold are critical because they free economic actors from paralyzing doubt, enabling them to commit resources and coordinate decisions even if those expectations prove inaccurate.” https://t.co/4oCHWgSSNp
— Nils Gilman (@nils_gilman) January 8, 2019

Nils indulged me in a brief exchange that helped me better grasp at a bothersome puzzle.

There is a certain class of intellectuals that insist on the primacy of narratives as a mode of human experience. These tend to be, not too surprisingly, writers and other forms of storytellers.

There is a different class of intellectuals that insists on the primacy of statistics. Statistics does not make it easy to tell stories because it is largely about the complexity of hypotheses and our lack of confidence in them.

The narrative/statistic divide could be seen as a divide between academic disciplines. It has often been taken to be, I believe wrongly, the crux of the “technology ethics” debate.

I questioned Nils as to whether his generalization stood up to statistically driven allocation of resources; i.e., those decisions made explicitly on probabilistic judgments. He argued that in the end, management and collective action require consensus around narrative.

In the (literal) final analysis, the various quantitative possibilities get distilled into narratives — that’s how they get used to drive decisions and collective action.
— Nils Gilman (@nils_gilman) January 8, 2019

In other words, what keeps narratives at the center of human activity is that (a) humans are in the loop, and (b) humans are collectively in the loop.

The idea that communication is necessary for collective action is one I used to put great stock in when studying Habermas. For Habermas, consensus, and especially linguistic consensus, is how humanity moves together. Habermas contrasted this mode of knowledge aimed at consensus and collective action with technical knowledge, which is aimed at efficiency. Habermas envisioned a society ruled by communicative rationality, deliberative democracy; following this line of reasoning, this communicative rationality would need to be a narrative rationality. Even if this rationality is not universal, it might, in Habermas’s later conception of governance, be shared by a responsible elite. Lawyers and a judiciary, for example.

The puzzle that recurs again and again in my work has been the challenge of communicating how technology has become an alternative form of collective action. The claim made by some that technologists are a social “other” makes more sense if one sees them (us) as organizing around non-narrative principles of collective behavior.

It is I believe beyond serious dispute that well-constructed, statistically based collective decision-making processes perform better than many alternatives. In the field of future predictions, Phillip Tetlock’s work on superforecasting teams and prior work on expert political judgment has long stood as an empirical challenge to the supposed primacy of narrative-based forecasting. This challenge has not been taken up; it seems rather one-sided. One reason for this may be because the rationale for the effectiveness of these techniques rests ultimately in the science of statistics.

It is now common to insist that Artificial Intelligence should be seen as a sociotechnical system and not as a technological artifact. I wholeheartedly agree with this position. However, it is sometimes implied that to understand AI as a social+ system, one must understand it one narrative terms. This is an error; it would imply that the collective actions made to build an AI system and the technology itself are held together by narrative communication.

But if the whole purpose of building an AI system is to collectively act in a way that is more effective because of its facility with the nuances of probability, then the narrative lens will miss the point. The promise and threat of AI is that is delivers a different, often more effective form of collective or institution. I’ve suggested that computational institution might be the best way to refer to such a thing.

Leave a comment

December 12, 2017

Notes on Clark Kerr’s “The ‘City of Intellect’ in a Century for Foxes?”, in The Uses of the University 5th Edition

I am in my seventh and absolutely, definitely last year of a doctoral program and so have many questions about the future of higher education and whether or not I will be a part of it. For insight, I have procured an e-book copy of Clark Kerr’s The Uses of the University (5th Edition, 2001). Clark Kerr was the 20th President of University of California system and became famous among other things for his candid comments on university administration, which included such gems as

“I find that the three major administrative problems on a campus are sex for the students, athletics for the alumni and parking for the faculty.”

…and…

“One of the most distressing tasks of a university president is to pretend that the protest and outrage of each new generation of undergraduates is really fresh and meaningful. In fact, it is one of the most predictable controversies that we know. The participants go through a ritual of hackneyed complaints, almost as ancient as academe, while believing that what is said is radical and new.”

The Uses of the University is a collection of lectures on the topic of the university, most of which we given in the second half of the 20th century. The most recent edition contains a lecture given in the year 2000, after Kerr had retired from administration, but anticipating the future of the university in the 21st century. The title of the lecture is “The ‘City of Intellect’ in a Century for Foxes?”, and it is encouragingly candid and prescient.

To my surprise, Kerr approaches the lecture as a forecasting exercise. Intriguingly, Kerr employs the hedgehog/fox metaphor from Isaiah Berlin in a lecture about forecasting five years before the publication of Tetlock’s 2005 book Expert Political Judgment (review link), which used the fox/hedgehog distinction to cluster properties that were correlated with political expert’s predictive power. Kerr’s lecture is structured partly as the description of a series of future scenarios, reminiscent of scenario planning as a forecasting method. I didn’t expect any of this, and it goes to show perhaps how pervasive scenario thinking was as a 20th century rhetorical technique.

Kerr makes a number of warning about the university in the 20th century, especially with respect to the glory of the university in the 20th century. He makes a historical case for this: universities in the 20th century thrived on new universal access to students, federal investment in universities as the sites of basic research, and general economic prosperity. He doesn’t see these guaranteed in the 20th century, though he also makes the point that in official situations, the only thing a university president should do is discuss the past with pride and the future with apprehension. He has a rather detailed analysis of the incentives guiding this rhetorical strategy as part of the lecture, which makes you wonder how much salt to take the rest of the lecture with.

What are the warnings Kerr makes? Some are a continuation of the problems universities experienced in the 20th century. Military and industrial research funding changed the roles of universities away from liberal arts education into research shop. This was not a neutral process. Undergraduate education suffered, and in 1963 Kerr predicted that this slackening of the quality of undergraduate education would lead to student protests. He was half right; students instead turned their attention externally to politics. Under these conditions, there grew to be a great tension between the “internal justice” of a university that attempted to have equality among its faculty and the permeation of external forces that made more of the professiorate face outward. A period of attempted reforms throguh “participatory democracy” was “a flash in the pan”, resulting mainly in “the creation of courses celebrating ethnic, racial, and gender diversities. “This experience with academic reform illustrated how radical some professors can be when they look at the external world and how conservative when they look inwardly at themselves–a split personality”.

This turn to industrial and military funding and the shift of universities away from training in morality (theology), traditional professions (medicine, law), self-chosen intellectual interest for its own sake, and entrance into elite society towards training for the labor force (including business administration and computer science) is now quite old–at least 50 years. Among other things, Kerr predicts, this means that we will be feeling the effects of the hollowing out of the education system that happened as higher education deprioritized teaching in favor of research. The baby boomers who went through this era of vocational university education become, in Kerr’s analysis, an enormous class of retirees by 2030, putting new strain on the economy at large. Meanwhile, without naming computers and the Internet, Kerr acknowledged that the “electronic revolution” is the first major change to affect universities for three hundred years, and could radically alter their role in society. He speaks highly of Peter Drucker, who in 1997 was already calling the university “a failure” that would be made obsolete by long-distance learning.

In an intriguing comment on aging baby boomers, which Kerr discusses under the heading “The Methuselah Scenario”, is that the political contest between retirees and new workers will break down partly along racial lines: “Nasty warfare may take place between the old and the young, parents and children, retired Anglos and labor force minorities.” Almost twenty years later, this line makes me wonder how much current racial tensions are connected to age and aging. Have we seen the baby boomer retirees rise as a political class to vigorously defend the welfare state from plutocratic sabotage? Will we?

Kerr discusses the scenario of the ‘disintegration of the integrated university’. The old model of medicine, agriculture, and law integrated into one system is coming apart as external forces become controlling factors within the university. Kerr sees this in part as a source of ethical crises for universities.

“Integration into the external world inevitably leads to disintegration of the university internally. What are perceived by some as the injustices in the external labor market penetrate the system of economic rewards on campus, replacing policies of internal justice. Commitments to external interests lead to internal conflicts over the impartiality of the search for truth. Ideologies conflict. Friendships and loyalties flow increasingly outward. Spouses, who once held the academic community together as a social unit, now have their own jobs. “Alma Mater Dear” to whom we “sing a joyful chorus” becomes an almost laughable idea.”

A factor in this disintegration is globalization, which Kerr identifies with the mobility of those professors who are most able to get external funding. These professors have increased bargaining power and can use “the banner of departmental autonomy” to fight among themselves for industrial contracts. Without oversight mechanisms, “the university is helpless in the face of the combined onslaught of aggressive industry and entrepreneurial faculty members”.

Perhaps most fascinating for me, because it resonates with some of my more esoteric passions, is Kerr’s section on “The fractionalization of the academic guild“. Subject matter interest breaks knowledge into tiny disconnected topics–"Once upon a time, the entire academic enterprise originated in and remained connected to philosophy." The tension between "internal justice" and the "injustices of the external labor market" creates a conflict over monetary rewards. Poignantly, "fractionalization also increases over differing convictions about social justice, over whether it should be defined as equality of opportunity or equality of results, the latter often taking the form of equality of representation. This may turn out to be the penultimate ideological battle on campus."

And then:

The ultimate conflict may occur over models of the university itself, whether to support the traditional or the “postmodern” model. The traditional model is based on the enlightenment of the eighteenth century–rationality, scientific processes of thought, the search for truth, objectivity, “knowledge for its own sake and for its practical applications.” And the traditional university, to quote the Berkeley philosopher John Searle, “attempts to be apolitical or at least politically neutral.” The university of postmodernism thinks that all discourse is political anyway, and it seeks to use the university for beneficial rather than repressive political ends… The postmodernists are attempting to challenge certain assumptions about the nature of truth, objectivity, rationality, reality, and intellectual quality.”

… Any further politicization of the university will, of course, alienate much of the public at large. While most acknowledge that the traditional university was partially politicized already, postmodernism will further raise questions of whether the critical function of the university is based on political orientation rather than on nonpolitical scientific analysis.”

I could go on endlessly about this topic; I’ll try to be brief. First, as per Lyotard’s early analysis of the term, postmodernism is as much as result of the permeation of the university by industrial interests as anything else. Second, we are seeing, right now today in Congress and on the news etc., the eroded trust that a large portion of the public has of university “expertise”, as they assume (having perhaps internalized a reductivist version of the postmodern message despite or maybe because they were being taught by teaching assistants instead of professors) that the professoriate is politically biased. And now the students are in revolt over Free Speech again as a result.

Kerr entertains for a paragraph the possibility of a Hobbesian doomsday free-for-all over the university before considering more mundane possibilities such as a continuation of the status quo. Adapting to new telecommunications (including “virtual universities”), new amazing discoveries in biological sciences, and higher education as a step in mid-career advancement are all in Kerr’s more pragmatic view of the future. The permeability of the university can bring good as well as bad as it is influenced by traffic back and forth across its borders. “The drawbridge is now down. Who and what shall cross over it?”

Kerr counts three major wildcards determining the future of the university. The first is overall economic productivity, the second is fluctuations in returns to a higher education. The third is the United States’ role in the global economy “as other nations or unions of nations (for example, the EU) may catch up with and even surpass it. The quality of education and training for all citizens will be to this contest. The American university may no longer be supreme.” Fourth, student unrest turning universities into the “independent critic”. And fifth, the battles within the professoriate, “over academic merit versus social justice in treatment of students, over internal justice in the professional reward system versus the pressures of external markets, over the better model for the university–modern or post-modern.”

He concludes with three wishes for the open-minded, cunning, savvy administrator of the future, the “fox”:

Careful study of new information technologies and their role.
“An open, in-depth debate…between the proponents of the traditional and the postmodern university instead of the sniper shots of guerilla warfare…”
An “in-depth discussion…about the ethical systems of the future university”. “Now the ethical problems are found more in the flow of contacts between the academic and the external worlds. There have never been so many ethical problems swirling about as today.”

Leave a comment

July 11, 2017

Why disorganized heavy tail distributions?

I wrote too soon.

Miller and Page (2009) do indeed address “fat tail” distributions explicitly in the same chapter on Emergence discussed in my last post.

However, they do not touch on the possibility that fat tail distributions might be log normal distributions generated by the Central Limit Theorem, as is well-documented by Mitzenmacher (2004).

Instead, they explicitly make a different case. They argue that there are two kinds of complexity:

disorganized complexity, complexity where extreme values balance each other out to create average aggregate behavior according to the Law of Large Numbers and Central Limit Theorem.
organized complexity, where positive and negative feedback can result in extreme outcomes, best characterized by power law or “heavy tail” distributions. Preferential attachment is an example of a feedback based mechanism for generating power law distributions (in the specific case of network degrees).

Indeed, this rough breakdown of possible scientific explanations (the relatively orderly null-hypothesis world of normal distributions, and the chaotic, more accurately rendered world of heavy tail distributions) was the one I had before I started studying complex systems and statistics more seriously in grad school.

Only later did I come to the conclusion that this is a pervasive error, because of the ease with which log normal distributions (which may be “disorganized”) can be confused with power law distributions (which tend to be explained by “organized” processes). I am a bit disappointed that Miller and Page repeat this error, but then again their book is written in 2009. I wonder whether the methodological realization (which I assume I’m not alone in, as I hear it confirmed informally in conversations with smart people sometimes) is relatively recent.

Because this is something so rarely discussed in focus, I think it may be worth pondering exactly why disorganized heavy tail distributions are not favored in the literature. There are several reasons I can think of, which I’ll offer informally here as possibilities or hypotheses.

One reason that I’ve argued for before here is that organized processes are more satisfying as explanations than disorganized processes. Most people are not very good at thinking about probabilities (Tetlock and Gardner (2016) have a great, accessible discussion of why this is the case). So to the extent that the Law of Large Numbers or Central Limit Theorem have true explanatory power, it may not be the kind of explanation most people are willing to entertain. This apparently includes scientists. Rather, a simple explanation in terms of feedback may be the kind of thing that feels like a robust scientific finding, even if there’s something spurious about it when viewed rigorously. (This is related, I think, to arguments about the end of narrative in social science.)

Another reason why disorganized heavy tail distributions may be underutilized as scientific explanations is that it is counter-intuitive that a disorganized process can produce such extreme inequality in outcomes.

This has to do with the key transformation that is the difference between a normal and a log normal distribution. A normal distribution is a bell-shaped distribution one gets when one adds a large number of independent random variables.

The log normal distribution is a heavy tail distribution one gets by multiplying a large number of positively valued independent random variables. While it does have a bell or hump, the top of the bell is not at the arithmetic mean, because the sides of the bell are skewed in size. But this is not necessarily because of the dominance of any particular factor (as would be expected if, for example, a single factor were involved in a positive feedback loop). Rather, it is the mathematical fact of many factors multiplied creating extraordinarily high values which creates the heavy right-hand side of the bell.

One way to put it is that rather than having a “deep” positive feedback loop where a single factor amplifies itself many times over, disorganized heavy tails have “shallow” positive feedback where each of many factors has a single and simultaneous amplifying effect on the impact of all the others. This amplification effect is, like multiplication itself, commutative, which means that no single factor can be considered to be causally prior to the others.

Once again, this defies specificity in an explanation, which may be for some people an explanatory desideratum.

But these extreme values are somehow ones that people demand specific explanations for. This is related, I believe, at the desire for a causal lever with which people can change outcomes, especially their own personal outcomes.

There’s an important political question implicated by all this, which is: why is wealth and power concentrated in the hands of the very few?

One explanation that must be considered is the possibility that society is accumulated history, and over thousands of years an innumerable number of independent factors have affected the distribution of wealth and power. Though rather disorganized, these factors amplify each other multiplicatively, resulting in the distribution that we see today.

The problem with this explanation is that it seems there is little to be done about this state of affairs. A person can effect a handful of the factors that contribute to their own wealth or the wealth of another, but if there are thousands of them then it’s hard to get a grip. One must view the other as simply lucky or unlucky. How can one politically mobilize around that?

References

Miller, John H., and Scott E. Page. Complex adaptive systems: An introduction to computational models of social life. Princeton university press, 2009

Mitzenmacher, Michael. “A brief history of generative models for power law and lognormal distributions.” Internet mathematics 1.2 (2004): 226-251.

Tetlock, Philip E., and Dan Gardner. Superforecasting: The art and science of prediction. Random House, 2016.

Leave a comment

January 10, 2017

Loving Tetlock’s Superforecasting: The Art and Science of Prediction

I was a big fan of Philip Tetlock’s Expert Political Judgment (EPJ). I read it thoroughly; in fact a book review of it was my first academic publication. It was very influential on me.

EPJ is a book that is troubling to many political experts because it basically says that most so-called political expertise is bogus and that what isn’t bogus is fairly limited. It makes this argument with far more meticulous data collection and argumentation than I am able to do justice to here. I found it completely persuasive and inspiring. It wasn’t until I got to Berkeley that I met people who had vivid negative emotional reactions to this work. They seem to mainly have been political experts who do not having their expertise assessed in terms of its predictive power.

Superforecasting: The Art and Science of Prediction (2016) is a much more accessible book that summarizes the main points from EPJ and then discusses the results of Tetlock’s Good Judgment Project, which was his answer to an IARPA challenge in forecasting political events.

Much of the book is an interesting history of the United States Intelligence Community (IC) and the way its attitudes towards political forecasting have evolved. In particular, the shock of the failure of the predictions around Weapons of Mass Destruction that lead to the Iraq War were a direct cause of IARPA’s interest in forecasting and their funding of the Good Judgment Project despite the possibility that the project’s results would be politically challenging. IARPA comes out looking like a very interesting and intellectually honest organization solving real problems for the people of the United States.

Reading this has been timely for me because: (a) I’m now doing what could be broadly construed as “cybersecurity” work, professionally, (b) my funding is coming from U.S. military and intelligence organizations, and (c) the relationship between U.S. intelligence organizations and cybersecurity has been in the news a lot lately in a very politicized way because of the DNC hacking aftermath.

Since so much of Tetlock’s work is really just about applying mathematical statistics to the psychological and sociological problem of developing teams of forecasters, I see the root of it as the same mathematical theory one would use for any scientific inference. Cybersecurity research, to the extent that it uses sound scientific principles (which it must, since it’s all about the interaction between society, scientifically designed technology, and risk), is grounded in these same principles. And at its best the U.S. intelligence community lives up to this logic in its public service.

The needs of the intelligence community with respect to cybersecurity can be summed up in one word: rationality. Tetlock’s work is a wonderful empirical study in rationality that’s a must-read for anybody interested in cybersecurity policy today.

Leave a comment

Tag: philip tetlock