Digifesto

Imre Lakatos and programming as dialectic

My dissertation is about the role of software in scholarly communication. Specifically, I’m interested in the way software code is itself a kind of scholarly communication, and how the informal communications around software production represent and constitute communities of scientists. I see science as a cognitive task accomplished by the sociotechnical system of science, including both scientists and their infrastructure. Looking particularly at scientist’s use of communications infrastructure such as email, issue trackers, and version control, I hope to study the mechanisms of the scientific process much like a neuroscientist studies the mechanisms of the mind by studying neural architecture and brainwave activity.

To get a grip on this problem I’ve been building BigBang, a tool for collecting data from open source projects and readying it for scientific analysis.

I have also been reading background literature to give my dissertation work theoretical heft and to procrastinate from coding. This is why I have been reading Imre Lakatos’ Proofs and Refutations (1976).

Proofs and Refutations is a brilliantly written book about the history of mathematical proof. In particular, it is an analysis of informal mathematics through an investigation of the letters written by mathematicians working on proofs about the Euler characteristic of polyhedra in the 18th and 19th centuries.

Whereas in the early 20th century, based on the work of Russel and Whitehead and others, formal logic was axiomatized, prior to this mathematical argumentation had less formal grounding. As a result, mathematicians would argue not just substantively about the theorem they were trying to prove or disprove, but also about what constitutes a proof, a conjecture, or a theorem in the first place. Lakatos demonstrates this by condensing 200+ years of scholarly communication into a fictional, impassioned classroom dialog where characters representing mathematicians throughout history banter about polyhedra and proof techniques.

What’s fascinating is how convincingly Lakatos presents the progress of mathematical understanding as an example of dialectical logic. Though he doesn’t use the word “dialectical” as far as I’m aware, he tells the story of the informal logic of pre-Russellian mathematics through dialog. But this dialog is designed to capture the timeless logic behind what’s been said before. It takes the reader through the thought process of mathematical discovery in abbreviated form.

I’ve had conversations with serious historians and ethnographers of science who would object strongly to the idea of a history of a scientific discipline reflecting a “timeless logic”. Historians are apt to think that nothing is timeless. I’m inclined to think that the objectivity of logic persists over time much the same way that it persists over space and between subjects, even illogical ones, hence its power. These are perhaps theological questions.

What I’d like to argue (but am not sure how) is that the process of informal mathematics presented by Lakatos is strikingly similar to that used by software engineers. The process of selecting a conjecture, then of writing a proof (which for Lakatos is a logical argument whether or not it is sound or valid), then having it critiqued with counterexamples, which may either be global (counter to the original conjecture) or local (counter to a lemma), then modifying the proof, then perhaps starting from scratch based on a new insight… all this reads uncannily like the process of debugging source code.

The argument for this correspondence is strengthened by later work in theory of computation and complexity theory. I learned this theory so long ago I forget who to attribute it to, but much of the foundational work in computer science was the establishment of a correspondence between classes of formal logic and classes of programming languages. So in a sense its uncontroversial within computer science to consider programs to be proofs.

As I write I am unsure whether I’m simply restating what’s obvious to computer scientists in an antiquated philosophical language (a danger I feel every time I read a book, lately) or if I’m capturing something that could be an interesting synthesis. But my point is this: that if programming language design and the construction of progressively more powerful software libraries is akin to the expanding of formal mathematical knowledge from axiomatic grounds, then the act of programming itself is much more like the informal mathematics of pre-Russellian mathematics. Specifically, in that it is unaxiomatic and proofs are in play without necessarily being sound. When we use a software system, we are depending necessarily on a system of imperfected proofs that we fix iteratively through discovered counterexamples (bugs).

Is it fair to say, then, that whereas the logic of software is formal, deductive logic, the logic of programming is dialectical logic?

Bear with me; let’s presume it is. That’s a foundational idea of my dissertation work. Proving or disproving it may or may not be out of scope of the dissertation itself, but it’s where it’s ultimately headed.

The question is whether it is possible to develop a formal understanding of dialectical logic through a scientific analysis of the software collaboration. (see a mathematical model of collective creativity). If this could be done, then we could then build better software or protocols to assist this dialectical process.

Discourse theory of law from Habermas

There has been at least one major gap in my understanding of Habermas’s social theory which I’m just filling now. The position Habermas reaches towards the end of Theory of Communicative Action vol 2 and develops further in later work in Between Facts and Norms (1992) is the discourse theory of law.

What I think went on is that Habermas eventually gave up on deliberative democracy in its purest form. After a career of scholarship about the public sphere, the ideal speech situation, and communicative action–fully developing the lifeworld as the ground for legitimate norms–but eventually had to make a concession to “the steering media” of money and power as necessary for the organization of society at scale. But at the intersection between lifeworld and system is law. Law serves as a transmission belt between legitimate norms established by civil society and “system”; at it’s best it is both efficacious and legitimate.

Law is ambiguous; it can serve both legitimate citizen interests united in communicative solidarity. It can also serve strong powerful interests. But it’s where the action is, because it’s where Habermas sees the ability for lifeworld to counter-steer the whole political apparatus towards legitimacy, including shifting the balance of power between lifeworld and system.

This is interesting because:

  • Habermas is like the last living heir of the Frankfurt School mission and this is a mature and actionable view nevertheless founded in the Critical Theory tradition.
  • If you pair it with Lessig’s Code is Law thesis, you get a framework for thinking about how technical mediation of civil society can be legitimate but also efficacious. I.e., code can be legitimized discoursively through communicative action. Arguably, this is how a lot of open source communities work, as well as standards bodies.
  • Thinking about managerialism as a system of centralized power that provides a framework of freedoms within it, Habermas seems to be presenting an alternative model where law or code evolves with the direct input of civil stakeholders. I’m fascinated by where Nick Doty’s work on multistakeholderism in the W3C is going and think there’s an alternative model in there somewhere. There’s a deep consistency in this, noted a while ago (2003) by Froomkin but largely unacknowledged as far as I can tell in the Data and Society or Berkman worlds.

I don’t see in Habermas anything about funding the state. That would mean acknowledging military force and the power to tax. But this is progress for me.

References

Zurn, Christopher. “Discourse theory of law”, in Jurgen Habermas: Key Concepts, edited by Barbara Fultner

Some research questions

Last week was so interesting. Some weeks you just get exposed to so many different ideas that it’s trouble to integrate them. I tried to articulate what’s been coming up as a result. It’s several difficult questions.

  • Assuming trust is necessary for effective context management, how does one organize sociotechnical systems to provide social equity in a sustainable way?
  • Assuming an ecology of scientific practices, what are appropriate selection mechanisms (or criteria)? Are they transcendent or immanent?
  • Given the contradictory character of emotional reality, how can psychic integration occur without rendering one dead or at least very boring?
  • Are there limitations of the computational paradigm imposed by data science as an emerging pan-constructivist practice coextensive with the limits of cognitive or phenomenological primitives?

Some notes:

  • I think that two or three of these questions above may be in essence the same question. In that they can be formalized into the same mathematical problem, and the solution is the same in each case.
  • I really do have to read Isabelle Stengers and Nancy Nersessian. Based on the signals I’m getting, they seem to be the people most on top of their game in terms of understanding how science happens.
  • I’ve been assuming that trust relations are interpersonal but I suppose they can be interorganizational as well, or between a person and an organization. This gets back to a problem I struggle with in a recurring way: how do you account for causal relationships between a macro-organism (like an organization or company) and a micro-organism? I think it’s when there are entanglements between these kinds of entities that we are inclined to call something an “ecosystem”, though I learned recently that this use of the term bothers actual ecologists (no surprise there). The only things I know about ecology are from reading Ulanowicz papers, but those have been so on point and beautiful that I feel I can proceed with confidence anyway.
  • I don’t think there’s any way to get around having at least a psychological model to work with when looking at these sorts of things. A recurring an promising angle is that of psychic integration. Carl Jung, who has inspired clinical practices that I can personally vouch for, and Gregory Bateson both understood the goal of personal growth to be integration of disparate elements. I’ve learned recently from Turner’s The Democratic Surround that Bateson was a more significant historical figure than I thought, unless Turner’s account of history is a glorification of intellectuals that appeal to him, which is entirely possible. Perhaps more importantly to me, Bateson inspired Ulanowicz, and so these theories are compatible; Bateson was also a cyberneticist following Wiener, who was prescient and either foundational to contemporary data science or a good articulator of its roots. But there is also a tie-in to constructivist epistemology. DiSessa’s epistemology, building on Piaget but embracing what he calls the computational metaphor, understands the learning of math and physics as the integration of phenomenological primitives.
  • The purpose of all this is ultimately protocol design.
  • This does not pertain directly to my dissertation, though I think it’s useful orienting context.

Protected: Discovering Thomas Sowell #blacklivesmatter

This content is password-protected. To view it, please enter the password below.

Notes on The Democratic Surround; managerialism

I’ve been greatly enjoying Fred Turner’s The Democratic Surround partly because it cuts through a lot of ideological baggage with smart historical detail. It marks a turn, perhaps, in what intellectuals talk about. The critical left has been hung up on neoliberalism for decades while the actual institutions that are worth criticizing have moved on. It’s nice to see a new name for what’s happening. That new name is managerialism.

Managerialism is a way to talk about what Facebook and the Democratic Party and everybody else providing a highly computationally tuned menu of options is doing without making the mistake of using old metaphors of control to talk about a new thing.

Turner is ambivalent about managerialism perhaps because he’s at Stanford and so occupies an interesting position in the grand intellectual matrix. He’s read his Foucault, he explains when he speaks in public, though he is sometimes criticized for not being critical enough. I think ‘critical’ intellectuals may find him confusing because he’s not deploying the same ‘critical’ tropes that have been used since Adorno even though he’s writing sometimes about Adorno. He is optimistic, or at least writes optimistically about the past, or at least writes about the past in a way that isn’t overtly scathing which is just more upbeat than a lot of writing nowadays.

Managerialism is, roughly, the idea of technocratically bounded space of complex interactive freedom as a principle of governance or social organization. In The Democratic Surround, he is providing a historical analysis of a Bauhaus-initiated multimedia curation format, the ‘surround’, to represent managerialist democracy in the same way Foucault provided a historical analysis of the Panopticon to represent surveillance. He is attempting to implant a new symbol into the vocabulary of political and social thinkers that we can use to understand the world around us while giving it a rich and subtle history that expands our sense of its possibilities.

I’m about halfway through the book. I love it. If I have a criticism of it it’s that everything in it is a managerialist surround and sometimes his arguments seems a bit stretched. For example, here’s his description of how John Cage’s famous 4’33” is a managerialist surround:

With 4’33”, as with Theater Piece #1, Cage freed sounds, performers, and audiences alike from the tyrannical wills of musical dictators. All tensions–between composer, performer, and audience; between sound and music; between the West and the East–had dissolved. Even as he turned away from what he saw as more authoritarian modes of composition and performance, though, Cage did not relinquish all control of the situation. Rather, he acted as an aesthetic expert, issuing instructions that set the parameters for action. Even as he declined the dictator’s baton, Cage took up a version of the manager’s spreadsheet and memo. Thanks to his benevolent instructions, listeners and music makers alike became free to hear the world as it was and to know themselves in that moment. Sounds and people became unified in their diversity, free to act as they liked, within a distinctly American musical universe–a universe finally freed of dictators, but not without order.

I have two weaknesses as a reader. One is a soft spot for wicked vitriol. Another is an intolerance of rhetorical flourish. The above paragraph is rhetorical flourish that doesn’t make sense. Saying that 4’33” is a manager’s spreadsheet is just about the most nonsensical metaphor I could imagine. In a universe with only fascists and managerialists, then I guess 4’33” is more like a memo. But there are so many more apt musical metaphors for unification in diversity in music. For example, a blues or jazz band playing a standard. Literally any improvisational musical form. No less quintessentially American.

If you bear with me and agree that this particular point is poorly argued and that John Cage wasn’t actually a managerialist and was in fact the Zen spiritualist that he claimed to be in his essays, then either Turner is equating managerialism with Zen spiritualism or Turner is trying to make Cage a symbol of managerialism for his own ideological ends.

Either of these is plausible. Steve Jobs was an I Ching enthusiast like Cage. Stewart Brand, the subject of Turner’s last book, From Counterculture to Cyberculture, was a back-to-land commune enthusiast before he become a capitalist digerati hero. Running through Turner’s work is the demonstration of the cool origins of today’s world that’s run by managerialist power. We are where we are today because democracy won against fascism. We are where we are today because hippies won against whoever. Sort of. Turner is also frank about capitalist recuperation of everything cool. But this is not so bad. Startups are basically like co-ops–worker owned until the VC’s get too involved.

I’m a tech guy, sort of. It’s easy for me to read my own ambivalence about the world we’re in today into Turner’s book. I’m cool, right? I like interesting music and read books on intellectual history and am tolerant of people despite my connections to power, right? Managers aren’t so bad. I’ve been a manager. They are necessary. Sometimes they are benevolent and loved. That’s not bad, right? Maybe everything is just fine because we have a mode of social organization that just makes more sense now than what we had before. It’s a nice happy medium between fascism, communism, anarchism, and all the other extreme -ism’s that plagued the 20th century with war. People used to starve to death or kill each other en masse. Now they complain about bad management or, more likely, bad customer service. They complain as if the bad managers are likely to commit a war crime at any minute but that’s because their complaints would sound so petty and trivial if they were voiced without the use of tropes that let us associate poor customer service with deliberate mind-control propaganda or industrial wage slavery. We’ve forgotten how to complain in a way that isn’t hyperbolic.

Maybe it’s the hyperbole that’s the real issue. Maybe a managerialist world lacks catastrophe and so is so frickin’ boring that we just don’t have the kinds of social crises that a generation of intellectuals trained in social criticism have been prepared for. Maybe we talk about how things are “totally awesome!” and totally bad because nothing really is that good or that bad and so our field of attention has contracted to the minute, amplifying even the faintest signal into something significant. Case in point, Alex from Target. Under well-tuned managerialism, the only thing worth getting worked up about is that people are worked up about something. Even if it’s nothing. That’s the news!

So if there’s a critique of managerialism, it’s that it renders the managed stupid. This is a problem.

textual causation

A problem that’s coming up for me as a data scientist is the problem of textual causation.

There has been significant interesting research into the problem of extracting causal relationships between things in the world from text about those things. That’s an interesting problem but not the problem I am talking about.

I am talking about the problem of identifying when a piece of text has been the cause of some event in the world. So, did the State of the Union address affect the stock prices of U.S. companies? Specifically, did the text of the State of the Union address affect the stock price? Did my email cause my company to be more productive? Did specifically what I wrote in the email make a difference?

A trivial example of textual causation (if I have my facts right–maybe I don’t) is the calculation of Twitter trending topics. Millions of users write text. That text is algorithmically scanned and under certain conditions, Twitter determines a topic to be trending and displays it to more users through its user interface, which also uses text. The user interface text causes thousands more users to look at what people are saying about the topic, increasing the causal impact of the original text. And so on.

These are some challenges to understanding the causal impact of text:

  • Text is an extraordinarily high-dimensional space with tremendous irregularity in distribution of features.
  • Textual events are unique not just because the probability of any particular utterance is so low, but also because the context of an utterance is informed by all the text prior to it.
  • For the most part, text is generated by a process of unfathomable complexity and interpreted likewise.
  • A single ‘piece’ of text can appear and reappear in multiple contexts as distinct events.

I am interested in whether it is possible to get a grip on textual causation mathematically and with machine learning tools. Bayesian methods theoretically can help with the prediction of unique events. And the Pearl/Rubin model of causation is well integrated with Bayesian methods. But is it possible to use the Pearl/Rubin model to understand unique events? The methodological uses of Pearl/Rubin I’ve seen are all about establishing type causation between independent occurrences. Textual causation appears to be as a rule a kind of token causation in a deeply integrated contextual web.

Perhaps this is what makes the study of textual causation uninteresting. If it does not generalize, then it is difficult to monetize. It is a matter of historical or cultural interest.

But think about all the effort that goes into communication at, say, the operational level of an organization. How many jobs require “excellent communication skills.” A great deal of emphasis is placed not only on that communication happens, but how people communicate.

One way to approach this is using the tools of linguistics. Linguistics looks at speech and breaks it down into components and structures that can be scientifically analyzed. It can identify when there are differences in these components and structures, calling these differences dialects or languages.

analysis of content vs. analysis of distribution of media

A theme that keeps coming up for me in work and conversation lately is the difference between analysis of the content of media and analysis of the distribution of media.

Analysis of content looks for the tropes, motifs, psychological intentions, unconscious historical influences, etc. of the media. Over Thanksgiving a friend of mine was arguing that the Scorpions were a dog whistle to white listeners because that band made a deliberate move to distance themselves from influence of black music on rock. Contrast this with Def Leppard. He reached this conclusion based by listening carefully to the beats and contextualizing them in historical conversations that were happening at the time.

Analysis of distribution looks at information flow and the systemic channels that shape it. How did the telegraph change patterns of communication? How did television? Radio? The Internet? Google? Facebook? Twitter? Ello? Who is paying for the distribution of this media? How far does the signal reach?

Each of these views is incomplete. Just as data underdetermines hypotheses, media underdetermines its interpretation. In both cases, a more complete understanding of the etiology of the data/media is needed to select between competing hypotheses. We can’t truly understand content unless we understand the channels through which it passes.

Analysis of distribution is more difficult than analysis of content because distribution is less visible. It is much easier to possess and study data/media than it is to possess and study the means of distribution. The means of distribution are a kind of capital. Those that study it from the outside must work hard to get anything better than a superficial view of it. Those on the inside work hard to get a deep view of it that stays up to date.

Part of the difficulty of analysis of distribution is that the system of distribution depends on the totality of information passing through it. Communication involves the dynamic engagement of both speakers and an audience. So a complete analysis of distribution must include an analysis of content for every piece of implicated content.

One thing that makes the content analysis necessary for analysis of distribution more difficult than what passes for content analysis simpliciter is that the former needs to take into account incorrect interpretation. Suppose you were trying to understand the popularity of Fascist propaganda in pre-WWII Germany and were interested in how the state owned the mass media channels. You could initially base your theory simply on how people were getting bombarded by the same information all the time. But you would at some point need to consider how the audience was reacting. Was it stirring feelings of patriotic national identity? Did they experience communal feelings with others sharing similar opinions? As propaganda provided interpretations of Shakespeare saying he was secretly a German and denunciation of other works as “degenerate art”, did the audience believe this content analysis? Did their belief in the propaganda allow them to continue to endorse the systems of distribution in which they took part?

This shows how the question of how media is interpreted is a political battle fought by many. Nobody fighting these battles is an impartial scientist. Since one gets an understanding of the means of distribution through impartial science, and since this understanding of the means of distribution is necessary for correct content analysis, we can dismiss most content analysis as speculative garbage, from a scientific perspective. What this kind of content analysis is instead is art. It can be really beautiful and important art.

On the other hand, since distribution analysis depends on the analysis of every piece of implicated content, distribution analysis is ultimately hopeless without automated methods for content analysis. This is one reason why machine learning techniques for analyzing text, images, and video are such a hot research area. While the techniques for optimizing supply chain logistics (for example) are rather old, the automated processing of media is a more subtle problem precisely because it involves the interpretation and reinterpretation by finite subjects.

By “finite subject” here I mean subjects that are inescapably limited by the boundaries of their own perspective. These limits are what makes their interpretation possible and also what makes their interpretation incomplete.

things I’ve been doing while not looking at twitter

Twitter was getting me down so I went on a hiatus. I’m still on that hiatus. Instead of reading Twitter, I’ve been:

  • Reading Fred Turner’s The Democratic Surround. This is a great book about the relationship between media and democracy. Since a lot of my interest in Twitter has been because of my interest in the media and democracy, this gives me those kinds of jollies without the soap opera trainwreck of actually participating in social media.
  • Going to arts events. There was a staging of Rhinoceros at Berkeley. It’s an absurdist play in which a small French village is suddenly stricken by an epidemic wherein everybody is transformed into a rhinoceros. It’s probably an allegory for the rise of Communism or Fascism but the play is written so that it’s completely ambiguous. Mainly it’s about conformity in general, perhaps ideological conformity but just as easily about conformity to non-ideology, to a state of nature (hence, the animal form, rhinoceros.) It’s a good play.
  • I’ve been playing Transistor. What an incredible game! The gameplay is appealingly designed and original, but beyond that it is powerfully written an atmospheric. In many ways it can be read as a commentary on the virtual realities of the Internet and the problems with them. Somehow there was more media attention to GamerGate than to this one actually great game. Too bad.
  • I’ve been working on papers, software, and research in anticipation of the next semester. Lots of work to do!

Above all, what’s great about unplugging from social media is that it isn’t actually unplugging at all. Instead, you can plug into a smarter, better, deeper world of content where people are more complex and reasonable. It’s elevating!

I’m writing this because some time ago it was a matter of debate whether or not you can ‘just quit Facebook’ etc. It turns out you definitely can and it’s great. Go for it!

(Happy to respond to comments but won’t respond to tweets until back from the hiatus)

prediction and computational complexity

To the extent that an agent is predictable, it must be:

  • observable, and
  • have a knowable internal structure

The first implies that the predictor has collected data emitted by the agent.

The second implies that the agent has internal structure and that the predictor has the capacity to represent the internal structure of the other agent.

In general, we can say that people do not have the capacity to explicitly represent other people very well. People are unpredictable to each other. This is what makes us free. When somebody is utterly predictable to us, their rigidity is a sign of weakness or stupidity. They are following a simple algorithm.

We are able to model the internal structure of worms with available computing power.

As we build more and more powerful predictive systems, we can ask: is our internal structure in principle knowable by this powerful machine?

This is different from the question of whether or not the predictive machine has data from which to draw inferences. Though of course the questions are related in their implications.

I’ve tried to make progress on modeling this with limited success. Spiros has just told me about binary decision diagrams which are a promising lead.

objective properties of text and robot scientists

One problem with having objectivity as a scientific goal is that it may be humanly impossible.

One area where this comes up is in the reading of a text. To read is to interpret, and it is impossible to interpret without bringing ones own concepts and experience to bear on the interpretation. This introduces partiality.

This is one reason why Digital Humanities are interesting. In Digital Humanities, one is using only the objective properties of the text–its data as a string of characters and its metadata. Semantic analysis is reduced to a study of a statistical distribution over words.

An odd conclusion: the objective scientific subject won’t be a human intelligence at all. It will need to be a robot. Its concepts may never be interpretable by humans because any individual human is too small-minded or restricted in their point of view to understand the whole.

Looking at the history of cybernetics, artificial intelligence, and machine learning, we can see the progression of a science dedicated to understanding the abstract properties of an idealized, objective learner. That systems such as these underly the infrastructure we depend on for the organization of society is a testament to their success.