Digifesto

on courage in the face of failure developing bluestocking

It would be easy to be discouraged by early experiments with bluestocking.

sb@lebenswelt:~/dev/bluestocking$ python factchecker.py "Courage is what makes us. Courage is what divides us. Courage is what drives us. Courage is what stops us. Courage creates news. Courage demands more. Courage creates blame. Courage brings shame. Courage shows in school. Courage determines the cool. Courage divides the weak. Courage pours out like a leak. Courage puts us on a knee. Courage makes us free. Courage makes us plea. Courage helps us flee. Corey Fauchon"
Looking up Fauchon
Lookup failed
Looking up shame
Looking up news
Looking up puts
Lookup failed
Looking up leak
Lookup failed
Looking up stops
Lookup failed
Looking up Courage
Looking up helps
Lookup failed
Looking up divides
Lookup failed
Looking up shows
Lookup failed
Looking up demands
Lookup failed
Looking up pours
Lookup failed
Looking up brings
Lookup failed
Looking up weak
Lookup failed
Looking up drives
Lookup failed
Looking up free
Looking up blame
Lookup failed
Looking up Corey
Lookup failed
Looking up plea
Lookup failed
Looking up knee
Looking up flee
Lookup failed
Looking up cool
Looking up school
Looking up determines
Lookup failed
Looking up like
Looking up us
Lookup failed
Looking up creates
Lookup failed
Looking up makes
Lookup failed
Building knowledge base
Querying knowledge base with original document
Consistency: 0
Contradictions: []
Supported: []
Novel: [(True, 'helps', 'flee'), (True, 'helps', 'us'), (True, 'determines', 'cool'), (True, 'like', 'leak'), (True, 'puts', 'knee'), (True, 'puts', 'us'), (True, 'pours', 'leak'), (True, 'pours', 'like'), (True, 'brings', 'shame'), (True, 'drives', 'us'), (True, 'stops', 'us'), (True, 'creates', 'blame'), (True, 'creates', 'news'), (True, 'Courage', 'shame'), (True, 'Courage', 'news'), (True, 'Courage', 'puts'), (True, 'Courage', 'leak'), (True, 'Courage', 'stops'), (True, 'Courage', 'helps'), (True, 'Courage', 'divides'), (True, 'Courage', 'shows'), (True, 'Courage', 'demands'), (True, 'Courage', 'pours'), (True, 'Courage', 'brings'), (True, 'Courage', 'weak'), (True, 'Courage', 'drives'), (True, 'Courage', 'free'), (True, 'Courage', 'blame'), (True, 'Courage', 'plea'), (True, 'Courage', 'knee'), (True, 'Courage', 'flee'), (True, 'Courage', 'cool'), (True, 'Courage', 'school'), (True, 'Courage', 'determines'), (True, 'Courage', 'like'), (True, 'Courage', 'us'), (True, 'Courage', 'creates'), (True, 'Courage', 'makes'), (True, 'us', 'knee'), (True, 'us', 'flee'), (True, 'us', 'plea'), (True, 'us', 'free'), (True, 'Corey', 'Fauchon'), (True, 'makes', 'plea'), (True, 'makes', 'free'), (True, 'makes', 'us'), (True, 'divides', 'weak'), (True, 'divides', 'us'), (True, 'shows', 'school')]

But, then again, our ambitions are outlandish. Nevertheless, there is a silver lining:

sb@lebenswelt:~/dev/bluestocking$ python factchecker.py "The sky is not blue."
Looking up blue
Looking up sky
Building knowledge base
Querying knowledge base with original document
Consistency: -1
Contradictions: [(True, 'sky', 'blue')]
Supported: []
Novel: []

averting the techno-apocalypse

I was worried when I wrote this that I was exaggerating the phenomenon of literati denouncing technical progress. Then I happened upon this post by a pseudonymous Mr. Teacup, which echoes themes from Morozov’s review.

(At a company Christmas party, we exchanged Secret Santa gifts drawn from each other’s Amazon wish lists. I received Žižek’s In Defense of Lost Causes, and was asked by the Ivy-League educated hacker founder what the book was about. I explained that the book’s lost cause was Enlightenment values, and he was totally shocked by this because he had never heard that they were even in doubt – a typical example of hackers’ ignorance of intellectual trends outside their narrow fields of engineering expertise. But this naivety may explain why some parts of the public finds Silicon Valley’s pseudo-revolutionary marketing message so compelling – their hostility to the humanities has, for good or ill, spared them the influence of postmodernity, so that they are the only segment of society that unselfconsciously adopts universal-emancipatory rhetoric. Admittedly, this rhetoric is misleading and conceals a primarily capitalist agenda. Nonetheless, the public’s misrecognition of Silicon Valley’s potential to liberate also contains a moment of truth.)

All of this is true. But it’s also a matter of perspective. The “narrow fields of engineering expertise” require, to some extent, an embrace of Enlightenment values and universal-emancipatory rhetoric. Meanwhile, the humanities, which have adopted a kind of universal-problematic rhetoric (in which intellectual victory is achieved by labeling something as ‘problematic’), are themselves insulated. Can it be truthfully said that such rhetoric is an ‘intellectual trend’ outside of the narrow fields of high brow wordslinging?

I wouldn’t know, as I’ve been exposed enough to both sides to have gotten both bugs. And, I’d guess, so has Mr. Teacup, who writes in what I believe is an hyperintellectualized parody:

The reader will find in these pages a repository of chronologically-arranged personal writings on topics at turns varied and repetitious, circulating around certain themes: the Internet and the problematics of New Media; Capitalism; Anti-Capitalism; Psychoanalysis; Film; the works of Žižek, Lacan and others; etc.

…while the author is in fact a web professional living in this century.

I think Mr. Teacup does a good job of diagnosing some of the roots of technophobia. The technophobe denies that the technologists are in fact transforming society because they believe change is possible and are terrified that it will occur, while the technologist is happy to say that Things are Changing–but just as they Always Have, though perhaps much more significantly in their era. (Isn’t the rate of technological change “increasing”? Isn’t that a natural consequence of Moore’s law?)

Those who domesticate social change are telling us that nothing is going to happen: “Yes, things will change, but don’t worry about it! Society will adjust and everything will go back to normal.” This is true conservatism. But some are afraid, because they believe change can really happen. (For example, the Tea Party is the only political group that believes in socialism, while progressives continually deny that it is a possibility.)

What if the converse is also true: those who believe in change are afraid, and this is not the same as opposing it. The technophobic nightmare scenarios of machines spinning out of control is not a delusional fantasy. On the contrary, it gives us an extremely accurate psychological representation of what genuine social change entails. The radical step is to simply endorse it. From the standpoint of the old ways, the birth of the New must be subjectively experienced as an apocalyptic event.

So, Morozov‘s loathing of the Hybrid Reality Institute is due to what again? A legitimate fear that technological change will usher in an autocratic regime that is run by technocratic industrialists without democratic consent. Mr. Teacup writes:

This reveals the general problem with deconstructing the human-technology binary: it frequently undermines legitimate grievances about the coercive uses of technology. People are not that stupid, they don’t oppose technology because they don’t realize they are always-already technologically mediated. They oppose technology because they do realize it – this is what makes it a crucial site of political resistance.

The problem, though, is that technophobia, however entertainingly it is articulated, will do nothing to stop technical change, because (as it’s already been conceded) the people responsible for technical change don’t bother reading expansive critiques informed by the intellectual trends in the humanities. Rather, it seems that technologists are developing their own intellectual tradition based on theories of the Singularity and individual rationality. A more mathematized, libertarian, and pragmatic great-grandchild of Enlightenment thought.

The question for those concerned with the death of democratic politics or the rise of technocolonialism, then, has got to be: how do you do better than whining? Given that technological change is going to happen, how can it be better steered towards less “problematic” ends?

The difficulty with this question is that it is deeply sociotechnical. Meaning, it’s a question where social and technical problems are interleaved so densely that it requires expertise from both sides of the aisle. Which means that the literati and digerati are going to have to respectfully talk to each other.

Protected: The jealousy of the literati in the Hybrid Age

This content is password-protected. To view it, please enter the password below.

The link between computation asymmetry and openness

I want to jot something down while it is on my mind. It’s rather speculative, but may wind up being the theme of my thesis work.

I’ve written here about computational asymmetry in the economy. The idea is that when different agents are endowed with different capacity to compute (or are differently boundedly rational)), then that can become an extreme inequality (power law distributed, as is income) as computational power is stockpiled as a kind of capital accumulation.

Whereas a solution to unequal income is redistribution and a solution to unequal physical is regulation against violence, for computational asymmetry there is a simpler solution: “openness” in the products of computation. In particular, high quality data goods–data that is computationally rich (has more logical depth)–can be made available as public goods.

There are several challenges to this idea. One is the problem of funding. How do you encourage the production of costly public goods? The classic answer is state funding. Today we have another viable option, crowdfunding.

Another involves questions of security and privacy. Can a policy of ‘openness’ lead to problematic invasions of privacy? Viewing the problem in light of computational assymetry sheds light into this dynamic. Privacy should be a privilege of the disempowered, openness a requirement of the powerful.

In an ideal economy, agents are rewarded for their contribution to social welfare. For high quality data goods, openness leads to the maximum social welfare. So in theory, agents should be willingly adopting an open policy of their own volition. What has prevented them in the past is transactions costs and the problem of incurred risk. As institutions that reduce transaction costs and absorb risks get better, the remaining problems will be ones of regulation of noncompetitive practices.

Why federally funded software should be open source

Recently, open access to government funded research has gained attention and traction. Britain and Europe have both announced that they will make research they fund open access. In the United States, a community-driven effort has pushed a Whitehouse petition to the Obama administration for a similar policy. We may be experiencing a sea change.

Perhaps on the coattails of this movement, Open Source for America has launched a petition asking for a similar policy regarded federally funded software development: share all government-developed software under an open source license.

This is a really good idea.

Unfortunately, software development and the government IT procurement are so misunderstood that this is not likely to excite those who aren’t somehow directly by the issue. That is too bad, because every American stands to benefit from this sort of change. That makes it important for those of us who do understand to act.

I’ll try to illustrate why this is important with a story, or really a template of a story. This is a story told in countless cases of government software procurement:

ACRNM, a federal agency, has realized that its database management system and its user interface have not been updated since the late 90’s, because building it the last time was such a headache. It never really worked the way they wanted it, and the vendor who built it for them has since vanished off the face of the earth. Desperate and beleaguered, ACRNM finally gets the budget together to build a new system, and put out a bid.

Vendors that have navigated the prerequisite bureaucratic maze flock to this bid, knowing victory will be lucrative. Among them is FUBAR Enterprise Solutions. They know that whatever they build, they have a revenue stream for life. Not only does ACRNM have an enormous internal incentive to declare the new system a success to justify their budget, but they also have nobody to turn to for help with their software when it inevitably fails but FUBAR. FUBAR can continue extorting ACRNM for cash until ACRNM gives up, and the cycle continues.

What is wrong with this picture? Let’s count the problems:

  • FUBAR has ACRNM by the (pardon me, there’s really no other way to put this) balls. The term is vendor lock-in. The second ACRNM installs their system, FUBAR becomes a parasite on the government leeching taxpayer money. This is because the software is proprietary. No other company is legally allowed to fix or modify FUBAR’s proprietary system, so FUBAR faces no competition and so can charge through the nose. If the software were open source, ACRNM could turn to other contractors to repair their system, lowering total costs.
  • ACRNM has to do its work with worse software. Remember, this is a government agency that we pay taxes to for their services. With so much government activity boiling down to bureaucratic information processing, and so much innovation in software engineering and design, and so much budgetary pressure, you would think that the federal government would leap at technological innovation. But proprietary contracting causes the government to cripple itself at a tipping point.
  • Today, government agencies like ACRNM are wisening up and turning to open source solutions. But it’s a slow, slow process. This is partly because FUBAR and its buddy companies who, after so many years of this relationship with government, are now an entrenched lobby that will sow Fear, Uncertainty, and Doubt about open source alternatives if they can get away with it. In recent years, since open source has become more mainstream, these companies are admitting the viability of open source compatibility and mixed solutions. They see the writing on the wall. They will of course fight an open source purchasing mandate with everything they have.
  • Few governmental problems are unique. If ACRNM is paying for a new custom software solution, there likely many other agencies–at federal, state, or local level–with a similar problem. Civic Commons has already jumped on this opportunity by trying to facilitate technology reuse across city governments. If ACRNM invests in an open source solution, then other agencies can seek out that solution and adapt it to their needs, reducing government IT costs overall.
  • As we’ve discussed, open source software creates a competitive market for services. That makes an open source mandate a job creation program. Every new open technology is an opportunity for several small businesses to open. These are businesses that share fixed costs to market entry and add value through technologist consulting and custom development. Jobs customizing existing open source solutions can be well-paid with even an entry-level programming skill set, and are a good way to build a lasting career in the technology sector. Federal investment in open source software builds our national supply of technology skill faster than proprietary investment.
  • Lastly, but certainly not least, is the possible reuse of open source technology by the private sector. Just as federally funded research contributes to growth in America’s scientific industry, federal investment in software provides a foundation for stronger tech companies. Openness in both cases expands the impact of the funding.

So, to recap: if this sort of policy passes, the winners are government employees, taxpayers, entry-level workers with a minimum of technical skills, and the tech industry in general. The losers (in the short term) are those existing companies that have the federal government locked into custom proprietary software contracts.

I want to make a point clear: I am talking specifically about new software development in this post. Purchasing licenses for existing proprietary software is a different story.

Brian Carver, professor at UC Berkeley School of Information, has offered this clarification of what an open source mandate could look like:

  1. An unambiguous policy and awareness that all software created by
    federal employees as part of their job duties is not subject to copyright
    at all and is born in the public domain, and therefore not subject to any
    license terms at all, including a FOSS license.
  2. Given 1, the federal government should either just use github/bitbucket
    or set up a similar repository to share all such federal government
    software that is in the public domain.
  3. When the federal government contracts with developers for software,
    there should be an unambiguous policy that all such software must be
    licensed under a FOSS license unless subject to a specifically-requested
    exemption (national security, military, etc.)

A central election issue is the size and role of government in the economy. Politicians on the right advocate for smaller government and a strong private sector with competitive markets. Politicians on the left advocate for government’s active investment in the economy.

Proprietary government-developed software is the worst of both worlds: inefficient government spending to create parasitic, uncompetitive companies that don’t invest their technology back into the economy. An open source mandate would give us the best of both worlds: efficient government spending that shrinks government (by easing overhead) while investing in new technology and competitive businesses.

The movement for open access to government funded research is strong and winning victories around the world. Maybe we can do the same for government funded software development.

The Shame or Shine Lotto

Consider the following Massively Multiplier On-line Game:

  • The game is strictly opt in. Nobody is forced to play the game.
  • Upon joining, some set of personal details is tracked and saved by the game. Purchasing data, tax records, …hell, legal record, personal messages?
  • Once per day, N players are selected at random and the data available on them are released into the public domain.
  • Members can look up to see whether others are playing the game. In addition to identifying information, they can see what information a player has agreed to have tracked.

It’s the Shame or Shine Lotto! Every day, there is a chance you will be roasted or toasted for the information you’ve agreed to uncertainly share.

Would you play this game?

We need help naming a software project

Speaking of computational argumentation, Dave Kush and I are starting a software project and we need a name for it.

The purpose of the software is to extract information from a large number of documents, and then merge this information together into a knowledge base. We think this could be pretty great because it would support:

  • Conflict detection and resolution. In the process of combining information from many sources into a single knowledge base, the system should be able to mark conflicts of information. That would indicate an inconsistency or controversy between the documents, which could be flagged for further investigation.
  • Naturally queryable aggregate knowledge. We anticipate being able to build a query interface that is a natural extension of this system: just run the query through the extraction process and compare the result for consistency with the knowledge base. This would make the system into a “dissonance engine,” useful for opposition research or the popping of filter bubbles.

I should say that neither of us knows exactly what we are doing. But Dave’s almost got his PhD in human syntax so I think we’ve got a shot at building a sweet parser. What’s more, we’ve got the will and plan. It will be open source, of course, and we’re eager for collaborators.

We have one problem:

We don’t know what to call it.

I can’t even make the GitHub account for our code until we have a good name. And until then we’ll be sending Python scripts to each other as email attachments and that will never get anywhere.

Please help us. Tell us what to name our project. If we use your name, we’ll do something awesome for you some day.

Scratch that. We’re calling it Bluestocking. The GitHub repo is here.

Computational Asymmetry

I’ve written a paper with John Chuang about “Computational Asymmetry in Strategic Bayes Networks” to open a conversation about an economic and social issue: computational asymmetry. By this I mean the problem that some agents–people, corporations, nations–have access to more computational power than others.

We know that computational power is a scarce resource. Computing costs money, whether we buy our own hardware or rent it on the cloud. Should we be concerned with how this resource gets distributed in society?

One could argue that the market will lead to an efficient distribution of computing power, just like it leads to an efficient distribution of brown shoes or butter. But that argument only makes sense if computational power is not associated with externalities that would cause systematic market failure.

This isn’t likely. We know that information asymmetry can wreak havoc on market efficiency. Arguably, computational asymmetry is another form of information asymmetry: it allows some parties to get important information, faster. Or perhaps a better way to put it is that with more computing power, you can get more knowledge out of the information you already have.

In the paper linked above, we show that in some game theoretic situations with complex problems, more computationally powerful players can beat their opponents using only their superior silicon. Suppose that organizations use computing power to gain an economic advantage, and then use their winnings to invest in more computing power? You could see how this cycle would lead to massive inequality.

I don’t think this situation is far fetched. In fact, we may already be living it. Consider that computing power is carried not just by hardware availability, but by software and human capital. What are the most powerful forces in United States politics today? Is it Wall Street, with its bright minds and high-frequency traders? Or Silicon Valley, crunching data and rolling out code? Or technocratic elites in government? President Obama has a large team of software developers available to build whatever data mining tools he needs. Does Mexico have the same skills and tools at its disposal? Does Nigeria? There is asymmetry here. How will this power imbalance manifest itself in twenty years? Fifty years?

Henry Farrell (George Washington University) and Cosma Rohilla Shalizi (Carnegie-Mellon/The Santa Fe Institute) have recently put out a great paper about Cognitive Democracy, a political theory that grapple’s with society’s ability to solve complex problems. Following Hayek, who maintains that the market will efficiently solve complex economic problems, and Thaler and Sunstein, who believe that a paternalistic hierarchy can solve problems in a disinterested way, Farrell and Shaliza argue that a radical democracy can solve problems in a way that diffuses unequal power through people’s confrontation with other viewpoints. This requires that open argumentation and deliberation being an effective information-processing mechanism. They advocate for greater experimentation with democratic structure over the Internet, with the goal of eventually re-designing democratic institutions.

I love the concept of cognitive democracy and their approach. However, if their background assumptions are correct then computational asymmetry poses a problem. Politics is the negotiation of adversarial interests. If argumentation is a computational process (which I believe it is), then even a system of governance based on free speech and collective intelligence could be manipulated or overpowered by a computational titan. In such a system, whoever holds the greatest gigahertz gets a bigger piece of the derived social truth. As we plunge into a more computationally directed world, that should give us pause.

Defining information with Dretske

I prepared these slides to present Fred Dretske’s paper “The Epistemology of Belief” to a class I’m taking this semester, ‘Concepts of Information’, taught by Paul Duguid and Geoff Nunberg.

Somewhere along the line I realised that if I was put on earth for one reason and one reason only, it was to make slide decks about epistemology.

I’ve had a serious interest in philosophy as a student and as a…hobbyist? can you say that?…for my entire thinking life. I considered going to graduate school for it before tossing the idea for more practical pursuits. So it comes as a delightful surprise that I’ve found an opportunity to read and work with philosophy at a graduate level through my program.

A difficult issue for a “School of Information” is defining what information is. I’ve gathered from conversations with faculty that there is an acknowledged intellectual tussle over the identity of iSchools which hinges in part on the meaning of the word. There seems to me to be roughly two ideologies at play: the cyberneticist ideology that sought to unify Shannon’s information theory, computer science, management science, economics, AI, and psychology under a coherent definition of information on the one hand, and the softer social science view that ‘information’ is a polysemous term which refers variously to newspapers and the stuff mediated by “information technology” in a loose sense but primarily as a social phenomenon.

As I’ve been steeped in the cyberneticist tradition but still consider myself literate in English and capable of recognizing social phenomena, it bothers me that people don’t see all this as just talking about the same thing in different ways.

I figured coming into the program that this was an obvious point that was widely accepted. It’s in a way nice to see that this is controversial and the arguments for this view are either unknown, unarticulated, or obscure, because that means I have some interesting work ahead of me.

This slide deck was a first stab at the problem: tying Dretske’s persuasive account of a qualitative definition of ‘information about’ to the relevant concept of Shannon’s information theory. I hope to see how far I can push this in later work. (At the point where is proves impossible, as opposed to merely difficult or non-obvious, then we’ll have discovered something new!)

The instability of adversarial roles

One topic I’m interested in researching is the automated detection of certain kinds of social (or anti-social) activity on the internet. This paper. “Visualizing the Signatures of Social Roles in Online Discussion Groups”, by Welser et al., is a good example of a stab at the problem. Can we look at data from a mailing list and identify the most helpful person on it? Welser thinks so, and they develop a preliminary model for detecting them.

That’s all well and good until the roles get more complicated. A great way to make things more complicated is by introducing an adversarial relationship into the mix. The internet is rife with adversity, in the form of flame warriors, trolls, and spammers. There is also much more benign disagreement as well, but this is probably comparatively rare. Or (this is a broad claim based in cynicism, not research:) people are so likely to take disagreement and conflict personally or dismissively that much legitimate conflict on the Internet is probably seen as flaming, trolling, or spamming.

The problem is that it is very hard to pin down the definitions of these terms. This isn’t just a conceptual problem. It’s also a problem for engineering solutions around these roles. Spam filtering, for example, depends on a certain model of what counts as spam. While training a classifier based on a user’s subjective classification makes lots of sense in some circumstances (like a mail filter), in other cases the line may be less clear. Trolling, meanwhile, can be death to a web-based community. Once could argue that Pinterest has only been successful partly because it has been able to keep the trolls out. But in other contexts, where vigorous debate is encouraged, standards of ‘trolling’ may differ dramatically.

Horse_ebooks is spam content (or, spam detection evasion content) that that has turned into a viral meme. Trolls sometimes become accepted members of a web community, understood to be entertaining and serving as rites of passage to n00bs. So these roles are not necessarily fixed.

With so much data about such roles available for analysis, research into these questions could teach us a lot more about human communication and conflict.