Another rant about academia and open source
by Sebastian Benthall
A few weeks ago I went to a great talk by Victoria Stodden about how there’s a crisis of confidence in scientific research that depends on heavy computing. Long story short, because the data and code aren’t openly available, the results aren’t reproducible. That means there’s no check on prior research, and bad results can slip through and be the foundation for future work. This is bad.
Stodden’s solution was to push forward within the scientific community and possibly in legislation (i.e., as a requirement on state-funded research) for open data and code in research. Right on!
Then, something intriguing: somebody in the audience asked how this relates to open source development. Stodden, who just couldn’t stop saying amazing things that needed to be said that day, answered by saying that scientists have a lot to learn from the “open source world”, because they know how to build strong communities around their (open) work.
Looking around the room at this point, I saw several scientists toying with their laptops. I don’t think they were listening.
It’s a difficult thing coming from an open source background and entering academia, because the norms are close, but off.
The other day I wrote in an informal departmental mailing list a criticism and questions about a theorist with a lot of influence in the department, Bruno Latour. There were a lot of reactions to that thread that ranged pretty much all across the board, but one of the surprising reactions I got was along the lines of “I’m not going to do your work for you by answering your question about Latour.” In other words, RTFM. Except, in this case, “the manual” was a book or two of dense academic literature in a field that I was just beginning to dip into.
I don’t want to make too much of this response, since there were a lot of extenuating circumstances, but it did strike me as an indication of one of the cultural divides between open source development and academic scholarship. In the former, you want as many people as possible to understand and use your cool new thing because that enriches your community and makes your feel better about your contribution to the world. For some kinds of scholars, being the only one who understands a thing is a kind of distinction that gives you pride and job opportunities, so you don’t really want other people to know as much as you about it.
Similarly for computationally heavy sciences: if you think your job is to get grants to fund your research, you don’t really want anybody picking through it and telling you your methodology was busted. In an Internet Security course this semester, I’ve had the pleasure of reading John McHugh’s Testing Intrusion Detection Systems: A Critique of the 1998 and 1999 DARPA Off-line Intrusion Detection System Evaluation as Performed by Lincoln Laboratory. In this incredible paper, McHugh explains why a particular DARPA-funded Lincoln Labs Intrusion Detection research paper is BS, scientifically speaking.
In open source development, we would call McHugh’s paper a bug report. We would say, “McHugh is a great user of our research because he went through and tested for all these bugs, and even has recommendations about how to fix them. This is fantastic! The next release is going to be great.”
In the world of security research, Lincoln Labs complained to the publisher and got the article pulled.
Ok, so security research is a new field with a lot of tough phenomena to deal with and not a ton of time to read up on 300 years of epistemology, philosophy of science, statistical learning theory, or each others’ methodological critiques. I’m not faulting the research community at all. However, it does show some of the trouble that happens in a field that is born out of industry and military funding concerns without the pretensions or emphasis on reproducible truth-discovery that you get in, say, physics.
All of this, it so happens, is what Lyotard describes in his monograph, The Postmodern Condition (1979). Lyotard argues that because of cybernetics and information technologies, because of Wittgenstein, because of the “collapse of metanarratives” that would make anybody believe in anything silly like “truth”, there’s nothing left to legitimize knowledge except Winning.
You can win in two ways: you can research something that helps somebody beat somebody else up or consume more, so that they give you funding. Or you can win by not losing, by pulling some wild theoretical stunt that puts you out of range of everybody else so that they can’t come after you. You become good at critiquing things in ways that sound smart, and tell people who disagree with you that they haven’t read your cannon. You hope that if they call your bluff and read it, they will be so converted by the experience that they will leave you alone.
Some, but certainly not all, of academia seems like this. You can still find people around who believe in epistemic standards: rational deduction, dialectical critique resolving to a consensus, sound statistical induction. Often people will see these as just a kind of meta-methodology in service to a purely pragmatic ideal of something that works well or looks pretty or makes you think in a new way, but that in itself isn’t so bad. Not everybody should be anal about methodology.
But these standards are in tension with the day to day of things, because almost nobody really believes that they are after true ideas any more. It’s so easy to be cynical or territorial.
What seems to be missing is a sense of common purpose in academic work. Maybe it’s the publication incentive structure, maybe it’s because academia is an ideological proxy for class or sex warfare, maybe it’s because of a lot of big egos, maybe it’s the collapse of meta-narratives.
In FOSS development, there’s a secret ethic that’s not particularly well articulated by either the Free Software Movement or the Open Source Initiative, but which I believe is shared by a lot of developers. It goes something like this:
I’m going to try to build a totally great new thing. It’s going to be a lot of work, but it will be worth it because it’s going to be so useful and cool. Gosh, it would be helpful if other people worked on it with me, because this is a lonely pursuit and having others work with me will help me know I’m not chasing after a windmill. If somebody wants to work on it with me, I’m going to try hard to give them what they need to work on it. But hell, even if somebody tells me they used it and found six problems in it, that’s motivating; that gives me something to strive for. It means I have (or had) a user. Users are awesome; they make my heart swell with pride. Also, bonus, having lots of users means people want to pay me for services or hire me or let me give talks. But it’s not like I’m trying to keep others out of this game, because there is just so much that I wish we could build and not enough time! Come on! Let’s build the future together!
I think this is the sort of ethic that leads to the kind of community building that Stodden was talking about. It requires a leap of faith: that your generosity will pay off and that the world won’t run out of problems to be solved. It requires self-confidence because you have to believe that you have something (even something small) to offer that will make you a respected part of an open community without walls to shelter you from criticism. But this ethic is the relentlessly spreading meme of the 21st century and it’s probably going to be victorious by the start of the 22nd. So if we want our academic work to have staying power we better get on this wagon early so we can benefit from the centrality effects in the growing openly collaborative academic network.
I heard David Weinberger give a talk last year on his new book Too Big to Know, in which he argued that “the next Darwin” was going to be actively involved in social media as a research methodology. Tracing their research notes will involve an examination of their inbox and facebook feed to see what conversations were happening, because just so much knowledge transfer is happening socially and digitally and it’s faster and more contextual than somebody spending a weekend alone reading books in a library. He’s right, except maybe for one thing, which is that this digital dialectic (or pluralectic) implies that “the next Darwin” isn’t just one dude, Darwin, with his own ‘-ism’ and pernicious Social adherents. Rather, it means that the next great theory of the origin of species is going to be built by a massive collaborative effort in which lots of people will take an active part. The historical record will show their contributions not just with the clumsy granularity of conference publications and citations, but with minute granularity of thousands of traced conversations. The theory itself will probably be too complicated for any one person to understand, but that’s OK, because it will be well architected and there will be plenty of domain experts to go to if anyone has problems with any particular part of it. And it will be growing all the time and maybe competing with a few other theories. For a while people might have to dual boot their brains until somebody figures out how to virtualize Foucauldean Quantum Mechanics on a Organic Data Splicing ideological platform, but one day some crazy scholar-hacker will find a way.
“Cool!” they will say, throwing a few bucks towards the Kickstarter project for a musical instrument that plays to the tune of the uncollapsed probabilistic power dynamics playing out between our collated heartbeats.
Does that future sound good? Good. Because it’s already starting. It’s just an evolution of the way things have always been, and I’m pretty sure based on what I’ve been hearing that it’s a way of doing things that’s picking of steam. It’s just not “normal” yet. Generation gap, maybe. That’s cool. At the rate things are changing, it will be here before you know it.
Um… Seb? Are you on Planet TOS? Because if you’re not, we need to get your stuff on there. This is exactly, EXACTLY what I’ve been seeing too, and… thank god I’m not the only one! You articulate it beautifully.
Wow, thanks Mel! Glad I’m not the only one seeing it too :) I don’t think I’m on Planet TOS–how do I get on it? I just added the feed to my reader.
Also, I’m lurking on the mailing list still. Got some neat new members!
I’ve noticed the territorial attitude as well, and I kept trying to explain this to my industry counter-parts, but I couldn’t quite find the words. Of course it isn’t everybody, but it definitely exists. Great insight on the reasons behind this egocentric mentality!
Sebastian, I love your point about the next Darwin.
Also, I must agree: Victoria Stodden is awesome.
Thanks, David! That means a lot to me.
You’ve made a very important point, and I hope the next generation of scientists is open-minded enough to live up to better, less selfish standards.
Also: you’re summary of the Lyotard idea was in a way slightly scary: if there is no metanarrative to legitimize an argument, surely the relentless logic of modern capitalism is all that’s left: advertising/PR on the one hand, and smart bombs on the other!
Thanks, and I’m glad you agree! You yourself seem to have a very interesting blog (based on skimming a few entries). I’ve added it to my reader.
Yes, Lyotard’s conception of post-modernism is scary. I think we need to own up to how deep we are in it before we can find our way out of it. Dialectical synthesis seems promising, but labor intensive. What do you think?
I agree it’s a major problem. Around 6 years ago I discovered postmodernism, and I was fascinated by it as a description of the (intellectual) world. But now the fragmentariness is scary, especially when confronted with the unstoppable logic of markets: “all that is sold melts into air”, after all!
I’m a somewhat religious person — in a very heterodox way — so my hope is for a new and powerful synthesis of science and religion that manages to find a balance between rigor, human solidarity and kindness, and a sense of awe and reverence.
I’ve been playing around with the phrase “radical de-centering” in my head. I feel like religions are good at making you think the world does not revolve around your own idiosyncratic thought-bubble. It’s this feeling that needs to be rediscovered. Perhaps the pendulum needs to swing in favor of some kind of new Modernism. One with a heart, perhaps! Occupy, and everything it spawned, gives me a tiny bit of hope.
A lot of this resonates for me as well. I like the idea of “modernism with a heart”.
When you say “synthesis of science and religion”, what aspects of each are your interested in preserving, and where do you see the conflict between them that must be overcome?
There does seem to be something in the air: Occupy, a lot of people feeling like people on the Internet in particular are coming towards a kind of collective identity, a hopefulness for a substantive ideology despite the challenges. I wonder what the biggest obstacles are to seeing something like that realized.
[…] Benthall’s post about academic vs open culture reminded me that I’d like to track my fellow FOSS-to-academia migrants somewhere. Seb and […]
[…] want to come back to Seb’s blog post here because he’s given the best summary of the open source mentality in one paragraph that […]
Enjoyed your thoughts, Sebastian, and particularly your characterization of the motivation of the prototypical FOSS developer. I’d be delighted if such attitudes and approaches would permeate culture, well beyond open source software development and similar endeavors, by the 22nd Century. We may well find that a raft of oppositional human motivations make this slower going than we might both hope, however.
Heard Victoria Stoddert talk today for the first time, not far from the I-School – in the Stat Dept. in Evans. Her talk was a handy overview of five ‘hows,’ to help move us closer to reproducible research in the computational domain of research. Among others, these included the growing availability of cloud-based tools for sharing code, data, and methodology (e.g. sharing one’s runtime parameters and exact sequences of computational steps performed); and nascent requirements on the part of some funders and scientific journals to motivate such sharing.
Your comment on Darwin is what motivated this response: there’s an excellent overview of the world of ideas that led to ‘Darwinism’ – and beyond – in Loren Eisley’s “Darwin’s Century” (http://www.amazon.com/dp/0385081413 for reader reviews; http://0-www.worldcat.org.l0.apu.edu/oclc/168989 for Library holdings). Theories of evolution of living organisms – and of natural selection as a driving force behind evolution – were, from what I gathered from Eisley’s book, truly a decentralized, and often unguided, community effort over several centuries and in multiple countries, rather than what we might now conceive of as one dude’s ‘ism,’ accompanied by a posse of adherents. By the now almost quaint 18th and 19th Century mechanisms of handwritten letters, books, journal articles, and occasional in-person talks, in an era long before blogs or social networks, these ideas gained form, and then ultimately, momentum.
Thanks for your thoughts. Do you remember any of those cloud-based tools off the top of your head?
That’s fascinating to hear about Darwin. I guess the fear of a lot of academics is that some Darwin will get all the credit for their incremental work towards some grand conclusion. It’s nice to see that intellectual historians remedy this somewhat. Hopefully, as you say, by the 22nd century their job will be a lot easier.
I think it’s cool that you work on open source projects and on Berkeley’s collaborative tooling. I didn’t know there were people here working on that. I’d definitely say that that is as if not more relevant as any academics’ work to the realization of the dream. I’d be interested in hearing more about what’s going on on your end.
Victoria Stodden included a representative listing of cloud- and desktop-based tools for sharing research code, data, and methodology on p. 14 of her talk on April 25, 2012, accessible via her Talks page:
Or directly via:
Click to access BerkeleyNeymanApril252012-STODDEN.pdf
Victoria Stoddert => Stodden in the post above. With apologies to Professor Stodden, I seem to be repeatedly making this mental transposition. :-(
Her slides from that April 25, 2012 talk, among many others, can be readily found on her Talks page:
One of the more interesting parts of her talk (see slide 12), and relevant to the “community building ethic” you describe in your blog post, Sebastian, was a discussion of the results of her survey on the behavioral impediments to sharing code, data, and methodology related to research findings.
These included concerns related to: the effort and time required to package up and document these artifacts to make them useful to others; the prospect of providing ongoing support to those using these artifacts; having others use one’s work without attribution, once it has been made so freely and comprehensively available; and a variety of institutional and intellectual property-related issues. These are clearly some of the ‘real world’ barriers to taking a FOSS-like approach to these artifacts.
Thanks for this.
Congrats for your post, you touch an extremely interesting topic. I would like to highlight Mat Todd’s example of the Univeristy of Sydney; he just started an open research project to find a feasible treatment against malaria. The results of the research will be public at least in the initial phases. A great example to follow.
Here a brief reflection on the theme of open science, posted in http://www.ideasforchange.com/en:
The right to access existent drug products at a reasonable price is a traditional demand in developing countries and is about to be achieved, at least partially, through three complementary but different ways.
The industry cooperates and donates
Invited by Bill Gates, the largest pharmaceutical companies in the world have reached an agreement to share their research in order to eradicate forgotten tropical illnesses. They have also agreed to increase significantly the donations of medicine products that cure obsolete illnesses in the West. Bravo! Thank you.
The (Indian) State regulates
Nexavar, a drug for kidney and liver cancer patented by Bayer will be produced in India under the name of Sorafenat by the local business Natco Pharma. The Indian patent law allows local industry manufacturing if, after three years since launch, the drug is not accessible for the general citizenship at an “affordable” price.
The monthly treatment for a patient cost 4,000€ with Bayer’s drug product. The generic alternative will be available for a fraction of that price: 134€/month. 29 times less.
This means that over a 95% of the product’s price does not directly correspond to production costs. Instead, this amount includes R&D, marketing, financial and structure costs. Bayer offered it’s medicine for 475€ to certified sick patients. The authorized business will pay a 6% of sales in royalties. A necessary exception. Applicable everywhere else? Sustainable?
The active professor Dr. Matt Todd of the School of Chemistry at Sydney’s University has developed a methodology for open research that already has showed results: an alternative way for low cost production of a medicine that threats Bilharzia, a parasitic sickness which affects millions of people who do not have access to clean water sanitation systems.
“The challenge was that the medicine had to be produced at a very low cost and that was a challenge that academia was not going to solve”, states Todd. So he tried something different: he openly published his lab notes online while he advanced in his research and this proved to be crucial in the process.
In may he begins – with Australian government funding for three years – an open research project to find a feasible treatment against malaria using worldwide scientists that share live time results without worrying about patents. He believes that open science can achieve significant discoveries in the initial phases, before clinical trials. Free, without pardon or permission, for all.
Open as well for existent business for collaboration, investment or cost reduction. Inspiration to redefine their strategy and activities. Energy to to impulse new business models and new structures based on the common resource. Fresh air.
Thanks for this information about this. It’s very encouraging to see hear about so much prestigious work being done in open science.
[…] niche, domain-specific applications themselves. But scientists who know some programming often aren’t very familiar with best practices in software development, and that includes licensing – many scientists, […]
[…] Another rant about academia and open source […]
What happens when you have a department that is focused on a particular vision, say like the Chicago School in economics? This would seem to solve many of the common purpose problems and create more collaboration intra group (community building), but perhaps at the expense of other issues like filter bubbles and group level winning activity which could have even more problematic outcomes. Don’t have strong thoughts on this, but seems like there could be interesting tensions between a collaborative community and diversity of thought (and does it matter what you unify over – methodology, questions that matter, starting facts, attitude?).
That’s a great point. Huh.
So, there’s this research showing that diversity in a group leads to better decision-making on complex problems…
…but you need SOME common language or vision in order to get people united around what the problem you’re trying to solve is. Or do you?
Or, suppose you have some members of a decision-making team that are in it for themselves, are trying to win, individually, and others who are not. Is that dimension of diversity necessarily a problem?
I think you really nailed something critical.
[…] is what gives us hope for the revolutionary new kind of science BIDS is beginning. Two years ago, this was a fringe idea. Perlmutter may have just made it […]