Digifesto

November 20, 2011

Occupy and ‘liberation technology’

I’m moved by an anonymous letter covering and commenting on the UC Davis pepper spraying aftermath.

The video speaks for itself, but I wanted to write about a particular point the letter makes:

Various searches related to UC Davis and pepper spraying were the *top searches on Google* in the US today — think of what that means.

this all happened on a day when virtually no news (except Demi and Ashton’s divorce or the 30 year old Natalie Wood death investigation) gets reported on mainstream outlets. This *all* happened online, and drew a huge national audience in the process, enough so to force a major university into damage control freakout.

Last week I attended a talk at Berkeley’s CITRIS center on Internet and Democracy, with EFF’S Jillian York and Evgeny Morozov, author of The Net Delusion. The speakers discussed the role of the internet as ‘liberation technology’ in countries overthrowing dictatorships. Morozov was skeptical about promoting the use of the web for these purposes without extensive regional expertise. He also noted that that many of these regimes import surveillance technology from the West. York agreed. But neither was able to speak much to the use of technology by the Occupy movement, and neither was able to address the actual mechanics of how the web helps (or not). Instead, both agreed that, yes, the Internet is a factor, but people on the ground and organizing in person were also very important.

There must be a more satisfying answer than this. Thankfully, the Occupy provides a great case study in the role of technology in movement building. Conditions in the U.S. are obviously different from those in the Arab Spring. There is relatively little fear of censorship on the web, free speech is well protected (within limits–see below), and Internet access and use is very high. So what we’d expect is for the impact of web technology on movement building to be stronger in the U.S. than in, say, the Middle East or Belarus.

What the anonymous letter writer points out is that the mass action required “to force a major university into damage control freakout” could happen almost overnight and even when the events in question are under-served by the mass media. It’s not clear whether the web activity encourages the movement on the ground or the other way around. Or rather, I don’t think there’s any question that they two activities feed off each other. But the real upshot is that combined, they simply make damage control impossible. You can’t hide the fact that you are beating kids up in the U.S. Full stop.

This is all possible because we have such great freedom of speech in this country. The irony is that what’s getting so much attention is the repression of free speech. In the U.S., it’s OK to complain on the Internet. It’s not yet OK to “encamp” as a symbolic political act. Encampment gets you pepper sprayed in the mouth.

This isn’t a contradiction, so much as a demonstration of why free speech is valuable and how it is won. Our ability to publish videos and articles freely on-line and use social media to express dissent is allowing the activity at the frontier of free speech to gain resonance. My colleague Kartik Date explain today that the effect of the encampments, met by violence, is to force the viewer to make a decision. Do you support the students, or do you support the forces breaking them? The health and safety technicalities of pitching a tent in a park become insignificant if people are hospitalized with broken ribs.

So what is the technology doing? It’s increasing the velocity of information from the events that demand that we make a decision to the people looking on. Exposure to edge cases makes us, as onlookers, realize that world does not divide into the categories that we expect. It forces the boundary of right and wrong to curve and swell.

This is why it would not be enough for the Occupy movement to limit itself to innocuous speech like blog posts or op-eds. If it did, it wouldn’t really ‘speak’ to the national audience at all, because its statements would be lost in the steady drone of information we filter out. Yes, there are radicals who think corporations are at fault. Yes, there are radicals who think the government is at fault. So what? This news changes nothing for me.

But when I hear the stories about how students are being hospitalized and demonstrating their peaceful commitment in response, I can’t remain neutral. I am now a supporter. And all because of content shared very rapidly through the web. Multiply this effect, and its clear how liberation technology can work to expand a movement.

2 Comments

November 14, 2011

Measuring Occupy Steam

The Economist recently blogged that the Occupy movement may be losing steam, based on the number of posts per day on the We Are the 99% Tumblr blog.

The author explains the appeal of this metric here, arguing that since updating a site is more effortful than using a Twitter hashtag, it is a better indicator of involvement.

While it’s definitely worth making the distinction of between on-line buzz and meat activity, using just one web site as an indicator seemed shady to me. Who knows what could be influencing that Tumblr? Maybe it’s just the site that’s lost steam, since by now anybody who is likely to look at it probably (a) has already and (b) gets the point.

What about using a more aggregate measure of how much people care about the Occupy movement? Here’s an easy one to grab: the number of Google searches for ‘occupy’.

You can see spikes corresponding to some major Occupy events:

October 15th, the peak, was Occupy’s Global Day of Action
October 27th, another high, came right after an Oakland occupier got brained by a police tear gas canister.
November 3rd was Oakland’s Occupy-induced general strike
The last little bump on November 10th corresponds to the Colbert coverage of the police brutality on Berkeley’s campus

Yes, searches are in decline. But the numbers suggest that as long as protesters can keep things eventful–by causing an economic ruckus or getting beat up–they will stay on the public radar.

3 Comments

November 4, 2011

To Google Reader users

A lot of friends of mine were avid Google Reader users. For some of them, it was their primary social media tool. They had built a strong community around it. Naturally, they were attached to its user interface, features, and workflows. It was home to them.

Google recently ‘redesigned’ Google Reader in a way that blatantly forced Reader users to adopt Google+ as their social media platform. A lot of Reader devotees are pissed about this. They want their old technology back.

My first response to this is: What did you expect? What made Reader so special? It was just the first of several experiments in social media that Google’s used to edge into the Facebook’s market. (Reader, Buzz, Wave, now Google+). Of course, the industry logic is that your community should be dumped onto the newer platform, so that Google can capture the network effects of your participation. Your community is what will make their new technology so valuable to them!

Still not happy?

The problem is that Google Reader was a corporately operated platform, not a community operated one. You may not have know that you had other options. There are a lot of social media communities that have a lot of self-control, Metafilter being a particularly great one. (incidentally, Ask Metafilter has a good guide to Reader alternatives) There is also a lot of energy going into open source social media tools.

The most prominent of these is Diaspora, which raised a ridiculous amount of funding on Kickstarter when the New York Times wrote about its being a project. I stopped following it after the first press buzz, but maybe it’s time to start paying attention to it again. Since its community has recently announced that it is not vaporware, I decided to go ahead and join the diasp.org pod.

To my surprise, it’s pretty great! Smooth, intuitive interface, fast enough, seems to have all the bells and whistles you’d want and not a lot of cruft–basically all the stuff I care about on Google+. I’ve got a public profile. Plus, it has great tools for data export in case I want to pick up and move to a different pod.

Looking into it, Diaspora does not yet work as an RSS reader, though there is an open issue for it. A bit of a missed opportunity, IMO. Some other people are build an open-source Reader clone in response, which could more directly solve the Reader problem. Whatever the current technical limitations, though, they can be surmounted by some creative piping between services.

The point that I hope stands is that there is a hidden cost to a community investing in a technical infrastructure when it is being maintained by those that do not value your community. People’s anger at the Reader redesign demonstrates the value of the open source alternatives.

3 Comments

October 29, 2011

Notes on Open Access for academic works

You could read this blog post, or you could watch this YouTube video and get about 50% of the written information.

I attended a meeting last week about Open Access publishing at Berkeley. As is well-known now, most academic publishing is a ruthless industry that stifles innovation by making it expensive to acess academic journals. (Nevermind for a minute that this industry is only possible because of academia’s unhealthy dependence on these journals as a currency of prestige.) Thankfully, principles of ‘openness’ are swiftly descending on the academy.

Three interesting things came up in the meeting. The first was the existence of hybrid open access journals. These try to bridge the gap between open access journals (which generally allow publishers to maintain copyright and make works available on the web) and normal journals by charging authors a premium for making their articles openly available in an otherwise journal..

This sounds good for about two seconds until you think about it and realize that the publisher is essentially ransoming the openness of the work, and making the author incur the cost. Often the fees charged by hybrid publishers for openness are exorbitant.

It’s worth noting that open access publishers tend to charge authors for publication as well. Also, in many cases universities or their libraries have started subsidizing their faculty to publish openly. (That makes sense, since it cuts down on library subscription costs!)

The difference between open access and closed publish, then, appears to be that in the case of open access publishing, authors (or the university they are associated with? unclear) get to maintain copyright. Also, the fees tend to be more reasonable. These may be related. The openness of the content means that publishers don’t reap monopoly profits, so doesn’t sense for the OA journal to charge academics for profit lost due to open content. OA journals will run leaner. They will also, in a just world, be more competitive, but that will require a shift in the way academics view prestige as being associated with a journal’s name or ‘impact’.

Which brings me to impact rankings. Since academics need to compete on how good their research is, they need a filter mechanism for demonstrating the value of their work. This has traditionally been benchmarked against journal publication, and in particular which journals one publishes in. Journals are ranked by various estimates of impact factor–who reads it, who cites it, who takes it seriously.

I haven’t looked into it carefully, but I would be willing to bet that the definition of impact factor is viciously circular to the advantage of any existing journal with “high impact.” It is precisely this estimation of “high impact” that gives journals the power to get academics to provide free content (articles) and free labor (editors) and then charge libraries for access to the tune of extraordinary profits.

This is a bad system. The solution, article level metrics (where the use and impact of the work itself, not the impact of the journal in which it is published, is considered what’s valuable–maybe a no brainer) is being pushed forward by the Public Library of Science, a leading Open Access publisher, but at the time of this writing article level metrics are covered by only the stubbiest of lonesome stubs on Wikipedia.

The other interesting thing I learned was about the growing trend of prestigious universities mandating that faculty publish open access. Harvard, MIT, Princton, Stanford, and Duke are apparently on board for this already. By the domino logic of academic prestige competition, this means a sea change is afoot.

There are some objections to this trend that are quickly countered. The main one appears to come from the humanities, where there are many small “society-based” journals that use a traditional business model to publish works. In my imagination, these journals are a bit like private poetry magazines, or n+1.

As a result, these university-wide open access mandates come with a strong opt-out clause. Faculty can get permission to publish in closed way, if they really really want to.

Then why is this a big deal? It turns out that it’s about bargaining power. It’s not that Harvard, MIT, and the rest are no longer publishing in Nature or other big name journals. It’s just that they can negotiate special deals with the major publishers to allow the universities to maintain copyright. With that copyright, they can then publish the works on-line with a university based publishing tool.

What does this mean for other schools? Well, it means that open access journals are going to become more legitimate and traditional journals are in trouble unless they can change their business models. And it means that more and more universities are going to have an easier time using their bargaining power to change the way academic publishing works.

At Berkeley (and I believe this is generally true of other universities) the decision to go open access is a faculty decision, to be made at the Faculty Senate. I didn’t get a sense from the meeting when these meetings take place or how like the faculty was to take the dive, but I’d like to look into it more.

2 Comments

October 4, 2011

Responding to “Declaration of the Occupation”

I’ve been reserving judgment on Occupy Wall Street. I’ve recently left New York, and so while I know some people close to the action, I am merely a social media voyeur on another coast. I’m friends with both bankers and radicals and don’t see that as a problem. I know that OWS is significant, but is it right?

Thankfully, the NYC General Assembly has released a Declaration of the Occupation of New York City. This seems like a good thing since it deflects the main criticism of the movement: that it’s so unfocused in its intentions that there is nothing to take seriously.

I’m going to pick through this declaration and see what I can find.

As we gather together in solidarity to express a feeling of mass injustice, we must not lose sight of what brought us together. We write so that all people who feel wronged by the corporate forces of the world can know that we are your allies.

Hmm. This is not a great start. They are uncritically invoking this “corporate forces” rhetoric despite the weakness of the idea. Literally, corporations are boring. Lots of do-goody non-profits are corporations.

So there’s a sense that there are some big corporations that are at fault, that have wronged people, but which are they? How does one distinguish the wrongful corporations from the ones that are just going about their business?

As this protest started as an occupation of Wall Street, I would have guessed that it was objecting the financial services industry in particular. This does not appear to be the case.

As one people, united, we acknowledge the reality: that the future of the human race requires the cooperation of its members; that our system must protect our rights, and upon corruption of that system, it is up to the individuals to protect their own rights, and those of their neighbors; that a democratic government derives its just power from the people,…

Brilliant. I’m on board.

…but corporations do not seek consent to extract wealth from the people and the Earth…

This is may be a subtle point, but corporate existence depends on lots of consensual behavior. Consumers consent to consume, employees consent to work.

Generally, our modern capitalism affords consumers an absurdly wide range of consumption options–though thisis limited by access to technology, available disposable income, and availability of transportation. No doubt some corporations extort those with more limited options by controlling retail inventories (though the possibility of this control is being increasingly disrupted by technology). But are these consumers the ones driving the ‘corporate forces’ that the Wall Street demonstrators are protesting? Arguably, the more these consumers are vulnerable to extortion, the less they can be the driving market of ‘corporate forces’. So, of the 99% who are purportedly represented by the demonstrators, probably a good number–say, those in the 2%-50% bracket–are consumers who are consensually complicit in the triumph of corporate power (whatever that is).

Labor extortion is a more serious problem. But again, however much corporations depend on a captive labor force, these workers will be exploitable because they are replaceable. Higher skilled workers contributing to a corporation will individually contribute more to corporate success. They will also be more mobile, implying once again consent.

My point is this: blaming ‘corporations’ acting ‘without consent’ for economic problems clouds our individual agency in our choices when participating in the economy. My guess is that most of the people who read this far into this post (all six of you) are to some extent consensually complicit in ‘corporate power.’ It’s worth keeping that in mind, if only because the possibility of progress through traditional political channels is so dismal right now.

…and that no true democracy is attainable when the process is determined by economic power.

I think I agree with this. I’m not sure what a ‘true democracy’ is or whether or not I’d want one. California, my new home, seems to have gutted itself through its referendum system, which is probably the most ‘true democratic’ system out there. Then again, that’s partly due to the influence of economic power over referenda, which seems unavoidable. By this logic, referenda are not truly democratic.

What about true representative democracy? I used to have faith in Clean Elections, a system of public campaign finance reform. The Supreme Court ruled part of this legislation unconstitutional on free speech grounds, which means we are truly hosed.

The most promising thing to come out of the recent protests, in my view, is the possibility of a constitutional convention to fix, among other things, campaign financing and corporate personhood. Lawrence Lessig is involved. It’s cool. It’s quite possibly the Best Policy Outcome of Occupy Wall Street.

Incidentally, this is along the lines of what Seymour Lachman proposed to fix New York State’s broken legislation in Three Men in a Room. Maybe if they really wanted to sick it to Wall Street they could go two for one.

We come to you at a time when corporations, which place profit over people, self-interest over justice, and oppression over equality, run our governments. We have peaceably assembled here, as is our right, to let these facts be known.

These statements are vague and hyperbolic. Corporations are composed of and operate in service of people. They also vary in their evilness. See Michael Porter’s Harvard Business Review article on Creating Shared Value for a view on how boringly mainstream notions of corporate social responsibility have become, and moreover how the creation of social value is once again coming to be seen as an important source of economic value.

None of this is to say that it’s a good thing that economically powerful corporations exert undue influence over our government. But not all corporations are up to these games, and many non-corporate organizations that are just as self-interested, wealth-driven, and oppressive exert political influence. Our political and economic problems are due to a wider, thicker network of power than this Declaration would have you believe.

They have taken our houses through an illegal foreclosure process, despite not having the original mortgage.

I don’t understand this one.

They have taken bailouts from taxpayers with impunity, and continue to give Executives exorbitant bonuses.

Ok, true.

They have perpetuated inequality and discrimination in the workplace based on age, the color of one’s skin, sex, gender identity and sexual orientation.

Ok, yes. However, these inequalities and discriminations are endemic to society at large and are not particular to ‘corporations’. As has been noted, the Occupy Wall Street movement itself has internal problems with race and gender.

They have poisoned the food supply through negligence, and undermined the farming system through monopolization.

Not sure about the poisoning part. But yes, good god yes, monopolized subsidized corporate farming is bad.

Wall Street is a silly place to organize a agribusiness protest, but if this declaration makes anything clear, it is that the “Occupy Wall Street” movement is no longer just about Wall Street.

They have profited off of the torture, confinement, and cruel treatment of countless animals, and actively hide these practices.

Ugh. Animal rights. Now you’ve got me considering lobsters again.

They have continuously sought to strip employees of the right to negotiate for better pay and safer working conditions.

Ok, sure. I mean, some of them. Others have been innovating in the workplace and treating their employees well. Is this clause designed to attract generic labor discontent? Interesting that they have placed an emphasis on working conditions rather than general unemployment, which is probably the bigger issue at the moment. Is it any less logical to blame corporations for unemployment?

They have held students hostage with tens of thousands of dollars of debt on education, which is itself a human right.

I had to think about this one.

Yes, it is terrible that students are hostage with education debt. If I were a student in debt in New York right now, I would be pissed off and in the streets.

But corporations…are they the ones holding students hostage? By what mechanism? Is it because corporations aren’t providing universal college education, or is it because they aren’t employing enough people out of college? Or is this a critique of the companies that provide student loans?

This is one of the most emotionally compelling clauses in the document, to me. Not because it makes any sense, but because it is tragic. Nothing makes it clearer that Occupy Wall Street is made up of, in part, desperate students who have been promised opportunity and are now angry about their prospects for the future. Holding corporations accountable for fixing this future is almost hopeless, but what hope is there?

They have consistently outsourced labor and used that outsourcing as leverage to cut workers’ healthcare and pay.

I’ve heard that outsourcing in a lot of industries is declining, actually. This seems like more general misplaced worker angst. This worries me because it can turn so quickly to anti-immigration rhetoric. (Note that immigration is one issue that this declaration steers very clear of. Why is that?)

They have influenced the courts to achieve the same rights as people, with none of the culpability or responsibility.

Well, at least some of the culpability and responsibility. Limited liability protects shareholders but you can still sue a company.

They have spent millions of dollars on legal teams that look for ways to get them out of contracts in regards to health insurance.

I got nothing.

They have sold our privacy as a commodity.

Sure, yes, ok. Though it’s ironic that the Occupy Wall Street protests have benefited so much from social media like Facebook, which are precisely the corporations selling our privacy. See above note about our responsibility/complicity in the economy.

But, ok, Tumblr is probably not evil. I mean, it’s just so cute.

They have used the military and police force to prevent freedom of the press. They have deliberately declined to recall faulty products endangering lives in pursuit of profit.

Hmmm.

They determine economic policy, despite the catastrophic failures their policies have produced and continue to produce.
They have donated large sums of money to politicians, who are responsible for regulating them.

See above.

They continue to block alternate forms of energy to keep us dependent on oil.

Ok.

They continue to block generic forms of medicine that could save people’s lives or provide relief in order to protect investments that have already turned a substantial profit.

Oh, hey, an intellectual property issue! Glad to see someone snuck this one in!

They have purposely covered up oil spills, accidents, faulty bookkeeping, and inactive ingredients in pursuit of profit.
They purposefully keep people misinformed and fearful through their control of the media.
They have accepted private contracts to murder prisoners even when presented with serious doubts about their guilt.
They have perpetuated colonialism at home and abroad. They have participated in the torture and murder of innocent civilians overseas.
They continue to create weapons of mass destruction in order to receive government contracts. *

These are bad. I’m getting dizzy. The asterisk goes to this footnote: *These grievances are not all-inclusive. Looks like some issues didn’t make the cut.

To the people of the world,

We, the New York City General Assembly occupying Wall Street in Liberty Square, urge you to assert your power.

Exercise your right to peaceably assemble; occupy public space; create a process to address the problems we face, and generate solutions accessible to everyone.

Peaceably assemble, check. Occupy public space, check.

Create a process to address the problems we face…

Oh dear.

There are a hell of a lot of problems.

But I think this may be the secret sauce of the document.

As somebody way outside what’s going on, one of the best pieces for getting insight into what’s going on has been Nathan Schneider’s This is Just Practice article. Nathan’s frustration with social media getting the credit for the movement’s success is interesting. He thinks that it doesn’t do justice to the social connections and dialog that is happening on the ground. Contrast with this article from Tech President saying that it was skilled social media usage that has lead the originally tiny movement to grow and gain national attention.

This video from CBC news is also revealing. Drew Hombine, sleepily representing OWS, dismisses the accusation that the movement’s policy goals are unfocused on the grounds that the point is to “build an ideal society in the heart of darkness.” It’s the establishment of a fresh venue for grassroots dialog, not any particular policy position, that’s the goal.

So, what does that add up to? There is a core of the movement that is training itself in grassroots, consensus-based activism. And there is a much larger surrounding network of people watching and legitimizing via social media. That’s powerful. If it actually gets people to think and act differently in society, it doesn’t matter what the legislative policy response is. (Though constitutional reform, if not mangled by special interests in the process, would be a huge plus.)

What could the outcomes be?

One possibility is that it could change the way people engage with the economy as consumers and laborers. If you are living in a tent in a public space and depending on your fellow activists for help, as people in Liberty Square seem to be doing, then you are divesting to some extent from the system that allows for “corporate greed”. If the core group could promote a low-consumption lifestyle or a targeted consumer boycott through its digital social network, that could have a significant economic impact. Similarly, they could make it so socially painful to work for certain kinds of organizations that mobile, high-skilled workers choose to find other jobs in order to avoid social stigma.

So much for the economy. There is the wider phenomenon of the de-legitimation of the government. When asked if OWS was at all related to the Tea Party, Hombine claimed (to my suprise) that yes, it was. “Original Tea Party members — members of the party before it was taken over by corporate influence — are with us.”

That’s consistent with a lot of Tea Party rhetoric, which is vitriolic against the “permanent political class.” I’d recommend to anyone who would like insight into the Tea Party ideology Codevilla’s The Ruling Class: How They Corrupted America and What We Can Do About It. Like the OWS movement, it defends the Tea Party’s inconsistent political positions on the grounds that what they are rejecting is the entire political system in its current incarnation.

For Codevilla, the Tea Party’s tactic is to use its grassroots appeal within the Country Class (contrasted with the Ruling Class of left-leaning elite university graduates) to elect new representatives that are not part of the permanent political class. Once in the legislature, these representatives can cause nuisance and get demands met.

Interestingly, OWS appears to be in the business of setting up alternative governance and communication structures, and causing a nuisance from without not from within. This is the anarchist agenda.

Will these tactics of dissent merge as we approach the next election? Some have argued that it will be hard for mainstream groups like labor unions or MoveOn.org to work with the anarchists within OWS. Time will tell.

To all communities that take action and form groups in the spirit of direct democracy, we offer support, documentation, and all of the resources at our disposal.

Join us and make your voices heard!

Godspeed. You’re evolving. If passion about animal rights gets you out of bed and into General Assembly meetings, more power to you. I hope that the dialog within the movement will cause it to work out its internal contradictions and maybe wring out the weak ‘corporate forces’ rhetoric in favor of something more actionable, something that takes more responsibility for the way things are. I’m eager to see where you go.

2 Comments

September 21, 2011

Notes on using the neo4j-scala package, Part 1

Encouraged by the reception of last week’s hacking notes, I’ve decided to keep experimenting with Neo4j and Scala. Taking Michael Hunger’s advice, I’m looking into the neo4j-scala package. My goal is to port my earlier toy program to this library to take advantage of more Scala language features.

These my notes from stumbling through it. I’m halfway through.

To start with, I had trouble wrangling the dependencies. Spoiled by scripting languages, I’ve been half-assing my way around Maven for years, so I got burned a bit.

What happened was that in earlier messing around in my project, I had installed an earlier version of neo4j-scala from a different github repository. Don’t use that one. At the time of this writing, FaKoD‘s version is much more up to date and featureful.

I was getting errors that looked like this:

> [error] error while loading Neo4jWrapper, Scala signature Neo4jWrapper has
> wrong version
> [error]  expected: 5.0
> [error]  found: 4.1 in
> /home/sb/.ivy2/cache/org.neo4j/neo4j-scala/bundles/neo4j-scala-0.9.9-SNAPSHOT.jar(org/neo4j/scala/Neo4jWrapper.class)

The only relevant web pages I could find on this suggested that the problem had to due with having compiled the dependency in a different version of Scala. Since I had the Ubuntu package installed, which is pegged at 2.7.7, this seemed plausible. I went through a lot of flailing to reinstall Scala and rebuild the package, but to no avail.

That wasn’t the problem. Rather, when I asked him about it FaKoD patiently pointed out that older library has version 0.9.9-SNAPSHOT, whereas the newer one is version 0.1.0-SNAPSHOT. So, my sbt build configuration file has this line now:

libraryDependencies += "org.neo4j" % "neo4j-scala" % "0.1.0-SNAPSHOT"

Thanks to FaKoD’s walking me through these problems, I stopped getting cryptic errors and could start hacking.

Here’s what I had to start with, copying out of one of the neo4j-scala’s tests:

import org.neo4j.kernel.EmbeddedGraphDatabase
import org.neo4j.graphdb._
import collection.JavaConversions._
import org.neo4j.scala.{EmbeddedGraphDatabaseServiceProvider, Neo4jWrapper}

class Krow extends Neo4jWrapper with EmbeddedGraphDatabaseServiceProvider {
}

Running this in sbt, I get this error:

[error] /home/sb/dev/krow/src/main/scala/Krow.scala:6: class Krow needs to be abstract, /
since method neo4jStoreDir in trait EmbeddedGraphDatabaseServiceProvider /
of type => String is not defined

That’s because EmbeddedGraphDatabaseServiceProvider (this code is written by a German, I gather) has an abstract method that I haven’t defined.

What I find neat is that this is an abstract method–it’s a function that takes no arguments and returns a String. But Scala seems smart enough to allow this to be defined by either methods or more naturally variables. So, this compiles:

class Krow extends Neo4jWrapper with EmbeddedGraphDatabaseServiceProvider {
  val neo4jStoreDir = "var/graphdb"
}

but so does this:

class Krow extends Neo4jWrapper with EmbeddedGraphDatabaseServiceProvider {
  def neo4jStoreDir = {
    var a = "var/"
    var b = "graphdb"

    a + b
  }

(Functions in Scala can be defined by a block of code in curly braces, with the last line evaluated and returned.)

Next, I worked on rewriting my toy app, using this unittest as a guide.

Here was the code from my original experiment:

    var first : Node = null
    var second : Node = null

    val neo: GraphDatabaseService = new EmbeddedGraphDatabase("var/graphdb")
    var tx: Transaction = neo.beginTx()

    implicit def string2relationshipType(x: String) = DynamicRelationshipType.withName(x)

    try {
      first = neo.createNode()
      first.setProperty("name","first")

      second = neo.createNode()
      second.setProperty("name","second")

      first.createRelationshipTo(second, "isRelatedTo" : RelationshipType)

      tx.success()
      println("added nodes")
    } catch {
      case e: Exception => println(e)
    } finally {
      tx.finish() // wrap in try, finally   

      println("finished transaction 1")
    }

You could see why I would like it to be more concise. Here’s a first pass on what neo4j-scala let me whittle it down to:

    var first : Node = null
    var second : Node = null

    withTx {
      neo =>
          first = neo.gds.createNode()  
          first.setProperty("name","first")

          second = neo.gds.createNode()
          second.setProperty("name","second")

          first --> "isRelatedTo" --> second
    }

There is a lot of magic going on and it took me a while to get my head around it.

The point of withTx is to wrap around the try/success/finally pattern needed for most Neo4j transactions. Here’s the code for it:

  def withTx[T <: Any](operation: DatabaseService => T): T = {
    val tx = synchronized {
      ds.gds.beginTx
    }
    try {
      val ret = operation(ds)
      tx.success
      return ret
    } finally {
      tx.finish
    }
  }

Coming from years of JavaScript and Python, it was tough getting my head around this type signature. The syntax alone is daunting. But what I think it comes down to is this:

withTx takes a type parameter, T, which can be a subclass (<:) of Any.
It takes an argument, operation, which must be a function from something of type DatabaseService to something of type T.
It returns type T.

In practice, this means that the function can be called in a way that’s agnostic to the return type of its argument. But what is this DatabaseService argument?

In neo4j-scala, DatabaseService is a trait that wraps a Neo4j GraphDatabaseService. Then a GraphDatabaseServiceProvider wraps the DatabaseService. Application code is as far as I can tell expected to doubly inherit from both Neo4jWrapper, which handles the syntactic sugar, and a GraphDatabaseServiceProvider that provides the context for the sugar.

Which means that somewhere deep in the structure of our main object there is a DatabaseService that has real ultimate power over the database. withTx will find it for us, but we need to send it an operation that binds to it.

neo4j-scala also provides this helpful method, which operates in the context where that DatabaseService is available:

  def createNode(implicit ds: DatabaseService): Node = ds.gds.createNode

createNode‘s argument is implicit and so is plucked otherwise unbidden from its environment. And since Scala lets you call methods that have no arguments without parentheses, we can shorten the code further.

    withTx {
      implicit neo =>
          first = createNode
          first.setProperty("name","first")

          second = createNode
          second.setProperty("name","second")

          println("added nodes")

          // uses neo4j-scala special syntax
          first --> "isRelatedTo" --> second
    }

Notice that I had to put an implicit before neo in this code. When I didn’t, I got this error:

[error] /home/sb/dev/krow/src/main/scala/Krow.scala:23: /
 could not find implicit value for parameter ds: org.neo4j.scala.DatabaseService
[error]           first = createNode

What I think is happening is that in order to make the DatabaseService, neo, available as an implicit argument of the createNode method, we have to mark it as available with the implicit keyword.

See this page for reference:

The actual arguments that are eligible to be passed to an implicit parameter fall into two categories:

* First, eligible are all identifiers x that can be accessed at the point of the method call without a prefix and that denote an implicit definition or an implicit parameter.
* Second, eligible are also all members of companion modules of the implicit parameter’s type that are labeled implicit.

The other interesting thing going on here is this line:

          first --> "isRelatedTo" --> second

This makes a Neo4j relationship between first and second of type “isRelatedTo.”

I have no idea how the code that makes this happen works. Looking at it hurts my head. I think there may be black magic involved.

This has been slow going, since I’m learning as I’m going. I’m not done yet, though. The code I’m converting had some code to do a short traversal between my two nodes, printing their names. I’m going to leave that to Part 2.

2 Comments

September 15, 2011

Neo4j and Scala hacking notes

This week FOSS4G, though it has nothing in particular to do with geospatial (…yet), I’ve started hacking around graph database Neo4j in Scala because I’m convinced both are the future. I’ve had almost no experience with either.

Dwins kindly held my hand through this process. He knows a hell of a lot about Scala and guided me through how some of the language features could help me work with the Neo4j API. In this post, I will try to describe the process and problems we ran into and parrot his explanations.

I wrote some graphical hello world code to test things out in a file called Krow.scala (don’t ask). I’ll walk you through it:

import org.neo4j.kernel.EmbeddedGraphDatabase
import org.neo4j.graphdb._
import collection.JavaConversions._

I wanted to code against an embedded database, rather than code against the Neo4j server, because I have big dreams of combining Neo4j with some other web framework and don’t like have to start and stop databases. So I needed EmbeddedGraphDatabase, which implements the GraphDatabaseService interface, and persists its data to a directory of files.

I’ll talk about the JavaConversions bit later.

object Krow extends Application {

I am a lazy programmer who only bothers to encapsulate things into software architecture at the last minute. I’m also spoiled by Python and JavaScript and intimidated by the idea of code compilation. So initially I wanted to write this as an interpreted script so I wouldn’t have to think about it. But I’ve heard great things about sbt (simple-build-tool) so I figured I’d try it out.

Using sbt was definitely worth it, if only because it is really well documented and starting up my project with it got me back into the mindset of Java development enough to get Dwins to explain Maven repositories to me again. Adding dependencies to an sbt project involves writing Scala itself, which is a nice way to ease into the language.

But running my project in sbt meant I needed a main method on by lame-o script. Ugh. That sounded like too much work for me, and args: Array[String] looks ugly and typing it ruins my day.

Dwins recommended I try using Scala’s Application trait. He explained that this would take code from an object’s body and do some magic to turn it into a main method. Rock on!

Of course, I didn’t bother to check the documentation or anything. Otherwise, I would have seen this:

The Application trait can be used to quickly turn objects into executable programs, but is not recommended.

For a language that is so emphatically Correct in its design, I give a lot of credit to whoever it was that had the balls to include this language feature so that newbs could hang themselves on it. If they hadn’t, I wouldn’t have had to confront hard truths about threading. (That’s foreshadowing)

  println("start")

  val neo: GraphDatabaseService = new EmbeddedGraphDatabase("var/graphdb")

Sweet, a database I don’t have to start and stop on the command line! This var/graphdb directory is made in the directory in which I run the program (for me, using sbt run).

  var tx: Transaction = neo.beginTx()

  var first : Node = null
  var second : Node = null

  try {
    first = neo.createNode()
    first.setProperty("name","first")
    
    second = neo.createNode()
    second.setProperty("name","second")

    first.createRelationshipTo(second, "isRelatedTo")

    tx.success()
  } finally {
    println("finished transaction 1")
  }

What I’m trying to do with this code is make two nodes and a relationship between them. Should be simple.

But it turns out that with Neo4j, all modifications to the database have to be done in a transaction context, and for that you have to do this business of creating a new Transaction:

A programmatically handled transaction. All modifying operations that work with the node space must be wrapped in a transaction. Transactions are thread confined. Transactions can either be handled programmatically, through this interface, or by a container through the Java Transaction API (JTA). The Transaction interface makes handling programmatic transactions easier than using JTA programmatically. Here’s the idiomatic use of programmatic transactions in Neo4j:
 Transaction tx = graphDb.beginTx();
 try
 {
        ... // any operation that works with the node space
     tx.success();
 }
 finally
 {
     tx.finish();
 }
 

No big deal.

This bit of code was a chance for Dwins to show me a Scala feature that makes the language shine. Check out this line:

first.createRelationshipTo(second, "isRelatedTo")

If you check the documentation for this method, you can see that I’m not using this method as expect. The Java type signature is:

Relationship createRelationshipTo(Node otherNode, RelationshipType type)

where RelationshipType is a Neo4j concept that’s what it sounds like. I suppose it is important to set apart from mere Properties for performance on traversals something. RelationshipTypes can be created dynamically and seem to more or less exist in the either, but you need to provide them when you create a relationship. All relationships are of a type.

In terms of their data content, though, RelationshipTypes are just wrappers around strings. Rather than doing this wrapping in the same line that I createRelationship, Scala lets me establish a conversion from strings to RelationshipTypes in an elegant way.

You see, I lied. The above code would not have compiled had I not also included this earlier in the object’s definition:

  implicit def string2relationshipType(x: String) = DynamicRelationshipType.withName(x)

This code uses Scala’s implicit conversions to define a conversion between Strings and RelationshipTypes.

DynamicRelationshipType.withName(x) is one of Neo4j’s ways of making a new RelationshipType. Scala’s type inference means that the compiler knows that string2relationshipType returns a RelationshipType.

Since I used the implicit keyword, Scala knows that when a String is used in a method that expects a RelationshipType, it can use this function to convert it on the fly.

Check out all that majesty. Thanks, Scala!

Ok, so now I want to show that I was actually able to get something into the database. So here’s my node traversal and printing code.

  tx = neo.beginTx()

  try{
    
    val trav : Traverser = first.traverse(Traverser.Order.BREADTH_FIRST,
                                          StopEvaluator.END_OF_GRAPH,
                                          ReturnableEvaluator.ALL,
                                          "isRelatedTo",
                                          Direction.BOTH)

    for(node <- trav){
      println(node.getProperty("name"))
    }
    tx.success()
  } finally {
    tx.finish()
    println("finished transaction 2")
  }

  neo.shutdown()

  println("done")

}

Two observations:

traverse takes a lot of arguments, most of which seem to be these awkwardly specified static variables. I bet there’s a way to use Scala features to wrap that and make it more elegant.
Check out that for loop. Concise syntax that takes an iterator. There’s one catch: Traverser is a Java.lang.Iterable iterator, whereas the loop syntax requires a scala.collection.Iterable. Remember that import scala.collection.JavaConversions._ line? That imported an implicit conversion from Java to Scala iterables.

All in all, pretty straightforward stuff, I thought. Here’s what I got when I used sbt to run this project:

> run
[info] Compiling 1 Scala source to /home/sb/dev/krow/target/scala-2.9.1.final/classes...
[warn] there were 1 deprecation warnings; re-run with -deprecation for details
[warn] one warning found
[info] Running Krow 
start
finished transaction 1
finished transaction 2

That’s not what I wanted! Not only did I not get any printed acknowledgement of the nodes that I had made in the database, but program hangs and doesn’t finish.

What the hell?!

Asking Dwins about it, he tells me sagely about threads. Transactions need to be run in a single thread. The Application trait does a lot of bad stuff with threads. To be technically specific about it, it does…some really bad stuff with threads. I thought I had a handle on it when I started writing this blog post but instead I’m just going to copy/paste from the Application trait docs, which I should have read in the first place.

In practice the Application trait has a number of serious pitfalls:

* Threaded code that references the object will block until static initialization is complete. However, because the entire execution of an object extending Application takes place during static initialization, concurrent code will always deadlock if it must synchronize with the enclosing object.

Oh. Huh. That’s interesting.

It is recommended to use the App trait instead.

Now you’re talking. Let me just change that line to object Krow extends App { and I’ll be cooking in no…

> run
[info] Compiling 1 Scala source to /home/sb/dev/krow/target/scala-2.9.1.final/classes...
[warn] there were 1 deprecation warnings; re-run with -deprecation for details
[warn] one warning found
[info] Running Krow 
start
finished transaction 1
finished transaction 2

…time.

God dammit. There’s something else about App, which runs all the object code at initialization, which is causing a problem I guess. I asked Dwins what he thought.

Too much magic.

I guess I’m going to have to write a main method after all.

After some further messing around with the code, I have something that runs and prints the desired lines.

While the code would compile, I got I wound up having to explicitly name the RelationshipType type in the calls where I was trying to implicitly convert the strings; otherwise I got exceptions like this:

java.lang.IllegalArgumentException: Expected RelationshipType at var args pos 0, found isRelatedTo

Does that make it an explicit conversion?

Overall, hacking around with this makes me excited about both Scala and Neo4j despite the setbacks and wrangling.

Complete working code appended below.

import org.neo4j.kernel.EmbeddedGraphDatabase
import org.neo4j.graphdb._
import collection.JavaConversions._

object Krow {

  println("start")

  def main(args: Array[String]){

    var first : Node = null
    var second : Node = null

    val neo: GraphDatabaseService = new EmbeddedGraphDatabase("var/graphdb")
    var tx: Transaction = neo.beginTx()
      
    implicit def string2relationshipType(x: String) = DynamicRelationshipType.withName(x)

    try {
      first = neo.createNode()
      first.setProperty("name","first")
    
      second = neo.createNode()
      second.setProperty("name","second")

      first.createRelationshipTo(second, "isRelatedTo" : RelationshipType)
      
      tx.success()
      println("added nodes")
    } catch {
      case e: Exception => println(e)
    } finally {
      tx.finish() // wrap in try, finally   
      
      println("finished transaction 1")
    }

    tx = neo.beginTx()

    try{
    
      val trav : Traverser = first.traverse(Traverser.Order.BREADTH_FIRST,
                                            StopEvaluator.END_OF_GRAPH,
                                            ReturnableEvaluator.ALL,
                                            "isRelatedTo" : RelationshipType,
                                            Direction.BOTH)

      for(node <- trav){
        println(node.getProperty("name"))
      }
      tx.success()
    } finally {
      tx.finish()
      println("finished transaction 2")
    }

    neo.shutdown()

    println("done")
  }
}

7 Comments

September 9, 2011

Open Source is Software’s Labor Movement

Software developers often have high starting salaries, but wind up not getting paid as much as managers and others who work in software. Why is this? I’d argue that it’s not because software development is any less essential or important to the software business. Rather, the reason is that software developers that work for a proprietary company give away the rights to their work to their employer.

No matter how much the proprietary developer is getting paid, they gradually experience employer lock-in. Why is this? They are the natural experts of the software that they write. This expertise is valuable. So, developers get more and more valuable to a company the longer they work for it.

Employers give developers raises, but not all is well. If the propriety developer leaves their job, they will generally not be able to find as high-paying a job elsewhere, because they will not have the same relevant expertise. That means that there is no incentive on their current employer to pay them for all they contribute. They only need to pay them enough to keep them from quitting.

That means that if you have been working for a long time for a proprietary company, you probably aren’t getting paid enough or getting the benefits you deserve.

Now consider open source developers. By now it is clear that there is a large market for open source companies and freelance consulting. If you are a developer, you should get into that market.

If you are an open source developer, then you get to have the same access to the fruits of your labor as your employer. You are not alienated from your labor, in Marx’s sense of the word. Your employer can never take away the public record of your contributions to an open source project or your standing in the community. And, importantly, the skills you have learned from your hard work are transferable to any other job using the software you have developed.

The result is that even more so than other software developers, open source developers can have their pick of jobs. That means employers have to compete for them even more.

While many companies will shower their developers with perks to keep them on board, savvy open source developers will demand time to do core community work that provides them the intangible social relationships and technical skills that make them more valuable workers–for anyone using their software. And they will demand that new code be vetted through core community processes whenever possible, as opposed to being one off applications.

Proprietary software companies should fear these trends. More and more developers are going to choose open source jobs, so proprietary companies are going to have to pay more for worse programming talent. (This is one of many reasons why proprietary software is doomed.)

Open source companies, on the other hand, should embrace this trend and do their best to satisfy their developer’s demands. Ultimately, these company’s value will be in the superiority of their technology and their developers. The developers want their software to be excellent, because it is the software and their involvement in it, not the company employing them, that guarantees them employment. In the open source economy, this loyalty is not misplaced. Rather, employing companies need to align themselves with it. In so doing, they will achieve technological superiority and community fame. They may err from this path at their peril.

2 Comments

August 30, 2011

Remixes free, originals not

I’m interested in whether others have experienced this phenomenon: a new pop (or indy pop? I can’t tell the difference any more) song gets some recognition. While it is difficult to find the original song for free on the internet, music blogs post remixes of the song for free downloading.

I’m pretty sure that this is illegal. Whatever. The question is why is this is a recurring phenomenon. First thoughts:

For the original song, there is incentive for the recording studio to crack down on distribution, and it seems that for the most part, they do.
For remixes, there is less incentive. Why is that? Here’s some possibilities:
- They aren’t going to make any money off the remix anyway, so why bother enforcing access to it?
- The remix is going to drive up sales of the original song by increasing people’s exposure to it, so the recording studio has reason to let the remix run free.

Any other ideas?

According to copyright law, the original copyright holder has rights to derivative works. I’m assuming that most of these remixes are made and distributed without the original copyright holder’s permission, though maybe that’s wrong. Maybe I’m just a sap, believing the myth of the underground digital remix artist, when in fact there’s based economic motives in play. There could easily be pseudonymous remix artists who ply their trade in coordination with music studios, making dance remixes of songs more or less to generate an “underground” following.

That wouldn’t be a bad thing, though as a pattern it would have the risk of crowding out original music from the music “market” through free remixes. Obviously, that’s not sustainable, which I suppose is why music blogs seem to have a time limit on how long they keep content up. Suppose: after an incubation period, the underground effect ceases to increase sales because the song has already gone “mainstream.”

If this is how things work, there’s something elegant but also diabolical about this pattern. As a mechanized process, the underground becomes nothing more than a channel through which things emerge. Cool can be a state of being on a gray market fringe, but that fringe is just a flower crafted by a larger organism to attract pollinating bees. Aficionados become part of the ecosystem, rather than advancers of it. Does the system continue to evolve?

2 Comments

August 23, 2011

Academic holy wars and conference acceptances

I’m inspired by Mel Chua’s recent posts about culture shock of entering academia from the open source. I don’t have her humility about it and so am convinced that they are doing a lot of things wrong more or less from the get-go, so I’m more or less looking for problems. That said, one came up at lunch with an old friend who’s finishing up his PhD.

My friend Joe reports that sometimes, when papers are submitted to conferences, academic holy war disputes will sometimes affect whether papers get accepted.

Ok, maybe that doesn’t sound like much of a surprise, but it’s an interesting mechanism.

According to Joe, conference papers are reviewed by attendees. Informally, somebody who gets their paper accepted is required to review 3 or so other papers. Nothing bad there.

However, when there is a “holy war” — a major division within the field about a basic theoretical or methodological issue — these religious persuasions will affect the reviews and lead to some papers being rejected despite what we could suppose to be their objective merits.

Is this bad? Is it any different from the open source process? I think so.

But not because of the dispute itself. There’s got to be some substance to these kinds of theoretical and methodological differences. Gosh, open source is full of divisive holy wars, and in general they are a good thing, since competing camps race to innovate and prove that Python is a better programming language than Ruby, or whatever.

The difference in the academic domain is that these conferences are a bottleneck for publication and accreditation, and the conference process itself is not easily forked. So the paper selection process is not merely curational, in the sense of selecting papers of interest for the attendees. Rather, rejected papers are silenced and discredited.

Some balance has to be struck. There has to be some venue for a real conflict of ideas, because unless you fight the holy war, how can you find out who is right? On the other hand, since individual’s reputations are tied to the success of their religion, there is an incentive for doctrinaire and skulduggerous rejection of opposing papers without regard to how much these papers contribute to the field.

What’s the solution? We could imagine a more open, web-based unconference system for accepting papers. There could be the same requirement that one has to review other papers in order to get ones own paper included. Reviews can include rating metadata that affects its prominence within the conference; reviewers could also be rated (for their comments) to give them additional clout within the community. Then you could track discrepancies in people’s ratings on controversial items to detect where the holy wars are at and correct for them statistically when awarding credit.