Digifesto

Ascendency and overhead in networked ecosystems

Ulanowicz (2000) proposes in information-theoretic terms several metrics for ecosystem health, where one models an ecosystem as a for example a trophic network. Principal among them ascendancy , which is a measure of the extent to which energy flows in the system are predictably structured weighted by the total energy of the system. He believes that systems tend towards greater ascendancy in expectation, and that this is predictive of ecological ‘succession’ (and to some extent ecological fitness). On the other hand, overhead, which is unpredictability (perhaps, inefficiency) in energy flows (“free energy”?), are important for the system’s resiliency towards external shocks.
ascendency
At least in the papers I’ve read so far, Ulanowicz is not mathematically specific about the mechanism that leads to greater ascendancy, though he sketches some explanations. Autocatalytic cycles within the network reinforce their own positive perturbations and mutations, drawing in resources from external sources, crowding out and competing with them. These cycles become agents in themselves, exerting what Ulanwicz suggests is Aristotelian final or formal causal power on the lower level components. In this way, freely floating energy is drawn into structures of increasing magnificence and complexity.

I’m reminded on Bataille’s The Accursed Share, in which he attempts to account for societal differences and the arc of human history through the use of its excess energy. “The sexual act is in time what the tiger is in space,” he says, insightfully. The tiger, as an apex predator, is flame that clings brilliantly to the less glamorous ecosystem that supports it. That is why we adore them. And yet, their existence is fragile, as it depends on both the efficiency and stability of the rest of its network. When its environment is disturbed, it is the first to suffer.
space tiger
Ulanowicz cites himself suggesting that a similar framework could be used to analyze computer networks. I have not read his account yet, though I anticipate several difficulties. He suggests that data flows in a computer network are analogous to energy flows within an ecosystem. That has intuitive appeal, but obscures the fact that some data is more valuable than others. A better analogy might be money as a substitute for energy. Or maybe there is a way to reduce both to a common currency, at least for modeling purposes.

Econophysics has been gaining steam, albeit controversially. Without knowing anything about it but based just on statistical hunches, I suspect that this comes down to using more complex models on the super duper complex phenomenon of the economy, and demonstrating their success there. In other words, I’m just guessing that the success of econophysics modeling is due to the greater degrees of freedom in the physics models compared to non-dynamic, structural equilibrium models. However, since ecology models the evolutionary dynamics of multiple competing agents (and systems of those agents), its possible that those models could capture quite a bit of what’s really going on and even be a source of strategic insight.

Indeed, economics already has a sense of stable versus unstable equilibria that resonate with the idea of stability of ecological succession. These ideas translate into game theoretic analysis as well. As we do more work with Strategic Bayesian Networks or other constructs to model equilibrium strategies in a networked, multi-agent system, I wonder if we can reproduce Ulanowicz’s results and use his ideas about ascendancy (which, I’ve got to say, are extraordinary and profound) to provide insight into the information economy.

I think that will require translating he ecosystem modeling into Judea Pearl’s framework for causal reasoning. Having been indoctrinated in Pearl’s framework in much of my training, I believe that it is general enough to subsume Ulanowicz’s results. But I have some doubt. In some of his later writings Ulanowicz refers explicitly to a “Hegelian dialectic” between order and disorder as a consequence of some of his theories, and between that and his insistence on his departure from mechanistic thinking over the course of his long career, I am worried that he may have transcended what it’s possible to do even with the modeling power of Bayesian networks. The question is: what then? It may be that once one’s work sublimates beyond our ability to model explicitly and intervene strategically, it becomes irrelevant. (I get the sense that in academia, Ulanwicz’s scientific philosophizing is a privilege reserved for someone tenured who late in their career is free to make his peace with the world in their own way) But reading his papers is so exhilarating to me. I’ve had no prior exposure to ecology before this, so his papers are packed with fresh ideas. So while I don’t know how to justify it to any of my mentors or colleagues, I think I just have to keep diving into it when I can, on the side.

@#$%! : variance annotations in Scala’s unsound parameterized types

[error] /home/sb/ischool/cs294/hw3/src/main/scala/TestScript.scala:32: type mismatch;
[error] found : Array[wikilearn.TestScript.parser.Page]
[error] required: Array[wikilearn.WikiParser#Page]
[error] Note: wikilearn.TestScript.parser.Page <: wikilearn.WikiParser#Page, but class Array is invariant in type T.
[error] You may wish to investigate a wildcard type such as `_ <: wikilearn.WikiParser#Page`. (SLS 3.2.10)

wtf, Scala.  You know exactly what I’m trying to do here.

EDIT: I sent a link to the above post to David Winslow. He responded with a crystal clear explanation that was so great I asked him if I could include it here. This is it, below:

It’s a feature, not a bug :) This is actually the specific issue that Dart had in mind when they put this note in the language spec:

The type system is unsound, due to the covariance of generic types. This is a deliberate choice (and undoubtedly controversial). Experience has shown that sound type rules for generics fly in the face of programmer intuition. It is easy for tools to provide a sound type analysis if they choose, which may be useful for tasks like refactoring.

Which of course caused some hubbub among the static typing crowd.

The whole issue comes down to the variance annotations of type parameters Variance influences how type parameters relate to the subtyping relationships of parameterized types:

Given types A and B, A is a supertype of B
trait Invariant[T] means there is no subtype relationship between Invariant[A] and invariant[B]. (Either could be used as an Invariant[_] though)
trait Covariant[+T] means Covariant[A] is a supertype of Covariant[B]
trait Contravariant[-T] means Contravariant[A] is a subtype of Contravariant[B].

The basic rule of thumb is that if you produce values of type T, you can be covariant in T, and if you consume values of type U, you can be contravariant in type U. For example, Function1 has two type parameters, the parameter type A and the result type T. it is contravariant in A and covariant in T. An (Any => String) can be used where a (String => Any) is expected, but not the other way around.

So, what about the type parameter for Array[T]? Among other operations, Arrays provide:

class Array[T] {
  def apply(i: Int): T // "producing" a T
  def update(i: Int, t: T): Unit // "consuming" a T
}

When the type parameter appears in contravariant and covariant positions the only option is to make it invariant.

Now, it’s interesting to note that in the Java language Arrays are treated as if they are covariant. This means that you can write a Java program that doesn’t use casts, passes the typechecker, and generates a type error at runtime; the body of main() would look like:

String[] strings = new String[1];
Object[] objects = strings;
objects[0] = Integer.valueOf(0); // the runtime error occurs at this step, but even if it didn't: 
System.out.println(strings[0]); // what happens here?

Anyway, the upshot is that immutable collections only use their types in covariant positions (you can get values out, but never insert) so they are much handier. Does your code work better if you replace your usage of Array with Vector? Alternatively, you can always provide the type parameter when you construct your array. Array(“”) is an Array[String], but Array[AnyRef](“”) is an Array[AnyRef].

Bash script for converting all .wav files in a directory to .mp3

I’ve been working with music files lately trying to get Steve Morrell‘s music online. In the process I’ve had to convert his albums, which I’ve ripped in .wav format, to .mp3.

To accomplish this, I’ve written a short bash script. It’s requires a number of tricks I wasn’t familiar with and had to look up.

#!/bin/bash

SAVEIF=$IFS
IFS=$(echo -en "\n\b")

for file in $(ls *wav)
do
  name=${file%%.wav}
  lame -V0 -h -b 160 --vbr-new $name.wav $name.mp3
done


IFS=$SAVEIFS

Though it isn’t recommended, I did the for loop on ls because I wanted to limit it to .wav files. But that means the script chokes on file names with spaces unless you swap out the IFS variable.

I used LAME for the conversion.

Hadoop with Scala: hacking notes

I am trying to learn how to use Hadoop. I’m am trying to learn to program in Scala. I mostly forget how to program in Java. In this post I will take notes on things that come up as I try to get my frickin’ code to compile so I can run a Hadoop job.

There was a brief window in my life when I was becoming a good programmer. It was around the end of my second year as a professional software engineer that I could write original code to accomplish novel tasks.

Since then, the tools and my tasks have changed. For the most part, my coding has been about projects for classes, really just trying to get a basic competence in commodity open tools. So, my “programming” consists largely of cargo-culting code snippets and trying to get them to run in a slightly modified environment.

Right now I’ve got an SBT project; I’m trying to write a MapReduce job in Scala that will compile as a .jar that I can run on Hadoop.

One problem I’m having is there are apparently several different coding patterns for doing this, and several frameworks that are supposed to make my life easier. These include SMR, Shadoop, and Scalding. But since I’m doing this for a class and I actually want to learn something about how Hadoop works, I’m worried about having to good a level of abstraction.

So I’m somewhat perversely taking the Scala Wordcount example from jweslley’s Shadoop and make it dumber. I.e., not use Shadoop.

One thing that has been confusing as hell is that there Hadoop has a Mapper interface and a Mapper class, both with map() functions (1,2), but those functions haved different type signatures.

I started working with some other code that used the second map() function. One of the arguments to this function is of type Mapper.Context. I.e., the Context class is a nested member of the Mapper class. Unfortunately, referencing this class within Scala is super hairy. I saw a code snippet that did this:

override def map(key:Object, value:Text, context:Mapper[Object,Text,Text,IntWritable]#Context) = {
    for (t <-  value.toString().split("\\s")) {
      word.set(t)
      context.write(word, one)
    }
  }

But I couldn’t get this thing to compile. Kept getting this awesome error:

type Context is not a member of org.apache.hadoop.mapred.Mapper[java.lang.Object,org.apache.hadoop.io.Text,org.apache.hadoop.io.Text,org.apache.hadoop.io.IntWritable]

Note the gnarliness here. It’s not super clear whether or how Context is parameterized by the type parameters of Mapper. The docs for the Mapper class make it seem like you can refer to Context without type parameterization within the code of the class extending Mapper. But I didn’t see that until I had deleted everything and tried a different track, which was to use the Mapper interface in a class extending MapReduceBase.

Oddly, this interface hides the Context mechanic and instead introduces the Reporter class as a final argument to map(). I find this less intimidating for some reason. Probably because after years of working in Python and JavaScript my savvinness around the Java type hierarchy is both rusty and obsolete. With the added type magicalness of Scala to add complexity to the mix, I think I’ve got to steer towards the dumbest implementation possible. And at the level I’m at, it looks like I don’t ever have to touch or think about this Reporter.

So, now starting with the example from Shadoop, now I just need to decode the Scala syntactic sugar that Shadoop provides to figure out what the hell is actually going on.

Consider:

  class Map extends MapReduceBase with Mapper[LongWritable, Text, Text, IntWritable] {

    val one = 1

    def map(key: LongWritable, value: Text, output: OutputCollector[Text, IntWritable], reporter: Reporter) =
      (value split " ") foreach (output collect (_, one))
  }

This is beautiful concise code. But since I want to know something about the underlying program I’m going to uglify it by removing the implict conversions provided by Shadoop.

The Shadoop page provides a Java equivalent for this, but that’s not really what I want either. For some reason I demand the mildy more concise syntax of Scala over Java but not the kind of condensed, beautiful syntax Scala makes possible with additional libraries.

This compiles at least:

  class Map extends MapReduceBase with Mapper[LongWritable, Text, Text,
 IntWritable] {   

    val one = new IntWritable(1); 

    def map(key: LongWritable, value: Text, output: OutputCollector[Text,
     IntWritable], reporter: Reporter) = {
      var line = value.toString();

      for(word <- line.split(" ")){
        output.collect(new Text(word), one)
      }
    }
  }

What I find a little counterintuitive about this is that the OutputCollector doesn’t act like a dictionary, overwriting the key-value pair with each call to collect(). I guess since I’m making a new Text object with each new entry, that makes sense even if the collector is implemented as a hash map of some kind. (Shadoop hides this mechanism with implicit conversions, which is rad of course.)

Next comes the reducer. The Shadoop code is this:

def reduce(key: Text, values: Iterator[IntWritable],
            output: OutputCollector[Text, IntWritable], reporter: Reporter) = {
  val sum = values reduceLeft ((a: Int, b: Int) => a + b)
  output collect (key, sum)
}

Ok, so there’s a problem here. The whole point of using Scala to code a MapReduce job is so that you can use Scala’s built in reduceLeft function inside the reduce method of the Reducer. Because functional programming is awesome. By which I mean using built-in functions for things like map and reduce operations are awesome. And Scala supports functional programming, in at the very least that sense. And MapReduce as a computing framework is at least analogous to that paradigm in functional programming, and even has the same name. So, OMG.

Point being, no way in hell am I going to budge on this minor aesthetic point in my toy code. Instead, I’m going to brazenly pillage jweslley’s source code for the necessary implicit type conversion.

  implicit def javaIterator2Iterator[A](value: java.util.Iterator[A]) = new Iterator[A] {
    def hasNext = value.hasNext
    def next = value.next
  }

But not the other implicit conversions that would make my life easier. That would be too much.

Unfortunately, I couldn’t get this conversion to work right. Attempting to run the code gives me the following error:

[error] /home/cc/cs294/sp13/class/cs294-bg/hw3/wikilearn/src/main/scala/testIt/WordCount.scala:33: type mismatch;
[error]  found   : java.util.Iterator[org.apache.hadoop.io.IntWritable]
[error]  required: Iterator[org.apache.hadoop.io.IntWritable]
[error]       val sum = (values : scala.collection.Iterator[IntWritable]).reduceLeft (
[error]                  ^

It beats me why this doesn’t work. In my mental model of how implicit conversion is supposed to work, the java.util.Iterator[IntWritable] should be caught by the parameterized implicit conversion (which I defined within the Object scope) and converted no problemo.

I can’t find any easy explanation of this on-line at the moment. I suspect it’s a scoping issue or a limit to the parameterization of implicit conversions. Or maybe because Iterator is a trait, not a class? Instead I’m going to do the conversion explicitly in the method code.

After fussing around for a bit, I got:

    def reduce(key: Text, values: java.util.Iterator[IntWritable],                                                               
      output: OutputCollector[Text, IntWritable], reporter: Reporter) = {                                                        
      val svals = new scala.collection.Iterator[IntWritable]{
        def hasNext = values.hasNext
        def next = values.next
      }
      val sum = (svals : scala.collection.Iterator[IntWritable])\|
.reduceLeft (
       (a: IntWritable, b: IntWritable) => new IntWritable(a.get() + b.get())}
      ) 
      output collect (key, sum)
    }

…or, equivalently and more cleanly:

    def reduce(key: Text, values: java.util.Iterator[IntWritable],
      output: OutputCollector[Text, IntWritable], reporter: Reporter) = {

      val svals = new scala.collection.Iterator[Int]{
        def hasNext = values.hasNext
        def next = values.next.get
      }

      val sum = (svals : scala.collection.Iterator[Int]).reduceLeft (
        (a: Int, b: Int) => a + b
      )
      output collect (key, new IntWritable(sum))
    }
  }

I find the Scala syntax for defining the methods of an abstract class here pretty great (I hadn’t encountered it before). Since Iterator[A] is an abstract class, you define the methods next and hasNext inside the curly braces. What an elegant way to let people subclass abstract classes in an ad hoc way!

There’s one more compile error I had to bust around. This line was giving me noise:

conf setOutputFormat classOf[TextOutputFormat[_ <: WritableComparable, _ <: Writable]]

It was complaining that WriteComparable needed a type parameter. Not confident I could figure out exactly which parameter to set, I just made the signature tighter.

conf setOutputFormat classOf[TextOutputFormat[Text, IntWritable]]

Only then did I learn that JobConf is a deprecated way of defining jobs. So I rewrote WordCount object into a class implementing the Tool interface, using this Java snippet as an example to work from. To do that, I had to learn the to write a class that extends two interfaces in Scala, you need to use a “extends X with Y” syntax. Also, for trivial conditionals Scala dispenses with Java’s ternary X ? Y : Z operator in favor of a single line if (X) Y else Z. Though I will miss the evocative use of punctuation in the ternary construct, I’ve got to admit that Scala is keeping it real classy here.

Wait…ok, so I just learned that most of the code I was cargo culting was part of the deprecated coding pattern, which means I now have to switch it over to the new API. I learned this from somebody helpful in the #hadoop IRC channel.

[23:31]  what's the deal with org.apache.hadoop.mapreduce.Mapper and org.apache.hadoop.mapred.Mapper ??
[23:31]  is one of them deprecated?
[23:31]  which should I be using?
[23:32]  sbenthall: Use the new API
[23:32]  sbenthall: (i.e. mapreduce.*) both are currently supported but eventually the (mapred.*) may get deprecated
[23:32]  ok thanks QwertyM
[23:33]  sbenthall: as a reference, HBase uses mapreduce.* APIs completely for its provided MR jobs; and I believe Pig too uses the new APIs
[23:33]  is MapReduceBase part of the old API?
[23:33]  sbenthall: yes, its under the mapred.* package
[23:33]  ok, thanks.

Parachuting into the middle of a project has it’s drawbacks, but it’s always nice when a helpful community member can get you up to speed. Even if you’re asking near midnight on a Sunday.

Wait. I realize now that I’ve come full circle.

See, I’ve been writing these notes over the course of several days. Only just now am I realizing that I’m not going back to where I started, with the Mapper class that takes the Context parameter that was giving me noise.

Looking back at the original error, it looks like that too was a result of mixing two API’s. So maybe I can now safely shift everything BACK to the new API, drawing heavily on this code.

It occurs to me that this is one of those humbling programming experiences when you discover that the reason why your thing was broken was not the profound complexity of the tool you were working with, but your own stupidity over something trivial. This happens to me all the time.

Thankfully, I can’t ponder that much now, since it’s become clear that the instructional Hadoop cluster on which we’ve been encouraged to do our work are highly unstable. So I’m going to take the bet that it will be more productive for me to work locally, even if that means installing Hadoop locally on my Ubuntu machine.

I thought I was doing pretty good with the installation until I got to the point of hitting the “ON” switch on Hadoop. I got this:

sb@lebensvelt:~$ /usr/local/hadoop/bin/start-all.sh 
Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-sb-namenode-lebensvelt.out
localhost: ssh: connect to host localhost port 22: Connection refused
localhost: ssh: connect to host localhost port 22: Connection refused
starting jobtracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-sb-jobtracker-lebensvelt.out
localhost: ssh: connect to host localhost port 22: Connection refused

I googled around and it looks like this problem is due to not having an SSH server running locally. Since I’m running Ubuntu, I went ahead and followed these instructions. In the process I managed to convince my computer that I was undergoing a man-in-the-middle attack between myself and myself.

I fixed that with

$ ssh-keygen -R localhost

and successfully got Hadoop running with

$ /usr/local/hadoop/bin/start-all.sh 

only to be hung up on this error

$ hadoop fs -ls
Warning: $HADOOP_HOME is deprecated.

ls: Cannot access .: No such file or directory.

which somebody who runs an Indian matrimony search engine had run into and documented the fix for. (Right way to spell it is

hadoop fs -ls .

With an extra dot.)

There’s a point to me writing all this out, by the way. An important part of participation in open source software, or the hacker ethic in general, is documenting ones steps so that others who follow the same paths can benefit from what you’ve gone through. I’m going into a bit more detail about this process than really helpful because in my academic role I’m dealing with a lot of social scientist types who really don’t know what this kind of work entails. Let’s face it: programming is a widely misunderstood discipline which seems like an utter mystery to those that aren’t deeply involved in it. Much of this has to do with the technical opacity of the work. But another part of why its misunderstood is because problem solving in the course of development depends on a vast and counter-intuitive cyberspace of documentation (often generated from code comments, so written by some core developer), random blog posts, chat room conversations, forum threads. Easily 80% of the work when starting out on a new project like this is wrestling with all the minutia of configuration on a particular system (operating system and hardware contingent, in many cases) and idioms of programming language and environment.

The amount of time it takes to invest in any particular language or toolkit necessarily creates a tribalism among developers because their identities wind up being intertwined with the tools they use. As I hack on this thing, however incompetently, I’m becoming a Scala developer. That’s similar to saying that I’m becoming a German speaker. My conceptual vocabulary, once I’ve learned how to get things done in Scala, is going to be different than it was before. In fact, that’s one of the reasons why I’m insisting on teaching myself Scala in the first place–because I know that it is a conceptually deep and rigorous language which will have something to teach me about the Nature of Things.

Some folks in my department are puzzled at the idea that technical choices in software development might be construed as ethical choices by the developers themselves. Maybe it’s easier to understand that if you see that in choosing a programming language you are in many ways choosing an ontology or theoretical framework through which to conceive of problem-solving. Of course, choice of ontology will influence ones ethical principles, right?

But I digress.

So I have Hadoop running on my laptop now, and a .jar file that compiles in SBT. So now all I need to do is run the .jar using the hadoop jar command, right?

Nope, not yet…

Exception in thread "main" java.lang.NoClassDefFoundError: scala/ScalaObject

OK, so I the problem is that I haven’t included scala-library.jar on my Hadoop runtime classpath.

I solved this by making a symbolic link from the Hadoop /lib directory to the .jar in my Scala installation.

ln -s /usr/local/share/scala-2.9.2/lib/scala-library.jar /usr/local/hadoop/lib/scala-library.jar

That seemed to work, only now I have the most mundane and inscrutable of Java errors to deal with:

Exception in thread "main" java.lang.NullPointerException
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:601)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

I had no idea how to proceed from here. This time, no helpful folks in the #hadoop channel helped me either.

So once again I switched to a new hunk of code to work from, this time the WordCount.scala file from Derrick Cheng’s ScalaOnHadoop project. Derrick posted a link to this project on our course’s Piazza earlier, which was awesome.

Another digression. There’s a lot of talk now about on-line education. People within the university context feel vaguely threatened by MOOCs, believing that there will be a superstar effect that advantages first movers, but many are uncomfortable making that first move. Taek Lim in my department is starting to studying user interfaces to support collaboration in on-line learning.

My own two cents on this are that the open software model is about as dynamics a collaborative environment as you can get, and at the point when people started to use our course discussion management system, Piazza, as if it were a forum to discuss the assignment almost as if it was an open source mailing list, we started to get a lot more out of it and learned a lot from each other. I’m not the first person to see this potential of the open source model for education, of course. I’m excited to be attending the POSSE workshop, which is about that intersection, this summer. At this rate, it looks like I will be co-teaching a course along these lines in the Fall targeted at the I School Masters students, which is exciting!

So, anyway, I’m patching Derrick Cheng’s code. I’m not working with BIDMat yet so I’m leaving that out of the build, so I remove all references to that and I get the thing to compile…and run! I got somebody else’s Scala WordCount to run!

This seems like such a triumph. It’s taken a lot of tinkering and beating my head against a wall to get this far. (Not all at once–I’ve wrote this post over the course of several days. I realize that it’s a little absurd.)

But wait! I’m not out of the woods yet. I check the output of my MapReduce job with

hadoop fs -cat test-output/part-r-00000

and the end of my output file looks like this:

§	1
§	1
§	1
§	1
§	1
§	1
§	1
§	1
§	1
Æschylus	1
Æsop	1
Æsop	1
Æsop	1
É	1
Élus,_	1
à	1
à	1
à	1
æons	1
æsthetic	1
è	1
è	1
è	1
état_.	1
The	1
The	1

What’s going on here? Well, it looks like I successfully mapped each occurrence of each word in the original text to a key-value pair. But something went wrong in the reduce step, that was supposed to combine all the occurrences into a single count for each word.

That is just fine for me, because I’d rather be using my own Reduce function. Because it uses Scala’s functional reduceLeft, which is sweet! Why even write a Map Reduce job in a functional programming language if you can’t use a built=in language reduce in the Reduce step?

Ok, mine doesn’t work either.

Apparently, the reason for this is that the type signature I’ve been using for the Reducer’s reduce method has been wrong all along. And when that happens, the code compiles but Reducer runs its default reduce function, which is the identity function.

It’s almost (almost!) as if it would have made more sense to start by just reading the docs and following the instructions.

Now I have edited the map and reduce functions so that they have the right type signatures. To get this right exactly, I looked at a different file. I also tinkered

At last, it works.

Now, at last, I understand how this Context member class works. The problem was that I was trying to use it with the mapred.Mapper class from the old API. So much of my goose chase was due to not reading things carefully enough.

On the other hand, I feel enriched by the whole meandering process. Realizing that my programming faults were mine and not due to the complexity of the tools I was using paradoxically gives me more confidence in my understanding of the tools moving forward. And engaging with the distributed expertise on the subject–through StackOverflow, documentation generated from the original coders, helpful folks on IRC, blog posts, and so on–is much more compelling when one is driven by concrete problem-solving goals, even when those goals are somewhat arbitrary. Had I learned to use Hadoop in a less circuitous way, my understanding would probably be much more brittle. I am integrating new knowledge of Hadoop, Scala, and Java (it’s been a long time) with existing background knowledge. After a good night’s sleep, with any luck it will be part of my lifeworld!

This is the code I wound up with, by the way. I don’t suggest you use it.

the technical political spectrum?

Since the French Revolution, we have had the Left/Right divide in politics.

Probably seven or so years ago, some people got excited about thinking about a two-dimensional political spectrum. There were Economic and Social dimensions. You could be in one of four quadrants: Libertarian, Social Democrat, Totalitarian, or Conservative.

Technology is getting more political and politicized. Have we figured out the spectrum yet?

Because there’s been a lot of noise about their beef, lets assume as a first pass that O’Reilly and Morozov give us some sense of the space. The problem is that there’s a good chance the “debate” between them is giving off a lot more heat than light, so it’s not clear if there’s a substantive political difference.

Let me try to take a constructive crack at it. I don’t think I’m going to get it right, but I’m curious to know how much this resonates and if others would map things differently.

A two-dimensional representation of the continuum of technical politics, with unscientifically plotted representatives

A two-dimensional representation of the continuum of technical politics, with unscientifically plotted representatives

Some people think that “technology”, by which most people mean technology companies, should be replacing more and more of the functions of government. I think the peer progressives are in this camp, as are the institutionalized nudgers in the UK Conservative party, who would prefer to shrink the state. There’s a fair argument that the “open government” people are trying to shrink government by giving non-state actors the ability to provide services that the state might otherwise provide. Through free flow of information and greater connectivity, we can spur vibrancy in civil society and perfect the market.

Others think that the state needs to have a strong role in regulating technology companies to make sure they don’t abuse their power. There’s a lot of that going around in my department at UC Berkeley. These people see that democratic state as the best representative of citizen’s interests. The FTC and Congress need to help ensure, e.g., people’s privacy. Maybe Morozov is in here somewhere. Monopoly concentrations of technical power are threatening to the public interest; technical platforms should be decentralized and controlled so that politics is not overwhelmed by an illegitimate technocracy.

Another powerful group, the Copyright lobby, is economically threatened by new technology and so wants to restrict its use. Telecom companies would like to effectively meter flow of information. Maybe it’s a stretch, but perhaps we could include the military-industrial complex and its desire to instrument the Web for surveillance purposes in this camp as well. These groups tend to not want technology to change, or to tightly control that technology.

Then there’s the Free Software movement. And Stanford’s Liberation Technology folks, if I understand them correctly. And maybe Anonymous is in here somewhere. Pro-technology, generally skeptical of both state and corporate interests.

So maybe what’s going on is that we have a two-dimensional political space.

In one dimension, we have Centralization versus Decentralization. Richly interconnected platforms managed by an elite with tight arrangements for data sharing, versus a much more loosely connected set of networks where the lines of power are less clear.

In the other dimension, we have Unrestricted versus Controlled. Either the technical organizations should be free to persue their own interests, or they should be regulated by non- (or at least less) technical political forces, such as the state.

What do you think?

the social intelligence of spotted hyenas

The best thing I did today was stop by for the beginning of Kay Holekamp‘s talk on “Social Complexity and the Evolution of Intelligence.”

Her work involves researching spotted hyenas.

Spotted hyenas live in clans of about a hundred hyenas, which contain several martilineal kinship groups each. Female hyenas have an observable social hierarchy that is caused by and a cause of survival “fitness”.  Male hyenas migrate to a different clan before reproducing.

This is very similar to the social structure of certain primates, like baboons.  It is nothing like the social structure of cats and dogs (hyenas are somewhere in between the two, closer to cats.)

What’s interesting about the research is that without exception, results about the social cognitive capabilities of primates is, without exception, reproducible in spotted hyenas.

That means that the same capacities for social intelligence has been achieved by multiple species through convergent evolution.

deep thoughts by jack handy

Information transfer just is the coming-into-dependence of two variables, which under the many worlds interpretation of quantum mechanics means the entanglement of the “worlds” of each variable (and, by extension, the networks of causally related variables of which they are a part). Information exchange collapses possibilities.
This holds up whether you take a subjectivist view of reality (and probability–Bayesian probability properly speaking) or an objectivist view. At their (dialectical?) limit, the two “irreconcilable” paradigms converge on a monist metaphysics that is absolutely physical and also ideal. (This was recognized by Hegel, who was way ahead of the game in a lot of ways.) It is the ideality of nature that allows it to be mathematized, though its important to note that mathematization does not exclude engagement with nature through other modalities, e.g. the emotional, the narrative, etc.

This means that characterizing the evolution of networks of information exchange by their physical properties (limits of information capacity of channels, etc.) is something to be embraced to better understand their impact on e.g. socially constructed reality, emic identity construction, etc. What the mathematics provide is a representation of what remains after so many diverse worlds are collapsed.

A similar result, representing a broad consensus, might be attained dialectically, specifically through actual dialog. Whereas the mathematical accounting is likely to lead to reduction to latent variables that may not coincide with the lived experience of participants, a dialectical approach is more likely to result in a synthesis of perspectives at a higher level of abstraction. (Only a confrontation with nature as the embodiment of unconscious constraints is likely to force us to confront latent mechanisms.)

Whether or not such dialectical synthesis will result in a singular convergent truth is unknown, with various ideologies taking positions on the matter as methodological assumptions. Haraway’s feminist epistemology, eschewing rational consensus in favor of interperspectival translation, rejects a convergent (scientific, and she would say masculine) truth. But does this stand up to the simple objection that Haraway’s own claims about truth and method transcend individual perspective, making he guilty of performative contradiction?

Perhaps a deeper problem with the consensus view of truth, which I heard once from David Weinberger, is that the structure of debate may have fractal complexity. The fractal pluralectic can fray into infinite and infinitesimal disagreement at its borders. I’ve come around to agreeing with this view, uncomfortable as it is. However, within the fractal pluralectic we can still locate a convergent perspective based on the network topology of information flow. Some parts of the network are more central and brighter than others.

A critical question is to what extent the darkness and confusion in the dissonant periphery can be included within the perspective of the central, convergent parts of the network. Is there necessarily a Shadow? Without the noise, can there be a signal?

Bay Area Rationalists

There is an interesting thing happening. Let me just try to lay down some facts.

There are a number of organizations in the Bay Area right now up to related things.

  • Machine Intelligence Research Institute (MIRI). Researches the implications of machine intelligence on the world, especially the possibility of super-human general intelligences. Recently changed their name from the Singularity Institute due to the meaninglessness of the term Singularity. I interviewed their Executive Director (CEO?), Luke Meuhlhauser, a while back. (I followed up on some of the reasoning there with him here).
  • Center for Applied Rationality (CFAR). Runs workshops training people in rationality, applying cognitive science to life choices. Trying to transition from appearing to pitch a “world-view” to teaching a “martial art” (I’ve sat in on a couple of their meetings). They aim to grow out a large network of people practicing these skills, because they think it will make the world a better place.
  • Leverage Research. A think-tank with an elaborate plan to save the world. Their research puts a lot of emphasis on how to design and market ideologies. I’ve been told that they recently moved to the Bay Area to be closer to CFAR.

Some things seem to connect these groups. First, socially, they all seem to know each other (I just went to a party where a lot of members of each group were represented.) Second, the organizations seem to get the majority of their funding from roughly the same people–Peter Thiel, Luke Nosek, and Jaan Tallinn, all successful tech entrepreneurs turned investors with interest in stuff like transhumanism, the Singularity, and advancing rationality in society. They seem to be employing a considerable number of people to perform research on topics normally ignored in academia and spread an ideology and/or set of epistemic practices. Third, there seems to be a general social affiliation with LessWrong.com; I gather a lot of the members of this community originally networked on that site.

There’s a lot that’s interesting about what’s going on here. A network of startups, research institutions, and training/networking organizations is forming around a cluster of ideas: the psychological and technical advancement of humanity, being smarter, making machines smarter, being rational or making machines to be rational for us. It is as far as I can tell largely off the radar of “mainstream” academic thinking. As a network, it seems concerned with growing to gather into itself effective and connected people. But it’s not drawing from many established bases of effective and connected people (the academic establishment, the government establishment, the finance establishment, “old boys netowrks” per se, etc.) but rather is growing its own base of enthusiasts.

I’ve had a lot of conversations with people in this community now. Some, but not all, would compare what they are doing to the starting of a religion. I think that’s pretty accurate based on what I’ve seen so far. Where I’m from, we’ve always talked about Singularitarianism as “eschatology for nerds”. But here we have all these ideas–the Singularity, “catastrophic risk”, the intellectual and ethical demands of “science”, the potential of immortality through transhumanist medicine, etc.–really motivating people to get together, form a community, advance certain practices and investigations, and proselytize.

I guess what I’m saying is: I don’t think it’s just a joke any more. There is actually a religion starting up around this. Granted, I’m in California now and as far as I can tell there are like sixty religions out here I’ve never heard of (I chalk it up to the lack of population density and suburban sprawl). But this one has some monetary and intellectual umph behind it.

Personally, I find this whole gestalt both attractive and concerning. As you might imagine, diversity is not this group’s strong suit. And its intellectual milieu reflects its isolation from the academic mainstream in that it lacks the kind of checks and balances afforded by multidisciplinary politics. Rather, it appears to have more or less declared the superiority of its methodological and ideological assumptions to its satisfaction and convinced itself that it’s ahead of the game. Maybe that’s true, but in my own experience, that’s not how it really works. (I used to share most of the tenets of this rationalist ideology, but have deliberately exposed myself to a lot of other perspectives since then [I think that taking the Bayesian perspective seriously necessitates taking the search for new information very seriously]. Turns out I used to be wrong about a lot of things.)

So if I were to make a prediction, it would go like this. One of these things is going to happen:

  • This group is going to grow to become a powerful but insulated elite with an expanded network and increasingly esoteric practices. An orthodox cabal seizes power where they are able, and isolates itself into certain functional roles within society with a very high standard of living.
  • In order to remain consistent with its own extraordinarily high epistemic standards, this network starts to assimilate other perspectives and points of view in an inclusive way. In the process, it discovers humility, starts to adapt proactively and in a decentralized way, losing its coherence but perhaps becomes a general influence on the preexisting societal institutions rather than a new one.
  • Hybrid models. Priesthood/lay practitioners. Or denominational schism.

There is a good story here, somewhere. If I were a journalist, I would get in on this and publish something about it, just because there is such a great opportunity for sensationalist exploitation.

Spaghetti, meet wall (on public intellectuals, activists in residence, and just another existential crisis of a phd student)

I have a backlog of things I’ve been planning to write about. It’s been a fruitful semester for me, in a number of ways. At the risk of being incoherent, I thought I’d throw some of my spaghetti against the Internet wall. It’s always curious to see what sticks.

One of the most fascinating intellectual exchanges of the past couple months for me was what I’d guess you could call the Morozov/Johnson debate. Except it wasn’t a debate. It was a book, then a book review (probably designed to sell a different book), and a rebuttal. It was fantastic showmanship. I have never felt so much like I was watching a boxing match while reading stuff on the Internet.

But what really made it for me was the side act of Henry Farrell taking Morozov to task. Unlike the others, I’ve met Farrell. He was kind enough to talk to me about his Cognitive Democracy article (which I was excited about) and academia in general (“there are no academic jobs for brilliant generalists”) last summer when I was living in DC. He is very smart, and not showy. What was cool about his exchange with Morozov was that it showed how a debate that wasn’t designed to sell books could still leak out into the public. There’s still a role for the dedicated academic, as a watchdog on public intellectuals who one could argue have to get sloppier to entertain the public.

An intriguing fallout from (or warm up to?) the whole exchange was Morozov casually snarking Nick Grossman‘s title, “Activist in Residence” at a VC fund in a tweet (“Another sign of the coming Apocalypse? Venture capital firms now have “activists in residence”? “), which then triggered some business press congratulating Nick for being “out in the streets”. Small world, I used to work with Nick at OpenPlans, and can vouch for his being a swell guy with an experienced and nuanced view of technology and government. He has done a lot of pioneering, constructive work on open governance applications–just the sort of constructive work a hater like Morozov would hate if he looked into it some. Privately, he’s told me he’s well aware of the potential astroturfing connotations of his title.

I got mixed feelings about all this. I’m suspicious of venture capital for the kind of vague “capital isn’t trustworthy” reasons you pick up in academia. Activism is sexy, lobbyists are not, and so if you can get away with calling your lobbyist an activist in residence then clearly that’s a step up.

But I think there’s something a little more going on here, which has to do with the substance of the debate. As I understand it, the Peer Progressives believe that social and economic progress can happen through bottom-up connectivity supported by platforms that are potentially run for profit. If you’re a VC, you’d want to invest in one of them platforms, because they are The Future. Nevertheless, you believe stuff happens by connecting people “on the ground”, not targeting decision-makers who are high in a hierarchy.

In Connected, Christakis and Fowler (or some book like it, I’m been reading a lot of them lately and having a hard time keeping track) make the interesting argument that the politics of protesters in the streets and lobbyists aren’t much different. What’s different is the centrality of the actor in the social network of governance. If you know a lot of senators, you’re probably a lobbyist. If you have to hold a sign and shout to have your political opinions heard, then you might be an activist.

I wonder who Nick talks to. Is he schmoozing with the Big Players? Or is he networking the base and trying to spur coordinated action on the periphery? I really have no idea. But if it were the latter, maybe that would give credibility to his title.

Another difference between activists and lobbyists is their authenticity. I have no doubt that Nick believes what he writes and advocates for. I do wonder how much he restrains himself based on his employers’ interests. What would prove he was an activist, not a lobbyist, would be if he were given a longer leash and allowed to speak out on controversial issues in a public way.

I’m mulling over all of this because I’m discovering in grad school that as an academic, you have to pick an audience. Are you targeting your work at other academics? At the public? At the press? At the government? At industry? At the end of the day, you’re writing something and you want somebody else to read it. If I’m lucky, I’ll be able to build something and get some people to use it, but that’s an ambitious thing to attempt when you’re mainly working alone.

So far some of my most rewarding experiences writing in academia have been blogging. It doesn’t impress anybody important but a traffic spike can make you feel like you’re on to something. I’ve been in a world of open work for a long time, and just throwing the spaghetti and trying to see what sticks has worked well for me in the past.

But if you try to steer yourself deeper into the network, the stakes get higher. Things get more competitive. Institutions are more calcified and bureaucratic and harder to navigate. You got to work to get anywhere. As it should be.

Dang, I forgot where I was going with this.

Maybe that’s the problem.

twitter’s liberal bias

Pew Research Center recently put out a report on Twitter’s liberal bias. It argues that overall sentiment on certain political events on Twitter is far to the left of national surveys. Also, people on Twitter complain a lot (are more negative). That makes sense, because Twitter has only 13% of the country even looking at it, and only 3% of the population posting or retweeting. And many of these people are younger.

Follow

Get every new post delivered to your Inbox.

Join 27 other followers