Tag: context

“Context, Causality, and Information Flow: Implications for Privacy Engineering, Security, and Data Economics” <– My dissertation

In the last two weeks, I’ve completed, presented, and filed my dissertation, and commenced as a doctor of philosophy. In a word, I’ve PhinisheD!

The title of my dissertation is attention-grabbing, inviting, provocative, and impressive:

“Context, Causality, and Information Flow: Implications for Privacy Engineering, Security, and Data Economics”

If you’re reading this, you are probably wondering, “How can I drop everything and start reading that hot dissertation right now?”

Look no further: here is a link to the PDF.

You can also check out this slide deck from my “defense”. It covers the highlights.

I’ll be blogging about this material as I break it out into more digestible forms over time. For now, I’m obviously honored by any interest anybody takes in this work and happy to answer questions about it.

Nissenbaum the functionalist

Today in Classics we discussed Helen Nissenbaum’s Privacy in Context.

Most striking to me is that Nissenbaum’s privacy framework, contextual integrity theory, depends critically on a functionalist sociological view. A context is defined by its information norms and violations of those norms are judged according to their (non)accordance with the purposes and values of the context. So, for example, the purposes of an educational institution determine what are appropriate information norms within it, and what departures from those norms constitute privacy violations.

I used to think teleology was dead in the sciences. But recently I learned that it is commonplace in biology and popular in ecology. Today I learned that what amounts to a State Philosopher in the U.S. (Nissenbaum’s framework has been more or less adopted by the FTC) maintains a teleological view of social institutions. Fascinating! Even more fascinating that this philosophy corresponds well enough to American law as to be informative of it.

From a “pure” philosophy perspective (which is I will admit simply a vice of mine), it’s interesting to contrast Nissenbaum with…oh, Horkheimer again. Nissenbaum sees ethical behavior (around privacy at least) as being behavior that is in accord with the purpose of ones context. Morality is given by the system. For Horkheimer, the problem is that the system’s purposes subsume the interests of the individual, who is alone the agent who is able to determine what is right and wrong. Horkheimer is a founder of a Frankfurt school, arguably the intellectual ancestor of progressivism. Nissenbaum grounds her work in Burke and her theory is admittedly conservative. Privacy is violated when people’s expectations of privacy are violated–this is coming from U.S. law–and that means people’s contextual expectations carry more weight than an individual’s free-minded beliefs.

The tension could be resolved when free individuals determine the purpose of the systems they participate in. Indeed, Nissenbaum quotes Burke in his approval of established conventions as being the result of accreted wisdom and rationale of past generations. The system is the way it is because it was chosen. (Or, perhaps, because it survived.)

Since Horkheimer’s objection to “the system” is that he believes instrumentality has run amok, thereby causing the system serve a purpose nobody intended for it, his view is not inconsistent with Nissenbaum’s. Nissenbaum, building on Dworkin, sees contextual legitimacy as depending on some kind of political legitimacy.

The crux of the problem is the question of what information norms comprise the context in which political legitimacy is formed, and what purpose does this context or system serve?

Privacy, trust, context, and legitimate peripheral participation

Privacy is important. For Nissenbaum, what’s essential to privacy is control over context. But what is context?

Using Luhmann’s framework of social systems–ignoring for a moment e.g. Habermas’ criticism and accepting the naturalized, systems theoretic understanding of society–we would have to see a context as a subsystem of the total social system. In so far as the social system is constituted by many acts of communication–let’s visualize this as a network of agents, whose edges are acts of communication–then a context is something preserved by configurations of agents and the way they interact.

Some of the forces that shape a social system will be exogenous. A river dividing two cities or, more abstractly, distance. In the digital domain, the barriers of interoperability between one virtual community infrastructure and another.

But others will be endogenous, formed from the social interactions themselves. An example is the gradual deepening of trust between agents based on a history of communication. Perhaps early conversations are formal, stilted. Later, an agent takes a risk, sharing something more personal–more private? It is reciprocated. Slowly, a trust bond, an evinced sharing of interests and mutual investment, becomes the foundation of cooperation. The Prisoner’s Dilemma is solved the old fashioned way.

Following Carey’s logic that communication as mere transmission when sustained over time becomes communication as ritual and the foundation of community, we can look at this slow process of trust formation as one of the ways that a context, in Nissenbaum’s sense, perhaps, forms. If Anne and Betsy have mutually internalized each others interests, then information flow between them will by and large support the interests of the pair, and Betsy will have low incentives to reveal private information in a way that would be detrimental to Anne.

Of course this is a huge oversimplification in lots of ways. One way is that it does not take into account the way the same agent may participant in many social roles or contexts. Communication is not a single edge from one agent to another in many circumstances. Perhaps the situation is better represented as a hypergraph. One reason why this whole domain may be so difficult to reason about is the sheer representational complexity of modeling the situation. It may require the kind of mathematical sophistication used by quantum physicists. Why not?

Not having that kind of insight into the problem yet, I will continue to sling what the social scientists call ‘theory’. Let’s talk about an exisiting community of practice, where the practice is a certain kind of communication. A community of scholars. A community of software developers. Weird Twitter. A backchannel mailing list coordinating a political campaign. A church.

According to Lave and Wenger, the way newcomers gradually become members and oldtimers of a community of practice is legitimate peripheral participation. This is consistent with the model described above characterizing the growth of trust through gradually deepening communication. Peripheral participation is low-risk. In an open source context, this might be as simple as writing a question to the mailing list or filing a bug report. Over time, the agent displays good faith and competence. (I’m disappointed to read just now that Wenger ultimately abandoned this model in favor of a theory of dualities. Is that a Hail Mary for empirical content for the theory? Also interested to follow links on this topic to a citation of von Krogh 1998, whose later work found its way onto my Open Collaboration and Peer Production syllabus. It’s a small world.

I’ve begun reading as I write this fascinating paper by Hildreth and Kimble 2002 and am now have lost my thread. Can I recover?)

Some questions:

  • Can this process of context-formation be characterized empirically through an analysis of e.g. the timing dynamics of communication (c.f. Thomas Maillart’s work)? If so, what does that tell us about the design of information systems for privacy?
  • What about illegitimate peripheral participation? Arguably, this blog is that kind of participation–it participates in a form of informal, unendorsed quasi-scholarship. It is a tool of context and disciplinary collapse. Is that a kind of violation of privacy? Why not?

notes on innovation in journalism

I’ve spent the better part of the past week thinking hard about journalism. This is due largely to two projects: further investigation into Weird Twitter, and consulting work I’ve been doing with the Center for Investigative Reporting. Journalism, the trope goes, is a presently disrupted industry. I’d say it’s fair to say it’s a growing research interest for me. So here’s the rundown on where things seem to be at.

Probably the most rewarding thing to come out of the fundamentally pointless task of studying Weird Twitter, besides hilarity, is getting a better sense of the digital journalism community. I’ve owed Ethnography Matters a part 2 for a while, and it seems like the meatiest bone to pick is still on the subject of attention economy. The @horse_ebooks/Buzzfeed connection drives that nail in deeper.

I find content farming pretty depressing and only got more depressed reading Dylan Love’s review of MobileWorks that he crowdsourced to crowdworkers using MobileWorks. I mean, can you think of a more dystopian world than one in which the press is dominated by mercenary crowdworkers pulling together plausible-sounding articles out of nowhere for the highest bidder.

I was feeling like the world was going to hell until somebody told me about Oximity, which is a citizen journalist platform, as opposed to a viral advertising platform. Naturally, this has a different flavor to it, though is less monetized/usable/populated. Hmm.

I spend too much time on the Internet. That was obvious when attending CIR’s Dissection:Impact events on Wednesday and Thursday. CIR is a foundation-funded non-profit that actually goes and investigates things like prisons, migrant farm workers, and rehab clinics. The people there really turned my view of things around, as I realized that there are still people out there dedicated to using journalism to do good in the world.

There were three interesting presentations with divergent themes.

One was a presentation of ConText, a natural language and network processing toolkit for analyzing the discussion around media. It was led by Jana Deisner at the I School at Urbana-Champaign. Her dissertation work was on covert network analysis to detect white collar criminals. They have a thoroughly researched impact model, and software is currently unusable by humans but combines best practices in text and network analysis. The intend to release it as an academic tool for researchers, open source.

Another was a presentation by Harmony Institute, which has high-profile clients like MTV. Their lead designer walked us through a series of compelling mockups of ImpactSpace, an impact analysis tool that shows the discussion around an issue as “constellations” through different “solar systems” of ideas. Their project promises to identify how one can frame a story to target swing viewers. But they were not specific about how they would get and process the data. They intend to make demos of thir service available on-line, and market it as a product.

The third presentation was by CIR itself, which has hired a political science post-doc to come up with an analysis framework. They focused on a story, “Rape in the Fields”, about sexual abuse of migrant farm workers. These people tend not to be on Twitter, but the story was a huge success on Univision. Drawing mainly on qualitative data, it considers “micro”, “mezo”, and “macro” impact. Micro interactions might be eager calls to the original journalist for more information, or powerful anectdotes of how somebody hurt had felt healed when they were able to tell their story to the world.

Each team has their disciplinary bias and their own strengths and weaknesses. But they are tackling the same problem: trying to evaluate the effectiveness of media. They know that data is powerful: CIR uses it all the time to find stories. They will sift through a large data set, look for anomalies, and then carefully investigate. But even when collaborative science, including “data science” components, is effectively used to do external facing research, the story gets more difficult, intellectually and politically, when it turns that kind of thinking reflexively on itself.

I think this story sounds a lot like the story of what’s happening in Berkeley. A disrupted research organization struggles to understand its role in a changing world under pressure to adapt to data that seems both ubiquitous and impoverished.
Does this make you buy into the connection between universities and journalism?

If it does, then I can tell you another story about how software ties in. If not, then I’ve got deeper problems.

There is an operational tie: D-Lab and CIR have been in conversation about how to join forces. With the dissolution of disciplines, investigative reporting is looking more and more like social science. But its the journalists who are masters of distribution and engagement. What can we learn about the imoact of social science research from journalists? And how might the two be better operationally linked?

The New School sent some folks to the Dissection event to talk about the Open Journalism program they are starting soon.

I asked somebody at CIR what he thought about Buzzfeed. He explained that it’s the same business model as HuffPo–funding real journalism with the revenue from the crappy clickbait. I hope that’s true. I wonder if they would suffer as a business if they only put out clickbait. Is good journalism anything other than clickbait for the narrow segment of the population that has expensive taste in news?

The most interesting conversation I had was with Mike Corey at CIR, who explained that there are always lots of great stories, but that the problem was that newspapers don’t have space to run all the stories, they are an information bottleneck. I found this striking because I don’t get my media from newspapers any more, and it revealed that the shifting of the journalism ecosystem is still underway. Thinking this through…

In the old model, a newspaper (or radio show, or TV show) had limited budget to distributed information, and so competed for prestige with creativity and curational prowess. Naturally they targeted different audiences, but there was more at stake in deciding what to and what not to report. (The unintentional past tense here just goes to show where I am in time, I guess.)

With web publishing, everybody can blog or tweet. What’s newsworthy is what gets sifted through and picked up. Moreover, this can be done experimentally on a larger scale than…ah, interesting. Ok, so individual reporters wind up building a social media presence that is effectively a mini-newspaper and…oh dear.

One of the interesting phrases that came out of the discussion at the Dissection event was “self-commodification”–the tendency of journalists to need to brand themselves as products, artists, performers. Watching journalists on Twitter is striking partly because of how these constraints affect their behavior.

Putting it another way: what if newspapers had unlimited paper on which to print things? How would they decide to sort and distribute information? This is effectively what the Gawker, Buzzfeed, Techcrunch, and all the rest of the web press is up to. Hell, it’s what the Wall Street Journal is up to, as older more prestigious brands are pressured to compete. This causes the much lamented decline in the quality of journalism.

Ok, ok, so what does any of this mean? For society, for business. What is the equilibrium state?