Digifesto

Tag: contextual integrity

Towards a Synthesis of Differential Privacy and Contextual Integrity

At last week’s 3rd Annual Symposium on Applications of Contextual Integrity, there was a lively discussion of a top-of-mind concern for computers scientists seeking to work with Contextual Integrity (CI): how does CI relate to differential privacy (DP)? Yan Shvartzshnaider encouraged me to write up my own comments as a blog post.

Differential Privacy (DP)

Differential privacy (Dwork, 2006) is a widely studied paradigm of computational privacy. It is a mathematical property of an algorithm or database A which dictates that the output of the mechanism depends only slightly on any one individual data subject’s data. This is most often expressed mathematically as

Pr[A(D_1) \in S] \leq e^\epsilon \cdot Pr[A(D_2) \in S]

Where D_1 and D_2 differ only by the contents on one data point corresponding to a single individual, and S is any arbitrary set of outputs of the mechanism.

A key motivation for DP is that each individual should, in principle, be indifferent to whether or not they are included in the DP database, because their impact on the result is bounded by a small value, \epsilon.

There are many, many variations of DP that differ based on assumptions about the generative model of the data set, the privacy threat model, and others ways of relaxing the indifference constraint. However, the technical research of DP is often silent on some key implementation details, such as how to choose the privacy budget \epsilon. There are some noteworthy industrial applications of DP, but they may use egregiously high values of \epsilon. There are also several reasons to believe the DP is not getting at a socially meaningful sense of privacy, but rather is merely a computationally convenient thing to research and implement.

Contextual Integrity (CI)

Contextual Integrity (Nissenbaum, 2009/2020) aims to capture what is socially meaningful about privacy. It defined privacy as appropriate information flow, where appropriateness means alignment with norms based in social context. Following Walzer (2008)’s vision of society divided into separate social spheres, CI recognizes that society is differentiated into many contexts, such as education, healthcare, the workplace, and the family, and that each context has different norms about personal information flow that are adapted to that context’s purpose. For example, the broadly understood rules that doctors keep their patient’s medical information confidential, but can share records with patient’s consent to other medical specialists, are examples of information norms that adhere in the context of healthcare. CI provides a template for understanding information norms, parameterized in terms of:

  • Sender of the personal information
  • Receiver of the personal information
  • Subject of the personal information — the “data subject” in legal terms
  • The attribute of the data subject that is referred to or described in the personal information.
  • The transmission principle, the normative rule governing the conditions under which the above parameterized information flow is (in)appropriate. Examples of transmission principles include reciprocity, confidentiality, and consent.

Though CI is a theory based in social, philosophical, and legal theories of privacy, it has had uptake in other disciplines, including computer science. These computer science applications have engaged CI deeply and contributed to it by clarifying the terms and limits of the theory (Benthall et al., 2017).

CI has perhaps been best used by computer scientists thus far as a way of conceptualizing the privacy rules of sectoral regulations such as HIPAA, GLBA, and COPPA (Barth et al., 2006) and commercial privacy polices (Shvartzshnaider et al., 2019). However, a promise of CI is that is can address social expectations that have not yet been codified into legal language, helping to bridge between technical design, social expectation, and legal regulation in new and emerging contexts.

Bridging Between DP and CI

I believe it’s safe to say that whereas DP has been widely understood and implemented by computer scientists, it has not sufficed as either a theory or practice to meet the complex and nuanced requirements that socially meaningful privacy entails. On the other hand, while CI does a better job of capturing socially meaningful privacy, it has not yet been computationally operationalized in a way that makes it amenable to widespread implementation. The interest at the Symposium in bridging DP and CI was due to a recognition that CI has defined problems worth solving by privacy oriented computer scientists who would like to build on their deep expertise in DP.

What, then, are the challenges to be addressed by a synthesis of DP and CI? These are just a few conjectures.

Social choice of epsilon. DP is a mathematical theory that leaves open the key question of the choice of privacy budget \epsilon. DP researchers would love a socially well-grounded way to choose is numerical value. CI can theoretically provide that social expectation, except for the fact that social norms are generally not expressed with such mathematical sensitivity. Rather, social norms (and legal rules) use less granular terms like confidentiality and consent. A DP/CI synthesis might involve a mapping from natural language privacy rules to numerical values for tuning DP.

Being explicit about context. DP is attractive precisely because it is a property of a mechanism that does not depend on the system’s context (Tschantz et al., 2020). But this is also its weakness. Key assumptions behind the motivation of DP, such as that the data subjects’ qualities are independent from each other, are wrong in many important privacy contexts. Variations of DP have been developed to, for example, adapt to how genetically or socially related people will have similar data, but the choice of which variant to use should be tailored to the conditions of social context. CI can inform DP practice by clarifying which contextual conditions matter and how to map these to DP variations.

DP may only address a subset of CI’s transmission principles. The rather open concept of transmission principle in CI does a lot of work for the theory by making it extensible to almost any conceivable privacy norm. Computer scientists may need to accept that DP will only be able to address a subset of CI’s transmission principles — those related to negative rules of personal information flow. Indeed, some have argued that CI’s transmission principles include rules that will always be incompletely auditable from a computer science perspective. (Datta et al., 2001) DP scholars may need to accept the limits of DP and see CI as a new frontier.

Horizontal data relations and DP for data governance. Increasingly, legal privacy scholars are becoming skeptical that socially meaningful privacy can be guaranteed to individuals alone. Because any individual’s data can enable an inference that has an effect on others, even those who are not in the data set, privacy may not properly be an individual concern. Rather, as Viljoen argues, these horizontal relationships between individuals via their data make personal data a democratic concern properly addressed with a broader understanding of collective or institutional data governance. This democratic data approach is quite consistent with CI, which was among the first privacy theories to emphasize the importance of socially understood norms as opposed to privacy as individual “control” of data. DP can no longer rely on its motivating idea that individual indifference to inclusion in a data set is sufficient for normative, socially meaningful privacy. However, some DP scholars have already begun to expand their expertise and address how DP can play a role in data governance. (Zhang et al., 2020)


DP and CI are two significant lines of privacy research that have not yet been synthesized effectively. That presents an opportunity for researchers in either subfield to reach across the aisle and build new theoretical and computational tools for socially meaningful privacy. In many ways, CI has worked to understand the socially contextual aspects of privacy, preparing the way for more mathematically oriented DP scholars to operationalize them. However, DP scholars may need to relax some of their assumptions and open their minds to make the most of what CI has to offer computational privacy.

References

Barth, A., Datta, A., Mitchell, J. C., & Nissenbaum, H. (2006, May). Privacy and contextual integrity: Framework and applications. In 2006 IEEE symposium on security and privacy (S&P’06) (pp. 15-pp). IEEE.

Benthall, S., Gürses, S., & Nissenbaum, H. (2017). Contextual integrity through the lens of computer science. Now Publishers.

Datta, A., Blocki, J., Christin, N., DeYoung, H., Garg, D., Jia, L., Kaynar, D. and Sinha, A., 2011, December. Understanding and protecting privacy: Formal semantics and principled audit mechanisms. In International Conference on Information Systems Security (pp. 1-27). Springer, Berlin, Heidelberg.

Dwork, C. (2006, July). Differential privacy. In International Colloquium on Automata, Languages, and Programming (pp. 1-12). Springer, Berlin, Heidelberg.

Nissenbaum, H. (2020). Privacy in context. Stanford University Press.

Shvartzshnaider, Y., Pavlinovic, Z., Balashankar, A., Wies, T., Subramanian, L., Nissenbaum, H., & Mittal, P. (2019, May). Vaccine: Using contextual integrity for data leakage detection. In The World Wide Web Conference (pp. 1702-1712).

Shvartzshnaider, Y., Apthorpe, N., Feamster, N., & Nissenbaum, H. (2019, October). Going against the (appropriate) flow: A contextual integrity approach to privacy policy analysis. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing (Vol. 7, No. 1, pp. 162-170).

Tschantz, M. C., Sen, S., & Datta, A. (2020, May). Sok: Differential privacy as a causal property. In 2020 IEEE Symposium on Security and Privacy (SP) (pp. 354-371). IEEE.

Viljoen, S. (forthcoming). Democratic data: A relational theory for data governance. Yale Law Journal.

Walzer, M. (2008). Spheres of justice: A defense of pluralism and equality. Basic books.

Zhang, W., Ohrimenko, O., & Cummings, R. (2020). Attribute Privacy: Framework and Mechanisms. arXiv preprint arXiv:2009.04013.

Hildebrandt (2013) on double contingency in Parsons and Luhmann

I’ve tried to piece together double contingency before, and am finding myself re-encountering these ideas in several projects. I just now happened on this very succinct account of double contingency in Hildebrandt (2013), which I wanted to reproduce here.

Parsons was less interested in personal identity than in the construction of social institutions as proxies for the coordination of human interaction. His point is that the uncertainty that is inherent in the double contingency requires the emergence of social structures that develop a certain autonomy and provide a more stable object for the coordination of human interaction. The circularity that comes with the double contingency is thus resolved in the consensus that is consolidated in sociological institutions that are typical for a particular culture. Consensus on the norms and values that regulate human interaction is Parsons’s solution to the problem of double contingency, and thus explains the existence of social institutions. As could be expected, Parsons’s focus on consensus and his urge to resolve the contingency have been criticized for its ‘past-oriented, objectivist and reified concept of culture’, and for its implicitly negative understanding of the double contingency.

This paragraph says a lot, both about “the problem” posed by “the double contingency”, the possibility of solution through consensus around norms and values, and the rejection of Parsons. It is striking that in the first pages of this article, Hildebrandt begins by challenging “contextual integrity” as a paradigm for privacy (a nod, if not a direct reference, to Nissenbaum (2009)), astutely pointing out that this paradigm makes privacy a matter of delinking data so that it is not reused across contexts. Nissenbaum’s contextual integrity theory depends rather critically on consensus around norms and values; the appropriateness of information norms is a feature of sociological institutions accountable ultimately to shared values. The aim of Parsons, and to some extent also Nissenbaum, is to remove the contingency by establishing reliable institutions.

The criticism of Parsons as being ‘past-oriented, objectivist and reified’ is striking. It opens the question whether Parsons’s concept of culture is too past-oriented, or if some cultures, more than others, may be more past-oriented, rigid, or reified. Consider a continuum of sociological institutions ranging from the rigid, formal, bureaucratized, and traditional to the flexible, casual, improvisational, and innovative. One extreme of these cultures is better conceptualized as “past-oriented” than the other. Furthermore, when cultural evolution becomes embedded in infrastructure, no doubt that culture is more “reified” not just conceptually, but actually, via its transformation into durable and material form. That Hildebrandt offers this criticism of Parsons perhaps foreshadows her later work about the problems of smart information communication infrastructure (Hildebrandt, 2015). Smart infrastructure poses, to those which this orientation, a problem in that it reduces double contingency by being, in fact, a reification of sociological institutions.

“Reification” is a pejorative word in sociology. It refers to a kind of ideological category error with unfortunate social consequences. The more positive view of this kind of durable, even material, culture would be found in Habermas, who would locate legitimacy precisely in the process of consensus. For Habermas, the ideals of legitimate consensus through discursively rational communicative actions finds its imperfect realization in the sociological institution of deliberative democratic law. This is the intellectual inheritor of Kant’s ideal of “perpetual peace”. It is, like the European Union, supposed to be a good thing.

So what about Brexit, so to speak?

Double contingency returns with a vengeance in Luhmann, who famously “debated” Habermas (a more true follower of Parsons), and probably won that debate. Hildebrandt (2013) discusses:

A more productive understanding of double contingency may come from Luhmann (1995), who takes a broader view of contingency; instead of merely defining it in terms of dependency he points to the different options open to subjects who can never be sure how their actions will be interpreted. The uncertainty presents not merely a problem but also a chance; not merely a constraint but also a measure of freedom. The freedom to act meaningfully is constraint [sic] by earlier interactions, because they indicate how one’s actions have been interpreted in the past and thus may be interpreted in the future. Earlier interactions weave into Luhmann’s (1995) emergent social systems, gaining a measure of autonomy — or resistance — with regard to individual participants. Ultimately, however, social systems are still rooted in double contingency of face-to-face communication. The constraints presented by earlier interactions and their uptake in a social system can be rejected and renegotiated in the process of anticipation. By figuring out how one’s actions are mapped by the other, or by social systems in which one participates, room is created to falsify expectations and to disrupt anticipations. This will not necessarily breed anomy, chaos or anarchy, but may instead provide spaces for contestation, self-definition in defiance of labels provided by the expectations of others, and the beginnings of novel or transformed social institutions. As such, the uncertainty inherent in the double contingency defines human autonomy and human identity as relational and even ephemeral, always requiring vigilance and creative invention in the face of unexpected or unreasonably constraining expectations.

Whereas Nissenbaum’s theory of privacy is “admitted conservative”, Hildebrandt’s is grounded in a defense of freedom, invention, and transformation. If either Nissenbaum or Hildebrandt were more inclined to contest each other directly, this may be privacy scholarship’s equivalent of the Habermas/Luhmann debate. However, this is unlikely to occur because the two scholars operate in different legal systems, reducing the stakes of the debate.

We must assume that Hildebrandt, in 2013, would have approved of Brexit, the ultimate defiance of labels and expectations against a Habermasian bureaucratic consensus. Perhaps she also, as would be consistent with this view, has misgivings about the extraterritorial enforcement of the GDPR. Or maybe she would prefer a a global bureaucratic consensus that agreed with Luhmann; but this is a contradiction. This psychologistic speculation is no doubt unproductive.

What is more productive is the pursuit of a synthesis between these poles. As a liberal society, we would like our allocation of autonomy; we often find ourselves in tension with the the bureaucratic systems that, according to rough consensus and running code, are designed to deliver to us our measure of autonomy. Those that overstep their allocation of autonomy, such as those that participated in the most recent Capitol insurrection, are put in prison. Freedom cooexists with law and even order in sometimes uncomfortable ways. There are contests; they are often ugly at the time however much they are glorified retrospectively by their winners as a form of past-oriented validation of the status quo.

References

Hildebrandt, M. (2013). Profile transparency by design?: Re-enabling double contingency. Privacy, due process and the computational turn: The philosophy of law meets the philosophy of technology, 221-46.

Hildebrandt, M. (2015). Smart technologies and the end (s) of law: novel entanglements of law and technology. Edward Elgar Publishing.

Nissenbaum, H. (2009). Privacy in context: Technology, policy, and the integrity of social life. Stanford University Press.

Privacy of practicing high-level martial artists (BJJ, CI)

Continuing my somewhat lazy “ethnographic” study of Brazilian Jiu Jitsu, an interesting occurrence happened the other day that illustrates something interesting about BJJ that is reflective of privacy as contextual integrity.

Spencer (2016) has accounted for the changes in martial arts culture, and especially Brazilian Jiu Jitsu, due to the proliferation of video on-line. Social media is now a major vector for the skill acquisition in BJJ. It is also, in my gym, part of the social experience. A few dedicated accounts on social media platforms that share images and video from the practice. There is a group chat where gym members cheer each other on, share BJJ culture (memes, tips), and communicate with the instructors.

Several members have been taking pictures and videos of others in practice and sharing them to the group chat. These are generally met with enthusiastic acclaim and acceptance. The instructors have also been inviting in very experienced (black belt) players for one-off classes. These classes are opportunities for the less experienced folks to see another perspective on the game. Because it is a complex sport, there are a wide variety of styles and in general it is exciting and beneficial to see moves and attitudes of masters besides the ones we normally train with.

After some videos of a new guest instructor were posted to the group chat, one of the permanent instructors (“A”) asked not to do this:

A: “As a general rule of etiquette, you need permission from a black belt and esp if two black belts are rolling to record them training, be it drilling not [sic] rolling live.”

A: “Whether you post it somewhere or not, you need permission from both to record then [sic] training.”

B: “Heard”

C: “That’s totally fine by me, but im not really sure why…?

B: “I’m thinking it’s a respect thing.”

A: “Black belt may not want footage of him rolling or training. as a general rule if two black belts are training together it’s not to be recorded unless expressly asked. if they’re teaching, that’s how they pay their bills so you need permission to record them teaching. So either way, you need permission to record a black belt.”

A: “I’m just clarifying for everyone in class on etiquette, and for visiting other schools. Unless told by X, Y, [other gym staff], etc., or given permission at a school you’re visiting, you’re not to record black belts and visiting upper belts while rolling and potentially even just regular training or class. Some schools take it very seriously.”

C: “OK! Totally fine!”

D: “[thumbs up emoji] gots it :)”

D: “totally makes sense”

A few observations on this exchange.

First, there is the intriguing point that for martial arts black belts teaching, their instruction is part of their livelihood. The knowledge of the expert martial arts practitioner is hard-earned and valuable “intellectual property”, and it is exchanged through being observed. Training at a gym with high-rank players is a privilege that lower ranks pay for. The use of video recording has changed the economy of martial arts training. This has in many ways opened up the sport; it also opens up potential opportunities for the black belt in producing training videos.

Second, this is framed as etiquette, not as a legal obligation. I’m not sure what the law would say about recordings in this case. It’s interesting that as a point of etiquette, it applies only to videos of high belt players. Recording low belt players doesn’t seem to be a problem according to the agreement in the discussion. (I personally have asked not to be recorded at one point at the gym when an instructor explicitly asked to be recorded in order to create demo videos. This was out of embarrassment at my own poor skills; I was also feeling badly because I was injured at the time. This sort of consideration does not, it seem, currently operate as privacy etiquette within the BJJ community. Perhaps these norms are currently being negotiated or are otherwise in flux.)

Third, there is a sense in which high rank in BJJ comes with authority and privileges that do not require any justification. The “trainings are livelihood” argument does apply directly to general practice roles; the argument is not airtight. There is something else about the authority and gravitas of the black belt that is being preserved here. There is a sense of earned respect. Somehow this translates into a different form of privacy (information flow) norm.

References

Spencer, D. C. (2016). From many masters to many Students: YouTube, Brazilian Jiu Jitsu, and communities of practice. Jomec Journal, (5).

social structure and the private sector

The Human Cell

Academic social scientists leaning towards the public intellectual end of the spectrum love to talk about social norms.

This is perhaps motivated by the fact that these intellectual figures are prominent in the public sphere. The public sphere is where these norms are supposed to solidify, and these intellectuals would like to emphasize their own importance.

I don’t exclude myself from this category of persons. A lot of my work has been about social norms and technology design (Benthall, 2014; Benthall, Gürses and Nissenbaum, 2017)

But I also work in the private sector, and it’s striking how differently things look from that perspective. It’s natural for academics who participate more in the public sphere than the private sector to be biased in their view of social structure. From the perspective of being able to accurately understand what’s going on, you have to think about both at once.

That’s challenging for a lot of reasons, one of which is that the private sector is a lot less transparent than the public sphere. In general the internals of actors in the private sector are not open to the scrutiny of commentariat onlookers. Information is one of the many resources traded in pairwise interactions; when it is divulged, it is divulged strategically, introducing bias. So it’s hard to get a general picture of the private sector, even though accounts for a much larger proportion of the social structure that’s available than the public sphere. In other words, public spheres are highly over-represented in analysis of social structure due to the available of public data about them. That is worrisome from an analytic perspective.

It’s well worth making the point that the public/private dichotomy is problematic. Contextual integrity theory (Nissenbaum, 2009) argues that modern society is differentiated among many distinct spheres, each bound by its own social norms. Nissenbaum actually has a quite different notion of norm formation from, say, Habermas. For Nissenbaum, norms evolve over social history, but may be implicit. Contrast this with Habermas’s view that norms are the result of communicative rationality, which is an explicit and linguistically mediated process. The public sphere is a big deal for Habermas. Nissenbaum, a scholar of privacy, reject’s the idea of the ‘public sphere’ simpliciter. Rather, social spheres self-regulate and privacy, which she defines as appropriate information flow, is maintained when information flows according to these multiple self-regulatory regimes.

I believe Nissenbaum is correct on this point of societal differentiation and norm formation. This nuanced understanding of privacy as the differentiated management of information flow challenges any simplistic notion of the public sphere. Does it challenge a simplistic notion of the private sector?

Naturally, the private sector doesn’t exist in a vacuum. In the modern economy, companies are accountable to the law, especially contract law. They have to pay their taxes. They have to deal with public relations and are regulated as to how they manage information flows internally. Employees can sue their employers, etc. So just as the ‘public sphere’ doesn’t permit a total free-for-all of information flow (some kinds of information flow in public are against social norms!), so too does the ‘private sector’ not involve complete secrecy from the public.

As a hypothesis, we can posit that what makes the private sector different is that the relevant social structures are less open in their relations with each other than they are in the public sphere. We can imagine an autonomous social entity like a biological cell. Internally it may have a lot of interesting structure and organelles. Its membrane prevents this complexity leaking out into the aether, or plasma, or whatever it is that human cells float around in. Indeed, this membrane is necessary for the proper functioning of the organelles, which in turn allows the cell to interact properly with other cells to form a larger organism. Echoes of Francisco Varela.

It’s interesting that this may actually be a quantifiable difference. One way of modeling the difference between the internal and external-facing complexity of an entity is using information theory. The more complex internal state of the entity has higher entropy than the membrane. The fact that the membrane causally mediates interactions between the internals and the environment prevents information flow between them; this is captured by the Data Processing Inequality. The lack of information flow between the system internals and externals is quantified as lower mutual information between the two domains. At zero mutual information, the two domains are statistically independent of each other.

I haven’t worked out all the implications of this.

References

Benthall, Sebastian. (2015) Designing Networked Publics for Communicative Action. Jenny Davis & Nathan Jurgenson (eds.) Theorizing the Web 2014 [Special Issue]. Interface 1.1. (link)

Sebastian Benthall, Seda Gürses and Helen Nissenbaum (2017), “Contextual Integrity through the Lens of Computer Science”, Foundations and Trends® in Privacy and Security: Vol. 2: No. 1, pp 1-69. http://dx.doi.org/10.1561/3300000016

Nissenbaum, H. (2009). Privacy in context: Technology, policy, and the integrity of social life. Stanford University Press.

Contextual Integrity as a field

There was a nice small gathering of nearby researchers (and one important call-in) working on Contextual Integrity at Princeton’s CITP today. It was a nice opportunity to share what we’ve been working on and make plans for the future.

There was a really nice range of different contributions: systems engineering for privacy policy enforcement, empirical survey work testing contextualized privacy expectations, a proposal for a participatory design approach to identifying privacy norms in marginalized communities, a qualitative study on how children understand privacy, and an analysis of the privacy implications of the Cybersecurity Information Sharing Act, among other work.

What was great is that everybody was on the same page about what we were after: getting a better understanding of what privacy really is, so that we can design between policies, educational tools, and technologies that preserve it. For one reason or another, the people in the room had been attracted to Contextual Integrity. Many of us have reservations about the theory in one way or another, but we all see its value and potential.

One note of consensus was that we should try to organize a workshop dedicated specifically to Contextual Integrity, and widening what we accomplished today to bring in more researchers. Today’s meeting was a convenience sample, leaving out a lot of important perspectives.

Another interesting thing that happened today was a general acknowledgment that Contextual Integrity is not a static framework. As a theory, it is subject to change as scholars critique and contribute to it through their empirical and theoretical work. A few of us are excited about the possibility of a Contextual Integrity 2.0, extending the original theory to fill theoretical gaps that have been identified in it.

I’d articulate the aspiration of the meeting today as being about letting Contextual Integrity grow from being a framework into a field–a community of people working together to cultivate something, in this case, a kind of knowledge.