Digifesto

Tag: privacy

Hildebrandt (2013) on double contingency in Parsons and Luhmann

I’ve tried to piece together double contingency before, and am finding myself re-encountering these ideas in several projects. I just now happened on this very succinct account of double contingency in Hildebrandt (2013), which I wanted to reproduce here.

Parsons was less interested in personal identity than in the construction of social institutions as proxies for the coordination of human interaction. His point is that the uncertainty that is inherent in the double contingency requires the emergence of social structures that develop a certain autonomy and provide a more stable object for the coordination of human interaction. The circularity that comes with the double contingency is thus resolved in the consensus that is consolidated in sociological institutions that are typical for a particular culture. Consensus on the norms and values that regulate human interaction is Parsons’s solution to the problem of double contingency, and thus explains the existence of social institutions. As could be expected, Parsons’s focus on consensus and his urge to resolve the contingency have been criticized for its ‘past-oriented, objectivist and reified concept of culture’, and for its implicitly negative understanding of the double contingency.

This paragraph says a lot, both about “the problem” posed by “the double contingency”, the possibility of solution through consensus around norms and values, and the rejection of Parsons. It is striking that in the first pages of this article, Hildebrandt begins by challenging “contextual integrity” as a paradigm for privacy (a nod, if not a direct reference, to Nissenbaum (2009)), astutely pointing out that this paradigm makes privacy a matter of delinking data so that it is not reused across contexts. Nissenbaum’s contextual integrity theory depends rather critically on consensus around norms and values; the appropriateness of information norms is a feature of sociological institutions accountable ultimately to shared values. The aim of Parsons, and to some extent also Nissenbaum, is to remove the contingency by establishing reliable institutions.

The criticism of Parsons as being ‘past-oriented, objectivist and reified’ is striking. It opens the question whether Parsons’s concept of culture is too past-oriented, or if some cultures, more than others, may be more past-oriented, rigid, or reified. Consider a continuum of sociological institutions ranging from the rigid, formal, bureaucratized, and traditional to the flexible, casual, improvisational, and innovative. One extreme of these cultures is better conceptualized as “past-oriented” than the other. Furthermore, when cultural evolution becomes embedded in infrastructure, no doubt that culture is more “reified” not just conceptually, but actually, via its transformation into durable and material form. That Hildebrandt offers this criticism of Parsons perhaps foreshadows her later work about the problems of smart information communication infrastructure (Hildebrandt, 2015). Smart infrastructure poses, to those which this orientation, a problem in that it reduces double contingency by being, in fact, a reification of sociological institutions.

“Reification” is a pejorative word in sociology. It refers to a kind of ideological category error with unfortunate social consequences. The more positive view of this kind of durable, even material, culture would be found in Habermas, who would locate legitimacy precisely in the process of consensus. For Habermas, the ideals of legitimate consensus through discursively rational communicative actions finds its imperfect realization in the sociological institution of deliberative democratic law. This is the intellectual inheritor of Kant’s ideal of “perpetual peace”. It is, like the European Union, supposed to be a good thing.

So what about Brexit, so to speak?

Double contingency returns with a vengeance in Luhmann, who famously “debated” Habermas (a more true follower of Parsons), and probably won that debate. Hildebrandt (2013) discusses:

A more productive understanding of double contingency may come from Luhmann (1995), who takes a broader view of contingency; instead of merely defining it in terms of dependency he points to the different options open to subjects who can never be sure how their actions will be interpreted. The uncertainty presents not merely a problem but also a chance; not merely a constraint but also a measure of freedom. The freedom to act meaningfully is constraint [sic] by earlier interactions, because they indicate how one’s actions have been interpreted in the past and thus may be interpreted in the future. Earlier interactions weave into Luhmann’s (1995) emergent social systems, gaining a measure of autonomy — or resistance — with regard to individual participants. Ultimately, however, social systems are still rooted in double contingency of face-to-face communication. The constraints presented by earlier interactions and their uptake in a social system can be rejected and renegotiated in the process of anticipation. By figuring out how one’s actions are mapped by the other, or by social systems in which one participates, room is created to falsify expectations and to disrupt anticipations. This will not necessarily breed anomy, chaos or anarchy, but may instead provide spaces for contestation, self-definition in defiance of labels provided by the expectations of others, and the beginnings of novel or transformed social institutions. As such, the uncertainty inherent in the double contingency defines human autonomy and human identity as relational and even ephemeral, always requiring vigilance and creative invention in the face of unexpected or unreasonably constraining expectations.

Whereas Nissenbaum’s theory of privacy is “admitted conservative”, Hildebrandt’s is grounded in a defense of freedom, invention, and transformation. If either Nissenbaum or Hildebrandt were more inclined to contest each other directly, this may be privacy scholarship’s equivalent of the Habermas/Luhmann debate. However, this is unlikely to occur because the two scholars operate in different legal systems, reducing the stakes of the debate.

We must assume that Hildebrandt, in 2013, would have approved of Brexit, the ultimate defiance of labels and expectations against a Habermasian bureaucratic consensus. Perhaps she also, as would be consistent with this view, has misgivings about the extraterritorial enforcement of the GDPR. Or maybe she would prefer a a global bureaucratic consensus that agreed with Luhmann; but this is a contradiction. This psychologistic speculation is no doubt unproductive.

What is more productive is the pursuit of a synthesis between these poles. As a liberal society, we would like our allocation of autonomy; we often find ourselves in tension with the the bureaucratic systems that, according to rough consensus and running code, are designed to deliver to us our measure of autonomy. Those that overstep their allocation of autonomy, such as those that participated in the most recent Capitol insurrection, are put in prison. Freedom cooexists with law and even order in sometimes uncomfortable ways. There are contests; they are often ugly at the time however much they are glorified retrospectively by their winners as a form of past-oriented validation of the status quo.

References

Hildebrandt, M. (2013). Profile transparency by design?: Re-enabling double contingency. Privacy, due process and the computational turn: The philosophy of law meets the philosophy of technology, 221-46.

Hildebrandt, M. (2015). Smart technologies and the end (s) of law: novel entanglements of law and technology. Edward Elgar Publishing.

Nissenbaum, H. (2009). Privacy in context: Technology, policy, and the integrity of social life. Stanford University Press.

Autonomy as link between privacy and cybersecurity

A key aspect of the European approach to privacy and data protection regulation is that it’s rooted in the idea of an individual’s autonomy. Unlike an American view of privacy which suggests that privacy is important only because it implies some kind of substantive harm—such as reputational loss or discrimination–in European law it’s understood that personal data matters because of its relevance to a person’s self-control.

Autonomy etymologically is “self-law”. It is traditionally associated with the concept of rationality and the ability to commit oneself to duty. My colleague Jake Goldenfein argues that autonomy is the principle that one has the power to express one’s own narrative about oneself, and for that narrative to have power. Uninterpretable and unaccountable surveillance, “nudging”, manipulation, profiling, social sorting, and so on are all in a sense an attack on autonomy. They interfere with the individual’s capacity to self-rule.

It is more rare to connect the idea of autonomy to cybersecurity, though here the etymology of the words also weighs in favor of it. Cyber- has its root in in Greek kybernetes, for steersman, governor, pilot, or rudder. To be secure means to be free from threat. So cybersecurity for a person or organization is the freedom of their (self-control) from external threat. Cybersecurity is the condition of being free to control oneself–to be autonomous.

Understood in this way, privacy is just one kind of cybersecurity: the cybersecurity of the individual person. We can speak additionally of the cybersecurity of a infrastructure, such as a power grid, or of an organization, such as a bank, or of a device, such as a smartphone. What both the privacy and cybersecurity discussions implicate are questions of the ontology of the entities involved and their ability to control themselves and control each other.

The diverging philosophical roots of U.S. and E.U. privacy regimes

For those in the privacy scholarship community, there is an awkward truth that European data protection law is going to a different direction from U.S. Federal privacy law. A thorough realpolitical analysis of how the current U.S. regime regarding personal data has been constructed over time to advantage large technology companies can be found in Cohen’s Between Truth and Power (2019). There is, to be sure, a corresponding story to be told about EU data protection law.

Adjacent, somehow, to the operations of political power are the normative arguments leveraged both in the U.S. and in Europe for their respective regimes. Legal scholarship, however remote from actual policy change, remains as a form of moral inquiry. It is possible, still, that through professional training of lawyers and policy-makers, some form of ethical imperative can take root. Democratic interventions into the operations of power, while unlikely, are still in principle possible: but only if education stays true to principle and does not succumb to mere ideology.

This is not easy for educational institutions to accomplish. Higher education certainly is vulnerable to politics. A stark example of this was the purging of Marxist intellectuals from American academic institutions under McCarthyism. Intellectual diversity in the United States has suffered ever since. However, this was only possible because Marxism as a philosophical movement is extraneous to the legal structure of the United States. It was never embedded at a legal level in U.S. institutions.

There is a simply historical reason for this. The U.S. legal system was founded under a different set of philosophical principles; that philosophical lineage still impacts us today. The Founding Fathers were primarily influenced by John Locke. Locke rose to prominence in Britain when the Whigs, a new bourgeois class of Parliamentarian merchant leaders, rose to power, contesting the earlier monarchy. Locke’s political contributions were a treatise pointing out the absurdity of the Divine Right of Kings, the prevailing political ideology of the time, and a second treatise arguing for a natural right to property based on the appropriation of nature. This latter political philosophy was very well aligned with Britain’s new national project of colonialist expansion. With the founding of the United States, it was enshrined into the Constitution. The liberal system of rights that we enjoy in the U.S. are founded in the Lockean tradition.

Intellectual progress in Europe did not halt with Locke. Locke’s ideas were taken up by David Hume, whose introduced arguments that were so agitating that they famously woke Immanuel Kant, in Germany, from his “dogmatic slumber”, leading him to develop a new highly systematic system of morality and epistemology. Among the innovations in this work was the idea that human freedom is grounded in the dignity of being an autonomous person. The source of dignity is not based in a natural process such as the tilling of land. It is rather based in on transcendental facts about what it means to be human. The key to morality is treating people like ends, not means; in other words, not using people as tools to other aims, but as aims in themselves.

If this sound overly lofty to an American audience, it’s because this philosophical tradition has never taken hold in American education. In both the United Kingdom and Britain, Kantian philosophy has always been outside the mainstream. The tradition of Locke, through Hume, has continued on in what philosophers will call “analytic philosophy”. This philosophy has taken on the empiricist view that the only source of knowledge is individual experience. It has transformed over centuries but continues to orbit around the individual and their rights, grounded in pragmatic considerations, and learning normative rules using the case-by-case approach of Common Law.

From Kant, a different “continental philosophy” tradition produced Hegel, who produced Marx. We can trace from Kant’s original arguments about how morality is based on the transcendental dignity of the individual to the moralistic critique that Marx made against capitalism. Capitalism, Marx argued, impugns the dignity of labor because it treats it like a means, not an end. No such argument could take root in a Lockean system, because Lockean ethics has no such prescription against treating others instrumentally.

Germany lost its way at the start of the 20th century. But the post-war regime, funded by the Marshall plan, directed by U.S. constitutional scholars as well as repatriating German intellectuals, had the opportunity to rewrite their system of governance. They did so along Kantian lines: with statutory law, reflecting a priori rational inquiry, instead of empiricist Common Law. They were able to enshrine into their system the Kantian basis of ethics, with its focus on autonomy.

Many of the intellectuals influencing the creation of the new German state were “Marxist” in the loose sense that they were educated in the German continental intellectual tradition which, at that time, included Marx as one of its key figures. By the mid-20th century they had naturally surpassed this ideological view. However, as a consequence, the McCarthyist attack on Marxism had the effect of also purging some of the philosophical connection between German and U.S. legal education. Kantian notions of autonomy are still quite foreign to American jurisprudence. Legal arguments in the United States draw instead on a vast collection of other tools based on a much older and more piecemeal way of establishing rights. But are any of these tools up to the task of protecting human dignity?

The EU is very much influenced by Germany and the German legal system. The EU has the Kantian autonomy ethic at the heart of its conception of human rights. This philosophical commitment has recently expressed itself in the EU’s assertion of data protection law through the GDPR, whose transnational enforcement clauses have brought this centuries-old philosophical fight into contemporary legal debate in legal jurisdictions that predate the neo-Kantian legal innovations of Continental states.

The puzzle facing American legal scholars is this: while industrial advocates and representatives tend to disagree with the strength of the GDPR, arguing that it is unworkable and/or based on poorly defined principle, the data protections that it offer seem so far to be compelling to users, and the shifting expectations around privacy in part induced by it are having effects on democratic outcomes (such as the CCPA). American legal scholars now have to try to make sense of the GDPR’s rules and find a normative basis for them. How can these expansive ideas of data protection, which some have had the audacity to argue is a new right (Hildebrandt, 2015), be grafted onto the the Common Law, empiricist legal system in a way that gives it the legitimacy of being an authentically American project? Is there a way to explain data protection law that does not require the transcendental philosophical apparatus which, if adopted, would force the American mind to reconsider in a fundamental way the relationship between individuals and the collective, labor and capital, and other cornerstones of American ideology?

There may or may not be. Time will tell. My own view is that the corporate powers, which flourished under the Lockean judicial system because of the weaknesses in that philosophical model of the individual and her rights, will instinctively fight what is in fact a threatening conception of the person as autonomous by virtue of their transcendental similarity with other people. American corporate power will not bother to make a philosophical case at all; it will operate in the domain of realpolitic so well documented by Cohen. Even if this is so, it is notable that so much intellectual and economic energy is now being exerted in the friction around a poweful an idea.

References

Cohen, J. E. (2019). Between Truth and Power: The Legal Constructions of Informational Capitalism. Oxford University Press, USA.

Hildebrandt, M. (2015). Smart technologies and the end (s) of law: Novel entanglements of law and technology. Edward Elgar Publishing.

search engines and authoritarian threats

I’ve been intrigued by Daniel Griffin’s tweets lately, which have been about situating some upcoming work of his an Deirdre Mulligan’s regarding the experience of using search engines. There is a lively discussion lately about the experience of those searching for information and the way they respond to misinformation or extremism that they discover through organic use of search engines and media recommendation systems. This is apparently how the concern around “fake news” has developed in the HCI and STS world since it became an issue shortly after the 2016 election.

I do not have much to add to this discussion directly. Consumer misuse of search engines is, to me, analogous to consumer misuse of other forms of print media. I would assume to best solution to it is education in the complete sense, and the problems with the U.S. education system are, despite all good intentions, not HCI problems.

Wearing my privacy researcher hat, however, I have become interested in a different aspect of search engines and the politics around them that is less obvious to the consumer and therefore less popularly discussed, but I fear is more pernicious precisely because it is not part of the general imaginary around search. This is the aspect that is around the tracking of search engine activity, and what it means for this activity to be in the hands of not just such benevolent organizations such as Google, but also such malevolent organizations such as Bizarro World Google*.

Here is the scenario, so to speak: for whatever reason, we begin to see ourselves in a more adversarial relationship with search engines. I mean “search engine” here in the broad sense, including Siri, Alexa, Google News, YouTube, Bing, Baidu, Yandex, and all the more minor search engines embedded in web services and appliances that do something more focused than crawl the whole web. By ‘search engine’ I mean entire UX paradigm of the query into the vast unknown of semantic and semiotic space that contemporary information access depends on. In all these cases, the user is at a systematic disadvantage in the sense that their query is a data point amount many others. The task of the search engine is to predict the desired response to the query and provide it. In return, the search engine gets the query, tied to the identity of the user. That is one piece of a larger mosaic; to be a search engine is to have a picture of a population and their interests and the mandate to categorize and understand those people.

In Western neoliberal political systems the central function of the search engine is realized as commercial transaction facilitating other commercial transactions. My “search” is a consumer service; I “pay” for this search by giving my query to the adjoined advertising function, which allows other commercial providers to “search” for me, indirectly, through the ad auction platform. It is a market with more than just two sides. There’s the consumer who wants information and may be tempted by other information. There are the primary content providers, who satisfy consumer content demand directly. And there are secondary content providers who want to intrude on consumer attention in a systematic and successful way. The commercial, ad-enabled search engine reduces transaction costs for the consumer’s search and sells a fraction of that attentional surplus to the advertisers. Striking the right balance, the consumer is happy enough with the trade.

Part of the success of commercial search engines is the promise of privacy in the sense that the consumer’s queries are entrusted secretly with the engine, and this data is not leaked or sold. Wise people know not to write into email things that they would not want in the worst case exposed to the public. Unwise people are more common than wise people, and ill-considered emails are written all the time. Most unwise people do not come to harm because of this because privacy in email is a de facto standard; it is the very security of email that makes the possibility of its being leaked alarming.

So to with search engine queries. “Ask me anything,” suggests the search engine, “I won’t tell”. “Well, I will reveal your data in an aggregate way; I’ll expose you to selective advertising. But I’m a trusted intermediary. You won’t come to any harms besides exposure to a few ads.”

That is all a safe assumption until it isn’t, at which point we must reconsider the role of the search engine. Suppose that, instead of living in a neoliberal democracy where the free search for information was sanctioned as necessary for the operation of a free market, we lived in an authoritarian country organized around the principle that disloyalty to the state should be crushed.

Under these conditions, the transition of a society into one that depends for its access to information on search engines is quite troubling. The act of looking for information is a political signal. Suppose you are looking for information about an extremist, subversive ideology. To do so is to flag yourself as a potential threat of the state. Suppose that you are looking for information about a morally dubious activity. To do so is to make yourself vulnerable to kompromat.

Under an authoritarian regime, curiosity and free thought are a problem, and a problem that are readily identified by ones search queries. Further, an authoritarian regime benefits if the risks of searching for the ‘wrong’ thing are widely known, since it suppresses inquiry. Hence, the very vaguely announced and, in fact, implausible to implement Social Credit System in China does not need to exist to be effective; people need only believe it exists for it to have a chilling and organizing effect on behavior. That is the lesson of the Foucouldean panopticon: it doesn’t need a guard sitting in it to function.

Do we have a word for this function of search engines in an authoritarian system? We haven’t needed one in our liberal democracy, which perhaps we take for granted. “Censorship” does not apply, because what’s at stake is not speech but the ability to listen and learn. “Surveillance” is too general. It doesn’t capture the specific constraints on acquiring information, on being curious. What is the right term for this threat? What is the term for the corresponding liberty?

I’ll conclude with a chilling thought: when at war, all states are authoritarian, to somebody. Every state has an extremist, subversive ideology that it watches out for and tries in one way or another to suppress. Our search queries are always of strategic or tactical interest to somebody. Search engine policies are always an issue of national security, in one way or another.

The California Consumer Privacy Act of 2018: a deep dive

I have given the California Consumer Privacy Act of 2018 a close read.

In summary, the act grants consumers a right to request that businesses disclose the categories of information about them that it collects and sells, and gives consumers the right to businesses to delete their information and opt out of sale.

What follows are points I found particularly interesting. Quotations from the Act (that’s what I’ll call it) will be in bold. Questions (meaning, questions that I don’t have an answer to at the time of writing) will be in italics.

Privacy rights

SEC. 2. The Legislature finds and declares that:
(a) In 1972, California voters amended the California Constitution to include the right of privacy among the “inalienable” rights of all people. …

I did not know that. I was under the impression that in the United States, the ‘right to privacy’ was a matter of legal interpretation, derived from other more explicitly protected rights. A right to privacy is enumerated in Article 12 of the Universal Declaration of Human Rights, adopted in 1948 by the United Nations General Assembly. There’s something like a right to privacy in Article 8 of the 1950 European Convention on Human Rights. California appears to have followed their lead on this.

In several places in the Act, it specifies that exceptions may be made in order to be compliant with federal law. Is there an ideological or legal disconnect between privacy in California and privacy nationally? Consider the Snowden/Schrems/Privacy Shield issue: exchanges of European data to the United States are given protections from federal surveillance practices. This presumably means that the U.S. federal government agrees to respect EU privacy rights. Can California negotiate for such treatment from the U.S. government?

These are the rights specifically granted by the Act:

[SEC. 2.] (i) Therefore, it is the intent of the Legislature to further Californians’ right to privacy by giving consumers an effective way to control their personal information, by ensuring the following rights:

(1) The right of Californians to know what personal information is being collected about them.

(2) The right of Californians to know whether their personal information is sold or disclosed and to whom.

(3) The right of Californians to say no to the sale of personal information.

(4) The right of Californians to access their personal information.

(5) The right of Californians to equal service and price, even if they exercise their privacy rights.

It has been only recently that I’ve been attuned to the idea of privacy rights. Perhaps this is because I am from a place that apparently does not have them. A comparison that I believe should be made more often is the comparison of privacy rights to property rights. Clearly privacy rights have become as economically relevant as property rights. But currently, property rights enjoy a widespread acceptance and enforcement that privacy rights do not.

Personal information defined through example categories

“Information” is a notoriously difficult thing to define. The Act gets around the problem of defining “personal information” by repeatedly providing many examples of it. The examples are themselves rather abstract and are implicitly “categories” of personal information. Categorization of personal information is important to the law because under several conditions businesses must disclose the categories of personal information collected, sold, etc. to consumers.

SEC. 2. (e) Many businesses collect personal information from California consumers. They may know where a consumer lives and how many children a consumer has, how fast a consumer drives, a consumer’s personality, sleep habits, biometric and health information, financial information, precise geolocation information, and social networks, to name a few categories.

[1798.140.] (o) (1) “Personal information” means information that identifies, relates to, describes, is capable of being associated with, or could reasonably be linked, directly or indirectly, with a particular consumer or household. Personal information includes, but is not limited to, the following:

(A) Identifiers such as a real name, alias, postal address, unique personal identifier, online identifier Internet Protocol address, email address, account name, social security number, driver’s license number, passport number, or other similar identifiers.

(B) Any categories of personal information described in subdivision (e) of Section 1798.80.

(C) Characteristics of protected classifications under California or federal law.

(D) Commercial information, including records of personal property, products or services purchased, obtained, or considered, or other purchasing or consuming histories or tendencies.

Note that protected classifications (1798.140.(o)(1)(C)) includes race, which is socially constructed category (see Omi and Winant on racial formation). The Act appears to be saying that personal information includes the race of the consumer. Contrast this with information as identifiers (see 1798.140.(o)(1)(A)) and information as records (1798.140.(o)(1)(D)). So “personal information” in one case is the property of a person (and a socially constructed one at that); in another case it is the specific syntactic form; in another case it is a document representing some past action. The Act is very ontologically confused.

Other categories of personal information include (continuing this last section):


(E) Biometric information.

(F) Internet or other electronic network activity information, including, but not limited to, browsing history, search history, and information regarding a consumer’s interaction with an Internet Web site, application, or advertisement.

Devices and Internet activity will be discussed in more depth in the next section.


(G) Geolocation data.

(H) Audio, electronic, visual, thermal, olfactory, or similar information.

(I) Professional or employment-related information.

(J) Education information, defined as information that is not publicly available personally identifiable information as defined in the Family Educational Rights and Privacy Act (20 U.S.C. section 1232g, 34 C.F.R. Part 99).

(K) Inferences drawn from any of the information identified in this subdivision to create a profile about a consumer reflecting the consumer’s preferences, characteristics, psychological trends, preferences, predispositions, behavior, attitudes, intelligence, abilities, and aptitudes.

Given that the main use of information is to support inferences, it is notable that inferences are dealt with here as a special category of information, and that sensitive inferences are those that pertain to behavior and psychology. This may be narrowly interpreted to exclude some kinds of inferences that may be relevant and valuable but not so immediately recognizable as ‘personal’. For example, one could infer from personal information the ‘position’ of a person in an arbitrary multi-dimensional space that compresses everything known about a consumer, and use this representation for targeted interventions (such as advertising). Or one could interpret it broadly: since almost all personal information is relevant to ‘behavior’ in a broad sense, and inference from it is also ‘about behavior’, and therefore protected.

Device behavior

The Act focuses on the rights of consumers and deals somewhat awkwardly with the fact that most information collected about consumers is done indirectly through machines. The Act acknowledges that sometimes devices are used by more than one person (for example, when they are used by a family), but it does not deal easily with other forms of sharing arrangements (i.e., an open Wifi hotspot) and the problems associated with identifying which person a particular device’s activity is “about”.

[1798.140.] (g) “Consumer” means a natural person who is a California resident, as defined in Section 17014 of Title 18 of the California Code of Regulations, as that section read on September 1, 2017, however identified, including by any unique identifier. [SB: italics mine.]

[1798.140.] (x) “Unique identifier” or “Unique personal identifier” means a persistent identifier that can be used to recognize a consumer, a family, or a device that is linked to a consumer or family, over time and across different services, including, but not limited to, a device identifier; an Internet Protocol address; cookies, beacons, pixel tags, mobile ad identifiers, or similar technology; customer number, unique pseudonym, or user alias; telephone numbers, or other forms of persistent or probabilistic identifiers that can be used to identify a particular consumer or device. For purposes of this subdivision, “family” means a custodial parent or guardian and any minor children over which the parent or guardian has custody.

Suppose you are a business that collects traffic information and website behavior connected to IP addresses, but you don’t go through the effort of identifying the ‘consumer’ who is doing the behavior. In fact, you may collect a lot of traffic behavior that is not connected to any particular ‘consumer’ at all, but is rather the activity of a bot or crawler operated by a business. Are you on the hook to disclose personal information to consumers if they ask for their traffic activity? If they do, or if they do not, provide their IP address?

Incidentally, while the Act seems comfortable defining a Consumer as a natural person identified by a machine address, it also happily defines a Person as “proprietorship, firm, partnership, joint venture, syndicate, business trust, company, corporation, …” etc. in addition to “an individual”. Note that “personal information” is specifically information about a consumer, not a Person (i.e., business).

This may make you wonder what a Business is, since these are the entities that are bound by the Act.

Businesses and California

The Act mainly details the rights that consumers have with respect to businesses that collect, sell, or lose their information. But what is a business?

[1798.140.] (c) “Business” means:
(1) A sole proprietorship, partnership, limited liability company, corporation, association, or other legal entity that is organized or operated for the profit or financial benefit of its shareholders or other owners, that collects consumers’ personal information, or on the behalf of which such information is collected and that alone, or jointly with others, determines the purposes and means of the processing of consumers’ personal information, that does business in the State of California, and that satisfies one or more of the following thresholds:

(A) Has annual gross revenues in excess of twenty-five million dollars ($25,000,000), as adjusted pursuant to paragraph (5) of subdivision (a) of Section 1798.185.

(B) Alone or in combination, annually buys, receives for the business’ commercial purposes, sells, or shares for commercial purposes, alone or in combination, the personal information of 50,000 or more consumers, households, or devices.

(C) Derives 50 percent or more of its annual revenues from selling consumers’ personal information.

This is not a generic definition of a business, just as the earlier definition of ‘consumer’ is not a generic definition of consumer. This definition of ‘business’ is a sui generis definition for the purposes of consumer privacy protection, as it defines businesses in terms of their collection and use of personal information. The definition explicitly thresholds the applicability of the law to businesses over certain limits.

There does appear to be a lot of wiggle room and potential for abuse here. Consider: the Mirai botnet had by one estimate 2.5 million devices compromised. Say you are a small business that collects site traffic. Suppose the Mirai botnet targets your site with a DDOS attack. Suddenly, your business collects information of millions of devices, and the Act comes into effect. Now you are liable for disclosing consumer information. Is that right?

An alternative reading of this section would recall that the definition (!) of consumer, in this law, is a California resident. So maybe the thresholds in 1798.140.(c)(B) and 1798.140.(c)(C) refer specifically to Californian consumers. Of course, for any particular device, information about where that device’s owner lives is personal information.

Having 50,000 California customers or users is a decent threshold for defining whether or not a business “does business in California”. Given the size and demographics of California, you would expect that many of the, just for example, major Chinese technology companies like Tencent to have 50,000 Californian users. This brings up the question of extraterritorial enforcement, which gave the GDPR so much leverage.

Extraterritoriality and financing

In a nutshell, it looks like the Act is intended to allow Californians to sue foreign companies. How big a deal is this? The penalties for noncompliance are civil penalties and a price per violation (presumably individual violation), not a ratio of profit, but you could imagine them adding up:

[1798.155.] (b) Notwithstanding Section 17206 of the Business and Professions Code, any person, business, or service provider that intentionally violates this title may be liable for a civil penalty of up to seven thousand five hundred dollars ($7,500) for each violation.

(c) Notwithstanding Section 17206 of the Business and Professions Code, any civil penalty assessed pursuant to Section 17206 for a violation of this title, and the proceeds of any settlement of an action brought pursuant to subdivision (a), shall be allocated as follows:

(1) Twenty percent to the Consumer Privacy Fund, created within the General Fund pursuant to subdivision (a) of Section 1798.109, with the intent to fully offset any costs incurred by the state courts and the Attorney General in connection with this title.

(2) Eighty percent to the jurisdiction on whose behalf the action leading to the civil penalty was brought.

(d) It is the intent of the Legislature that the percentages specified in subdivision (c) be adjusted as necessary to ensure that any civil penalties assessed for a violation of this title fully offset any costs incurred by the state courts and the Attorney General in connection with this title, including a sufficient amount to cover any deficit from a prior fiscal year.

1798.160. (a) A special fund to be known as the “Consumer Privacy Fund” is hereby created within the General Fund in the State Treasury, and is available upon appropriation by the Legislature to offset any costs incurred by the state courts in connection with actions brought to enforce this title and any costs incurred by the Attorney General in carrying out the Attorney General’s duties under this title.

(b) Funds transferred to the Consumer Privacy Fund shall be used exclusively to offset any costs incurred by the state courts and the Attorney General in connection with this title. These funds shall not be subject to appropriation or transfer by the Legislature for any other purpose, unless the Director of Finance determines that the funds are in excess of the funding needed to fully offset the costs incurred by the state courts and the Attorney General in connection with this title, in which case the Legislature may appropriate excess funds for other purposes.

So, just to be concrete: suppose a business collects personal information on 50,000 Californians and does not disclose that information. California could then sue that business for $7,500 * 50,000 = $375 million in civil penalties, that then goes into the Consumer Privacy Fund, whose purpose is to cover the cost of further lawsuits. The process funds itself. If it makes any extra money, it can be appropriated for other things.

Meaning, I guess this Act basically sustains a very sustained bunch of investigations and fines. You could imagine that this starts out with just some lawyers responding to civil complaints. But consider the scope of the Act, and how it means that any business in the world not properly disclosing information about Californians is liable to be fined. Suppose that some kind of blockchain or botnet based entity starts committing surveillance in violation of this act on a large scale. What kinds of technical investigative capacity is necessary to enforce this kind of thing worldwide? Does this become a self-funding cybercrime investigative unit? How are foreign actors who are responsible for such things brought to justice?

This is where it’s totally clear that I am not a lawyer. I am still puzzling over the meaning of [1798.155.(c)(2), for example.

“Publicly available information”

There are more weird quirks to this Act than I can dig into in this post, but one that deserves mention (as homage to Helen Nissenbaum, among other reasons) is the stipulation about publicly available information, which does not mean what you think it means:

(2) “Personal information” does not include publicly available information. For these purposes, “publicly available” means information that is lawfully made available from federal, state, or local government records, if any conditions associated with such information. “Publicly available” does not mean biometric information collected by a business about a consumer without the consumer’s knowledge. Information is not “publicly available” if that data is used for a purpose that is not compatible with the purpose for which the data is maintained and made available in the government records or for which it is publicly maintained. “Publicly available” does not include consumer information that is deidentified or aggregate consumer information.

The grammatical error in the second sentence (the phrase beginning with “if any conditions” trails off into nowhere…) indicates that this paragraph was hastily written and never finished, as if in response to an afterthought. There’s a lot going on here.

First, the sense of ‘public’ used here is the sense of ‘public institutions’ or the res publica. Amazingly and a bit implausibly, government records are considered publicly available only when they are used for purposes compatible with their maintenance. So if a business takes a public record and uses it differently that it was originally intended when it was ‘made available’, it becomes personal information that must be disclosed? As somebody who came out of the Open Data movement, I have to admit I find this baffling. On the other hand, it may be the brilliant solution to privacy in public on the Internet that society has been looking for.

Second, the stipulation that “publicly available” does not mean biometric information collected by a business about a consumer without the consumer’s knowledge” is surprising. It appears to be written with particular cases in mind–perhaps IoT sensing. But why specifically biometric information, as opposed to other kinds of information collected without consumer knowledge?

There is a lot going on in this paragraph. Oddly, it is not one of the ones explicitly flagged for review and revision in the section of soliciting public participation on changes before the Act goes into effect on 2020.

A work in progress

1798.185. (a) On or before January 1, 2020, the Attorney General shall solicit broad public participation to adopt regulations to further the purposes of this title, including, but not limited to, the following areas:

This is a weird law. I suppose it was written and passed to capitalize on a particular political moment and crisis (Sec. 2 specifically mentions Cambridge Analytica as a motivation), drafted to best express its purpose and intent, and given the horizon of 2020 to allow for revisions.

It must be said that there’s nothing in this Act that threatens the business models of any American Big Tech companies in any way, since storing consumer information in order to provide derivative ad targeting services is totally fine as long as businesses do the right disclosures, which they are now all doing because of GDPR anyway. There is a sense that this is California taking the opportunity to start the conversation about what U.S. data protection law post-GDPR will be like, which is of course commendable. As a statement of intent, it is great. Where it starts to get funky is in the definitions of its key terms and the underlying theory of privacy behind them. We can anticipate some rockiness there and try to unpack these assumptions before adopting similar policies in other states.

Pondering “use privacy”

I’ve been working carefully with Datta et al.’s “Use Privacy” work (link), which makes a clear case for how a programmatic, data-driven model may be statically analyzed for its use of a proxy of a protected variable, and repaired.

Their system has a number of interesting characteristics, among which are:

  • The use of a normative oracle for determining which proxy uses are prohibited.
  • A proof that there is no coherent definition of proxy use which has all of a set of very reasonable properties defined over function semantics.

Given (2), they continue with a compelling study of how a syntactic definition of proxy use, one based on the explicit contents of a function, can support a system of detecting and repairing proxies.

My question is to what extent the sources of normative restriction on proxies (those characterized by the oracle in (1)) are likely to favor syntactic proxy use restrictions, as opposed to semantic ones. Since ethicists and lawyers, who are the purported sources of these normative restrictions, are likely to consider any technical system a black box for the purpose of their evaluation, they will naturally be concerned with program semantics. It may be comforting for those responsible for a technical program to be able to, in a sense, avoid liability by assuring that their programs are not using a restricted proxy. But, truly, so what? Since these syntactic considerations do not make any semantic guarantees, will they really plausibly address normative concerns?

A striking result from their analysis which has perhaps broader implications is the incoherence of a semantic notion of proxy use. Perhaps sadly but also substantively, this result shows that a certain plausible normative is impossible for a system to fulfill in general. Only restricted conditions make such a thing possible. This seems to be part of a pattern in these rigorous computer science evaluations of ethical problems; see also Kleinberg et al. (2016) on how it’s impossible to meet several plausible definitions of “fairness” in the risk-assessment scores across social groups except under certain conditions.

The conclusion for me is that what this nobly motivated computer science work reveals is that what people are actually interested in normatively is not the functioning of any particular computational system. They are rather interested in social conditions more broadly, which are rarely aligned with our normative ideals. Computational systems, by making realities harshly concrete, are disappointing, but it’s a mistake to make that a disappointment with the computing systems themselves. Rather, there are mathematical facts that are disappointing regardless of what sorts of systems mediate our social world.

This is not merely a philosophical consideration or sociological observation. Since the the interpretation of laws are part of the process of informing normative expectations (as in a normative oracle), it is an interesting an perhaps open question how lawyers and judges, in their task of legal interpretation, make use of the mathematical conclusions about normative tradeoffs being offered up by computer scientists.

References

Datta, Anupam, et al. “Use Privacy in Data-Driven Systems: Theory and Experiments with Machine Learnt Programs.” arXiv preprint arXiv:1705.07807 (2017).

Kleinberg, Jon, Sendhil Mullainathan, and Manish Raghavan. “Inherent trade-offs in the fair determination of risk scores.” arXiv preprint arXiv:1609.05807 (2016).

Robert Post on Data vs. Dignitary Privacy

I was able to see Robert Post present his article, “Data Privacy and Dignitary Privacy: Google Spain, the Right to Be Forgotten, and the Construction of the Public Sphere”, today. My other encounter with Post’s work was quite positive, and I was very happy to learn more about his thinking at this talk.

Post’s argument was based off of the facts of the Google Spain SL v. Agencia Española de Protección de Datos (“Google Spain”) case in the EU, which set off a lot of discussion about the right to be forgotten.

I’m not trained as a lawyer, and will leave the legal analysis to the verbatim text. There were some broader philosophical themes that resonate with topics I’ve discussed on this blog andt in my other research. These I wanted to note.

If I follow Post’s argument correctly, it is something like this:

  • According to EU Directive 95/46/EC, there are two kinds of privacy. Data privacy rules over personal data, establishing control and limitations on use of it. The emphasis is on the data itself, which is property reasoned about analogously to. Dignitary privacy is about maintaining appropriate communications between people and restricting those communications that may degrade, humiliate, or mortify them.
  • EU rules about data privacy are governed by rules specifying the purpose for which data is used, thereby implying that the use of this data must be governed by instrumental reason.
  • But there’s the public sphere, which must not be governed by instrumental reason, for Habermasian reasons. The public sphere is, by definition, the domain of communicative action, where actions must be taken with the ambiguous purpose of open dialogue. That is why free expression is constitutionally protected!
  • Data privacy, formulated as an expression of instrumental reason, is incompatible with the free expression of the public sphere.
  • The Google Spain case used data privacy rules to justify the right to be forgotten, and in this it developed an unconvincing and sloppy precedent.
  • Dignitary privacy is in tension with free expression, but not incompatible with it. This is because it is based not on instrumental reason, but rather on norms of communication (which are contextual)
  • Future right to be forgotten decisions should be made on the basis of dignitary privac. This will result in more cogent decisions.

I found Post’s argument very appealing. I have a few notes.

First, I had never made the connection between what Hildebrandt (2013, 2014) calls “purpose binding” in EU data protection regulation and instrumental reason, but there it is. There is a sense in which these purpose clauses are about optimizing something that is externally and specifically defined before the privacy judgment is made (cf. Tschantz, Datta, and Wing, 2012, for a formalization).

This approach seems generally in line with the view of a government as a bureaucracy primarily involved in maintaining control over a territory or population. I don’t mean this in a bad way, but in a literal way of considering control as feedback into a system that steers it to some end. I’ve discussed the pervasive theme of ‘instrumentality run amok’ in questions of AI superintelligence here. It’s a Frankfurt School trope that appears to have made its way in a subtle way into Post’s argument.

The public sphere is not, in Habermasian theory, supposed to be dictated by instrumental reason, but rather by communicative rationality. This has implications for the technical design of networked publics that I’ve scratched the surface of in this paper. By pointing to the tension between instrumental/purpose/control based data protection and the free expression of the public sphere, I believe Post is getting at a deep point about how we can’t have the public sphere be too controlled lest we lose the democratic property of self-governance. It’s a serious argument that probably should be addressed by those who would like to strengthen rights to be forgotten. A similar argument might be made for other contexts whose purposes seem to transcend circumscription, such as science.

Post’s point is not, I believe, to weaken these rights to be forgotten, but rather to put the arguments for them on firmer footing: dignitary privacy, or the norms of communication and the awareness of the costs of violating them. Indeed, the facts behind right to be forgotten cases I’ve heard of (there aren’t many) all seem to fall under these kinds of concerns (humiliation, etc.).

What’s very interesting to me is that the idea of dignitary privacy as consisting of appropriate communication according to contextually specific norms feels very close to Helen Nissenbaum’s theory of Contextual Integrity (2009), with which I’ve become very familiar in past year through my work with Prof. Nissenbaum. Contextual integrity posits that privacy is about adherence to norms of appropriate information flow. Is there a difference between information flow and communication? Isn’t Shannon’s information theory a “mathematical theory of communication”?

The question of whether and under what conditions information flow is communication and/or data are quite deep, actually. More on that later.

For now though it must be noted that there’s a tension, perhaps a dialectical one, between purposes and norms. For Habermas, the public sphere needs to be a space of communicative action, as opposed to instrumental reason. This is because communicative action is how norms are created: through the agreement of people who bracket their individual interests to discuss collective reasons.

Nissenbaum also has a theory of norm formation, but it does not depend so tightly on the rejection of instrumental reason. In fact, it accepts the interests of stakeholders as among several factors that go into the determination of norms. Other factors include societal values, contextual purposes, and the differentiated roles associated with the context. Because contexts, for Nissenbaum, are defined in part by their purposes, this has led Hildebrandt (2013) to make direct comparisons between purpose binding and Contextual Integrity. They are similar, she concludes, but not the same.

It would be easy to say that the public sphere is a context in Nissenbaum’s sense, with a purpose, which is the formation of public opinion (which seems to be Post’s position). Properly speaking, social purposes may be broad or narrow, and specially defined social purposes may be self-referential (why not?), and indeed these self-referential social purposes may be the core of society’s “self-consciousness”. Why shouldn’t there be laws to ensure the freedom of expression within a certain context for the purpose of cultivating the kinds of public opinions that would legitimize laws and cause them to adapt democratically? We could possibly make these frameworks more precise if we could make them a little more formal and could lose some of the baggage; that would be useful theory building in line with Nissenbaum and Post’s broader agendas.

A test of this perhaps more nuanced but still teleological (indeed, instrumental, but maybe actually more properly speaking pragmatic (a la Dewey), in that it can blend several different metaethical categories) is to see if one can motivate a right to be forgotten in a public sphere by appealing to the need for communicative action, thereby especially appropriate communication norms around it, and dignitary privacy.

This doesn’t seem like it should be hard to do at all.

References

Hildebrandt, Mireille. “Slaves to big data. Or are we?.” (2013).

Hildebrandt, Mireille. “Location Data, Purpose Binding and Contextual Integrity: What’s the Message?.” Protection of Information and the Right to Privacy-A New Equilibrium?. Springer International Publishing, 2014. 31-62.

Nissenbaum, Helen. Privacy in context: Technology, policy, and the integrity of social life. Stanford University Press, 2009.

Post, Robert, Data Privacy and Dignitary Privacy: Google Spain, the Right to Be Forgotten, and the Construction of the Public Sphere (April 15, 2017). Duke Law Journal, Forthcoming; Yale Law School, Public Law Research Paper No. 598. Available at SSRN: https://ssrn.com/abstract=2953468 or http://dx.doi.org/10.2139/ssrn.2953468

Tschantz, Michael Carl, Anupam Datta, and Jeannette M. Wing. “Formalizing and enforcing purpose restrictions in privacy policies.” Security and Privacy (SP), 2012 IEEE Symposium on. IEEE, 2012.

Ohm and Post: Privacy as threats, privacy as dignity

I’m reading side by side two widely divergent law review articles about privacy.

One is Robert Post‘s “The Social Foundations of Privacy: Community and Self in Common Law Tort” (1989) (link)

The other is Paul Ohm‘s “Sensitive Information” (2014) (link)

They are very notably different. Post’s article diverges sharply from the intellectual millieu I’m used to. It starts with an exposition of Goffman’s view of the personal self as being constituted by ceremonies and rituals of human relationships. Privacy tort law is, in Post’s view, about repairing tears in the social fabric. The closest thing to this that I have ever encountered is Fingarette’s book on Confucianism.

Ohm’s article is much more recent and is in large part a reaction to the Snowden leaks. It’s an attempt to provide an account of privacy that can limit the problems associated with massive state (and corporate?) data collection. It attempts to provide a legally informed account of what information is sensitive, and then suggests that threat modeling strategies from computer security can be adapted to the privacy context. Privacy can be protected by identifying and mitigated privacy threats.

As I get deeper into the literature on Privacy by Design, and observe how privacy-related situations play out in the world and in my own life, I’m struck by the adaptability and indifference of the social world to shifting technological infrastructural conditions. A minority of scholars and journalists track major changes in it, but for the most part the social fabric adapts. Most people, probably necessarily, have no idea what the technological infrastructure is doing and don’t care to know. It can be coopted, or not, into social ritual.

If the swell of scholarship and other public activity on this topic was the result of surprising revelations or socially disruptive technological innovations, these same discomforts have also created an opportunity for the less technologically focused to reclaim spaces for purely social authority, based on all the classic ways that social power and significance play out.

organizational secrecy and personal privacy as false dichotomy cf @FrankPasquale

I’ve turned from page 2 to page 3 of The Black Box Society (I can be a slow reader). Pasquale sets up the dichotomy around which the drama of the hinges like so:

But while powerful businesses, financial institutions, and government agencies hide their actions behind nondisclosure agreements, “proprietary methods”, and gag rules, our own lives are increasingly open books. Everything we do online is recorded; the only questions lft are to whom the data will be available, and for how long. Anonymizing software may shield us for a little while, but who knows whether trying to hide isn’t the ultimate red flag for watchful authorities? Surveillance cameras, data brokers, sensor networks, and “supercookies” record how fast we drive, what pills we take, what books we read, what websites we visit. The law, so aggressively protective of secrecy in the world of commerce, is increasingly silent when it comes to the privacy of persons.

That incongruity is the focus of this book.

This is a rhetorically powerful paragraph and it captures a lot of trepidation people have about the power of larger organization relative to themselves.

I have been inclined to agree with this perspective for a lot of my life. I used to be the kind of person who thought Everything Should Be Open. Since then, I’ve developed what I think is a more nuanced view of transparency: some secrecy is necessary. It can be especially necessary for powerful organizations and people.

Why?

Well, it depends on the physical properties of information. (Here is an example of how a proper understanding of the mechanics of information can support the transcendent project as opposed to a merely critical project).

Any time you interact with something or somebody else in a meaningful way, you affect the state of each other in probabilistic space. That means there has been some kind of flow of information. If an organization interacts with a lot of people, it is going to absorb information about a lot of people. Recording this information as ‘data’ is something that has been done for a long time because that is what allows organizations to do intelligent things vis a vis the people they interact with. So businesses, financial institutions, and governments recording information about people is nothing new.

Pasquale suggests that this recording is a threat to our privacy, and that the secrecy of the organizations that do the recording gives them power over us. But this is surely a false dichotomy. Why? Because if an organization records information about a lot of people, and then doesn’t maintain some kind of secrecy, then that information is no longer private! To, like, everybody else. In other words, maintaining secrecy is one way of ensuring confidentiality, which is surely an important part of privacy.

I wonder what happens if we continue to read The Black Box society with this link between secrecy, confidentiality, and privacy in mind.

Nissenbaum the functionalist

Today in Classics we discussed Helen Nissenbaum’s Privacy in Context.

Most striking to me is that Nissenbaum’s privacy framework, contextual integrity theory, depends critically on a functionalist sociological view. A context is defined by its information norms and violations of those norms are judged according to their (non)accordance with the purposes and values of the context. So, for example, the purposes of an educational institution determine what are appropriate information norms within it, and what departures from those norms constitute privacy violations.

I used to think teleology was dead in the sciences. But recently I learned that it is commonplace in biology and popular in ecology. Today I learned that what amounts to a State Philosopher in the U.S. (Nissenbaum’s framework has been more or less adopted by the FTC) maintains a teleological view of social institutions. Fascinating! Even more fascinating that this philosophy corresponds well enough to American law as to be informative of it.

From a “pure” philosophy perspective (which is I will admit simply a vice of mine), it’s interesting to contrast Nissenbaum with…oh, Horkheimer again. Nissenbaum sees ethical behavior (around privacy at least) as being behavior that is in accord with the purpose of ones context. Morality is given by the system. For Horkheimer, the problem is that the system’s purposes subsume the interests of the individual, who is alone the agent who is able to determine what is right and wrong. Horkheimer is a founder of a Frankfurt school, arguably the intellectual ancestor of progressivism. Nissenbaum grounds her work in Burke and her theory is admittedly conservative. Privacy is violated when people’s expectations of privacy are violated–this is coming from U.S. law–and that means people’s contextual expectations carry more weight than an individual’s free-minded beliefs.

The tension could be resolved when free individuals determine the purpose of the systems they participate in. Indeed, Nissenbaum quotes Burke in his approval of established conventions as being the result of accreted wisdom and rationale of past generations. The system is the way it is because it was chosen. (Or, perhaps, because it survived.)

Since Horkheimer’s objection to “the system” is that he believes instrumentality has run amok, thereby causing the system serve a purpose nobody intended for it, his view is not inconsistent with Nissenbaum’s. Nissenbaum, building on Dworkin, sees contextual legitimacy as depending on some kind of political legitimacy.

The crux of the problem is the question of what information norms comprise the context in which political legitimacy is formed, and what purpose does this context or system serve?