Pondering “use privacy”

by Sebastian Benthall

I’ve been working carefully with Datta et al.’s “Use Privacy” work (link), which makes a clear case for how a programmatic, data-driven model may be statically analyzed for its use of a proxy of a protected variable, and repaired.

Their system has a number of interesting characteristics, among which are:

  • The use of a normative oracle for determining which proxy uses are prohibited.
  • A proof that there is no coherent definition of proxy use which has all of a set of very reasonable properties defined over function semantics.

Given (2), they continue with a compelling study of how a syntactic definition of proxy use, one based on the explicit contents of a function, can support a system of detecting and repairing proxies.

My question is to what extent the sources of normative restriction on proxies (those characterized by the oracle in (1)) are likely to favor syntactic proxy use restrictions, as opposed to semantic ones. Since ethicists and lawyers, who are the purported sources of these normative restrictions, are likely to consider any technical system a black box for the purpose of their evaluation, they will naturally be concerned with program semantics. It may be comforting for those responsible for a technical program to be able to, in a sense, avoid liability by assuring that their programs are not using a restricted proxy. But, truly, so what? Since these syntactic considerations do not make any semantic guarantees, will they really plausibly address normative concerns?

A striking result from their analysis which has perhaps broader implications is the incoherence of a semantic notion of proxy use. Perhaps sadly but also substantively, this result shows that a certain plausible normative is impossible for a system to fulfill in general. Only restricted conditions make such a thing possible. This seems to be part of a pattern in these rigorous computer science evaluations of ethical problems; see also Kleinberg et al. (2016) on how it’s impossible to meet several plausible definitions of “fairness” in the risk-assessment scores across social groups except under certain conditions.

The conclusion for me is that what this nobly motivated computer science work reveals is that what people are actually interested in normatively is not the functioning of any particular computational system. They are rather interested in social conditions more broadly, which are rarely aligned with our normative ideals. Computational systems, by making realities harshly concrete, are disappointing, but it’s a mistake to make that a disappointment with the computing systems themselves. Rather, there are mathematical facts that are disappointing regardless of what sorts of systems mediate our social world.

This is not merely a philosophical consideration or sociological observation. Since the the interpretation of laws are part of the process of informing normative expectations (as in a normative oracle), it is an interesting an perhaps open question how lawyers and judges, in their task of legal interpretation, make use of the mathematical conclusions about normative tradeoffs being offered up by computer scientists.


Datta, Anupam, et al. “Use Privacy in Data-Driven Systems: Theory and Experiments with Machine Learnt Programs.” arXiv preprint arXiv:1705.07807 (2017).

Kleinberg, Jon, Sendhil Mullainathan, and Manish Raghavan. “Inherent trade-offs in the fair determination of risk scores.” arXiv preprint arXiv:1609.05807 (2016).