Digifesto

Category: economics

Credit scores and information economics

The recent Equifax data breach brings up credit scores and their role in the information economy. Credit scoring is a controversial topic in the algorithmic accountability community. Frank Pasquale, for example, writes about it in The Black Box Society. Most of the critical writing on the subject points to how credit scoring might be done in a discriminatory or privacy-invasive way. As interesting as those critiques are from a political and ethical perspective, it’s worth reviewing what credit scores are for in the first place.

Let’s model this as we have done in other cases of information flow economics.

There’s a variable of interest, the likelihood that a potential borrower will not default on a loan, X. Note that any value sampled from this x will vary within the interval [0,1] because it is a value of probability.

There’s a decision to be made by a bank: whether or not to provide a random borrower a loan.

To keep things very simple, let’s suppose that the bank gets a payoff of 1 if the borrower is given a loan and does not default and gets a payoff of -1 if the borrower gets the loan and defaults. The borrower gets a payoff of 1 if he gets the loan and 0 otherwise. The bank’s strategy is to avoid giving loans that lead to negative expected payoff. (This is a gross oversimplification of, but is essentially consistent with, the model of credit used by Blöchlinger and Leippold (2006).

Given a particular x, the expected utility of the bank is:

x (1) + (1 - x) (-1) = 2x - 1

Given the domain of [0,1], this function ranges from -1 to 1, hitting 0 when x = .5.

We can now consider welfare outcomes under conditions of now information flow, total information flow, and partial information flow.

Suppose the bank has no insight into x besides a prior expectation X. Then the expected value of the bank upon offering the loan is E[2x+1]. If it is above zero, the bank will offer the loan and the borrower gets a positive payoff. If it is below zero, the bank will not offer the loan and both the bank and potential borrower will get zero payoff. The outcome depends entirely on the prior probability of loan default and is either rewards borrowers or not depending on that distribution.

If the bank has total insight into x, then the outcomes are different. The bank can use the option to reject borrowers for whom x is less than .5, and accept those for whom x is greater than .5. If we see the game as repeated over many borrowers whose chances of paying off their loan are all sampled from X. Then the additional knowledge of the bank creates two classes of potential borrowers, one that gets loans and one that does not. This increases inequality among borrowers.

It also increases the utility of the bank. This is perhaps best illustrated with a simple example. Suppose the distribution X is uniform over the unit interval [0,1]. Then the expected value of the bank’s payoff under complete information is

\int_{.5}^{1} 2x - 1 dx = 0.25

which is a significant improvement over the expected payoff of 0 in the uninformed case.

Putting off an analysis of the partial information case for now, suffice it to say that we expect partial information (such as a credit score) to lead to an intermediate result, improving bank profits and differentiating borrowers with respect to the bank’s choice to loan.

What is perhaps most interesting about this analysis is the similarity between it and Posner’s employment market. In both cases, the subject of the variable of interest X is a person’s prospects for improving the welfare of the principle decision-maker upon their being selected, where selection also implies benefit to the subject. Uncertainty about the prospects leads to equal treatment of prospective persons and reduced benefit to the principle. More information leads to differentiated impact to the prospects and benefit to the principle.

References

Blöchlinger, A., & Leippold, M. (2006). Economic benefit of powerful credit scoring. Journal of Banking & Finance, 30(3), 851-873.

Information flow in economics

We have formalized three different cases of information economics:

What we discovered is that each of these cases has, to some extent, a common form. That form is this:

There is a random variable of interest, x \sim X (that is, a value x sampled from a probability distribution X), that has direct effect on the welfare outcome of decisions made be agents in the economy. In our cases this was the aptitude of job applicants, consumers willingness to pay, and the utility of receiving a range of different expert recommendations, respectively.

In the extreme cases, the agent at the focus of the economic model could act with extreme ignorance of x, or extreme knowledge of it. Generally, the agent’s situation improves the more knowledgeable they are about x. The outcomes for the subjects of X vary more widely.

We also considered the possibility that the agent has access to partial information about X through the observation of a different variable y \sim Y. Upon observation of y, they can make their judgments based on an improved subjective expectation of the unknown variable, P(x \vert y). We assumed that the agent was a Bayesian reasoner and so capable of internalizing evidence according to Bayes rule, hence they are able to compute:

P(X \vert Y) \propto P(Y \vert X) P(X)

However, this depends on two very important assumptions.

The first is that the agent knows the distribution X. This is the prior in their subjective calculation of the Bayesian update. In our models, we have been perhaps sloppy in assuming that this prior probability corresponds to the true probability distribution from which the value x is drawn. We are somewhat safe in this assumption because for the purposes of determining strategy, only subjective probabilities can be taken into account and we can relax the distribution to encode something close to zero knowledge of the outcome if necessary. In more complex models, the difference between agents with different knowledge of X may be more strategically significant, but we aren’t there yet.

The second important assumption is that the agent knows the likelihood function P(Y | X). This is quite a strong assumption, as it implies that the agent knows truly how Y covaries with X, allowing them to “decode” the message y into useful information about x.

It may be best to think of access and usage of the likelihood function as a rare capability. Indeed, in our model of expertise, the assumption was that the service provider (think doctor) knew more about the relationship between X (appropriate treatment) and Y (observable symptoms) than the consumer (patient) did. In the case of companies that use data science, the idea is that some combination of data and science gives the company an edge in knowing the true value of some uncertain property than its competitors.

What we are discovering is that it’s not just the availability of y that matters, but also the ability to interpret y with respect to the probability of x. Data does not speak for itself.

This incidentally ties in with a point which we have perhaps glossed over too quickly in the present discussion, which is what is information, really? This may seem like a distraction in a discussion about economics but it is a question that’s come up in my own idiosyncratic “disciplinary” formation. One of the best intuitive definitions of information is provided by philosopher Fred Dretske (1981; 1983). Made a presentation of Fred Dretske’s view on information and its relationship to epistemological skepticism and Shannon information theory; you can find this presentation here. But for present purposes I want to call attention to his definition of what it means for a message to carry information, which is:

[A] message carries the information that X is a dingbat, say, if and only if one could learn (come to know) that X is a dingbat from the message.

When I say that one could learn that X was a dingbat from the message, I mean, simply, that the message has whatever reliable connection with dingbats is required to enable a suitably equipped, but otherwise ignorant receiver, to learn from it that X is a dingbat.

This formulation is worth mentioning because it supplies a kind of philosophical validation for our Bayesian formulation of information flow in the economy. We are modeling situations where Y is a signal that is reliably connected with X such that instantiations of Y carry information about the value of the X. We might express this in terms of conditional entropy:

H(X|Y) < H(X)

While this is sufficient for Y to carry information about X, it is not sufficient for any observer of Y to consequently know X. An important part of Dretske's definition is that the receiver must be suitably equipped to make the connection.

In our models, the “suitably equipped” condition is represented as the ability to compute the Bayesian update using a realistic likelihood function P(Y \vert X). This is a difficult demand. A lot of computational statistics has to do with the difficulty of tractably estimating the likelihood function, let alone computing it perfectly.

References

Dretske, F. I. (1983). The epistemology of belief. Synthese, 55(1), 3-19.

Dretske, F. (1981). Knowledge and the Flow of Information.

Economics of expertise and information services

We have no considered two models of how information affects welfare outcomes.

In the first model, inspired by an argument from Richard Posner, the are many producers (employees, in the specific example, but it could just as well be cars, etc.) and a single consumer. When the consumer knows nothing about the quality of the producers, the consumer gets an average quality producer and the producers split the expected utility of the consumer’s purchase equally. When the consumer is informed, she benefits and so does the highest quality producer, at the detriment of the other producers.

In the second example, inspired by Shapiro and Varian’s discussion of price differentiation in the sale of information goods, there was a single producer and many consumers. When the producer knows nothing about the “quality” of the consumers–their willingness to pay–the producer charges all consumers a profit-maximizing price. This price leaves many customers out of reach of the product, and many others getting a consumer surplus because the product is cheap relative to their demand. When the producer is more informed, they make more profit by selling as personalized prices. This lets the previously unreached customers in on the product at a compellingly low price. It also allows the producer to charge higher prices to willing customers; they capture what was once consumer surplus for themselves.

In both these cases, we have assumed that there is only one kind of good in play. It can vary numerically in quality, which is measured in the same units as cost and utility.

In order to bridge from theory of information goods to theory of information services, we need to take into account a key feature of information services. Consumers buy information when they don’t know what it is they want, exactly. Producers of information services tailor what they provide to the specific needs of the consumers. This is true for information services like search engines but also other forms of expertise like physician’s services, financial advising, and education. It’s notable that these last three domains are subject to data protection laws in the United States (HIPAA, GLBA, and FERPA) respectively, and on-line information services are an area where privacy and data protection are a public concern. By studying the economics of information services and expertise, we may discover what these domains have in common.

Let’s consider just a single consumer and a single producer. The consumer has a utility function \vec{x} \sim X (that is, sampled from random variable X, specifying the values it gets for the consumption of each of m = \vert J \vert products. We’ll denote with x_j the utility awarded to the consumer for the consumption of product j \in J.

The catch is that the consumer does not know X. What they do know is y \sim Y, which is correlated with X is some way that is unknown to them. The consumer tells the producer y, and the producer’s job is to recommend to them j \in J that will most benefit them. We’ll assume that the producer is interested in maximizing consumer welfare in good faith because, for example, they are trying to promote their professional reputation and this is roughly in proportion to customer satisfaction. (Let’s assume they pass on costs of providing the product to the consumer).

As in the other cases, let’s consider first the case where the acting party has no useful information about the particular customer. In this case, the producer has to choose their recommendation \hat j based on their knowledge of the underlying probability distribution X, i.e.:

\hat j = arg \max_{j \in J} E[X_j]

where X_j is the probability distribution over x_j implied by X.

In the other extreme case, the producer has perfect information of the consumer’s utility function. They can pick the truly optimal product:

\hat j = arg \max_{j \in J} x_j

How much better off the consumer is in the second case, as opposed to the first, depends on the specifics of the distribution X. Suppose X_j are all independent and identically distributed. Then an ignorant producer would be indifferent to the choice of \hat j, leaving the expected outcome for the consumer E[X_j], whereas the higher the number of products m the more \max_{j \in J} x_j will approach the maximum value of X_j.

In the intermediate cases where the producer knows y which carries partial information about \vec{x}, they can choose:

\hat j = arg \max_{j \in J} E[X_j \vert y] =

arg \max_{j \in J} \sum x_j P(x_j = X_j \vert y) =

arg \max_{j \in J} \sum x_j P(y \vert x_j = X_j) P(x_j = X_j)

The precise values of the terms here depend on the distributions X and Y. What we can know in general is that the more informative is y is about x_j, the more the likelihood term P(y \vert x_j = X_j) dominates the prior P(x_j = X_j) and the condition of the consumer improves.

Note that in this model, it is the likelihood function P(y \vert x_j = X_j) that is the special information that the producer has. Knowledge of how evidence (a search query, a description of symptoms, etc.) are caused by underlying desire or need is the expertise the consumers are seeking out. This begins to tie the economics of information to theories of statistical information.

Formalizing welfare implications of price discrimination based on personal information

In my last post I formalized Richard Posner’s 1981 argument concerning the economics of privacy. This is just one case of the economics of privacy. A more thorough analysis of the economics of privacy would consider the impact of personal information flow in more aspects of the economy. So let’s try another one.

One major theme of Shapiro and Varian’s Information Rules (1999) is the importance of price differentiation when selling information goods and how the Internet makes price differentiation easier than ever. Price differentiation likely motivates much of the data collection on the Internet, though it’s a practice that long predates the Internet. Shapiro and Varian point out that the “special offers” one gets from magazines for an extension to a subscription may well offer a personalized price based on demographic information. What’s more, this personalized price may well be an experiment, testing for the willingness of people like you to pay that price. (See Acquisti and Varian, 2005 for a detailed analysis of the economics of conditioning prices on purchase history.)

The point of this post is to analyze how a firm’s ability to differentiate its prices is a function of the knowledge it has about its customers and hence outcomes change with the flow of personal information. This makes personalized price differentiation a sub-problem of the economics of privacy.

To see this, let’s assume there are a number of customers for a job, i \in I, where the number of customers is n = \left\vert{I}\right\vert. Let’s say each has a willingness to pay for the firm’s product, x_i. Their willingness to pay is sampled from an underlying probability distribution x_i \sim X.

Note two things about how we are setting up this model. The first is that it closely mirrors our formulation of Posner’s argument about hiring job applicants. Whereas before the uncertain personal variable was aptitude for a job, in this case it is willingness to pay.

The second thing to note is that whereas it is typical to analyze price differentiation according to a model of supply and demand, here we are modeling the distribution of demand as a random variable. This is because we are interested in modeling information flow in a specific statistical sense. What we will find is that many of the more static economic tools translate well into a probabilistic domain, with some twists.

Now suppose the firm knows X but does not know any specific x_i. Knowing nothing to differentiate the customers, the firm will choose to offer the product at the same price z to everybody. Each customer will buy the product if x_i > z, and otherwise won’t. Each customer that buys the product contributes z to the firm’s utility (we are assuming an information good with near zero marginal cost). Hence, the firm will pick \hat z according to the following function:

\hat z = arg \max_z E[\sum_i z [x_i > z]] =

\hat z = arg \max_z \sum_i E[z [x_i > z]] =

\hat z = arg \max_z \sum_i z E[[x_i > z]] =

\hat z = arg \max_z \sum_i z P(x_i > z) =

\hat z = arg \max_z \sum_i z P(X > z)

Where [x_i > z] is a function with value 1 if x_i > z and 0 otherwise; this is using Iverson bracket notation.

This is almost identical to the revenue-optimizing strategy of price selection more generally, and it has a number of similar properties. One property is that for every customer for whom x_i > z, there is a consumer surplus of utility $late x_i – z$, that feeling of joy the customer gets for having gotten something valuable for less than they would have been happy to pay for it. There is also the deadweight loss of customers for whom z > x_i. These customers get 0 utility from the product and pay nothing to the producer despite their willingness to pay.

Now consider the opposite extreme, wherein the producer knows the willingness to pay of each customer x_i and can pick a personalized price z_i accordingly. The producer can price z_i = x_i - \epsilon, effectively capturing the entire demand \sum_i x_i as producer surplus, while reducing all consumer surplus and deadweight loss to zero.

What are the welfare implications of the lack of consumer privacy?

Like in the case of Posner’s employer, the real winner here is the firm, who is able to capture all the value added to the market by the increased flow of information. In both cases we have assumed the firm is a monopoly, which may have something to do with this result.

As for consumers, there are two classes of impact. For those with x_i > \hat z, having their personal willingness to pay revealed to the firm means that they lose their consumer surplus. Their welfare is reduced.

For those consumers with x_i < \hat z, these discover that they now can afford the product as it is priced close to their willingness to pay.

Unlike in Posner's case, "the people" here are more equal when their personal information is revealed to the firm because now the firm is extracting every spare ounce of joy it can from each of them, whereas before some consumers were able to enjoy low prices relative to their idiosyncratically high appreciation for the good.

What if the firm has access to partial information about each consumer y_i that is a clue to their true x_i without giving it away completely? Well, since the firm is a Bayesian reasoner they now have the subjective belief P(x_i \vert y_i) and will choose each z_i in a way that maximizes their expected profit from each consumer.

z_i = arg \max_z E[z [P(x_i > z \vert y_i)]]

The specifics of the distributions X, Y, and P(Y | X) all matter for the particular outcomes here, but intuitively one would expect the results of partial information to fall somewhere between the extremes of undifferentiated pricing and perfect price discrimination.

Perhaps the more interesting consequence of this analysis is that the firm has, for each consumer, a subjective probabilistic distribution of that consumer’s demand. Their best strategy for choosing the personalized price is similar to that of choosing a price for a large uncertain consumer demand base, only now the uncertainty is personalized. This probabilistic version of classic price differentiation theory may be more amenable to Bayesian methods, data science, etc.

References

Acquisti, A., & Varian, H. R. (2005). Conditioning prices on purchase history. Marketing Science, 24(3), 367-381.

Shapiro, C., & Varian, H. R. (1998). Information rules: a strategic guide to the network economy. Harvard Business Press.

Formalizing Posner’s economics of privacy argument

I’d like to take a more formal look at Posner’s economics of privacy argument, in light of other principles in economics of information, such as those in Shapiro and Varian’s Information Rules.

By “formal”, what I mean is that I want to look at the mathematical form of the argument. This is intended to strip out some of the semantics of the problem, which in the case of economics of privacy can lead to a lot of distracting anxieties, often for legitimate ethical reasons. However, there are logical realities that one must face despite the ethical conundrums they cause. Indeed, if there weren’t logical constraints on what is possible, then ethics would be unnecessary. So, let’s approach the blackboard, shall we?

In our interpretation of Posner’s argument, there are a number of applicants for a job, i \in I, where the number of candidates is n = \left\vert{I}\right\vert. Let’s say each is capable of performing at a certain level based on their background and aptitude, x_i. Their aptitude is sampled from an underlying probability distribution x_i \sim X.

There is an employer who must select an applicant for the job. Let’s assume that their capacity to pay for the job is fixed, for simplicity, and that all applicants are willing to accept the wage. The employer must pick an applicant i and gets utility x_i for their choice. Given no information on which to base her choice, she chooses a candidate randomly, which is equivalent to sampling once from X. Her expected value, given no other information on which to make the choice, is E[X]. The expected welfare of each applicant is their utility from getting the job (let’s say it’s 1 for simplicity) times their probability of being picked, which comes to \frac{1}{n}.

Now suppose the other extreme: the employer has perfect knowledge of the abilities of the applicants. Since she is able to pick the best candidate, her utility is \max x_i. Let \hat i = arg\max_{i \in I} x_i. Then the utility for applicant \hat i is 1, and it is 0 for the other applicants.

Some things are worth noting about this outcome. There is more inequality. All expected utility from the less qualified applicants has moved to the most qualified applicant. There is also an expected surplus of (\max x_i) - E[X] that accrues to the totally informed employer. One wonders if a “safety net” were to be provided those who have lost out in this change; if it could be, it would presumably be funded from this surplus. If the surplus were entirely taxed and redistributed among the applicants who did not get the job, it would provide each rejected applicant with \frac{(\max x_i) - E[X]}{n-1} utility. Adding a little complexity to the model we could be more precise by computing the wage paid to the worker and identify whether redistribution could potentially recover the losses of the weaker applicants.

What about intermediary conditions? These get more analytically complex. Suppose that each applicant i produces an application y_i which is reflective of their abilities. When the employer makes her decision, her expectation of the performance of each applicant is

P(x_i \vert y_i) \propto P(y_i \vert x_i)P(x_i)

because naturally the employer is a Bayesian reasoner. She makes her decision by maximizing her expected gain, based on this evidence:

arg\max E[P(x_i \vert y_i)] =

arg\max \sum_{x_i} x_i p(x_i \vert y_i) =

arg\max \sum_{x_i} x_i p(y_i \vert x_i) p(x_i)

The particulars of the distributions X and Y and especially P(Y \vert X) matter a great deal to the outcome. But from the expanded form of the equation we can see that the more revealing y_i is about x_i< the more the likelihood term p(y_i \vert x_i) will overcome the prior expectations. It would be nice to be able to capture the impact of this additional information in a general way. One would think that providing limited information about applicants to the employer would result in an intermediate outcome. Under reasonable assumptions, more qualified applicants would be more likely to be hired and the employer would accrue more value from the work.

What this goes to show is how ones evaluation of Posner's argument about the economics of privacy really has little to do with the way one feels about privacy and much more to do with how one feels about the equality and economic surplus. I've heard that a similar result has been discovered by Solon Barocas, though I'm not sure where in his large body of work to find it.

From information goods to information services

Continuing to read through Information Rules, by Shapiro and Varian (1999), I’m struck once again by its clear presentation and precise wisdom. Many of the core principles resonate with my experience in the software business when I left it in 2011 for graduate school. I think it’s fair to say that Shapiro and Varian anticipated the following decade of  the economics of content and software distribution.

What they don’t anticipate, as far as I can tell, is what has come to dominate the decade after that, this decade. There is little in Information Rules that addresses the contemporary phenomena of cloud computing and information services, such as Software-as-a-Service, Platforms-as-a-Service, and Infrastructure-as-a-Service. Yet these are clearly the kinds of services that have come to dominate the tech market.

That’s an opening. According to a business manager in 2014, there’s no book yet on how to run an SaaS company. While sure that if I were slightly less lazy I would find several, I wonder if they are any good. By “any good”, I mean would they hold up to scientific standards in their elucidation of economic law, as opposed to being, you know, business books.

One of the challenges of working on this which has bothered me since I first became curious about these problems is that there is not very good elegant formalism available for representing competition between computing agents. The best that’s out there is probably in the AI literature. But that literature is quite messy.

Working up from something like Information Rules might be a more promising way of getting at some of these problems. For example, Shapiro and Varian start from the observation that information goods have high fixed (often, sunk) costs and low marginal costs to reproduce. This leads them to the conclusion that the market cannot look like a traditional competitive market with multiple firms selling similar goods but rather must either have a single dominant firm or a market of many similar but differentiated products.

The problem here is that most information services, even “simple” ones like a search engine, are not delivering a good. They are being responsive to some kind of query. The specific content and timing of the query, along with the state of the world at the time of the query, are unique. Consumers may make the same query with varying demand. The value-adding activity is not so much creating the good as it is selecting the right response to the query. And who can say how costly this is, marginally?

On the other hand, this framing obscures something important about information goods, which is that all information goods are, in a sense, a selection of bits from the wide range of possible bits one might send or receive. This leads to my other frustration with information economics, which is that it is insufficiently tied to the statistical definition of information and the modeling tools that have been built around it. This is all the more frustrating because I suspect that in advanced industrial settings these connections have been made and are used with confidence. However, it had been slow to make it into mainstream understanding. There’s another opportunity here.

Shapiro and Varian: scientific “laws of economics”

I’ve been amiss in not studying Shapiro and Varian’s Information Rules: A Strategic Guide to the Network Economy (1998, link) more thoroughly. In my years in the tech industry and academic study, there are few sources that deal with the practical realities of technology and society as clearly as Shapiro and Varian. As I now turn my attention more towards the rationale for various forms of information law and find how much of it is driven by considerations of economics, I have to wonder why this was not something I’ve given more emphasis in my graduate study so far.

The answer that comes immediately to mind is that throughout my academic study of the past few years I’ve encountered a widespread hostility to economics from social scientists of other disciplines. This hostility resembles, though is somewhat different from, the hostility social scientists other other stripes have had (in my experience) for engineers. The critiques have been along the lines that economists are powerful disproportionately to the insight provided by the field, that economists are focused too narrowly on certain aspects of social life to the exclusion of others that are just as important, that economists are arrogant in their belief that their insights about incentives apply to other areas of social life besides the narrow concerns of the economy, that economists mistakenly think their methods are more scientific or valid than other social scientists, that economics is in the business of enshrining legal structures into place that give their conclusions more predictive power than they would have in other legal regimes and, as of the most recent news cycle, that the field of economics is hostile to women.

This is a strikingly familiar pattern of disciplinary critique, as it seems to be the same one levied at any field that aims to “harden” inquiry into social life. The encroachment of engineering disciplines and physicists into social explanation has come with similar kinds of criticism. These criticisms, it must be noted, contain at least one contradiction: should economists be concerned about issues besides the economy, or not? But the key issue, as with most disciplinary spats, is the politics of a lot of people feeling dismissed or unheard or unfunded.

Putting all this aside, what’s interesting about the opening sections of Shapiro and Varian’s book is their appeal to the idea of laws of economics, as if there were such laws analogous to laws of physics. The idea is that trends in the technology economy are predictable according to these laws, which have been learned through observation and formalized mathematically, and that these laws should therefore be taught for the benefit of those who would like to participate successfully in that economy.

This is an appealing idea, though one that comes under criticism, you know, from the critics, with a predictability that almost implies a social scientific law. This has been a debate going back to discussions of Marx and communism. Early theorists of the market declared themselves to have discovered economic laws. Marx, incidentally, also declared that he had discovered (different) economic laws, albeit according to the science of dialectical materialism. But the latter declared that the former economic theories hide the true scientific reality of the social relations underpinning the economy. These social relations allowed for the possibility of revolution in a way that an economy of goods and prices abstracted from society did not.

As one form of the story goes, the 20th century had its range of experiments with ways of running an economy. Those most inspired by Marxism had mass famines and other unfortunate consequences. Those that took their inspiration from the continually evolving field of increasingly “neo”-classical economics, with its variations of Keynesianism, monetarism, and the rest, had some major bumps (most recently the 2008 financial crisis) but tends to improve over time with historical understanding and the discovery of, indeed, laws of economics. And this is why Janet Yellen and Mario Draghi are now warning against removing the post-crisis financial market regulations.

This offers an anecdotal counter to the narrative that all economists ever do is justify more terrible deregulation at the expense of the lived experience of everybody else. The discovery of laws of economics can, indeed, be the basis for economic regulation; in fact this is often the case. In point of fact, it may be that this is one of the things that tacitly motivates the undermining of economic epistemology: the fact that if the laws of economics were socially determined to be true, like the laws of physics, such that everybody ought to know them, it would lead to democratic will for policies that would be opposed to the interests of those who have heretofore enjoyed the advantage of their privileged (i.e., not universally shared) access to the powerful truth about markets, technology, etc.

Which is all to say: I believe that condemnations of economics as a field are quite counterproductive, socially, and that the scientific pursuit of the discovery of economic laws is admirable and worthy. Those that criticize economics for this ambition, and teach their students to do so, imperil everyone else and should stop.

Notes on Posner’s “The Economics of Privacy” (1981)

Lately my academic research focus has been privacy engineering, the designing of information processing systems that preserve privacy of their users. I have been looking the problem particularly through the lens of Contextual Integrity, a theory of privacy developed by Helen Nissenbaum (2004, 2009). According to this theory, privacy is defined as appropriate information flow, where “appropriateness” is determined relative to social spheres (such as health, education, finance, etc.) that have evolved norms based on their purpose in society.

To my knowledge most existing scholarship on Contextual Integrity is comprised by applications of a heuristic process associated with Contextual Integrity that evaluates the privacy impact of new technology. In this process, one starts by identifying a social sphere (or context, but I will use the term social sphere as I think it’s less ambiguous) and its normative structure. For example, if one is evaluating the role of a new kind of education technology, one would identify the roles of the education sphere (teachers, students, guardians of students, administrators, etc.), the norms of information flow that hold in the sphere, and the disruptions to these norms the technology is likely to cause.

I’m coming at this from a slightly different direction. I have a background in enterprise software development, data science, and social theory. My concern is with the ways that technology is now part of the way social spheres are constituted. For technology to not just address existing norms but deal adequately with how it self-referentially changes how new norms develop, we need to focus on the parts of Contextual Integrity that have heretofore been in the background: the rich social and metaethical theory of how social spheres and their normative implications form.

Because the ultimate goal is the engineering of information systems, I am leaning towards mathematical modeling methods that trade well between social scientific inquiry and technical design. Mechanism design, in particular, is a powerful framework from mathematical economics that looks at how different kinds of structures change the outcomes for actors participating in “games” that involve strategy action and information flow. While mathematical economic modeling has been heavily critiqued over the years, for example on the basis that people do not act with the unbounded rationality such models can imply, these models can be a first step and valuable in a technical context especially as they establish the limits of a system’s manipulability by non-human actors such as AI. This latter standard makes this sort of model more relevant than it has ever been.

This is my roundabout way of beginning to investigate the fascinating field of privacy economics. I am a new entrant. So I found what looks like one of the earliest highly cited articles on the subject written by the prolific and venerable Richard Posner, “The Economics of Privacy”, from 1981.

Richard Posner, from Wikipedia

Wikipedia reminds me that Posner is politically conservative, though apparently he has changed his mind recently in support of gay marriage and, since the 2008 financial crisis, the laissez faire rational choice economic model that underlies his legal theory. As I have mainly learned about privacy scholarship from more left-wing sources, it was interesting reading an article that comes from a different perspective.

Posner’s opening position is that the most economically interesting aspect of privacy is the concealment of personal information, and that this is interesting mainly because privacy is bad for market efficiency. He raises examples of employers and employees searching for each other and potential spouses searching for each other. In these cases, “efficient sorting” is facilitated by perfect information on all sides. Privacy is foremost a way of hiding disqualifying information–such as criminal records–from potential business associates and spouses, leading to a market inefficiency. I do not know why Posner does not cite Akerlof (1970) on the “market for ‘lemons'” in this article, but it seems to me that this is the economic theory most reflective of this economic argument. The essential question raised by this line of argument is whether there’s any compelling reason why the market for employees should be any different from the market for used cars.

Posner raises and dismisses each objective he can find. One objection is that employers might heavily weight factors they should not, such as mental illness, gender, or homosexuality. He claims that there’s evidence to show that people are generally rational about these things and there’s no reason to think the market can’t make these decisions efficiently despite fear of bias. I assume this point has been hotly contested from the left since the article was written.

Posner then looks at the objection that privacy provides a kind of social insurance to those with “adverse personal characteristics” who would otherwise not be hired. He doesn’t like this argument because he sees it as allocating the costs of that person’s adverse qualities to a small group that has to work with that person, rather than spreading the cost very widely across society.

Whatever one thinks about whose interests Posner seems to side with and why, it is refreshing to read an article that at the very least establishes the trade offs around privacy somewhat clearly. Yes, discrimination of many kinds is economically inefficient. We can expect the best performing companies to have progressive hiring policies because that would allow them to find the best talent. That’s especially true if there are large social biases otherwise unfairly skewing hiring.

On the other hand, the whole idea of “efficient sorting” assumes a policy-making interest that I’m pretty sure logically cannot serve the interests of everyone so sorted. It implies a somewhat brutally Darwinist stratification of personnel. It’s quite possible that this is not healthy for an economy in the long term. On the other hand, in this article Posner seems open to other redistributive measures that would compensate for opportunities lost due to revelation of personal information.

There’s an empirical part of the paper in which Posner shows that percentage of black and Hispanic populations in a state are significantly correlated with existence of state level privacy statutes relating to credit, arrest, and employment history. He tries to spin this as an explanation for privacy statutes as the result of strongly organized black and Hispanic political organizations successfully continuing to lobby in their interest on top of existing anti-discrimination laws. I would say that the article does not provide enough evidence to strongly support this causal theory. It would be a stronger argument if the regression had taken into account the racial differences in credit, arrest, and employment state by state, rather than just assuming that this connection is so strong it supports this particular interpretation of the data. However, it is interesting that this variable ways more strongly correlated with the existence of privacy statutes than several other variables of interest. It was probably my own ignorance that made me not consider how strongly privacy statutes are part of a social justice agenda, broadly speaking. Considering that disparities in credit, arrest, and employment history could well be the result of other unjust biases, privacy winds up mitigating the anti-signal that these injustices have in the employment market. In other words, it’s not hard to get from Posner’s arguments to a pro-privacy position based of all things on market efficiency.

It would be nice to model that more explicitly, if it hasn’t been done yet already.

Posner is quite bullish on privacy tort, thinking that it is generally not so offensive from an economic perspective largely because it’s about preventing misinformation.

Overall, the paper is a valuable starting point for further study in economics of privacy. Posner’s economic lens swiftly and clearly puts the trade-offs around privacy statutes in the light. It’s impressively lucid work that surely bears directly on arguments about privacy and information processing systems today.

References

Akerlof, G. A. (1970). The market for” lemons”: Quality uncertainty and the market mechanism. The quarterly journal of economics, 488-500.

Nissenbaum, H. (2004). Privacy as contextual integrity. Wash. L. Rev., 79, 119.

Nissenbaum, H. (2009). Privacy in context: Technology, policy, and the integrity of social life. Stanford University Press.

Posner, R. A. (1981). The economics of privacy. The American economic review, 71(2), 405-409. (jstor)

Capital, democracy, and oligarchy

1. Capital

Bourdieu nicely lays out a taxonomy of forms of capital (1986), including economic capital (wealth) which we are all familiar with, as well as cultural capital (skills, elite tastes) and social capital (relationships with others, especially other elites). By saying that all three categories are forms of capital, what he means is that each “is accumulated labor (in its materialized form or its ‘incorporated,’ embodied form) which, when appropriated on a private, i.e., exclusive, basis by agents or groups of agents, enables them to appropriate social energy in the form of reified or living labor.” In his account, capital in all its forms are what give society its structure, including especially its economic structure.

[Capital] is what makes the games of society – not least, the economic game – something other than simple games of chance offering at every moment the possibility of a miracle. Roulette, which holds out the opportunity of winning a lot of money in a short space of time, and therefore of changing one’s social status quasi-instantaneously, and in which the winning of the previous spin of the wheel can be staked and lost at every new spin, gives a fairly accurate image of this imaginary universe of perfect competition or perfect equality of opportunity, a world without inertia, without accumulation, without heredity or acquired properties, in which every moment is perfectly independent of the previous one, every soldier has a marshal’s baton in his knapsack, and every prize can be attained, instantaneously, by everyone, so that at each moment anyone can become anything. Capital, which, in its objectified or embodied forms, takes time to accumulate and which, as a potential capacity to produce profits and to reproduce itself in identical or expanded form, contains a tendency to persist in its being, is a force inscribed in the objectivity of things so that everything is not equally possible or impossible. And the structure of the distribution of the different types and subtypes of capital at a given moment in time represents the immanent structure of the social world, i.e. , the set of constraints, inscribed in the very reality of that world, which govern its functioning in a durable way, determining the chances of success for practices.

Bourdieu is clear in his writing that he does not intend this to be taken as unsubstantiated theoretical posture. Rather, it is a theory he has developed through his empirical research. Obviously, it is also informed by many other significant Western theorists, including Kant and Marx. There is something slightly tautological about the way he defines his terms: if capital is posited to explain all social structure, then any social structure may be explained according to a distribution of capital. This leads Bourdieu to theorize about many forms of capital less obvious than wealth, such as the symbolic capital, like academic degrees.

The costs of such a theory is that it demands that one begin the difficult task of enumerate different forms of capital and, importantly, the ways in which some forms of capital can be converted into others. It is a framework which, in principle, could be used to adequately explain social reality in a properly scientific way, as opposed to other frameworks that seem more intended to maintain the motivation of a political agenda or academic discipline. Indeed there is something “interdisciplinary” about the very proposal to address symbolic and economic power in a way that deals responsibly with their commensurability.

So it has to be posited simultaneously that economic capital is at the root of all the other types of capital and that these transformed, disguised forms of economic capital, never entirely reducible to that definition, produce their most specific effects only to the extent that they conceal (not least from their possessors) the fact that economic capital is at their root, in other words – but only in the last analysis – at the root of their effects. The real logic of the functioning of capital, the conversions from one type to another, and the law of conservation which governs them cannot be understood unless two opposing but equally partial views are superseded: on the one hand, economism, which, on the grounds that every type of capital is reducible in the last analysis to economic capital, ignores what makes the specific efficacy of the other types of capital, and on the other hand, semiologism (nowadays represented by structuralism, symbolic interactionism, or ethnomethodology), which reduces social exchanges to phenomena of communication and ignores the brutal fact of universal reducibility to economics.

[I must comment that after years in an academic environment where sincere intellectual effort seemed effectively boobytrapped by disciplinary trip wires around ethnomethodology, quantification, and so on, this Bourdieusian perspective continues to provide me fresh hope. I’ve written here before about Bourdieu’s Science of Science and Reflexivity (2004), which was a wake up call for me that led to my writing this paper. That has been my main entrypoint into Bourdieu’s thought until now. The essay I’m quoting from now was published at least fifteen years prior and by its 34k citations appears to be a classic. Much of what’s written here will no doubt come across as obvious to the sophisticated reader. It is a symptom of a perhaps haphazard education that leads me to write about it now as if I’ve discovered it; indeed, the personal discovery is genuine for me, and though it is not a particularly old work, reading it and thinking it over carefully does untangle some of the knots in my thinking as I try to understand society and my role in it. Perhaps some of that relief can be shared through writing here.]

Naturally, Bourdieu’s account of capital is more nuanced and harder to measure than an economist’s. But it does not preclude an analysis of economic capital such as Piketty‘s. Indeed, much of the economist’s discussion of human capital, especially technological skill, and its relationship to wages can be mapped to a discussion of a specific form of cultural capital and how it can be converted into economic capital. A helpful aspect of this shift is that it allows one to conceptualize the effects of class, gender, and racial privilege in the transmission of technical skills. Cultural capital is, explicitly in Bourdieu’s account, labor intensive to transmit and often done so informally. Cultural tendencies to transmit this kind of capital preferentially to men instead of women in the family home become a viable explanation for the gender cap in the tech industry. While this is perhaps not a novel explanation, it is a significant one and Bourdieu’s theory helps us formulate it in a specific and testable way that transcends, as he says, both economism and semiologism, which seems productive when one is discussing society in a serious way.

One could also use a Bourdieusian framework to understand innovation spillover effects, as economists like to discuss, or the rise of Silicon Valley’s “Regional Advantage” (Saxenian, 1996), to take a specific case. One of Saxenian’s arguments (as I gloss it) is that Silicon Valley was more economically effective as a region than Route 128 in Massachusetts because the influx of engineers experimenting with new business models and reinvesting their profits into other new technology industries created a confluence of relevant cultural capital (technical skill) and economic capital (venture capital) that allowed the economic capital to be deployed more effectively. In other words, it wasn’t that the engineers in Silicon Valley were better engineers than the engineers in Route 128; it was that the economic capital was being deployed in a way that was less informed by technical knowledge. [Incidentally, if this argument is correct, then in some ways it undermines an argument put forward recently for setting up a “cyber workforce incubator” for the Federal Government in the Bay Area based on the idea that it’s necessary to tap into the labor pool there. If what makes Silicon Valley is smart capital rather than smart engineers, then that explains why there are so many engineers there (they are following the money) but also suggests that the price of technical labor there may be inflated. Engineers elsewhere may be just as good at being part of a cyber workforce. Which is just to say that when Bourdieusian theory is taken seriously, it can have practical policy implications.]

One must imagine, when considering society thus, that one could in principle map out the whole of society and the distribution of capitals within it. I believe Bourdieu does something like this in Distinction (1979), which I haven’t read–it is sadly referred to in the United States as the kind of book that is too dense to read. This is too bad.

But I was going to talk about…

2. Democracy

There are at least two great moments in history when democracy flourished. They have something in common.

One is Ancient Greece. The account of the polis in Hannah Arendt’s The Human Condition (1, cf (2 3) makes the familiar point that the citizens of the Ancient Greek city-state were masters of economically independent households. It was precisely the independence of politics (polis – city) from household economic affairs (oikos – house) that defined political life. Owning capital, in this case land and maybe slaves, was a condition for democratic participation. The democracy, such as it was, was the political unity of otherwise free capital holders.

The other historical moment is the rise of the mercantile class and the emergence of the democratic public sphere, as detailed by Habermas. If the public sphere Habermas described (and to some extent idealized) has been critiqued as being “bourgeois masculinist” (Fraser), that critique is telling. The bourgeoisie were precisely those who were owners of newly activated forms of economic capital–ships, mechanizing technologies, and the like.

If we can look at the public sphere in its original form realistically through the disillusionment of criticism, the need for rational discourse among capital holders was strategically necessary for the bourgeoisie to make strategic decisions about how to collectively allocate their economic capital. The Viewed through the objective lens of information processing and pure strategy, the public sphere was an effective means of economic coordination that complemented the rise of the Weberian bureaucracy, which provided a predictable state and also created new demand for legal professionals and the early information workers: clerks and scriveners and such.

The diversity of professions necessary for the functioning of the modern mercantile state created a diversity of forms of cultural capital that could be exchanged for economic capital. Hence, capital diffused from its concentration in the aristocracy into the hands of the widening class of the bourgeoisie.

Neither the Ancient Greek nor the mercantile democracies were particularly inclusive. Perhaps there is no historical precedent for a fully inclusive democracy. Rather, there is precedent for egalitarian alliances of capital holders in cases where that capital is broadly enough distributed to constitute citizenship as an economic class. Moreover, I must insert here that the Bourdieusian model suggests that citizenship could extend through the diffusion of non-economic forms of capital as well. For example, membership in the clergy was a form of capital taken on by some of the gentry; this came, presumably, with symbolic and social capital. The public sphere creates opportunities for the public socialite that were distinct from the opportunities of the courtier or courtesan. And so on.

However exclusive these democracies were, Fraser’s account of subaltern publics and counterpublics is of course very significant. What about the early workers and womens movements? Arguably these too can be understood in Bourdieusian terms. There were other forms of (social and cultural, if not economic) capital that workers and women in particular had available that provided the basis for their shared political interest and political participation.

What I’m suggesting is that:

  • Historically, the democratic impulse has been about uniting the interests of freeholders of capital.
  • A Bourdieusian understanding of capital allows us to maintain this (analytically helpful) understanding of democracy while also acknowledging the complexity of social structure, through the many forms of capital
  • That the complexity of society through the proliferation of forms of capital is one of, if not the, main mechanism of expanding effective citizenship, which is still conditioned on capital ownership even though we like to pretend it’s not.

Which leads me to my last point, which is about…

3. Oligarchy

If a democracy is a political unity of many different capital holders, what then is oligarchy in contrast?

Oligarchy is rule of the few, especially the rich few.

We know, through Bourdieu, that there are many ways to be rich (not just economic ways). Nevertheless, capital (in its many forms) is very unevenly distributed, which accounts for social structure.

To some extent, it is unrealistic to expect the flattening of this distribution. Society is accumulated history and there has been a lot of history and most of it has been brutally unkind.

However, there have been times when capital (in its many forms) has diffused because of the terms of capital exchange, broadly speaking. The functional separation of different professions was one way in which capital was fragmented into many differently exchangeable forms of cultural, social, and economic capitals. A more complex society is therefore a more democratic one, because of the diversity of forms of capital required to manage it. [I suspect there’s a technically specific way to make this point but don’t know how to do it yet.]

There are some consequences of this.

  1. Inequality in the sense of a very skewed distribution of capital and especially economic capital does in fact undermine democracy. You can’t really be a citizen unless you have enough capital to be able to act (use your labor) in ways that are not fully determined by economic survival. And of course this is not all or nothing; quantity of capital and relative capital do matter even beyond a minimum threshold.
  2. The second is that (1) can’t be the end of the story. Rather, to judge if the capital distribution of e.g. a nation can sustain a democracy, you need to account for many kinds of capital, not just economic capital, and see how these are distribute and exchanged. In other words, it’s necessary to look at the political economy broadly speaking. (But, I think, it’s helpful to do so in terms of ‘forms of capital’.)

One example, which I just learned recently, is this. In the United States, we have an independent judiciary, a third branch of government. This is different from other countries that are allegedly oligarchies, notably Russia but also Rhode Island before 2004. One could ask: is this Separation of Powers important for democracy? The answer is intuitively “yes”, and though I’m sure very smart things have been written to answer the question “why”, I haven’t read them, because I’ve been too busy blogging….

Instead, I have an answer for you based on the preceding argument. It was a new idea for me. It was this: What separation of powers does is its constructs a form of cultural capital associated with professional lawyers which is less exchangeable for economic and other forms of capital than in places where non-independence of the judiciary leads to more regular bribery, graft, and preferential treatment. Because it mediates economic exchanges, this has a massively distortative effect on the ability of economic capital to bulldoze other forms of capital, and the accompanying social structures (and social strictures) that bind it. It also creates a new professional class who can own this kind of capital and thereby accomplish citizenship.

Coda

In this blog post, I’ve suggested that not everybody who, for example, legally has suffrage in nominally democratic state is, in an effective sense, a citizen. Only capital owners can be citizens.

This is not intended in any way to be a normative statement about who should or should not be a citizen. Rather, it is a descriptive statement about how power is distributed in nominal democracies. To be an effective citizen, you need to have some kind of surplus of social power; capital the objectification of that social power.

The project of expanding democracy, if it is to be taken seriously, needs to be understood as the project of expanding capital ownership. This can include the redistribution of economic capital. It can also changing institutions that ground cultural and social capitals in ways that distribute other forms of capital more widely. Diversifying professional roles is a way of doing this.

Nothing I’ve written here is groundbreaking, for sure. It is for me a clearer way to think about these issues than I have had before.