bayes | Digifesto

Formalizing welfare implications of price discrimination based on personal information

In my last post I formalized Richard Posner’s 1981 argument concerning the economics of privacy. This is just one case of the economics of privacy. A more thorough analysis of the economics of privacy would consider the impact of personal information flow in more aspects of the economy. So let’s try another one.

One major theme of Shapiro and Varian’s Information Rules (1999) is the importance of price differentiation when selling information goods and how the Internet makes price differentiation easier than ever. Price differentiation likely motivates much of the data collection on the Internet, though it’s a practice that long predates the Internet. Shapiro and Varian point out that the “special offers” one gets from magazines for an extension to a subscription may well offer a personalized price based on demographic information. What’s more, this personalized price may well be an experiment, testing for the willingness of people like you to pay that price. (See Acquisti and Varian, 2005 for a detailed analysis of the economics of conditioning prices on purchase history.)

The point of this post is to analyze how a firm’s ability to differentiate its prices is a function of the knowledge it has about its customers and hence outcomes change with the flow of personal information. This makes personalized price differentiation a sub-problem of the economics of privacy.

To see this, let’s assume there are a number of customers for a job, $i \in I$ , where the number of customers is $n = \left\vert{I}\right\vert$ . Let’s say each has a willingness to pay for the firm’s product, $x_i$ . Their willingness to pay is sampled from an underlying probability distribution $x_i \sim X$ .

Note two things about how we are setting up this model. The first is that it closely mirrors our formulation of Posner’s argument about hiring job applicants. Whereas before the uncertain personal variable was aptitude for a job, in this case it is willingness to pay.

The second thing to note is that whereas it is typical to analyze price differentiation according to a model of supply and demand, here we are modeling the distribution of demand as a random variable. This is because we are interested in modeling information flow in a specific statistical sense. What we will find is that many of the more static economic tools translate well into a probabilistic domain, with some twists.

Now suppose the firm knows $X$ but does not know any specific $x_i$ . Knowing nothing to differentiate the customers, the firm will choose to offer the product at the same price $z$ to everybody. Each customer will buy the product if $x_i > z$ , and otherwise won’t. Each customer that buys the product contributes $z$ to the firm’s utility (we are assuming an information good with near zero marginal cost). Hence, the firm will pick $\hat z$ according to the following function:

$\hat z = arg \max_z E[\sum_i z [x_i > z]] =$

$\hat z = arg \max_z \sum_i E[z [x_i > z]] =$

$\hat z = arg \max_z \sum_i z E[[x_i > z]] =$

$\hat z = arg \max_z \sum_i z P(x_i > z) =$

$\hat z = arg \max_z \sum_i z P(X > z)$

Where $[x_i > z]$ is a function with value 1 if $x_i > z$ and 0 otherwise; this is using Iverson bracket notation.

This is almost identical to the revenue-optimizing strategy of price selection more generally, and it has a number of similar properties. One property is that for every customer for whom $x_i > z$ , there is a consumer surplus of utility $late x_i – z$, that feeling of joy the customer gets for having gotten something valuable for less than they would have been happy to pay for it. There is also the deadweight loss of customers for whom $z > x_i$ . These customers get 0 utility from the product and pay nothing to the producer despite their willingness to pay.

Now consider the opposite extreme, wherein the producer knows the willingness to pay of each customer $x_i$ and can pick a personalized price $z_i$ accordingly. The producer can price $z_i = x_i - \epsilon$ , effectively capturing the entire demand $\sum_i x_i$ as producer surplus, while reducing all consumer surplus and deadweight loss to zero.

What are the welfare implications of the lack of consumer privacy?

Like in the case of Posner’s employer, the real winner here is the firm, who is able to capture all the value added to the market by the increased flow of information. In both cases we have assumed the firm is a monopoly, which may have something to do with this result.

As for consumers, there are two classes of impact. For those with $x_i > \hat z$ , having their personal willingness to pay revealed to the firm means that they lose their consumer surplus. Their welfare is reduced.

For those consumers with $x_i < \hat z$ , these discover that they now can afford the product as it is priced close to their willingness to pay.

Unlike in Posner's case, "the people" here are more equal when their personal information is revealed to the firm because now the firm is extracting every spare ounce of joy it can from each of them, whereas before some consumers were able to enjoy low prices relative to their idiosyncratically high appreciation for the good.

What if the firm has access to partial information about each consumer $y_i$ that is a clue to their true $x_i$ without giving it away completely? Well, since the firm is a Bayesian reasoner they now have the subjective belief $P(x_i \vert y_i)$ and will choose each $z_i$ in a way that maximizes their expected profit from each consumer.

$z_i = arg \max_z E[z [P(x_i > z \vert y_i)]]$

The specifics of the distributions $X$ , $Y$ , and $P(Y | X)$ all matter for the particular outcomes here, but intuitively one would expect the results of partial information to fall somewhere between the extremes of undifferentiated pricing and perfect price discrimination.

Perhaps the more interesting consequence of this analysis is that the firm has, for each consumer, a subjective probabilistic distribution of that consumer’s demand. Their best strategy for choosing the personalized price is similar to that of choosing a price for a large uncertain consumer demand base, only now the uncertainty is personalized. This probabilistic version of classic price differentiation theory may be more amenable to Bayesian methods, data science, etc.

References

Acquisti, A., & Varian, H. R. (2005). Conditioning prices on purchase history. Marketing Science, 24(3), 367-381.

Shapiro, C., & Varian, H. R. (1998). Information rules: a strategic guide to the network economy. Harvard Business Press.

Filtering feeds

About a week ago Subtraction made a long post complaining about the main problem of feed aggregators:

No matter how much I try to organize it, it’s always in disarray, overflowing with unread posts and encumbered with mothballed feeds. … The whole process frustrates me though, mostly because I feel like I shouldn’t have to do it at all. The software should just do it for me.

These are my reactions to this, roughly in order:

I feel the pain of feed bloat myself, and know many others that do. It’s another symptom of internet-enabled information explosion.
It’s amazing that we live in an era when a feeling of entitlement about our interactions with web technology isn’t seen as ridiculous outright. It’s true–it does feel surprising that somebody smart hasn’t solved this problem for everybody yet.
The reason why it hasn’t been solved yet is probably because it’s a tough problem. It’s not easy to program a computer to know What I Find Interesting…

…or is it? This is, after all, what various web services have fought to do well for us ever since the dawn of the search engine. And the results are pretty good right now. So there must be a good way to solve this problem.

As far as I can tell, there are two successful ways of doing smart filtering-for-people on the internet, both of which are being applied to feeds:

Using direct social recommendations to let people know what other people find interesting. Digg is probably the best example of this. In the feed domain, Google Reader’s “Shared items” does this.
Machine learning techniques. Probably the most successful implementation of this on the internet are Bayesian spam filters. The Tao of Mac reports on a writer’s personal experiment to add a Bayesian filter to his feed aggregator. It’s a qualified success.

The most interesting solutions to these kinds of problems are collaborative filtering algorithms that combine both methods. This is why Gmail’s spam filter is so good: it uses the input of its gillions of users to collaborative train its algorithmic filter. StumbleUpon is probably my favorite implementation of this for general web content–although its closed-ness spooks me out.

We’re working on applying collaborative filtering methods to feeds at The Open Planning Project. Specifically, Luke Tucker has been developing Melkjug, an open source collaborative filtering feed aggregator. It’s currently in version 0.2.1. To get involved in the project, check out the Melkjug Project page on OpenPlans.org.

Digifesto

Tag: bayes

September 9, 2017

Formalizing welfare implications of price discrimination based on personal information

Digifesto

Tag: bayes

September 9, 2017

Formalizing welfare implications of price discrimination based on personal information

May 3, 2008

Filtering feeds