Information flow in economics

by Sebastian Benthall

We have formalized three different cases of information economics:

Information about supply quality. For example, Posner’s case of an employer reviewing job applicants.
Information about consumer willingness-to-pay. As in the case of a firm engaged in personalized price discrimination.
Expertise. Where the service being sold involves special insight that the consumer does not have.

What we discovered is that each of these cases has, to some extent, a common form. That form is this:

There is a random variable of interest, $x \sim X$ (that is, a value $x$ sampled from a probability distribution $X$ ), that has direct effect on the welfare outcome of decisions made be agents in the economy. In our cases this was the aptitude of job applicants, consumers willingness to pay, and the utility of receiving a range of different expert recommendations, respectively.

In the extreme cases, the agent at the focus of the economic model could act with extreme ignorance of $x$ , or extreme knowledge of it. Generally, the agent’s situation improves the more knowledgeable they are about $x$ . The outcomes for the subjects of $X$ vary more widely.

We also considered the possibility that the agent has access to partial information about $X$ through the observation of a different variable $y \sim Y$ . Upon observation of $y$ , they can make their judgments based on an improved subjective expectation of the unknown variable, $P(x \vert y)$ . We assumed that the agent was a Bayesian reasoner and so capable of internalizing evidence according to Bayes rule, hence they are able to compute:

$P(X \vert Y) \propto P(Y \vert X) P(X)$

However, this depends on two very important assumptions.

The first is that the agent knows the distribution $X$ . This is the prior in their subjective calculation of the Bayesian update. In our models, we have been perhaps sloppy in assuming that this prior probability corresponds to the true probability distribution from which the value $x$ is drawn. We are somewhat safe in this assumption because for the purposes of determining strategy, only subjective probabilities can be taken into account and we can relax the distribution to encode something close to zero knowledge of the outcome if necessary. In more complex models, the difference between agents with different knowledge of $X$ may be more strategically significant, but we aren’t there yet.

The second important assumption is that the agent knows the likelihood function $P(Y | X)$ . This is quite a strong assumption, as it implies that the agent knows truly how Y covaries with X, allowing them to “decode” the message $y$ into useful information about $x$ .

It may be best to think of access and usage of the likelihood function as a rare capability. Indeed, in our model of expertise, the assumption was that the service provider (think doctor) knew more about the relationship between $X$ (appropriate treatment) and $Y$ (observable symptoms) than the consumer (patient) did. In the case of companies that use data science, the idea is that some combination of data and science gives the company an edge in knowing the true value of some uncertain property than its competitors.

What we are discovering is that it’s not just the availability of $y$ that matters, but also the ability to interpret $y$ with respect to the probability of $x$ . Data does not speak for itself.

This incidentally ties in with a point which we have perhaps glossed over too quickly in the present discussion, which is what is information, really? This may seem like a distraction in a discussion about economics but it is a question that’s come up in my own idiosyncratic “disciplinary” formation. One of the best intuitive definitions of information is provided by philosopher Fred Dretske (1981; 1983). Made a presentation of Fred Dretske’s view on information and its relationship to epistemological skepticism and Shannon information theory; you can find this presentation here. But for present purposes I want to call attention to his definition of what it means for a message to carry information, which is:

[A] message carries the information that X is a dingbat, say, if and only if one could learn (come to know) that X is a dingbat from the message.

When I say that one could learn that X was a dingbat from the message, I mean, simply, that the message has whatever reliable connection with dingbats is required to enable a suitably equipped, but otherwise ignorant receiver, to learn from it that X is a dingbat.

This formulation is worth mentioning because it supplies a kind of philosophical validation for our Bayesian formulation of information flow in the economy. We are modeling situations where Y is a signal that is reliably connected with X such that instantiations of Y carry information about the value of the X. We might express this in terms of conditional entropy:

$H(X|Y) < H(X)$

While this is sufficient for Y to carry information about X, it is not sufficient for any observer of Y to consequently know X. An important part of Dretske's definition is that the receiver must be suitably equipped to make the connection.

In our models, the “suitably equipped” condition is represented as the ability to compute the Bayesian update using a realistic likelihood function $P(Y \vert X)$ . This is a difficult demand. A lot of computational statistics has to do with the difficulty of tractably estimating the likelihood function, let alone computing it perfectly.

References

Dretske, F. I. (1983). The epistemology of belief. Synthese, 55(1), 3-19.

Dretske, F. (1981). Knowledge and the Flow of Information.

Digifesto