fred dretske | Digifesto

Information flow in economics

We have formalized three different cases of information economics:

Information about supply quality. For example, Posner’s case of an employer reviewing job applicants.
Information about consumer willingness-to-pay. As in the case of a firm engaged in personalized price discrimination.
Expertise. Where the service being sold involves special insight that the consumer does not have.

What we discovered is that each of these cases has, to some extent, a common form. That form is this:

There is a random variable of interest, $x \sim X$ (that is, a value $x$ sampled from a probability distribution $X$ ), that has direct effect on the welfare outcome of decisions made be agents in the economy. In our cases this was the aptitude of job applicants, consumers willingness to pay, and the utility of receiving a range of different expert recommendations, respectively.

In the extreme cases, the agent at the focus of the economic model could act with extreme ignorance of $x$ , or extreme knowledge of it. Generally, the agent’s situation improves the more knowledgeable they are about $x$ . The outcomes for the subjects of $X$ vary more widely.

We also considered the possibility that the agent has access to partial information about $X$ through the observation of a different variable $y \sim Y$ . Upon observation of $y$ , they can make their judgments based on an improved subjective expectation of the unknown variable, $P(x \vert y)$ . We assumed that the agent was a Bayesian reasoner and so capable of internalizing evidence according to Bayes rule, hence they are able to compute:

$P(X \vert Y) \propto P(Y \vert X) P(X)$

However, this depends on two very important assumptions.

The first is that the agent knows the distribution $X$ . This is the prior in their subjective calculation of the Bayesian update. In our models, we have been perhaps sloppy in assuming that this prior probability corresponds to the true probability distribution from which the value $x$ is drawn. We are somewhat safe in this assumption because for the purposes of determining strategy, only subjective probabilities can be taken into account and we can relax the distribution to encode something close to zero knowledge of the outcome if necessary. In more complex models, the difference between agents with different knowledge of $X$ may be more strategically significant, but we aren’t there yet.

The second important assumption is that the agent knows the likelihood function $P(Y | X)$ . This is quite a strong assumption, as it implies that the agent knows truly how Y covaries with X, allowing them to “decode” the message $y$ into useful information about $x$ .

It may be best to think of access and usage of the likelihood function as a rare capability. Indeed, in our model of expertise, the assumption was that the service provider (think doctor) knew more about the relationship between $X$ (appropriate treatment) and $Y$ (observable symptoms) than the consumer (patient) did. In the case of companies that use data science, the idea is that some combination of data and science gives the company an edge in knowing the true value of some uncertain property than its competitors.

What we are discovering is that it’s not just the availability of $y$ that matters, but also the ability to interpret $y$ with respect to the probability of $x$ . Data does not speak for itself.

This incidentally ties in with a point which we have perhaps glossed over too quickly in the present discussion, which is what is information, really? This may seem like a distraction in a discussion about economics but it is a question that’s come up in my own idiosyncratic “disciplinary” formation. One of the best intuitive definitions of information is provided by philosopher Fred Dretske (1981; 1983). Made a presentation of Fred Dretske’s view on information and its relationship to epistemological skepticism and Shannon information theory; you can find this presentation here. But for present purposes I want to call attention to his definition of what it means for a message to carry information, which is:

[A] message carries the information that X is a dingbat, say, if and only if one could learn (come to know) that X is a dingbat from the message.

When I say that one could learn that X was a dingbat from the message, I mean, simply, that the message has whatever reliable connection with dingbats is required to enable a suitably equipped, but otherwise ignorant receiver, to learn from it that X is a dingbat.

This formulation is worth mentioning because it supplies a kind of philosophical validation for our Bayesian formulation of information flow in the economy. We are modeling situations where Y is a signal that is reliably connected with X such that instantiations of Y carry information about the value of the X. We might express this in terms of conditional entropy:

$H(X|Y) < H(X)$

While this is sufficient for Y to carry information about X, it is not sufficient for any observer of Y to consequently know X. An important part of Dretske's definition is that the receiver must be suitably equipped to make the connection.

In our models, the “suitably equipped” condition is represented as the ability to compute the Bayesian update using a realistic likelihood function $P(Y \vert X)$ . This is a difficult demand. A lot of computational statistics has to do with the difficulty of tractably estimating the likelihood function, let alone computing it perfectly.

References

Dretske, F. I. (1983). The epistemology of belief. Synthese, 55(1), 3-19.

Dretske, F. (1981). Knowledge and the Flow of Information.

scientific contexts

Recall:

For Helen Nissenbaum (contextual integrity theory):
- a context is a social domain that is best characterized by its purpose. For example, a hospital’s purpose is to cure the sick and wounded.
- a context also has certain historically given norms of information flow.
- a violation of a norm of information flow in a given context is a potentially unethical privacy violation. This is an essentially conservative notion of privacy, which is balanced by the following consideration…
- Whether or not a norm of information flow should change (given, say, a new technological affordance to do things in a very different way) can be evaluated by how well it serve the context’s purpose.
For Fred Dretske (Knowledge and the Flow of Information, 1983):
- The appropriate definition of information is (roughly) just what it takes to know something. (More specifically: M carries information about X if it reliably transmits what it takes for a suitably equipped but otherwise ignorant observer to learn about X.)
Combining Nissenbaum and Dretske, we see that with an epistemic and naturalized understanding of information, contextual norms of information flow are inclusive of epistemic norms.
Consider scientific contexts. I want to use ‘science’ in the broadest possible (though archaic) sense of the intellectual and practical activity of study or coming to knowledge of any kind. “Science” from the Latin “scire”–to know. Or “Science” (capitalized) as the translated 19th Century German Wissenschaft.
- A scientific context is one whose purpose is knowledge.
- Specific issues of whose knowledge, knowledge about what, and to what end the knowledge is used will vary depending on the context.
- As information flow is necessary for knowledge, the purpose of science, the norms of information flow within (and without) a scientific context, the integrity of scientific context will be especially sensitive to its norms of information flow.
An insight I owe to my colleague Michael Tschantz, in conversation, is that there are several open problems within contextual integrity theory:
- How does one know what context one is in? Who decides that?
- What happens at the boundary between contexts, for example when one context is embedded in another?
- Are there ways for the purpose of a context to change (not just the norms within it)?
Proposal: One way of discovering what a science is is to trace its norms of information flow and to identify its purpose. A contrast between the norms and purpose of, for example, data science and ethnography, would be illustrative of both. One approach to this problem could be kind of qualitative research done by Edwin Hutchins on distributed cognition, which accepts a naturalized view of information (necessary for this framing) and then discovers information flows in a context through qualitative observation.

Digifesto

Tag: fred dretske

September 11, 2017

Information flow in economics

September 9, 2015

scientific contexts