opacity | Digifesto

November 15, 2018

The Crevasse: a meditation on accountability of firms in the face of opacity as the complexity of scale

To recap:

(A1) Beneath corporate secrecy and user technical illiteracy, a fundamental source of opacity in “algorithms” and “machine learning” is the complexity of scale, especially scale of data inputs. (Burrell, 2016)

(A2) The opacity of the operation of companies using consumer data makes those consumers unable to engage with them as informed market actors. The consequence has been a “free fall” of market failure (Strandburg, 2013).

(A3) Ironically, this “free” fall has been “free” (zero price) for consumers; they appear to get something for nothing without knowing what has been given up or changed as a consequence (Hoofnagle and Whittington, 2013).

Comments:

(B1) The above line of argument conflates “algorithms”, “machine learning”, “data”, and “tech companies”, as is common in the broad discourse. That this conflation is possible speaks to the ignorance of the scholarly position on these topics, and ignorance that is implied by corporate secrecy, technical illiteracy, and complexity of scale simultaneously. We can, if we choose, distinguish between these factors analytically. But because, from the standpoint of the discourse, the internals are unknown, the general indication of a ‘black box’ organization is intuitively compelling.

(B1a) Giving in to the lazy conflation is an error because it prevents informed and effective praxis. If we do not distinguish between a corporate entity and its multiple internal human departments and technical subsystems, then we may confuse ourselves into thinking that a fair and interpretable algorithm can give us a fair and interpretable tech company. Nothing about the former guarantees the latter because tech companies operate in a larger operational field.

(B2) The opacity as the complexity of scale, a property of the functioning of machine learning algorithms, is also a property of the functioning of sociotechnical organizations more broadly. Universities, for example, are often opaque to themselves, because of their own internal complexity and scale. This is because the mathematics governing opacity as a function of complexity and scale are the same in both technical and sociotechnical systems (Benthall, 2016).

(B3) If we discuss the complexity of firms, as opposed the the complexity of algorithms, we should conclude that firms that are complex due to scale of operations and data inputs (including number of customers) will be opaque and therefore have strategic advantage in the market against less complex market actors (consumers) with stiffer bounds on rationality.

(B4) In other words, big, complex, data rich firms will be smarter than individual consumers and outmaneuver them in the market. That’s not just “tech companies”. It’s part of the MO of every firm to do this. Corporate entities are “artificial general intelligences” and they compete in a complex ecosystem in which consumers are a small and vulnerable part.

Twist:

(C1) Another source of opacity in data is that the meaning of data come from the causal context that generates it. (Benthall, 2018)

(C2) Learning causal structure from observational data is hard, both in terms of being data-intensive and being computationally complex (NP). (c.f. Friedman et al., 1998)

(C3) Internal complexity, for a firm, is not sufficient to be “all-knowing” about the data that is coming it; the firm has epistemic challenges of secrecy, illiteracy, and scale with respect to external complexity.

(C4) This is why many applications of machine learning are overrated and so many “AI” products kind of suck.

(C5) There is, in fact, an epistemic crevasse between all autonomous entities, each containing its own complexity and constituting a larger ecological field that is the external/being/environment for any other autonomy.

To do:

The most promising direction based on this analysis is a deeper read into transaction cost economics as a ‘theory of the firm’. This is where the formalization of the idea that what the Internet changed most are search costs (a kind of transaction cost) should be.

It would be nice if those insights could be expressed in the mathematics of “AI”.

There’s still a deep idea in here that I haven’t yet found the articulation for, something to do with autopoeisis.

References

Benthall, Sebastian. (2016) The Human is the Data Science. Workshop on Developing a Research Agenda for Human-Centered Data Science. Computer Supported Cooperative Work 2016. (link)

Sebastian Benthall. Context, Causality, and Information Flow: Implications for Privacy Engineering, Security, and Data Economics. Ph.D. dissertation. Advisors: John Chuang and Deirdre Mulligan. University of California, Berkeley. 2018.

Burrell, Jenna. “How the machine ‘thinks’: Understanding opacity in machine learning algorithms.” Big Data & Society 3.1 (2016): 2053951715622512.

Friedman, Nir, Kevin Murphy, and Stuart Russell. “Learning the structure of dynamic probabilistic networks.” Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc., 1998.

Hoofnagle, Chris Jay, and Jan Whittington. “Free: accounting for the costs of the internet’s most popular price.” UCLA L. Rev. 61 (2013): 606.

Strandburg, Katherine J. “Free fall: The online market’s consumer preference disconnect.” U. Chi. Legal F. (2013): 95.

November 17, 2015

“Transactions that are too complex…to be allowed to exist.” cf @FrankPasquale

I stand corrected; my interpretation of Pasquale in my last post was too narrow. Having completed Chapter One of The Black Box Society (TBBS), Pasquale does not take the naive view that all organizational secrecy should be abolished, as I might have once. Rather, his is a more nuanced perspective.

First, Pasquale distinguishes between three “critical strategies for keeping black boxes closed”, or opacity, “[Pasquale’s] blanket term for remediable incomprehensibility”:

“Real secrecy establishes a barrier between hidden content and unauthorized access to it.”
“Legal secrecy obliges those privy to certain information to keep it secret”
“Obfuscation involves deliberate attempts at concealment when secrecy has been compromised.”

Cutting to the chase by looking at the Pasquale and Bracha “Federal Search Commission” (2008) paper that a number of people have recommended to me, it appears (in my limited reading so far) that Pasquale’s position is not that opacity in general is a problem (because there are of course important uses of opacity that serve the public interest, such as confidentiality). Rather, despite these legitimate uses of opacity there is also the need for public oversight, perhaps through federal regulation. The Federal Government serves the public interest better than the imperfect market for search can provide on its own.

There is perhaps a tension between this 2008 position and what is expressed in Chapter 1 of TBBS in the section “The One-Way Mirror,” which gets I dare say a little conspiratorial about The Powers That Be. “We are increasingly ruled by what former political insider Jeff Connaughton called ‘The Blob,’ a shadowy network of actors who mobilize money and media for private gain, whether acting officially on behalf of business or of government.” Here, Pasquale appears to espouse a strong theory of regulatory capture from which, we we to insist on consistency, a Federal Search Commission would presumably not be exempt. Hence perhaps the role of TBBS in stirring popular sentiment to put political pressure on the elites of The Blob.

Though it is a digression I will note, since it is a pet peeve of mine, Pasquale’s objection to mathematized governance:

“Technocrats and managers cloak contestable value judgments in the garb of ‘science’: thus the insatiable demand for mathematical models that reframe the subtle and subjective conclusions (such as the worth of a worker, service, article, or product) as the inevitable dictate of salient, measurable data. Big data driven decisions may lead to unprecedented profits. But once we use computation not merely to exercise power over things, but also over people, we need to develop a much more robust ethical framework than ‘the Blob’ is now willing to entertain.”

That this sentiment that scientists should not be making political decisions has been articulated since at least as early as Hannah Arendt’s 1958 The Human Condition is an indication that there is nothing particular to Big Data about this anxiety. And indeed, if we think about ‘computation’ as broadly as mathematized, algorithmic thought, then its use for control over people-not-just-things has an even longer history. Lukacs’ 1923 “Reification and the Consciousness of the Proletariat” is a profound critique of Tayloristic scientific factory management that is getting close to being a hundred years old.

Perhaps a robust ethics of quantification has been in the works for some time as well.

Moving past this, by the end of Chapter 1 of TBBS Pasquale gives us the outline of the book and the true crux of his critique, which is the problem of complexity. Whether or not regulators are successful in opening the black boxes of Silicon Valley or Wall Street (or the branches of government that are complicit with Silicon Valley and Wall Street), their efforts will be in vain if what they get back from the organizations they are trying to regulate is too complex for them to understand.

Following the thrust of Pasquale’s argument, we can see that for him, complexity is the result of obfuscation. It is therefore a source of opacity, which as we have noted he has defined as “remediable incomprehensibility”. Pasquale promises to, by the end of the book, give us a game plan for creating, legally, the Intelligible Society. “Transactions that are too complex to explain to outsiders may well be too complex to be allowed to exist.”

This gets us back to the question we started with, which is whether this complexity and incomprehensibility is avoidable. Suppose we were to legislate against institutional complexity: what would that cost us?

Mathematical modeling gives us the tools we need to analyze these kinds of question. Information theory, theory of computational, and complexity theory are all foundational to the technology of telecommunications and data science. People with expertise in understanding complexity and the limitations we have of controlling it are precisely the people who make the ubiquitous algorithms which society depends on today. But this kind of theory rarely makes it into “critical” literature such as TBBS.

I’m drawn to the example of The Social Media Collective’s Critical Algorithm Studies Reading List, which lists Pasquale’s TBBS among many other works, because it opens with precisely the disciplinary gatekeeping that creates what I fear is the blind spot I’m pointing to:

This list is an attempt to collect and categorize a growing critical literature on algorithms as social concerns. The work included spans sociology, anthropology, science and technology studies, geography, communication, media studies, and legal studies, among others. Our interest in assembling this list was to catalog the emergence of “algorithms” as objects of interest for disciplines beyond mathematics, computer science, and software engineering.

As a result, our list does not contain much writing by computer scientists, nor does it cover potentially relevant work on topics such as quantification, rationalization, automation, software more generally, or big data, although these interests are well-represented in these works’ reference sections of the essays themselves.

This area is growing in size and popularity so quickly that many contributions are popping up without reference to work from disciplinary neighbors. One goal for this list is to help nascent scholars of algorithms to identify broader conversations across disciplines and to avoid reinventing the wheel or falling into analytic traps that other scholars have already identified.

This reading list is framed as a tool for scholars, which it no doubt is. But if contributors to this field of scholarship aspire, as Pasquale does, for “critical algorithms studies” to have real policy ramifications, then this disciplinary wall must fall (as I’ve argued this elsewhere).

November 14, 2015

Is the opacity of governance natural? cf @FrankPasquale

I’ve begun reading Frank Pasquale’s The Black Box Society on the recommendation that it’s a good place to start if I’m looking to focus a defense of the role of algorithms in governance.

I’ve barely started and already found lots of juicy material. For example:

Gaps in knowledge, putative and real, have powerful implications, as do the uses that are made of them. Alan Greenspan, once the most powerful central banker in the world, claimed that today’s markets are driven by an “unredeemably opaque” version of Adam Smith’s “invisible hand,” and that no one (including regulators) can ever get “more than a glimpse at the internal workings of the simplest of modern financial systems.” If this is true, libertarian policy would seem to be the only reasonable response. Friedrich von Hayek, a preeminent theorist of laissez-faire, called the “knowledge problem” an insuperable barrier to benevolent government intervention in the economy.

But what if the “knowledge problem” is not an intrinsic aspect of the market, but rather is deliberately encouraged by certain businesses? What if financiers keep their doings opaque on purpose, precisely to avoid and confound regulation? That would imply something very different about the merits of deregulation.

The challenge of the “knowledge problem” is just one example of a general truth: What we do and don’t know about the social (as opposed to the natural) world is not inherent in its nature, but is itself a function of social constructs. Much of what we can find out about companies, governments, or even one another, is governed by law. Laws of privacy, trade secrecy, the so-called Freedom of Information Act–all set limits to inquiry. They rule certain investigations out of the question before they can even begin. We need to ask: To whose benefit?

There are a lot of ideas here. Trying to break them down:

Markets are opaque.
If markets are naturally opaque, that is a reason for libertarian policy.
If markets are not naturally opaque, then they are opaque on purpose, then that’s a reason to regulate in favor of transparency.
As a general social truth, the social world is not naturally opaque but rather opaque or transparent because of social constructs such as law.

We are meant to conclude that markets should be regulated for transparency.

The most interesting claim to me is what I’ve listed as the fourth one, as it conveys a worldview that is both disputable and which carries with it the professional biases we would expect of the author, a Professor of Law. While there are certainly many respects in which this claim is true, I don’t yet believe it has the force necessary to carry the whole logic of this argument. I will be particularly attentive to this point as I read on.

The danger I’m on the lookout for is one where the complexity of the integration of society, which following Beniger I believe to be a natural phenomenon, is treated as a politically motivated social construct and therefore something that should be changed. It is really only the part after the “and therefore” which I’m contesting. It is possible for politically motivated social constructs to be natural phenomena. All institutions have winners and losers relative to their power. Who would a change in policy towards transparency in the market benefit? If opacity is natural, it would shift the opacity to some other part of society, empowering a different group of people. (Possibly lawyers).

If opacity is necessary, then perhaps we could read The Black Box Society as an expression of the general problem of alienation. It is way premature for me to attribute this motivation to Pasquale, but it is a guiding hypothesis that I will bring with me as I read the book.

Tag: opacity