Probabilistic Interfaces

One truism about the digital is that it is “binary”—that its data is all, at some level, 1s and 0s, or that it understands categories as either true or false. At some level this is true. For any given operation, each bit involved must be flipped or not. As a truism intended to convey a deep insight into computation or data, however, the use of “binary” in these formulations seems to us inadequate for describing the ways humans interact with computers—akin to claiming speech is binary because one either makes a sound or doesn’t.¹ We do not intellectually engage with speech at the level of individual sound waves, but instead through symbolic systems of language that organize those sound waves into complex meaning. Likewise we typically interact with the computer through higher-level programming languages that mediate between the computer and its human programmers. Any digital object comprises thousands, millions, or billions of bits, and in their combinations create rich objects that belie a “binary” designation.

Typically those who invoke the binary paradigm are not discussing the binary code at all, but instead a higher level programming language (or even an application) implemented in ways that permit only either/or choices, such as a classification algorithm that marks texts as either poetry or not. This is a consequence of a particular programming decision, however, not an inevitable outcome of the computer as medium. The experiments we outline in this chapter demonstrate the overlapping, probabilistic outcomes that often underlie classification algorithms, and which very well could shape the data they output. Here a given text is 83% likely to be poetry and 81% likely to be literary and 52% likely to be an advertisement. Even the most generous humanists might look at such a result and believe the classifier has failed, but we argue this is a deeply limiting response.

With writers such as Hoyt Long and Richard Jean So (2016), we embrace “a method of reading that oscillates or pivots between human and machine interpretation, each providing feedback to the other” (267). To illustrate this they offer a hypothetical:

If a machine learning algorithm consistently misclassifies messages from a friend as spam, then the data scientist will want to treat this as an error and find a way to refine his or her model to improve the accuracy of the algorithm. For us, however, the error raises an interpretative question: what made the friend’s message so spam-like? (261)

One answer might be that the spam filter is broken, but another might be that this friend should examine her prose style. In an earlier example in this chapter, we discussed a nineteenth century newspaper text classified as 83% likely to be poetry and 52% likely to be an advertisement. Such a result raises a number of fascinating questions about the similarities between the structure of poetry and advertisements on the newspaper page (the two have the most white space of any newspaper genres), the vocabulary of both genres, and possibly even topic overlap between verse and commerce. This particular text may be an outlier—a particularly jingling poem or a strangely lyrical ad—or, if there are other similarly classified texts, it may speak to some shared structure between newspaper poems and advertisements that would merit further investigation. These moments where the probabilities seem to point to something improbable are, we argue, precisely the fissures of potential humanistic interest.

Scholars reading about the genre classification experiments described in this chapter and our theorization of the vignette might take issue with our methods or the conclusions we drew from them. Perhaps because we only classified vignettes along a continuum between news and fiction we have, through our own prior hypotheses about this genre, failed to account for another genre to which vignettes are actually much more closely related. Certainly these modest experiments leave significant room for new research. In order to perform those experiments, however, scholars would need access to our models and our training data: they would need to understand, in short, what “news,” “fiction,” or even “vignette” mean, both to us and to the classifiers we trained.

We want to suggest that for methods such as classification to influence broader humanistic research, we need to develop probabilistic interfaces into our data, and to develop the critical awareness about how to understand and use such interfaces. In an abundant research environment, we cannot expect digitized books, newspapers, magazines, and other media to be hand annotated to the level we might wish, were money and time no object. At the same time, we require paths into mass data beyond keyword search: paths that might be based on complex text mining and analysis, but which do not require every humanist to implement those methods. Methods such as classification might offer renewed paths for browsing and reading within mass archives, for identifying subsets of potentially related materials grouped more capaciously than search allows. But this will require a degree of comfort with uncertainty: show me those texts highly likely to be lists, or those likely to be both news AND fiction.

To the humanist, this should feel, ironically, like more solid ground. That is, by rejecting the illusion of certainty that is often and unfairly associated with computational text analysis, we find ourselves returned to the realm of reading closely. The classifier can direct our eyes to a particular corner of a corpus, but from there we have to dig in, begin to read, and consider whether the computational model confirms our sense of literary genre or, as is often our hope, challenges it and pushes us in new and interesting directions. Ultimately, we are not looking for a machine that thinks for us, or even makes it easier to think. Rather, computational classification efforts like those described here are meant to help us decide where to start reading, or where to focus our energies.

Most importantly, computational models help us better understand the way humans make decisions about genre. Human readers make decisions about genre based on both explicit and implicit ideas about what a given genre means. When a computational model is trained on examples provided by literary historians, it will identify texts that resemble those examples in ways both apparent (because they are similar in the explicit features we associate with the genre) and less apparent (because they are similar in the implicit features we do not immediately associate with the genre). The results of the model are neither wrong nor right, but instead prompts toward reflection. We can better understand the way we make decisions about genre by seeing them modeled—seeing them reflected back—for us, allowing us to see more clearly the whole and its constituent parts.

Braille writing is quite literally a binary system. Dots are either raised on the page or they are not, “flipped” or “not flipped.” However, as with our metaphor drawn from speech, Braille is not ineluctably reductive because of this. The meaning of Braille is not in its individual dots, but in their combinations to form characters, words, and the other building blocks of written language. Like any symbolic system, Braille cannot convey everything (see also concerns about conveying tone in alphabetic writing) but it conveys rich concepts and ideas and would not, we suspect, be marked as reductively “binary” in the ways computation often is. ↩

Coda: Vignettes and “Fake News”

Draft Chapters

Show the following:

Adjust appearance:

Notes

Probabilistic Interfaces

Annotate