Digitizing the Humanities: Canon in D. Major

Our discussion in the last class meeting was terrific! I have no other words for it: It was terrific. I think this is because the readings for this week were a bit more in our comfort zone as English Majors and a bit less involved with the digital world in general. Matthew Wilkens’ article, “Canons, Close Reading, and the Evolution of Method” dealt with our old notion of Canon(s) and Canon construction, and complicated the matter by noting just how many new books are added to the back log of texts on a yearly basis. This is what he describes as a “problem of abundance” (250) and with ever more novels being published each year, our old way of reading—close reading—needs to evolve to keep up with the sheer volume of books yet to be encountered. The books are able, he posits, to tell us something about the culture that produced them, and since we cannot read them all in order to extract this information, we should be devising modern, technologically assisted means of digesting these incalculable pages. His “anything and everything else” to “data mine” these texts include, as he shows the reader through an extended example and figures in the text, as distant reading (for a good explanation click here).

I really do like the idea of distant reading, though as a supplement to close reading, not as an alternative. The example that Wilkens provides takes all of the books, popular and obscure, published in a 25 year window starting in 1851, and data mines them to find Place Names, foreign and domestic, and plots these mentions (or multiple mentions) on a map. The purpose of this exercise is to note how often places are named or written about in an era thought by scholars to be dominated by the Northeast. What this example shows is that the culture that produced all of these texts had in their minds places overseas, in Europe and Asia, several mentions of South America, and plotting in Australia. This shows that texts in this period mention these places, shifting the primary focus from New England to the rest of the world—or so it would seem.

In our discussion, I had some questions of this practice. The ability of technology to extract these very distinct instances and utterances (is utterance the right word for a text?) shows us, the scholarly reader, what, exactly? That other places are mentioned? What of the context of these mentions? Were they casual in nature? Hostile? Romanticized? What were these utterances and how do they illustrate the cultural attitudes of the time? The concept seems a bit removed from what reading itself is to be—the encountering of a text. Distant reading, in this sense, seems like a brief fly over. One of my colleagues noted that the way that the programs work in this situation is very much like how Google Books works: they find these words and send back results in the context of how they were found, so “Australia” would be found in the context of “Australia is an island peopled entirely of criminals” as opposed to simply as “Australia”. This example is glib, I know, but the way that I read Wilkens’ article indicates that this example is ostensibly correct. Wilkens never once mentioned text in context, only that the methodology of finding place names needed to be tweaked in that several places include proper names and this needed to be accounted for. The reader sees three maps showing the locations, but not what these locations mean. I appreciated that my colleague mentioned that the results would come in context, but the published example would have benefitted from proffering this information. Another colleague of mine noted that data mining in this manner was helpful beyond the confines of the Literature profession, which, again, I am grateful to have heard, seeing as we are all lit. people and to think beyond our boundaries, at least for me, is difficult. However, despite these few moments of relief, something was still bothering me: Data mining can read a text, and even provide the statistician with context, but is in NO WAY capable of providing subtext. What is meant by an utterance or turn of phrase cannot be picked up by a machine (anyone hear about how sarcasm doesn’t translate well in an e-mail will know that what I am saying here is true), so there is so much that data mining will miss in relation to the culture that produced these texts.

To punctuate my final point above, I cited James Joyce as an example. This did not go over well. At least not initially. As soon as Joyce was invoked, several of my colleagues jumped up and shouted (not really): Joyce is perfect for data mining. His work is so dense. We would benefit from DH in his case.

It seemed as though He was the first author of DH data mining. I may have blasphemed and taken the Lord’s name in vain.

I continued: Yes, his work is dense, and much of its depth lies in the subtext of his writing, the collective, cultural meanings that are revealed only after repeated readings. These tend to be different from person to person, and in the case of universalities, they reveal themselves at different time and at different degrees. How could/would data mining be able to pull back the layers of subtext when these layers are invisible to a mechanical eye? No, we know nothing of Joyce—at least not all of it—and it is only through reading the text and mining it ourselves can the mysteries of Joyce be revealed. Data mining, in this case, would be the Mountain top removal method, as opposed to the individual prospector. Prospectors are in touch with the environment and readers are very similar, knowing the terrain of the text. This insider knowledge, as far as the Literary Profession is concerned, should be at the fore, not the mechanized version.

Save that for sociology.

Digitizing the Humanities

Monday, October 8, 2012

Canon in D. Major

No comments:

Post a Comment

Señores y Señoras

About Me