Wednesday, April 08, 2009

The Age of Digital Citation

Peer-to-peer technologies are working to unlock one of the most secretly-guarded rituals of academic citizenship, one that in former times was the most expensive to procure and the most costly to transfer: that is to say, knowledge of the canon itself.

The canon has to be mastered in a process of slow reading and even slower surfing of footnotes that occupies the first three to five years of graduate study leading up to qualifying exams and a dissertation prospectus. Even finding out what the canon is remains part of the work, eased in certain places by official departmental reading lists and historiographical classes, but finally a matter of reading and mastering the minutia of the scholarly apparatus.

Finding the canon in history, for instance, means careful reading of acknowledgements sections and footnotes correlated with cv’s, finding out who worked with whom, which texts appear with frequency, and which are dismissed. Finding the canon in comparative literature is frequently a matter of reading notes from Lacanian seminars in 1960s Paris, deducing from reported conversations the subtext that actually mattered to scholars.

All of these processes depend upon having the time to follow professors, to track them down in office hours, to pay attention to which conversations they listened to, to abstract one’s own canon from the masses. It was never enough to self-train; it was hardly enough merely to read, and visiting bookshops was a way to error rather than fruition. The canon has, until now, been secret, and it has been a matter of personal socialization to even find out what the important names were. And all of this suddenly promises to fold. Google Scholar counts citations and delivers the one true text on the transport revolution cited by scholar after scholar, or the new groundbreaking text that rocketed to a favorite within the last ten years.

The citation databases create new canons, established by numbers. Numbers have power. Sooner or later, it’s nearly inevitable now, that those numbers will begin to influence hiring decisions. Woe betide the uncited book or ignored article: relevance to disciplinary discourse can be counted and numbered. Scholarship has entered the age of the citation database.

Such highways create residual suburbs on the periphery of common activity. Journals to exclusive or small or specialist to go online, such as Cabinet, which depends on orders of back editions for part of its revenue, upload none of their articles by humanities rockstars, be they ever so bright as Wolfgang Schivelbusch or Marina Warner. Blog entries from para-academic scholars such as Geoff Manaugh of BLDGBLOG, or podcasts by David Harvey, despite their circulation, will show up in Google stats but never Scholar; it will who up on Zotero if users put it there, but part of its cache is being known to only a small group of thinkers. One finds these scholarly suburbs by knowing the right people, by following the right idea. Knowing about Cabinet means dedication to a discourse.

There are two levels of citation, two ways of knowing then: the official, the common highway, the established canon of knowledge, now finally unlocked for all. The eighteen-year-old high-school drop-out in Cleveland can learn about the industrial revolution on his own, navigating straight to the top texts if he chooses. Alongside that canon, another and more mysterious one is forming: the secret canon of para-academic, interdisciplinary know-how. The former uses Scholar and Zotero, the tools of the trade. The latter leaves traces on Delicious and Twitter, the tools of public intellectuals. Scholars following the breath of the new will want to have exposure to both.

Interdisciplinary Canons, New Fields

We can look around the curve of time to further consequences of this unlocking of canons. The proximate arenas of affect are in interdisciplinarity and the establishment of new fields.

The age of digital citation also makes possible a new age of rampant interdisciplinarity: searching for the origins of urban prisons in the nineteenth-century launches the historian into the abundant writing from literature scholars on the same subject. One no longer has to visit the art history department to develop a second field in art history; the list of innovative new texts is in easy grasp. While traditional scholars ignore other fields for the sake of expediency, the easy grasp of interdisciplinary knowledge makes ignoring it merely irresponsible.

Open canons also imply the more rapid establishment of new fields: though scholars had been writing serious studies of the city since Henri Pirenne, it took an operator like Arthur Schlesinger to establish urban history as a field. A Harvard professor could produce a generation of graduate students, a flood of scholarship, a conference, and finally a journal. For most of the twentieth century, such were the criteria necessary to generate a legitimate subfield where most departments hire and teach today.

Navigating the Information Glut

Such interdisciplinary plenitude foreshadows an age of information glut. Even as I write, I too cringe, already worn out from a morning preparing a nineteenth-century cities lecture, lured outside my historical canon by the ready availability of literature scholars’ studies of early detective fiction.

The temptation to meander is a serious one. I could waste hours there, thinking about the difficulty of finding information for the urban police, and the way those searches find their way into the middle-class fascination with Sherlock Holmes; a historical problematic of information glut not unlike my own. The task looks impossible, though, and for the moment I’ve simply avoided the other canon. I’ll concentrate on historians’ accounts of police, and leave their literary imaginal for another day. Here’s the gist of the problem: much like those urban subjects, today’s researchers have the problem of knowing which categories of information are relevant.

A second temptation is to decline responsibility altogether. Clumsy navigation of information results in a glut of citations that don’t actually reflect their user’s experience. That happens in the last efficient way now whenever a scholar cites relevant articles in a footnote without reading them, guessing from title or first page alone their content. Scholars show their finesse at navigating digital canons by citing only the works essential to their argument. A smaller list of citations frequently demonstrates real mastery.

Reducing the canon effectively is always a matter of outsourcing responsibility. Traditionally, the scholar relies upon the advisor for setting the bounds of discovery, the questions of debate, even the nature of inquiry. Derrida gives us the image of Socrates prodding Plato with his stylus to get him going; Socrates comes up with the questions and Plato does all the work. In traditional graduate departments, the student is somewhat relieved of this relationship by the possibility of multiple members of a committee. Barbara Johnson, Derrida’s feminist pupil, gives another image: Moliere’s Agnes in The School for Wives, who gains her freedom by having two teachers and choosing her own path between them.

The age of digital citation raises the possibility of hacking through the canon with other prosthetics than the human teacher. Each of them has their own limits and rewards.

Crowdsourced citation, in its most blunt form, creates simple accounts of which texts are most read. In the world of tagging, however, readers assign labels to a text or passage. Tagclouds rank the labels used by a particular group of users by frequency. A set of crowdsourced labels produces a folksonomy, or the set of terms of greatest interest to that particular set of users. Individual folk publics can emerge, each of them generating their own set of terms. Each advisor and her graduate students can communicate, it seems, in a common language of labels applied to the texts the commonly read. One could sort the entire western canon for texts labeled “governmentality” by students of Patrick Joyce. The crowdsourcing is hypothetically open: our drop-out in Cleveland can theoretically acquaint himself with the canon interpreted by Patrick Joyce and followers by searching Zotero for “governmentality.” He can theoretically contribute his own readings from Mayhew.

The generation of new terms in a folksonomy is organic, as well. For another scholar to highlight another term to the tagcloud, they need only to begin abundantly tagging themselves. A body of sympathetic users who adopt a new term can grow and find each other. In the age of digital citation, subfields have the chance to emerge in a new way. They emerge with less certainty and coherence, to be sure, than those directed by graduate advisors, but they emerge nonetheless. Landscape Studies, so long on the periphery of a dozen canons, perhaps only has a chance in the age of digital citation.
To whom lies innovation in such a setting? To the advisor, to be sure, who launches a generation of students tagging the world through a new taxonomy; but also to the innovator, who dives into the established canon, passionately splicing the world according to a new set of values: leaving behind a trail of texts for a Marxist reading of the eighteenth century or a landscapey reading of the nineteenth.

And here the problem of originality reemerges. For these alternative taxonomies to be persuasive, they much seem relevant to other taxonomists. They must not seem redundant, the mirror of so many Patrick Joyces or T. S. Ashtons who have looked at the literature beforehand.

Another means of sorting through the noise of the digital canon is to outsource the reading to artificial intelligence. A program such as Devonthink, taught by a user to group together the readings and excerpts for a single undergraduate survey, can learn that texts that mention Britain, Adam Smith, and the 1750s belong together. It can even browse JSTOR for new passages of immediate relevance to the topic, excerpt them, and highlight some of the most important words that seem to appear for frequency. The scholar still has to read: but the machine performs the work of the research assistant, diving into the archives and coming out with particular passages neatly marked.

Working with such an apparatus creates the problem of an echo chamber. If you liked this, you will also like something like it. How does the scholar find an alternative telling? Where lies innovation? The answer to this question is probably the same as it has always been in academia: one does something innovative by mastering the canon and looking outside of it. In the age of digital citation, the canon is easier to find than ever, which means that the economy of time can spare more room for reading beyond the canon in search of fresher ideas. The healthy scholar will employ digital prosthetics towards mastering established canons, leaving more energy to spare for creative praxis.

It is an age for the flourishing of scholars who have the time to read deeply and the energy to think outside of the canon. This is what’s scary about it: to keep up in the age of digital citation, scholars will have to master a series of intellectual prostheses – tagging circles, artificial reading bots, quick skimming – that will help them navigate through the masses of texts. The age of digital citation will punish scholars who merely reduplicate the canons of their mentors. This is what’s exciting about it: you no longer have to go to a university to find out what books are on the canon.

The age of digital citation is almost guaranteed to produce a phalanx of interdisciplinary thinkers, skilled synthesists, capable of putting together the big picture from a variety of micro-fields and offering new perspectives on the whole. It will be to the credit of the rest of us if we can accept them.

Labels: academia, academics, citations, crowdsourcing, folksonomy, googlescholar, innovation, scholarship, tagclouds, zotero

7 Comments:

PhDinHistory said...: DEVONthink seems to be a recurring theme in your posts. I haven't tried it, but it sounds fascinating.

I think your right about numbers and citations. Just wait until we create the algorithms that will track citations in Google Books.

Did you see the article this week in the New York Times about physicists who are are creating computer and robot replacements that run lab tests and discover new laws of nature? It appears you are half right. The computers can become research assistants. But they can also discover things that we could never find on our own.

I sure hope we do not end up thinking of our "intellectual prostheses" as black boxes. If that is the case, we may end up like the above-cited physicists who recognize that their computer programs are finding new answers, but who are at a loss to know what question was being answered.

To navigate this new terrain, I think we will need a wide-ranging interdisciplinary knowledge. I can't wait to see who is up to the challenge.; 8:10 PM
Jo Guldi said...: Exciting, PH! The scientists are ahead of us on the web 2.0 curve, that's for sure. The physicists have been writing papers on wikis and sharing citations via Papers for some time now.

Devonthink: yes. I've been hearing about it from geek friends for two years now, and I'm finally starting to use it to help process large historiographic apparati. In practice what that means is storing pdf's and fresh scans (which Devonthink can read and sort), with some minimal sorting, and seeing what patterns emerge. I'm speculating from below the curve on this one -- I can sort of see where it's going, but I'm waiting until I build up a big enough database to do something really cool to report on where I think DevonThink can take us. (DevonAgent, related, but not the same, I've been able to mess around with, but I still need those plugins to apply it to historical databases... send coders!!!); 9:13 PM
Unknown said...: Great post!

There's a lot to wrestle with here. One issue that jumps to mind is how the role of an advisor will change. Their function as a repository of canonical expertise will undoubtedly dim, but I also wonder if we will see a decrease in their authority as mentors. Will the role of doling out career advice or facilitating new professional contacts increasingly fall on the virtual crowdsourced shoulders of Twitter followers or blog subscribers?

It will also be interesting to see the backlash against the shifting status quo. There are a lot of people in academia whose careers rest firmly on maintaining their narrowed "expert" status. If they can't/won't adapt to a new paradigm, there's going to be some ugliness. Finding a way to ease them into the fold of digital academia is going to be a major challenge.

And finally, amen to the call for interdisciplinarity. To me, it's one of the most exciting aspects of digital scholarship - the opportunity (and increasingly, the necessity) to push the limits of your own academic comfort zone and get involved in a variety of fields.; 6:32 AM
PhDinHistory said...: I look forward to your updates on both DEVONthink and DEVONagent. I have scanned about 1,500 archival documents for my dissertation and run them through OCR, in hopes of using text mining to help me write my dissertation. More recently I have downloaded PDFs of more than a thousand newspapers articles, all of which I can easily access through desktop search. It sounds like I may need to invest in this software package. Do you have a sense on whether these programs could be integrated with Zotero or other digital history tools?; 8:53 AM
Jo Guldi said...: PH, gosh, if only they COULD be integrated. The first question is probably: are enough historians using them to make the integration worthwhile? Code follows community. If more of us get on board DT and DA, ManyEyes or GoogleEarth, new code and new hacks will start to emerge from the very everyday activities we're doing.

To make DT/Zotero immediately useful, I'm quarantining my use of each. I only use Zotero for footnote citations; I don't generally use its tagging/sorting/storage capabilities: my other applications do more interesting things. I use delicious to sort (the community is more interesting than Zotero's so far) and DT to find patterns (which Zotero can't do). So long as Zotero doesn't play with them, I'll pass on its other capabilities. The result is a two-step process. I'm downloading the citation to Zotero as I find the pdf, then immediately importing the downloaded pdf's into Devonthink so that I can analyze, tag, and sort them.

With an archive like yours, I'd suggest that DT would make a lot of sense for finding new patterns. For example: I ask DT to look at the pdf's on visuality and Carlyle that I've uploaded for a lecture on "learning to look"; DT tells me that among the top keywords is "panopticism" and immediately pulls the other pdf's that reference the term. If I asked it to, it would automatically generate a folder for those files, with the relevant passages highlighted. Instant outline of one of the paragraphs in my lecture. Instant generation of the notes for a single paragraph of your dissertation.

With your archive of pdf's and scanned documents, you're in a much better position than me to take advantage of such tools. My digital archive is in the process of building precisely because I already created my archives in other ways (delicious, non-digital old-fashioned hand-written notes). If only I'd had these tools when I was dissertating!

Alas, all historical methodology is doomed to be ephemeral. We use these things while they serve our purposes, and all purposes are dependent upon the changing nature of the whole. I guess that's part of the fun.; 9:08 AM
Louis said...: You say DEVONthink can browse JSTOR? How? I haven't been able to get it or DEVONagent to do that?; 9:49 PM
Jo Guldi said...: Louis, I got a JSTOR plugin from a librarian at GMU who's a better coder than I am. Email me directly and I'll pass it on?; 2:05 AM

Inscape

Wednesday, April 08, 2009

The Age of Digital Citation

7 Comments:

About Me

Previous Posts

Close Friends (in Real Life)