Landscape organizes everything within sight.

Friday, October 10, 2008

Folksonomy vs. Navigation in Chains

Two tendencies are at war over digital publishing: the trend towards free navigation and navigation in chains.

Navigation in chains is a disturbing trend, encouraged accidentally by some of the hardest-working and smartest academics on the web. As RWMG points out in a comment below, Exhibitors and Mass Digitization projects expand different kinds of public access. I've been arguing that mass-digitization projects are the broadest hope for expanding public access to the treasures formerly locked away for the few. Exhibitors indeed play a valuable role by encouraging the expansion of mass digitization back in time, so that nineteenth-century books from the dawn of cheap printing are joined online by medieval manuscripts, Renaissance incunabulae, and tablets from ancient Persia.

What I'm worried about is the role of independent navigation and collaborative interpretation in both projects.

Traditionally, online exhibitors have tended to pre-package navigation through their online collections. One can "take the Itinerary" through ancient Rome -- more fun than most primetime shows, sure, but generally not where I'd head first myself. The big problem is that most of the categories first noticed and made available -- the names of major monuments, emperors, and styles of architecture -- are the most over-written subjects in the field.

Where's the fun in that?

Navigate your way through the collection based on known categories, and you won't see anything new. What about the natural curiosity that guides even an amateur through the stacks? What about the undergraduate or grad student who comes to the online exhibit with their own concern -- what Renaissance Rome can tell us about ribbon development, public places, or eminent domain? Such freely arising, spontaneous, and individual questions are the questions that drive individuals to do their own inquiry in the first place, rather than operating as passive consumers of books.

The braver alternative, which few of the exhibitors are using for reasons of authority and control, is to open up to the public the tagging of each manuscript. When you open up each manuscript you get, possibly thousands of overlapping keywords -- in digital searching, that's not a problem; it in fact means that more subtle descriptors like mood, use, or background details might be noticed and tagged by people who care about them, leaving the room for a later researcher to make headway in categories no one's noticed before.

No lesser institution than the Library of Congress has experimented with open tagging. This spring, they opened up 3000 images to open-tagging on flickr. Famously, within the first 24 hours, the public had added 11,000 tags. For visual images, those expanding, publicly-generated tags signify a new kind of searching and category formation hitherto unavailable to visual image researchers, who had to rely on their own eyes, their own skills of analysis, and the clumsy and slow manuscript delivery of image libraries, where images were tagged only with preexisting categories. With open tagging, image searching hypothetically means one could actually look through occurrence of the tag "poverty" in the nineteenth century and learn something. Or look through the tags relating to "women" auto-generated by users and start wondering about the papers that haven't yet been written:

Insofar as critical inquiry -- the engagement with texts, the arguing back against books -- represents one of the fundamental reasons for the humanities, a tendency towards navigation in chains is antithetical to what academics should be doing on the web. Encouraged by an uncritical reverence for the scholar's authority, navigation in chains is structuring a disturbing number of the collections now online.


Blogger Tim Hitchcock said...

Who tags the information, or creates the hierarchy, is only a part of the problem. Keyword searching, and searching on tagged information, are both very blunt instruments, and simply re-enforce an older style of iterative research. The search engines created by free standing sites are themselves sad interim solutions to new problems, and will wither as new ways of searching are created (the chains will rot in short order from their own internal weaknesses). And tagging, is again, just one more interim technology (a strategy derived from the 1980s, and well past its sell-by date). All of these creations are based on the notion of the 'library', on ordered information, and you are arguing about who should have the right to order it. I think, it is all just so much renaissance detritus (a worthy subject of study, but not a working methodology). What seems to me to be missing are the new tools that allow you to do things differently in the infinite archive. With 100 billion words of digitised text, I want to find the patterns that I cannot imagine, and which even an army of folksomonic taggers could not reveal.

9:16 AM  
Blogger J said...

I love you, Tim, but you're being contrary. Folksonomy is valuable as a key to mapping a contemporary set of evolving interests on top of old texts. Thinking _with_ the vast collective brain of other Flickr users of different nationalities, backgrounds, and interests is a pretty interesting way to look at the nineteenth century. Whatever their limits (and as critical thinkers of course we always press against the limits), folksonomy is as good as it gets for jacking directly into contemporary consciousness.

What you're right about, though, is that this is an exciting time for those interested in thinking about and pushing the way we wrap language onto reality.

So a question for you. Which of the following represents a direction you're most excited about?

- text excavation programs like Juxta
- plugging a natural language thesaurus into your keyword searches
- visual programs to synthesize different kinds of search

...or something else?

10:11 AM  
Blogger Tim Hitchcock said...

Three things are raising my temperature at the minute. First, the possibility of open ended pattern matching on large bodies of text using compression algorithms (using the repetitions identified by compression programmes to map abstract similarities across large bodies of text). This is an idea first suggested by Bill Turkel. Second, the creation of visualistion strategies that move beyond text, to sets of relationships. I am keen on creating a 3d environment to represent the lives of a few million 18th c. Londoners - each life being represented by a line between points defined by archive, location in 18th c. London and chronology. And finally, I think the 'archive' is now big enough to start looking for the evolution of concepts across genre and time; and to create a defensible theory of intellectual evolution, that could be represented as change over time; and allow you to both identify the 'mash-up' at the heart of all originality, and the ever borrowed afterlife of every phrase and idea.

Of course, none of this leads to 'Jacking into contemporary consciousness' - which may be the problem! But, it is a kind of history that could not be conceived before the digital archive, and I keen to witness/do that different kind of scholarly practise, to conceive afresh pretty much everything; and frustratingly, to do it in the few inches between my own ears.

5:52 AM  

Post a Comment

<< Home