Landscape organizes everything within sight.

Tuesday, March 27, 2007

New Tools in Old Disciplines: Working Magic with Google Books, cont'd


The earlier post about Google Books on this site is creating quite a buzz among librarian and historian communities online -- partly because famed tech blogger Tim O'Reilly reported my having "dissed" the experience of libraries for virtual research, partly because Google Books is so hot, and partly because the image of libraries disappearing for computers raises hackles among academics everywhere.

The fears start flying. Will historians neglect the skills of traditional research because they've discovered the internet? Probably not. We spend years training in arcane research methods, and we make our names by doing something new, which even today generally means finding some measure of unknown documents in the archive. Will the material archives disappear? That's a fear, because any time a university can cut funds, it will, and then, as Rick Prelinger can testify, entire corpuses of periodicals and log books from the eighteenth century are jetissoned in the dumpster. Are internet archives going to be exhaustive? Definitely not, and in no case is every last spare bit of paper -- the forms, the doodles, the enormous maps -- getting scanned. Some of the fears are legitimate, and some of the fears are false. All give evidence of a rapidly changing world.

The real excitement around tools like Google Books is the possibility of applying new tools that are now simply not available with the other kind of text. The word-count and documentation databases I mentioned are now only a dream -- Google's caution with copyright laws puts them out of the realm of possibility for the moment. But should those become possible, they will open up a realm of research possibilities that are now only experimental in the humanities.

To give but one example, it is now possible in the text-searchable, online Oxford English Dictionary to find all words with "road" or "walking" in the definition that had their origin between 1810 and 1840. I discovered a variety of pieces of slang pertaining specifically to the way people walk down the new streets -- suggesting that they were parading, performing, acting in some way so new to the culture that an entire vocabulary had to be invented to explain what they were doing. By traditional methods, most of these would never have turned up; they're too far apart in occurance, we tend to focus on polemic rather than slang texts, and the shift would have escaped me. This data from the OED is now a major piece of evidence in one of my chapters, allowing me to advance conclusions I would not have been able to make before.

Similar searches on the Dictionary of National Biography have allowed me to perform acrobatics with the networks of different professionals in the 1780s, people like artisans and innkeepers who rarely turn up in traditional historiography, about whom the data is scarce. These professions make brief appearances in the DNB, and by tracing the lives of a hundred innkeepers in the 1780s, patterns of politics, religious belief, and marriage emerge that suggest that innkeepers, with their access to horses and carriages and strangers, were among the best-connected and most political people in the nation. We are only beginning to see what this kind of research can do.

Doing this sort of number crunching on texts yields amazing results. In the future, historians will demand access to the full text of Google Books for exactly this reason. If Google doesn't provide it, many of its competitors -- including the Internet Archive -- may. So a fertile world of sorting searches is ahead of us.

The rosiest scenario includes tech geeks and academic researchers teeming up to talk about framing the search queries. The raw text in the Dictionary of National Biography, for example, has no fields except the entry for "name" and "years." I have to sort through myself to find the number of children, the profession, the religion, the political beliefs, and the books he wrote. But sorting this kind of material against each other in searches is immensely powerful. Did Quakers have more children than Catholics? I don't know, but the archive does. And if the DNB has too few variables to be the right resource for this sort of search, a variety of local archives and court records around Britain are now going online, with exactly that potential. These include the entire proceedings of the Old Bailey court in London, 1674-1834; the census, 1801-1964; the British Parliamentary Papers, 1688-1905, and the LSE Booth Archive (maps of poverty in the 1860s). More often than not, historians like myself with no technical background are in charge of creating the data fields and search algorithms. We rarely find what we want, because we don't know how to use the technology to get what we want. The marriage of technocrats and historians could be a happy one.

Right now, none of these archives talk to each other, none allow tagging or comments from researchers, and those that have tried to provide fields or tags have done so by hand, over years, at immense expense with little to show. To this sort of labor, some open databases provide a vista of solutions. GoogleBase and Freebase are the two important ones now. In open databases, even small archives can contribute the raw data from their holdings, and anyone -- from the genealogist to the professional historian to the computer scientist looking for a Masters Thesis project -- can start putting together the evidence into interesting patterns, and then sharing those tools with others. Analytic tools like Swivel, Pipes, and DabbleDB can start finding patterns immediately. The miracles to come will happen when data starts talking to data, bubbling into new patterns yet undiscovered -- when we start getting entire life histories of shoemakers and Quaker populations out of the traces they left across a dozen government and local databases, when we start discovering shoemakers across vast swathes of England who knew each other and were talking, and when we start following the spread of religious or political ideas across those networks. We must believe that there are patterns locked in the data that are burning to get out, and we must apply all the tools we have to release them.

Google Books blew my socks off because it was able to contribute something new to my research after I had already circled the world for this information, pillaging a variety of specialty libraries, among them, Harvard's Dumbarton Oaks landscape collection, the Maps Collection and Center for British Art at Yale, the Royal Institute of British Architects Collection, the Victoria and Albert, the British Museum, the Cambridge libraries, and the Public Records Office. I was also going through whatever ILL could bring me through the well-organized mechanisms of the University of California. I've seen ephemera and political documents pertaining to the road that were never looked at by any of the thirty major historians who wrote about the road in the course of the twentieth century. It is utterly a delight, then, to encounter other books that did not turn up in my exhaustive ramble through the traditional methods. New tools in old disciplines can do us a world of magic.

Labels: , , , , , ,

Thursday, March 22, 2007

How Delicious is Changing Academic Research

As of a recent post on Google Books and the research of History, our quiet little blog here on academic history, activism, and spirituality has suddenly gotten more notoriety than it's accustomed to. Hi world! Thanks for stopping by. To carry on with the thread of how information travels for academics, and what the 'net is doing, let's talk about another of my favorite sites for research, del.icio.us.

Delicious is the Rome, Jerusalem, and Paris of my existence as an academic these days. It's where I make my friends, how I get the news, and where I go to trade. All this from a little server that does nothing but share bookmarks in public.

Why? Two reasons it's cool. 1) It sorts things. 2) it makes them public.

1) it sorts things.

For two years I've been using Delicious as an information organizer. It's produced an impressive encyclopedia of the most interesting information, images, articles, citations, books, and subjects on the internet to which I might want to refer. Consider my dissertation tag, under which are a wide variety of online images and google books that I'll be using for my research. Not only can I come back to them, but I can also find related subjects -- dissertation material related to walking -- navigating seamlessly from one to another. As an improvement on the index card system, or on my own terrifying piles of articles (even now ornamenting my bookshelf), or even on the folders within folders within folders of word documents, this represents a definite improvement.

I've been building a taxonomy -- the way some people use wikis, the way my boyfriend uses that utterly cool personal software, "the brain;" the way my father uses his vertical file, the way my DC friends use their rolodexes -- so I sort out all the information I take in, annexing technology to memory, sorting factoids and spare threads and notable evidence in neat, interlocking piles where I can find information again, draw connections, and create new connections.

The result is a navigable taxonomy of my thoughts. If I want to find my stuff on the history of "walking," the taxonomy already knows that my material on walking is associated with other categories of knowledge which I've tagged nearby.

After a year of using delicious for my own bookmarks, helping other people find things becomes remarkably easy. Many of the link lists below are simply cut and paste over from delicious. Lists of citations for colleagues are cut and paste from delicious into email. The forty American history students I teach are instructed to go to my delicious page for writing help, research help, maps, and images relating to the class.


Second reason delicious is cool:

2) it makes things public.

Not only can you look at your own bookmarks, but you can also look at others'. When you find something noted to be queer and interesting, you can find out what other topics that same person thinks to be queer and interesting.

What's rapidly happening with these shared tags is academics finding each other in rapid numbers. I have some twenty people in my network, at least half of whom I've never met in real life. They include:

* Javier Arbona, a graduate student in Geography who's also at the University of California, Berkeley
* Travis Brown, a graduate student in literature
* LeahB, an editor at Cabinet Magazine, my favorite periodical
* bibliparis4, a librarian at one of the public universities in Paris

Each of these is another intellectual putting together rarified connections about strange pieces of thought somehow related to my world.

I found them because they were, like me, publicly tagging with some arcane tag that I also use. c19 -- the nineteenth century tag. vernacular -- a tag used by other people who work with ephemera.

Every morning, I log into my delicious network and read the links that my small army of admired, clever, canny, eccentric brains has put together for me.

What's more, I'm developing what I'd consider an actual working relationship with these other scholars. A few of them have added me to their own networks. Day to day, I watch their reactions to Bush, I get a sense of where their research is going, and they get a sense of mine. It's low-level, low-commitment hanging out with high levels of information exchange.

And this is something different than the social activity I know anywhere else on the internet.

Normally, if you want to meet people on the internet, the connections are typically time-limited and action-specific. You want a date, you want sex, you need a friend of a friend for networking in Argentina. You meet up online and then you meet in real life. Or you meet online at Myspace and then, unless you have a crush on the person, forget to ever go back again. But my scholars are folks I'm seeing on a regular basis in the course of my regular research. This is the nearest thing to running into someone else at the card catalog yet.

I don't check in with them. I don't have, nor do I really need, the capacity to send email to them. Some of them I may actually encounter at academic conferences later, and we'll share more of a bond, through our years of doing collaborative research, than many scholars who have labored through the years in adjoining offices.

As Hannah Arendt understood, the modern democratic state happened when people in public spaces began interacting, and thus began taking action together. For this reason, she identified the medival carnivals and fair days of Europe as the seat of literature, culture, debate, and politics. The rule goes like this: make a public, get action. Today, Delicious does for the internet what open-air markets did for medieval society. Low key, high-information, continuous-formation community building.

All hail the bookmark market.

Labels: , , , , , , , ,

Wednesday, March 14, 2007

How Google Books is Changing Academic History

Google Book Search is a relatively recent phenomenon... six months ago, right? About six months ago I was pottering around there, finding a few illustrated nineteenth-century texts, a lot of contemporary books for sale, and not much of too much interest.

Six months turns out to be a long time in book land. In that period of time, Book Search has accomplished enough to transform the academic profession.

I was idly trying a search on "roads" to see what sort of a literature would turn up for the period of my dissertation research, 1740-1850. I didn't expect much. I've spent the last two years wandering through the Yale, Harvard, and California libraries, the British Library, Britain's National Archives, and the immense reserves of North American Inter Library Loan reading every book on London, pavement, or travel I could get my hands on.

Surprise. In a single idle search I just added twenty extra full-text books to my list.

Which are, by the way, full-text searchable --

-- and subject to word-count analysis --

-- and replete with full illustrations --

-- and instantly digestable into visuals for powerpoint presentations.

Hallelujah, GoogleBooks. And holy mackerel! Good work.

By now, the first half of the nineteenth century exists in a very complete form on Google Books. In the last six months, while academic history has meandered in its habituated paths of grinding research, the possibilities of scholarship have been utterly transformed.

To give just one example, this little puppy -- Henry Parnell's A Treatise on Roads (1833) -- one of the key texts for my dissertation exists on our campus in Berkeley's transport library, a quaint but understaffed, spare room
hidden on the third floor of the engineering building, far, far away from where historians ever go. It wasn't actually on the shelf when I got there, so it took some patient emailing with the transport library librarians before the book was found, returned to the correct place, held at the desk for me, to be picked up during the library hours specific to that particular institution (10am-4pm, M-Fr). Wild with enthusiasm at having at last obtained it, I held the volume prisoner at my desk in San Francisco for six straight months, unruffled by overdue notices, until at last the plaintive emails from the circulation desk were too much for me to bear. Research in my world is very often a personal matter of haggling for more time with the particular librarian in question. They're used to us, and I figure they need a good struggle to keep them alert. But thanks to Google Book Search, these days of scavenger-hunt and tug-of-war are drawing to an end.

Time for a professional dialogue about the new kinds of research these texts have opened up. For a very vast vista has erupted before us, and with it, a more serious set of comparative questions as a standard for social history, and new levels of rigor to be expected from the individual researcher. No longer can historians afford to stay in the empty, lonely world of the weary scholar, pouring of close readings of dialogue. Time for all those structural analysis skills to come back in full force. Quantitative and open databases of word-count and thematic analyses. Open databases of pictures, tagged by keywords and available for classroom use.

What this signals, by the way, is the opportunity for a new age of scholarship. Cultural and image analysis used to be painfully time-consuming, heavy lifting, involving rare kinds of access, full fellowships, immense travel, and long waits for delicate books. Comparison between different cultural sources was even harder, placing absurd demands on the cultural historian's personal memory and note-taking skills. Cultural historians, despite their many skills, stood second in depth of research on any particular topic to political historians, for whom one visit to a Parliamentary archive and one visit to a personal residence outfitted them with every last detail of historical change. Now all that is changing. Comparing a hundred images is no longer a problem for a year's labor in an out-of-the-way museum reading room. Comparing a hundred personal accounts from working men is no longer a task to eat up a social historian's entire year.

I'm looking forward to seeing what the future holds. Any reports of historians currently putting together databases? Please post them here. In the meantime, check out this afternoon's dissertation links...

  1. Practical Remarks, and Precedents of... - Google Book Search

    legal commentary on new pavement and turnpike legislation in parliament, 1802.


  2. A Treatise on the Law of Ways - Humphry Woolrych, 1829


  3. Steam Carriages on London Roads - Walter Hancock, London, 1838


  4. A Treatise on Roads, Their History - Simeon De Witt Bloodgood - 1838

    from Albany New York - lectures on the history of recent paving, with comments on tolls and despotism


  5. General Rules for Repairing Roads for surveyors on the Holyhead Roads - 1827


  6. Letter to Sir Alexander Muir M'Kenzie on Scottish Roads - McAdam - 1833


  7. A Practical Treatise on Making and Repairing Roads - Edmund Leahy - 1844


  8. Observations on the Formation, State and Condition of Turnpike Roads - A H Chambers - 1820


  9. The Practice of Making & Repairing Roads: - Thomas Hughes - 1838


  10. Rudiments of the Art of Constructing Roads - S Hughes - 1850


  11. A Treatise on Roads - Henry Parnell - 1833


  12. An Act [57 Geo. III. Cap. Xxix] for Better... - Google Book Search

    Metropolis Paving Act, 1817 - Michelangelo Taylor Act (?)


  13. Lights and Shadows of London Life - James Grant - 1842

    descriptionof ethnic ghettos; Jews and Quakers, their neighborhoods andappearances. begging imposters and the typical figures of cantdictionaries


  14. Hydraulia, an Historical and Descriptive... 1835, William Matthews

    a historical description of London's water supply


  15. Sinks of London Laid Open: A Pocket Companion...

    George Cruickshank, 1848.A flash dictionary with excursions through lodging houses, kitchens, hells, etc.


Labels: , , , , , , , , , ,