Wednesday, January 18, 2012

A Developer Goes to AHA2012

Last Sunday I returned from the 2012 meeting of the American Historical Association.  Although I have attended my share of conferences and unconferences--from Lone Star Ruby Con to Dreamforce and Texas State Historical Association to Museum Computer Network--I'd never attended one of the big mid-year academic conferences before.  The experience was strange but fruitful, and I hope I'll be able to attend again.

Let me start with my superficial impressions.  First, historians dress much better than developers do, though they really don't hold a candle to the art gallery folks.  They also are a more reactive audience, although there is very little back-channel conversation on Twitter -- in fact I was informed that typing on laptops would be considered rude! Finally, they are pretty introverted -- more likely to strike up a conversation with a stranger than your average Rubyist, but not by much.

The conference itself is a bit warped by the fact that many of the attendees are there for the sole purpose of conducting job interviews.  This apparently involves a days-long series of thirty-to-ninety minute interviews designed to figure out which candidates to invite to campus for an on-site interview -- a grueling process for the interviewers and an expensive one for the interviewees.  (The analogous activity in the software world is the phone screen, in which a hiring manager discusses experience and skill-set with a candidate.  Over the phone.)  If most attendees are interviewing,  they aren't actually participating in the conference -- I was told that around 12,000 people register, but only 5000 attend.  This gives AHA a kind of Potemkin village flavor, and it's not unusual to see a panel lecturing to a nearly empty room.  In fact, the last session I attended had five speakers on the podium and only three people in the audience.

Nevertheless, AHA2012 and the associated THATCamp were tremendously productive for me.  There were several opportunities for collaboration, so while I didn't find my dream partner for FromThePage--that institution with a staff of front-end experts and a burning need for transcription software--I did have some really good conversations.  I've been trying to add better support for letters to FromThePage, and Jean Bauer gave me a detailed walk-through of the Project Quincy data model for correspondence.  A lot of people were interested in starting their own crowdsourcing projects, and we've been swapping emails since.  Most importantly, while I was in town I met with the development team behind Scribe and Talk, the open-source tools that power Citizen Science Alliance projects like OldWeather and AncientLives. I'll be posting about that separately.

One of the things that impressed me most about the history world was the potential there is for a programmer to make a big impact. The graduate student I roomed with was an expert with regular expressions--his texts were in Arabic, so the RTL/LTR mix required him to close his eyes as he composed his patterns--but he had no experience with elementary scripting.  In one two-hour hack session, we were able to split a three-hundred-thousand-line medieval biographical dictionary into twenty thousand small files representing individual entries.  With a couple more hours' work, we'd have been able to extract dates, places, names, and other data from these files.  It is a delight for a software engineer to work in a domain where such minimal effort can make such a difference: most of our work deals with obscure edge cases of hard/boring problems, so removing months of tedious manual labor with an hour's worth of programming is incredibly rewarding.

Crowdsourcing History: Collaborative Transcription and Archives, the panel I presented at, seemed to go well.  Moderator Shane Landrum invited the audience to give 3-minute presentations on their own crowdsourcing projects after the presenters finished their 8-minute talks, then he opened the floor for questions.  Although I was skeptical about this format, it worked very well indeed.  In particular, the Q/A period was blessedly free of the self-promoters who plague events like South by Southwest.  Perhaps this can be attributed to the novel format or perhaps it was due to the inherent civility of academic historians -- all I know is that it succeeded.  I felt very fortunate to be among the panelists, who were a Who's Who of manuscript transcription tools, although a couple prominent projects were not represented because they were too recent to be included in the proposal.  Because the context was already set by my fellow panelists and because the time was so constrained, I decided to concentrate my own talk on one feature of FromThePage: subject indexing through wiki-links.  An abbreviated recap of the presentation is embedded below:

On the whole, I think I'd like to go back to the AHA meeting. The conversations and collaborations made the trip worth the expense, and it was gratifying to finally meet the people behind the big transcription projects face-to-face.  I even managed to learn some fascinating stuff about American history.