Wednesday, January 18, 2012

A Developer Goes to AHA2012

Last Sunday I returned from the 2012 meeting of the American Historical Association.  Although I have attended my share of conferences and unconferences--from Lone Star Ruby Con to Dreamforce and Texas State Historical Association to Museum Computer Network--I'd never attended one of the big mid-year academic conferences before.  The experience was strange but fruitful, and I hope I'll be able to attend again.

Let me start with my superficial impressions.  First, historians dress much better than developers do, though they really don't hold a candle to the art gallery folks.  They also are a more reactive audience, although there is very little back-channel conversation on Twitter -- in fact I was informed that typing on laptops would be considered rude! Finally, they are pretty introverted -- more likely to strike up a conversation with a stranger than your average Rubyist, but not by much.

The conference itself is a bit warped by the fact that many of the attendees are there for the sole purpose of conducting job interviews.  This apparently involves a days-long series of thirty-to-ninety minute interviews designed to figure out which candidates to invite to campus for an on-site interview -- a grueling process for the interviewers and an expensive one for the interviewees.  (The analogous activity in the software world is the phone screen, in which a hiring manager discusses experience and skill-set with a candidate.  Over the phone.)  If most attendees are interviewing,  they aren't actually participating in the conference -- I was told that around 12,000 people register, but only 5000 attend.  This gives AHA a kind of Potemkin village flavor, and it's not unusual to see a panel lecturing to a nearly empty room.  In fact, the last session I attended had five speakers on the podium and only three people in the audience.

Nevertheless, AHA2012 and the associated THATCamp were tremendously productive for me.  There were several opportunities for collaboration, so while I didn't find my dream partner for FromThePage--that institution with a staff of front-end experts and a burning need for transcription software--I did have some really good conversations.  I've been trying to add better support for letters to FromThePage, and Jean Bauer gave me a detailed walk-through of the Project Quincy data model for correspondence.  A lot of people were interested in starting their own crowdsourcing projects, and we've been swapping emails since.  Most importantly, while I was in town I met with the development team behind Scribe and Talk, the open-source tools that power Citizen Science Alliance projects like OldWeather and AncientLives. I'll be posting about that separately.

One of the things that impressed me most about the history world was the potential there is for a programmer to make a big impact. The graduate student I roomed with was an expert with regular expressions--his texts were in Arabic, so the RTL/LTR mix required him to close his eyes as he composed his patterns--but he had no experience with elementary scripting.  In one two-hour hack session, we were able to split a three-hundred-thousand-line medieval biographical dictionary into twenty thousand small files representing individual entries.  With a couple more hours' work, we'd have been able to extract dates, places, names, and other data from these files.  It is a delight for a software engineer to work in a domain where such minimal effort can make such a difference: most of our work deals with obscure edge cases of hard/boring problems, so removing months of tedious manual labor with an hour's worth of programming is incredibly rewarding.

Crowdsourcing History: Collaborative Transcription and Archives, the panel I presented at, seemed to go well.  Moderator Shane Landrum invited the audience to give 3-minute presentations on their own crowdsourcing projects after the presenters finished their 8-minute talks, then he opened the floor for questions.  Although I was skeptical about this format, it worked very well indeed.  In particular, the Q/A period was blessedly free of the self-promoters who plague events like South by Southwest.  Perhaps this can be attributed to the novel format or perhaps it was due to the inherent civility of academic historians -- all I know is that it succeeded.  I felt very fortunate to be among the panelists, who were a Who's Who of manuscript transcription tools, although a couple prominent projects were not represented because they were too recent to be included in the proposal.  Because the context was already set by my fellow panelists and because the time was so constrained, I decided to concentrate my own talk on one feature of FromThePage: subject indexing through wiki-links.  An abbreviated recap of the presentation is embedded below:

On the whole, I think I'd like to go back to the AHA meeting. The conversations and collaborations made the trip worth the expense, and it was gratifying to finally meet the people behind the big transcription projects face-to-face.  I even managed to learn some fascinating stuff about American history.


Yvonne Perkins said...

It's great that we are seeing more IT professionals mixing it with historians at THATCamps etc. You are right, there is so much that people with IT skills can help historians with.

It is interesting that some sessions had so few attending them. Those sessions are probably excellent but they are competing with so much else at the same time. Would it be better to have fewer sessions thereby increasing average numbers attending each session? The presenters at poorly attended sessions would probably make better use of their time by instead presenting at their own institution or some other forum in their region. But how do you predict what will be well-attended?

Ben W. Brumfield said...

I really can't say, Yvonne. My previous experience at a history conference was a regional one -- the Texas State Historical Association meeting in 2005. I don't remember any empty rooms in that one, but then again there wasn't that odd interview phenomenon going on either. It's possible that there are incentives that either make the organizers want to pad the number of sessions or make presenters crave panels. If so, those incentives aren't financial, as I'm sure the extra meeting rooms cost the conference money, and I know from experience that presenters still have to pay their registration fees and expenses. Then again, maybe the panels are planned based on the number of registrations, without regard for absenteeism due to interviews.

Despite the creepiness, I'm delighted that there was such an array of choices. I learned some fascinating things about the antebellum American South in the one mostly-empty panel I attended -- compelling stories about facets of life I didn't even know existed. I hope I can follow the work of those scholars, since I've been raving about them ever since.

Francie Diep said...

Hello! I'm a reporter at InnovationNewsDaily and I'd noticed a few interesting crowd-sourced text digitization projects lately. I was so excited to find your blog and especially your notes from MCN 2011. May I set up a time to ask you some questions about these projects? I'm free to talk almost any time today and before 4:30 pm Eastern tomorrow, January 31. Please just email me. Thank you for your consideration.

Francie Diep said...

Hi again, I'm not sure if you're able to access my email through my comment. You can contact me at fdiep at techmedianetwork dot com. Thanks again. -FD