Tuesday, June 15, 2010

Facebook versus Twitter for Crowdsourcing Document Transcription

Last week, I posted a new document to FromThePage and inadvertently conducted a little experiment on how to publicize crowdsourcing. Sometime in the early 1970s, my uncle was on a hunting trip when he was driven into an old, abandoned building by a thunderstorm. While waiting for the weather to moderate, he found a few old documents -- two envelopes bearing Confederate stamps, one of which contained a letter. I photographed this letter and uploaded it to FromThePage while on vacation last week, and that's where the experiment begins.

I use both Facebook and Twitter, but post entirely different material to them. My 347 Facebook friends include most of my high school classmates, several of my friends and classmates from college, much of my extended family, and a few of my friends in here in Austin. I mostly post status updates about my personal life, only occasionally sharing links to Julia Brumfield diaries whenever I read an especially moving passage. My 344 Twitter followers are almost entirely people I've met at or through conferences like THATCamp08 or THATCamp Austin. They consist of academics, librarians, archivists, and programmers -- mostly ones who identify in some way with the "digital humanities" label. I usually tweet about technical/theoretical issues I've encountered in my own DH work. At least a few of my followers even run their own transcription software projects. Given the overlap between interesting content and FromThePage development, I decided to post news of the East Civil War letters to both systems.

My initial tweet/status--posted while I was still cropping the images--got similar responses from both systems. Two people on Twitter and five on Facebook replied, helping me resolve the letter's year to 1862. Here's what I posted on FaceBook:

And here's the post on Twitter:

After I got one of the envelopes created in FromThePage, I tested the images out by posting again to Facebook. This update got no response.

The next day, I uploaded the second envelope and letter, then posted to Twitter and Facebook while packing for our return trip.



This time the contrast in responses was striking. I got 3 click-throughs from Twitter in the first three days, and I'm not entirely sure that one of those wasn't me clicking the bit.ly link by accident. While my statistics aren't as good for Facebook click-throughs, there were at least 6 I could identify. More important, however, was the transcription activity -- which is the point of my crowdsourcing project, after all. Within 3 hours of posting the link on Facebook, one very-occasional user had contributed a transcription, and I'd gotten two personal emails requesting reminders of login credentials from other people who wanted to help with the letter.

What accounts for this difference? One possibility is that the archivists and humanists who comprise my Twitter followership are less likely to get excited about a previously-unpublished Civil War letter -- after all, many of them have their own stacks of unpublished material to transcribe. Another possibility is that the envelope link I posted on Facebook increased people's anticipation and engagement. However, I suspect that the most important difference is that the Facebook link itself was more compelling due to the inclusion of an image of the manuscript page. Images are just more compelling than dry text, and Facebook's thumbnail service draws potential volunteers in.

My conclusion is that it's worth the effort to build an easy Facebook sharing mechanism into any document crowdsourcing tool, especially if that mechanism provides the scanned document as the image thumbnail.