Tuesday, July 26, 2011

Can a Closed Crowdsourcing Project Succeed?

Last night, the Zooniverse folks announced their latest venture: Ancient Lives, which invites the public to help analyze the Oxyrhynchus Papyri. The transcription tool meets the high standards we now expect from the team who designed Old Weather, but the project immediately stirred some controversy because of its terms of use:

Sean is referring to this section of the copyright statement (technically, not a terms of use), which is re-displayed from the tutorial:
Images may not be copied or offloaded, and the images and their texts may not be published. All digital images of the Oxyrhynchus Papyri are © Imaging Papyri Project, University of Oxford. The papyri themselves are owned by the Egypt Exploration Society, London. All rights reserved.
Future use of the transcriptions may be hinted at a bit on the About page:
The papyri belong to the Egypt Exploration Society and their texts will eventually be published and numbered in Society's Greco-Roman Memoirs series in the volumes entitled The Oxyrhynchus Papyri.
It should be noted that the closed nature of the project is likely a side-effect of UK copyright law, not a policy decision by the Zooniverse team. In the US, a scan or transcription of a public domain work is also public domain and not subject to copyright. In the UK, however, scanning an image creates a copyright in the scan, so the up-stream providers automatically are able to restrict down-stream use of public domain materials. In the case of federated digitization projects this can create a situation like that of the Old Bailey Online, where different pieces of a seemingly-seamless digital database are owned by entirely different institutions.

I will be very interested to see how the Ancient Lives project fares compared to GalaxyZoo's other successes. If the transcriptions are posted and accessible on their own site, users may not care about the legal ownership of the results of their labor. They've already had 100,000 characters transcribed, so perhaps these concerns are irrelevant for most volunteers.


Chris Lintott said...

Thanks for the (very fair) post about our project. The only things I'd like to add is that Ancient Lives represents a radical opening of a collection that until now has been very hard to reach, although it obviously falls a long, long way short of being very open. Secondly, any academic use of the results by the project team will, as with all of our projects, give full credit to volunteer transcribers.

It will indeed be interesting to see what happens!

Anonymous said...

I wouldn't call one cranky tweet a "controversy".

It's a very slick and engaging site with a captivating topic, and I don't disagree with tapping the internet's cognitive surplus (the project on which I'm working aims to do the same thing) to transcribe (or categorize or delineate). I only regret that the results aren't more free and open.

Closed crowdsourcing can indeed succeed, Google's Map Maker being a prime example. The understanding that the work you give away to Google will show up eventually in Google Maps and be available to use via the Maps API has to be a factor in the success. There's some tangible return and in the same currency (maps on the web via API).

Still, I'd rather people were instead donating their time and energy to OpenStreetMap.

Anonymous said...

The more relevant part of UK copyright law is that all unpublished works are under copyright until at least 2039 no matter how old they are.

Lots of UK institutions claim that just by making a straightforward mechanical copy of a document they gain a new copyright. I'm not convinced that the letter of the law agrees with them, but no-one can afford to challenge it.

Alexandra Eveleigh said...

Gavin, that's a bit of an exaggeration! 'Most' unpublished works remain in copyright in the UK until 2039 possibly, but certainly not all. There are also some handy regulations whereby many older literary works can in fact be published, despite remaining in copyright, although I agree the law is extremely frustrating when it comes to archival documents.

Other examples of successful 'closed' crowdsourcing projects might be the commercial genealogy ones, such as Ancestry's World Archives Project.

Ben W. Brumfield said...

Thanks for all the comments. One of things I worry about as I talk with folks in the UK about digitization projects is that the Crown Copyright/non-Bridgeman regime is going to make this kind of opener (but still restricted) use the best we can hope for. As Chris Lintott points out, Ancient Lives is a massive advance in openess despite the restrictions.

To the extent such things are possible, I think that the case for openness should be made to rights-holders. One of the side effects of restricted use is that a project limits its user contributions to only the kinds of activities permitted. A good example of the downside of this may be seen in the comments to the Slashdot post discussing Ancient Lives. There are three separate comment threads bemoaning the copyright on the images (1, 2, 3). What's interesting is that in addition to general grouching about closed projects, two of the posts discuss computational approaches to assembling and analyzing the papyrus fragments in terms of "what would happen if we put the images through X?" This is a form of public participation which was likely never envisioned as it's well outside the scope of Ancient Lives tool, but it's prohibited nonetheless.