Saturday, June 21, 2008

Workflow: flags, tags, or ratings?

Over the past couple of months, I've gotten a lot of user feedback relating to workflow. Paraphrased, they include:
  • How do I mark a page "unfinished"? I've typed up half of it and want to return later.
  • How do I see all the pages that need transcription? I don't know where to start!
  • I'm not sure about the names or handwriting in this page. How do I ask someone else to review it?
  • Displaying whether a page has transcription text or not isn't good enough -- how do we know when something is really finished?
  • How do we ask for a proofreader, a tech savvy person to review the links, or someone familiar with the material to double-check names?
In a traditional, institutional setting, this is handled both through formal workflows (transcription assignments, designated reviewers, and researchers) and through informal face-to-face communication. None of these methods are available to volunteer-driven online
projects.

The folks at THATCamp recommended I get around this limitation by implementing user-driven ratings, similar to those found at online bookstores. Readers could flag pages as needing review, scribes could flag pages in which they need help, and volunteers could browse pages by quality to look for ways to help out. An additional benefit would be the low barrier to user-engagement, as just about anyone can click a button when they spot an error.

The next question is what this system should look like. Possible options are:
  1. Rating scale: Add a one-to-five scale of "quality" to each page.
    • Pros: Incredibly simple.
    • Cons: "Quality" is ambiguous. There's no way to differentiate a page needing content review (i.e. "what is this placename?") from a page needing technical help (i.e. "I messed up the subject links"). Low quality ratings also have an almost accusatory tone, which can lead to lots of problems in social software.
  2. Flags: Define a set of attributes ("needs review", "unfinished", "inappropriate") for pages and allow users to set or un-set them independently of each other.
    • Pros: Also simple.
    • Cons: Too precise. The flags I can think of wanting may be very different from those a different user wants. If I set up a flag-based data-model, it's going to be limited by my preconceptions.
  3. Tags: Allow users to create their own labels for a page.
    • Pros: Most flexible, easy to implement via acts_as_taggable or similar Rails plugins.
    • Cons: Difficult to use. Tech-savvy users are comfortable with tags, but that may be a small proportion of my user base. An additional problem may be use of non-workflow based tags. If a page mentions a dramatic episode, why not tag it with that? (Admittedly this may actually be a feature.)
I'm currently leaning towards a combination of tags and flags: implement tags under the hood, but promote a predefined subset of tags to be accessible via a simple checkbox UI. Users could tag pages however they like, and if I see patterns emerge that suggest common use-cases, I could promote those tags as well.

Sunday, June 8, 2008

THATCamp Takeaways

I just got back from THATCamp, and man, what a ride! I've never been to a conference with this level of collaboration before -- neither academic nor technical. Literally nobody was "audience" -- I don't think a single person emerged from the conference without having presented in at least one session, and pitched in their ideas in half a dozen more.

To my astonishment, I ended up demoing FromThePage in two different sessions, and presented a custom how-to on GraphViz in a third. I was really surprised by the technical savvy of the participants -- just about everyone at the sessions I took part in had done development of one form or another. The feedback on FromThePage were far more concrete than I was expecting, and got me past several roadblocks. And since this is a product development blog, here's what they were:
  • Zoom: I've looked at Zoomify a few times in the past, but have never been able to get around the fact that their image-preparation software is incompatible with Unix-based server-side processing. Two different people suggested workarounds for this, which may just solve my zoom problems nicely.

  • WYSIWYG: I'd never heard of the Yahoo WYSIWYG before, but a couple of people recommended it as being especially extensible, and appropriate for linking text to subjects. I've looked over it a bit now, and am really impressed.
  • Analysis: One of the big problems I've had with my graphing algorithm is the noise that comes from members of Julia's household. Because they appear on 99 of 100 entries, they're more related everything, and (worse) show up on relatedness graphs for other subjects as more related than the subjects that's I'm actually looking for. Over the course of the weekend, while preparing my DorkShorts presentation, discussing it, and explaining the noise quandary in FromThePage, both problem and solution clarified.
    The noise is due to the unit of analysis being a single diary entry. The solution is to reduce the unit of analysis. Many of the THATCampers suggested alternatives: look for related subjects within the same paragraph, or within N words, or even (using natural language toolkits) within the same sentence.
    It might even be possible to do this without requiring markup of each mention of a subject. One possibility is to automate this by searching the entry text for likely mentions of the same subject that has occurred already. This search could be informed by previous matches -- the same data I'm using for the autolink feature. (Inspiration for this comes from Andrea Eastman-Mullins' description of how Alexander Street Press is using well-encoded texts to inform searches of unencoded texts.)
  • Autolink: Travis Brown, whose background is in computational linguistics, suggested a few basic tools for making autolink smarter. Namely, permuting the morphology of a word before the autolink feature looks for matches. This would allow me to clean up the matching algorithm, which currently does some gross things with regular expressions to approach the same goal.

  • Workflow: The participants at the Crowdsourcing Transcription and Annotation session were deeply sympathetic to the idea that volunteer-driven projects can't use the same kind of double-keyed, centrally organized workflows that institutional transcription projects use. They suggested a number of ways to use flagging and ratings to accomplish the same goals. Rather than assigning transcription to A, identification and markup to B, and proofreading to C, they suggested a user-driven rating system. This would allow scribes or viewers to indicate the quality level of a transcribed entry, marking it with ratings like "unfinished", "needs review", "pretty good", or "excellent". I'd add tools to the page list interface to show entries needing review, or ones that were nearly done, to allow volunteers to target the kind of contributions they were making.
    Ratings also would provide an non-threatening way for novice users to contribute.

  • Mapping: Before the map session, I was manually clicking on Google's MyMaps, then embedding a link within subject articles. Now I expect to attach latitude/longitude coordinates to subjects, then generate maps via KML files. I'm still just exploring this functionality, but I feel like I've got a clue now.

  • Presentation: The Crowdsourcing session started brainstorming presentation tools for transcriptions. I'd seen a couple of these before, but never really considered them for FromThePage. Since one of my challenges is making the reader experience more visually appealing, it looks like it might be time to explore some of these.
These are all features I'd considered either out-of-reach, dead-ends, or (in one case) entirely impossible.

Thanks to my fellow THATCampers for all the suggestions, correction, and enthusiasm. Thanks also to THATCamp for letting an uncredentialed amateur working out of his garage attend. I only hope I gave half as much as I got.