Tuesday, January 4, 2011

Progress Report: GitHub, Archive.org Integration, and General Availability

2010 saw big changes in FromThePage.
  • The Balboa Park Online Collaborative started using FromThePage to transcribe the field notes of herpetologist Laurence Klauber. Perian Sully, Rich Cherry, and all the other folks there have been fantastic to work with: full of enthusiasm and new ideas for the system while patient with the bugs that we've discovered. This is the first institution to install FromThePage, and their needs have driven a lot of development since October, including
  • Internet Archive integration: As you can see on the Klauber site, FromThePage now integrates directly with books hosted on the Internet Archive. This means that FromThePage gets to use the BookReader (in modified form) with its spiffy zoom and pan capabilities while delegating the expensive work of image hosting to Archive.org. It also reduces duplication of data and may enhance findability of the transcriptions. Best of all, the tedious process of uploading, assembling, and titling page images can be skipped, as FromThePage now imports the book structure and even the OCRed page titles from Archive.org derivative files.
  • As you can see from that last link, I've transferred FromThePage over to GitHub, released it under the Affero GPL, and created some extensive documentation on the wiki. So FromThePage is officially Free software, available for immediate use.
If you're interested in hosting a transcription project on FromThePage, drop me a line at benwbrum@gmail.com and I'll help you get started.