Friday, April 20, 2007

Risks: Why trust me with your stuff?

Why should someone trust me with their data? Why should someone trust me with the fruits of their effort? By "trust", I'm not talking about the possibility that I'll misuse the transcriptions -- I presume that someone's using the software in order to distribute their transcriptions more widely, so keeping it all a secret isn't that big a deal. Rather, I'm talking about the possibility that the computers the software is running on take a dive and never return. Or that the computers keep running but Julia ends up turning into a stagnant project riddled with comment spam the content creators are powerless to fight.

The only answer I can give is that the software must make it unnecessary for a work owner or scribe to trust in the future of that service. I can do this by making sure that the work owner can get their data back out. So at any time, any content authored on the site can be exported in a lossless format -- including transcription source, intermediate transcription documents, annotations, and the RDBMS structure that links it all. (This list does not include the original images, since I presume they have an independent existence on the machines of whomever uploaded them originally, and transferring them would be costly.) If the software itself is open-source, this could be loaded on another server with no dependencies on my own technical or personal reliability.

So in addition to features for transcription and printing, I need a full export feature.


Sara said...

I think you should rethink the images -- at a minimum you should have the image names (which is implied in the RDBMS data export). At a maxium, you give them a way to pull the image down with the data -- perhaps a nice little package that could be imported in one easy step into another system running your software.

Remind me to share with you the section in 43Signal's "Getting Real" on this very topic.

Ben W. Brumfield said...

Assuming that the project is open-source, the ideal export would include everything necessary to re-create the site (including user accounts and edit histories) somewhere else.

I'm running into this with Horizon: as we shut it down, we'll want to take a backup and put it somewhere. That "somewhere" sort of necessitates blogging software -- that or we end up doing a full site-scrape.