IntroductionAgbe-Davies: Manuscript accounts from plantations come in a variety of forms: accounts of plantation residents (enslaved and free) as they frequented local stores; records of the daily expenditures and income realized by slave owners as a direct result of their human property; and accounts tracking economic exchanges between plantation owners and the laborers on whom they depended for their livelihood. The data recorded in these sources present an unparalleled opportunity for scholarly analysis of the economic and social structures that characterized the plantation for people throughout its hierarchy.
The properties of these manuscripts are simultaneously the source of their richness and the font of many challenges. The average American--for all we think we know about our recent plantation past--has little idea of the economic underpinnings of that regime and likewise little sense of how individual men and women may have navigated it. The idea that enslaved people engaged in commercial transactions, were consumers, at the same time that they were treated as chattel property, runs counter to our understanding of what slavery meant and how it was experienced.
Therefore, primary documents challenging these deeply-held beliefs are an important resource, not only for researchers, but the general public as well. We have set out to develop a mechanism that delivers these resources to a wider public, enables their participation in transcription, which makes these sources readable by machines and by people not well-versed in 18th- and 19th-century handwriting.
Just to review, for those of you who were not present for our paper in Regensburg. Neither of us is an historian. I am an archaeologist and, as is usual in the US, also an anthropologist. I came to texts such as the “Slave Ledger” discussed throughout this presentation with a straightforward question: what were enslaved people buying in 19th-century North Carolina? In this sense, the store records complement the archaeological record, which is my primary interest. Clearly, however, these texts have additional meanings and potential for addressing much more than material culture and consumption. This is exciting for the anthropologist in me. Ben is editing the account books of Jeremiah White Graves, a ledger and miscellany from a Virginia tobacco plantation. We are collaborating to extend the capabilities of Ben’s online transcription tool FromThePage, to unleash the full analytical possibilities embodied in financial records. This paper follows up on our previous contribution by showing how the new version of FromThePage meets the challenges that we outlined in October.
Aims and Problems
In pilot studies I had done before adopting FromThePage, participants cited the need to have the manuscript being transcribed visible at the same time as the transcription window.
One of the problems with transcription is how to treat variations in terminology and orthography. This conversation was discussed at the last MEDEA meeting as a difference between historical and linguistic content.
The goal of wiki encoding is quick and easy data entry. What that means is that where possible, users are typing in plain text. Now this is a compromise. It is a compromise between presentational mark-up and semantic mark-up. But fundamentally all editions are compromises. [inaudible]
If the user encounters a line break in the original text, they hit carriage return. That encodes a line-break. For a paragraph break, they encode a blank line. So you end up with something very similar to the old-fashioned typographic facsimile in your transcript.
That's not enough, however -- you have to have some explicit mark-up. That's where we've added lightweight encoding. Most wiki systems support that, but the one which we use most prominently are wikilinks that are backed by a relational database system that records all of the encoding.
In this case, "gunflint" shows up in the category of "arms".
The TEI exporter generates that text with a reference string and a line break. It also generates the personography entry for Joseph A Whitehead from the the wiki-link connecting "Jos. A Whitehead" to the canonical name.
Here's an example from the Stagville Account books that Anna has encoded, which shows off the mark-up which we developed for this project. And I have to say that this is "hot code" -- this was developed in February, so we are nowhere near done with it yet.
We needed to come up with some way to encode these tabular records in a semantically meaningful way and to render them usefully. We chose the Markdown sub-flavor of wiki-markup to come up with this format which looks vaguely tabular.
We do something really pretty simple: we look at the heading, and if something appears under the same heading in more than one table, we'll put it in the same column in the spreadsheet we generate. That means that some rows are going to have blank cells because they've had different headers. Filtering should allow you to put that together. Here you see Mrs. Henry's Abram's account, you have Frederick's account, so you can filter those and say you just want to study Frederick's account. You just want to study those two accounts together.
Furthermore, we also have the ability to link back to the original page, so that you can get back from the spreadsheet to the textual edition.
- We need to work on encoding dates that are useful for analysis.
- We need to figure out hot to integrate subjects and tables in ways that can be used analytically.
- We need to add this table support to our TEI exports.
With that, I turn it back to Anna.
ConclusionAgbe-Davies: We have conceptualized usability in terms of both process and product. Because FromThePage is designed to facilitate crowdsourced transcription of manuscript accounts the functionality of the input process is as important as the form the output will take. The resulting transcription will be exponentially more readable for nonspecialist users, while at the same time allowing researchers to perform quantitative analyses on data that would otherwise be inaccessible.
Each of these audiences can contribute to the development of these datasets and use them in creative ways. In my association with Stagville State Historic Site, I have the opportunity to share research findings with the general public and they are eager to explore these data for themselves and turn them to their own purposes. Teachers can use this material in their classes. Site interpreters and curators can enrich their museums’ content with it. History enthusiasts can get a sense of the primary data that underlies historical scholarship. Researchers can manipulate and examine transcriptions in ways that are both quantitative and qualitative. In a recent paper in crowdsourcing as it applies to archaeological research, a colleague and I wrote, “[there is an] urgent need for access to comparative data and information technology infrastructure so that we may produce increasingly synthetic research. We encourage increased attention not only the ways that technology permits new kinds of analyses, but also to the ways that we can use it to improve access to scattered datasets and bring more hands to take up these challenges.” A similar argument can be made for the modeling of historic accounts.