Friday 5 February 2016

TEI in the Wild: Mark Twain Project Online (MTPO) - Madiha Zahra Choksi

The Mark Twain Project Online (in collaboration with the California Digital Library) is an example of a digital project that not only puts TEI to good use, but shares the behind the scenes process and decision making. Though I could not find secondary publications on its methods and challenges, the MTPO includes a detailed section on their website describing their “Technical Summary”, including extensive information on TEI encoding.  It is to be said, however, that unlike other archives (e.g. The Walt Whitman Archive) that use XML encoding and TEI, the Mark Twain Project does not provide links to source codes for each individual entry.

The "TEI Encoding" subsection begins by defining TEI encoding and providing a hyperlink to the TEI website for further reading. The following paragraphs describe the MTPO’s method, schema, and challenges.

For the purposes of this project, the MTPO takes the critical editions in print form (e.g. letters) and manually transcribes them into TEI-XML documents. Unfortunately, manual transcription is subject to unintentional errors or mistakes in the detailed data entry process. In an attempt to limit these errors, the MTPO established a set of instructions, a guideline of sorts for encoding challenging pieces or areas of texts. The XML encoding process utilizes the Microsoft Office XML schema and uses the same programming as the default schema for converting data into XML formatting. The MTPO even  identifies the programming language to the public: Perl and how it transforms into Saxon (Java Script).

Furthermore, the MTPO openly shares that it implements TEI P4 with some local modifications. These modifications either exhaust additional meanings and attributes or allow multiple attributes to be classified together. In other words, these modifications provide extensive detail and are made to meet unique user needs, ensuring the best user experience possible. 

Figure 1 from the TEI Encoding” section on the MTPO webpage.

After browsing through MTPO, I was wowed by the open and intuitive access to the breadth of these texts, all supplemented with extensive notes. The most prominent features that are made possible by the XML and TEI are the indexed search (that uses the eXtensible Text Framework software), the customizable MTPO interface (providing some solace to your eyes), built in comparisons with facsimiles and originals (if available), citations, embedded links to documents, historical dates, images and more. Not only do these features allow a Mark Twain enthusiast (like myself) free and unlimited access to rare letters and papers, but the they allow me to enrich my experience in both reading and researching them. 


 "Mark Twain Project: About MTPO." Mark Twain Project. Web. 03 Feb. 2016. 

No comments:

Post a Comment