Friday 5 February 2016

Jackson Ossea on the Oxford Text Archive's (OTA) use of XML markup


                                                                   Picture Source


           At a first glance, there is nothing particularly unique about the Oxford Text Archive (OTA). It’s a digital library with a collection that primarily consists of digitized, English language humanities documents and manuscripts such as the works of Shakespeare or the collected poems of Milton. There’s nothing available in their catalogue which couldn’t be found on other, much more popular electronic resources such as the Internet Archive or Project Gutenberg. These items are accessible to the user in Plain Text, Epub, and Modi (Kindle) formats so as to be flexible with the different devices from which they will be read.
           
           The difference is that, partly because of their cooperation with the Text Encoding Initiative (TEI), the OTA also makes available their collections in their skeletal, XML formats. For many, this addition would not mean a great deal. They are likely only interested in the content of the text itself and not particularly interested with the process which brought it from the manuscript in a temperature-controlled library room to the much more accessible form now available to them on their laptop, tablet or phone.
           
           But allowing this access is provides the student and researchers the opportunity to scrutinize these texts more objectively. Even the best digitisations of classic texts are reliant on the work of those who are using markup in the same way that an English version of Plato is reliant on a translator. Both workers are concerned with communicating the idiosyncrasies of the work to those who are not fluent with the language and form of how it was created. Making the markup available allows the researcher to discern whether or not the decisions made by the OTA encoders are in agreement with their interpretation of the text and even to challenge their own notions because of the tags chosen by the institution.
           
            The OTA’s website indicates the principles which they adhere to when regarding their use of XML, but not in any significant detail. They make no attempt to clarify whether or not they find descriptive or presentational markup to be more suitable to their needs. All they indicate is that they adhere to the TEI guidelines for markup. They do not provide any summary for what those guidelines are, assuming the user will either be already familiar with them or are perfectly capable of looking them up for themselves.

            The OTA allowing their users to view the markup of their texts provides them with the opportunity to see the decisions they made when marking up the text. This way, they can be more critical of the text and not have to passively accept another scholar’s decisions regarding the markup. They can see the decisions themselves and decide whether or not this was the most useful way to communicate the content with hypertext. 

No comments:

Post a Comment