<TEI xmlns="http://www.tei-c.org/ns/1.0">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title type="main">TEI by Example</title>
        <title type="sub">Module 6: Primary Sources</title>
        <author xml:id="RvdB">Ron Van den Branden</author>
        <editor xml:id="EV">Edward Vanhoutte</editor>
        <editor xml:id="MT">Melissa Terras</editor>
        <sponsor>Association for Literary and Linguistic Computing (ALLC)</sponsor>
        <sponsor>Centre for Data, Culture and Society, University of Edinburgh, UK</sponsor> 
        <sponsor>Centre for Digital Humanities (CDH), University College London, UK</sponsor>
        <sponsor>Centre for Computing in the Humanities (CCH), King’s College London, UK</sponsor>
        <sponsor>Centre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, Belgium</sponsor>
        <funder>
          <address>
            <addrLine>Centre for Scholarly Editing and Document Studies (CTB)</addrLine>
            <addrLine>Royal Academy of Dutch Language and Literature</addrLine>
            <addrLine>Koningstraat 18</addrLine>
            <addrLine>9000 Gent</addrLine>
            <addrLine>Belgium</addrLine>
          </address>
          <email>ctb@kantl.be</email>
        </funder>
        <principal>Edward Vanhoutte</principal>
        <principal>Melissa Terras</principal>
      </titleStmt>
      <publicationStmt>
        <publisher>Centre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, Belgium</publisher>
        <distributor>Centre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, Belgium</distributor>
        <pubPlace>Gent</pubPlace>
        <address>
          <addrLine>Centre for Scholarly Editing and Document Studies (CTB)</addrLine>
          <addrLine>Royal Academy of Dutch Language and Literature</addrLine>
          <addrLine>Koningstraat 18</addrLine>
          <addrLine>9000 Gent</addrLine>
          <addrLine>Belgium</addrLine>
        </address>
        <availability status="free">
          <p>Licensed under a <ref target="http://creativecommons.org/licenses/by-sa/3.0/">Creative Commons Attribution ShareAlike 3.0 License</ref>
                    </p>
        </availability>
        <date when="2010-07-09">9 July 2010</date>
      </publicationStmt>
      <seriesStmt>
        <title>TEI by Example.</title>
        <respStmt>
          <name>Edward Vanhoutte</name>
          <resp>editor</resp>
        </respStmt>
        <respStmt>
          <name>Ron Van den Branden</name>
          <resp>editor</resp>
        </respStmt>
        <respStmt>
          <name>Melissa Terras</name>
          <resp>editor</resp>
        </respStmt>
      </seriesStmt>
      <sourceDesc>
        <p>Digitally born</p>
      </sourceDesc>
    </fileDesc>
    <encodingDesc>
      <projectDesc>
        <p>TEI by Example offers a series of freely available online tutorials walking individuals through the different stages in marking up a document in TEI (Text Encoding Initiative). Besides a general introduction to text encoding, step-by-step tutorial modules provide example-based introductions to eight different aspects of electronic text markup for the humanities. Each tutorial module is accompanied with a dedicated examples section, illustrating actual TEI encoding practise with real-life examples. The theory of the tutorial modules can be tested in interactive tests and exercises.</p>
      </projectDesc>
    </encodingDesc>
    <profileDesc>
      <langUsage>
        <language ident="en-GB">en-GB</language>
      </langUsage>
    </profileDesc>
    <revisionDesc>
      <change when="2020-06-15" who="#RvdB">technical revision</change>
      <change when="2010-07-09" who="#RvdB">release</change>
      <change when="2009-11-30" who="#RvdB">corrected typos</change>
      <change when="2009-06-11" who="#RvdB">editing</change>
      <change when="2009-04-27" who="#RvdB">authoring</change>
    </revisionDesc>
  </teiHeader>
  <text xml:id="TBED06v00" type="tutorials">
    <body>
            <div xml:id="editorialInterventions">
        <head>Editorial Interventions</head>
        <div xml:id="supplied">
          <head>Unclear, Supplied, Omitted Text</head>
          <p>Depending on the quality of the source material or the handwriting, transcription of primary source texts may be more or less straightforward. As any further interpretation of an electronic transcription depends on this first interpretative act, it may be desirable for an encoder to indicate places of uncertainty, either for further inspection or to take intellectual responsibility. Text for which the reading is uncertain can be encoded in an <gi>unclear</gi> element. The reason for the unclear reading can be stated in a <att>reason</att> attribute, which takes either a single keyword, or a white space separated list of keywords. If the legibility is affected by damage, the cause of the damage can be described in the <att>agent</att> attribute. For example, as our previous transcription of the word <q>dioxide</q> as <q>diacxside</q> was quite uncertain, this could be indicated with the <gi>unclear</gi> element:
            <figure xml:id="example18">
              <egXML xmlns="http://www.tei-c.org/ns/Examples">
                <TEI>
                  <teiHeader>
                    <fileDesc>
                      <titleStmt>
                        <title>There and Back Again: digital edition</title>
                        <author xml:id="HannaRenton">Hanna Renton</author>
                        <editor xml:id="TBE">The TBE crew</editor>
                        <!--...-->
                      </titleStmt>
                      <!--...-->
                    </fileDesc>
                    <!--...-->
                  </teiHeader>
                  <text>
                    <body>
                      <!--...-->
                      <p>
                        <!--...--> di<unclear reason="illegible" resp="#TBE">
                                                    <subst hand="#HR">
                          <del rend="overwritten">ox</del>
                          <add>acx</add>
                        </subst>
                                                </unclear>side <!--...-->
                      </p>
                      <!--...-->
                    </body>
                  </text>
                </TEI>
              </egXML>
              <head type="legend">Signalling unclear text with <gi>unclear</gi>.</head>
            </figure>
          </p>
          <p>Similarly, if we decided that the damaged dateline at the start of the document could still be deciphered, the uncertain status of this part of the text could be indicated with an <gi>unclear</gi> element:
            <figure xml:id="example19">
              <egXML xmlns="http://www.tei-c.org/ns/Examples">
                <dateline>
                  <date when="2008-08-26">
                                        <damage agent="stapling" hand="#teacher" unit="chars" quantity="3">
                                            <unclear>26/</unclear>
                                        </damage>8/08</date>
                </dateline>
              </egXML>
              <head type="legend">Combining <gi>damage</gi> with <gi>unclear</gi>.</head>
            </figure>
            ...or without the <gi>damage</gi> element, if this is deemed less important to the transcription:
            <figure xml:id="example20">
              <egXML xmlns="http://www.tei-c.org/ns/Examples">
                <dateline>
                  <date when="2008-08-26">
                                        <unclear reason="damage" agent="stapling" unit="chars" quantity="3">26/</unclear>8/08</date>
                </dateline>
              </egXML>
              <head type="legend">Indicating damage in a <att>reason</att> attribute on <gi>unclear</gi>.</head>
            </figure>
          </p>
          <p>Notice, however, that the use of the <gi>unclear</gi> element implies that the text it encloses must still be present in the document source, and still be legible to some degree. If the encoder considers text too unclear to be transcribed in any way, he or she may opt to omit this part of the text, and indicate this editorial intervention with a <gi>gap</gi> element. This is an empty element, whose sole purpose is to indicate the omission, possibly with characterisation of the reason (<att>reason</att>), or the cause of the damage causing this omission, if any (<att>agent</att>). As with the <gi>damage</gi> element, the extent of the omission can be specified implicitly with the <att>extent</att> attribute, or more explicitly by combining the <att>unit</att> and <att>quantity</att> attributes.</p>
          <p>For example, on page 3 of our sample text, the phrase <q>Yess!! The door!!!</q> is followed by some words that can’t all be deciphered confidently, as they appear to have been erased by the author. When transcribing this passage, we could opt to encode an informed guess and mark it with the <gi>unclear</gi> element, while leaving out the truly illegible words. This omission can be marked with <gi>gap</gi>:
            <figure xml:id="example21">
              <egXML xmlns="http://www.tei-c.org/ns/Examples">
                Yess!! The door!!! <unclear reason="erasure" resp="#TBE">
                                    <gap unit="cm" quantity="2.5"/> I got out.</unclear>
              </egXML>
              <head type="legend">Omitting illegible text with <gi>gap</gi>.</head>
            </figure>
          </p>
          <p>Similarly, the damage in the dateline could be deemed too destructive for a confident reading of the day, which may motivate the encoder to leave it out. In this case, too, a <gi>gap</gi> element can be used, either within or without a surrounding <gi>damage</gi> element:
            <figure xml:id="example22">
              <egXML xmlns="http://www.tei-c.org/ns/Examples">
                <dateline>
                  <date when="2008-08-26">
                                        <damage agent="stapling" unit="chars" quantity="3">
                                            <gap unit="chars" quantity="3"/>
                                        </damage>8/08</date>
                </dateline>
              </egXML>
              <head type="legend">Omitting illegible text in a damaged region.</head>
            </figure>
          </p>
          <p>In contrast, the editor may wish to make a stronger intervention, by supplying text that is lacking from or illegible in the document source. This can be done by wrapping the added text in a <gi>supplied</gi> element. In a <att>reason</att> attribute, the reason for this editorial addition can be given. For the dateline example, if the text is considered illegible, but the encoder feels able to reconstruct the date, this can result in following encoding:
            <figure xml:id="example23">
              <egXML xmlns="http://www.tei-c.org/ns/Examples">
                <dateline>
                  <date when="2008-08-26">
                                        <damage agent="stapling" unit="chars" quantity="3">
                                            <supplied resp="#TBE">26/</supplied>
                                        </damage>8/08</date>
                </dateline>
              </egXML>
              <head type="legend">Encoding editorial additions with <gi>supplied</gi>.</head>
            </figure>
          </p>
          <p>This allows us to encode other lacking text as well: at the end of page 2, a couple of final words on some lines are incomplete due to xeroxing. These can be reconstructed fairly straightforwardly for the transcription. However, these reconstructions are best signalled with the <gi>supplied</gi> element:<note>Notice, the crucial difference between the encoding of text added or deleted by the author or editor of the <emph>source document</emph> on the one hand, and by the encoder of the <emph>electronic</emph> transcription on the other hand. Additions or deletions present in the source may only be encoded respectively as <gi>add</gi> and <gi>del</gi>, while text that has been added or deleted by editorial emendation must be encoded as <gi>supplied</gi> or <gi>gap</gi>, respectively.</note>
            <figure xml:id="example24">
              <egXML xmlns="http://www.tei-c.org/ns/Examples">
                <p>
                  <!--...-->
                  Nothing was happeni<supplied reason="cutoff-while-xeroxing" resp="#TBE">ng.</supplied>  Then, <quote>Hello, dearie</quote>, an old woman answere<supplied reason="cutoff-while-xeroxing" resp="#TBE">d.</supplied> <quote>Who is it?</quote> I hung up. Uh, oh. How do I get back then?</p>
              </egXML>
              <head type="legend">Providing a reason for an editorial addition, with <att>reason</att> on <gi>supplied</gi>.</head>
            </figure>
          </p>
          <note type="summary">When text in the source document is still partly legible, but needs interpretation in order to be transcribed, this uncertainty can be expressed by enclosing the text in an <gi>unclear</gi> element. If text is deemed totally illegible, it can be omitted from the transcription, but signalled with a <gi>gap</gi> element (without any content). Both elements can indicate the reason for the editorial intervention (<att>reason</att>), and the nature of the damage (<att>agent</att>). An editor wishing to supply text in the electronic transcription for illegible or lacking text in the source text, can encode this supplied text with the <gi>supplied</gi> element. In a <att>reason</att> attribute, the reason for this intervention can be stated.</note>
        </div>
        <div xml:id="corrections">
          <head>Corrections</head>
          <p>If we look back at the comparison between the facsimiles (see <ptr type="crossref" target="#figure1"/>) and the initial transcription (see <ptr type="crossref" target="#example1"/>), we notice that a lot of words have been silently corrected by the transcriber. Although some errors had been corrected by the teacher (who can be considered an editor or corrector of the source document), many have slipped through. Depending on the aim of the transcription, such apparent errors may be transcribed unmediatedly, corrected silently, marked explicitly, or corrected explicitly. All of these practices are perfectly legitimate as long as they are applied consistently and motivated in the <gi>editorialDecl</gi> element of the electronic document’s header. An encoder adhering to a more explicit practice would like to at least signal apparent errors, editorial corrections, or both. The TEI provides specific elements for this purpose: <gi>sic</gi>, for indicating apparent errors, and <gi>corr</gi>, for indicating editorial corrections. For example, the sentence <q>Now I know what all the black air is: all the polution.</q> on page 2 could be transcribed as follows:
            <figure xml:id="example25">
              <egXML xmlns="http://www.tei-c.org/ns/Examples">
                Now I know what all the black air is: all the <sic>polution</sic>.
              </egXML>
              <head type="legend">Encoding an apparent error with <gi>sic</gi>.</head>
            </figure>
            ...if the encoder would be interested in transcribing the source text as accurately as possible, or:
            <figure xml:id="example26">
              <egXML xmlns="http://www.tei-c.org/ns/Examples">
                Now I know what all the black air is: all the <corr>pollution</corr>.
              </egXML>
              <head type="legend">Encoding the correction of an apparent error with <gi>corr</gi>.</head>              
            </figure>
            ...if the content matters most (perhaps to ease searching operations in a digital edition of the text). However, both <soCalled>views</soCalled> on the text can be combined in a <gi>choice</gi> element.<note>Again, a crucial distinction must be pointed out between encoding of corrections present in the <emph>source document</emph>, and editorial corrections in the <emph>electronic transcription</emph>. The latter must always be encoded using <gi>sic</gi> and <gi>corr</gi>, possibly wrapped in a <gi>choice</gi> element. Corrections present in the source document must be encoded using combinations of <gi>del</gi> and <gi>add</gi>, possibly grouped in a <gi>subst</gi> element, and preferably specified with attributes identifying the responsible document hand (<att>hand</att>), and the editor responsible for this identification (<att>resp</att>).</note> This enables an encoder to express alternative encodings of the same text. Both views could thus be combined as:
            <figure xml:id="example27">
              <egXML xmlns="http://www.tei-c.org/ns/Examples">
                Now I know what all the black air is: all the <choice>
                                    <sic>polution</sic>
                                    <corr>pollution</corr>
                                </choice>.
              </egXML>
              <head type="legend">Combining errors and corrections in <gi>choice</gi>.</head>
            </figure>
          </p>
          <p>The <gi>sic</gi> element may contain all elements that are necessary to represent the original source text, like deletions, damage, and so on. The <q>diacxside</q> fragment on page two can thus be corrected as follows:
          <figure xml:id="example28">
            <egXML xmlns="http://www.tei-c.org/ns/Examples">
              <choice>
                                    <sic>di<subst resp="#TBE">
                <del hand="#HR" rend="overwritten">ox</del>
                <add hand="#HR">acx</add>
              </subst>side</sic>
                                    <corr>dioxide</corr>
                                </choice>
            </egXML>
            <head type="legend">Encoding authorial phenomena inside <gi>sic</gi>.</head>
          </figure>
          </p>
          <note type="summary">Apparent errors in the source text may be indicated explicitly in a <gi>sic</gi> element, or corrected with a <gi>corr</gi> element. Both the original and the correction can be included in the transcription, if they are wrapped in a <gi>choice</gi> element.</note>
        </div>
      </div>
        </body>
  </text>
  <!-- 
        $Date: 2020-07-08 02:33:20 +0200 (Wed, 08 Jul 2020) $
        $Id: TBED06v00.xml 425 2020-07-08 00:33:20Z ron.vandenbranden $  -->
</TEI>