<TEI xmlns="http://www.tei-c.org/ns/1.0" xml:lang="en-GB">
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title type="main">TEI by Example</title>
        <title type="sub">Module 0: Introduction to Text Encoding and the TEI</title>
        <author xml:id="EV">Edward Vanhoutte</author>
        <editor xml:id="RvdB">Ron Van den Branden</editor>
        <editor xml:id="MT">Melissa Terras</editor>
        <sponsor>Association for Literary and Linguistic Computing (ALLC)</sponsor>
        <sponsor>Centre for Data, Culture and Society, University of Edinburgh, UK</sponsor>
        <sponsor>Centre for Digital Humanities (CDH), University College London, UK</sponsor>
        <sponsor>Centre for Computing in the Humanities (CCH), King’s College London, UK</sponsor>
        <sponsor>Centre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, Belgium</sponsor>
        <funder>
          <address>
            <addrLine>Centre for Scholarly Editing and Document Studies (CTB)</addrLine>
            <addrLine>Royal Academy of Dutch Language and Literature</addrLine>
            <addrLine>Koningstraat 18</addrLine>
            <addrLine>9000 Gent</addrLine>
            <addrLine>Belgium</addrLine>
          </address>
          <email>ctb@kantl.be</email>
        </funder>
        <principal>Edward Vanhoutte</principal>
        <principal>Melissa Terras</principal>
      </titleStmt>
      <publicationStmt>
        <publisher>Centre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, Belgium</publisher>
        <distributor>Centre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, Belgium</distributor>
        <pubPlace>Gent</pubPlace>
        <address>
          <addrLine>Centre for Scholarly Editing and Document Studies (CTB)</addrLine>
          <addrLine>Royal Academy of Dutch Language and Literature</addrLine>
          <addrLine>Koningstraat 18</addrLine>
          <addrLine>9000 Gent</addrLine>
          <addrLine>Belgium</addrLine>
        </address>
        <availability status="free">
          <p>Licensed under a <ref target="http://creativecommons.org/licenses/by-sa/3.0/">Creative Commons Attribution ShareAlike 3.0 License</ref>
                    </p>
        </availability>
        <date when="2010-07-09">9 July 2010</date>
      </publicationStmt>
      <seriesStmt>
        <title>TEI by Example.</title>
        <respStmt>
          <name>Edward Vanhoutte</name>
          <resp>editor</resp>
        </respStmt>
        <respStmt>
          <name>Ron Van den Branden</name>
          <resp>editor</resp>
        </respStmt>
        <respStmt>
          <name>Melissa Terras</name>
          <resp>editor</resp>
        </respStmt>
      </seriesStmt>
      <sourceDesc>
        <p>Digitally born</p>
      </sourceDesc>
    </fileDesc>
    <encodingDesc>
      <projectDesc>
        <p>TEI by Example offers a series of freely available online tutorials walking individuals through the different stages in marking up a document in TEI (Text Encoding Initiative). Besides a general introduction to text encoding, step-by-step tutorial modules provide example-based introductions to eight different aspects of electronic text markup for the humanities. Each tutorial module is accompanied with a dedicated examples section, illustrating actual TEI encoding practise with real-life examples. The theory of the tutorial modules can be tested in interactive tests and exercises.</p>
      </projectDesc>
    </encodingDesc>
    <profileDesc>
      <langUsage>
        <language ident="en-GB">en-GB</language>
      </langUsage>
    </profileDesc>
    <revisionDesc>
      <change when="2020-06-23" who="#RvdB">technical revision</change>
      <change when="2010-07-23" who="#RvdB">fixed broken link and (example) character encoding</change>
      <change when="2010-07-13" who="#RvdB">
                <list>
                    <item>added distinction <gi>gi</gi> — <tag>gi scheme="..."</tag> — <gi>tag</gi>
                    </item>
        <item>final spellcheck</item>
                </list>
            </change>
      <change when="2010-07-08" who="#RvdB">release</change>
      <change when="2009-12-16" who="#EV">Added documentation on how to associate entity declarations with a document instance under 4.2.3.</change>
      <change when="2009-11-20" who="#EV">
                <list>
                    <item>Added new section 4. XML ground rules: to be finished</item>
                    <item>Added new section 5.3 Using TEI: to be revised</item>
                </list>
            </change>
      <change when="2009-06-11" who="#RvdB">-reshuffled modules: TBED01v00 has become TBED00v00; updated TBED00v00.xml</change>
      <change when="2009-09-10" who="#EV">Revision</change>
      <change when="2008-02-19" who="#EV">XML-izing text</change>
    </revisionDesc>
  </teiHeader>
  <text xml:id="TBED00v00" type="tutorials">
    <body>
            <div xml:id="textencoding">
        <head>Text Encoding in the Humanities</head>
        <p>Since the earliest uses of computers and computational techniques in the humanities at the end of the 1940s, scholars, projects, and research groups had to look for systems that could provide representations of data which the computer could process. Computers, as Michael Sperberg-McQueen has reminded us, are binary machines that <quote source="#quoteref1">can contain and operate on patterns of electronic charges, but they cannot contain numbers, which are abstract mathematical objects not electronic charges, nor texts, which are complex, abstract cultural and linguistic objects.</quote> (<ref xml:id="quoteref1" type="bibl" target="#msmq1991">Sperberg-McQueen 1991, 34</ref>). This is clearly seen in the mechanics of early input devices such as punched cards where a hole at a certain coordinate actually meant a I or 0 (true or false) for the character or numerical represented by this coordinate, according to the specific character set of the computer used. Because different computers used different character sets with a different number of characters, texts first had to be transcribed into that character set. All characters, punctuation marks, diacritics, and significant changes of type style had to be encoded with an inadequate budget of characters. This resulted in a complex of <soCalled>flags</soCalled> for distinguishing upper-case and lower-case letters, for coding accented characters, the start of a new chapter, paragraph, sentence, or word. These <soCalled>flags</soCalled> were also used for adding analytical information to the text such as word classes, morphological, syntactic, and lexical information. Ideally, each project used its own set of conventions consistently throughout. Since this set of conventions was usually designed on the basis of an analysis of the textual material to be transcribed to machine readable text, another corpus of textual material would possibly need another set of conventions. The design of these sets of conventions was also heavily dependent on the nature and infrastructure of the project, such as the type of computers, software, and devices such as magnetic tapes of a certain kind that were available.</p>
        <p>Although several projects were able to produce meaningful scholarly results with this internally consistent approach, the particular nature of each set of conventions or encoding scheme had lots of disadvantages. Texts prepared in such a proprietary scheme by one project could not readily be used by other projects; software developed for the analysis of such texts could hence not be used outside the project due to an incompatibility of encoding schemes and non-standardisation of hardware. However, with the increase of texts being prepared in machine-readable format, the call for an economic use of resources increased as well. Already in 1967, Martin Kay argued in favour of a <quote source="#quoteref2">standard code in which any text received from an outside source can be assumed to be</quote> (<ref xml:id="quoteref2" type="bibl" target="#kay1967">Kay 1967, 171</ref>). This code would behave as an exchange format which allowed the users to use their own conventions at output and at input (<ref type="bibl" target="#kay1967">Kay 1967, 172</ref>).</p>
      </div>
        </body>
    <back>
      <div type="bibliography">
        <listBibl>
          <bibl xml:id="barnard1988">
                        <author>Barnard, David T.</author>, <author>Cheryl A. Fraser</author>, and <author>George M. Logan</author>. <date>1988</date>. <title level="a">Generalized Markup for Literary Texts</title>. <title level="j">Literary and Linguistic Computing</title> <biblScope unit="volume">3</biblScope> (<biblScope unit="issue">1</biblScope>): <biblScope unit="page">26–31</biblScope>. <idno type="DOI">10.1093/llc/3.1.26</idno>.</bibl>
          <bibl xml:id="barnard1988b">
                        <author>Barnard, David T.</author>, <author>Ron Hayter</author>, <author>Maria Karababa</author>, <author>George M. Logan</author>, and <author>John McFadden</author> <date>1988</date>. <title level="a">SGML-Based Markup for Literary Texts: Two Problems and Some Solutions</title>. <title level="j">Computers and the Humanities</title> <biblScope unit="volume">22</biblScope> (<biblScope unit="issue">4</biblScope>): <biblScope unit="page">265–276</biblScope>.</bibl>
          <bibl xml:id="berkowitz1986">
                        <author>Berkowitz, Luci</author>, <author>Karl A. Squitier</author>, and <author>William H. A. Johnson</author>. <date>1986</date>. <title level="m">Thesaurus Linguae Graecae, Canon of Greek Authors and Works.</title> <pubPlace>New York/Oxford</pubPlace>: <publisher>Oxford University Press</publisher>.</bibl>
          <bibl xml:id="bray1998">
                        <editor>Bray, Tim</editor>, <editor>Jean Paoli</editor>, and <editor>C. M. Sperberg-McQueen</editor>. <title level="m">Extensible Markup Language (XML) 1.0.</title> W3C Recommendation 10-February-1998. <ptr target="http://www.w3.org/TR/1998/REC-xml-19980210"/> (accessed September 2008).</bibl>
          <bibl xml:id="burnard1988">
                        <author>Burnard, Lou</author> <date>1988</date>. <title level="a">Report of Workshop on Text Encoding Guidelines</title>. <title level="j">Literary and Linguistic Computing</title> <biblScope unit="volume">3</biblScope> (<biblScope unit="issue">2</biblScope>): <biblScope unit="page">131–133</biblScope>. <idno type="DOI">10.1093/llc/3.2.131</idno>.</bibl>
          <bibl xml:id="burnard2006">
                        <author>Burnard, Lou</author>, and <author>C. M. Sperberg-McQueen</author>. <date>2006</date>. <title level="u">TEI Lite: Encoding for Interchange: an introduction to the TEI Revised for TEI P5 release</title>. February 2006 <ptr target="https://tei-c.org/release/doc/tei-p5-exemplars/html/tei_lite.doc.html"/>.</bibl>
          <bibl xml:id="derose1999">
                        <author>DeRose, Steven J.</author> <date>1999</date>. <title level="a">XML and the TEI</title>. <title level="j">Computers and the Humanities</title> <biblScope unit="volume">33</biblScope> (<biblScope unit="issue">1–2</biblScope>): <biblScope unit="page">11–30</biblScope>.</bibl>
          <bibl xml:id="goldfarb1990">
                        <author>Goldfarb, Charles F.</author> <date>1990</date>. <title level="m">The SGML Handbook</title>. <pubPlace>Oxford</pubPlace>: <publisher>Clarendon Press</publisher>.</bibl>
          <bibl xml:id="hockey1980">
                        <author>Hockey, Susan</author> <date>1980</date>. <title level="m">Oxford Concordance Program Users’ Manual</title>. <pubPlace>Oxford</pubPlace>: <publisher>Oxford University Computing Service</publisher>.</bibl>
          <bibl xml:id="ide1988">
                        <author>Ide, Nancy M.</author>, and <author>C. M. Sperberg-McQueen</author>. <date>1988</date>. <title level="a">Development of a Standard for Encoding Literary and Linguistic Materials</title>. In <title level="m">Cologne Computer Conference 1988. Uses of the Computer in the Humanities and Social Sciences. Volume of Abstracts.</title> Cologne, Germany, Sept 7–10 1988, p. <biblScope unit="page">E.6-3-4</biblScope>.</bibl>
          <bibl xml:id="ide1995">
                        <author>Ide, Nancy M.</author>, and <author>C. M. Sperberg-McQueen</author>. <date>1995</date>. <title level="a">The TEI: History, Goals, and Future</title>. <title level="j">Computers and the Humanities</title> <biblScope unit="volume">29</biblScope> (<biblScope unit="issue">1</biblScope>): <biblScope unit="page">5–15</biblScope>.</bibl>
          <bibl xml:id="kay1967">
                        <author>Kay, Martin</author> <date>1967</date>. <title level="a">Standards for Encoding Data in a Natural Language</title>. <title level="j">Computers and the Humanities</title>, <biblScope unit="volume">1</biblScope> (<biblScope>5</biblScope>): <biblScope unit="page">170–177</biblScope>.</bibl>
          <bibl xml:id="lancashire1996">
                        <author>Lancashire, Ian</author>, <author>John Bradley</author>, <author>Willard McCarty</author>, <author>Michael Stairs</author>, and <author>Terence Russon Woolridge</author>. <date>1996</date> <title level="m">Using TACT with Electronic Texts</title>. <pubPlace>New York</pubPlace>: <publisher>Modern Language Association of America</publisher>.</bibl>
          <bibl xml:id="russel1967">
                        <author>Russel, D. B.</author> <date>1967</date>. <title level="m">COCOA: A Word Count and Concordance Generator for Atlas</title>. <pubPlace>Chilton</pubPlace>: <publisher>Atlas Computer Laboratory</publisher>.</bibl>
          <bibl xml:id="msmq1991">
                        <author>Sperberg-McQueen, C. M.</author> <date>1991</date>. <title level="a">Text in the Electronic Age: Textual Study and Text Encoding with examples from Medieval Texts</title>. <title level="j">Literary and Linguistic Computing</title> <biblScope unit="volume">6</biblScope> (<biblScope unit="issue">1</biblScope>): <biblScope unit="page">34–46</biblScope>. <idno type="DOI">10.1093/llc/6.1.34</idno>.</bibl>
          <bibl xml:id="msmq1990">
                        <editor>Sperberg-McQueen, C. M.</editor>, and <editor>Lou Burnard</editor> (eds.). <date>1990</date>. <title level="m">TEI P1: Guidelines for the Encoding and Interchange of Machine Readable Texts</title>. <pubPlace>Chicago/Oxford</pubPlace>: <publisher>ACH-ALLC-ACL Text Encoding Initiative</publisher>. <ptr target="https://tei-c.org/Vault/Vault-GL.html"/> (accessed October 2008).</bibl>
          <bibl xml:id="msmq1993">
                        <editor>Sperberg-McQueen, C. M.</editor>, and <editor>Lou Burnard</editor> (eds.). <date>1993</date>. <title level="m">TEI P2 Guidelines for the Encoding and Interchange of Machine Readable Texts</title> Draft P2 (published serially 1992–1993); Draft Version 2 of April 1993: 19 chapters. <ptr target="https://tei-c.org/Vault/Vault-GL.html"/> (accessed October 2008).</bibl>
          <bibl xml:id="msmq1994">
                        <editor>Sperberg-McQueen, C. M.</editor>, and <editor>Lou Burnard</editor> (eds.). <date>1994</date>. <title level="m">Guidelines for Electronic Text Encoding and Interchange. TEI P3.</title> <pubPlace>Oxford, Providence, Charlottesville, Bergen</pubPlace>: <publisher>Text Encoding Initiative</publisher>.</bibl>
          <bibl xml:id="msmq1999">
                        <editor>Sperberg-McQueen, C. M.</editor>, and <editor>Lou Burnard</editor> (eds.). <date>1999</date>. <title level="m">Guidelines for Electronic Text Encoding and Interchange. TEI P3. Revised reprint.</title> <pubPlace>Oxford, Providence, Charlottesville, Bergen</pubPlace>: <publisher>Text Encoding Initiative</publisher>.</bibl>
          <bibl xml:id="msmq2002">
                        <editor>Sperberg-McQueen, C. M.</editor>, and <editor>Lou Burnard</editor> (eds.). <date>2002</date>. <title level="m">TEI P4: Guidelines for Electronic Text Encoding and Interchange. XML-compatible edition.</title> XML conversion by Syd Bauman, Lou Burnard, Steven DeRose, and Sebastian Rahtz. <pubPlace>Oxford, Providence, Charlottesville, Bergen</pubPlace>: <publisher>Text Encoding Initiative Consortium</publisher>. <ptr target="https://tei-c.org/Vault/P4/doc/html/"/> (accessed October 2008).</bibl>
          <bibl xml:id="tei2007">
                        <orgName>TEI Consortium</orgName>. <date>2007</date>. <title level="m">TEI P5: Guidelines for Electronic Text Encoding and Interchange</title>. <pubPlace>Oxford, Providence, Charlottesville, Nancy</pubPlace>: <publisher>TEI Consortium</publisher>. <ptr target="https://tei-c.org/Vault/P5/1.0.0/doc/tei-p5-doc/en/html/"/> (accessed October 2008).</bibl>
        </listBibl>
      </div>
    </back>
  </text>
  <!-- 
        $Date: 2020-07-08 02:33:20 +0200 (Wed, 08 Jul 2020) $
        $Id: TBED00v00.xml 425 2020-07-08 00:33:20Z ron.vandenbranden $  -->
</TEI>