The TEI and Current Standards for Structuring Linguistic Data

The TEI has served for many years as a mature annotation format for corpora of different types, including linguistically annotated data. Although it is based on the consensus of a large community, it does not have the legal status of a standard. During the last decade, efforts have been undertaken t...

Full description

Saved in:
Bibliographic Details
Main Author: Maik Stührenberg
Format: Article
Language:deu
Published: Text Encoding Initiative Consortium 2012-10-01
Series:Journal of the Text Encoding Initiative
Subjects:
Online Access:https://journals.openedition.org/jtei/523
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832578503659225088
author Maik Stührenberg
author_facet Maik Stührenberg
author_sort Maik Stührenberg
collection DOAJ
description The TEI has served for many years as a mature annotation format for corpora of different types, including linguistically annotated data. Although it is based on the consensus of a large community, it does not have the legal status of a standard. During the last decade, efforts have been undertaken to develop definitive de jure standards for linguistic data that not only act as a normative basis for the exchange of language corpora but also address recent advancements in technology, such as web-based standards, and the use of large and multiply annotated corpora. In this article we will provide an overview of the process of international standardization and discuss some of the international standards currently being developed under the auspices of ISO/TC 37, a technical committee called “Terminology and other Language and Content Resources”. After that the relationship between the TEI Guidelines and these specifications, according to their formal model, notation format, and annotation model, will be discussed. The conclusion of the paper provides recommendations for dealing with language corpora.
format Article
id doaj-art-a3c958f7e5074c538d68d2a87df113e1
institution Kabale University
issn 2162-5603
language deu
publishDate 2012-10-01
publisher Text Encoding Initiative Consortium
record_format Article
series Journal of the Text Encoding Initiative
spelling doaj-art-a3c958f7e5074c538d68d2a87df113e12025-01-30T13:56:14ZdeuText Encoding Initiative ConsortiumJournal of the Text Encoding Initiative2162-56032012-10-01310.4000/jtei.523The TEI and Current Standards for Structuring Linguistic DataMaik StührenbergThe TEI has served for many years as a mature annotation format for corpora of different types, including linguistically annotated data. Although it is based on the consensus of a large community, it does not have the legal status of a standard. During the last decade, efforts have been undertaken to develop definitive de jure standards for linguistic data that not only act as a normative basis for the exchange of language corpora but also address recent advancements in technology, such as web-based standards, and the use of large and multiply annotated corpora. In this article we will provide an overview of the process of international standardization and discuss some of the international standards currently being developed under the auspices of ISO/TC 37, a technical committee called “Terminology and other Language and Content Resources”. After that the relationship between the TEI Guidelines and these specifications, according to their formal model, notation format, and annotation model, will be discussed. The conclusion of the paper provides recommendations for dealing with language corpora.https://journals.openedition.org/jtei/523ISO/TC 37/SC 4Linguistic Annotation Framework (LAF)Morpho-Syntactic Annotation Framework (MAF)Syntactic Annotation Framework (SynAF)feature structuresstandards
spellingShingle Maik Stührenberg
The TEI and Current Standards for Structuring Linguistic Data
Journal of the Text Encoding Initiative
ISO/TC 37/SC 4
Linguistic Annotation Framework (LAF)
Morpho-Syntactic Annotation Framework (MAF)
Syntactic Annotation Framework (SynAF)
feature structures
standards
title The TEI and Current Standards for Structuring Linguistic Data
title_full The TEI and Current Standards for Structuring Linguistic Data
title_fullStr The TEI and Current Standards for Structuring Linguistic Data
title_full_unstemmed The TEI and Current Standards for Structuring Linguistic Data
title_short The TEI and Current Standards for Structuring Linguistic Data
title_sort tei and current standards for structuring linguistic data
topic ISO/TC 37/SC 4
Linguistic Annotation Framework (LAF)
Morpho-Syntactic Annotation Framework (MAF)
Syntactic Annotation Framework (SynAF)
feature structures
standards
url https://journals.openedition.org/jtei/523
work_keys_str_mv AT maikstuhrenberg theteiandcurrentstandardsforstructuringlinguisticdata
AT maikstuhrenberg teiandcurrentstandardsforstructuringlinguisticdata