Utilizing Language Technology in the Documentation of Endangered Uralic Languages
The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi language documentation projects, all of which record new spoken language data, digitize available recordings and annotate these multimedia data in order to provide comprehensive language corpora as databases for futur...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Linköping University Electronic Press
2016-03-01
|
Series: | Northern European Journal of Language Technology |
Online Access: | https://nejlt.ep.liu.se/article/view/1660 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832590643293061120 |
---|---|
author | Ciprian Gerstenberger Niko Partanen Michael Rießler Joshua Wilbur |
author_facet | Ciprian Gerstenberger Niko Partanen Michael Rießler Joshua Wilbur |
author_sort | Ciprian Gerstenberger |
collection | DOAJ |
description |
The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi language documentation projects, all of which record new spoken language data, digitize available recordings and annotate these multimedia data in order to provide comprehensive language corpora as databases for future research on and for endangered – and under-described – Uralic speech communities. Applying language technology in language documentation helps us to create more systematically annotated corpora, rather than eclectic data collections. Specifically, we describe a script providing interactivity between different morphosyntactic analysis modules implemented as Finite State Transducers and ELAN, a Graphical User Interface tool for annotating and presenting multimodal corpora. Ultimately, the spoken corpora created in our projects will be useful for scientifically significant quantitative investigations on these languages in the future.
|
format | Article |
id | doaj-art-f2b181dd025f41aa86c116d521b7c0f6 |
institution | Kabale University |
issn | 2000-1533 |
language | English |
publishDate | 2016-03-01 |
publisher | Linköping University Electronic Press |
record_format | Article |
series | Northern European Journal of Language Technology |
spelling | doaj-art-f2b181dd025f41aa86c116d521b7c0f62025-01-23T10:36:33ZengLinköping University Electronic PressNorthern European Journal of Language Technology2000-15332016-03-01410.3384/nejlt.2000-1533.1643Utilizing Language Technology in the Documentation of Endangered Uralic LanguagesCiprian Gerstenberger0Niko Partanen1Michael Rießler2Joshua Wilbur3UiT – The Arctic University of Norway, Giellatekno – Saami Language TechnologyUniversity of Hamburg, Department of Uralic StudiesUniversity of Freiburg, Department of Scandinavian StudiesUniversity of Freiburg, Department of Scandinavian Studies The paper describes work-in-progress by the Pite Saami, Kola Saami and Izhva Komi language documentation projects, all of which record new spoken language data, digitize available recordings and annotate these multimedia data in order to provide comprehensive language corpora as databases for future research on and for endangered – and under-described – Uralic speech communities. Applying language technology in language documentation helps us to create more systematically annotated corpora, rather than eclectic data collections. Specifically, we describe a script providing interactivity between different morphosyntactic analysis modules implemented as Finite State Transducers and ELAN, a Graphical User Interface tool for annotating and presenting multimodal corpora. Ultimately, the spoken corpora created in our projects will be useful for scientifically significant quantitative investigations on these languages in the future. https://nejlt.ep.liu.se/article/view/1660 |
spellingShingle | Ciprian Gerstenberger Niko Partanen Michael Rießler Joshua Wilbur Utilizing Language Technology in the Documentation of Endangered Uralic Languages Northern European Journal of Language Technology |
title | Utilizing Language Technology in the Documentation of Endangered Uralic Languages |
title_full | Utilizing Language Technology in the Documentation of Endangered Uralic Languages |
title_fullStr | Utilizing Language Technology in the Documentation of Endangered Uralic Languages |
title_full_unstemmed | Utilizing Language Technology in the Documentation of Endangered Uralic Languages |
title_short | Utilizing Language Technology in the Documentation of Endangered Uralic Languages |
title_sort | utilizing language technology in the documentation of endangered uralic languages |
url | https://nejlt.ep.liu.se/article/view/1660 |
work_keys_str_mv | AT cipriangerstenberger utilizinglanguagetechnologyinthedocumentationofendangereduraliclanguages AT nikopartanen utilizinglanguagetechnologyinthedocumentationofendangereduraliclanguages AT michaelrießler utilizinglanguagetechnologyinthedocumentationofendangereduraliclanguages AT joshuawilbur utilizinglanguagetechnologyinthedocumentationofendangereduraliclanguages |