Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow

This contribution describes the workflow used to transcribe Manuscripts from the Ethiopian and Eritrean Tradition. The goal of the workflow is to obtain a TEI file with an initial text transcription that profits from a wealth of machine-generated information collected through community-based contrib...

Full description

Saved in:
Bibliographic Details
Main Author: Hizkiel Mitiku Alemayehu
Format: Article
Language:deu
Published: Text Encoding Initiative Consortium 2022-02-01
Series:Journal of the Text Encoding Initiative
Subjects:
Online Access:https://journals.openedition.org/jtei/4109
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832578460569042944
author Hizkiel Mitiku Alemayehu
author_facet Hizkiel Mitiku Alemayehu
author_sort Hizkiel Mitiku Alemayehu
collection DOAJ
description This contribution describes the workflow used to transcribe Manuscripts from the Ethiopian and Eritrean Tradition. The goal of the workflow is to obtain a TEI file with an initial text transcription that profits from a wealth of machine-generated information collected through community-based contributions. The author sets the framework of interest of this effort to discuss available state-of-the-art options and the actual workflow implemented. It is argued that a workflow that prefers expert post-processing in the TEI instead of refinement of the preprocessing techniques is preferable for this specific use case. The publication of large quantities of text although, not 100% correct, when done in a collaboratively edited and open environment, can still be used and provide a user with information reusable for research.
format Article
id doaj-art-14a2e3776ab74aa8b2169383b2b6de49
institution Kabale University
issn 2162-5603
language deu
publishDate 2022-02-01
publisher Text Encoding Initiative Consortium
record_format Article
series Journal of the Text Encoding Initiative
spelling doaj-art-14a2e3776ab74aa8b2169383b2b6de492025-01-30T13:56:41ZdeuText Encoding Initiative ConsortiumJournal of the Text Encoding Initiative2162-56032022-02-0110.4000/jtei.4109Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflowHizkiel Mitiku AlemayehuThis contribution describes the workflow used to transcribe Manuscripts from the Ethiopian and Eritrean Tradition. The goal of the workflow is to obtain a TEI file with an initial text transcription that profits from a wealth of machine-generated information collected through community-based contributions. The author sets the framework of interest of this effort to discuss available state-of-the-art options and the actual workflow implemented. It is argued that a workflow that prefers expert post-processing in the TEI instead of refinement of the preprocessing techniques is preferable for this specific use case. The publication of large quantities of text although, not 100% correct, when done in a collaboratively edited and open environment, can still be used and provide a user with information reusable for research.https://journals.openedition.org/jtei/4109Ethiopic literatureManuscriptsTranskribus
spellingShingle Hizkiel Mitiku Alemayehu
Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow
Journal of the Text Encoding Initiative
Ethiopic literature
Manuscripts
Transkribus
title Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow
title_full Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow
title_fullStr Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow
title_full_unstemmed Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow
title_short Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow
title_sort handwritten text recognition best practice in the beta masahǝft workflow
topic Ethiopic literature
Manuscripts
Transkribus
url https://journals.openedition.org/jtei/4109
work_keys_str_mv AT hizkielmitikualemayehu handwrittentextrecognitionbestpracticeinthebetamasahǝftworkflow