Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow

This contribution describes the workflow used to transcribe Manuscripts from the Ethiopian and Eritrean Tradition. The goal of the workflow is to obtain a TEI file with an initial text transcription that profits from a wealth of machine-generated information collected through community-based contrib...

Full description

Saved in:
Bibliographic Details
Main Author: Hizkiel Mitiku Alemayehu
Format: Article
Language:deu
Published: Text Encoding Initiative Consortium 2022-02-01
Series:Journal of the Text Encoding Initiative
Subjects:
Online Access:https://journals.openedition.org/jtei/4109
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This contribution describes the workflow used to transcribe Manuscripts from the Ethiopian and Eritrean Tradition. The goal of the workflow is to obtain a TEI file with an initial text transcription that profits from a wealth of machine-generated information collected through community-based contributions. The author sets the framework of interest of this effort to discuss available state-of-the-art options and the actual workflow implemented. It is argued that a workflow that prefers expert post-processing in the TEI instead of refinement of the preprocessing techniques is preferable for this specific use case. The publication of large quantities of text although, not 100% correct, when done in a collaboratively edited and open environment, can still be used and provide a user with information reusable for research.
ISSN:2162-5603