Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow
This contribution describes the workflow used to transcribe Manuscripts from the Ethiopian and Eritrean Tradition. The goal of the workflow is to obtain a TEI file with an initial text transcription that profits from a wealth of machine-generated information collected through community-based contrib...
Saved in:
Main Author: | |
---|---|
Format: | Article |
Language: | deu |
Published: |
Text Encoding Initiative Consortium
2022-02-01
|
Series: | Journal of the Text Encoding Initiative |
Subjects: | |
Online Access: | https://journals.openedition.org/jtei/4109 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832578460569042944 |
---|---|
author | Hizkiel Mitiku Alemayehu |
author_facet | Hizkiel Mitiku Alemayehu |
author_sort | Hizkiel Mitiku Alemayehu |
collection | DOAJ |
description | This contribution describes the workflow used to transcribe Manuscripts from the Ethiopian and Eritrean Tradition. The goal of the workflow is to obtain a TEI file with an initial text transcription that profits from a wealth of machine-generated information collected through community-based contributions. The author sets the framework of interest of this effort to discuss available state-of-the-art options and the actual workflow implemented. It is argued that a workflow that prefers expert post-processing in the TEI instead of refinement of the preprocessing techniques is preferable for this specific use case. The publication of large quantities of text although, not 100% correct, when done in a collaboratively edited and open environment, can still be used and provide a user with information reusable for research. |
format | Article |
id | doaj-art-14a2e3776ab74aa8b2169383b2b6de49 |
institution | Kabale University |
issn | 2162-5603 |
language | deu |
publishDate | 2022-02-01 |
publisher | Text Encoding Initiative Consortium |
record_format | Article |
series | Journal of the Text Encoding Initiative |
spelling | doaj-art-14a2e3776ab74aa8b2169383b2b6de492025-01-30T13:56:41ZdeuText Encoding Initiative ConsortiumJournal of the Text Encoding Initiative2162-56032022-02-0110.4000/jtei.4109Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflowHizkiel Mitiku AlemayehuThis contribution describes the workflow used to transcribe Manuscripts from the Ethiopian and Eritrean Tradition. The goal of the workflow is to obtain a TEI file with an initial text transcription that profits from a wealth of machine-generated information collected through community-based contributions. The author sets the framework of interest of this effort to discuss available state-of-the-art options and the actual workflow implemented. It is argued that a workflow that prefers expert post-processing in the TEI instead of refinement of the preprocessing techniques is preferable for this specific use case. The publication of large quantities of text although, not 100% correct, when done in a collaboratively edited and open environment, can still be used and provide a user with information reusable for research.https://journals.openedition.org/jtei/4109Ethiopic literatureManuscriptsTranskribus |
spellingShingle | Hizkiel Mitiku Alemayehu Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow Journal of the Text Encoding Initiative Ethiopic literature Manuscripts Transkribus |
title | Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow |
title_full | Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow |
title_fullStr | Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow |
title_full_unstemmed | Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow |
title_short | Handwritten Text Recognition Best Practice in the Beta maṣāḥǝft workflow |
title_sort | handwritten text recognition best practice in the beta masahǝft workflow |
topic | Ethiopic literature Manuscripts Transkribus |
url | https://journals.openedition.org/jtei/4109 |
work_keys_str_mv | AT hizkielmitikualemayehu handwrittentextrecognitionbestpracticeinthebetamasahǝftworkflow |