Exploring the Performance of Tagging for the Classical and the Modern Standard Arabic

The part of speech (PoS) tagging is a core component in many natural language processing (NLP) applications. In fact, the PoS taggers contribute as a preprocessing step in various NLP tasks, such as syntactic parsing, information extraction, machine translation, and speech synthesis. In this paper,...

Full description

Saved in:
Bibliographic Details
Main Authors: Dia AbuZeina, Taqieddin Mostafa Abdalbaset
Format: Article
Language:English
Published: Wiley 2019-01-01
Series:Advances in Fuzzy Systems
Online Access:http://dx.doi.org/10.1155/2019/6254649
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832568115393724416
author Dia AbuZeina
Taqieddin Mostafa Abdalbaset
author_facet Dia AbuZeina
Taqieddin Mostafa Abdalbaset
author_sort Dia AbuZeina
collection DOAJ
description The part of speech (PoS) tagging is a core component in many natural language processing (NLP) applications. In fact, the PoS taggers contribute as a preprocessing step in various NLP tasks, such as syntactic parsing, information extraction, machine translation, and speech synthesis. In this paper, we examine the performance of a modern standard Arabic (MSA) based tagger for the classical (i.e., traditional or historical) Arabic. In this work, we employed the Stanford Arabic model tagger to evaluate the imperative verbs in the Holy Quran. In fact, the Stanford tagger contains 29 tags; however, this work experimentally evaluates just one that is the VB ≡ imperative verb. The testing set contains 741 imperative verbs, which appear in 1,848 positions in the Holy Quran. Despite the previously reported accuracy of the Arabic model of the Stanford tagger, which is 96.26% for all tags and 80.14% for unknown words, the experimental results show that this accuracy is only 7.28% for the imperative verbs. This result promotes the need for further research to expose why the tagging is severely inaccurate for classical Arabic. The performance decline might be an indication of the necessity to distinguish between training data for both classical and MSA Arabic for NLP tasks.
format Article
id doaj-art-ee48a026b95345faa78937c637f4f76c
institution Kabale University
issn 1687-7101
1687-711X
language English
publishDate 2019-01-01
publisher Wiley
record_format Article
series Advances in Fuzzy Systems
spelling doaj-art-ee48a026b95345faa78937c637f4f76c2025-02-03T00:59:50ZengWileyAdvances in Fuzzy Systems1687-71011687-711X2019-01-01201910.1155/2019/62546496254649Exploring the Performance of Tagging for the Classical and the Modern Standard ArabicDia AbuZeina0Taqieddin Mostafa Abdalbaset1College of Information Technology and Computer Engineering, Palestine Polytechnic University, Hebron, State of PalestinePalestine Technical University–Kadoorie, AL-Aroub Branch, Hebron, State of PalestineThe part of speech (PoS) tagging is a core component in many natural language processing (NLP) applications. In fact, the PoS taggers contribute as a preprocessing step in various NLP tasks, such as syntactic parsing, information extraction, machine translation, and speech synthesis. In this paper, we examine the performance of a modern standard Arabic (MSA) based tagger for the classical (i.e., traditional or historical) Arabic. In this work, we employed the Stanford Arabic model tagger to evaluate the imperative verbs in the Holy Quran. In fact, the Stanford tagger contains 29 tags; however, this work experimentally evaluates just one that is the VB ≡ imperative verb. The testing set contains 741 imperative verbs, which appear in 1,848 positions in the Holy Quran. Despite the previously reported accuracy of the Arabic model of the Stanford tagger, which is 96.26% for all tags and 80.14% for unknown words, the experimental results show that this accuracy is only 7.28% for the imperative verbs. This result promotes the need for further research to expose why the tagging is severely inaccurate for classical Arabic. The performance decline might be an indication of the necessity to distinguish between training data for both classical and MSA Arabic for NLP tasks.http://dx.doi.org/10.1155/2019/6254649
spellingShingle Dia AbuZeina
Taqieddin Mostafa Abdalbaset
Exploring the Performance of Tagging for the Classical and the Modern Standard Arabic
Advances in Fuzzy Systems
title Exploring the Performance of Tagging for the Classical and the Modern Standard Arabic
title_full Exploring the Performance of Tagging for the Classical and the Modern Standard Arabic
title_fullStr Exploring the Performance of Tagging for the Classical and the Modern Standard Arabic
title_full_unstemmed Exploring the Performance of Tagging for the Classical and the Modern Standard Arabic
title_short Exploring the Performance of Tagging for the Classical and the Modern Standard Arabic
title_sort exploring the performance of tagging for the classical and the modern standard arabic
url http://dx.doi.org/10.1155/2019/6254649
work_keys_str_mv AT diaabuzeina exploringtheperformanceoftaggingfortheclassicalandthemodernstandardarabic
AT taqieddinmostafaabdalbaset exploringtheperformanceoftaggingfortheclassicalandthemodernstandardarabic