Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora
Despite the accessibility of numerous online corpora, students and researchers engaged in the fields of Natural Language Processing (NLP), corpus linguistics, and language learning and teaching may encounter situations in which they need to develop their own corpora. Several commercial and free stan...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2014-01-01
|
Series: | The Scientific World Journal |
Online Access: | http://dx.doi.org/10.1155/2014/602745 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832551832404099072 |
---|---|
author | Abdulmohsen Al-Thubaity Hend Al-Khalifa Reem Alqifari Manal Almazrua |
author_facet | Abdulmohsen Al-Thubaity Hend Al-Khalifa Reem Alqifari Manal Almazrua |
author_sort | Abdulmohsen Al-Thubaity |
collection | DOAJ |
description | Despite the accessibility of numerous online corpora, students and researchers engaged in the fields of Natural Language Processing (NLP), corpus linguistics, and language learning and teaching may encounter situations in which they need to develop their own corpora. Several commercial and free standalone corpora processing systems are available to process such corpora. In this study, we first propose a framework for the evaluation of standalone corpora processing systems and then use it to evaluate seven freely available systems. The proposed framework considers the usability, functionality, and performance of the evaluated systems while taking into consideration their suitability for Arabic corpora. While the results show that most of the evaluated systems exhibited comparable usability scores, the scores for functionality and performance were substantially different with respect to support for the Arabic language and N-grams profile generation. The results of our evaluation will help potential users of the evaluated systems to choose the system that best meets their needs. More importantly, the results will help the developers of the evaluated systems to enhance their systems and developers of new corpora processing systems by providing them with a reference framework. |
format | Article |
id | doaj-art-c716a1905eb54ff09abf467a9b113f3f |
institution | Kabale University |
issn | 2356-6140 1537-744X |
language | English |
publishDate | 2014-01-01 |
publisher | Wiley |
record_format | Article |
series | The Scientific World Journal |
spelling | doaj-art-c716a1905eb54ff09abf467a9b113f3f2025-02-03T06:00:20ZengWileyThe Scientific World Journal2356-61401537-744X2014-01-01201410.1155/2014/602745602745Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic CorporaAbdulmohsen Al-Thubaity0Hend Al-Khalifa1Reem Alqifari2Manal Almazrua3King AbdulAziz City for Science and Technology, Riyadh 11442, Saudi ArabiaKing Saud University, Riyadh 12372, Saudi ArabiaKing Saud University, Riyadh 12372, Saudi ArabiaKing AbdulAziz City for Science and Technology, Riyadh 11442, Saudi ArabiaDespite the accessibility of numerous online corpora, students and researchers engaged in the fields of Natural Language Processing (NLP), corpus linguistics, and language learning and teaching may encounter situations in which they need to develop their own corpora. Several commercial and free standalone corpora processing systems are available to process such corpora. In this study, we first propose a framework for the evaluation of standalone corpora processing systems and then use it to evaluate seven freely available systems. The proposed framework considers the usability, functionality, and performance of the evaluated systems while taking into consideration their suitability for Arabic corpora. While the results show that most of the evaluated systems exhibited comparable usability scores, the scores for functionality and performance were substantially different with respect to support for the Arabic language and N-grams profile generation. The results of our evaluation will help potential users of the evaluated systems to choose the system that best meets their needs. More importantly, the results will help the developers of the evaluated systems to enhance their systems and developers of new corpora processing systems by providing them with a reference framework.http://dx.doi.org/10.1155/2014/602745 |
spellingShingle | Abdulmohsen Al-Thubaity Hend Al-Khalifa Reem Alqifari Manal Almazrua Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora The Scientific World Journal |
title | Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora |
title_full | Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora |
title_fullStr | Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora |
title_full_unstemmed | Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora |
title_short | Proposed Framework for the Evaluation of Standalone Corpora Processing Systems: An Application to Arabic Corpora |
title_sort | proposed framework for the evaluation of standalone corpora processing systems an application to arabic corpora |
url | http://dx.doi.org/10.1155/2014/602745 |
work_keys_str_mv | AT abdulmohsenalthubaity proposedframeworkfortheevaluationofstandalonecorporaprocessingsystemsanapplicationtoarabiccorpora AT hendalkhalifa proposedframeworkfortheevaluationofstandalonecorporaprocessingsystemsanapplicationtoarabiccorpora AT reemalqifari proposedframeworkfortheevaluationofstandalonecorporaprocessingsystemsanapplicationtoarabiccorpora AT manalalmazrua proposedframeworkfortheevaluationofstandalonecorporaprocessingsystemsanapplicationtoarabiccorpora |