Exploring GPT-4 Capabilities in Generating Paraphrased Sentences for the Arabic Language
Paraphrasing means expressing the semantic meaning of a text using different words. Paraphrasing has a significant impact on numerous Natural Language Processing (NLP) applications, such as Machine Translation (MT) and Question Answering (QA). Machine Learning (ML) methods are frequently employed to...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-04-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/8/4139 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Paraphrasing means expressing the semantic meaning of a text using different words. Paraphrasing has a significant impact on numerous Natural Language Processing (NLP) applications, such as Machine Translation (MT) and Question Answering (QA). Machine Learning (ML) methods are frequently employed to generate new paraphrased text, and the generative method is commonly used for text generation. Generative Pre-trained Transformer (GPT) models have demonstrated effectiveness in various text generation tasks, including summarization, proofreading, and rephrasing of English texts. However, GPT-4’s capabilities in Arabic paraphrase generation have not been extensively studied despite Arabic being one of the most widely spoken languages. In this paper, the researchers evaluate the capabilities of GPT-4 in text paraphrasing for Arabic. Furthermore, the paper presents a comprehensive evaluation method for paraphrase quality and developing a detailed framework for evaluation. The framework comprises Bilingual Evaluation Understudy (BLEU), Recall-Oriented Understudy for Gisting Evaluation (ROUGE), Lexical Diversity (LD), Jaccard similarity, and word embedding using the Arabic Bi-directional Encoder Representation from Transformers (AraBERT) model with cosine and Euclidean similarity. This paper illustrates that GPT-4 can effectively produce a new paraphrased sentence that is semantically equivalent to the original sentence, and the quality framework efficiently ranks paraphrased pairs according to quality criteria. |
|---|---|
| ISSN: | 2076-3417 |