Exploring GPT-4 Capabilities in Generating Paraphrased Sentences for the Arabic Language

Paraphrasing means expressing the semantic meaning of a text using different words. Paraphrasing has a significant impact on numerous Natural Language Processing (NLP) applications, such as Machine Translation (MT) and Question Answering (QA). Machine Learning (ML) methods are frequently employed to...

Full description

Saved in:
Bibliographic Details
Main Authors: Haya Rabih Alsulami, Amal Abdullah Almansour
Format: Article
Language:English
Published: MDPI AG 2025-04-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/8/4139
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Paraphrasing means expressing the semantic meaning of a text using different words. Paraphrasing has a significant impact on numerous Natural Language Processing (NLP) applications, such as Machine Translation (MT) and Question Answering (QA). Machine Learning (ML) methods are frequently employed to generate new paraphrased text, and the generative method is commonly used for text generation. Generative Pre-trained Transformer (GPT) models have demonstrated effectiveness in various text generation tasks, including summarization, proofreading, and rephrasing of English texts. However, GPT-4’s capabilities in Arabic paraphrase generation have not been extensively studied despite Arabic being one of the most widely spoken languages. In this paper, the researchers evaluate the capabilities of GPT-4 in text paraphrasing for Arabic. Furthermore, the paper presents a comprehensive evaluation method for paraphrase quality and developing a detailed framework for evaluation. The framework comprises Bilingual Evaluation Understudy (BLEU), Recall-Oriented Understudy for Gisting Evaluation (ROUGE), Lexical Diversity (LD), Jaccard similarity, and word embedding using the Arabic Bi-directional Encoder Representation from Transformers (AraBERT) model with cosine and Euclidean similarity. This paper illustrates that GPT-4 can effectively produce a new paraphrased sentence that is semantically equivalent to the original sentence, and the quality framework efficiently ranks paraphrased pairs according to quality criteria.
ISSN:2076-3417