Hybrid Deep Learning and Fuzzy Matching for Real-Time Bidirectional Arabic Sign Language Translation: Toward Inclusive Communication Technologies

Technological advances and AI tools can help address the challenges faced by individuals who are deaf or nonverbal in different areas of social interaction. Existing tools mainly focus on one-way translation and are limited by small vocabulary datasets, require significant computational power, and o...

Full description

Saved in:
Bibliographic Details
Main Authors: Mogeeb A. A. Mosleh, Ahmed A. A. Mohammed, Ezzaldeen E. A. Esmail, Rehab A. A. Mohammed, Basheer Almuhaya
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11015993/
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Technological advances and AI tools can help address the challenges faced by individuals who are deaf or nonverbal in different areas of social interaction. Existing tools mainly focus on one-way translation and are limited by small vocabulary datasets, require significant computational power, and often lack a real-time implementation. Therefore, a bidirectional real-time translation application for Arabic Sign Language and written Arabic text was developed in this research to improve communication and learning experiences for individuals who are deaf. The proposed system designed with two primary translation modules includes sign-to-text and text-to-sign. The sign-to-text module employs transfer learning models to translate Arabic sign images into text, while the text-to-sign module integrates a fuzzy string-matching tool to convert Arabic text into sign images. The system was customized using six CNN-based deep learning architectures: AlexNet, ResNet152V2, YOLOv8n, Swin Transformer, InceptionV3, and Xception. Additionally, the ArSL dataset and an Arabic data dictionary were employed to enhance the diversity, accuracy, and completeness of the selected CNN models, thereby improving the system’s adaptability across various users and contexts. Experimental evaluations were conducted to assess the system’s performance in terms of both accuracy and processing efficiency. The results demonstrated exceptionally high accuracy across all investigated CNN models, but YOLOv8n-cls demonstrated the highest accuracy with a score of 99.9%, followed by Xception, Swin Transformer, and AlexNet at 99.0%, and InceptionV3 and ResNet152V2 at 98.0%. These closely aligned results are attributed to the inherent characteristics of the dataset, as well as the shared methodologies employed, including preprocessing, data augmentation, cross-validation, and hyperparameter tuning. In terms of the real-time adaptability in recognizing each sign image, the InceptionV3, AlexNet, and YOLOv8n models achieved high efficiency, with execution times of 13 ms, 16 ms, and 67 ms, respectively. This result indicates that YOLOv8n stands out among other models, particularly for its superior accuracy and lower computational speed. These findings highlighted that the integration of deep learning models and fuzzy string-matching techniques results in significant improvement in both accuracy and speed over baseline models, confirming the feasibility of developing a robust and efficient real-time bidirectional Arabic Sign Language (ArSL) translation system. Thus, the proposed system has strong potential to reduce communication barriers between people who are deaf and hearing individuals.
ISSN:2169-3536