Enhancing communication with elderly and stroke patients based on sign-gesture translation via audio-visual avatars
Communication barrier faced by elderly individuals and stroke patients with speech impairments pose significant challenges in daily interactions. While sign language serves as a vital means of communication, those struggling to speak may encounter difficulties in conveying their messages effectively...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
De Gruyter
2025-03-01
|
| Series: | Open Engineering |
| Subjects: | |
| Online Access: | https://doi.org/10.1515/eng-2024-0068 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Communication barrier faced by elderly individuals and stroke patients with speech impairments pose significant challenges in daily interactions. While sign language serves as a vital means of communication, those struggling to speak may encounter difficulties in conveying their messages effectively. This research addresses this issue by proposing a system for generating audio-visual avatars capable of translating sign gestures into the written and spoken language, thereby offering a comprehensive communication tool for individuals with special needs. The proposed method integrated YOLOv8, MobileNetV2, and MobileNetV1 based on U-Net to accurately recognize and classify sign gestures. For gesture detection and classification, YOLOv8n was used; for segmentation, traditional U-Net, U-Net with VGG16, and U-Net with MobileNetV2 based on multi-stage image segmentation were used; for classification, MobileNetV1 and MobileNetV2 were used. Using the improved first-order motion model, the generated avatars enabled the real-time translation of sign motions into text and speech and facilitated interactive conversation in both Arabic and English. The system’s importance was demonstrated by the evaluation findings, which showed that traditional U-Net produced ideal results in gesture segmentation and YOLOv8n performed best in gesture classification. This study contributes to advancing assistive communication technologies, offering insights into optimizing gesture recognition and avatar generation for enhanced communication support in elderly and stroke patient care. The YOLOv8n model achieved 0.956 and 0.939 for precision and recall, respectively, for detecting and classifying gestures. MobileNetV1 gained 0.94 and MobileNetV2 gained 0.79 in accuracy for classification. |
|---|---|
| ISSN: | 2391-5439 |