Speech recognition using an english multimodal corpus with integrated image and depth information
Abstract Traditional English corpora mainly collect information from a single modality, but lack information from multimodal information, resulting in low quality of corpus information and certain problems with recognition accuracy. To solve the above problems, this paper proposes to introduce depth...
Saved in:
| Main Author: | Bing Wang |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Nature Portfolio
2024-11-01
|
| Series: | Scientific Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1038/s41598-024-78557-2 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Bridging language gaps: The role of NLP and speech recognition in oral english instruction
by: Parul Dubey, et al.
Published: (2025-06-01) -
Multimodal Emotion Recognition Based on Facial Expressions, Speech, and EEG
by: Jiahui Pan, et al.
Published: (2024-01-01) -
Unified Depth-Guided Feature Fusion and Reranking for Hierarchical Place Recognition
by: Kunmo Li, et al.
Published: (2025-06-01) -
Small Language Models for Speech Emotion Recognition in Text and Audio Modalities
by: José L. Gómez-Sirvent, et al.
Published: (2025-07-01) -
Exploring Speech-Recognition Technology on English Pronunciation Skills: A Qualitative Study in Eleventh Grade of SMAI Miftahul Ulum
by: Moh. Hidayatulllah, et al.
Published: (2025-06-01)