Speech recognition using an english multimodal corpus with integrated image and depth information

Speech recognition using an english multimodal corpus with integrated image and depth information

Abstract Traditional English corpora mainly collect information from a single modality, but lack information from multimodal information, resulting in low quality of corpus information and certain problems with recognition accuracy. To solve the above problems, this paper proposes to introduce depth...

Full description

Saved in:

Bibliographic Details
Main Author:	Bing Wang
Format:	Article
Language:	English
Published:	Nature Portfolio 2024-11-01
Series:	Scientific Reports
Subjects:	English multimodal corpus Speech recognition methods Depth information Electronic images
Online Access:	https://doi.org/10.1038/s41598-024-78557-2
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

Bridging language gaps: The role of NLP and speech recognition in oral english instruction
by: Parul Dubey, et al.
Published: (2025-06-01)

Multimodal Emotion Recognition Based on Facial Expressions, Speech, and EEG
by: Jiahui Pan, et al.
Published: (2024-01-01)

Unified Depth-Guided Feature Fusion and Reranking for Hierarchical Place Recognition
by: Kunmo Li, et al.
Published: (2025-06-01)

Small Language Models for Speech Emotion Recognition in Text and Audio Modalities
by: José L. Gómez-Sirvent, et al.
Published: (2025-07-01)

Exploring Speech-Recognition Technology on English Pronunciation Skills: A Qualitative Study in Eleventh Grade of SMAI Miftahul Ulum
by: Moh. Hidayatulllah, et al.
Published: (2025-06-01)

A WebGL Serious Game for Practicing English Conversations in Public Places Using Speech Recognition
by: Rickman Roedavan, et al.
Published: (2025-05-01)

English listening and speaking ability evaluation model fusing computer vision and speech recognition algorithms
by: Yihui Zeng
Published: (2025-12-01)

Implications extra et pré-linguistiques d’un corpus audiovisuel touchant à l’information spécialisée : quelques réflexions sur le cas du discours sportif télévisé contemporain
by: Camille Lagarde-Belleville
Published: (2019-03-01)

ClinClip: a Multimodal Language Pre-training model integrating EEG data for enhanced English medical listening assessment
by: Guangyu Sun
Published: (2025-01-01)

Improving Speech Recognition Rate through Analysis Parameters
by: Eringis Deividas, et al.
Published: (2014-05-01)

A Self-Evaluated Bilingual Automatic Speech Recognition System for Mandarin–English Mixed Conversations
by: Xinhe Hai, et al.
Published: (2025-07-01)

Indonesian English(?): A Corpus-Based Lexical Analysis
by: Ignatius Tri Endarto
Published: (2018-12-01)

Indonesian English(?): A Corpus-Based Lexical Analysis
by: Ignatius Tri Endarto
Published: (2018-12-01)

Indonesian English(?): A Corpus-Based Lexical Analysis
by: Ignatius Tri Endarto
Published: (2018-12-01)

AFT-SAM: Adaptive Fusion Transformer with a Sparse Attention Mechanism for Audio–Visual Speech Recognition
by: Na Che, et al.
Published: (2024-12-01)

Polish Speech and Text Emotion Recognition in a Multimodal Emotion Analysis System
by: Kamil Skowroński, et al.
Published: (2024-11-01)

Multimodal hate speech detection: a novel deep learning framework for multilingual text and images
by: Furqan Khan Saddozai, et al.
Published: (2025-04-01)

Phonetic minimization of the text corpus in Belarusian for the speech synthesis system training
by: S. I. Lysy
Published: (2019-03-01)

Multimodal meaning making in news communication about immigration: using the NewsScape corpus to explore co-verbal images in TV news
by: Christopher Hart
Published: (2024-10-01)

Accents in Speech Recognition through the Lens of a World Englishes Evaluation Set
by: Miguel Del Río, et al.
Published: (2023-12-01)

MemoCMT: multimodal emotion recognition using cross-modal transformer-based feature fusion
by: Mustaqeem Khan, et al.
Published: (2025-02-01)

Assessing costa rican children speech recognition by humans and machines
by: Maribel Morales-Rodríguez, et al.
Published: (2022-11-01)

Corpus-Linguistic Analysis of Speech Communities on Anti-Gender Discourse in Slovene
by: Damjan Popič, et al.
Published: (2023-01-01)

Corpus Literacy Training for In-Service English Language Teachers
by: Ahmet Basal, et al.
Published: (2024-01-01)

English translation research based on a multimodal corpus of Cantonese opera: a case study of classic Cantonese opera Red Boat
by: Liping Jiang
Published: (2025-04-01)

Silent speech recognition using visual cascading fusion of tongue-lip movements based on pre-trained and fine-tuned model
by: Chongchong Yu, et al.
Published: (2025-04-01)

Maximizing English Speech on YouTube Videos to Enrich Students’ Vocabulary
by: Ihdal Bayu Pamungkas, et al.
Published: (2022-12-01)

On the corpus of speech samples with errors in the use of Russian as a foreign language: methods of data representation and deep markup parameters
by: S. V. Gusarenko, et al.
Published: (2023-01-01)

Development of speech material for an Armenian speech recognition threshold test
by: Sona Sargsyan, et al.
Published: (2021-09-01)

Shuffling Augmented Decoupled Features for Multimodal Emotion Recognition
by: Sunyoung Cho
Published: (2025-01-01)

Exploring Studies on Multimodal Literacy in English Learning: A Bibliometric Analysis
by: Ivan Samuel Christian, et al.
Published: (2024-11-01)

Discourse markers in the speech of Italian tourist guides: a corpus-based study
by: Iolanda Alfano
Published: (2025-05-01)

Using casual speech phonology in synthetic speech
by: Linda SHOCKEY
Published: (2014-04-01)

Analysis and development of clinically recorded dysarthric speech corpus for patients affected with various stroke conditions
by: Oindrila Banerjee, et al.
Published: (2025-06-01)

Intelligibility and Recognition of Announcer's Speech during Electric Acoustic Conversions in a Transformer
by: A. Y. Shafranov, et al.
Published: (2022-06-01)

Corpus Applications in ELT in Colombia: An Exploratory Survey
by: Rodrigo A. Rodríguez-Fuentes, et al.
Published: (2025-01-01)

SVIT‐SSR: A sEMG‐based vision transformer approach for silent speech recognition
by: Zhao Li, et al.
Published: (2024-11-01)

Speech Emotion Recognition: Humans vs Machines
by: S. Werner, et al.
Published: (2019-12-01)

A Comprehensive Review of Multimodal Emotion Recognition: Techniques, Challenges, and Future Directions
by: You Wu, et al.
Published: (2025-06-01)

Recent advancements in automatic disordered speech recognition: A survey paper
by: Nada Gohider, et al.
Published: (2024-12-01)