Speech recognition using an english multimodal corpus with integrated image and depth information

Abstract Traditional English corpora mainly collect information from a single modality, but lack information from multimodal information, resulting in low quality of corpus information and certain problems with recognition accuracy. To solve the above problems, this paper proposes to introduce depth...

Full description

Saved in:
Bibliographic Details
Main Author: Bing Wang
Format: Article
Language:English
Published: Nature Portfolio 2024-11-01
Series:Scientific Reports
Subjects:
Online Access:https://doi.org/10.1038/s41598-024-78557-2
Tags: Add Tag
No Tags, Be the first to tag this record!