-
1
Global-Local Self-Attention-Based Long Short-Term Memory with Optimization Algorithm for Speaker Identification
Published 2025-01-01“…The GLSA-LSTM with EN-GWO method acquires an accuracy of 99.36% on the TIMIT dataset, and an accuracy of 93.45% on the VoxCeleb 1 datasets, while compared to SincNet and Generative Adversarial Network (SincGAN) and Hybrid Neural Network – Support Vector Machine (NN-SVM). …”
Get full text
Article -
2
An audio steganography algorithm based on segment-STC
Published 2019-07-01“…Then, the segmented message in each sub-segment was embedded to the optimal submatrix. The TIMIT dataset and author collected online songs as databases was used to run the experiments. …”
Get full text
Article -
3
Source cell-phone identification from recorded speech using non-speech segments
Published 2017-07-01“…Source cell-phone identification has become a hot topic in multimedia forensics.A novel cell-phone identification method was proposed based on the silent segments of recorded speech.Firstly,the silent segments were obtained using adaptive endpoint detection algorithm.Then,the mean of Mel frequency coefficients (MFC) was extracted as the characteristics for device identification.Finally,the CfsSubsetEval evaluation function of WEKA platform was selected according to the best priority (BestFirst) search,and support vector machine (SVM) was used for classification.Twenty-three popular models of the cell-phones were evaluated in the experiment.Experimental results show that the proposed method is feasible and the average recognition rates are 99.23% and 99.00% on the TIMIT database and the CKC-SD database.At the same time,the proposed feature performs was demonstrated better than the MFC features and the Mel frequency cepstrum coefficients (MFCC) features of the speech segments.…”
Get full text
Article -
4
A Hierarchical Framework Approach for Voice Activity Detection and Speech Enhancement
Published 2014-01-01“…Effectiveness of the proposed approach is compared and evaluated to other VAD techniques by using two well-known databases, namely, TIMIT database and NOISEX-92 database. Experimental results show that the proposed method performs well under a variety of noisy conditions.…”
Get full text
Article -
5
Research on Underwater Sound Source Localization Based on NLM-EEMD and FCM-Generalized Quadratic Correlation
Published 2022-01-01“…The theories such as NLM, EEMD, FCM, and generalized quadratic correlation delay estimation are studied in detail, and the speech signal of the test library in the TIMIT standard library is selected for simulation analysis, which verifies the correctness of the delay estimation method. …”
Get full text
Article -
6
Automated speech therapy through personalized pronunciation correction using reinforcement learning and large language models
Published 2025-03-01“…Further validation utilized datasets such as TIMIT, LibriTTS, SpeechOcean762, and the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), enabling direct comparisons with contemporary methods. …”
Get full text
Article