Showing 1 - 4 results of 4 for search '"voice activity detection"', query time: 0.04s Refine Results
  1. 1

    A Hierarchical Framework Approach for Voice Activity Detection and Speech Enhancement by Yan Zhang, Zhen-min Tang, Yan-ping Li, Yang Luo

    Published 2014-01-01
    “…Accurate and effective voice activity detection (VAD) is a fundamental step for robust speech or speaker recognition. …”
    Get full text
    Article
  2. 2

    Combined wideband speech enhancement method based on statistical model and EMD by Xuan ZHOU, Chang-chun BAO, Bing-yin XIA

    Published 2013-08-01
    “…A combined wideband speech enhancement method based on statistical model and empirical mode decomposition (EMD) was proposed.First,statistical model was used to eliminate the main noise component in noisy speech.Then,the residual noise was further suppressed by a post-processing module which is a speech enhancement algorithm with voice activity detection (VAD) based on EMD.The advantages of the two methods were combined effectively.The performance of the proposed method was evaluated under the standard of ITU-T G160.The experimental results indicate that the algorithm is more effective for improving the SNR in the different noise environments than classical statistical model approach.Meanwhile,in low SNR conditions,musical noise is reduced effectively,and the speech sounds more comfortable.…”
    Get full text
    Article
  3. 3

    Source localization based on time delay estimation in complex environment by Da-wei ZHANG, Chang-chun BAO, Bing-yin XIA

    Published 2014-01-01
    “…In order to improve the performance of source localization in noisy and reverberant environments,a novel time delay estimation (TDE) method was proposed.This method is called acoustical transfer function ratio based on statistical model (ATFR-SM).In the proposed algorithm,the noise reduction method based on the statistical model was adopted to reduce the effect of noise on acoustical transfer Function (ATF).In the ATF method,the power spectral density (PSD) was smoothed and whitened to reduce the effect of reverberations.voice activity detection (VAD) was used to distinguish the speech period from the noise period,and the TDE was performed in the speech period to improve the estimation accuracy.Moreover,the proposed TDE method and the linear closed-form method for source localization were combined to constitute a source localization system.The results of performance evaluation show that,in both the noisy and reverberant conditions,the lower percentage of abnormal points (PAP) and lower root mean square error (RMSE) can be achieved by the proposed TDE method than those of the reference methods.Meanwhile,the source localization has higher accuracy than the reference methods.…”
    Get full text
    Article
  4. 4

    Sistem Identifikasi Pembicara Berbahasa Indonesia Menggunakan X-Vector Embedding by Alim Misbullah, Muhammad Saifullah Sani, Husaini, Laina Farsiah, Zahnur, Kikye Martiwi Sukiakhy

    Published 2024-08-01
    “…Untuk membangun model, fitur-fitur diekstrak dengan menggunakan MFCC, dihitung voice activity detection (VAD), dilakukan augmentasi dan normalisasi fitur menggunakan cepstral mean and variance normalization (CMVN) serta dilakukan filtering. …”
    Get full text
    Article