Dual-Channel Deepfake Audio Detection: Leveraging Direct and Reverberant Waveforms
Deepfake content-including audio, video, images, and text-synthesized or modified using artificial intelligence is designed to convincingly mimic real content. As deepfake generation technology advances, detecting deepfake content presents significant challenges. While recent progress has been made...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10849546/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832576759515578368 |
---|---|
author | Gunwoo Lee Jungmin Lee Minkyo Jung Joseph Lee Kihun Hong Souhwan Jung Yoseob Han |
author_facet | Gunwoo Lee Jungmin Lee Minkyo Jung Joseph Lee Kihun Hong Souhwan Jung Yoseob Han |
author_sort | Gunwoo Lee |
collection | DOAJ |
description | Deepfake content-including audio, video, images, and text-synthesized or modified using artificial intelligence is designed to convincingly mimic real content. As deepfake generation technology advances, detecting deepfake content presents significant challenges. While recent progress has been made in detection techniques, identifying deepfake audio remains particularly challenging. Previous approaches have attempted to capture deepfake features by combining video and audio content; however, these methods are ineffective when video and audio are mismatched due to occlusion. To address this, we propose a novel dual-channel deepfake audio detection model that leverages the direct and reverberant components extracted from raw audio signals, focusing exclusively on audio-based detection without reliance on video content. Across various datasets, including ASVspoof2019, FakeAVCeleb, and sport press conference datasets collected by our group, the proposed dual-channel model demonstrates significant improvements in quantitative metrics such as equal error rate and area under the curve. The implementation is available at <uri>https://github.com/gunwoo5034/Dual-Channel-Audio-Deepfake-Detection</uri>. |
format | Article |
id | doaj-art-657c7498ddb84b75ae49abf1828b682a |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-657c7498ddb84b75ae49abf1828b682a2025-01-31T00:01:39ZengIEEEIEEE Access2169-35362025-01-0113180401805210.1109/ACCESS.2025.353277510849546Dual-Channel Deepfake Audio Detection: Leveraging Direct and Reverberant WaveformsGunwoo Lee0https://orcid.org/0009-0008-4185-5626Jungmin Lee1Minkyo Jung2Joseph Lee3https://orcid.org/0009-0007-3817-0514Kihun Hong4https://orcid.org/0000-0002-5538-3630Souhwan Jung5https://orcid.org/0000-0003-2676-3412Yoseob Han6https://orcid.org/0000-0002-0382-7826School of Electronic Engineering, Soongsil University, Seoul, Republic of KoreaSchool of Electronic Engineering, Soongsil University, Seoul, Republic of KoreaSchool of Electronic Engineering, Soongsil University, Seoul, Republic of KoreaDepartment of Statistics and Actuarial Science, Soongsil University, Seoul, Republic of KoreaSchool of Electronic Engineering, Soongsil University, Seoul, Republic of KoreaSchool of Electronic Engineering, Soongsil University, Seoul, Republic of KoreaSchool of Electronic Engineering, Soongsil University, Seoul, Republic of KoreaDeepfake content-including audio, video, images, and text-synthesized or modified using artificial intelligence is designed to convincingly mimic real content. As deepfake generation technology advances, detecting deepfake content presents significant challenges. While recent progress has been made in detection techniques, identifying deepfake audio remains particularly challenging. Previous approaches have attempted to capture deepfake features by combining video and audio content; however, these methods are ineffective when video and audio are mismatched due to occlusion. To address this, we propose a novel dual-channel deepfake audio detection model that leverages the direct and reverberant components extracted from raw audio signals, focusing exclusively on audio-based detection without reliance on video content. Across various datasets, including ASVspoof2019, FakeAVCeleb, and sport press conference datasets collected by our group, the proposed dual-channel model demonstrates significant improvements in quantitative metrics such as equal error rate and area under the curve. The implementation is available at <uri>https://github.com/gunwoo5034/Dual-Channel-Audio-Deepfake-Detection</uri>.https://ieeexplore.ieee.org/document/10849546/Deepfake audio detectiondual-channel datadirect waveformreverberant waveform |
spellingShingle | Gunwoo Lee Jungmin Lee Minkyo Jung Joseph Lee Kihun Hong Souhwan Jung Yoseob Han Dual-Channel Deepfake Audio Detection: Leveraging Direct and Reverberant Waveforms IEEE Access Deepfake audio detection dual-channel data direct waveform reverberant waveform |
title | Dual-Channel Deepfake Audio Detection: Leveraging Direct and Reverberant Waveforms |
title_full | Dual-Channel Deepfake Audio Detection: Leveraging Direct and Reverberant Waveforms |
title_fullStr | Dual-Channel Deepfake Audio Detection: Leveraging Direct and Reverberant Waveforms |
title_full_unstemmed | Dual-Channel Deepfake Audio Detection: Leveraging Direct and Reverberant Waveforms |
title_short | Dual-Channel Deepfake Audio Detection: Leveraging Direct and Reverberant Waveforms |
title_sort | dual channel deepfake audio detection leveraging direct and reverberant waveforms |
topic | Deepfake audio detection dual-channel data direct waveform reverberant waveform |
url | https://ieeexplore.ieee.org/document/10849546/ |
work_keys_str_mv | AT gunwoolee dualchanneldeepfakeaudiodetectionleveragingdirectandreverberantwaveforms AT jungminlee dualchanneldeepfakeaudiodetectionleveragingdirectandreverberantwaveforms AT minkyojung dualchanneldeepfakeaudiodetectionleveragingdirectandreverberantwaveforms AT josephlee dualchanneldeepfakeaudiodetectionleveragingdirectandreverberantwaveforms AT kihunhong dualchanneldeepfakeaudiodetectionleveragingdirectandreverberantwaveforms AT souhwanjung dualchanneldeepfakeaudiodetectionleveragingdirectandreverberantwaveforms AT yoseobhan dualchanneldeepfakeaudiodetectionleveragingdirectandreverberantwaveforms |