Dual-Channel Deepfake Audio Detection: Leveraging Direct and Reverberant Waveforms
Deepfake content-including audio, video, images, and text-synthesized or modified using artificial intelligence is designed to convincingly mimic real content. As deepfake generation technology advances, detecting deepfake content presents significant challenges. While recent progress has been made...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10849546/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Deepfake content-including audio, video, images, and text-synthesized or modified using artificial intelligence is designed to convincingly mimic real content. As deepfake generation technology advances, detecting deepfake content presents significant challenges. While recent progress has been made in detection techniques, identifying deepfake audio remains particularly challenging. Previous approaches have attempted to capture deepfake features by combining video and audio content; however, these methods are ineffective when video and audio are mismatched due to occlusion. To address this, we propose a novel dual-channel deepfake audio detection model that leverages the direct and reverberant components extracted from raw audio signals, focusing exclusively on audio-based detection without reliance on video content. Across various datasets, including ASVspoof2019, FakeAVCeleb, and sport press conference datasets collected by our group, the proposed dual-channel model demonstrates significant improvements in quantitative metrics such as equal error rate and area under the curve. The implementation is available at <uri>https://github.com/gunwoo5034/Dual-Channel-Audio-Deepfake-Detection</uri>. |
---|---|
ISSN: | 2169-3536 |