Authenticity at Risk: Key Factors in the Generation and Detection of Audio Deepfakes

Detecting audio deepfakes is crucial to ensure authenticity and security, especially in contexts where audio veracity can have critical implications, such as in the legal, security or human rights domains. Various elements, such as complex acoustic backgrounds, enhance the realism of deepfakes; howe...

Full description

Saved in:

Bibliographic Details
Main Authors:	Alba Martínez-Serrano, Claudia Montero-Ramírez, Carmen Peláez-Moreno
Format:	Article
Language:	English
Published:	MDPI AG 2025-01-01
Series:	Applied Sciences
Subjects:	audio deepfake generation detection acoustic context
Online Access:	https://www.mdpi.com/2076-3417/15/2/558
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832589284989730816
author	Alba Martínez-Serrano Claudia Montero-Ramírez Carmen Peláez-Moreno
author_facet	Alba Martínez-Serrano Claudia Montero-Ramírez Carmen Peláez-Moreno
author_sort	Alba Martínez-Serrano
collection	DOAJ
description	Detecting audio deepfakes is crucial to ensure authenticity and security, especially in contexts where audio veracity can have critical implications, such as in the legal, security or human rights domains. Various elements, such as complex acoustic backgrounds, enhance the realism of deepfakes; however, their effect on the processes of creation and detection of deepfakes remains under-explored. This study systematically analyses how factors such as the acoustic environment, user type and signal-to-noise ratio influence the quality and detectability of deepfakes. For this study, we use the <i>WELIVE</i> dataset, which contains audio recordings of 14 female victims of gender-based violence in real and uncontrolled environments. The results indicate that the complexity of the acoustic scene affects both the generation and detection of deepfakes: classifiers, particularly the linear SVM, are more effective in complex acoustic environments, suggesting that simpler acoustic environments may facilitate the generation of more realistic deepfakes and, in turn, make it more difficult for classifiers to detect them. These findings underscore the need to develop adaptive models capable of handling diverse acoustic environments, thus improving detection reliability in dynamic and real-world contexts.
format	Article
id	doaj-art-b2b0c19671f64fd4906401abe473d491
institution	Kabale University
issn	2076-3417
language	English
publishDate	2025-01-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj-art-b2b0c19671f64fd4906401abe473d4912025-01-24T13:19:49ZengMDPI AGApplied Sciences2076-34172025-01-0115255810.3390/app15020558Authenticity at Risk: Key Factors in the Generation and Detection of Audio DeepfakesAlba Martínez-Serrano0Claudia Montero-Ramírez1Carmen Peláez-Moreno2Signal Theory and Communications Department, University Carlos III of Madrid, 28911 Madrid, SpainSignal Theory and Communications Department, University Carlos III of Madrid, 28911 Madrid, SpainSignal Theory and Communications Department, University Carlos III of Madrid, 28911 Madrid, SpainDetecting audio deepfakes is crucial to ensure authenticity and security, especially in contexts where audio veracity can have critical implications, such as in the legal, security or human rights domains. Various elements, such as complex acoustic backgrounds, enhance the realism of deepfakes; however, their effect on the processes of creation and detection of deepfakes remains under-explored. This study systematically analyses how factors such as the acoustic environment, user type and signal-to-noise ratio influence the quality and detectability of deepfakes. For this study, we use the <i>WELIVE</i> dataset, which contains audio recordings of 14 female victims of gender-based violence in real and uncontrolled environments. The results indicate that the complexity of the acoustic scene affects both the generation and detection of deepfakes: classifiers, particularly the linear SVM, are more effective in complex acoustic environments, suggesting that simpler acoustic environments may facilitate the generation of more realistic deepfakes and, in turn, make it more difficult for classifiers to detect them. These findings underscore the need to develop adaptive models capable of handling diverse acoustic environments, thus improving detection reliability in dynamic and real-world contexts.https://www.mdpi.com/2076-3417/15/2/558audio deepfakegenerationdetectionacoustic context
spellingShingle	Alba Martínez-Serrano Claudia Montero-Ramírez Carmen Peláez-Moreno Authenticity at Risk: Key Factors in the Generation and Detection of Audio Deepfakes Applied Sciences audio deepfake generation detection acoustic context
title	Authenticity at Risk: Key Factors in the Generation and Detection of Audio Deepfakes
title_full	Authenticity at Risk: Key Factors in the Generation and Detection of Audio Deepfakes
title_fullStr	Authenticity at Risk: Key Factors in the Generation and Detection of Audio Deepfakes
title_full_unstemmed	Authenticity at Risk: Key Factors in the Generation and Detection of Audio Deepfakes
title_short	Authenticity at Risk: Key Factors in the Generation and Detection of Audio Deepfakes
title_sort	authenticity at risk key factors in the generation and detection of audio deepfakes
topic	audio deepfake generation detection acoustic context
url	https://www.mdpi.com/2076-3417/15/2/558
work_keys_str_mv	AT albamartinezserrano authenticityatriskkeyfactorsinthegenerationanddetectionofaudiodeepfakes AT claudiamonteroramirez authenticityatriskkeyfactorsinthegenerationanddetectionofaudiodeepfakes AT carmenpelaezmoreno authenticityatriskkeyfactorsinthegenerationanddetectionofaudiodeepfakes

Authenticity at Risk: Key Factors in the Generation and Detection of Audio Deepfakes

Similar Items