Novel transfer learning based acoustic feature engineering for scene fake audio detection

Abstract Audio forensics plays a major role in the investigation and analysis of audio recordings for legal and security purposes. The advent of audio fake attacks using speech combined with scene-manipulated audio represents a sophisticated challenge in fake audio detection. Fake audio detection, a...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahmad Sami Al-Shamayleh, Hafsa Riasat, Ala Saleh Alluhaidan, Ali Raza, Sahar A. El-Rahman, Diaa Salama AbdElminaam
Format: Article
Language:English
Published: Nature Portfolio 2025-03-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-025-93032-2
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Abstract Audio forensics plays a major role in the investigation and analysis of audio recordings for legal and security purposes. The advent of audio fake attacks using speech combined with scene-manipulated audio represents a sophisticated challenge in fake audio detection. Fake audio detection, a critical technology in modern digital security, addresses the growing threat of manipulated audio content across various applications, including media, legal evidence, and cybersecurity. This research proposes a novel transfer learning approach for fake audio detection. We utilized a benchmark dataset, SceneFake, that contains 12,668 audio signal files for both real and fake scenes. We propose a novel transfer learning method, which initially extracts mel-frequency cepstral coefficients (MFCC) and then class prediction probability value features. The newly generated transfer features set by the proposed MfC-RF (MFCC-Random Forest) are utilized for further experiments. Results expressed that using the MfC-RF features random forest method outperformed existing state-of-the-art methods with a high-performance measure accuracy of 0.98. We have tuned hyperparameters of applied machine learning approaches, and cross-validation is applied to validate performance results. In addition, the complexity of the computation is measured. The proposed research aims to enhance the accuracy measure, and efficiency of identifying manipulated audio content, thereby contributing to the integrity and reliability of digital communications.
ISSN:2045-2322