BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism

When forgery techniques can generate highly realistic videos, traditional convolutional neural network (CNN)-based detection models often struggle to capture subtle forgery features and temporal dependencies. Most existing models focus on feature extraction from static frames, neglecting the tempora...

Full description

Saved in:

Bibliographic Details
Main Authors:	Demao Xiong, Zhan Wen, Cheng Zhang, Dehao Ren, Wenzao Li
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Deepfake deepfake video detection bi-directional long short-term memory network multi-head self-attention mechanism
Online Access:	https://ieeexplore.ieee.org/document/10852294/
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1832087939300982784
author	Demao Xiong Zhan Wen Cheng Zhang Dehao Ren Wenzao Li
author_facet	Demao Xiong Zhan Wen Cheng Zhang Dehao Ren Wenzao Li
author_sort	Demao Xiong
collection	DOAJ
description	When forgery techniques can generate highly realistic videos, traditional convolutional neural network (CNN)-based detection models often struggle to capture subtle forgery features and temporal dependencies. Most existing models focus on feature extraction from static frames, neglecting the temporal correlation in videos, which decreases accuracy in detecting dynamic forged videos. Furthermore, the ability to detect localized forgery features remains insufficient. To address these limitations, we propose a deep forgery detection framework named BiLSTM Multi-Head Self-Attention Network (BMNet). By leveraging the Bi-Directional Long Short-Term Memory Network (BiLSTM) for modeling temporal dependencies between video frames, and the Multi-Head Self-Attention Mechanism (MHSA) for capturing features from different regions of an image, BMNet more effectively identifies dynamic and local forgery features. In our experiments, we extract features from 68 facial landmarks of each video frame and evaluate detection performance on four datasets: UADFV, FF++, Celeb-DF and DFDC. The results show significant improvements over traditional methods. We further validate the necessity of each network component through ablation studies, demonstrating that BMNet achieves accuracies of 95.54%, 92.18%, 80.20% and 84.72% on the FF++, UADFV, Celeb-DF and DFDC datasets, respectively, indicating its superior performance in deep forgery detection.
format	Article
id	doaj-art-21db568b6e754843b8ec95dfc2196bbb
institution	Kabale University
issn	2169-3536
language	English
publishDate	2025-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj-art-21db568b6e754843b8ec95dfc2196bbb2025-02-06T00:00:22ZengIEEEIEEE Access2169-35362025-01-0113215472155610.1109/ACCESS.2025.353365310852294BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention MechanismDemao Xiong0https://orcid.org/0009-0000-9469-5381Zhan Wen1https://orcid.org/0000-0002-6717-4289Cheng Zhang2https://orcid.org/0009-0004-8939-6204Dehao Ren3Wenzao Li4https://orcid.org/0000-0001-5986-1073School of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaWhen forgery techniques can generate highly realistic videos, traditional convolutional neural network (CNN)-based detection models often struggle to capture subtle forgery features and temporal dependencies. Most existing models focus on feature extraction from static frames, neglecting the temporal correlation in videos, which decreases accuracy in detecting dynamic forged videos. Furthermore, the ability to detect localized forgery features remains insufficient. To address these limitations, we propose a deep forgery detection framework named BiLSTM Multi-Head Self-Attention Network (BMNet). By leveraging the Bi-Directional Long Short-Term Memory Network (BiLSTM) for modeling temporal dependencies between video frames, and the Multi-Head Self-Attention Mechanism (MHSA) for capturing features from different regions of an image, BMNet more effectively identifies dynamic and local forgery features. In our experiments, we extract features from 68 facial landmarks of each video frame and evaluate detection performance on four datasets: UADFV, FF++, Celeb-DF and DFDC. The results show significant improvements over traditional methods. We further validate the necessity of each network component through ablation studies, demonstrating that BMNet achieves accuracies of 95.54%, 92.18%, 80.20% and 84.72% on the FF++, UADFV, Celeb-DF and DFDC datasets, respectively, indicating its superior performance in deep forgery detection.https://ieeexplore.ieee.org/document/10852294/Deepfakedeepfake video detectionbi-directional long short-term memory networkmulti-head self-attention mechanism
spellingShingle	Demao Xiong Zhan Wen Cheng Zhang Dehao Ren Wenzao Li BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism IEEE Access Deepfake deepfake video detection bi-directional long short-term memory network multi-head self-attention mechanism
title	BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
title_full	BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
title_fullStr	BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
title_full_unstemmed	BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
title_short	BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
title_sort	bmnet enhancing deepfake detection through bilstm and multi head self attention mechanism
topic	Deepfake deepfake video detection bi-directional long short-term memory network multi-head self-attention mechanism
url	https://ieeexplore.ieee.org/document/10852294/
work_keys_str_mv	AT demaoxiong bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism AT zhanwen bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism AT chengzhang bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism AT dehaoren bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism AT wenzaoli bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism

BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism

Similar Items