BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism

When forgery techniques can generate highly realistic videos, traditional convolutional neural network (CNN)-based detection models often struggle to capture subtle forgery features and temporal dependencies. Most existing models focus on feature extraction from static frames, neglecting the tempora...

Full description

Saved in:
Bibliographic Details
Main Authors: Demao Xiong, Zhan Wen, Cheng Zhang, Dehao Ren, Wenzao Li
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10852294/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1832087939300982784
author Demao Xiong
Zhan Wen
Cheng Zhang
Dehao Ren
Wenzao Li
author_facet Demao Xiong
Zhan Wen
Cheng Zhang
Dehao Ren
Wenzao Li
author_sort Demao Xiong
collection DOAJ
description When forgery techniques can generate highly realistic videos, traditional convolutional neural network (CNN)-based detection models often struggle to capture subtle forgery features and temporal dependencies. Most existing models focus on feature extraction from static frames, neglecting the temporal correlation in videos, which decreases accuracy in detecting dynamic forged videos. Furthermore, the ability to detect localized forgery features remains insufficient. To address these limitations, we propose a deep forgery detection framework named BiLSTM Multi-Head Self-Attention Network (BMNet). By leveraging the Bi-Directional Long Short-Term Memory Network (BiLSTM) for modeling temporal dependencies between video frames, and the Multi-Head Self-Attention Mechanism (MHSA) for capturing features from different regions of an image, BMNet more effectively identifies dynamic and local forgery features. In our experiments, we extract features from 68 facial landmarks of each video frame and evaluate detection performance on four datasets: UADFV, FF++, Celeb-DF and DFDC. The results show significant improvements over traditional methods. We further validate the necessity of each network component through ablation studies, demonstrating that BMNet achieves accuracies of 95.54%, 92.18%, 80.20% and 84.72% on the FF++, UADFV, Celeb-DF and DFDC datasets, respectively, indicating its superior performance in deep forgery detection.
format Article
id doaj-art-21db568b6e754843b8ec95dfc2196bbb
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-21db568b6e754843b8ec95dfc2196bbb2025-02-06T00:00:22ZengIEEEIEEE Access2169-35362025-01-0113215472155610.1109/ACCESS.2025.353365310852294BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention MechanismDemao Xiong0https://orcid.org/0009-0000-9469-5381Zhan Wen1https://orcid.org/0000-0002-6717-4289Cheng Zhang2https://orcid.org/0009-0004-8939-6204Dehao Ren3Wenzao Li4https://orcid.org/0000-0001-5986-1073School of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaWhen forgery techniques can generate highly realistic videos, traditional convolutional neural network (CNN)-based detection models often struggle to capture subtle forgery features and temporal dependencies. Most existing models focus on feature extraction from static frames, neglecting the temporal correlation in videos, which decreases accuracy in detecting dynamic forged videos. Furthermore, the ability to detect localized forgery features remains insufficient. To address these limitations, we propose a deep forgery detection framework named BiLSTM Multi-Head Self-Attention Network (BMNet). By leveraging the Bi-Directional Long Short-Term Memory Network (BiLSTM) for modeling temporal dependencies between video frames, and the Multi-Head Self-Attention Mechanism (MHSA) for capturing features from different regions of an image, BMNet more effectively identifies dynamic and local forgery features. In our experiments, we extract features from 68 facial landmarks of each video frame and evaluate detection performance on four datasets: UADFV, FF++, Celeb-DF and DFDC. The results show significant improvements over traditional methods. We further validate the necessity of each network component through ablation studies, demonstrating that BMNet achieves accuracies of 95.54%, 92.18%, 80.20% and 84.72% on the FF++, UADFV, Celeb-DF and DFDC datasets, respectively, indicating its superior performance in deep forgery detection.https://ieeexplore.ieee.org/document/10852294/Deepfakedeepfake video detectionbi-directional long short-term memory networkmulti-head self-attention mechanism
spellingShingle Demao Xiong
Zhan Wen
Cheng Zhang
Dehao Ren
Wenzao Li
BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
IEEE Access
Deepfake
deepfake video detection
bi-directional long short-term memory network
multi-head self-attention mechanism
title BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
title_full BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
title_fullStr BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
title_full_unstemmed BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
title_short BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
title_sort bmnet enhancing deepfake detection through bilstm and multi head self attention mechanism
topic Deepfake
deepfake video detection
bi-directional long short-term memory network
multi-head self-attention mechanism
url https://ieeexplore.ieee.org/document/10852294/
work_keys_str_mv AT demaoxiong bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism
AT zhanwen bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism
AT chengzhang bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism
AT dehaoren bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism
AT wenzaoli bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism