BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism
When forgery techniques can generate highly realistic videos, traditional convolutional neural network (CNN)-based detection models often struggle to capture subtle forgery features and temporal dependencies. Most existing models focus on feature extraction from static frames, neglecting the tempora...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10852294/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832087939300982784 |
---|---|
author | Demao Xiong Zhan Wen Cheng Zhang Dehao Ren Wenzao Li |
author_facet | Demao Xiong Zhan Wen Cheng Zhang Dehao Ren Wenzao Li |
author_sort | Demao Xiong |
collection | DOAJ |
description | When forgery techniques can generate highly realistic videos, traditional convolutional neural network (CNN)-based detection models often struggle to capture subtle forgery features and temporal dependencies. Most existing models focus on feature extraction from static frames, neglecting the temporal correlation in videos, which decreases accuracy in detecting dynamic forged videos. Furthermore, the ability to detect localized forgery features remains insufficient. To address these limitations, we propose a deep forgery detection framework named BiLSTM Multi-Head Self-Attention Network (BMNet). By leveraging the Bi-Directional Long Short-Term Memory Network (BiLSTM) for modeling temporal dependencies between video frames, and the Multi-Head Self-Attention Mechanism (MHSA) for capturing features from different regions of an image, BMNet more effectively identifies dynamic and local forgery features. In our experiments, we extract features from 68 facial landmarks of each video frame and evaluate detection performance on four datasets: UADFV, FF++, Celeb-DF and DFDC. The results show significant improvements over traditional methods. We further validate the necessity of each network component through ablation studies, demonstrating that BMNet achieves accuracies of 95.54%, 92.18%, 80.20% and 84.72% on the FF++, UADFV, Celeb-DF and DFDC datasets, respectively, indicating its superior performance in deep forgery detection. |
format | Article |
id | doaj-art-21db568b6e754843b8ec95dfc2196bbb |
institution | Kabale University |
issn | 2169-3536 |
language | English |
publishDate | 2025-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj-art-21db568b6e754843b8ec95dfc2196bbb2025-02-06T00:00:22ZengIEEEIEEE Access2169-35362025-01-0113215472155610.1109/ACCESS.2025.353365310852294BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention MechanismDemao Xiong0https://orcid.org/0009-0000-9469-5381Zhan Wen1https://orcid.org/0000-0002-6717-4289Cheng Zhang2https://orcid.org/0009-0004-8939-6204Dehao Ren3Wenzao Li4https://orcid.org/0000-0001-5986-1073School of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaSchool of Communication Engineering, Chengdu University of Information Technology, Chengdu, ChinaWhen forgery techniques can generate highly realistic videos, traditional convolutional neural network (CNN)-based detection models often struggle to capture subtle forgery features and temporal dependencies. Most existing models focus on feature extraction from static frames, neglecting the temporal correlation in videos, which decreases accuracy in detecting dynamic forged videos. Furthermore, the ability to detect localized forgery features remains insufficient. To address these limitations, we propose a deep forgery detection framework named BiLSTM Multi-Head Self-Attention Network (BMNet). By leveraging the Bi-Directional Long Short-Term Memory Network (BiLSTM) for modeling temporal dependencies between video frames, and the Multi-Head Self-Attention Mechanism (MHSA) for capturing features from different regions of an image, BMNet more effectively identifies dynamic and local forgery features. In our experiments, we extract features from 68 facial landmarks of each video frame and evaluate detection performance on four datasets: UADFV, FF++, Celeb-DF and DFDC. The results show significant improvements over traditional methods. We further validate the necessity of each network component through ablation studies, demonstrating that BMNet achieves accuracies of 95.54%, 92.18%, 80.20% and 84.72% on the FF++, UADFV, Celeb-DF and DFDC datasets, respectively, indicating its superior performance in deep forgery detection.https://ieeexplore.ieee.org/document/10852294/Deepfakedeepfake video detectionbi-directional long short-term memory networkmulti-head self-attention mechanism |
spellingShingle | Demao Xiong Zhan Wen Cheng Zhang Dehao Ren Wenzao Li BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism IEEE Access Deepfake deepfake video detection bi-directional long short-term memory network multi-head self-attention mechanism |
title | BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism |
title_full | BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism |
title_fullStr | BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism |
title_full_unstemmed | BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism |
title_short | BMNet: Enhancing Deepfake Detection Through BiLSTM and Multi-Head Self-Attention Mechanism |
title_sort | bmnet enhancing deepfake detection through bilstm and multi head self attention mechanism |
topic | Deepfake deepfake video detection bi-directional long short-term memory network multi-head self-attention mechanism |
url | https://ieeexplore.ieee.org/document/10852294/ |
work_keys_str_mv | AT demaoxiong bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism AT zhanwen bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism AT chengzhang bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism AT dehaoren bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism AT wenzaoli bmnetenhancingdeepfakedetectionthroughbilstmandmultiheadselfattentionmechanism |