A Spatio-Temporal Deep Learning Approach for Efficient Deepfake Video Detection

Deepfake videos have grown to be a big concern in the modern digital media landscape as they cause difficulties undermining the legitimacy of channels of information and communication. Humans often find it challenging to tell the difference between a fake and a genuine video due to the increasing r...

Full description

Saved in:

Bibliographic Details
Main Authors:	Raman Z. Khudhur, Marwan A. Mohammed
Format:	Article
Language:	English
Published:	Koya University 2025-08-01
Series:	ARO-The Scientific Journal of Koya University
Subjects:	Deep learning DeepFake detection EfficientNet Spatio-temporal Modeling
Online Access:	https://test.koyauniversity.org/index.php/aro/article/view/2190
Tags:	Add Tag No Tags, Be the first to tag this record!

_version_	1849337898953867264
author	Raman Z. Khudhur Marwan A. Mohammed
author_facet	Raman Z. Khudhur Marwan A. Mohammed
author_sort	Raman Z. Khudhur
collection	DOAJ
description	Deepfake videos have grown to be a big concern in the modern digital media landscape as they cause difficulties undermining the legitimacy of channels of information and communication. Humans often find it challenging to tell the difference between a fake and a genuine video due to the increasing realism of facial deepfakes. Identification of these misleading materials is the first step in preventing deepfakes from spreading through social media. This work introduces Spatio-temporal Intelligent Deepfake Detector (STIDD), a deep learning system including enhanced spatial and temporal modeling techniques. By means of a pre-trained EfficientNetV2-B0 model, the proposed framework efficiently extracts spatial characteristics from each frame, subsequently, and Bidirectional Long Short-Term Memory layers help to capture temporal relationships from video sequences. We evaluate STIDD on the FaceForensics++ (FF++) dataset encompassing all five manipulation techniques (DeepFakes, FaceSwap, Face2Face, FaceShifter, and NeuralTextures). The experimental results reveal that STIDD achieved precision, recall, and F1-scores all higher than 0.99 and a final test accuracy of 99.51% on the combined FF++ test set. The results demonstrate that the integration of sophisticated spatial extraction and strong temporal modeling allows STIDD to achieve high detection performance while maintaining computing efficiency at just 0.39 Giga Floating-Point Operations (GFLOPs) per inference.
format	Article
id	doaj-art-4bbfbd498c0b4da78da6dc65d722a8be
institution	Kabale University
issn	2410-9355 2307-549X
language	English
publishDate	2025-08-01
publisher	Koya University
record_format	Article
series	ARO-The Scientific Journal of Koya University
spelling	doaj-art-4bbfbd498c0b4da78da6dc65d722a8be2025-08-20T03:44:33ZengKoya UniversityARO-The Scientific Journal of Koya University2410-93552307-549X2025-08-0113210.14500/aro.12190A Spatio-Temporal Deep Learning Approach for Efficient Deepfake Video DetectionRaman Z. Khudhur0https://orcid.org/0009-0002-3856-2095Marwan A. Mohammed1https://orcid.org/0000-0001-9072-3672Department of Software Engineering, College of Engineering, Salahaddin University, Erbil, Kurdistan Region – F.R. IraqDepartment of Computer Engineering, College of Engineering, Knowledge University, Erbil, Kurdistan Region – F.R. Iraq Deepfake videos have grown to be a big concern in the modern digital media landscape as they cause difficulties undermining the legitimacy of channels of information and communication. Humans often find it challenging to tell the difference between a fake and a genuine video due to the increasing realism of facial deepfakes. Identification of these misleading materials is the first step in preventing deepfakes from spreading through social media. This work introduces Spatio-temporal Intelligent Deepfake Detector (STIDD), a deep learning system including enhanced spatial and temporal modeling techniques. By means of a pre-trained EfficientNetV2-B0 model, the proposed framework efficiently extracts spatial characteristics from each frame, subsequently, and Bidirectional Long Short-Term Memory layers help to capture temporal relationships from video sequences. We evaluate STIDD on the FaceForensics++ (FF++) dataset encompassing all five manipulation techniques (DeepFakes, FaceSwap, Face2Face, FaceShifter, and NeuralTextures). The experimental results reveal that STIDD achieved precision, recall, and F1-scores all higher than 0.99 and a final test accuracy of 99.51% on the combined FF++ test set. The results demonstrate that the integration of sophisticated spatial extraction and strong temporal modeling allows STIDD to achieve high detection performance while maintaining computing efficiency at just 0.39 Giga Floating-Point Operations (GFLOPs) per inference. https://test.koyauniversity.org/index.php/aro/article/view/2190Deep learningDeepFake detectionEfficientNetSpatio-temporal Modeling
spellingShingle	Raman Z. Khudhur Marwan A. Mohammed A Spatio-Temporal Deep Learning Approach for Efficient Deepfake Video Detection ARO-The Scientific Journal of Koya University Deep learning DeepFake detection EfficientNet Spatio-temporal Modeling
title	A Spatio-Temporal Deep Learning Approach for Efficient Deepfake Video Detection
title_full	A Spatio-Temporal Deep Learning Approach for Efficient Deepfake Video Detection
title_fullStr	A Spatio-Temporal Deep Learning Approach for Efficient Deepfake Video Detection
title_full_unstemmed	A Spatio-Temporal Deep Learning Approach for Efficient Deepfake Video Detection
title_short	A Spatio-Temporal Deep Learning Approach for Efficient Deepfake Video Detection
title_sort	spatio temporal deep learning approach for efficient deepfake video detection
topic	Deep learning DeepFake detection EfficientNet Spatio-temporal Modeling
url	https://test.koyauniversity.org/index.php/aro/article/view/2190
work_keys_str_mv	AT ramanzkhudhur aspatiotemporaldeeplearningapproachforefficientdeepfakevideodetection AT marwanamohammed aspatiotemporaldeeplearningapproachforefficientdeepfakevideodetection AT ramanzkhudhur spatiotemporaldeeplearningapproachforefficientdeepfakevideodetection AT marwanamohammed spatiotemporaldeeplearningapproachforefficientdeepfakevideodetection

A Spatio-Temporal Deep Learning Approach for Efficient Deepfake Video Detection

Similar Items