Dataset Dependency in CNN-Based Copy-Move Forgery Detection: A Multi-Dataset Comparative Analysis
Convolutional neural networks (CNNs) have established themselves over time as a fundamental tool in the field of copy-move forgery detection due to their ability to effectively identify and analyze manipulated images. Unfortunately, they still represent a persistent challenge in digital image forens...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-06-01
|
| Series: | Machine Learning and Knowledge Extraction |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2504-4990/7/2/54 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849432054358343680 |
|---|---|
| author | Potito Valle Dell’Olmo Oleksandr Kuznetsov Emanuele Frontoni Marco Arnesano Christian Napoli Cristian Randieri |
| author_facet | Potito Valle Dell’Olmo Oleksandr Kuznetsov Emanuele Frontoni Marco Arnesano Christian Napoli Cristian Randieri |
| author_sort | Potito Valle Dell’Olmo |
| collection | DOAJ |
| description | Convolutional neural networks (CNNs) have established themselves over time as a fundamental tool in the field of copy-move forgery detection due to their ability to effectively identify and analyze manipulated images. Unfortunately, they still represent a persistent challenge in digital image forensics, underlining the importance of ensuring the integrity of digital visual content. In this study, we present a systematic evaluation of the performance of a convolutional neural network (CNN) specifically designed for copy-move manipulation detection, applied to three datasets widely used in the literature in the context of digital forensics: CoMoFoD, Coverage, and CASIA v2. Our experimental analysis highlighted a significant variability of the results, with an accuracy ranging from 95.90% on CoMoFoD to 27.50% on Coverage. This inhomogeneity has been attributed to specific structural factors of the datasets used, such as the sample size, the degree of imbalance between classes, and the intrinsic complexity of the manipulations. We also investigated different regularization techniques and data augmentation strategies to understand their impact on the network performance, finding that adopting the L2 penalty and reducing the learning rate led to an accuracy increase of up to 2.5% for CASIA v2, while on CoMoFoD we recorded a much more modest impact (1.3%). Similarly, we observed that data augmentation was able to improve performance on large datasets but was ineffective on smaller ones. Our results challenge the idea of universal generalizability of CNN architectures in the context of copy-move forgery detection, highlighting instead how performance is strictly dependent on the intrinsic characteristics of the dataset under consideration. Finally, we propose a series of operational recommendations for optimizing the training process, the choice of the dataset, and the definition of robust evaluation protocols aimed at guiding the development of detection systems that are more reliable and generalizable. |
| format | Article |
| id | doaj-art-2dbd2002ebad401e848fb14df9b7a99f |
| institution | Kabale University |
| issn | 2504-4990 |
| language | English |
| publishDate | 2025-06-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Machine Learning and Knowledge Extraction |
| spelling | doaj-art-2dbd2002ebad401e848fb14df9b7a99f2025-08-20T03:27:28ZengMDPI AGMachine Learning and Knowledge Extraction2504-49902025-06-01725410.3390/make7020054Dataset Dependency in CNN-Based Copy-Move Forgery Detection: A Multi-Dataset Comparative AnalysisPotito Valle Dell’Olmo0Oleksandr Kuznetsov1Emanuele Frontoni2Marco Arnesano3Christian Napoli4Cristian Randieri5Department of Theoretical and Applied Sciences, eCampus University, Via Isimbardi 10, 22060 Novedrate, ItalyDepartment of Theoretical and Applied Sciences, eCampus University, Via Isimbardi 10, 22060 Novedrate, ItalyDepartment of Political Sciences, Communication and International Relations, University of Macerata, Via Crescimbeni, 30/32, 62100 Macerata, ItalyDepartment of Theoretical and Applied Sciences, eCampus University, Via Isimbardi 10, 22060 Novedrate, ItalyDepartment of Computer, Control, and Management Engineering “Antonio Ruberti”, Sapienza University of Rome, V. Ariosto 25, 00185 Rome, ItalyDepartment of Theoretical and Applied Sciences, eCampus University, Via Isimbardi 10, 22060 Novedrate, ItalyConvolutional neural networks (CNNs) have established themselves over time as a fundamental tool in the field of copy-move forgery detection due to their ability to effectively identify and analyze manipulated images. Unfortunately, they still represent a persistent challenge in digital image forensics, underlining the importance of ensuring the integrity of digital visual content. In this study, we present a systematic evaluation of the performance of a convolutional neural network (CNN) specifically designed for copy-move manipulation detection, applied to three datasets widely used in the literature in the context of digital forensics: CoMoFoD, Coverage, and CASIA v2. Our experimental analysis highlighted a significant variability of the results, with an accuracy ranging from 95.90% on CoMoFoD to 27.50% on Coverage. This inhomogeneity has been attributed to specific structural factors of the datasets used, such as the sample size, the degree of imbalance between classes, and the intrinsic complexity of the manipulations. We also investigated different regularization techniques and data augmentation strategies to understand their impact on the network performance, finding that adopting the L2 penalty and reducing the learning rate led to an accuracy increase of up to 2.5% for CASIA v2, while on CoMoFoD we recorded a much more modest impact (1.3%). Similarly, we observed that data augmentation was able to improve performance on large datasets but was ineffective on smaller ones. Our results challenge the idea of universal generalizability of CNN architectures in the context of copy-move forgery detection, highlighting instead how performance is strictly dependent on the intrinsic characteristics of the dataset under consideration. Finally, we propose a series of operational recommendations for optimizing the training process, the choice of the dataset, and the definition of robust evaluation protocols aimed at guiding the development of detection systems that are more reliable and generalizable.https://www.mdpi.com/2504-4990/7/2/54copy-move forgery detectionconvolutional neural networksdigital image forensicsdataset dependencyregularization techniquesdata augmentation |
| spellingShingle | Potito Valle Dell’Olmo Oleksandr Kuznetsov Emanuele Frontoni Marco Arnesano Christian Napoli Cristian Randieri Dataset Dependency in CNN-Based Copy-Move Forgery Detection: A Multi-Dataset Comparative Analysis Machine Learning and Knowledge Extraction copy-move forgery detection convolutional neural networks digital image forensics dataset dependency regularization techniques data augmentation |
| title | Dataset Dependency in CNN-Based Copy-Move Forgery Detection: A Multi-Dataset Comparative Analysis |
| title_full | Dataset Dependency in CNN-Based Copy-Move Forgery Detection: A Multi-Dataset Comparative Analysis |
| title_fullStr | Dataset Dependency in CNN-Based Copy-Move Forgery Detection: A Multi-Dataset Comparative Analysis |
| title_full_unstemmed | Dataset Dependency in CNN-Based Copy-Move Forgery Detection: A Multi-Dataset Comparative Analysis |
| title_short | Dataset Dependency in CNN-Based Copy-Move Forgery Detection: A Multi-Dataset Comparative Analysis |
| title_sort | dataset dependency in cnn based copy move forgery detection a multi dataset comparative analysis |
| topic | copy-move forgery detection convolutional neural networks digital image forensics dataset dependency regularization techniques data augmentation |
| url | https://www.mdpi.com/2504-4990/7/2/54 |
| work_keys_str_mv | AT potitovalledellolmo datasetdependencyincnnbasedcopymoveforgerydetectionamultidatasetcomparativeanalysis AT oleksandrkuznetsov datasetdependencyincnnbasedcopymoveforgerydetectionamultidatasetcomparativeanalysis AT emanuelefrontoni datasetdependencyincnnbasedcopymoveforgerydetectionamultidatasetcomparativeanalysis AT marcoarnesano datasetdependencyincnnbasedcopymoveforgerydetectionamultidatasetcomparativeanalysis AT christiannapoli datasetdependencyincnnbasedcopymoveforgerydetectionamultidatasetcomparativeanalysis AT cristianrandieri datasetdependencyincnnbasedcopymoveforgerydetectionamultidatasetcomparativeanalysis |