An Efficient Dropout for Robust Deep Neural Networks

Overfitting remains a major difficulty in training deep neural networks, especially when attempting to achieve good generalization in complex classification tasks. Standard dropout is often employed to address this issue; however, its uniform random inactivation of neurons typically leads to instabi...

Full description

Saved in:
Bibliographic Details
Main Authors: Yavuz Çapkan, Aydın Yeşildirek
Format: Article
Language:English
Published: MDPI AG 2025-07-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/15/15/8301
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Overfitting remains a major difficulty in training deep neural networks, especially when attempting to achieve good generalization in complex classification tasks. Standard dropout is often employed to address this issue; however, its uniform random inactivation of neurons typically leads to instability and insufficient performance increases. This paper proposes an upgraded regularization technique merging adaptive sigmoidal dropout with weight amplification, seeking to dynamically adjust neuron deactivation depending on weight statistics, activation patterns, and neuron history. The proposed dropout process uses a sigmoid function driven by a temperature parameter to determine deactivation likelihood and incorporates a “neuron recovery” step to restore important activations. Simultaneously, the method amplifies high-magnitude weights to select crucial traits during learning. The proposed method is tested on CIFAR-10, and CIFAR-100 datasets using four unique CNN architectures, including deep and residual-based models, to evaluate the approach. Results demonstrate that the suggested technique consistently outperforms both standard dropout and baseline models without dropout, yielding higher validation accuracy and lower, more stable validation loss across all datasets. In particular, it demonstrated superior convergence and generalization performance on challenging datasets such as CIFAR-100. These findings demonstrate the potential of the proposed technique to improve model robustness and training efficiency and provide an alternative in complex classification tasks.
ISSN:2076-3417