A low functional redundancy-based network slimming method for accelerating deep neural networks

Deep neural networks (DNNs) have been widely criticized for their large parameters and computation demands, hindering deployment to edge and embedded devices. In order to reduce the floating point operations (FLOPs) running DNNs and accelerate the inference speed, we start from the model pruning, an...

Full description

Saved in:
Bibliographic Details
Main Authors: Zheng Fang, Bo Yin
Format: Article
Language:English
Published: Elsevier 2025-04-01
Series:Alexandria Engineering Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1110016824017162
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Deep neural networks (DNNs) have been widely criticized for their large parameters and computation demands, hindering deployment to edge and embedded devices. In order to reduce the floating point operations (FLOPs) running DNNs and accelerate the inference speed, we start from the model pruning, and realize this goal by removing useless network parameters. In this research, we propose a low functional redundancy-based network slimming method (LFRNS) that can find and remove functional redundant filters by feature clustering algorithm. However, the redundancy of some key features is beneficial to the model, and removing these features will limit the potential of the model to some extent. Build on this view, we propose feature contribution ranking unit (FCR unit) which can automatically learn the feature maps' contribution to the key information with training iterations. FCR unit can assist LFRNS restore some important elements in the pruning set to break the performance bottleneck of the slimming model. Our method mainly removes feature maps with similar functions instead of only pruning the unimportant parts, thus effectively ensuring the integrity of features’ functions and avoiding network degradation. We conduct experiments on image classification task based on CIFAR-10 and CIFAR-100 datasets. Our framework achieves over 2.0 × parameters and FLOPs reductions, while maintaining < 1 % loss in accuracy, and even improve accuracy of large-volume models. We also introduce our method to the vision transformer model (ViT) and achieve performance comparable to state-of-the-art methods with nearly 1.5 × less computation.
ISSN:1110-0168