Optimizing Deep Learning Models for Resource‐Constrained Environments With Cluster‐Quantized Knowledge Distillation

ABSTRACT Deep convolutional neural networks (CNNs) are highly effective in computer vision tasks but remain challenging to deploy in resource‐constrained environments due to their high computational and memory requirements. Conventional model compression techniques, such as pruning and post‐training...

Full description

Saved in:
Bibliographic Details
Main Authors: Niaz Ashraf Khan, A. M. Saadman Rafat
Format: Article
Language:English
Published: Wiley 2025-05-01
Series:Engineering Reports
Subjects:
Online Access:https://doi.org/10.1002/eng2.70187
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:ABSTRACT Deep convolutional neural networks (CNNs) are highly effective in computer vision tasks but remain challenging to deploy in resource‐constrained environments due to their high computational and memory requirements. Conventional model compression techniques, such as pruning and post‐training quantization, often compromise model accuracy by decoupling compression from training. Furthermore, traditional knowledge distillation approaches rely on full‐precision teacher models, limiting their effectiveness in compressed settings. To address these issues, we propose Cluster‐Quantized Knowledge Distillation (CQKD), a novel framework that integrates structured pruning with knowledge distillation, incorporating cluster‐based weight quantization directly into the training loop. Unlike existing methods, CQKD applies quantization to both the teacher and student models, ensuring a more effective transfer of compressed knowledge. By leveraging layer‐wise K‐means clustering, our approach achieves extreme model compression while maintaining high accuracy. Experimental results on CIFAR‐10 and CIFAR‐100 demonstrate the effectiveness of CQKD, achieving compression ratios of 34,000× while preserving competitive accuracy—97.9% on CIFAR‐10 and 91.2% on CIFAR‐100. These results highlight the feasibility of CQKD for efficient deep learning model deployment in low‐resource environments.
ISSN:2577-8196