Optimizing Deep Learning Models for Resource‐Constrained Environments With Cluster‐Quantized Knowledge Distillation
ABSTRACT Deep convolutional neural networks (CNNs) are highly effective in computer vision tasks but remain challenging to deploy in resource‐constrained environments due to their high computational and memory requirements. Conventional model compression techniques, such as pruning and post‐training...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2025-05-01
|
| Series: | Engineering Reports |
| Subjects: | |
| Online Access: | https://doi.org/10.1002/eng2.70187 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | ABSTRACT Deep convolutional neural networks (CNNs) are highly effective in computer vision tasks but remain challenging to deploy in resource‐constrained environments due to their high computational and memory requirements. Conventional model compression techniques, such as pruning and post‐training quantization, often compromise model accuracy by decoupling compression from training. Furthermore, traditional knowledge distillation approaches rely on full‐precision teacher models, limiting their effectiveness in compressed settings. To address these issues, we propose Cluster‐Quantized Knowledge Distillation (CQKD), a novel framework that integrates structured pruning with knowledge distillation, incorporating cluster‐based weight quantization directly into the training loop. Unlike existing methods, CQKD applies quantization to both the teacher and student models, ensuring a more effective transfer of compressed knowledge. By leveraging layer‐wise K‐means clustering, our approach achieves extreme model compression while maintaining high accuracy. Experimental results on CIFAR‐10 and CIFAR‐100 demonstrate the effectiveness of CQKD, achieving compression ratios of 34,000× while preserving competitive accuracy—97.9% on CIFAR‐10 and 91.2% on CIFAR‐100. These results highlight the feasibility of CQKD for efficient deep learning model deployment in low‐resource environments. |
|---|---|
| ISSN: | 2577-8196 |