Compressing Neural Networks on Limited Computing Resources
Network compression is a crucial technique for applying deep learning models to edge or mobile devices. However, the cost of achieving higher benchmark performance through compression is continuously increasing, making network compression a significant burden—especially for small industri...
Saved in:
| Main Authors: | , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2025-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/10988545/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850189601533067264 |
|---|---|
| author | Seunghyun Lee Dongjun Lee Minju Hyun Heeje Kim Byung Cheol Song |
| author_facet | Seunghyun Lee Dongjun Lee Minju Hyun Heeje Kim Byung Cheol Song |
| author_sort | Seunghyun Lee |
| collection | DOAJ |
| description | Network compression is a crucial technique for applying deep learning models to edge or mobile devices. However, the cost of achieving higher benchmark performance through compression is continuously increasing, making network compression a significant burden—especially for small industries focused on developing compact models. Specifically, existing network compression techniques often require extensive computational resources, rendering them impractical for edge devices and small-scale applications. To democratize network compression, we propose a general-purpose framework that combines novel filter pruning and knowledge distillation techniques. First, unlike conventional filter pruning methods based on static heuristics and costly neural architecture search (NAS)-based approaches, our method leverages meta-learning for rapid and fine examination of the importance of each gate. This enables rapid and stable sub-network discovery, significantly improving the pruning process. Second, to minimize the computational cost of knowledge distillation, we introduce a synthetic teacher assistant that leverages precomputed fixed knowledge—referring to the stored feature maps/logits of the teacher network. By leveraging fixed knowledge, we mitigate the cost incurred by the teacher network and facilitate the transmission of fixed knowledge to the student via synthetic teacher assistants, thereby preventing distribution collapse. Our proposed framework dramatically reduces the compression overhead while maintaining high accuracy, achieving a 55.2% reduction in FLOPs of ResNet-50 trained on ImageNet while preserving 76.2% top-1 accuracy with only 199 GPU hours—significantly lower than previous state-of-the-art methods. Overall, our framework democratizes deep learning compression by offering a cost-effective and computationally feasible solution, enabling broader adoption in low-resource environments. |
| format | Article |
| id | doaj-art-de6b40cacc844e4c942d46dd7ddeb89f |
| institution | OA Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| series | IEEE Access |
| spelling | doaj-art-de6b40cacc844e4c942d46dd7ddeb89f2025-08-20T02:15:34ZengIEEEIEEE Access2169-35362025-01-0113800638007510.1109/ACCESS.2025.356710210988545Compressing Neural Networks on Limited Computing ResourcesSeunghyun Lee0https://orcid.org/0000-0001-7139-1764Dongjun Lee1Minju Hyun2https://orcid.org/0009-0004-5523-4573Heeje Kim3https://orcid.org/0009-0006-4654-1832Byung Cheol Song4https://orcid.org/0000-0001-8742-3433Department of Electrical and Computer Engineering, Inha University, Incheon, Republic of KoreaDepartment of Electrical and Computer Engineering, Inha University, Incheon, Republic of KoreaDepartment of Electrical and Computer Engineering, Inha University, Incheon, Republic of KoreaDepartment of Electrical and Computer Engineering, Inha University, Incheon, Republic of KoreaDepartment of Electrical and Computer Engineering, Inha University, Incheon, Republic of KoreaNetwork compression is a crucial technique for applying deep learning models to edge or mobile devices. However, the cost of achieving higher benchmark performance through compression is continuously increasing, making network compression a significant burden—especially for small industries focused on developing compact models. Specifically, existing network compression techniques often require extensive computational resources, rendering them impractical for edge devices and small-scale applications. To democratize network compression, we propose a general-purpose framework that combines novel filter pruning and knowledge distillation techniques. First, unlike conventional filter pruning methods based on static heuristics and costly neural architecture search (NAS)-based approaches, our method leverages meta-learning for rapid and fine examination of the importance of each gate. This enables rapid and stable sub-network discovery, significantly improving the pruning process. Second, to minimize the computational cost of knowledge distillation, we introduce a synthetic teacher assistant that leverages precomputed fixed knowledge—referring to the stored feature maps/logits of the teacher network. By leveraging fixed knowledge, we mitigate the cost incurred by the teacher network and facilitate the transmission of fixed knowledge to the student via synthetic teacher assistants, thereby preventing distribution collapse. Our proposed framework dramatically reduces the compression overhead while maintaining high accuracy, achieving a 55.2% reduction in FLOPs of ResNet-50 trained on ImageNet while preserving 76.2% top-1 accuracy with only 199 GPU hours—significantly lower than previous state-of-the-art methods. Overall, our framework democratizes deep learning compression by offering a cost-effective and computationally feasible solution, enabling broader adoption in low-resource environments.https://ieeexplore.ieee.org/document/10988545/Deep neural network compressionfilter pruningknowledge transferefficient neural networkslightweight deep learning models |
| spellingShingle | Seunghyun Lee Dongjun Lee Minju Hyun Heeje Kim Byung Cheol Song Compressing Neural Networks on Limited Computing Resources IEEE Access Deep neural network compression filter pruning knowledge transfer efficient neural networks lightweight deep learning models |
| title | Compressing Neural Networks on Limited Computing Resources |
| title_full | Compressing Neural Networks on Limited Computing Resources |
| title_fullStr | Compressing Neural Networks on Limited Computing Resources |
| title_full_unstemmed | Compressing Neural Networks on Limited Computing Resources |
| title_short | Compressing Neural Networks on Limited Computing Resources |
| title_sort | compressing neural networks on limited computing resources |
| topic | Deep neural network compression filter pruning knowledge transfer efficient neural networks lightweight deep learning models |
| url | https://ieeexplore.ieee.org/document/10988545/ |
| work_keys_str_mv | AT seunghyunlee compressingneuralnetworksonlimitedcomputingresources AT dongjunlee compressingneuralnetworksonlimitedcomputingresources AT minjuhyun compressingneuralnetworksonlimitedcomputingresources AT heejekim compressingneuralnetworksonlimitedcomputingresources AT byungcheolsong compressingneuralnetworksonlimitedcomputingresources |