Parameter Disentanglement for Diverse Representations
Recent advances in neural network architectures reveal the importance of diverse representations. However, simply integrating more branches or increasing the width for the diversity would inevitably increase model complexity, leading to prohibitive inference costs. In this paper, we revisit the lear...
Saved in:
| Main Authors: | , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Tsinghua University Press
2025-05-01
|
| Series: | Big Data Mining and Analytics |
| Subjects: | |
| Online Access: | https://www.sciopen.com/article/10.26599/BDMA.2024.9020087 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849304126976950272 |
|---|---|
| author | Jingxu Wang Jingda Guo Ruili Wang Zhao Zhang Liyong Fu Qiaolin Ye |
| author_facet | Jingxu Wang Jingda Guo Ruili Wang Zhao Zhang Liyong Fu Qiaolin Ye |
| author_sort | Jingxu Wang |
| collection | DOAJ |
| description | Recent advances in neural network architectures reveal the importance of diverse representations. However, simply integrating more branches or increasing the width for the diversity would inevitably increase model complexity, leading to prohibitive inference costs. In this paper, we revisit the learnable parameters in neural networks and showcase that it is feasible to disentangle learnable parameters to latent sub-parameters, which focus on different patterns and representations. This important finding leads us to study further the aggregation of diverse representations in a network structure. To this end, we propose Parameter Disentanglement for Diverse Representations (PDDR), which considers diverse patterns in parallel during training, and aggregates them into one for efficient inference. To further enhance the diverse representations, we develop a lightweight refinement module in PDDR, which adaptively refines the combination of diverse representations according to the input. PDDR can be seamlessly integrated into modern networks, significantly improving the learning capacity of a network while maintaining the same complexity for inference. Experimental results show great improvements on various tasks, with an improvement of 1.47% over Residual Network 50 (ResNet50) on ImageNet, and we improve the detection results of Retina Residual Network 50 (Retina-ResNet50) by 1.7% Mean Average Precision (mAP). Integrating PDDR into recent lightweight vision transformer models, the resulting model outperforms related works by a clear margin. |
| format | Article |
| id | doaj-art-6a9fe109a21041bba2c370da5ecabe01 |
| institution | Kabale University |
| issn | 2096-0654 2097-406X |
| language | English |
| publishDate | 2025-05-01 |
| publisher | Tsinghua University Press |
| record_format | Article |
| series | Big Data Mining and Analytics |
| spelling | doaj-art-6a9fe109a21041bba2c370da5ecabe012025-08-20T03:55:49ZengTsinghua University PressBig Data Mining and Analytics2096-06542097-406X2025-05-018360662310.26599/BDMA.2024.9020087Parameter Disentanglement for Diverse RepresentationsJingxu Wang0Jingda Guo1Ruili Wang2Zhao Zhang3Liyong Fu4Qiaolin Ye5the College of Information Science and Technology & Artificial Intelligence, State Key Laboratory of Tree Genetics and Breeding, and also with the Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing 210037, Chinathe Research Institute of New Technology, Hillstone Networks, Santa Clara, CA 95054, USAthe School of Mathematical and Computational Sciences, Massey University, Auckland 102-904, New Zealandthe Key Laboratory of Knowledge Engineering with Big Data, Ministry of Education, Hefei University of Technology, Hefei 230009, Chinathe College of Forestry, Hebei Agricultural University, Baoding 071000, China, and also with the Institute of Forest Resource Information Techniques, Chinese Academy of Forestry, Beijing 100091, Chinathe College of Information Science and Technology & Artificial Intelligence, State Key Laboratory of Tree Genetics and Breeding, and also with the Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing 210037, ChinaRecent advances in neural network architectures reveal the importance of diverse representations. However, simply integrating more branches or increasing the width for the diversity would inevitably increase model complexity, leading to prohibitive inference costs. In this paper, we revisit the learnable parameters in neural networks and showcase that it is feasible to disentangle learnable parameters to latent sub-parameters, which focus on different patterns and representations. This important finding leads us to study further the aggregation of diverse representations in a network structure. To this end, we propose Parameter Disentanglement for Diverse Representations (PDDR), which considers diverse patterns in parallel during training, and aggregates them into one for efficient inference. To further enhance the diverse representations, we develop a lightweight refinement module in PDDR, which adaptively refines the combination of diverse representations according to the input. PDDR can be seamlessly integrated into modern networks, significantly improving the learning capacity of a network while maintaining the same complexity for inference. Experimental results show great improvements on various tasks, with an improvement of 1.47% over Residual Network 50 (ResNet50) on ImageNet, and we improve the detection results of Retina Residual Network 50 (Retina-ResNet50) by 1.7% Mean Average Precision (mAP). Integrating PDDR into recent lightweight vision transformer models, the resulting model outperforms related works by a clear margin.https://www.sciopen.com/article/10.26599/BDMA.2024.9020087representation learningefficient networkcomputer vision |
| spellingShingle | Jingxu Wang Jingda Guo Ruili Wang Zhao Zhang Liyong Fu Qiaolin Ye Parameter Disentanglement for Diverse Representations Big Data Mining and Analytics representation learning efficient network computer vision |
| title | Parameter Disentanglement for Diverse Representations |
| title_full | Parameter Disentanglement for Diverse Representations |
| title_fullStr | Parameter Disentanglement for Diverse Representations |
| title_full_unstemmed | Parameter Disentanglement for Diverse Representations |
| title_short | Parameter Disentanglement for Diverse Representations |
| title_sort | parameter disentanglement for diverse representations |
| topic | representation learning efficient network computer vision |
| url | https://www.sciopen.com/article/10.26599/BDMA.2024.9020087 |
| work_keys_str_mv | AT jingxuwang parameterdisentanglementfordiverserepresentations AT jingdaguo parameterdisentanglementfordiverserepresentations AT ruiliwang parameterdisentanglementfordiverserepresentations AT zhaozhang parameterdisentanglementfordiverserepresentations AT liyongfu parameterdisentanglementfordiverserepresentations AT qiaolinye parameterdisentanglementfordiverserepresentations |