Frozen Weights as Prior for Parameter-Efficient Fine-Tuning
In the fields of natural language processing and computer vision, the emergence of large pre-trained models has led to the adoption of fine-tuning them for downstream tasks as an important paradigm. However, the full fine-tuning approach often comes with a hefty cost, which is not feasible for many...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2025-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10840174/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In the fields of natural language processing and computer vision, the emergence of large pre-trained models has led to the adoption of fine-tuning them for downstream tasks as an important paradigm. However, the full fine-tuning approach often comes with a hefty cost, which is not feasible for many researchers. Therefore, in recent years, numerous fine-tuning methods have been proposed to efficiently learn incremental updates of pre-trained weights in a more parameter-efficient way (e.g., employing low-rank increments or introducing adapters to modify the network architecture). However, most of these methods involve adding a set of incrementally learned parameters from scratch. From the perspective of full fine-tuning, these approaches often fail to fully exploit the connection between incremental changes during fine-tuning and the frozen weights of the pre-trained model. In order to delve into how to more effectively harness the weights of pre-trained models during the fine-tuning process to fully acquire new knowledge, we propose a novel parameter-efficient approach, which reuses the Frozen Weights as a prior (FoWA). We adapt the incremental matrix to the unitary matrix obtained from the singular value decomposition of the frozen weights, and further fine-tune the model by incorporating prior information. Through the frozen weight prior, FoWA can automatically select the appropriate rank and decouple the number of trainable parameters from the rank. We have conducted extensive experiments on various tasks, including natural language processing, question answering, natural language generation, and visual classification, demonstrating the effectiveness of FoWA. |
---|---|
ISSN: | 2169-3536 |