Autocorrelation Matrix Knowledge Distillation: A Task-Specific Distillation Method for BERT Models
Pre-trained language models perform well in various natural language processing tasks. However, their large number of parameters poses significant challenges for edge devices with limited resources, greatly limiting their application in practical deployment. This paper introduces a simple and effici...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2024-10-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/14/20/9180 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|