Improving Low-Resource Neural Machine Translation With Teacher-Free Knowledge Distillation
Knowledge Distillation (KD) aims to distill the knowledge of a cumbersome teacher model into a lightweight student model. Its success is generally attributed to the privileged information on similarities among categories provided by the teacher model, and in this sense, only strong teacher models ar...
Saved in:
| Main Authors: | Xinlu Zhang, Xiao Li, Yating Yang, Rui Dong |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
IEEE
2020-01-01
|
| Series: | IEEE Access |
| Subjects: | |
| Online Access: | https://ieeexplore.ieee.org/document/9257421/ |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Confidence-Based Knowledge Distillation to Reduce Training Costs and Carbon Footprint for Low-Resource Neural Machine Translation
by: Maria Zafar, et al.
Published: (2025-07-01) -
Non‐Autoregressive Translation Algorithm Based on LLM Knowledge Distillation in English Corpus
by: Fang Ju, et al.
Published: (2025-01-01) -
Decoupled Time-Dimensional Progressive Self-Distillation With Knowledge Calibration for Edge Computing-Enabled AIoT
by: Yingchao Wang, et al.
Published: (2024-01-01) -
Leveraging logit uncertainty for better knowledge distillation
by: Zhen Guo, et al.
Published: (2024-12-01) -
Knowledge distillation for spiking neural networks: aligning features and saliency
by: Yifan Hu, et al.
Published: (2025-01-01)