Tailored knowledge distillation with automated loss function learning.
Knowledge Distillation (KD) is one of the most effective and widely used methods for model compression of large models. It has achieved significant success with the meticulous development of distillation losses. However, most state-of-the-art KD losses are manually crafted and task-specific, raising...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Public Library of Science (PLoS)
2025-01-01
|
| Series: | PLoS ONE |
| Online Access: | https://doi.org/10.1371/journal.pone.0325599 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|