Head information bottleneck (HIB): leveraging information bottleneck for efficient transformer head attribution and pruning
Abstract Multi-head attention mechanisms have been widely applied in speech pre-training. However, their roles and effectiveness in various downstream tasks have not been fully explored. Attention heads may vary in importance depending on the downstream task. We assume that the attention allocation...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
SpringerOpen
2025-07-01
|
| Series: | EURASIP Journal on Audio, Speech, and Music Processing |
| Subjects: | |
| Online Access: | https://doi.org/10.1186/s13636-025-00411-8 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|