Exploring the impact of fixed theta values in RoPE on character-level language model performance and efficiency
Rotary Positional Embedding (RoPE) is a widely used technique in Transformers, influenced by the hyperparameter theta (θ). However, the impact of varying *fixed* theta values, especially the trade-off between performance and efficiency on tasks like character-level modeling, remains under-explored....
Saved in:
| Main Authors: | Zhigao Huang, Musheng Chen, Shiyan Zheng |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Frontiers Media S.A.
2025-08-01
|
| Series: | Frontiers in Computer Science |
| Subjects: | |
| Online Access: | https://www.frontiersin.org/articles/10.3389/fcomp.2025.1626899/full |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Optimizing the Learnable RoPE Theta Parameter in Transformers
by: Zhigao Huang, et al.
Published: (2025-01-01) -
Analysis of Gearbox Bearing Fault Diagnosis Method Based on 2D Image Transformation and 2D-RoPE Encoding
by: Xudong Luo, et al.
Published: (2025-06-01) -
Optimasi RoBERTa dengan Hyperparameter Tuning untuk Deteksi Emosi berbasis Teks
by: Elvanro Marthen Pusung, et al.
Published: (2025-02-01) -
Medical named entity recognition based on domain knowledge and position encoding
by: Shuifa Sun, et al.
Published: (2025-07-01) -
Screening of multi deep learning-based de novo molecular generation models and their application for specific target molecular generation
by: Yishu Wang, et al.
Published: (2025-02-01)