Showing 1 - 5 results of 5 for search 'Musheng Chen', query time: 0.01s
Refine Results
-
1
Optimizing the Learnable RoPE Theta Parameter in Transformers by Zhigao Huang, Musheng Chen
Published 2025-01-01Get full text
Article -
2
-
3
Dynamic Mixture of Experts for Adaptive Computation in Character-Level Transformers by Zhigao Huang, Musheng Chen, Shiyan Zheng
Published 2025-06-01Get full text
Article -
4
Spectral Adaptive Dropout: Frequency-Based Regularization for Improved Generalization by Zhigao Huang, Musheng Chen, Shiyan Zheng
Published 2025-06-01Get full text
Article -
5
Spectral momentum integration: hybrid optimization of frequency and time domain gradients by Zhigao Huang, Musheng Chen, Shiyan Zheng
Published 2025-08-01Get full text
Article