A Hybrid Scale-Up and Scale-Out Approach for Performance and Energy Efficiency Optimization in Systolic Array Accelerators
The rapid development of deep neural networks (DNNs), such as convolutional neural networks and transformer-based large language models, has significantly advanced AI applications. However, these advances have introduced substantial computational and data demands, presenting challenges for the devel...
Saved in:
| Main Authors: | , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Micromachines |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2072-666X/16/3/336 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The rapid development of deep neural networks (DNNs), such as convolutional neural networks and transformer-based large language models, has significantly advanced AI applications. However, these advances have introduced substantial computational and data demands, presenting challenges for the development of systolic array accelerators, which excel in tensor operations. Systolic array accelerators are typically developed using two approaches: scale-up, which increases the size of a single array, and scale-out, which involves multiple parallel arrays of fixed size. Scale-up achieves high performance in large-scale matrix multiplications, while scale-out offers better energy efficiency for lower-dimensional matrix multiplications. However, neither approach can simultaneously maintain both high performance and high energy efficiency across the full spectrum of DNN tasks. In this work, we propose a hybrid approach that integrates scale-up and scale-out techniques. We use mapping space exploration in a multi-tenant application environment to assign DNN operations to specific systolic array modules, thereby optimizing performance and energy efficiency. Experiments show that our proposed hybrid systolic array accelerator reduces energy consumption by up to 8% on average and improves throughput by up to 57% on average, compared to TPUv3 across various DNN models. |
|---|---|
| ISSN: | 2072-666X |