A Hybrid Scale-Up and Scale-Out Approach for Performance and Energy Efficiency Optimization in Systolic Array Accelerators

The rapid development of deep neural networks (DNNs), such as convolutional neural networks and transformer-based large language models, has significantly advanced AI applications. However, these advances have introduced substantial computational and data demands, presenting challenges for the devel...

Full description

Saved in:
Bibliographic Details
Main Authors: Hao Sun, Junzhong Shen, Changwu Zhang, Hengzhu Liu
Format: Article
Language:English
Published: MDPI AG 2025-03-01
Series:Micromachines
Subjects:
Online Access:https://www.mdpi.com/2072-666X/16/3/336
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850091472986046464
author Hao Sun
Junzhong Shen
Changwu Zhang
Hengzhu Liu
author_facet Hao Sun
Junzhong Shen
Changwu Zhang
Hengzhu Liu
author_sort Hao Sun
collection DOAJ
description The rapid development of deep neural networks (DNNs), such as convolutional neural networks and transformer-based large language models, has significantly advanced AI applications. However, these advances have introduced substantial computational and data demands, presenting challenges for the development of systolic array accelerators, which excel in tensor operations. Systolic array accelerators are typically developed using two approaches: scale-up, which increases the size of a single array, and scale-out, which involves multiple parallel arrays of fixed size. Scale-up achieves high performance in large-scale matrix multiplications, while scale-out offers better energy efficiency for lower-dimensional matrix multiplications. However, neither approach can simultaneously maintain both high performance and high energy efficiency across the full spectrum of DNN tasks. In this work, we propose a hybrid approach that integrates scale-up and scale-out techniques. We use mapping space exploration in a multi-tenant application environment to assign DNN operations to specific systolic array modules, thereby optimizing performance and energy efficiency. Experiments show that our proposed hybrid systolic array accelerator reduces energy consumption by up to 8% on average and improves throughput by up to 57% on average, compared to TPUv3 across various DNN models.
format Article
id doaj-art-a3faa470041c4c20a2711eabb4d44ee5
institution DOAJ
issn 2072-666X
language English
publishDate 2025-03-01
publisher MDPI AG
record_format Article
series Micromachines
spelling doaj-art-a3faa470041c4c20a2711eabb4d44ee52025-08-20T02:42:22ZengMDPI AGMicromachines2072-666X2025-03-0116333610.3390/mi16030336A Hybrid Scale-Up and Scale-Out Approach for Performance and Energy Efficiency Optimization in Systolic Array AcceleratorsHao Sun0Junzhong Shen1Changwu Zhang2Hengzhu Liu3College of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaCollege of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaAcademy of Military Science, Beijing 100091, ChinaCollege of Computer Science and Technology, National University of Defense Technology, Changsha 410073, ChinaThe rapid development of deep neural networks (DNNs), such as convolutional neural networks and transformer-based large language models, has significantly advanced AI applications. However, these advances have introduced substantial computational and data demands, presenting challenges for the development of systolic array accelerators, which excel in tensor operations. Systolic array accelerators are typically developed using two approaches: scale-up, which increases the size of a single array, and scale-out, which involves multiple parallel arrays of fixed size. Scale-up achieves high performance in large-scale matrix multiplications, while scale-out offers better energy efficiency for lower-dimensional matrix multiplications. However, neither approach can simultaneously maintain both high performance and high energy efficiency across the full spectrum of DNN tasks. In this work, we propose a hybrid approach that integrates scale-up and scale-out techniques. We use mapping space exploration in a multi-tenant application environment to assign DNN operations to specific systolic array modules, thereby optimizing performance and energy efficiency. Experiments show that our proposed hybrid systolic array accelerator reduces energy consumption by up to 8% on average and improves throughput by up to 57% on average, compared to TPUv3 across various DNN models.https://www.mdpi.com/2072-666X/16/3/336systolic arraydeep neural networkperformance optimizationenergy efficiencyaccelerators
spellingShingle Hao Sun
Junzhong Shen
Changwu Zhang
Hengzhu Liu
A Hybrid Scale-Up and Scale-Out Approach for Performance and Energy Efficiency Optimization in Systolic Array Accelerators
Micromachines
systolic array
deep neural network
performance optimization
energy efficiency
accelerators
title A Hybrid Scale-Up and Scale-Out Approach for Performance and Energy Efficiency Optimization in Systolic Array Accelerators
title_full A Hybrid Scale-Up and Scale-Out Approach for Performance and Energy Efficiency Optimization in Systolic Array Accelerators
title_fullStr A Hybrid Scale-Up and Scale-Out Approach for Performance and Energy Efficiency Optimization in Systolic Array Accelerators
title_full_unstemmed A Hybrid Scale-Up and Scale-Out Approach for Performance and Energy Efficiency Optimization in Systolic Array Accelerators
title_short A Hybrid Scale-Up and Scale-Out Approach for Performance and Energy Efficiency Optimization in Systolic Array Accelerators
title_sort hybrid scale up and scale out approach for performance and energy efficiency optimization in systolic array accelerators
topic systolic array
deep neural network
performance optimization
energy efficiency
accelerators
url https://www.mdpi.com/2072-666X/16/3/336
work_keys_str_mv AT haosun ahybridscaleupandscaleoutapproachforperformanceandenergyefficiencyoptimizationinsystolicarrayaccelerators
AT junzhongshen ahybridscaleupandscaleoutapproachforperformanceandenergyefficiencyoptimizationinsystolicarrayaccelerators
AT changwuzhang ahybridscaleupandscaleoutapproachforperformanceandenergyefficiencyoptimizationinsystolicarrayaccelerators
AT hengzhuliu ahybridscaleupandscaleoutapproachforperformanceandenergyefficiencyoptimizationinsystolicarrayaccelerators
AT haosun hybridscaleupandscaleoutapproachforperformanceandenergyefficiencyoptimizationinsystolicarrayaccelerators
AT junzhongshen hybridscaleupandscaleoutapproachforperformanceandenergyefficiencyoptimizationinsystolicarrayaccelerators
AT changwuzhang hybridscaleupandscaleoutapproachforperformanceandenergyefficiencyoptimizationinsystolicarrayaccelerators
AT hengzhuliu hybridscaleupandscaleoutapproachforperformanceandenergyefficiencyoptimizationinsystolicarrayaccelerators