Optimization strategies for multi‐block structured CFD simulation based on Sunway TaihuLight
Abstract Decomposition and solver are the main performance bottlenecks of multi‐block structured CFD simulation involving complex industrial configurations such as aero‐engine, shock‐boundary layer interactions, turbulence modeling and so on. In this article, we proposed several optimization strateg...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2025-01-01
|
Series: | Engineering Reports |
Subjects: | |
Online Access: | https://doi.org/10.1002/eng2.12661 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832576650504568832 |
---|---|
author | Xiaojing Lv Wenhao Leng Zhao Liu Chengsheng Wu Fang Li Jiuxiu Xu |
author_facet | Xiaojing Lv Wenhao Leng Zhao Liu Chengsheng Wu Fang Li Jiuxiu Xu |
author_sort | Xiaojing Lv |
collection | DOAJ |
description | Abstract Decomposition and solver are the main performance bottlenecks of multi‐block structured CFD simulation involving complex industrial configurations such as aero‐engine, shock‐boundary layer interactions, turbulence modeling and so on. In this article, we proposed several optimization strategies to improve the computing efficiency of multi‐block structured CFD simulation based on Sunway TaihuLight super computing system, including: (1) a load balancing decomposition approach combined with recursive segmentation of undirected graphs and block mapping for multi‐structured blocks, (2) two‐level parallelism that utilizes MPI + OpenACC2.0* hybrid parallel paradigms with various performance optimizations such as data preprocessing, reducing unnecessary loops of subroutine calls, collapse, and tile syntax, memory access optimization between the main memory and local data memory (LDM), and (3) a carefully orchestrated pipeline and register communication strategy between computing processor elements (CPEs) to tackle the dependence of LU‐SGS (Lower‐Upper Symmetric Gauss–Seidel). Numerical simulations were conducted to evaluate the proposed optimization strategies. The results showed that our parallel implementation provides high load balance and efficiency, achieving a speedup of 8× + for one loop step, and a speed up of 2× + for strong correlation kernels. |
format | Article |
id | doaj-art-d558f551247348fcbf26ae448e7b70fe |
institution | Kabale University |
issn | 2577-8196 |
language | English |
publishDate | 2025-01-01 |
publisher | Wiley |
record_format | Article |
series | Engineering Reports |
spelling | doaj-art-d558f551247348fcbf26ae448e7b70fe2025-01-31T00:22:48ZengWileyEngineering Reports2577-81962025-01-0171n/an/a10.1002/eng2.12661Optimization strategies for multi‐block structured CFD simulation based on Sunway TaihuLightXiaojing Lv0Wenhao Leng1Zhao Liu2Chengsheng Wu3Fang Li4Jiuxiu Xu5China Ship Scientific Research Center Wuxi ChinaChina Ship Scientific Research Center Wuxi ChinaZhejiang Lab Hangzhou ChinaChina Ship Scientific Research Center Wuxi ChinaJiangnan Institute of Computing Technology Wuxi ChinaJiangnan Institute of Computing Technology Wuxi ChinaAbstract Decomposition and solver are the main performance bottlenecks of multi‐block structured CFD simulation involving complex industrial configurations such as aero‐engine, shock‐boundary layer interactions, turbulence modeling and so on. In this article, we proposed several optimization strategies to improve the computing efficiency of multi‐block structured CFD simulation based on Sunway TaihuLight super computing system, including: (1) a load balancing decomposition approach combined with recursive segmentation of undirected graphs and block mapping for multi‐structured blocks, (2) two‐level parallelism that utilizes MPI + OpenACC2.0* hybrid parallel paradigms with various performance optimizations such as data preprocessing, reducing unnecessary loops of subroutine calls, collapse, and tile syntax, memory access optimization between the main memory and local data memory (LDM), and (3) a carefully orchestrated pipeline and register communication strategy between computing processor elements (CPEs) to tackle the dependence of LU‐SGS (Lower‐Upper Symmetric Gauss–Seidel). Numerical simulations were conducted to evaluate the proposed optimization strategies. The results showed that our parallel implementation provides high load balance and efficiency, achieving a speedup of 8× + for one loop step, and a speed up of 2× + for strong correlation kernels.https://doi.org/10.1002/eng2.12661LU‐SGSmulti‐blockmulti‐level decompositionpipelineregister communicationSunway TaihuLight |
spellingShingle | Xiaojing Lv Wenhao Leng Zhao Liu Chengsheng Wu Fang Li Jiuxiu Xu Optimization strategies for multi‐block structured CFD simulation based on Sunway TaihuLight Engineering Reports LU‐SGS multi‐block multi‐level decomposition pipeline register communication Sunway TaihuLight |
title | Optimization strategies for multi‐block structured CFD simulation based on Sunway TaihuLight |
title_full | Optimization strategies for multi‐block structured CFD simulation based on Sunway TaihuLight |
title_fullStr | Optimization strategies for multi‐block structured CFD simulation based on Sunway TaihuLight |
title_full_unstemmed | Optimization strategies for multi‐block structured CFD simulation based on Sunway TaihuLight |
title_short | Optimization strategies for multi‐block structured CFD simulation based on Sunway TaihuLight |
title_sort | optimization strategies for multi block structured cfd simulation based on sunway taihulight |
topic | LU‐SGS multi‐block multi‐level decomposition pipeline register communication Sunway TaihuLight |
url | https://doi.org/10.1002/eng2.12661 |
work_keys_str_mv | AT xiaojinglv optimizationstrategiesformultiblockstructuredcfdsimulationbasedonsunwaytaihulight AT wenhaoleng optimizationstrategiesformultiblockstructuredcfdsimulationbasedonsunwaytaihulight AT zhaoliu optimizationstrategiesformultiblockstructuredcfdsimulationbasedonsunwaytaihulight AT chengshengwu optimizationstrategiesformultiblockstructuredcfdsimulationbasedonsunwaytaihulight AT fangli optimizationstrategiesformultiblockstructuredcfdsimulationbasedonsunwaytaihulight AT jiuxiuxu optimizationstrategiesformultiblockstructuredcfdsimulationbasedonsunwaytaihulight |