A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization
The single instruction multiple data (SIMD) capability in modern processors is critical to improving the performance of current compute-intensive programs. Modern compilers use vectorization techniques to exploit the SIMD capability, by detecting data parallelism in scalar source code and transformi...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
American Association for the Advancement of Science (AAAS)
2025-01-01
|
| Series: | Intelligent Computing |
| Online Access: | https://spj.science.org/doi/10.34133/icomputing.0113 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | The single instruction multiple data (SIMD) capability in modern processors is critical to improving the performance of current compute-intensive programs. Modern compilers use vectorization techniques to exploit the SIMD capability, by detecting data parallelism in scalar source code and transforming a group of scalar instructions into vector-based instructions. In this study, we focus on one of the most common vectorization techniques, a technique called loop-based vectorization, which targets loops and optimizes their performance by grouping multiple occurrences of the same operation across loop iterations into a single SIMD instruction. We propose a data-driven graph-based learning framework for automatic vectorization, called autograph, which takes an input program, extracts the loops, and then learns a structured representation to automatically predict the correct vectorization and interleaving factors. Our proposed framework utilizes deep reinforcement learning to learn an optimal policy (observations to actions) from an intelligent agent in a SIMD environment, and automatically injects the predicted vectorization pragmas into the input program. We conducted an extensive evaluation on multiple benchmark datasets and comparisons with state-of-the-art baselines. Our results show that autograph achieves on average 2.49× performance improvement for Polybench compared to NeuroVectorizer and 3.69× compared to the baseline -O3. |
|---|---|
| ISSN: | 2771-5892 |