A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization
The single instruction multiple data (SIMD) capability in modern processors is critical to improving the performance of current compute-intensive programs. Modern compilers use vectorization techniques to exploit the SIMD capability, by detecting data parallelism in scalar source code and transformi...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
American Association for the Advancement of Science (AAAS)
2025-01-01
|
| Series: | Intelligent Computing |
| Online Access: | https://spj.science.org/doi/10.34133/icomputing.0113 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849404354374664192 |
|---|---|
| author | Yao Xiao Nesreen K. Ahmed Mihai Capotă Guixiang Ma Theodore L. Willke Shahin Nazarian Paul Bogdan |
| author_facet | Yao Xiao Nesreen K. Ahmed Mihai Capotă Guixiang Ma Theodore L. Willke Shahin Nazarian Paul Bogdan |
| author_sort | Yao Xiao |
| collection | DOAJ |
| description | The single instruction multiple data (SIMD) capability in modern processors is critical to improving the performance of current compute-intensive programs. Modern compilers use vectorization techniques to exploit the SIMD capability, by detecting data parallelism in scalar source code and transforming a group of scalar instructions into vector-based instructions. In this study, we focus on one of the most common vectorization techniques, a technique called loop-based vectorization, which targets loops and optimizes their performance by grouping multiple occurrences of the same operation across loop iterations into a single SIMD instruction. We propose a data-driven graph-based learning framework for automatic vectorization, called autograph, which takes an input program, extracts the loops, and then learns a structured representation to automatically predict the correct vectorization and interleaving factors. Our proposed framework utilizes deep reinforcement learning to learn an optimal policy (observations to actions) from an intelligent agent in a SIMD environment, and automatically injects the predicted vectorization pragmas into the input program. We conducted an extensive evaluation on multiple benchmark datasets and comparisons with state-of-the-art baselines. Our results show that autograph achieves on average 2.49× performance improvement for Polybench compared to NeuroVectorizer and 3.69× compared to the baseline -O3. |
| format | Article |
| id | doaj-art-b3142e5dff0549f8b47e6c175b7c1a8b |
| institution | Kabale University |
| issn | 2771-5892 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | American Association for the Advancement of Science (AAAS) |
| record_format | Article |
| series | Intelligent Computing |
| spelling | doaj-art-b3142e5dff0549f8b47e6c175b7c1a8b2025-08-20T03:37:01ZengAmerican Association for the Advancement of Science (AAAS)Intelligent Computing2771-58922025-01-01410.34133/icomputing.0113A Graph-Based Learning Framework for Compiler Loop Auto-VectorizationYao Xiao0Nesreen K. Ahmed1Mihai Capotă2Guixiang Ma3Theodore L. Willke4Shahin Nazarian5Paul Bogdan6Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA.Cisco AI Research, San Jose, CA, USA.Intel Labs, Hillsboro, OR, USA.Intel Labs, Hillsboro, OR, USA.Intel Labs, Hillsboro, OR, USA.Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA.Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA.The single instruction multiple data (SIMD) capability in modern processors is critical to improving the performance of current compute-intensive programs. Modern compilers use vectorization techniques to exploit the SIMD capability, by detecting data parallelism in scalar source code and transforming a group of scalar instructions into vector-based instructions. In this study, we focus on one of the most common vectorization techniques, a technique called loop-based vectorization, which targets loops and optimizes their performance by grouping multiple occurrences of the same operation across loop iterations into a single SIMD instruction. We propose a data-driven graph-based learning framework for automatic vectorization, called autograph, which takes an input program, extracts the loops, and then learns a structured representation to automatically predict the correct vectorization and interleaving factors. Our proposed framework utilizes deep reinforcement learning to learn an optimal policy (observations to actions) from an intelligent agent in a SIMD environment, and automatically injects the predicted vectorization pragmas into the input program. We conducted an extensive evaluation on multiple benchmark datasets and comparisons with state-of-the-art baselines. Our results show that autograph achieves on average 2.49× performance improvement for Polybench compared to NeuroVectorizer and 3.69× compared to the baseline -O3.https://spj.science.org/doi/10.34133/icomputing.0113 |
| spellingShingle | Yao Xiao Nesreen K. Ahmed Mihai Capotă Guixiang Ma Theodore L. Willke Shahin Nazarian Paul Bogdan A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization Intelligent Computing |
| title | A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization |
| title_full | A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization |
| title_fullStr | A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization |
| title_full_unstemmed | A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization |
| title_short | A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization |
| title_sort | graph based learning framework for compiler loop auto vectorization |
| url | https://spj.science.org/doi/10.34133/icomputing.0113 |
| work_keys_str_mv | AT yaoxiao agraphbasedlearningframeworkforcompilerloopautovectorization AT nesreenkahmed agraphbasedlearningframeworkforcompilerloopautovectorization AT mihaicapota agraphbasedlearningframeworkforcompilerloopautovectorization AT guixiangma agraphbasedlearningframeworkforcompilerloopautovectorization AT theodorelwillke agraphbasedlearningframeworkforcompilerloopautovectorization AT shahinnazarian agraphbasedlearningframeworkforcompilerloopautovectorization AT paulbogdan agraphbasedlearningframeworkforcompilerloopautovectorization AT yaoxiao graphbasedlearningframeworkforcompilerloopautovectorization AT nesreenkahmed graphbasedlearningframeworkforcompilerloopautovectorization AT mihaicapota graphbasedlearningframeworkforcompilerloopautovectorization AT guixiangma graphbasedlearningframeworkforcompilerloopautovectorization AT theodorelwillke graphbasedlearningframeworkforcompilerloopautovectorization AT shahinnazarian graphbasedlearningframeworkforcompilerloopautovectorization AT paulbogdan graphbasedlearningframeworkforcompilerloopautovectorization |