A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization

The single instruction multiple data (SIMD) capability in modern processors is critical to improving the performance of current compute-intensive programs. Modern compilers use vectorization techniques to exploit the SIMD capability, by detecting data parallelism in scalar source code and transformi...

Full description

Saved in:
Bibliographic Details
Main Authors: Yao Xiao, Nesreen K. Ahmed, Mihai Capotă, Guixiang Ma, Theodore L. Willke, Shahin Nazarian, Paul Bogdan
Format: Article
Language:English
Published: American Association for the Advancement of Science (AAAS) 2025-01-01
Series:Intelligent Computing
Online Access:https://spj.science.org/doi/10.34133/icomputing.0113
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849404354374664192
author Yao Xiao
Nesreen K. Ahmed
Mihai Capotă
Guixiang Ma
Theodore L. Willke
Shahin Nazarian
Paul Bogdan
author_facet Yao Xiao
Nesreen K. Ahmed
Mihai Capotă
Guixiang Ma
Theodore L. Willke
Shahin Nazarian
Paul Bogdan
author_sort Yao Xiao
collection DOAJ
description The single instruction multiple data (SIMD) capability in modern processors is critical to improving the performance of current compute-intensive programs. Modern compilers use vectorization techniques to exploit the SIMD capability, by detecting data parallelism in scalar source code and transforming a group of scalar instructions into vector-based instructions. In this study, we focus on one of the most common vectorization techniques, a technique called loop-based vectorization, which targets loops and optimizes their performance by grouping multiple occurrences of the same operation across loop iterations into a single SIMD instruction. We propose a data-driven graph-based learning framework for automatic vectorization, called autograph, which takes an input program, extracts the loops, and then learns a structured representation to automatically predict the correct vectorization and interleaving factors. Our proposed framework utilizes deep reinforcement learning to learn an optimal policy (observations to actions) from an intelligent agent in a SIMD environment, and automatically injects the predicted vectorization pragmas into the input program. We conducted an extensive evaluation on multiple benchmark datasets and comparisons with state-of-the-art baselines. Our results show that autograph achieves on average 2.49× performance improvement for Polybench compared to NeuroVectorizer and 3.69× compared to the baseline -O3.
format Article
id doaj-art-b3142e5dff0549f8b47e6c175b7c1a8b
institution Kabale University
issn 2771-5892
language English
publishDate 2025-01-01
publisher American Association for the Advancement of Science (AAAS)
record_format Article
series Intelligent Computing
spelling doaj-art-b3142e5dff0549f8b47e6c175b7c1a8b2025-08-20T03:37:01ZengAmerican Association for the Advancement of Science (AAAS)Intelligent Computing2771-58922025-01-01410.34133/icomputing.0113A Graph-Based Learning Framework for Compiler Loop Auto-VectorizationYao Xiao0Nesreen K. Ahmed1Mihai Capotă2Guixiang Ma3Theodore L. Willke4Shahin Nazarian5Paul Bogdan6Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA.Cisco AI Research, San Jose, CA, USA.Intel Labs, Hillsboro, OR, USA.Intel Labs, Hillsboro, OR, USA.Intel Labs, Hillsboro, OR, USA.Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA.Department of Electrical and Computer Engineering, Viterbi School of Engineering, University of Southern California, Los Angeles, CA, USA.The single instruction multiple data (SIMD) capability in modern processors is critical to improving the performance of current compute-intensive programs. Modern compilers use vectorization techniques to exploit the SIMD capability, by detecting data parallelism in scalar source code and transforming a group of scalar instructions into vector-based instructions. In this study, we focus on one of the most common vectorization techniques, a technique called loop-based vectorization, which targets loops and optimizes their performance by grouping multiple occurrences of the same operation across loop iterations into a single SIMD instruction. We propose a data-driven graph-based learning framework for automatic vectorization, called autograph, which takes an input program, extracts the loops, and then learns a structured representation to automatically predict the correct vectorization and interleaving factors. Our proposed framework utilizes deep reinforcement learning to learn an optimal policy (observations to actions) from an intelligent agent in a SIMD environment, and automatically injects the predicted vectorization pragmas into the input program. We conducted an extensive evaluation on multiple benchmark datasets and comparisons with state-of-the-art baselines. Our results show that autograph achieves on average 2.49× performance improvement for Polybench compared to NeuroVectorizer and 3.69× compared to the baseline -O3.https://spj.science.org/doi/10.34133/icomputing.0113
spellingShingle Yao Xiao
Nesreen K. Ahmed
Mihai Capotă
Guixiang Ma
Theodore L. Willke
Shahin Nazarian
Paul Bogdan
A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization
Intelligent Computing
title A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization
title_full A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization
title_fullStr A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization
title_full_unstemmed A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization
title_short A Graph-Based Learning Framework for Compiler Loop Auto-Vectorization
title_sort graph based learning framework for compiler loop auto vectorization
url https://spj.science.org/doi/10.34133/icomputing.0113
work_keys_str_mv AT yaoxiao agraphbasedlearningframeworkforcompilerloopautovectorization
AT nesreenkahmed agraphbasedlearningframeworkforcompilerloopautovectorization
AT mihaicapota agraphbasedlearningframeworkforcompilerloopautovectorization
AT guixiangma agraphbasedlearningframeworkforcompilerloopautovectorization
AT theodorelwillke agraphbasedlearningframeworkforcompilerloopautovectorization
AT shahinnazarian agraphbasedlearningframeworkforcompilerloopautovectorization
AT paulbogdan agraphbasedlearningframeworkforcompilerloopautovectorization
AT yaoxiao graphbasedlearningframeworkforcompilerloopautovectorization
AT nesreenkahmed graphbasedlearningframeworkforcompilerloopautovectorization
AT mihaicapota graphbasedlearningframeworkforcompilerloopautovectorization
AT guixiangma graphbasedlearningframeworkforcompilerloopautovectorization
AT theodorelwillke graphbasedlearningframeworkforcompilerloopautovectorization
AT shahinnazarian graphbasedlearningframeworkforcompilerloopautovectorization
AT paulbogdan graphbasedlearningframeworkforcompilerloopautovectorization