MLCRP: ML-Based GPU Cache Performance Modeling Featured With Reuse Profiles

Accurate cache performance prediction is critical for designing efficient memory hierarchies in high-performance computing systems. While cyclic simulators provide high accuracy, they require significant computational cost and time, making them inefficient for large-scale design space exploration. A...

Full description

Saved in:
Bibliographic Details
Main Authors: Minjung Cho, Eui-Young Chung
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11082128/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849420325283954688
author Minjung Cho
Eui-Young Chung
author_facet Minjung Cho
Eui-Young Chung
author_sort Minjung Cho
collection DOAJ
description Accurate cache performance prediction is critical for designing efficient memory hierarchies in high-performance computing systems. While cyclic simulators provide high accuracy, they require significant computational cost and time, making them inefficient for large-scale design space exploration. Analytical models are faster but lack accuracy in complex cache scenarios. This paper proposes MLCRP, a machine learning-based GPU cache performance prediction framework that utilizes the reuse profile (RP) as a key feature. RP captures memory access locality through a histogram of reuse distances. MLCRP consists of three main stages: data preparation, training, and inference. In the data preparation stage, synthetic RP-based traces are generated from parameterized distributions to simulate diverse and non-stationary memory patterns. In the training stage, a regression-based ML model is trained to capture the relationship between RP features, cache configurations, and performance metrics such as miss rate and miss status holding register (MSHR) merge rate. Finally, we propose a method to extract RP features from real GPU application traces, enabling the trained model to predict cache performance. Experimental results demonstrate that MLCRP significantly improves prediction accuracy compared to existing analytical models, maintaining the mean absolute error (MAE) within 5%. Furthermore, it successfully reduces simulation time by an average of four orders of magnitude compared to cycle-accurate simulators. Combining the strengths of analytic speed and simulation accuracy, MLCRP offers a scalable and generalizable solution for GPU cache modeling.
format Article
id doaj-art-dbbbc0ffc9ac453dbd39ff022771ee00
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-dbbbc0ffc9ac453dbd39ff022771ee002025-08-20T03:31:47ZengIEEEIEEE Access2169-35362025-01-011312661012662210.1109/ACCESS.2025.358980211082128MLCRP: ML-Based GPU Cache Performance Modeling Featured With Reuse ProfilesMinjung Cho0https://orcid.org/0000-0002-6443-9878Eui-Young Chung1https://orcid.org/0000-0003-2013-8763Department of Electrical and Electronic Engineering, Yonsei University, Seoul, Republic of KoreaDepartment of Electrical and Electronic Engineering, Yonsei University, Seoul, Republic of KoreaAccurate cache performance prediction is critical for designing efficient memory hierarchies in high-performance computing systems. While cyclic simulators provide high accuracy, they require significant computational cost and time, making them inefficient for large-scale design space exploration. Analytical models are faster but lack accuracy in complex cache scenarios. This paper proposes MLCRP, a machine learning-based GPU cache performance prediction framework that utilizes the reuse profile (RP) as a key feature. RP captures memory access locality through a histogram of reuse distances. MLCRP consists of three main stages: data preparation, training, and inference. In the data preparation stage, synthetic RP-based traces are generated from parameterized distributions to simulate diverse and non-stationary memory patterns. In the training stage, a regression-based ML model is trained to capture the relationship between RP features, cache configurations, and performance metrics such as miss rate and miss status holding register (MSHR) merge rate. Finally, we propose a method to extract RP features from real GPU application traces, enabling the trained model to predict cache performance. Experimental results demonstrate that MLCRP significantly improves prediction accuracy compared to existing analytical models, maintaining the mean absolute error (MAE) within 5%. Furthermore, it successfully reduces simulation time by an average of four orders of magnitude compared to cycle-accurate simulators. Combining the strengths of analytic speed and simulation accuracy, MLCRP offers a scalable and generalizable solution for GPU cache modeling.https://ieeexplore.ieee.org/document/11082128/Cache memoryreuse distancereuse profilemachine learningtrain data generation
spellingShingle Minjung Cho
Eui-Young Chung
MLCRP: ML-Based GPU Cache Performance Modeling Featured With Reuse Profiles
IEEE Access
Cache memory
reuse distance
reuse profile
machine learning
train data generation
title MLCRP: ML-Based GPU Cache Performance Modeling Featured With Reuse Profiles
title_full MLCRP: ML-Based GPU Cache Performance Modeling Featured With Reuse Profiles
title_fullStr MLCRP: ML-Based GPU Cache Performance Modeling Featured With Reuse Profiles
title_full_unstemmed MLCRP: ML-Based GPU Cache Performance Modeling Featured With Reuse Profiles
title_short MLCRP: ML-Based GPU Cache Performance Modeling Featured With Reuse Profiles
title_sort mlcrp ml based gpu cache performance modeling featured with reuse profiles
topic Cache memory
reuse distance
reuse profile
machine learning
train data generation
url https://ieeexplore.ieee.org/document/11082128/
work_keys_str_mv AT minjungcho mlcrpmlbasedgpucacheperformancemodelingfeaturedwithreuseprofiles
AT euiyoungchung mlcrpmlbasedgpucacheperformancemodelingfeaturedwithreuseprofiles