Memory Prefetching Evaluation of Scientific Applications on a Modern HPC Arm-Based Processor

Memory prefetching is a well-known technique for mitigating the negative impact of memory access latencies on memory bandwidth. This problem has become more pressing as improvements in memory bandwidth have not kept pace with increases in computational power. While much existing work has been devote...

Full description

Saved in:
Bibliographic Details
Main Authors: Nam Ho, Carlos Falquez, Antoni Portero, Estela Suarez, Dirk Pleiter
Format: Article
Language:English
Published: IEEE 2025-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/11003053/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849328028301131776
author Nam Ho
Carlos Falquez
Antoni Portero
Estela Suarez
Dirk Pleiter
author_facet Nam Ho
Carlos Falquez
Antoni Portero
Estela Suarez
Dirk Pleiter
author_sort Nam Ho
collection DOAJ
description Memory prefetching is a well-known technique for mitigating the negative impact of memory access latencies on memory bandwidth. This problem has become more pressing as improvements in memory bandwidth have not kept pace with increases in computational power. While much existing work has been devoted to finding appropriate prefetching techniques for specific workloads, few provide insight into the behavior of scientific applications to better understand the impact of prefetchers. This paper investigates the impact of hardware prefetchers on the latest Arm-based high-end processor architectures. In this work, we investigate memory access patterns by analyzing locality properties and visualizing delta and repetitive address patterns. A deeper understanding of memory access patterns allows the use of the appropriate prefetcher and reaching a better correlation between access pattern properties and prefetcher performance. This can guide future co-design efforts. We evaluated traditional and innovative prefetchers using a gem5-based model of Arm Neoverse V1 cores. The model features a 16-core architecture, using Amazon’s Graviton 3 processor as a hardware reference, but substituting DDR5 by high bandwidth memory (HBM2). We performed a detailed prefetching evaluation focusing on stencil, sparse matrix-vector multiplication, and Breadth-First Search kernels. These kernels represent a broad range of the applications running on today’s High-Performance Computing (HPC) systems, which are sensitive to memory performance.
format Article
id doaj-art-2defa946a8ac4e198d07381cb1730379
institution Kabale University
issn 2169-3536
language English
publishDate 2025-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj-art-2defa946a8ac4e198d07381cb17303792025-08-20T03:47:41ZengIEEEIEEE Access2169-35362025-01-0113858988592610.1109/ACCESS.2025.356953311003053Memory Prefetching Evaluation of Scientific Applications on a Modern HPC Arm-Based ProcessorNam Ho0https://orcid.org/0000-0002-6973-4120Carlos Falquez1https://orcid.org/0000-0003-0382-7743Antoni Portero2https://orcid.org/0000-0003-1319-6404Estela Suarez3https://orcid.org/0000-0003-0748-7264Dirk Pleiter4https://orcid.org/0000-0001-7296-7817Jülich Supercomputing Centre, Institute for Advanced Simulation, Forschungszentrum Jülich GmbH, Jülich, GermanyJülich Supercomputing Centre, Institute for Advanced Simulation, Forschungszentrum Jülich GmbH, Jülich, GermanyJülich Supercomputing Centre, Institute for Advanced Simulation, Forschungszentrum Jülich GmbH, Jülich, GermanyJülich Supercomputing Centre, Institute for Advanced Simulation, Forschungszentrum Jülich GmbH, Jülich, GermanyDivision of Computational Science and Technology, EECS, KTH Royal Institute of Technology, Stockholm, SwedenMemory prefetching is a well-known technique for mitigating the negative impact of memory access latencies on memory bandwidth. This problem has become more pressing as improvements in memory bandwidth have not kept pace with increases in computational power. While much existing work has been devoted to finding appropriate prefetching techniques for specific workloads, few provide insight into the behavior of scientific applications to better understand the impact of prefetchers. This paper investigates the impact of hardware prefetchers on the latest Arm-based high-end processor architectures. In this work, we investigate memory access patterns by analyzing locality properties and visualizing delta and repetitive address patterns. A deeper understanding of memory access patterns allows the use of the appropriate prefetcher and reaching a better correlation between access pattern properties and prefetcher performance. This can guide future co-design efforts. We evaluated traditional and innovative prefetchers using a gem5-based model of Arm Neoverse V1 cores. The model features a 16-core architecture, using Amazon’s Graviton 3 processor as a hardware reference, but substituting DDR5 by high bandwidth memory (HBM2). We performed a detailed prefetching evaluation focusing on stencil, sparse matrix-vector multiplication, and Breadth-First Search kernels. These kernels represent a broad range of the applications running on today’s High-Performance Computing (HPC) systems, which are sensitive to memory performance.https://ieeexplore.ieee.org/document/11003053/Memory prefetcherhigh performance computingcomputer simulation
spellingShingle Nam Ho
Carlos Falquez
Antoni Portero
Estela Suarez
Dirk Pleiter
Memory Prefetching Evaluation of Scientific Applications on a Modern HPC Arm-Based Processor
IEEE Access
Memory prefetcher
high performance computing
computer simulation
title Memory Prefetching Evaluation of Scientific Applications on a Modern HPC Arm-Based Processor
title_full Memory Prefetching Evaluation of Scientific Applications on a Modern HPC Arm-Based Processor
title_fullStr Memory Prefetching Evaluation of Scientific Applications on a Modern HPC Arm-Based Processor
title_full_unstemmed Memory Prefetching Evaluation of Scientific Applications on a Modern HPC Arm-Based Processor
title_short Memory Prefetching Evaluation of Scientific Applications on a Modern HPC Arm-Based Processor
title_sort memory prefetching evaluation of scientific applications on a modern hpc arm based processor
topic Memory prefetcher
high performance computing
computer simulation
url https://ieeexplore.ieee.org/document/11003053/
work_keys_str_mv AT namho memoryprefetchingevaluationofscientificapplicationsonamodernhpcarmbasedprocessor
AT carlosfalquez memoryprefetchingevaluationofscientificapplicationsonamodernhpcarmbasedprocessor
AT antoniportero memoryprefetchingevaluationofscientificapplicationsonamodernhpcarmbasedprocessor
AT estelasuarez memoryprefetchingevaluationofscientificapplicationsonamodernhpcarmbasedprocessor
AT dirkpleiter memoryprefetchingevaluationofscientificapplicationsonamodernhpcarmbasedprocessor