PyPOD-GP: Using PyTorch for accelerated chip-level thermal simulation of the GPU

The rising demand for high-performance computing (HPC) has made full-chip dynamic thermal simulation in many-core GPUs critical for optimizing performance and extending device lifespans. Proper orthogonal decomposition (POD) with Galerkin projection (GP) has shown to offer high accuracy and massive...

Full description

Saved in:
Bibliographic Details
Main Authors: Neil He, Ming-Cheng Cheng, Yu Liu
Format: Article
Language:English
Published: Elsevier 2025-05-01
Series:SoftwareX
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352711025001141
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849761935530131456
author Neil He
Ming-Cheng Cheng
Yu Liu
author_facet Neil He
Ming-Cheng Cheng
Yu Liu
author_sort Neil He
collection DOAJ
description The rising demand for high-performance computing (HPC) has made full-chip dynamic thermal simulation in many-core GPUs critical for optimizing performance and extending device lifespans. Proper orthogonal decomposition (POD) with Galerkin projection (GP) has shown to offer high accuracy and massive runtime improvements over direct numerical simulation (DNS). However, previous implementations of POD-GP use MPI-based libraries like PETSc and FEniCS and face significant runtime bottlenecks. We propose a PyTorch-based POD-GP library (PyPOD-GP), a GPU-optimized library for chip-level thermal simulation. PyPOD-GP achieves over 23.4× speedup in training and over 10× speedup in inference on a GPU with over 13,000 cores, with just 1.2% error over the device layer.
format Article
id doaj-art-01016b463d3d4b8bb3bce12e35284a73
institution DOAJ
issn 2352-7110
language English
publishDate 2025-05-01
publisher Elsevier
record_format Article
series SoftwareX
spelling doaj-art-01016b463d3d4b8bb3bce12e35284a732025-08-20T03:05:52ZengElsevierSoftwareX2352-71102025-05-013010214710.1016/j.softx.2025.102147PyPOD-GP: Using PyTorch for accelerated chip-level thermal simulation of the GPUNeil He0Ming-Cheng Cheng1Yu Liu2Department of Mathematics, Yale University, United States of AmericaDepartment of Electrical and Computer Engineering, Clarkson University, United States of AmericaDepartment of Electrical and Computer Engineering, Clarkson University, United States of America; Corresponding author.The rising demand for high-performance computing (HPC) has made full-chip dynamic thermal simulation in many-core GPUs critical for optimizing performance and extending device lifespans. Proper orthogonal decomposition (POD) with Galerkin projection (GP) has shown to offer high accuracy and massive runtime improvements over direct numerical simulation (DNS). However, previous implementations of POD-GP use MPI-based libraries like PETSc and FEniCS and face significant runtime bottlenecks. We propose a PyTorch-based POD-GP library (PyPOD-GP), a GPU-optimized library for chip-level thermal simulation. PyPOD-GP achieves over 23.4× speedup in training and over 10× speedup in inference on a GPU with over 13,000 cores, with just 1.2% error over the device layer.http://www.sciencedirect.com/science/article/pii/S2352711025001141GPU thermal simulationProper Orthogonal Decomposition (POD)Finite element methodPyTorchGalerkin projection
spellingShingle Neil He
Ming-Cheng Cheng
Yu Liu
PyPOD-GP: Using PyTorch for accelerated chip-level thermal simulation of the GPU
SoftwareX
GPU thermal simulation
Proper Orthogonal Decomposition (POD)
Finite element method
PyTorch
Galerkin projection
title PyPOD-GP: Using PyTorch for accelerated chip-level thermal simulation of the GPU
title_full PyPOD-GP: Using PyTorch for accelerated chip-level thermal simulation of the GPU
title_fullStr PyPOD-GP: Using PyTorch for accelerated chip-level thermal simulation of the GPU
title_full_unstemmed PyPOD-GP: Using PyTorch for accelerated chip-level thermal simulation of the GPU
title_short PyPOD-GP: Using PyTorch for accelerated chip-level thermal simulation of the GPU
title_sort pypod gp using pytorch for accelerated chip level thermal simulation of the gpu
topic GPU thermal simulation
Proper Orthogonal Decomposition (POD)
Finite element method
PyTorch
Galerkin projection
url http://www.sciencedirect.com/science/article/pii/S2352711025001141
work_keys_str_mv AT neilhe pypodgpusingpytorchforacceleratedchiplevelthermalsimulationofthegpu
AT mingchengcheng pypodgpusingpytorchforacceleratedchiplevelthermalsimulationofthegpu
AT yuliu pypodgpusingpytorchforacceleratedchiplevelthermalsimulationofthegpu