System-Technology Co-Optimization for Dense Edge Architectures Using 3-D Integration and Nonvolatile Memory

High-performance edge artificial intelligence (Edge-AI) inference applications aim for high energy efficiency, memory density, and small form factor, requiring a design-space exploration across the whole stack—workloads, architecture, mapping, and co-optimization with emerging technology....

Full description

Saved in:
Bibliographic Details
Main Authors: Leandro M. Giacomini Rocha, Mohamed Naeim, Guilherme Paim, Moritz Brunion, Priya Venugopal, Dragomir Milojevic, James Myers, Mustafa Badaroglu, Marian Verhelst, Julien Ryckaert, Dwaipayan Biswas
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10750212/
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1841526330302136320
author Leandro M. Giacomini Rocha
Mohamed Naeim
Guilherme Paim
Moritz Brunion
Priya Venugopal
Dragomir Milojevic
James Myers
Mustafa Badaroglu
Marian Verhelst
Julien Ryckaert
Dwaipayan Biswas
author_facet Leandro M. Giacomini Rocha
Mohamed Naeim
Guilherme Paim
Moritz Brunion
Priya Venugopal
Dragomir Milojevic
James Myers
Mustafa Badaroglu
Marian Verhelst
Julien Ryckaert
Dwaipayan Biswas
author_sort Leandro M. Giacomini Rocha
collection DOAJ
description High-performance edge artificial intelligence (Edge-AI) inference applications aim for high energy efficiency, memory density, and small form factor, requiring a design-space exploration across the whole stack&#x2014;workloads, architecture, mapping, and co-optimization with emerging technology. In this article, we present a system-technology co-optimization (STCO) framework that interfaces with workload-driven system scaling challenges and physical design-enabled technology offerings. The framework is built on three engines that provide the physical design characterization, dataflow mapping optimizer, and system efficiency predictor. The framework builds on a systolic array accelerator to provide the design-technology characterization points using advanced imec A10 nanosheet CMOS node along with emerging, high-density voltage-gated spin-orbit torque (VGSOT) magnetic memories (MRAM), combined with memory-on-logic fine-pitch 3-D wafer-to-wafer hybrid bonding. We observe that the 3-D system integration of static random-access memory (SRAM)-based design leads to 9% power savings with 53% footprint reduction at iso-frequency with respect to 2-D implementation for the same memory capacity. Three-dimensional nonvolatile memory (NVM)-VGSOT allows <inline-formula> <tex-math notation="LaTeX">$4\times $ </tex-math></inline-formula> memory capacity increase with 30% footprint reduction at iso-power compared with 2-D SRAM <inline-formula> <tex-math notation="LaTeX">$1\times $ </tex-math></inline-formula>. Our exploration with two diverse workloads&#x2014;image resolution enhancement (FSRCNN) and eye tracking (EDSNet)&#x2014;shows that more resources allow better workload mapping possibilities, which are able to compensate peak system energy efficiency degradation on high memory capacity cases. We show that a 25% peak efficiency reduction on a <inline-formula> <tex-math notation="LaTeX">$32\times $ </tex-math></inline-formula> memory capacity can lead to a <inline-formula> <tex-math notation="LaTeX">$7.4\times $ </tex-math></inline-formula> faster execution with <inline-formula> <tex-math notation="LaTeX">$5.7\times $ </tex-math></inline-formula> higher effective TOPS/W than the <inline-formula> <tex-math notation="LaTeX">$1\times $ </tex-math></inline-formula> memory capacity case on the same technology.
format Article
id doaj-art-226b40e97ae0451ba5bc686ec2cc116c
institution Kabale University
issn 2329-9231
language English
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
spelling doaj-art-226b40e97ae0451ba5bc686ec2cc116c2025-01-17T00:00:34ZengIEEEIEEE Journal on Exploratory Solid-State Computational Devices and Circuits2329-92312024-01-011012513410.1109/JXCDC.2024.349611810750212System-Technology Co-Optimization for Dense Edge Architectures Using 3-D Integration and Nonvolatile MemoryLeandro M. Giacomini Rocha0https://orcid.org/0000-0003-2883-2768Mohamed Naeim1https://orcid.org/0009-0006-8202-2159Guilherme Paim2https://orcid.org/0000-0001-7809-9563Moritz Brunion3https://orcid.org/0000-0001-7842-7774Priya Venugopal4https://orcid.org/0000-0002-9783-2713Dragomir Milojevic5https://orcid.org/0000-0001-5915-5160James Myers6https://orcid.org/0009-0000-2558-7504Mustafa Badaroglu7https://orcid.org/0009-0006-0126-9062Marian Verhelst8https://orcid.org/0000-0003-3495-9263Julien Ryckaert9Dwaipayan Biswas10https://orcid.org/0000-0001-7912-3692imec, Leuven, Belgiumimec, Leuven, BelgiumINESC-ID, Lisbon, Portugalimec, Leuven, Belgiumimec, Leuven, Belgiumand Mechanical Systems Department (BEAMS), Bio, Electro, Universit&#x00E9; Libre de Bruxelles, Brussels, Belgiumimec, Cambridge, U.K.Qualcomm, San Diego, CA, USAKU Leuven, Leuven, Belgiumimec, Leuven, Belgiumimec, Leuven, BelgiumHigh-performance edge artificial intelligence (Edge-AI) inference applications aim for high energy efficiency, memory density, and small form factor, requiring a design-space exploration across the whole stack&#x2014;workloads, architecture, mapping, and co-optimization with emerging technology. In this article, we present a system-technology co-optimization (STCO) framework that interfaces with workload-driven system scaling challenges and physical design-enabled technology offerings. The framework is built on three engines that provide the physical design characterization, dataflow mapping optimizer, and system efficiency predictor. The framework builds on a systolic array accelerator to provide the design-technology characterization points using advanced imec A10 nanosheet CMOS node along with emerging, high-density voltage-gated spin-orbit torque (VGSOT) magnetic memories (MRAM), combined with memory-on-logic fine-pitch 3-D wafer-to-wafer hybrid bonding. We observe that the 3-D system integration of static random-access memory (SRAM)-based design leads to 9% power savings with 53% footprint reduction at iso-frequency with respect to 2-D implementation for the same memory capacity. Three-dimensional nonvolatile memory (NVM)-VGSOT allows <inline-formula> <tex-math notation="LaTeX">$4\times $ </tex-math></inline-formula> memory capacity increase with 30% footprint reduction at iso-power compared with 2-D SRAM <inline-formula> <tex-math notation="LaTeX">$1\times $ </tex-math></inline-formula>. Our exploration with two diverse workloads&#x2014;image resolution enhancement (FSRCNN) and eye tracking (EDSNet)&#x2014;shows that more resources allow better workload mapping possibilities, which are able to compensate peak system energy efficiency degradation on high memory capacity cases. We show that a 25% peak efficiency reduction on a <inline-formula> <tex-math notation="LaTeX">$32\times $ </tex-math></inline-formula> memory capacity can lead to a <inline-formula> <tex-math notation="LaTeX">$7.4\times $ </tex-math></inline-formula> faster execution with <inline-formula> <tex-math notation="LaTeX">$5.7\times $ </tex-math></inline-formula> higher effective TOPS/W than the <inline-formula> <tex-math notation="LaTeX">$1\times $ </tex-math></inline-formula> memory capacity case on the same technology.https://ieeexplore.ieee.org/document/10750212/3-D partitioningedge artificial intelligence (Edge-AI)nonvolatile memory (NVM)system-technology co-optimization (STCO)systolic arrayvoltage-gated spin-orbit torque (VGSOT)
spellingShingle Leandro M. Giacomini Rocha
Mohamed Naeim
Guilherme Paim
Moritz Brunion
Priya Venugopal
Dragomir Milojevic
James Myers
Mustafa Badaroglu
Marian Verhelst
Julien Ryckaert
Dwaipayan Biswas
System-Technology Co-Optimization for Dense Edge Architectures Using 3-D Integration and Nonvolatile Memory
IEEE Journal on Exploratory Solid-State Computational Devices and Circuits
3-D partitioning
edge artificial intelligence (Edge-AI)
nonvolatile memory (NVM)
system-technology co-optimization (STCO)
systolic array
voltage-gated spin-orbit torque (VGSOT)
title System-Technology Co-Optimization for Dense Edge Architectures Using 3-D Integration and Nonvolatile Memory
title_full System-Technology Co-Optimization for Dense Edge Architectures Using 3-D Integration and Nonvolatile Memory
title_fullStr System-Technology Co-Optimization for Dense Edge Architectures Using 3-D Integration and Nonvolatile Memory
title_full_unstemmed System-Technology Co-Optimization for Dense Edge Architectures Using 3-D Integration and Nonvolatile Memory
title_short System-Technology Co-Optimization for Dense Edge Architectures Using 3-D Integration and Nonvolatile Memory
title_sort system technology co optimization for dense edge architectures using 3 d integration and nonvolatile memory
topic 3-D partitioning
edge artificial intelligence (Edge-AI)
nonvolatile memory (NVM)
system-technology co-optimization (STCO)
systolic array
voltage-gated spin-orbit torque (VGSOT)
url https://ieeexplore.ieee.org/document/10750212/
work_keys_str_mv AT leandromgiacominirocha systemtechnologycooptimizationfordenseedgearchitecturesusing3dintegrationandnonvolatilememory
AT mohamednaeim systemtechnologycooptimizationfordenseedgearchitecturesusing3dintegrationandnonvolatilememory
AT guilhermepaim systemtechnologycooptimizationfordenseedgearchitecturesusing3dintegrationandnonvolatilememory
AT moritzbrunion systemtechnologycooptimizationfordenseedgearchitecturesusing3dintegrationandnonvolatilememory
AT priyavenugopal systemtechnologycooptimizationfordenseedgearchitecturesusing3dintegrationandnonvolatilememory
AT dragomirmilojevic systemtechnologycooptimizationfordenseedgearchitecturesusing3dintegrationandnonvolatilememory
AT jamesmyers systemtechnologycooptimizationfordenseedgearchitecturesusing3dintegrationandnonvolatilememory
AT mustafabadaroglu systemtechnologycooptimizationfordenseedgearchitecturesusing3dintegrationandnonvolatilememory
AT marianverhelst systemtechnologycooptimizationfordenseedgearchitecturesusing3dintegrationandnonvolatilememory
AT julienryckaert systemtechnologycooptimizationfordenseedgearchitecturesusing3dintegrationandnonvolatilememory
AT dwaipayanbiswas systemtechnologycooptimizationfordenseedgearchitecturesusing3dintegrationandnonvolatilememory