FPGA Acceleration With Hessian-Based Comprehensive Intra-Layer Mixed-Precision Quantization for Transformer Models

FPGA Acceleration With Hessian-Based Comprehensive Intra-Layer Mixed-Precision Quantization for Transformer Models

Recent advancements in using FPGAs as co-processors for language model acceleration, particularly for energy efficiency and flexibility, face challenges due to limited memory capacity. This limitation hinders the deployment of transformer-based language models. To address this challenge, we propose...

Full description

Saved in:

Bibliographic Details
Main Authors:	Woohong Byun, Jongseok Woo, Saibal Mukhopadhyay
Format:	Article
Language:	English
Published:	IEEE 2025-01-01
Series:	IEEE Access
Subjects:	Accelerator activation BERT FPGA hardware acceleration Hessian
Online Access:	https://ieeexplore.ieee.org/document/10973048/
Tags:	Add Tag No Tags, Be the first to tag this record!

Similar Items

FPGA-QNN: Quantized Neural Network Hardware Acceleration on FPGAs
by: Mustafa Tasci, et al.
Published: (2025-01-01)

An Accelerated FPGA-Based Parallel CNN-LSTM Computing Device
by: Xin Zhou, et al.
Published: (2024-01-01)

Survey of FPGA based recurrent neural network accelerator
by: Chen GAO, et al.
Published: (2019-08-01)

Image Processing Hardware Acceleration—A Review of Operations Involved and Current Hardware Approaches
by: Costin-Emanuel Vasile, et al.
Published: (2024-11-01)

An NVMe-Based Secure Computing Platform With FPGA-Based TFHE Accelerator
by: Yoshihiro Ohba, et al.
Published: (2025-01-01)

Hardware-Accelerated Data Readout Platform Using Heterogeneous Computing for DNA Data Storage
by: Xiaopeng Gou, et al.
Published: (2025-05-01)

Fully Quantized Matrix Arithmetic-Only BERT Model and Its FPGA-Based Accelerator
by: Hiroshi Fuketa, et al.
Published: (2025-01-01)

Proposal of an open-source accelerators library for inference of transformer networks in edge devices based on Linux
by: Alejandro Araya-Núñez, et al.
Published: (2024-06-01)

FPGA Hardware Acceleration of AI Models for Real-Time Breast Cancer Classification
by: Ayoub Mhaouch, et al.
Published: (2025-04-01)

Sorting algorithm acceleration based on CPU-FPGA heterogeneous system
by: Kou Yuanbo, et al.
Published: (2022-01-01)

FPG-AI RNN: A Technology-Agnostic Framework for the Automatic Acceleration of LSTM/GRU-Based Models on FPGAs
by: Tommaso Pacini, et al.
Published: (2025-01-01)

Smart Electric Vehicle Charging Management Using Reinforcement Learning on FPGA Platforms
by: Udhaya Mugil Damodarin, et al.
Published: (2025-04-01)

Better Scalability: Improvement of Block-Based CNN Accelerator for FPGAs
by: Yan Chen, et al.
Published: (2024-01-01)

Comprehensive Review on the Exploitation of Advanced Memory Optimization Strategies to Improve Performance for Convolutional and Spiking Neural Networks in Medical Imaging Using Hardware Accelerators
by: N. Srikanth Prasad, et al.
Published: (2025-01-01)

Low latency FPGA implementation of twisted Edward curve cryptography hardware accelerator over prime field
by: Md Rownak Hossain, et al.
Published: (2025-04-01)

NeuroAdaptiveNet: A Reconfigurable FPGA-Based Neural Network System with Dynamic Model Selection
by: Achraf El Bouazzaoui, et al.
Published: (2025-05-01)

OPTIMSM: FPGA hardware accelerator for Zero-Knowledge MSM
by: Xander Pottier, et al.
Published: (2025-03-01)

Hardware-accelerated real-time IP flow measurement method for multi-core architecture
by: ZHU Chao1, et al.
Published: (2008-01-01)

Improved adaptive FPGA dark channel prior dehazing algorithm for edge applications in agricultural scenarios
by: Qunpeng Gao, et al.
Published: (2025-12-01)

FINAL bootstrap acceleration on FPGA using DSP-free constant-multiplier NTTs
by: Jonas Bertels, et al.
Published: (2025-06-01)

Hardware Acceleration of Division-Free Quadrature-Based Square Rooting Approach for Near-Lossless Compression of Hyperspectral Images
by: Amal Altamimi, et al.
Published: (2025-02-01)

REED: Chiplet-based Accelerator for Fully Homomorphic Encryption
by: Aikata Aikata, et al.
Published: (2025-03-01)

Scalable 5G NR Rate-Matcher and Rate-Dematcher for Efficient Use in FPGA Accelerators
by: Nemanja Filipovic, et al.
Published: (2025-01-01)

The iterative properties of solutions for a singular k-Hessian system
by: Xinguang Zhang, et al.
Published: (2023-12-01)

Graphics Accelerators: A Review
by: Layla Jamal Hussein, et al.
Published: (2023-04-01)

AHA: Design and Evaluation of Compute-Intensive Hardware Accelerators for AMD-Xilinx Zynq SoCs Using HLS IP Flow
by: David Berrazueta-Mena, et al.
Published: (2025-05-01)

An Integrated Lightweight Neural Network Design and FPGA-Accelerated Edge Computing for Chili Pepper Variety and Origin Identification via an E-Nose
by: Ziyu Guo, et al.
Published: (2025-07-01)

Analysis of General Purpose FPGA-based Development Boards Efficiency in DNA Processing Applications
by: SZÁSZ Csaba
Published: (2024-10-01)

Advancing Applications of Robot Audition Systems: Efficient HARK Deployment with GPU and FPGA Implementations
by: Zirui Lin, et al.
Published: (2024-12-01)

The Hessian by blocks for neural network by backward propagation
by: Radhia Bessi, et al.
Published: (2024-12-01)

Parallel Newtonian Optimization without Hessian Approximation
by: Khalil Abbo
Published: (2006-12-01)

UniFL: Accelerating Federated Learning Using Heterogeneous Hardware Under a Unified Framework
by: Biyao Che, et al.
Published: (2024-01-01)

A Novel Two-Level Protection Scheme against Hardware Trojans on a Reconfigurable CNN Accelerator
by: Zichu Liu, et al.
Published: (2024-08-01)

Edge Detection with Hessian Matrix Property Based on Wavelet Transform
by: N. Aghazadeh, et al.
Published: (2015-06-01)

Sensitivity of magnetic islands in permanent magnet stellarators using the gradient and Hessian methods
by: A. Chambliss, et al.
Published: (2025-01-01)

Eigenvalue problems for a k-Hessian-type equation
by: Zedong Yang, et al.
Published: (2025-01-01)

High-Throughput ORB Feature Extraction on Zynq SoC for Real-Time Structure-from-Motion Pipelines
by: Panteleimon Stamatakis, et al.
Published: (2025-05-01)

An automatic algorithm for surface wave dispersion curve picking based on Hessian matrix attributes
by: Hou Xiaoping, et al.
Published: (2025-07-01)

Accelerating Training of Convolutional Neural Networks With Hessian-Free Optimization for Detecting Alzheimer’s Disease in Brain MRI
by: Marios Pafitis, et al.
Published: (2024-01-01)

Algorithm profiling for architectures with dataflow accelerators
by: Nenad Korolija, et al.
Published: (2025-07-01)