A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs

Convolutional neural networks (CNNs) have been widely used in satellite remote sensing. However, satellites in orbit with limited resources and power consumption cannot meet the storage and computing power requirements of current million-scale artificial intelligence models. This paper proposes a ne...

Full description

Saved in:
Bibliographic Details
Main Authors: Yingzhao Shao, Jincheng Shang, Yunsong Li, Yueli Ding, Mingming Zhang, Ke Ren, Yang Liu
Format: Article
Language:English
Published: Wiley 2024-01-01
Series:IET Computers & Digital Techniques
Online Access:http://dx.doi.org/10.1049/2024/4415342
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1849404130300264448
author Yingzhao Shao
Jincheng Shang
Yunsong Li
Yueli Ding
Mingming Zhang
Ke Ren
Yang Liu
author_facet Yingzhao Shao
Jincheng Shang
Yunsong Li
Yueli Ding
Mingming Zhang
Ke Ren
Yang Liu
author_sort Yingzhao Shao
collection DOAJ
description Convolutional neural networks (CNNs) have been widely used in satellite remote sensing. However, satellites in orbit with limited resources and power consumption cannot meet the storage and computing power requirements of current million-scale artificial intelligence models. This paper proposes a new generation of high flexibility and intelligent CNNs hardware accelerator for satellite remote sensing in order to make its computing carrier more lightweight and efficient. A data quantization scheme for INT16 or INT8 is designed based on the idea of dynamic fixed point numbers and is applied to different scenarios. The operation mode of the systolic array is divided into channel blocks, and the calculation method is optimized to increase the utilization of on-chip computing resources and enhance the calculation efficiency. An RTL-level CNNs field programable gate arrays accelerator with microinstruction sequence scheduling data flow is then designed. The hardware framework is built upon the Xilinx VC709. The results show that, under INT16 or INT8 precision, the system achieves remarkable throughput in most convolutional layers of the network, with an average performance of 153.14 giga operations per second (GOPS) or 301.52 GOPS, which is close to the system’s peak performance, taking full advantage of the platform’s parallel computing capabilities.
format Article
id doaj-art-990e8d5ba8004689ab8f441d58c9cd3f
institution Kabale University
issn 1751-861X
language English
publishDate 2024-01-01
publisher Wiley
record_format Article
series IET Computers & Digital Techniques
spelling doaj-art-990e8d5ba8004689ab8f441d58c9cd3f2025-08-20T03:37:05ZengWileyIET Computers & Digital Techniques1751-861X2024-01-01202410.1049/2024/4415342A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAsYingzhao Shao0Jincheng Shang1Yunsong Li2Yueli Ding3Mingming Zhang4Ke Ren5Yang Liu6Xidian UniversityXidian UniversityXidian UniversityChina Academy of Space Technology (Xi’an)China Academy of Space Technology (Xi’an)China Academy of Space Technology (Xi’an)China Academy of Space Technology (Xi’an)Convolutional neural networks (CNNs) have been widely used in satellite remote sensing. However, satellites in orbit with limited resources and power consumption cannot meet the storage and computing power requirements of current million-scale artificial intelligence models. This paper proposes a new generation of high flexibility and intelligent CNNs hardware accelerator for satellite remote sensing in order to make its computing carrier more lightweight and efficient. A data quantization scheme for INT16 or INT8 is designed based on the idea of dynamic fixed point numbers and is applied to different scenarios. The operation mode of the systolic array is divided into channel blocks, and the calculation method is optimized to increase the utilization of on-chip computing resources and enhance the calculation efficiency. An RTL-level CNNs field programable gate arrays accelerator with microinstruction sequence scheduling data flow is then designed. The hardware framework is built upon the Xilinx VC709. The results show that, under INT16 or INT8 precision, the system achieves remarkable throughput in most convolutional layers of the network, with an average performance of 153.14 giga operations per second (GOPS) or 301.52 GOPS, which is close to the system’s peak performance, taking full advantage of the platform’s parallel computing capabilities.http://dx.doi.org/10.1049/2024/4415342
spellingShingle Yingzhao Shao
Jincheng Shang
Yunsong Li
Yueli Ding
Mingming Zhang
Ke Ren
Yang Liu
A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs
IET Computers & Digital Techniques
title A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs
title_full A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs
title_fullStr A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs
title_full_unstemmed A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs
title_short A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs
title_sort configurable accelerator for cnn based remote sensing object detection on fpgas
url http://dx.doi.org/10.1049/2024/4415342
work_keys_str_mv AT yingzhaoshao aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT jinchengshang aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT yunsongli aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT yueliding aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT mingmingzhang aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT keren aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT yangliu aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT yingzhaoshao configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT jinchengshang configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT yunsongli configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT yueliding configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT mingmingzhang configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT keren configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas
AT yangliu configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas