A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs
Convolutional neural networks (CNNs) have been widely used in satellite remote sensing. However, satellites in orbit with limited resources and power consumption cannot meet the storage and computing power requirements of current million-scale artificial intelligence models. This paper proposes a ne...
Saved in:
| Main Authors: | , , , , , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2024-01-01
|
| Series: | IET Computers & Digital Techniques |
| Online Access: | http://dx.doi.org/10.1049/2024/4415342 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1849404130300264448 |
|---|---|
| author | Yingzhao Shao Jincheng Shang Yunsong Li Yueli Ding Mingming Zhang Ke Ren Yang Liu |
| author_facet | Yingzhao Shao Jincheng Shang Yunsong Li Yueli Ding Mingming Zhang Ke Ren Yang Liu |
| author_sort | Yingzhao Shao |
| collection | DOAJ |
| description | Convolutional neural networks (CNNs) have been widely used in satellite remote sensing. However, satellites in orbit with limited resources and power consumption cannot meet the storage and computing power requirements of current million-scale artificial intelligence models. This paper proposes a new generation of high flexibility and intelligent CNNs hardware accelerator for satellite remote sensing in order to make its computing carrier more lightweight and efficient. A data quantization scheme for INT16 or INT8 is designed based on the idea of dynamic fixed point numbers and is applied to different scenarios. The operation mode of the systolic array is divided into channel blocks, and the calculation method is optimized to increase the utilization of on-chip computing resources and enhance the calculation efficiency. An RTL-level CNNs field programable gate arrays accelerator with microinstruction sequence scheduling data flow is then designed. The hardware framework is built upon the Xilinx VC709. The results show that, under INT16 or INT8 precision, the system achieves remarkable throughput in most convolutional layers of the network, with an average performance of 153.14 giga operations per second (GOPS) or 301.52 GOPS, which is close to the system’s peak performance, taking full advantage of the platform’s parallel computing capabilities. |
| format | Article |
| id | doaj-art-990e8d5ba8004689ab8f441d58c9cd3f |
| institution | Kabale University |
| issn | 1751-861X |
| language | English |
| publishDate | 2024-01-01 |
| publisher | Wiley |
| record_format | Article |
| series | IET Computers & Digital Techniques |
| spelling | doaj-art-990e8d5ba8004689ab8f441d58c9cd3f2025-08-20T03:37:05ZengWileyIET Computers & Digital Techniques1751-861X2024-01-01202410.1049/2024/4415342A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAsYingzhao Shao0Jincheng Shang1Yunsong Li2Yueli Ding3Mingming Zhang4Ke Ren5Yang Liu6Xidian UniversityXidian UniversityXidian UniversityChina Academy of Space Technology (Xi’an)China Academy of Space Technology (Xi’an)China Academy of Space Technology (Xi’an)China Academy of Space Technology (Xi’an)Convolutional neural networks (CNNs) have been widely used in satellite remote sensing. However, satellites in orbit with limited resources and power consumption cannot meet the storage and computing power requirements of current million-scale artificial intelligence models. This paper proposes a new generation of high flexibility and intelligent CNNs hardware accelerator for satellite remote sensing in order to make its computing carrier more lightweight and efficient. A data quantization scheme for INT16 or INT8 is designed based on the idea of dynamic fixed point numbers and is applied to different scenarios. The operation mode of the systolic array is divided into channel blocks, and the calculation method is optimized to increase the utilization of on-chip computing resources and enhance the calculation efficiency. An RTL-level CNNs field programable gate arrays accelerator with microinstruction sequence scheduling data flow is then designed. The hardware framework is built upon the Xilinx VC709. The results show that, under INT16 or INT8 precision, the system achieves remarkable throughput in most convolutional layers of the network, with an average performance of 153.14 giga operations per second (GOPS) or 301.52 GOPS, which is close to the system’s peak performance, taking full advantage of the platform’s parallel computing capabilities.http://dx.doi.org/10.1049/2024/4415342 |
| spellingShingle | Yingzhao Shao Jincheng Shang Yunsong Li Yueli Ding Mingming Zhang Ke Ren Yang Liu A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs IET Computers & Digital Techniques |
| title | A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs |
| title_full | A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs |
| title_fullStr | A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs |
| title_full_unstemmed | A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs |
| title_short | A Configurable Accelerator for CNN-Based Remote Sensing Object Detection on FPGAs |
| title_sort | configurable accelerator for cnn based remote sensing object detection on fpgas |
| url | http://dx.doi.org/10.1049/2024/4415342 |
| work_keys_str_mv | AT yingzhaoshao aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT jinchengshang aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT yunsongli aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT yueliding aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT mingmingzhang aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT keren aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT yangliu aconfigurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT yingzhaoshao configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT jinchengshang configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT yunsongli configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT yueliding configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT mingmingzhang configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT keren configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas AT yangliu configurableacceleratorforcnnbasedremotesensingobjectdetectiononfpgas |