Greedy Prefetch for Reducing Off-Chip Memory Accesses in Convolutional Neural Network Inference
The high parameter and memory access demands of CNNs highlight the need to reduce off-chip memory accesses. While recent approaches have improved data reuse to lessen these accesses, simple and efficient prefetching methods are still lacking. This paper introduces a greedy prefetch method that uses...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-02-01
|
| Series: | Information |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2078-2489/16/3/164 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850203739861811200 |
|---|---|
| author | Dengtian Yang Lan Chen |
| author_facet | Dengtian Yang Lan Chen |
| author_sort | Dengtian Yang |
| collection | DOAJ |
| description | The high parameter and memory access demands of CNNs highlight the need to reduce off-chip memory accesses. While recent approaches have improved data reuse to lessen these accesses, simple and efficient prefetching methods are still lacking. This paper introduces a greedy prefetch method that uses data repetition to optimize the prefetching route, thus decreasing off-chip memory accesses. The method is also implemented in a hardware simulator to organize an deployment strategy with additional optimizations. Our deployment strategy outperforms recent works, with a maximum data reuse improvement of 1.98×. |
| format | Article |
| id | doaj-art-76923f0d092e4e9db788e96adeffc619 |
| institution | OA Journals |
| issn | 2078-2489 |
| language | English |
| publishDate | 2025-02-01 |
| publisher | MDPI AG |
| record_format | Article |
| series | Information |
| spelling | doaj-art-76923f0d092e4e9db788e96adeffc6192025-08-20T02:11:26ZengMDPI AGInformation2078-24892025-02-0116316410.3390/info16030164Greedy Prefetch for Reducing Off-Chip Memory Accesses in Convolutional Neural Network InferenceDengtian Yang0Lan Chen1Institute of Microelectronics of the Chinese Academy of Sciences, Beijing 100029, ChinaInstitute of Microelectronics of the Chinese Academy of Sciences, Beijing 100029, ChinaThe high parameter and memory access demands of CNNs highlight the need to reduce off-chip memory accesses. While recent approaches have improved data reuse to lessen these accesses, simple and efficient prefetching methods are still lacking. This paper introduces a greedy prefetch method that uses data repetition to optimize the prefetching route, thus decreasing off-chip memory accesses. The method is also implemented in a hardware simulator to organize an deployment strategy with additional optimizations. Our deployment strategy outperforms recent works, with a maximum data reuse improvement of 1.98×.https://www.mdpi.com/2078-2489/16/3/164deep learninggreedy prefetchaccelerator |
| spellingShingle | Dengtian Yang Lan Chen Greedy Prefetch for Reducing Off-Chip Memory Accesses in Convolutional Neural Network Inference Information deep learning greedy prefetch accelerator |
| title | Greedy Prefetch for Reducing Off-Chip Memory Accesses in Convolutional Neural Network Inference |
| title_full | Greedy Prefetch for Reducing Off-Chip Memory Accesses in Convolutional Neural Network Inference |
| title_fullStr | Greedy Prefetch for Reducing Off-Chip Memory Accesses in Convolutional Neural Network Inference |
| title_full_unstemmed | Greedy Prefetch for Reducing Off-Chip Memory Accesses in Convolutional Neural Network Inference |
| title_short | Greedy Prefetch for Reducing Off-Chip Memory Accesses in Convolutional Neural Network Inference |
| title_sort | greedy prefetch for reducing off chip memory accesses in convolutional neural network inference |
| topic | deep learning greedy prefetch accelerator |
| url | https://www.mdpi.com/2078-2489/16/3/164 |
| work_keys_str_mv | AT dengtianyang greedyprefetchforreducingoffchipmemoryaccessesinconvolutionalneuralnetworkinference AT lanchen greedyprefetchforreducingoffchipmemoryaccessesinconvolutionalneuralnetworkinference |