Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures
Field-programmable gate arrays (FPGAs) and other reconfigurable computing (RC) devices have been widely shown to have numerous advantages including order of magnitude performance and power improvements compared to microprocessors for some applications. Unfortunately, FPGA usage has largely been limi...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
Wiley
2010-01-01
|
| Series: | International Journal of Reconfigurable Computing |
| Online Access: | http://dx.doi.org/10.1155/2010/652620 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| _version_ | 1850234667377098752 |
|---|---|
| author | James Coole Greg Stitt |
| author_facet | James Coole Greg Stitt |
| author_sort | James Coole |
| collection | DOAJ |
| description | Field-programmable gate arrays (FPGAs) and other reconfigurable computing (RC) devices have been widely shown to have numerous advantages including order of magnitude performance and power improvements compared to microprocessors for some applications. Unfortunately, FPGA usage has largely been limited to applications exhibiting sequential memory access patterns, thereby prohibiting acceleration of important applications with irregular patterns (e.g., pointer-based data structures). In this paper, we present a design pattern for RC application development that serializes irregular data structure traversals online into a traversal cache, which allows the corresponding data to be efficiently streamed to the FPGA. The paper presents a generalized framework that benefits applications with repeated traversals, which we show can achieve between 7x and 29x speedup over pointer-based software. For applications without strictly repeated traversals, we present application-specialized extensions that benefit applications with highly similar traversals by exploiting similarity to improve memory bandwidth and execute multiple traversals in parallel. We show that these extensions can achieve a speedup between 11x and 70x on a Virtex4 LX100 for Barnes-Hut n-body simulation. |
| format | Article |
| id | doaj-art-1660972765d643b4ae0c8ecbc1a776e6 |
| institution | OA Journals |
| issn | 1687-7195 1687-7209 |
| language | English |
| publishDate | 2010-01-01 |
| publisher | Wiley |
| record_format | Article |
| series | International Journal of Reconfigurable Computing |
| spelling | doaj-art-1660972765d643b4ae0c8ecbc1a776e62025-08-20T02:02:33ZengWileyInternational Journal of Reconfigurable Computing1687-71951687-72092010-01-01201010.1155/2010/652620652620Traversal Caches: A Framework for FPGA Acceleration of Pointer Data StructuresJames Coole0Greg Stitt1Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32611-6550, USADepartment of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32611-6550, USAField-programmable gate arrays (FPGAs) and other reconfigurable computing (RC) devices have been widely shown to have numerous advantages including order of magnitude performance and power improvements compared to microprocessors for some applications. Unfortunately, FPGA usage has largely been limited to applications exhibiting sequential memory access patterns, thereby prohibiting acceleration of important applications with irregular patterns (e.g., pointer-based data structures). In this paper, we present a design pattern for RC application development that serializes irregular data structure traversals online into a traversal cache, which allows the corresponding data to be efficiently streamed to the FPGA. The paper presents a generalized framework that benefits applications with repeated traversals, which we show can achieve between 7x and 29x speedup over pointer-based software. For applications without strictly repeated traversals, we present application-specialized extensions that benefit applications with highly similar traversals by exploiting similarity to improve memory bandwidth and execute multiple traversals in parallel. We show that these extensions can achieve a speedup between 11x and 70x on a Virtex4 LX100 for Barnes-Hut n-body simulation.http://dx.doi.org/10.1155/2010/652620 |
| spellingShingle | James Coole Greg Stitt Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures International Journal of Reconfigurable Computing |
| title | Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures |
| title_full | Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures |
| title_fullStr | Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures |
| title_full_unstemmed | Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures |
| title_short | Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures |
| title_sort | traversal caches a framework for fpga acceleration of pointer data structures |
| url | http://dx.doi.org/10.1155/2010/652620 |
| work_keys_str_mv | AT jamescoole traversalcachesaframeworkforfpgaaccelerationofpointerdatastructures AT gregstitt traversalcachesaframeworkforfpgaaccelerationofpointerdatastructures |