Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures

Field-programmable gate arrays (FPGAs) and other reconfigurable computing (RC) devices have been widely shown to have numerous advantages including order of magnitude performance and power improvements compared to microprocessors for some applications. Unfortunately, FPGA usage has largely been limi...

Full description

Saved in:
Bibliographic Details
Main Authors: James Coole, Greg Stitt
Format: Article
Language:English
Published: Wiley 2010-01-01
Series:International Journal of Reconfigurable Computing
Online Access:http://dx.doi.org/10.1155/2010/652620
Tags: Add Tag
No Tags, Be the first to tag this record!
_version_ 1850234667377098752
author James Coole
Greg Stitt
author_facet James Coole
Greg Stitt
author_sort James Coole
collection DOAJ
description Field-programmable gate arrays (FPGAs) and other reconfigurable computing (RC) devices have been widely shown to have numerous advantages including order of magnitude performance and power improvements compared to microprocessors for some applications. Unfortunately, FPGA usage has largely been limited to applications exhibiting sequential memory access patterns, thereby prohibiting acceleration of important applications with irregular patterns (e.g., pointer-based data structures). In this paper, we present a design pattern for RC application development that serializes irregular data structure traversals online into a traversal cache, which allows the corresponding data to be efficiently streamed to the FPGA. The paper presents a generalized framework that benefits applications with repeated traversals, which we show can achieve between 7x and 29x speedup over pointer-based software. For applications without strictly repeated traversals, we present application-specialized extensions that benefit applications with highly similar traversals by exploiting similarity to improve memory bandwidth and execute multiple traversals in parallel. We show that these extensions can achieve a speedup between 11x and 70x on a Virtex4 LX100 for Barnes-Hut n-body simulation.
format Article
id doaj-art-1660972765d643b4ae0c8ecbc1a776e6
institution OA Journals
issn 1687-7195
1687-7209
language English
publishDate 2010-01-01
publisher Wiley
record_format Article
series International Journal of Reconfigurable Computing
spelling doaj-art-1660972765d643b4ae0c8ecbc1a776e62025-08-20T02:02:33ZengWileyInternational Journal of Reconfigurable Computing1687-71951687-72092010-01-01201010.1155/2010/652620652620Traversal Caches: A Framework for FPGA Acceleration of Pointer Data StructuresJames Coole0Greg Stitt1Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32611-6550, USADepartment of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32611-6550, USAField-programmable gate arrays (FPGAs) and other reconfigurable computing (RC) devices have been widely shown to have numerous advantages including order of magnitude performance and power improvements compared to microprocessors for some applications. Unfortunately, FPGA usage has largely been limited to applications exhibiting sequential memory access patterns, thereby prohibiting acceleration of important applications with irregular patterns (e.g., pointer-based data structures). In this paper, we present a design pattern for RC application development that serializes irregular data structure traversals online into a traversal cache, which allows the corresponding data to be efficiently streamed to the FPGA. The paper presents a generalized framework that benefits applications with repeated traversals, which we show can achieve between 7x and 29x speedup over pointer-based software. For applications without strictly repeated traversals, we present application-specialized extensions that benefit applications with highly similar traversals by exploiting similarity to improve memory bandwidth and execute multiple traversals in parallel. We show that these extensions can achieve a speedup between 11x and 70x on a Virtex4 LX100 for Barnes-Hut n-body simulation.http://dx.doi.org/10.1155/2010/652620
spellingShingle James Coole
Greg Stitt
Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures
International Journal of Reconfigurable Computing
title Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures
title_full Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures
title_fullStr Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures
title_full_unstemmed Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures
title_short Traversal Caches: A Framework for FPGA Acceleration of Pointer Data Structures
title_sort traversal caches a framework for fpga acceleration of pointer data structures
url http://dx.doi.org/10.1155/2010/652620
work_keys_str_mv AT jamescoole traversalcachesaframeworkforfpgaaccelerationofpointerdatastructures
AT gregstitt traversalcachesaframeworkforfpgaaccelerationofpointerdatastructures