A Self-Checking Hardware Journal for a Fault-Tolerant Processor Architecture
We introduce a specialized self-checking hardware journal being used as a centerpiece in our design strategy to build a processor tolerant to transient faults. Fault tolerance here relies on the use of error detection techniques in the processor core together with journalization and rollback executi...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2011-01-01
|
Series: | International Journal of Reconfigurable Computing |
Online Access: | http://dx.doi.org/10.1155/2011/962062 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
_version_ | 1832553965147914240 |
---|---|
author | Mohsin Amin Abbas Ramazani Fabrice Monteiro Camille Diou Abbas Dandache |
author_facet | Mohsin Amin Abbas Ramazani Fabrice Monteiro Camille Diou Abbas Dandache |
author_sort | Mohsin Amin |
collection | DOAJ |
description | We introduce a specialized self-checking hardware journal being used as a
centerpiece in our design strategy to build a processor tolerant to transient
faults. Fault tolerance here relies on the use of error detection techniques in
the processor core together with journalization and rollback execution to
recover from erroneous situations. Effective rollback recovery is possible
thanks to using a hardware journal and chosing a stack computing architecture
for the processor core instead of the usual RISC or CISC. The main objective of
the journalization and the hardware self-checking journal is to prevent data not
yet validated to be sent to the main memory, and allow to fast rollback
execution on faulty situations. The main memory, supposed to be fault secure in
our model, only contains valid (uncorrupted) data obtained from fault-free
computations. Error control coding techniques are used both in the processor
core to detect errors and in the HW journal to protect the temporarily stored
data from possible changes induced by transient faults. Implementation results
on an FPGA of the Altera Stratix-II family show clearly the relevance of the
approach, both in terms of performance/area tradeoff and fault tolerance
effectiveness, even for high error rates. |
format | Article |
id | doaj-art-b1fa2a0886b342c3882c4115b6b9da4c |
institution | Kabale University |
issn | 1687-7195 1687-7209 |
language | English |
publishDate | 2011-01-01 |
publisher | Wiley |
record_format | Article |
series | International Journal of Reconfigurable Computing |
spelling | doaj-art-b1fa2a0886b342c3882c4115b6b9da4c2025-02-03T05:52:42ZengWileyInternational Journal of Reconfigurable Computing1687-71951687-72092011-01-01201110.1155/2011/962062962062A Self-Checking Hardware Journal for a Fault-Tolerant Processor ArchitectureMohsin Amin0Abbas Ramazani1Fabrice Monteiro2Camille Diou3Abbas Dandache4LICM Laboratory, University Paul Verlaine, Metz, 7 rue Marconi, 57070 Metz, FranceElectrical Engineering Department, Engineering Faculty Lorestan, University Khorramabad, IranLICM Laboratory, University Paul Verlaine, Metz, 7 rue Marconi, 57070 Metz, FranceLICM Laboratory, University Paul Verlaine, Metz, 7 rue Marconi, 57070 Metz, FranceLICM Laboratory, University Paul Verlaine, Metz, 7 rue Marconi, 57070 Metz, FranceWe introduce a specialized self-checking hardware journal being used as a centerpiece in our design strategy to build a processor tolerant to transient faults. Fault tolerance here relies on the use of error detection techniques in the processor core together with journalization and rollback execution to recover from erroneous situations. Effective rollback recovery is possible thanks to using a hardware journal and chosing a stack computing architecture for the processor core instead of the usual RISC or CISC. The main objective of the journalization and the hardware self-checking journal is to prevent data not yet validated to be sent to the main memory, and allow to fast rollback execution on faulty situations. The main memory, supposed to be fault secure in our model, only contains valid (uncorrupted) data obtained from fault-free computations. Error control coding techniques are used both in the processor core to detect errors and in the HW journal to protect the temporarily stored data from possible changes induced by transient faults. Implementation results on an FPGA of the Altera Stratix-II family show clearly the relevance of the approach, both in terms of performance/area tradeoff and fault tolerance effectiveness, even for high error rates.http://dx.doi.org/10.1155/2011/962062 |
spellingShingle | Mohsin Amin Abbas Ramazani Fabrice Monteiro Camille Diou Abbas Dandache A Self-Checking Hardware Journal for a Fault-Tolerant Processor Architecture International Journal of Reconfigurable Computing |
title | A Self-Checking Hardware Journal for a Fault-Tolerant Processor Architecture |
title_full | A Self-Checking Hardware Journal for a Fault-Tolerant Processor Architecture |
title_fullStr | A Self-Checking Hardware Journal for a Fault-Tolerant Processor Architecture |
title_full_unstemmed | A Self-Checking Hardware Journal for a Fault-Tolerant Processor Architecture |
title_short | A Self-Checking Hardware Journal for a Fault-Tolerant Processor Architecture |
title_sort | self checking hardware journal for a fault tolerant processor architecture |
url | http://dx.doi.org/10.1155/2011/962062 |
work_keys_str_mv | AT mohsinamin aselfcheckinghardwarejournalforafaulttolerantprocessorarchitecture AT abbasramazani aselfcheckinghardwarejournalforafaulttolerantprocessorarchitecture AT fabricemonteiro aselfcheckinghardwarejournalforafaulttolerantprocessorarchitecture AT camillediou aselfcheckinghardwarejournalforafaulttolerantprocessorarchitecture AT abbasdandache aselfcheckinghardwarejournalforafaulttolerantprocessorarchitecture AT mohsinamin selfcheckinghardwarejournalforafaulttolerantprocessorarchitecture AT abbasramazani selfcheckinghardwarejournalforafaulttolerantprocessorarchitecture AT fabricemonteiro selfcheckinghardwarejournalforafaulttolerantprocessorarchitecture AT camillediou selfcheckinghardwarejournalforafaulttolerantprocessorarchitecture AT abbasdandache selfcheckinghardwarejournalforafaulttolerantprocessorarchitecture |