Detecting Silent Data Corruptions in Aerospace-Based Computing Using Program Invariants

Soft error caused by single event upset has been a severe challenge to aerospace-based computing. Silent data corruption (SDC) is one of the results incurred by soft error. SDC occurs when a program generates erroneous output with no indications. SDC is the most insidious type of results and very di...

Full description

Saved in:
Bibliographic Details
Main Authors: Junchi Ma, Dengyun Yu, Yun Wang, Zhenbo Cai, Qingxiang Zhang, Cheng Hu
Format: Article
Language:English
Published: Wiley 2016-01-01
Series:International Journal of Aerospace Engineering
Online Access:http://dx.doi.org/10.1155/2016/8213638
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Soft error caused by single event upset has been a severe challenge to aerospace-based computing. Silent data corruption (SDC) is one of the results incurred by soft error. SDC occurs when a program generates erroneous output with no indications. SDC is the most insidious type of results and very difficult to detect. To address this problem, we design and implement an invariant-based system called Radish. Invariants describe certain properties of a program; for example, the value of a variable equals a constant. Radish first extracts invariants at key program points and converts invariants into assertions. It then hardens the program by inserting the assertions into the source code. When a soft error occurs, assertions will be found to be false at run time and warn the users of soft error. To increase the coverage of SDC, we further propose an extension of Radish, named Radish_D, which applies software-based instruction duplication mechanism to protect the uncovered code sections. Experiments using architectural fault injections show that Radish achieves high SDC coverage with very low overhead. Furthermore, Radish_D provides higher SDC coverage than that of either Radish or pure instruction duplication.
ISSN:1687-5966
1687-5974