A non-conservative software-based approach for detecting illegal CFEs caused by transient faults

Diego Rodrigues; Ghazaleh Nazarian; Álvaro Moreira; Luigi Carro; Georgi Gaydadjiev

doi:10.1109/DFT.2015.7315166

A non-conservative software-based approach for detecting illegal CFEs caused by transient faults

Diego Rodrigues, Ghazaleh Nazarian, Álvaro Moreira, Luigi Carro, Georgi Gaydadjiev

Data-Intensive Systems

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

3 Citations (Scopus)

15 Downloads (Pure)

Abstract

Software-based methods for the detection of control-flow errors caused by transient fault usually consist in the introduction of protecting instructions both at the beginning and at the end of basic blocks. These methods are conservative in nature, in the sense that they assume that all blocks have the same probability of being the target of control flow errors. Because of that assumption they can lead to a considerable increase both in memory and performance overhead during execution time. In this paper, we propose a static analysis that provide a more refined information about which basic blocks can be the target of control-flow-errors caused by single-bit flips. This information can then be used to guide a program transformation in which only susceptible blocks have to be protected. We implemented the static analysis and program transformation in the context of the LLVM framework and performed an extensive fault injection campaign. Our experiments show that this less conservative approach can potentially lead to gains both in memory usage and in execution time while keeping high fault coverage.

Original language	English
Title of host publication	Proceedings of the 2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015
Publisher	Institute of Electrical and Electronics Engineers (IEEE)
Pages	221-226
Number of pages	6
ISBN (Electronic)	9781509003129
DOIs	https://doi.org/10.1109/DFT.2015.7315166
Publication status	Published - 2 Nov 2015
Event	28th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015 - Amherst, United States Duration: 12 Oct 2015 → 14 Oct 2015

Conference

Conference	28th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015
Country/Territory	United States
City	Amherst
Period	12/10/15 → 14/10/15

Keywords

and service-ability
availability
Fault tolerance
Reliability

Access to Document

10.1109/DFT.2015.7315166

Cite this

Rodrigues, D., Nazarian, G., Moreira, Á., Carro, L., & Gaydadjiev, G. (2015). A non-conservative software-based approach for detecting illegal CFEs caused by transient faults. In Proceedings of the 2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015 (pp. 221-226). Article 7315166 Institute of Electrical and Electronics Engineers (IEEE). https://doi.org/10.1109/DFT.2015.7315166

@inproceedings{a7664a9916f34113b9bfe19400299940,

title = "A non-conservative software-based approach for detecting illegal CFEs caused by transient faults",

abstract = "Software-based methods for the detection of control-flow errors caused by transient fault usually consist in the introduction of protecting instructions both at the beginning and at the end of basic blocks. These methods are conservative in nature, in the sense that they assume that all blocks have the same probability of being the target of control flow errors. Because of that assumption they can lead to a considerable increase both in memory and performance overhead during execution time. In this paper, we propose a static analysis that provide a more refined information about which basic blocks can be the target of control-flow-errors caused by single-bit flips. This information can then be used to guide a program transformation in which only susceptible blocks have to be protected. We implemented the static analysis and program transformation in the context of the LLVM framework and performed an extensive fault injection campaign. Our experiments show that this less conservative approach can potentially lead to gains both in memory usage and in execution time while keeping high fault coverage.",

keywords = "and service-ability, availability, Fault tolerance, Reliability",

author = "Diego Rodrigues and Ghazaleh Nazarian and {\'A}lvaro Moreira and Luigi Carro and Georgi Gaydadjiev",

year = "2015",

month = nov,

day = "2",

doi = "10.1109/DFT.2015.7315166",

language = "English",

pages = "221--226",

booktitle = "Proceedings of the 2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

address = "United States",

note = "28th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015 ; Conference date: 12-10-2015 Through 14-10-2015",

}

Rodrigues, D, Nazarian, G, Moreira, Á, Carro, L & Gaydadjiev, G 2015, A non-conservative software-based approach for detecting illegal CFEs caused by transient faults. in Proceedings of the 2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015., 7315166, Institute of Electrical and Electronics Engineers (IEEE), pp. 221-226, 28th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015, Amherst, United States, 12/10/15. https://doi.org/10.1109/DFT.2015.7315166

A non-conservative software-based approach for detecting illegal CFEs caused by transient faults. / Rodrigues, Diego; Nazarian, Ghazaleh; Moreira, Álvaro et al.
Proceedings of the 2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015. Institute of Electrical and Electronics Engineers (IEEE), 2015. p. 221-226 7315166.

Research output: Chapter in Book/Conference proceedings/Edited volume › Conference contribution › Scientific › peer-review

TY - GEN

T1 - A non-conservative software-based approach for detecting illegal CFEs caused by transient faults

AU - Rodrigues, Diego

AU - Nazarian, Ghazaleh

AU - Moreira, Álvaro

AU - Carro, Luigi

AU - Gaydadjiev, Georgi

PY - 2015/11/2

Y1 - 2015/11/2

N2 - Software-based methods for the detection of control-flow errors caused by transient fault usually consist in the introduction of protecting instructions both at the beginning and at the end of basic blocks. These methods are conservative in nature, in the sense that they assume that all blocks have the same probability of being the target of control flow errors. Because of that assumption they can lead to a considerable increase both in memory and performance overhead during execution time. In this paper, we propose a static analysis that provide a more refined information about which basic blocks can be the target of control-flow-errors caused by single-bit flips. This information can then be used to guide a program transformation in which only susceptible blocks have to be protected. We implemented the static analysis and program transformation in the context of the LLVM framework and performed an extensive fault injection campaign. Our experiments show that this less conservative approach can potentially lead to gains both in memory usage and in execution time while keeping high fault coverage.

AB - Software-based methods for the detection of control-flow errors caused by transient fault usually consist in the introduction of protecting instructions both at the beginning and at the end of basic blocks. These methods are conservative in nature, in the sense that they assume that all blocks have the same probability of being the target of control flow errors. Because of that assumption they can lead to a considerable increase both in memory and performance overhead during execution time. In this paper, we propose a static analysis that provide a more refined information about which basic blocks can be the target of control-flow-errors caused by single-bit flips. This information can then be used to guide a program transformation in which only susceptible blocks have to be protected. We implemented the static analysis and program transformation in the context of the LLVM framework and performed an extensive fault injection campaign. Our experiments show that this less conservative approach can potentially lead to gains both in memory usage and in execution time while keeping high fault coverage.

KW - and service-ability

KW - availability

KW - Fault tolerance

KW - Reliability

UR - http://www.scopus.com/inward/record.url?scp=84962861063&partnerID=8YFLogxK

U2 - 10.1109/DFT.2015.7315166

DO - 10.1109/DFT.2015.7315166

M3 - Conference contribution

AN - SCOPUS:84962861063

SP - 221

EP - 226

BT - Proceedings of the 2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015

PB - Institute of Electrical and Electronics Engineers (IEEE)

T2 - 28th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015

Y2 - 12 October 2015 through 14 October 2015

ER -

Rodrigues D, Nazarian G, Moreira Á, Carro L, Gaydadjiev G. A non-conservative software-based approach for detecting illegal CFEs caused by transient faults. In Proceedings of the 2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems, DFTS 2015. Institute of Electrical and Electronics Engineers (IEEE). 2015. p. 221-226. 7315166 doi: 10.1109/DFT.2015.7315166

A non-conservative software-based approach for detecting illegal CFEs caused by transient faults

Abstract

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this