Proceedings of the National Conference on “Emerging Trends in Electronics & Communications” ISBN: 978 93 83038 19 0
A MITIGATION TECHNIQUE FOR SRAM BASED FPGA Sadaf Omer, M.A.Raheem, Syeda Yasmeen, Naziya Firdous
[email protected],
[email protected],
[email protected],
[email protected] Department of Electronics and Communication Engineering Muffakham Jah College of Engineering and Technology, Hyderabad
space systems to powerful space radiations [1]. Circuits used in space applications need to be protected from effects of such radiation.
Abstract— The power consumption and hardware redundancy have a significant consideration in most of the VLSI-design applications. FPGAs offer flexibility but are more susceptible to radiations effects (SEUS) compared to ASIC designs. The effects of radiation on them need to be studied and robust methods of fault tolerance need to be devised. In this paper we compare different mitigation techniques with Input Output Logic technique. IOLB predicts whether or not a change in the output is expected and thereby corrects the error reducing hardware and time redundancy. We have designed for 4-bit multiplier design using IOLB and compared with TMR and DWC, the power delay product is improved.
Full-custom hardware design (also known as Application Specific Integrated Circuits - ASICs) and Semi-custom hardware design (also known as Field Programmable Gate Arrays - FPGAs) are hardwares that can be used for radiation prone environments. Though ASICs tender the best performance for space applications, the complication and cost involved are very high. Also, the functionality of ASICs is fixed and it cannot be changed. ASICs are very good against SEUs (Single Event Upsets)[2]. Field Programmable Gate Arrays based on SRAM (SRAM-FPGAs) have a major role in several application areas due to their high density, low cost and the onfield reconfiguration capability. Nevertheless when used in high reliability applications and specifically space applications, the Single Event Effects (SEEs) have to be addressed [3]. Single Event Upsets (SEU) is of particular concern, because in SRAM-FPGA they alter the logical function of the circuit. FPGAs have the potential to outperform ASICs in the space technology due to the above reasons. Thereby, it becomes necessary to study the effects of space radiation on FPGAs and devise fault tolerant techniques [4].
Keywords— DWC-Duplication with Comparison, TMR-Triple Modular Redundancy, SEUS-single event upsets. I. Introduction Space research is truly interdisciplinary and has enabled true innovations in multiple areas of science and engineering. Space research has its focus on making things work and bringing the dreams of mankind to fruition through technologies that mankind can be proud of. It is revealed that electronics in space systems, are sensitive to single event effects. Now the challenge is the vulnerability of electronic circuits used in
35
A MITIGATION TECHNIQUE FOR SRAM BASED FPGA
In this paper, comparison of different fault tolerant techniques for a 4-bit multiplier is shown. Different mitigation or fault tolerant design techniques presented include Triple Modular Redundancy (TMR), Duplication with Comparison (DWC) with time redundancy and finally Input-Output logic based design (IOLB) that eliminates the need for the existing dual redundancy.
A. Triple Modular Redundancy The most common technique used within FPGAs is triple-modular redundancy (TMR) [6]. In TMR, Triplicate the hardware and carry out a majority vote to determine the output of the system. The purpose of majority voter is to output the logic value that corresponds to at least two of its inputs. If one of the modules becomes faulty, the 2 remaining fault-free modules mask the results of the faulty module when the majority vote is performed. The area overhead is 3 times more than that of the standard circuit. It does not correct the upsets. The upsets will accumulate in the absence of extra logic for the refreshing. So, scrubbing is done periodically to ensure that faults do not accumulate.
The rest of the paper is organized as follows. Section II presents mitigation techniques. Section III shows the simulation and synthesis results. Section IV compares IOLB logic against other techniques. At last, statements’ concluding the presentation of this paper is given in section V. Need For Fault Tolerant Technique
Fig.2 TMR with a Majority Voter Block
SEU is a transient fault which may cause glitches i.e., current pulses, to propagate through the circuits [5]. For the electronic circuits to be vulnerable; it must be error-free. A fault-tolerant technique is one that can maintain a system to correctly perform its specified tasks in the presence of errors. Fig.1 SEUs
Fig.3 Majority Voter Circuit
Need for IOLB technique
Also, if 2 out of 3 modules are faulty, TMR provides a faulty output. Here a faithful functioning is expected only if two of the three modules are fully free from error.
The fault tolerant techniques TMR and DWC require triple and dual hardware redundancy, where as IOLB requires less hardware. Also it provides better power consumption and is least affected. IOLB assesses if the output is expected to change following a change in the input.
At least, full TMR of a design requires three times the hardware so as to implement three identical copies of a given circuit. Also, additional logic is required to implement the majority logic voters. In the
II. Mitigation Techniques
36
Proceedings of the National Conference on “Emerging Trends in Electronics & Communications” ISBN: 978 93 83038 19 0
worst case, TMR can require up to six times the area of the original circuit [7].
block to detect and correct an error. IOLB technique exploits the relation between changes in inputs and expected changes in output.
B. Duplication with Comparison (DWC) time redundancy
IOLB Logic assesses if the output is expected to change following a change in the input(s). The changes in inputs and change in output signals are used along with appropriately designed logic to generate the error signal, which can then be XOR-ed with the output signal to yield the error-free output.
In this method, a dual redundancy scheme is employed along with time redundancy instead of the TMRs triple redundancy [8]. Both modules execute the same computations in parallel and compare the results. An error message is generated if the two results disagree. DWC just detects the faults and provide no fault tolerance.
Fig.5 IOLB design
Let A = (dr0, dr0 d, dr1, dr1 d). Say, dr0 d is 1. This will help infer that block 0 is faulty and block 1 can be voted. The faulty block stays faulty until the next scrubbing cycle.
Combinational logic block
Fig.4 DWC combined with Time Redundancy Consider any combinational logic with X1;X2;X3:::Xn as inputs and Y as the output. Let Ac denote change in a variable A. If Ac is 1, there is a change in A and if Ac is 0, there is no change in A. We first obtain the changes in the input variables (X1c; X2c; X3c:::Xnc) by XOR-ing X1;X2;X3:::Xn with the delay themselves. We also obtain Yc from Y. Then, we consider all possible cases of input-output transitions. There would be cases, because given an input state; there are possible transitions (since there are ’n’ change variables) and the number of input states is itself (since there are ’n’ input variables). Also, the output Y and its change variable Yc contribute to states.
The scrubbing rate should be high enough to ensure that multiple faults do not occur within one cycle [8]. In DWC, if there is a fault on input line, both modules will receive the same incorrect signal and produce the erroneous result.
For the computation of changes in variables, we take the XOR of a variable with a delayed version of itself, thereby giving us ’1’ if there has been a change. The value of the delay is random. While trying to compute change in output Yc from output Y, we need to take care of the possibility of an error having occurred in the combinational logic. If we just perform
C. Input-Output Logic based design [9]. Input-Output Logic Based (IOLB) method reduces the dual and triple hardware redundancy as in DWC and TMR respectively. It does not require duplication or triplication of the combinational logic 37
A MITIGATION TECHNIQUE FOR SRAM BASED FPGA
XOR of delayed and current values of Y, we will get an erroneous Yc if the combinational logic is error-affected (since Y would be erroneous). Hence, if the error signal E is ’1’, we take XOR with the NOT of delayed output for the computation of change in output signal Yc.
Comparing the RTL Compiler results of 4-bit IOLB multiplier with those of the triple modular redundancy and duplication with comparison it is evident that IOLB can achieve a very small PDP (power delay product), lesser gates and very less area. AREA POWER DELAY (µm) (mw) (ns)
PDP (pJ)
157
2.910
0.201
1.397
0.280
DWC
101
1.975
0.142
1.366
0.193
IOLB
46
0.898
0.063
1.268
0.080
III. Simulation & Synthesis Results
TYPE
Gates
To demonstrate the advantages of IOLB technique we simulated 4-bit IOLB multiplier along with other two TMR and DWC 4-bit multipliers. All the circuits have been implemented using Verilog in NC SIM (Native compiler), Cadence.
TMR
Table-1 RTL Compiler Results for IOLB versus TMR and DWC
Fig.6 Simulation results of 4-bit IOLB Multiplier
IV.Comparison of TMR, DWC, IOLB In the Tables 2, 3 and 4, Mi indicates the ith module. If Mi is 0, then it is faultfree and if Mi is 1, there is a fault in it. In TMR, a module is tripled and a faithful performance is expected if two of the three instances are fully free from error. Table-2 TMR Technique
Fig.7 RTL Compiler Synthesis results of 4bit IOLB Multiplier
M1
M2
M3
TMR output
0
0
0
Faithful
0
0
1
Faithful
0
1
0
Faithful
0
1
1
Not Faithful
1
0
0
Faithful
1
0
1
Not faithful
1
1
0
Not faithful
1
1
1
Not faithful
Table-2, have identified eight possible scenarios for TMR, each of which could occur with equal probability. As can be seen from Table 2, TMR works in 4 out of the 8 possible cases. This is because two of the three instances of the module should be fault-free for a faithful functioning. Table-3 DWC Technique M1
38
M2
DWC output
Proceedings of the National Conference on “Emerging Trends in Electronics & Communications” ISBN: 978 93 83038 19 0
0
0
Faithful
0
1
Faithful
1
0
Faithful
1
1
Not faithful
technique IOLB to implement fault tolerant FPGA. The overall fault tolerant technique is divided into two phases: Error detection and Error correction. IOLB technique provides a mechanism to predict whether or not a change in the output is expected, as a function of changes in inputs.
From Table-3, it can be inferred that DWCCED works in 3 out of the 4 possible cases. This is because DWC-CED works in all cases except when both the instances of the module (i.e., both M1 and M2) are affected.
M1
IOLB output
0
Faithful
The Input- Output logic based technique (IOLB) provides an efficient design which uses least number of gates, and has one-third less gates use than TMR. Experimental results show that IOLB technique has lower area, consumes less power, and has higher reliability than DWC and the conventional TMR-based approach. Also it has low PDP (Power Delay Product) compared to DWC and TMR.
1
Faithful
VI. References
Table-4 IOLB Technique
We present a similar study in Table-4 for IOLB technique. The IOLB technique uses only one instance of the main module. If there is a fault in the module, then the IOLB logic corrects it. In both cases, a faithful output is obtained. We thus conclude from our above analysis that IOLB is the least likely to be affected when compared with TMR and DWC.
[1]
Allan H. Johnston and Steven M. Guertin ,“The Effects of Space Radiation on Linear Integrated Circuits,” Jet Propulsion Laboratory, California Institute of Technology.
[2]
R. Koga, W. R. Crain, K. B. Crawford, S. J. Hansel, S. D. Pinkerton, and T. K. Tsubota, “The Impact of ASIC Devices on the SEU Vulnerability of Space-Borne Computers,” The Aerospace Corporation, El Segundo, CA 902454691, 30 January 1994.
[3]
P.E. Dodd and L.W. Massengill,“Basic Mechanisms and Modeling of Single-Event Upset in Digital Microelectronics”, IEEE Trans. on Nucl. Sci., vol. 50, n. 3, pp 583-602, June 2003.
[4]
“Virtex 2.5 V Field Programmable Gate Arrays,” DS003, v2.5, Product Specification, 2 Apr. 2001, Xilinx; http://direct.xilinx.com/bvdocs/public ations/ds003.pdf.
[5]
D. Alexandrescu, L. Anghel, and M. Nicolaidis, “New Methods for Evaluating the Impact of Single Event Transients in VDSM ICs”, Proc.
Hence from the above analysis the probability of faithful functioning of the three techniques is given in Table-5. Technique
Probability of faithful functioning
TMR
1/2
DWC-CED
3/4
IOLB
1
Table-5 Probability of faithful functioning V. CONCLUSION Recently research work has been done in the area of fault tolerant techniques to mitigate SEU in FPGA, because of increasing use of SRAM based FPGA in space applications. Addressing the same problem, we have proposed a mitigation 39
A MITIGATION TECHNIQUE FOR SRAM BASED FPGA
IEEE Int’l Symp. Defect and Fault Tolerance in VLSI Systems, IEEE CS Press, 2002, pp. 99-107. [6]
[7]
Coeur d’Alene, ID, May 2003, pp. WA11.1–WA11.6.
C. Carmichael. Triple Module Redundancy Design Techniques for Virtex® Series FPGA: Application Notes 197. San Jose, USA: Xilinx, 2000. M. J. Wirthlin, N. Rollins, M. Caffrey, and P. Graham, “Hardness by design techniques for fieldprogrammable gate arrays,” in Proc. 11th Annu. NASA Symp. VLSI Design,
40
[8]
Kastensmidt, Luigi, Carro and Reis, “Fault- Tolerance Techniques for SRAM based FPGAs,” Springer, 2006.
[9]
“Input-Output Logic based FaultTolerant Design Technique for SRAM-based FPGAs”, by Aditya Srinivas Timmaraju, Deshmukh Aniket Anand, Mohammed Amir Khan, Zafar Ali Khan.