software and are based on signature assignment to each basic block. Signatures ... many devices are based on MIPS architecture: Apple airport wireless LAN access .... Processors,â Journal of Electronic Testing: Theory and. Applications, vol.
Control-Flow Error Detection Using Combining Basic and Program-level Checking in Commodity Multi-core Architectures Navid Khoshavi, Hamid R. Zarandi, Mohammad Maghsoudloo Department of Computer Engineering and Information Technology Amirkabir University of Technology (Tehran Polytechnic) Tehran, Iran {navid.khoshavi, h_zarandi, m.maghsoudloo} @aut.ac.ir
Abstract— This paper presents a software-based technique to detect control-flow errors using basic level control-flow checking and inherent redundancy in commodity multi-core processors. The proposed detection technique is composed of two phases of basic and program-level control-flow checking. Basic-level control-flow error detection is achieved through inserting additional instructions into program at design time regarding to control-flow graph. Previous research shows that modern superscalar microprocessors already contain significant amounts of redundancy. Program-level control-flow checking can detect CFEs by leveraging existing microprocessors redundancy. Therefore, the cost of adding extra redundancy for fault tolerance is eliminated. In order to evaluate the proposed technique, three workloads quick sort, matrix multiplication and linked list utilized to run on a multi-core processor, and a total of 6000 transient faults have been injected on the processor. The advantage of the proposed technique in terms of performance and memory overheads and detection capability compared with conventional control-flow error detection techniques. Keywords: control-flow checking; control-flow error detection; on-line testing; multi-core processor; inherent redundancy;
I.
INTRODUCTION
Improvement in CMOS technology has provided reduction in transistor size and voltage levels. Reduction in transistor size and voltage levels coupled with increased sensitivity of microprocessors to transient faults. One of the major threats in modern microprocessors is transient faults which induced by energetic particle strikes, such as highenergy neutrons from cosmic rays, and alpha particles from decaying radioactive impurities in packaging and interconnect materials. It has been shown that considerable fraction of transient faults, between 33% and 77%, reflects control-flow errors, such as possible errors in program counter (PC), address circuits, steering and control logic [15]. A Control-flow Error (CFE) is said to have occurred if the processor executes an incorrect sequence of instructions [2]. CFEs are divided into two types: intra-node and internode. An intra node CFE is an illegal movement within a basic block which is a branch-free group of instructions terminated by a branch. An inter node CFE is an illegal movement between two different basic blocks (nodes), or an illegal movement from a basic block to an unused spaces of the memory which is called partition block [2]. Control-Flow Checking (CFC) is a key method which is used for monitoring the flow of a program which partitions
given program code into basic blocks, and then adds redundant hardware/software components for checking the correct execution flow of that program, and consequently, handling any possible type of CFEs. Two main groups of CFC methods [2] have been designed for detecting different types of CFE in a program code: 1) Hardware-based approaches, relying on adding custom hardware, and 2) Software-based approaches, relying on exploiting devised software to achieve error detection. One of typical hardware-based solutions is to use of an external hardware like watchdog (checker) processor to monitor activities carried out by main processor. As soon as any misbehavior is observed, suitable error containment procedures are activated [1], [7]. Adding redundant hardware for fault tolerance or error handling would undermine benefits of modern processors due to area and power impositions. Software-based techniques make use of redundant software and are based on signature assignment to each basic block. Signatures are calculated at runtime and then compared with the original ones which were calculated at design time. Numerous software-based error detection techniques have been devised to assess processor errors [2], [5], [6], [8], [17]. Most of previous software methods have focused just on one specific type of CFEs (intra-node or inter-node), and their results related to detection coverage and overheads are based on this primary assumption [10]. It has been shown that results obtained by ignoring each type of the CFE are not credible [1]. Moreover, because the areaefficiency and power-efficiency of the new and modern processors enable them to stay within their area, power and thermal budgets, adding redundant hardware for fault tolerance or error handling would undermine the obtained benefits of the modern processors. Therefore, making use of redundant software is a more convenient option than redundant hardware, especially for commodity systems. Recently, the use of Commercial Off-The-Shelf (COTS) superscalar processors like MIPS processors has increased in industrial, embedded, real-time and space applications [12]. Trends of manufacturing COTS processors are toward multicore architectures. The key observation made by previous research [13] is that modern superscalar microprocessors already contain significant amounts of redundancy. The existing microprocessors redundancy can tolerate CFEs and remove cost of adding extra redundancy for fault tolerance. Furthermore, since COTS processors are often used in specific-purpose applications, the program can run on a
particular core and idle cores can be scheduled for error detection. In order to detect CFEs in basic-level phase, the program code is partitioned into basic blocks and extra instructions are added to each basic block to examine the execution control-flow. Program-level CFE detection phase is a complementary CFE detection technique which is executed at tolerable interval times. A new concept, macro basic block which is a specific number of blocks depend on workload type is defined in program-level CFE detection phase. The proposed technique activates program-level CFE detection at the end of each macro basic block. To detect CFEs in this phase, two identical copies of a program which are equipped with basic-level CFE detection technique are executed simultaneously as independent threads on different cores. The states of two cores are compared at interval time and if any mismatch observed, the CFE has occurred. The proposed technique exploits inherent redundancy in multi-core processors for error detection and guarantees to detect soft errors in instruction memory which may result in CFE, independent of the CFE type. Moreover, no program structure knowledge like thread interactions and control/data dependency between different threads is needed in our approach. A dual-core processor which is based on MIPS is considered to evaluate proposed technique. The main reason to choose MIPS architecture as core in the processor is its popularity and widespread usage properties. Nowadays, many devices are based on MIPS architecture: Apple airport wireless LAN access points, Cisco systems (7200 series router), Sony media server (Vaio VGX-X90P), Video Game Consoles (Sony playstation2 and PSP), SGI (Silicon Graphic Inc.), and windows CE devices. Three well-known workloads quick sort, matrix multiplication and linked list are utilized to run on MIPSbased dual-core processor. These workloads are generated and simulated in MARS assembler and simulator (ver. 4.1) [11]. To assess proposed technique, simulation-based fault injection is employed and a total of 6000 transient faults are injected on the processor.
Figure 1. The control-flow graph example
The structure of this paper is as follows: Section 2 introduces basic-level control-flow error detection technique. Control-flow error detection in program-level approach is described in section 3. Experimental results, performance and memory overheads are reported by section 4. Section 5 describes the future work. Finally, Section 6 concludes the paper. II.
BASIC-LEVEL CONTROL-FLOW ERROR DETECTION
A program P can be represented with a control-flow graph composed of a set of nodes V and a set of edges E, P={V, E}, where V={v1,…,vi,…,vn} and E={e1,…,ei,…,em}. Each node vi represents a basic block and each edge ei represents the branch bri,j from vi to vj. bri,j is illegal if bri,j is not included in E and it indicates a CFE like br2,1 in Fig. 1. To detect CFEs in basic-level, first of all, a unique signature is assigned to each basic block at design time. Next, a runtime signature is defined and continuously updated in executed nodes. Then, pre-assigned and run-time signatures are compared at the end of each basic block. If any mismatch observed, a CFE has occurred. Fig. 2 shows the added instruction to the basic blocks because of methods implementation. If an illegal jump occurred before added instructions and control transferred to it illegally, then the CFE can be detected by comparing the stored value in the run-time signature with another one calculated at design time. As shown in Fig. 2, CFE1 is an inter-node CFE and CFE2 is an intra-node CFE. While basic-level CFE detection approach is capable to detect inter-node CFEs, as well as possible, it does not have enough power to detect intra-node CFEs. Program-level CFE detection is devised to tackle intra-node CFEs beside internode CFEs which are not detected at basic-level CFE detection phase. III.
PROGRAM-LEVEL CONTROL-FLOW ERROR DETECTION
As mentioned above, although the trend in manufacturing COTS processors is toward multi-core
Figure 2. CFE detection through added instructions
architectures, the application is often performed on a particular core. To detect CFEs which are not detected in basic-level CFE detection phase a redundant program which is quite similar to the main program is specified. To achieve load balancing and reduce the probability of corrupting both redundant and primary programs, redundant program is executed on the other core of processor. These two identical copies of a program are executed simultaneously as independent threads on different cores. The processor is interrupted at tolerable interval times, and Program_state_comparison function is called to compare the states of programs. If any mismatch observed, a CFE has occurred. Intra-node CFEs are often caused data error generation in associated processor. These data errors can be detected when the program state of both processors are compared at program-level checking. The comparison among states of programs is a timeconsuming process. To reduce performance overhead of this phase, we increase interval time of comparison by defining macro basic block. Macro basic block is a specific number of blocks depend on workload type. The program-level CFE checking is activated at the end of each macro basic block. In addition, we limit comparison of programs state to intermediate calculations, results and some core registers. The comparison function is only applied on main program to keep system consistency. IV.
faults are injected on a subset of all possible locations like program counter to produce CFE in program. To measure the fault detection capability of our approach, about 6000 simulation faults were injected into the processor during execution of three workload programs. To assess faulttolerant property of proposed technique the effects of injected faults are examined. Table 1 represents the error detection coverage of considered approaches which is calculated as follows: Error detection coverage=
#
×100 (1)
#
% Performance Overhead
As shown in Table 1, the detection coverage of the proposed technique is higher than other techniques. Our approach detects both inter and intra-node CFEs and exploit inherent redundancy in multi-core processors for error detection. 60
CFCSS
YACCA
CEDA
Proposed Tech.
40
20
0
EXPERIMENTAL RESULTS
QS
To assess the effectiveness of the proposed approach, we utilized dual-core MIPS processor [12] which is based on miniMIPS processor [14] and its code described in VHDL language. During our experiment we considered three workload programs written in assembly language. Quick Sort (10 elements, QS), Matrix Multiplication (4x4, MM) and Linked List (20 elements, LL) are three workload programs used for the evaluation of the proposed technique. These codes have been generated in MARS IDE (v 4.1) assembler and simulator [11]. We assumed that the fault type against dual-core MIPS processor is SEU. Bit-flip fault model is considered to model the effects of SEUs which consists in the modification of the content of a single storage cell during program execution. For the purpose of this paper we are mainly interested in faults affecting the execution control-flow. For this reason,
MM
Average
LL
Figure 3. Comparison of performance overhead in different methods
CFCSS
%Memory Overhead
100
YACCA
CEDA
Proposed Tech.
80 60 40 20 0 QS
MM
LL
TABLE I. COMPARISON OF ERROR DETECTION COVERAGE IN DIFFERENT METHODS Original
CFCSS [2]
YACCA [3]
CEDA [6]
Proposed Technique
Workloads
Wrong Results (%)
Correct Results (%)
UDCFEsa (%)
DCFEsb (%)
UDCFEsa (%)
DCFEsb (%)
UDCFEsa (%)
DCFEsb (%)
UDCFEsa (%)
DCFEsb (%)
QS
92.4
7.6
15.9
84.1
5.5
94.5
4.9
95.1
1.8
98.2
MM
93.7
6.3
14.7
85.3
3.2
96.8
3.1
96.9
1.1
98.9
LL
91.3
8.7
16.7
83.3
6.3
93.7
6.0
94.0
2.2
97.8
Average
92.5
7.5
15.8
84.2
5.0
95.0
4.7
95.3
1.7
98.3
a. Un-detected CFEs b. Detected CFEs
Average
Figure 4. Comparison of memory overhead in different methods
The performance overhead reveals the amount of performance degradation due to methods operation and Fig. 3 illustrates comparison of performance overhead in different methods. The detection coverage and the overheads have great impact to affect the effectiveness of the methods. The performance overhead of proposed technique is about 39.3% and it is lower than some CFE detection techniques like CFCSS and YACCA because of fewer checking instructions and concurrent execution on different cores. The quick sort performance overhead is more than two other workloads in different CFE detection techniques. It can be because of its recursive structure and consecutive call and return instructions. The comparison among memory overhead percentages of the programs due to applying the methods has shown in Fig. 4. The memory overhead consists of the set of instructions which are added at the beginning and at the end of the basic blocks and the other set added instructions for implementing the Program_state_comparison function. The linked list program has many basic blocks including few instructions in compare to other workloads. Consequently, the memory overhead of linked list is more than two other workloads in different CFE detection techniques. V.
FUTURE WORK
Although the technique works well for small simple programs, some programs are likely to encounter situations where main memory (or shared caches, buses etc.) are likely to be required by both programs. The technique needs extending to copy with such situations, either by resynchronizing or by removing the need for the programs to run exactly in parallel. VI.
CONCLUSIONS
In this paper, a software technique to detect CFEs using inherent redundancy in multi-core architectures was proposed. The proposed technique guarantees to detect soft errors in instruction memory which may result in CFE, independent of the CFE type. Concentrating on detecting both intra- and inter-node control-flow errors is the first goal to achieve high detection coverage along with significant reduction in any kind of the imposed overheads. Fault injection experiments on a dual-core MIPS processor show that the proposed technique detects CFEs in over 98.3% of the cases. The low performance overheads and high error detection coverage make suggested solution a good candidate to be used in multi-core architectures. REFERENCES [1] M. Fazeli, R. Farivar and S. G. Miremadi, “Error Detection Enhancement in PowerPC Architecture-based Embedded Processors,” Journal of Electronic Testing: Theory and Applications, vol. 24, pp. 21-33, 2008. [2] N. Oh, P. Shirvani and E. J. McCluskey, “Control-Flow Checking by Software Signatures,” IEEE Transactions on Reliability, vol. 51, no. 2, pp. 111- 122, 2002.
[3] O. Goloubeva, M. Rebaudengo, M. R. Sonza and M. Violante, “Soft-error Detection Using Control Flow Assertion,” 18th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, pp. 57-62, 2003. [4] H. R. Zarandi, M. Maghsoudloo and N. Khoshavi, “Two Efficient Software Techniques to Detect and Correct Controlflow Errors,” 16th IEEE Pacific Rim International Symposium on Dependable Computing, pp. 141-148, 2010. [5] R. Venkatasubramanian, J. P. Hayes and B. T. Murray, “Lowcost on-line Fault Detection Using Control Flow Assertions,” 9th IEEE International On-Line Testing Symposium, pp. 137143, 2003. [6] R. Vemu and J. A. Abraham, “CEDA: Control-flow Error Detection through Assertions,” 12th IEEE International OnLine Testing Symposium, July, pp. 151–158, 2006. [7] A. Rajabzadeh and S. G. Miremadi, “CFCET: A HardwareBased Control Flow Checking Technique in COTS Processors Using Execution Tracing,” Elsevier Journal of Microelectronics and Reliability, vol. 46, pp. 959-972, 2006. [8] Y. Sedaghat, S. G. Miremadi and M. Fazeli, “A SoftwareBased Error Detection Technique Using Encoded Signature,” 21st IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems, pp. 389-400, 2006. [9] P. Bernardi, L. V. Bolzani, M. Rebaudengo, M. S. Reorda, F. Vargas and M. Violante, “On-line Detection of Control-Flow Errors in SoCs by means of an Infrastructure IP core,” 35th International Conference on Dependable Systems and Networks, pp. 50-58, 2005. [10] R. Vemu, S. Gurumurthy and J. A. Abraham, “ACCE: Automatic Correction of Control-flow Errors,” IEEE International Test Conference, pp. 1-10, 2007. [11] K. Voomar and P. Sanderson, “A MIPS Assembly Language Simulator Designed for Education,” Journal of Computing Sciences in Colleges, pp. 95-101. [12] I. Faraji, M. Didehban and H. R. Zarandi, “Analysis of Transient faults on a MIPS-based Dual-Core Processor,” International Conference on Availability, Reliability and Security, 2010. [13] D. Gizopoulos, M. Psarakis, S. V. Adve, P. Ramachandran, S. K. Hari, D. Sorin, A. Meixner, A. Biswas and X. Vera, “Architectures for Online Error Detection and Recovery in Multicore Processors,” Proceedings of Design, Automation and Test in Europe, 2011. [14] OpenCores, www.opencores.org/project/minimips. [15] J. Ohlsson, M. Rimen and U. Gunneflo, “A Study of the Effects of Transient Fault Injection Into a 32-bit Risc with Built-in Watchdog,” 22nd International Symposium on Fault Tolerant Computing, pp. 316-325, 1992. [16] C. Bolchini, A. Miele, M. Rebaudengo, F. Salice, D. Sciuto, L. Sterpone and M. Violante, “Software and Hardware Techniques for SEU Detection in IP Processors,” Journal of Electronic Testing Theory and Application, vol. 24, no. 1-3, pp. 35-44, 2008. [17] R. Vemu and J. A. Abraham, “Budget-dependent Control-flow Error detection,” 14th IEEE International On-Line Testing Symposium, pp. 73-78, 2008.