Improving Test Coverage
Hybrid-SBST Methodology for Efficient Testing of Processor Cores Nektarios Kranitis, Andreas Merentitis, George Theodorou, and Antonis Paschalis University of Athens
Dimitris Gizopoulos University of Piraeus
microprocessors integrate large caches on the same die, which enables self-test program execution from the on-chip cache, provided there is a cache loader mechanism to load the test program and unload the test response. Thus, by changing the external ATE’s role from actual test application to that of an interface with the on-chip memory before and after the test, SBST achieves the goal of at-speed testing using low-speed, low-cost external ATE. SBST is a scalable, portable, and reusable methodology for high-quality testing that incurs virtually zero performance, power, and circuit area overhead. In addition, because the vehicles for applying SBST programs are existing processor instructions, at-speed testing is feasible without the risk of thermal damage due to excessive signal activity in special test modes. Furthermore, by using the processor’s instruction set architecture (ISA) and complying with all the restrictions enforced by both the ISA and the designers’ decisions, SBST avoids overtesting for faults that don’t appear during normal circuit operation and saves valuable yield. Despite the significant advantages, however, most present forms of SBST represent a semi-automated approach that requires test engineer expertise for selftest program development. Modern commercial processors are characterized by a high level of complexity, and their architectural features introduce test challenges that no single SBST methodology can effectively address. Additionally, SBST methodologies have their individual advantages and disadvantages. For example, random test-program generation (RTPG) has a relatively low test development cost but results in
Because of their complexity and various architectural features, today’s commercial processor cores limit the effectiveness of a single test methodology. However, self-test programs based on deterministic softwarebased self-test (SBST) methodologies combined with verification-based selftest programs and supplemented by directed random test-program generation prove very effective as a test strategy.
&AGGRESSIVE
SEMICONDUCTOR FABRICATION processes have resulted in state-of-the-art processor designs and SoCs built around processor cores that contain more than a billion transistors and operate at gigahertz frequencies. Deep-submicron geometries and complex speed-enhancing mechanisms produce excellent performance, but serious testability challenges arise. New types of defects require the deployment of at-speed tests to achieve high test quality. Traditional processor at-speed manufacturing tests based on functional ATE cannot be considered an economically viable scheme. Several software-based self-test (SBST) methodologies have been proposed as an effective alternative or supplement for the manufacturing test of microprocessors and embedded processors in SoCs (see the ‘‘Related work’’ sidebar). SBST is a nonintrusive approach that embeds a ‘‘software tester’’ in the form of a self-test program in a processor’s on-chip memory. The ATE loads the program into the processor’s onchip memory. During test application, the processor executes this self-test program at its normal operational frequency, thereby achieving at-speed testing. Test responses collected from the processor are stored in the on-chip data memory. Finally, the ATE unloads the test responses from the on-chip memory. Modern
64
0740-7475/08/$25.00
G
2008 IEEE
Copublished by the IEEE CS and the IEEE CASS
IEEE Design & Test of Computers
many execution cycles and cannot address randompattern-resistant faults. On the other hand, high-level RTL test program generation achieves high fault coverage with few test cycles but requires expertise from the test engineer for test pattern generation and deployment. So, only by effectively combining the available SBST strategies can we capitalize on the advantages of each to achieve high test quality in a reasonable number of cycles and without prohibitive test development cost. In this article, we introduce a hybrid-SBST methodology for efficient testing of commercial processor cores that effectively uses the advantages of various SBST methodologies. Self-test programs based on deterministic structural SBST methodologies (using high-level test development and gate-level-constrained ATPG test development) combined with verificationbased self-test code development and directed RTPG constitute a very effective H-SBST test strategy. The proposed methodology applies directed RTPG as a supplement to improve overall fault coverage results after component-based self-test code development has been performed. An advantage of this strategy is that it avoids the use of large RTPG programs that result in an excessive number of cycles and prohibitive test application time during manufacturing test. We have developed a test program following these principles and applied it to a commercial, fully pipelined benchmark that has been used for industrial applications (OpenRISC 1200). Experimental results showing test coverage of more than 92% demonstrate the effectiveness of the proposed methodology.
H-SBST methodology A single test methodology is insufficient for today’s processors. Moreover, different processor component architectures and implementations, used as IP cores to reduce development time and achieve short time to market, have specific test requirements. In this article, we adopt information extraction (phase A in Figure 1) and component classification and test priority (phase B) from previous research,1,2 as these are both keys to low-cost test development. However, to address the challenges of modern commercial processor cores, we enhance the self-test development (phase C). We combine self-test programs based on deterministic structural SBST methodologies (using high-level test development1,3 and gatelevel-constrained ATPG test development4,5) with verification-based self-test code development and
January/February 2008
directed RTPG. Together, these methodologies constitute a very effective H-SBST test strategy. The overall methodology, shown in Figure 1, effectively extends the methodology presented by Kranitis et al.,1 so that it provides high test quality in nonregular functional components and control-oriented components.
Information extraction, component classification, and test priority Phase A begins by gathering all available information provided by the programmers’ manuals, including ISA constraints. Test engineers can also explore software development suites (compilers, instruction set simulators) to retrieve information and constraints. Even if such characteristics aren’t described in detail, careful examination of the ISA can yield valuable information. For example, the existence of multiplyaccumulate (MAC) instructions defines the existence of the corresponding functional component—in this case, a multiplier accumulator. Moreover, the length of certain fields in the instructions indicates the lengths of buses and registers in the processor. Even more detailed knowledge can be gained directly from the processor description at the RTL, if this is available to test engineers. Finally, CAD tools can provide valuable reports; for example, most synthesis tools can report specific IP building blocks used during synthesis (such as DesignWare Foundation Library components) as well as their architecture. After phase A, the processor set M of processor components (also called MUTs in related works) is identified. During phase B, processor components C M M are classified as functional or control (both types can be either visible or hidden to the assembly programmer). Data path functional components are further classified as data or address.2 Processor components are also ranked according to their test priority. Data path functional components implementing data-related logic should be assigned a higher test priority because of their better controllability and observability, and also because the components in this category contain a considerable number of processor faults. Proposed H-SBST program development Following phases A and B, we target the processor component C M M that is ranked highest in phase B. We select a suitable test strategy for each component C. If C is a functional component (see Figure 1), then we follow a deterministic self-test code development approach. If the functional component is character-
65
Improving Test Coverage
Related work The advantages of software-based self-test (SBST) make it a very attractive testing approach, so it’s not surprising that many SBST methodologies have been proposed: A comprehensive survey is available elsewhere.1 SBST methodologies fall mainly into two categories. The first includes SBST methodologies that have a high level of abstraction and are functional in nature.2–4 The second category of SBST methodologies includes those that are structural in nature with structural, fault-driven test development.5–8
Functional SBST methodologies A common characteristic of functional SBST methodologies is their almost exclusive use of randomized instructions or operands. Shen and Abraham proposed a functional test methodology that generates a random sequence of instructions, enumerating all the combinations of the operations and systematically selecting operands.2 They applied the methodology to a GL85 processor. Batcher and Papachristou proposed a selftest methodology that combines the execution of microprocessor instructions with a small amount of on-chip hardware used for creating random instruction sequences.3 They applied the methodology to a DLX processor. Parvathala et al. proposed an automated functional self-test methodology called functional random instruction testing (FRIT) that is based on the composition of random instruction sequences with pseudorandom data generated by software linear-feedback shift registers.4 They used the on-chip cache for applying the tests. Constraints were extracted and built into the generator to guarantee generation of valid instruction sequences, thus ensuring that no cache misses or bus access cycles were produced. They applied the methodology to the Intel Pentium 4 processor.
Structural SBST methodologies Corno et al. proposed an automated test development methodology based on evolutionary-theory techniques.5 They demonstrated the methodology on an 8051 8-bit microcontroller. Chen and Dey proposed an SBST methodology in which pseudorandom pattern sequences are developed for each processor component, taking into consideration manually extracted constraints imposed by the processor’s instruction set. They demonstrated the methodology on a simple processor called Parwan. Chen et al. proposed a methodology that extends their previous work by
66
automating the complex constraint extraction phase and emphasizing ATPG-based rather than pseudorandom test development.6 To derive a model of the logic surrounding the module under test (MUT), they applied statistical regression analysis on the RTL simulation results using manually coded instruction templates. The derived model is converted into a virtual constrained circuit (VCC), and ATPG is applied iteratively to the VCC MUT. Chen et al. applied this methodology to the combinational logic in the execution stage of a processor from Tensilica (Xtensa). Kranitis et al. proposed a high-level structural SBST methodology based on the ISA and the processor RTL description.7 They showed that small deterministic test sets, deployed through compact test routines, provide significant improvement when applied to the same simple processor—Parwan—used in Chen and Dey’s work. Kranitis et al. addressed low-cost SBST challenges by defining different test priorities for processor components, showing that high-level self-test code development based on an ISA and an RTL description of a processor’s ISA can lead to low test cost without sacrificing high fault coverage.8 This development is independent of the gate-level implementation. They applied this methodology to two processors: Plasma with a simple three-stage pipeline and a MIPS R3000 application-specific instruction set processor (ASIP) with a five-stage pipeline, the latter designed using the ASIP Meister design environment. Paschalis and Gizopoulos proposed a new classification and test priority scheme more fine-grained than that used by Kranitis et al.8 and identified the most effective self-test routines suitable for online periodic testing.9 Psarakis et al. identified testability hot spots in processor pipeline logic and proposed an SBST methodology that enhances existing SBST programs8 so as to target more effectively the pipeline logic.10 They applied this methodology to the miniMIPS and OpenRISC 1200 processor cores.
Recent SBST methodologies In recent SBST work, Gurumurthy et al. proposed a novel technique that works at the RTL and maps module-level precomputed test patterns to instruction sequences using formal verification techniques.11 The technique uses Boolean difference formulation, linear temporal logic, and bounded model checking to map module-level test responses into instruction sequences. continued on p. 67
IEEE Design & Test of Computers
continued from p. 66 After the application of 36,750 random instructions, the technique was applied to the OpenRISC 1200 processor to target remaining hard-to-detect faults. It increased the processor’s fault coverage from 68% to 82%. Wen et al. proposed an SBST methodology that uses random test-program generation (RTPG) as a baseline, with deterministic target test-program generation (TTPG) as a supplement.12 Simulation-based TTPG, performed much like the methodology described by Chen et al.,6 uses arithmetic and Boolean learning techniques instead of statistical regression to develop learned models for the logic surrounding the MUT. Wen et al. applied the methodology to the controller and the arithmetic logic unit (ALU) of the OpenRISC 1200 processor. When RTPG is applied to the controller component, fault coverage saturates at around 62.14%; on the other side, TTPG generates 134 valid test patterns and detects 4,967 faults, including all the faults that RTPG can detect. The result is controller fault coverage of 69.39%. For the ALU, after the application of 100 K RTPG test patterns, TTPG is applied; together they achieve ALU fault coverage of 94.94%. In both methodologies used by researchers who targeted commercial RISC processors such as the OpenRISC 1200,11,12 however, the use of RTPG as a baseline results in several test responses that need to be stored and an excessive number of test cycles.
and Design Validation,’’ Proc. Int’l Test Conf. (ITC 98), IEEE CS Press, 1989, pp. 990-999. 3. K. Batcher and C. Papachristou, ‘‘Instruction Randomization Self Test for Processor Cores,’’ Proc. 17th IEEE VLSI Test Symp. (VTS 99), IEEE Press, 1999, pp. 34-40. 4. P. Parvathala, K. Maneparambil, and W. Lindsay, ‘‘FRITS— A Microprocessor Functional BIST Method,’’ Proc. Int’l Test Conf. (ITC 02), IEEE CS Press, 2002, pp. 590-598. 5. F. Corno et al., ‘‘Fully Automatic Test Program Generation for Microprocessor Cores,’’ Proc. Design Automation and Test in Europe (DATE 03), IEEE CS Press, 2003, pp. 1006-1011. 6. L. Chen et al., ‘‘A Scalable Software-Based Self-Test Methodology for Programmable Processors,’’ Proc. 40th Design Automation Conf. (DAC 03), ACM Press, 2003, pp. 548-553. 7. N. Kranitis et al., ‘‘Instruction-Based Self-Testing of Processor Cores,’’ Proc. 20th IEEE VLSI Test Symp. (VTS 02), IEEE CS Press, 2002, pp. 223-228. 8. N. Kranitis et al., ‘‘Software-Based Self-Testing of Embedded Processors,’’ IEEE Trans. Computers, vol. 54, no. 4, Apr. 2005, pp. 461-475. 9. A. Paschalis and D. Gizopoulos, ‘‘Effective Software-Based Self-Test Strategies for On-Line Periodic Testing of Embedded Processors,’’ IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 24, no. 1, Jan. 2005, pp. 88-99. 10. M. Psarakis et al., ‘‘Systematic Software-Based Self-Test for Pipelined Processors,’’ Proc. 43rd Design Automation Conf. (DAC 06), ACM Press, 2006, pp. 393-398. 11. S. Gurumurthy, S. Vasudevan, and J.A. Abraham, ‘‘Automatic Generation of Instruction Sequences Targeting Hard-
References
to-Detect Structural Faults in a Processor,’’ Proc. IEEE Int’l
1. D. Gizopoulos, A. Paschalis, and Y. Zorian, Embedded Processor-Based Self-Test, Frontiers in Electronic Testing, vol. 28, Kluwer Academic, 2004.
tion Techniques for Improving the Robustness of a Soft-
2. J. Shen and J. Abraham, ‘‘Native Mode Functional Test Generation for Processors with Applications to Self-Test
ized by inherent regularity (like several computational, interconnect, or storage components), we can reuse our test library.1 The test library contains test algorithms in pseudocode for generic functional components, tailored to the ISA of the processor under test. These algorithms generate small, precomputed test sets, which provide high fault coverage for many types and architectures of processor components independently of the gate-level implementation. The test library routines exploit possible test vectors’ regularity and algorithmic nature, resulting in efficient compact code loops. Processor functional components with
January/February 2008
Test Conf. (ITC 06), IEEE CS Press, 2006, pp. 1-9. 12. C.H.P. Wen et al., ‘‘Simulation-Based Target Test Generaware-Based-Self-Test Methodology,’’ Proc. IEEE Int’l Test Conf. (ITC 05), IEEE CS Press, 2005, pp. 936-945.
inherent regularity include register files, multipliers and shifters, and various adder and comparator architectures. Multiplexers and all types of registers fall into this category as well. For example, if the CAD synthesis tool reports a component (multiplier, shifter, and so on) with an architecture supported by our test library, we reuse the self-test algorithm in the self-test program. Assuming that the functional component’s exact architecture is not known, we then apply all of our test library’s available test algorithms through an iterative trial-and-error approach. Significantly, components targeted by this method needn’t be visible at
67
Improving Test Coverage
the RTL. For example, forwarding multiplexers are implied in pipelined processors, and test library self-test code targeting them is added even if the corresponding components are hidden. If the approach just described doesn’t provide acceptable fault coverage (because either a suitable self-test routine doesn’t exist in our test library or a functional component is not inherently regular), then we follow a different strategy. We perform self-test development based on constrained ATPG, similar to the method described elsewhere for functional components.4,5 Moreover, when fault coverage results determined through a test library self-test program don’t satisfy the requirements, constrained ATPG can be applied only to the undetected faults. Constrained ATPG involves several steps. First, we extract constraints imposed on the component by its surrounding logic. Next, we describe these constraints as virtual constraint circuits. The VCCs enforce microarchitectural and ISAimposed constraints on the combinational component under test, providing constraint information in the form of logic circuits. Using constrained ATPG for components that have microarchitectural and ISA constraints has the added benefit that the ATPG tool itself determines the upper boundary of structurally testable faults, thus defining the lower bound of the functionally untestable faults. However, constrained ATPG is not very efficient in terms of code size, since each test pattern generated by the ATPG tool usually requires numerous instructions in the form of a test instruction template in order to be applied and to propagate the results to the data memory. r Figure 1. Overall hybrid softwarebased self-test (H-SBST) methodology. (FC: functional coverage; ISA: instruction set architecture; MUT: module under test.)
68
IEEE Design & Test of Computers
If C is a control component, we follow a coveragedriven, verification-based self-test code development approach. In this article, we consider a more comprehensive strategy than that described by Kranitis et al.1 Verification-based functional test routines aimed at maximizing functional-coverage metrics are generated in an iterative process that is repeated until high coverage levels are achieved in all functional metrics. The increase in these metrics serves as a guide for improving fault coverage in control components. Moreover, increasing the functional-coverage metrics in control-oriented components results in increased coverage for the components that are the destination of the control signals. In this article, we consider the following metrics, usually supported by industry-standard hardware description language (HDL) simulators: &
&
&
&
Statement coverage is a measure of the number of executable statements within the model that have been executed during a simulation run. Executable statements are those that perform a definite action during runtime and do not include components, compile directives, or declarations. Branch coverage counts the execution of expressions and case statements that affect the control flow of HDL execution. Condition coverage is an extension of branch coverage that breaks down branch conditions into the elements that make the result true or false. Expression coverage is the same as condition coverage, but instead of covering branch decisions, it covers concurrent signal assignments. Toggle coverage provides the ability to count and collect changes of state on specified nodes, such as Verilog nets and registers and VHDL signals.
Statement coverage verifies that all statements of the RTL code are triggered. This is very important because it lets us reduce uncontrolled stuck-at faults that might result from the test program’s missing certain statements in the RTL code. Specifically, because the synthesis tool translates every statement with a definite action into proper circuit logic, it’s best to cover, or hit, all the statements and thereby reduce uncontrolled faults. Branch coverage acts similarly but covers branch decisions instead of RTL statements. The functional metrics record the number of times a statement was activated or a branch decision was
January/February 2008
made, which lets us enhance our test routines to achieve more activations. For example, certain branch decisions seldom occur, resulting in some uncontrolled faults; nevertheless, code targeting those decisions can be added regardless of the fact that those statements have already been hit at least once. As for condition and expression coverage, we use these metrics when statement and branch coverage are low, and they let us detect the cause of the miss. Assuming that a complicated branch condition is never triggered, condition coverage can be used to find which element of the branch is responsible for not being triggered. Finally, toggle coverage determines whether a register gets both 0 R 1 and 1 R 0 transitions, thus triggering a possible stuck-at-0 or stuck-at-1 fault, respectively. After we detect the particular signal or signals, we identify their source in the RTL code, and we generate code setting them to appropriate values to hit the targeted condition. Verification engineers develop verification-based test routines during the processor verification phase, and these routines can be reused with slight modifications for the manufacturing test of control components. Verification-based functional self-test programs that rely on templates and instruction biases can be generated automatically by architectural validation suites developed for processor verification. To minimize the manual test development effort, test engineers can select appropriate parts of the verification routines. However, targeting corner cases usually requires proper handwritten instruction sequences because relying on verification-based test programs generated by RTPG might require many cycles before certain corner cases are hit. If verification-based test programs are reused, test responses should be propagated to the data memory. The basic idea of SBST makes this necessary; unlike the case with verification, the internal state of the processor is not directly observable, thus making propagation imperative. Self-test program development proceeds with the component C M M that has the highest test priority. After each iteration, processor-level fault simulation evaluates the total fault coverage and estimates collateral coverage. The iterative process repeats on the remaining components until acceptable processorlevel fault coverage is achieved. If all components in the set M have been targeted and the processor-level fault coverage is not acceptable, we proceed to directed random self-test code development for the remaining undetected faults. An
69
Improving Test Coverage
architectural validation suite automatically generates directed random self-test programs, and these programs employ instruction templates using pseudorandom operands. Furthermore, tunable parameters let a verification or test engineer tune the self-test program using defined biases (biases on instructions, instruction fields, addresses, registers, and so on) to control the degree of randomness. RTPG can increase total fault coverage when component-based self-test code development cannot proceed further. Thus, an important aspect of the proposed methodology is that it can test a processor core in its entirety by adopting an effective test approach for each processor component. Even the problem of difficultto-test control-oriented components can be addressed through verification-based functional test routines that maximize functional-coverage metrics. Finally, the use of inherent regularities of functional components results in increased efficiency and reduced test application cost compared with other SBST methodologies. In the proposed H-SBST flow, we apply directed RTPG as the final step, supplementing the component-based self-test code development and thereby improving overall fault coverage results. The proposed methodology generates only relatively few test instructions until fault coverage reaches the goal or saturation occurs. So, another advantage of this strategy is that it avoids the application of RTPG as a first step, because RTPG can result in large test programs and increased test application time during manufacturing test. Moreover, applying RTPG as the final step lets test engineers make an easy trade-off between instructions and cycles on the one hand and fault coverage on the other, depending on the specific test quality requirements.
Case study: OpenRISC 1200 processor OpenRISC 1200 is a publicly available processor core distributed under the general public license agreement. Its numerous industrial applications include a speech recognizer by Voxi, a SoC design by Flextronics Semiconductor, and a multimedia chip from Vivace Semiconductor that will run Linux 2.6. The current version is a five-stage pipelined, 32-bit RISC processor core with Harvard architecture. OpenRISC 1200 implements the Orbis32 ISA, in which basic digital-signal-processing capabilities are supported by default. It is configurable and can support from 48 to 58 core instructions. We synthesized a configuration of the OpenRISC CPU core supporting 56 instructions and
70
a MAC unit. The synthesis uses only two input gates, two input multiplexers, and flip-flops. The netlist generated after synthesis contains 44,476 gates and 2,021 state-holding elements. The total number of stuck-at faults is 190,702, of which 186,209 faults are reported structurally testable by the fault simulation tool. Because only the faults that are structurally testable are considered during fault simulation, the percentages reported in the case study constitute the test coverage—that is, the quotient of the detected faults divided by the number of structurally testable faults (structurally testable faults 5 total faults – undetectable faults). Table 1 shows component fault allocation, structurally testable faults, detected faults, and the resulting test coverage percentage for the various components of the OpenRISC 1200 CPU. We partitioned OpenRISC 1200 processor components into three groups, according to the self-test code development strategy followed for each group. Group 1 consists of the parallel multiplier, register file, and functional parts of the pipeline logic (registers and forwarding multiplexers). Group 2 includes the ALU and the MAC adder. Control components fall into group 3. According to the proposed methodology, the three main functional components—the register file, the MAC unit, and the ALU—have the highest test priority. We use deterministic self-test code development to target these functional components. We test the register file by reusing a deterministic self-test routine from our test library that exploits the register file’s inherent regularity, as described by Kranitis et al.,1 and we achieve almost complete (99.9%) test coverage. The MAC unit includes a (32 3 32)-bit multiplier and a 64bit adder, and contains 11,023 gates. The multiplier, a component with architecture supported by our test library, can be tested effectively using SBST deterministic self-test code development, as described elsewhere.1,3 Moreover, to achieve acceptable fault coverage for the entire MAC, test engineers should target the 64-bit adder as a component with considerable test priority because of the number of faults it contains. To accomplish this with the fewest possible test patterns, we use accumulator-based compaction for the deterministic multiplier patterns. The MAC adder compacts the multiplier responses. Compaction’s main purpose is to achieve high fault coverage in the adder; a reduced number of responses is an added benefit. All remaining adder faults were detected by seven ATPG patterns that were subsequently mapped to proper
IEEE Design & Test of Computers
Table 1. Component fault allocation and test coverage statistics for OpenRISC 1200. Component
Structurally testable faults
Detected faults
Test coverage (%)
MAC unit
78,076
77,295
99.0
Register file
37,008
37,004
99.9
ALU
23,653
21,737
91.9
Exception unit
9,237
5,117
55.4
Special-purpose registers
9,145
6,639
72.6
PC generation
7,308
5,781
79.1
Control unit
5,537
5,162
93.2
Load/store unit
4,558
4,293
94.2
Operand multiplexers
3,374
3,339
99.0
Writeback multiplexers
2,678
2,198
82.1
Instruction fetch unit
2,536
597
23.5
Remaining logic
3,099
2,717
87.7
171,879
92.3
Total CPU 186,209 *MAC: multiply-accumulate; ALU: arithmetic logic unit; PC: program counter.
instruction sequences to be applied at the processor level. Test coverage for the entire MAC unit is 99.0%. The concept of supporting various configurations applies not only to the number of components that constitute OpenRISC 1200 but also to their characteristics and functionality. For example, the ALU contains optional functions that can be activated or deactivated, along with alternative implementations for certain core instructions. This choice affects the RTL coding style of those components, resulting in a behavioral high-level RTL description that frees the synthesis tool and increases the uncertainty about the architecture of the functional components that implement various operations. However, this choice limits the amount of regularity in the ALU, reducing the benefits of high-level deterministic methods and making a constrained ATPG fine-tuned to the gate level more appropriate. In the current implementation with 56 instructions, the ALU contains 3,430 gates; there are 23,653 structurally testable faults. We have adopted a deterministic, constrained ATPG methodology similar to that described by Chen et al.4 and Wen et al.5 The ATPG patterns are mapped to proper instruction sequences that provide all the necessary operands, apply the test instruction, and propagate the results to the data memory. Figure 2 shows a code snippet representing the template used for the add-with-carry (ADDC) instruction, and another showing the mapping of the settable fields to the ATPG test patterns. Due to collateral coverage, the operand multiplexers component was sufficiently tested (99.0% test coverage). We used deterministic self-test code de-
January/February 2008
velopment for the pipeline registers and forwarding multiplexers as well. Pipeline logic is not listed explicitly as an entity in Table 2, because it is distributed in many processor components. Test coverage of 100% was reported for the testable data part of the pipeline registers. Test coverage for the address part of pipeline registers was 84%, partly because of constraints related to memory mapping and partly because controllability and observability were reduced compared with the controllability and observability of their data-related counterparts. After targeting the highest test priority functional components, we didn’t consider overall test coverage of 89.3% sufficient, so we proceeded to coveragedriven, verification-based self-test code development for the control components. The OpenRISC 1200 control unit in the current implementation contains 1,536 gates, including 148 state-holding elements. There are 5,537 structurally testable faults. As described in the methodology, control parts are targeted by verification-based functional test routines that aim at maximizing functional-coverage metrics. An automated, directed RTPG OR1200 processor verification suite (OPVS) that we developed for OpenRISC 1200 generates part of these test routines automatically. Our OPVS automatically generates random self-test programs based on templates and instruction biases; however, proper handwritten instruction sequences target a few corner cases. Functional-coverage metrics can serve as a guideline in this process. A code snippet produced by the OPVS for the ADDC instruction appears in Figure 3. We extended the OPVS so that verification-
71
Improving Test Coverage
Figure 2. Template of the add-with-carry (ADDC) instruction for constrained ATPG: ATPG test patterns (a), mapping to settable fields (b), ADDC instruction template (c), and ATPG test pattern application for ADDC (d).
Table 2. Verification-based self-test code improvement on functional metrics and test coverage. Control unit
72
Load/store unit
Routines
Routines
Routines
Routines
Functional metrics
targeting
targeting
targeting
targeting
and test coverage
groups 1 and 2
group 3
groups 1 and 2
group 3
Statement coverage (%)
84.9
98.8
84.2
100.0
Branch coverage (%)
79.8
98.4
89.7
97.6
Toggle coverage (%)
86.7
98.5
70.3
97.6
Test coverage (%)
79.2
90.9
83.2
91.2
IEEE Design & Test of Computers
based code includes instructions for propagating the results to memory in compliance with the SBST concept. The resulting code appears in Figure 3 as well. Table 2 shows experimental results before and after the application of the proposed verification-based functional testing approach for two control-oriented components, the control unit and the load/store unit (LSU). The table shows the relation between an increase in functional metrics and test coverage improvement. We also applied verification-based Figure 3. Template of the ADDC instruction for verification (a) and the functional testing to the remaining com- modified ADDC instruction template for manufacturing test (b). ponents that have significant control parts: the exception unit, the special-purpose registers, MAC adder (group 2). At this point, there are 12,135 instructions, and the combined program requires and the instruction fetch unit. After targeting the highest-test-priority functional 191,638 cycles. Step 3, verification-based self-test code components and the control components, we didn’t development, serves in testing control components consider overall test coverage of 91.2% sufficient, so (group 3); the number of instructions increases to we deployed directed RTPG, using the OPVS to 13,039, and the number of cycles increases to 195,739. increase test coverage at the processor level. Initially, When component-based self-test code development the bias was set so that all instructions had an equal cannot proceed further, directed RTPG, comprising 294 chance to be generated. Under this scenario, we test instructions and 254 instructions stressing pipeline generated 294 test instructions that are incorporated in control logic (RTPG 1), is deployed to improve templates for setting the operands and for storing the coverage at the processor level. This achieves test result to the data memory. Each instruction was used coverage of 91.6%, using a test program of 15,693 approximately five times with pseudorandom oper- instructions and requiring 210,167 cycles. If this result is ands. Additionally, to stress hazard detection and insufficient, and if a further increase in the number of pipeline control logic, we generated 254 instructions instructions and cycles is an acceptable trade-off, then containing various combinations of control and data a second RTPG program can be added. Test coverage hazards in adjacent lines, taking into account multi- reaches 92.3% after the application of an additional cycle operations. This code supplements the de- 2,500 RTPG test instructions (RTPG 2), while the terministic self-test code targeting pipeline registers number of instructions reaches 31,728 and the required number of cycles increases to 355,461. and forwarding multiplexers. Finally, Figure 4 shows that applying 5,000 extra test The incremental test coverage results in Figure 4 and the test program statistics in Table 3 were derived instructions (RTPG 3) results in a marginal improveafter applying the SBST strategies constituting the ment in test coverage (0.2%) while more or less doubling the total number of instructions (74,262) and proposed H-SBST methodology. The first step, applying deterministic self-test code the required number of cycles (646,185), thereby development reusing test library routines, targets the indicating saturating behavior. Functional fault coverage is far higher than the parallel multiplier, the register file, and the pipeline logic (group 1). Because of the compact coding style 92.3% test coverage reported by the fault simulator, characterizing most of the test library routines (in- because many of the undetected structurally testable cluding loop-based pattern generation and deploy- faults are functionally untestable for the specific ment), this requires only 1,041 instructions. The implementation. For example, since OpenRISC 1200 number of cycles (98,433) includes 38,367 processor is a configurable processor, many of the specialinitialization cycles that are not directly related to SBST. purpose registers exist only in specific implementaApplication of constrained ATPG (step 2) follows, tions when the caches and the memory management targeting the ALU as well as the remaining faults of the unit and translation look-aside buffer components are
January/February 2008
73
Improving Test Coverage
Figure 4. Incremental application of H-SBST and test coverage increase for the OpenRISC 1200. (RTPG: random test-program generation.)
activated, as described by the specification for the OpenRISC 1000 family. These components were not activated in our case study configuration, so the corresponding faults residing in the special-purpose registers, the instruction fetch unit, the exception unit, and the writeback multiplexers were functionally untestable for this configuration. Furthermore, functionally untestable faults exist in other components as well. For example, the program counter (PC) generation component incorporates one incrementer that calculates the PC+4 (the next value of the PC) and one adder that performs the effective address calculation for the branch instructions. Both of these subcompo-
nents include functionally untestable faults because of constraints related to memory mapping (restrictions on the memory code segment) required to comply with OpenRISC Reference Platform restrictions. Therefore, memory regions beyond the specified limits cannot be accessed by programs during the processor’s normal operation, making the corresponding address logic faults functionally untestable. SOFTWARE-BASED SELF-TESTING, due to its nonintrusive nature and at-speed testing capability, has a strong potential to eventually evolve into a mature and widely adopted self-test approach for microprocessors,
Table 3. Incremental application of H-SBST and test program statistics for OpenRISC 1200.
Step
74
Self-test code
Total
Clock
CPU test
development strategy
instructions
cycles
coverage (%)
1,041
98,433
84.9 89.3
1
Test library targeting group 1
2
Constrained ATPG targeting group 2
12,135
191,638
3
Verification-based strategy targeting group 3
13,039
195,739
91.2
4
RTPG 1
15,693
210,167
91.6
5
RTPG 2
31,728
355,461
92.3
6
RTPG 3
74,262
646,185
92.5
IEEE Design & Test of Computers
embedded processors, and processor-based systems, complementing standard DFT approaches. Hybrid SBST effectively combines deterministic structural SBST methodologies (using high-level test development and gate-level-constrained ATPG test development) with verification-based self-test code development for control components and directed RTPG. Intense research efforts are necessary in the near future to overcome challenges related to softwarebased self-test automation and application to emerging processor architectures. &
& References 1. N. Kranitis et al., ‘‘Software-Based Self-Testing of Embedded Processors,’’ IEEE Trans. Computers, vol.
low-energy self-testing of microprocessor cores, online testing, and reliability of reconfigurable systems. Merentitis has an MS in microelectronics from the University of Athens. He is a student member of the IEEE Computer Society and the IEEE. George Theodorou is a PhD candidate in computer science in the Department of Informatics and Telecommunications at the University of Athens. His research interests include online self-testing of microprocessor cores. Theodorou has an MS in microelectronics from the University of Athens. He is a student member of the IEEE Computer Society and the IEEE.
54, no. 4, Apr. 2005, pp. 461-475. 2. A. Paschalis and D. Gizopoulos, ‘‘Effective SoftwareBased Self-Test Strategies for On-Line Periodic Testing of Embedded Processors,’’ IEEE Trans. Computer-Aided Design of Integrated Circuits and Systems, vol. 24, no. 1, Jan. 2005, pp. 88-99. 3. N. Kranitis et al., ‘‘Instruction-Based Self-Testing of Processor Cores,’’ Proc. 20th IEEE VLSI Test Symp. (VTS 02), IEEE CS Press, 2002, pp. 223-228. 4. L. Chen et al., ‘‘A Scalable Software-Based Self-Test Methodology for Programmable Processors,’’ Proc. 40th
Antonis Paschalis is an associate professor in the Department of Informatics and Telecommunications at the University of Athens. His research interests include logic design and architecture, VLSI testing, processor testing, and hardware fault tolerance. Paschalis has a PhD in computer science from the University of Athens. He is a Golden Core member of the IEEE Computer Society and a member of the IEEE.
Design Automation Conf. (DAC 03), ACM Press, 2003, pp. 548-553. 5. C.H.P. Wen et al., ‘‘Simulation-Based Target Test Generation Techniques for Improving the Robustness of a Software-Based-Self-Test Methodology,’’ Proc. IEEE Int’l Test Conf. (ITC 05), IEEE CS Press, 2005, pp. 936-945.
Nektarios Kranitis is an adjunct lecturer in the Department of Informatics and Telecommunications at the University of Athens. His research interests include design and test of digital-system, microprocessor, embedded-processor, and SoC architectures. Kranitis has a PhD in computer science from the University of Athens. He is a member of the IEEE Computer Society and the IEEE. Andreas Merentitis is a PhD candidate in computer science in the Department of Informatics and Telecommunications at the University of Athens. His research interests include
January/February 2008
Dimitris Gizopoulos is an assistant professor in the Department of Informatics at the University of Piraeus, Greece. His research interests include microprocessors and microprocessor-based systems design, test, and fault tolerance; embedded-systems design, test, and reliability; and fault modeling, self-testing, and online testing of digital systems. Gizopoulos has a PhD in computer science from the University of Athens. He is associate editor of IEEE Transactions on Computers and IEEE Design & Test, a Golden Core member of the IEEE Computer Society, and a senior member of the IEEE. & Direct questions and comments about this article to Nektarios Kranitis, University of Athens, Dept. of Informatics and Telecommunications, Panepistimiopolis, Ilissia 15784, Athens, Greece;
[email protected]. For further information on this or any other computing topic, please visit our Digital Library at http:// www.computer.org/csdl.
75