Test Patterns of Multiple SIC Vectors: Theory and ... - IEEE Xplore

614

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 21, NO. 4, APRIL 2013

Test Patterns of Multiple SIC Vectors: Theory and Application in BIST Schemes Feng Liang, Luwen Zhang, Shaochong Lei, Guohe Zhang, Kaile Gao, and Bin Liang

Abstract— This paper proposes a novel test pattern generator (TPG) for built-in self-test. Our method generates multiple singleinput change (MSIC) vectors in a pattern, i.e., each vector applied to a scan chain is an SIC vector. A reconfigurable Johnson counter and a scalable SIC counter are developed to generate a class of minimum transition sequences. The proposed TPG is flexible to both the test-per-clock and the test-per-scan schemes. A theory is also developed to represent and analyze the sequences and to extract a class of MSIC sequences. Analysis results show that the produced MSIC sequences have the favorable features of uniform distribution and low input transition density. The performances of the designed TPGs and the circuits under test with 45 nm are evaluated. Simulation results with ISCAS benchmarks demonstrate that MSIC can save test power and impose no more than 7.5% overhead for a scan design. It also achieves the target fault coverage without increasing the test length. Index Terms— Built-in self-test (BIST), low power, single-input change (SIC), test pattern generator (TPG).

I. I NTRODUCTION

B

UILT-IN SELF-TEST (BIST) techniques can effectively reduce the difficulty and complexity of VLSI testing, by introducing on-chip test hardware into the circuit-under-test (CUT). In conventional BIST architectures, the linear feedback shift register (LFSR) is commonly used in the test pattern generators (TPGs) and output response analyzers. A major drawback of these architectures is that the pseudorandom patterns generated by the LFSR lead to significantly high switching activities in the CUT [1], which can cause excessive power dissipation. They can also damage the circuit and reduce product yield and lifetime [2], [3]. In addition, the LFSR usually needs to generate very long pseudorandom sequences in order to achieve the target fault coverage in nanometer technology. A. Prior Work Several advanced BIST techniques have been studied and applied. The first class is the LFSR tuning. Girard et al. analyzed the impact of an LFSR’s polynomial and seed selection on the CUT’s switching activity, and proposed a method to select the LFSR seed for energy reduction [4].

Manuscript received July 18, 2011; revised April 5, 2012; accepted April 12, 2012. Date of publication May 14, 2012; date of current version March 18, 2013. This work was supported in part by the Fundamental Research Funds for the Central Universities under Grant xjj20100053 and the National Natural Science Foundation of China under Grant 61006033. The authors are with the School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an 710049, China (e-mail: [email protected]). Digital Object Identifier 10.1109/TVLSI.2012.2195689

The second class is low-power TPGs. One approach is to design low-transition TPGs. Wang and Gupta used two LFSRs of different speeds to control those inputs that have elevated transition densities [5]. Corno et al. provided a lowpower TPG based on the cellular automata to reduce the test power in combinational circuits [6]. Another approach focuses on modifying LFSRs. The scheme in [7] reduces the power in the CUT in general and clock tree in particular. In [8], a low-power BIST for data path architecture is proposed, which is circuit dependent. However, this dependency implies that nondetecting subsequences must be determined for each circuit test sequence. Bonhomme et al. [9] used a clock gating technique where two nonoverlapping clocks control the odd and even scan cells of the scan chain so that the shift power dissipation is reduced by a factor of two. The ring generator [10] can generate a single-input change (SIC) sequence which can effectively reduce test power. The third approach aims to reduce the dynamic power dissipation during scan shift through gating of the outputs of a portion of the scan cells. Bhunia et al. [11] inserted blocking logic into the stimulus path of the scan flip-flops to prevent the propagation of the scan ripple effect to logic gates. The need for transistors insertion, however, makes it difficult to use with standard cell libraries that do not have power-gated cells. In [12], the efficient selection of the most suitable subset of scan cells for gating along with their gating values is studied. The third class makes use of the prevention of pseudorandom patterns that do not have new fault detecting abilities [13]–[15]. These architectures apply the minimum number of test vectors required to attain the target fault coverage and therefore reduce the power. However, these methods have high area overhead, need to be customized for the CUT, and start with a specific seed. Gerstendorfer et al. also proposed to filter out nondetecting patterns using gate-based blocking logics [16], which, however, add significant delay in the signal propagation path from the scan flip-flop to logic. Several low-power approaches have also been proposed for scan-based BIST. The architecture in [17] modifies scan-path structures, and lets the CUT inputs remain unchanged during a shift operation. Using multiple scan chains with many scanenable (SE) inputs to activate one scan chain at a time, the TPG proposed in [18] can reduce average power consumption during scan-based tests and the peak power in the CUT. In [19], a pseudorandom BIST scheme was proposed to reduce switching activities in scan chains. Other approaches include LT-LFSR [20], a low-transition random TPG [21], and the weighted LFSR [22]. The TPG in [20] can reduce the transitions in the scan inputs by assigning the same

1063-8210/$31.00 © 2012 IEEE

LIANG et al.: TEST PATTERNS OF MULTIPLE SIC VECTORS

615

value to most neighboring bits in the scan chain. In [21], power reduction is achieved by increasing the correlation between consecutive test patterns. The weighted LFSR in [22] decreases energy consumption and increases fault coverage by adding weights to tune the pseudorandom vectors for various probabilities.

B. Contribution and Paper Organization This paper presents the theory and application of a class of minimum transition sequences. The proposed method generates SIC sequences, and converts them to low transition sequences for each scan chain. This can decrease the switching activity in scan cells during scan-in shifting. The advantages of the proposed sequence can be summarized as follows. 1) Minimum transitions: In the proposed pattern, each generated vector applied to each scan chain is an SIC vector, which can minimize the input transition and reduce test power. 2) Uniqueness of patterns: The proposed sequence does not contain any repeated patterns, and the number of distinct patterns in a sequence can meet the requirement of the target fault coverage for the CUT. 3) Uniform distribution of patterns: The conventional algorithms of modifying the test vectors generated by the LFSR use extra hardware to get more correlated test vectors with a low number of transitions. However, they may reduce the randomness in the patterns, which may result in lower fault coverage and higher test time [23]. It is proved in this paper that our multiple SIC (MSIC) sequence is nearly uniformly distributed. 4) Low hardware overhead consumed by extra TPGs: The linear relations are selected with consecutive vectors or within a pattern, which has the benefit of generating a sequence with a sequential decompressor. Hence, the proposed TPG can be easily implemented by hardware. The rest of this paper is organized as follows. In Section II, the proposed MSIC-TPG scheme is presented. The principle of the new MSIC sequences is described in Section III. In Section IV, the properties of the MSIC sequences are analyzed. In Section V, experimental methods and results on test power, fault coverage, and area overhead are provided to demonstrate the performance of the propsoed MSIC-TPGs. Conclusions are given in Section VI.

II. P ROPOSED MSIC-TPG S CHEME This section develops a TPG scheme that can convert an SIC vector to unique low transition vectors for multiple scan chains. First, the SIC vector is decompressed to its multiple codewords. Meanwhile, the generated codewords will bit-XOR with a same seed vector in turn. Hence, a test pattern with similar test vectors will be applied to all scan chains. The proposed MSIC-TPG consists of an SIC generator, a seed generator, an XOR gate network, and a clock and control block.

Fig. 1. (a) Symbolic simulation of an MSIC pattern for scan chains. (b) Symbolic representation of an MSIC pattern.

A. Test Pattern Generation Method Assume there are m primary inputs (PIs) and M scan chains in a full scan design, and each scan chain has l scan cells. Fig. 1(a) shows the symbolic simulation for one generated pattern. The vector generated by an m-bit LFSR with the primitive polynomial can be expressed as S(t) = S0 (t)S1 (t)S2 (t), . . . , Sm−1 (t) (hereinafter referred to as the seed), and the vector generated by an l-bit Johnson counter can be expressed as J (t) = J0 (t)J1 (t)J2 (t), . . . , Jl−1 (t). In the first clock cycle, J = J0 J1 J2 , . . . , Jl−1 will bit-XOR with S = S0 S1 S2 , . . . , S M−1 , and the results X 1 X l+1 X 2l+1 , . . . , X (M−1)l+1 will be shifted into M scan chains, respectively. In the second clock cycle, J = J0 J1 J2 , . . . , Jl−1 will be circularly shifted as J = Jl−1 J0 J1 , . . . , Jl−2 , which will also bit-XOR with the seed S = S0 S1 S2 , . . . , S M−1 . The resulting X 2 X l+2 X 2l+2 , . . . , X (M−1)l+2 will be shifted into M scan chains, respectively. After l clocks, each scan chain will be fully loaded with a unique Johnson codeword, and seed S0 S1 S2 , . . . , Sm−1 will be applied to m PIs. Since the circular Johnson counter can generate l unique Johnson codewords through circular shifting a Johnson vector, the circular Johnson counter and XOR gates in Fig. 1 actually constitute a linear sequential decompressor. B. Reconfigurable Johnson Counter According to the different scenarios of scan length, this paper develops two kinds of SIC generators to generate Johnson vectors and Johnson codewords, i.e., the reconfigurable Johnson counter and the scalable SIC counter.

616


1) If SE = 0, the count from the adder is stored to the k-bit subtractor. During SE = 1, the contents of the k-bit subtractor will be decreased from the stored count to all zeros gradually. 2) If SE = 1 and the contents of the k-bit subtractor are not all zeros, M-Johnson will be kept at logic 1 (0). 3) Otherwise, it will be kept at logic 0 (1). Thus, the needed 1s (0s) will be shifted into the M-bit shift register by clocking CLK2 l times, and unique Johnson codewords will be applied into different scan chains. For example, after full-scan design, ISCAS’89 s13207 has 10 scan chains whose maximum scan length is 64. To implement a scalable SIC counter as shown in Fig. 2(b), it only needs 6 D-type flip-flops (DFFs) for the adder, 6 DFFs for the subtractor, 10 DFFs for a 10-bit shift register for 10 scan chains, 6 multiplexers, and additional 19 combinational logic gates. The equivalent gates are 204 in total. For a 64-bit Johnson counter, it needs 64 DFFs, which are about 428 equivalent gates. The overhead of a MSIC-TPG can thus be effectively decreased by using the scalable SIC counter. Generally, the gate overhead of a MSIC-TPG can be estimated by the number of DFFs (NDFF ) used 2log2l NDFF = m + M + 2log2l = (m + M) 1 + (1) m+M

Fig. 2. SIC generators. (a) Reconfigurable Johnson counter. (b) Scalable SIC counter. (c) Waveforms of the scalable SIC counter.

For a short scan length, we develop a reconfigurable Johnson counter to generate an SIC sequence in time domain. As shown in Fig. 2(a), it can operate in three modes. 1) Initialization: When RJ_Mode is set to 1 and Init is set to logic 0, the reconfigurable Johnson counter will be initialized to all zero states by clocking CLK2 more than l times. 2) Circular shift register mode: When RJ_Mode and Init are set to logic 1, each stage of the Johnson counter will output a Johnson codeword by clocking CLK2 l times. 3) Normal mode: When RJ_Mode is set to logic 0, the reconfigurable Johnson counter will generate 2l unique SIC vectors by clocking CLK2 2l times. C. Scalable SIC Counter When the maximal scan chain length l is much larger than the scan chain number M, we develop an SIC counter named the “scalable SIC counter.” As shown in Fig. 2(b), it contains a k-bit adder clocked by the rising SE signal, a k-bit subtractor clocked by test clock CLK2, an M-bit shift register clocked by test clock CLK2, and k multiplexers. The value of k is the integer of log2 (l − M). The waveforms of the scalable SIC counter are shown in Fig. 2(c). The k-bit adder is clocked by the falling SE signal, and generates a new count that is the number of 1s (0s) to fill into the shift register. As shown in Fig. 2(b), it can operate in three modes.

where m, M, and l are the seed number, scan chain number, and the maximum scan length, respectively. If l is doubled, the number of DFFs is only increased by 2/(m + M) times. If 2i−1 < l < 2i (i is a natural number), the number of DFFs does not vary with the CUTs’ sizes. For example, as will be shown in Table II, for ISCAS’89 benchmarks, when the seed number increases from 20 to 38 and the maximum scan length increases from 54 to 87, their area overheads change only from 397.40 to 488.29 μm2 . D. MSIC-TPGs for Test-per-Clock Schemes The MSIC-TPG for test-per-clock schemes is illustrated in Fig. 3(a). The CUT’s PIs X 1 − X mn are arranged as an n × m SRAM-like grid structure. Each grid has a two-input XOR gate whose inputs are tapped from a seed output and an output of the Johnson counter. The outputs of the XOR gates are applied to the CUT’s PIs. A seed generator is an m-stage conventional LFSR, and operates at low frequency CLK1. The test procedure is as follows. 1) The seed generator generates a new seed by clocking CLK1 one time. 2) The Johnson counter generates a new vector by clocking CLK2 one time. 3) Repeat 2 until 2l Johnson vectors are generated. 4) Repeat 1–3 until the expected fault coverage or test length is achieved. E. MSIC-TPGs for Test-per-Scan Schemes The MSIC-TPG for test-per-scan schemes is illustrated in Fig. 3(b). The stage of the SIC generator is the same as the maximum scan length, and the width of a seed generator is


617

sequential decompressor, facilitating hardware implementation. Another requirement is that the MSIC sequence should not contain any repeated test patterns, because repeated patterns could prolong the test time and reduce test efficiency [5]. Finally, uniformly distributed patterns are desired to reduce the test length (number of patterns required to achieve a target fault coverage) [21]. This section aims to extract a class of test sequences that meets these requirements. A. Algebraic Model of MSIC Sequence Assume there are m PIs and M scan chains in a full scan design, and each scan chain has l scan cells. At time t, a test vector applied to the ith scan chain can be expressed as ⎛ ⎞ l Mi (t, j )x j −1 ⎠ x (i−1)l+m (2) Ci (t, x) = ⎝ j =1

where i ∈(1, M), and Mi (t, j ) is the content of the jth scan cell in the ith scan chain. Assume an m-bit LFSR with the primitive polynomial generates a vector S(t) = S0 (t)S1 (t)S2 (t), . . . , Sm−1 (t), and an l-bit Johnson counter generates a vector J (t) = J0 (t)J1 (t)J2 (t), . . . , Jl−1 (t). These two vectors can be used to construct the vector for the ith scan chain as ⎧ ⎫ l ⎨ ⎬ Si (t)x j −1 + Ji (t, x) x (i−1)l+m (3) Ci (t, x) = ⎩ ⎭ j =1

where “+” denotes the bit-XOR operation, and Ji (t, x) is the tth codeword of the Johnson vector J (t) = J0 (t)J1 (t)J2 (t), . . . , Jl−1 (t). More precisely Fig. 3.

MSIC-TPGs for (a) test-per-clock and (b) test-per-scan schemes.

not smaller than the scan chain number. The inputs of the XOR gates come from the seed generator and the SIC counter, and their outputs are applied to M scan chains, respectively. The outputs of the seed generator and XOR gates are applied to the CUT’s PIs, respectively. The test procedure is as follows. 1) The seed circuit generates a new seed by clocking CLK1 one time. 2) RJ_Mode is set to “0”. The reconfigurable Johnson counter will operate in the Johnson counter mode and generate a Johnson vector by clocking CLK2 one time. 3) After a new Johnson vector is generated, RJ_Mode and Init are set to 1. The reconfigurable Johnson counter operates as a circular shift register, and generates l codewords by clocking CLK2 l times. Then, a capture operation is inserted. 4) Repeat 2–3 until 2l Johnson vectors are generated. 5) Repeat 1–4 until the expected fault coverage or test length is achieved. III. P RINCIPLE OF MSIC S EQUENCES The main objective of the proposed algorithm is to reduce the switching activity. In order to reduce the hardware overhead, the linear relations are selected with consecutive vectors or within a pattern, which can generate a sequence with a

J1 (t, x) = J0 (t, x)+J1 (t, x)x+ · · · +Jl−2 (t, x)x l−2 +Jl−1 (t, x)x l−1 J2 (t, x) = Jl−1 (t, x)+J0 (t, x)x+ · · · +Jl−3 (t, x)x l−2 +Jl−2 (t, x)x l−1 .. . Jl−1 (t, x) = J2 (t, x)+J3 (t, x)x+ · · · +J0 (t, x)x l−2 +J1 (t, x)x l−1 Jl (t, x) = J1 (t, x)+J2 (t, x)x+ · · · +Jl−1 (t, x)x l−2 +J0 (t, x)x l−1 .

(4)

As described in Section II-E [Figs. 1 and 3(b)], the seed circuit generates a new seed by clocking the slow clock CLK1 one time. When RJ_Mode is set to “0,” the reconfigurable Johnson counter will operate in the Johnson counter mode and generate a Johnson vector by clocking the normal clock CLK2 one time. After a new Johnson vector is generated, RJ_Mode and Init are set to 1. The reconfigurable Johnson counter will then operate as a circular shift register and generate l codewords by clocking CLK2 l times. Meanwhile, M of these codewords will be shifted into M scan chains. A capture operation is then inserted. Therefore, the time required for a Johnson vector to be applied to the CUT is l normal clock cycles. Since an l-stage Johnson counter can generate 2l different Johnson vectors, the period of a Johnson

618


sequence is actually 2l 2 normal clock cycles in this paper. Therefore, the constraint equations of S and J are given by S(t) = S([t/2l 2 ])

J (t + 2l 2 ) = J (t)

(5)

where [t/2l 2 ] denotes the integer part of t/2l 2 . With these considerations, the kth pattern applied to the combinational part of the CUT, i.e., X (tk , x), can be expressed as X (k) = X (tk , x) m Sq−1 (tk )x q−1 = q=1

+

⎧ M ⎨ l ⎩

i=1

j =1

⎫ ⎬

Si (tk )x j −1 + Ji (tk , x) x (i−1)l+m ⎭

(6)

where tk = [t/l]. A sequence satisfying (3)–(6) is called an MSIC sequence. Theorem 1: The period of a MSIC sequence is 2l(2m − 1). Proof: The pth pattern applied to the combinational part of the CUT can be expressed as X ( p) = X (t p , x) m Sq−1 (t p )x q−1 = ⎧ M ⎨ l

B. Example

⎫ ⎬

q=1

+

i=1

⎩

Si (t p )x j −1 + Ji (t p , x) x (i−1)l+m . ⎭

j =1

The bit-XOR result of the pth pattern and the kth pattern can be expressed as X (t p , x)+X (tk , x) m = (Sq−1 (t p ) + Sq−1 (tk ))x q−1 ⎧ M ⎨ l

q=1

+

i=1

+

⎩

M

(Si (tk ) + Si (tp ))x j −1

j =1

⎫ ⎬ ⎭

x (i−1)l+m

Ji (t p , x) + Ji (tk , x) x (i−1)l+m .

(7)

i=1

If S(t p ) = S(tk ), the following expressions are true: m

⎧

i=1

⎩

(Sq−1 (t p ) + Sq−1 (tk ))x q−1 = 0

q=1

M ⎨ l

(Si (tk ) + Si (tp ))x j −1

j =1

2l and 2m − 1, respectively, the period of an MSIC sequence defined in (6) is thus 2l(2m − 1). The theorem shows that the generated MSIC sequence can be applied to a CUT as a consecutive sequence. This makes the proposed TPG flexible to different lengths of scan chains. The tradeoff between the target fault coverage and the overhead determines the design process of MSIC-TPGs. In fact, the number 2l(2m − 1) is large enough for general CUTs, which can achieve the target fault coverage. Hence, the numbers of Johnson counter stages and LFSR stages can be selected according to the demand. Let s and j be the number of LFSR stages and the number of Johnson counter stages, respectively. For test-per-clock schemes, the product of s and j only needs to satisfy the condition s × j ≥ m such that 2 j (2s − 1) can achieve the target fault coverage. For test-per-scan schemes, j usually equals to l, and s is selected as half the number of PIs or a fixed number which is not smaller than the scan chain number. The principle of selecting s is to reduce the overhead and achieve the target fault coverage. Experimental results in this paper will show that the selection of s based on these principles has negligible impact on the test length.

⎫ ⎬ ⎭

x (i−1)l+m = 0.

(8)

Therefore, (7) is not zero for unique seeds. If S(t p ) = S(tk ), (7) can be simplified as M Ji (t p , x) + Ji (tk , x) x (i−1)l+m . X (t p , x)+X (tk , x) =

(9)

i=1

From (3), we know that (9) is not zero for unique Johnson vectors. Combining (8) and (9), we can conclude that (7) is zero for a maximum length LFSR sequence. Since the period of an l-bit Johnson sequence and an m-bit LFSR sequence are

Suppose there are four scan chains whose lengths are 8 in a full-scan design. The 8-bit Johnson codewords and 4-bit (or more bits) seeds can be used to generate an MSIC sequence. A MSIC sequence with seed 01101 is . . . . chain1 .. chain2 .. chain3 .. chain4 .. PIs . . . . X (1) 10000000 .. 10111111 .. 11011111 .. 00010000 .. 01101 . . . . X (2) 11000000 .. 10011111 .. 11001111 .. 00011000 .. 01101 . . . . X (3) 11100000 .. 10001111 .. 11000111 .. 00001110 .. 01101 . . . . X (4) 11110000 .. 10000111 .. 11000011 .. 00001111 .. 01101 ... . . . . X (15) 00000001 .. 01111111 .. 10111111 .. 00100000 .. 01101 . . . . X (16) 00000000 .. 11111111 .. 11111111 .. 00000000 .. 01101. For (25 − 1) unique seeds, there are 31 such blocks. Therefore, there are 16 × 31 = 496 unique patterns in this MSIC sequence. Also, the transition between patterns X(t) and X(t+1) is one for every scan chain, and transitions for scan chain 2 to scan in “1001111” are two per clock. Seed “01101” is kept unchanged for patterns X (1) to X (16). However, if the seed circuit and the Johnson counter are clocked by the same frequency clocks, a sequence based on (3) is as follows: chain1 X (1) 01111011 X (2) 01011101 X (3) 00101110 X (5) 00000111 ...

.. . .. . .. . . .. .. .

chain2 11001110 11100111 11110111 01010101

.. . .. . .. . . .. .. .

chain3 11000111 01100011 10110001 11011000

.. . .. . .. . . .. .. .

chain4 11100000 11010000 11101100 11110100

.. . .. . .. . . .. .. .

PIs 10001 → 11010 01101 → 01111 00111 → 11101 01110 → 00001

. . . . X (32) 01111011 .. 11001110 .. 11000111 .. 11100000 .. 10001 → 11010 .


619

TABLE I S WITCHING A CTIVITY D URING S HIFTING CUT

TL

s13207 s15850 s38417 s35932 s38584 Average

100 000 100 000 100 000 200 100 000 NA

WSAavg MSIC LFSR 2561 4477 2403 4531 2664 11 318 1031 8580 5268 11 194 NA NA

WSApk MSIC LFSR 5291 8021 4352 7126 7064 16 304 2543 14 625 3911 15 949 NA NA

WSAavg Red. % MSIC [3] [20] 43 59 45 47 NA 58 76 54 56 88 65 56 53 49 59 61 56 54

WSApk Red. % MSIC [3] 34 41 39 NA 57 39 83 55 25 51 47 46

It can be seen that X (1) and X (32) are the same. The period of this sequence is thus 31. The transitions in each scan chain or between neighbor patterns would be higher. For a combinational circuit, its m PIs can be split into M segments. Equation (6) can also be used to generate an MSIC sequence for it. The theory and examples above show that there are enough unique MSIC patterns to achieve the target fault coverage. Also, there are linear relations between these patterns or within a pattern. IV. P ROPERTIES OF MSIC S EQUENCES A. Switching Activity Reduction For test-per-clock schemes, M segments of the CUT’s primary inputs are applied with M unique SIC vectors. The mean input transition density of the CUT is close to 1/l. For test-per-scan schemes, the CUT’s PIs are kept unchanged during 2l 2 shifting-in clock cycles, and the transitions of a Johnson codeword are not greater than 2. Therefore, the mean input transition density of the CUT during scan-in operations is less than 2/l. For the five largest ISCAS’89 benchmarks with 20 evenly distributed scan chains, Table I compares the weighted switching activity (WSA) of scan cells during shifting with the proposed MSIC sequence and the conventional LFSR sequence, respectively. The column labeled “TL” lists the number of test patterns applied to the CUT. The next four columns are the average WSA (WSAavg) of all shifting operations and the maximum WSA in a clock cycle (WSApk). The last five columns show the percentages of reductions over LFSR, where the performances of the reordering method in [3] and the LT-RTPG method in [20] are also included. Compared to LFSR sequence, the proposed MSIC sequence can reduce the average WSA by 43%–88% and peak WSA by 25%–83%. Compared with the methods in [3] and [20], the MSIC sequence also shows higher reduction ratios. Therefore, the CUT’ transitions with an MSIC sequence can be effectively reduced.

Fig. 4. (a) Forms of test vectors for scan chain 1. (b) Numbers of 1s in 638 scan cells of s13207. (c) Test length-fault coverage curves for c2670.

B. Uniform Distribution of MSIC Patterns If test patterns are not uniformly distributed, there might be some inputs that are assigned the same values in most test patterns. Hence, faults that can only be detected by patterns that are not generated may escape, leading to low fault coverage. Therefore, uniformly distributed test patterns

are desired for most circuits in order to achieve higher fault coverage [5]. The form of MSIC vectors with seed S0 S1 S2 , . . . , S M−1 for scan chain 1 is shown in Fig. 4(a). If the test length L is a

620


TABLE II OVERHEAD OF TPGs

CUT

PIs

Circuit size Gates+ DFFs

c2670 c3540 c5315 c6288 c7552 Average s13207 s15850 s38417 s35932 s38584 Average

233 50 178 32 207 NA 62 77 28 35 38 NA

1193+0 1669+0 2307+0 2406+0 3512+0 NA 7951+638 9772+534 22 179+1636 12 204+1728 192 53+1426 NA

MSIC-TPG

LFSR [21]

Width/bit Area SIC 16 5 13 7 11 NA 64 54 82 87 72 NA

Seed 15 10 14 5 16 NA 32 38 20 20 20 NA

μm2 745.53 219.68 592.32 169.89 601.48 465.78 421.38 488.29 397.4 481.15 482.04 454.05

157 27 51 21 45 60.2 6.4 7.5 2 2.5 2.3 4.14

Total power Peak power μW μW 1019 235 27.17 70 296.7 32 13.91 17 620.5 54 24.05 27 196.3 8 12.07 3 819.9 55 24.32 18 590.48 76.8 20.30 27 351.8 4.8 17.5 2.7 381.3 5.8 18.25 2.8 306.5 1.6 25.29 1.2 326.4 1.6 25.51 1.3 323.4 1.6 25.3 1.2 3.08 22.37 1.84 337.88

multiple of 2l, all MSIC patterns can be divided into L/2l such blocks. The probability of 1 (or 0) applied to each scan cell is 0.5. For example, the number of S0 (S0 ) is l in the first column. If the test length L is not a multiple of 2l, the difference exists only in the last L − i nteger (L/2l) patterns. So the difference in the number of 1s and 0s applied to each scan cell does not exceed l. Therefore, the probability of 1 (or 0) for a scan cell is |probabilty(i ) − 0.5| ≤ l/L

(10)

where i ∈ (0, 1). Fig. 4(b) shows the numbers of 1s in 638 scan cells of s13207. The test length is 9974, and the number of 1s applied to each scan cell is between 4730 and 5194. The probabilities of 1 for all scan cells are between 0.4974 and 0.526. For all of the five largest ISCAS’89 benchmarks, simulation results with different initial seeds show that the probabilities of 1 for all scan cells are between 0.465 and 0.535. This shows that an MSIC sequence is nearly uniformly distributed, and the initial seed state has negligible impact on the uniform distribution of MSIC pattern. C. Relationship Between Test Length and Fault Coverage The test length of conventional LFSR methods is related to the initial test vector. In other words, the number of patterns to hit the target fault coverage depends on the initial vector in conventional LFSR TPGs [21]. Fig. 4(c) shows the test length-fault coverage relationships of c2670 with a conventional LFSR and with a MSIC sequence, respectively. The fault model in Fig. 4(c) is stuck at. The width of the seed and the Johnson counter is provided in Table II. For the LFSR sequences, the rates of growth of fault coverage vary greatly with the TPG’s initial states, as shown by the curves marked with “Best case, LFSR” and “Worst case, LFSR.” For MSIC sequences, the numbers of patterns to hit the target fault coverage with different initial seeds are nearly the same, as shown by curves marked with “Best case, MSIC” and “Worst case, MSIC.” Also, the rate of growth of the fault coverage with the MSIC sequence is very close to that with the LFSR in the best case. For example, the fault

Area μm2 1224.58 267.46 935.76 172.94 1088.04 737.75 401.62 498.48 178.18 223.38 246.64 309.66

258 33 81 22 82 95.2 6.2 7.7 0.9 1.1 1.2 3.4

Total power μW 163.1 423 37.68 46 124 112 24.84 7 144.3 105 98.78 138.6 52.81 7.6 65.38 10.1 23.77 1.1 29.67 1.2 32.66 1.6 40.85 4.3

Peak power μW 1060 244 322.5 35 780.1 67 222.4 9 908.1 61 658.6 83.2 454.1 6.2 552.7 8.5 214.7 1.1 268 1.3 299.5 1.5 357.8 3.7

A

NA 15 NA NA NA 15 NA NA 14 NA 12 13

P 21.87 19.51 16.94 NA NA 19.44 14 NA NA NA 18.6 16.3

[24] A

NA NA NA NA NA NA 11.3 8.3 3.8 NA 4.2 6.9

TABLE III FAULT C OVERAGE R ANGE W ITH THE S AME T EST L ENGTH CUT c2670 c3540 c5315 c6288 c7552 s13207 s15850 s38417 s35932 s38584

TL 1988 1052 1129 112 1024 9942 9533 9601 4800 9645

SFCmin 89.03 94.36 99.02 98.45 94.98 91.3 89.16 91.35 97.49 96.5

SFCmax 92.84 97.65 99.62 99.63 96.46 93.66 94.21 91.93 99.99 97.65

δrange −1.58%∼2.63% −0.74%∼2.71% −0.31%∼0.30% −1.00%∼0.18% −0.48%∼1.07% −1.23%∼1.32% −1.73%∼3.83% −0.34%∼0.29% −2.44%∼0.06% −0.76%∼0.42%

coverage curves rise to 91.50 and 91.24% after 1988 patterns with the MSIC sequence and “Best case, LFSR,” respectively. Hence, the uniform distribution of MSIC patterns alleviates the dependency relationship between the test length and TPG’s initial states. The above conclusion can also be confirmed from another perspective. Table III shows the fault coverage range that MSIC can achieve with the same test length for a given CUT. For every benchmark CUT, 50 fault simulations have been carried out with 50 different seeds. Columns labeled SFCmin and SFCmax refer to the min and max stuck-at fault coverage with the same test length, which is under the column “TL”. The column “δrange ” refers to the relative error range of the fault coverage for a given CUT. The i th relative error, i.e., δi , can be given by SFCi − δi = 1 50

1 50

50

50

SFCi

i=1

× 100%

(11)

SFCi

i=1

where SFCi is the SFC value of the i th simulation for a given CUT. It can be seen from this table that the maximum absolute value of relative error is less than 3.83%, which shows that the TL is independent of the seed.


621

TABLE IV FAULT C OVERAGE C OMPARISON CUT

TL

MSIC-TPG SFC TFC

LFSR SFC TFC

[7]

[21]

TL

SFC

TL

SFC

c2670 c3540 c5315 c6288 c7552

1988 1052 1129 112 1024

91.5 97.65 99.62 98.45 95.55

82.46 92.08 95.99 93.8 89.68

91.24 97.8 98.73 98.22 95.21

70.23 90.66 57.97 92.92 80.17

NA NA NA 87 8491

NA NA NA 99.56 94.08

1988 1052 1111 NA NA

91.5 97.8 99.7 NA NA

s13207 s15850 s38417 s35932 s38584

9942 9533 9601 4800 9645

93.66 94.21 91.35 97.59 97.16

81.35 80.93 89.04 62.99 81.08

90.9 90.81 91.66 96.5 94.24

80.04 65.37 85.81 63.86 80.01

9942 9533 9601 NA 9645

92.4 90.6 91.7 NA 94.1

78832 96413 116208 NA 79360

97.7 98.8 97.5 NA 99.3

TABLE V T OTAL AND P EAK P OWER R EDUCTION OF CUT S CUT

Ptot /uW MSIC LFSR

c2670 c3540 c5315 c6288 c7552

19.9 46.6 55.1 274.8 69.6

38.55 81.44 110 366.2 137

Average

93.2

s13207 s15850 s38417 s35932 s38584

359.8 509.9 136.4 1312 1494

Average

762.4

Ptot

Ppeak /uW MSIC LFSR

Ppeak Ptot

48.4 42.8 50 25 49.2

312.4 755.5 821.8 1994 1012

433.1 918.3 1157 2363 1502

146.63

43.08

979.1

1274.6

24.5

32

11

462.1 648.6 2083 1858 1983

22.14 21.38 34.52 29.39 24.66

3986 5370 12980 20190 18000

5082 6471 19190 20790 19560

21.6 17 32.4 3 8

47 29 NA NA 15

39 30 NA NA 36

1406.9

26.41

12105

14218

16.4

30.3

35

V. P ERFORMANCE A NALYSIS To analyze the performance of the proposed MSIC-TPG, experiments on ISCAS’85 benchmarks and standard full-scan designs of ISCAS’89 benchmarks are conducted. The performance simulations are carried out with the Synopsys Design Analyzer and Prime Power. Fault simulations are carried out with Synopsys Tetramax. Synthesis are carried out with Nangate library based on 45-nm typical technology. The test frequency is 100 MHz, and the power supply voltage is 1.1 V. The test application method is test-per-clock for ISCAS’85 benchmarks and test-per-scan for ISCAS’89 benchmarks. The number of scan chains is 10 for s13207 and s15850, and 20 for s38417, s35932 and s38584.

A. Performances of MSIC-TPGs Table II summarizes the simulation results about MSICTPGs. The first three columns are the name, the number of PIs, and the size of the CUT. The next two columns are the widths of the seed generator and the SIC counter. The columns under “MSIC-TPG” and “LFSR” are the test overheads of the two methods in area, total power, and peak power, where the columns labeled “” are the corresponding

27.9 17.7 29 15.6 32.6

47 14 48 21 NA

[25] Ppeak 6 1 25 14 NA

overhead percentages. The results of [21] and [24] are also included for comparison. For ISCAS’85 benchmarks, the area overheads of MSIC and LFSR are 21%–157% and 22%–258%, respectively. The MSIC-TPG thus incurs less area overhead than the LFSR. Taking c2670’s TPGs for example, the LFSR TPG contains 233 DFFs and 3 XOR gates, while the MSIC-TPG contains 233 XOR gates in the XOR array, 16 DFFs in the Johnson counter, 15 DFFs and 3 XOR gate in the seed generator, and 67 additional logic gates for the control and clock circuit. The number of total equivalent gates is about 2309 for the conventional LFSR and 1089 for the MSIC-TPG. For ISCAS’89 benchmarks, the seed width is selected as a half the number of Pls for s13207 and s15850, and 20 for s38417, s35932, and s38584. As shown in Fig. 3(b), the primary inputs are applied with the outputs of the XOR gates and seed generator, respectively. Simulation results with 45-nm technology show that TPGs’ area overheads are 2.0%–7.5% with MSIC-TPGs, and 0.9%–7.7% with the conventional LFSRs. Both methods do not have a high impact on area overhead of the CUT. Although the sizes of benchmarks vary more than 30 times in Table II, the total power of the MSIC-TPG only varies from 12.07 to 27.17 μW and from 17.5 to 25.51 μW for ISCAS’85

622


and ISCAS’89 benchmarks, respectively. The corresponding average total power overhead percentage is only 1.84% for ISCAS’89 benchmarks because of their larger sizes, and the average power overhead percentage is 27% for ISCAS’85 benchmarks. In terms of peak power, both the MSIC-TPG and the LFSR-TPG have negligible impacts for ISCAS’89 benchmarks. However, the LFSR consumes much more total power than the MSIC-TPG for ISCAS’85 benchmarks. The TPG in [21] imposes 12%–15% more area overhead, and consumes 14%–21.87% less total power against the conventional LFSR. The TPG in [24] has an average of 6.9% more area overhead than the conventional LFSR. Compared with the TPGs in [21] and [24], the MSIC-TPG imposes compatible area overhead and consumes less total power. B. Fault Coverage Comparison Table IV shows the fault coverages for ISCAS benchmarks with the MSIC-TPG, the conventional LFSR, and the methods in [7] and [21]. In this table, the columns labeled SFC, TFC, and TL refer to the stuck-at fault coverage, transition fault coverage, and test length, respectively. The TFC values of this table correspond to launch-on-shift test patterns. In order to achieve fair comparisons, for ISCAS’89 benchmarks, our TLs are chosen to be the same as those in [7]. For the first three ISCAS’85 benchmarks, our TLs are chosen according to [21]. The last two ISCAS’85 benchmarks are selected to have similar fault coverages as [7]. Compared with the conventional LFSR, the MSIC-TPG achieves similar stuck-at fault coverage for ISCAS’85 benchmarks and higher stuck-at fault coverage for ISCAS’89 benchmarks except for s38417. It also achieves higher transition fault coverage for both ISCAS’85 and ISCAS’89 benchmarks except for s35392. Compared to the LT-LFSR method in [21], it can be seen from the ISCAS’85 benchmarks that, with the same TL, the MSIC-TPG achieves similar test efficiency. Compared to the TPG in [7], the results with ISCAS’89 benchmarks show that, with the same TL, the MSIC-TPG has higher efficiency, except for s38417. Note that, for ISCAS’89 benchmarks, the results of our method and those of [21] in the table should not be compared directly because of the different TLs. C. Average and Peak Power Reduction Table V compares the total and peak power reductions of CUTs with the MSIC-TPG and with the LFSR-TPG, where columns labeled “Ptot” and “Ppeak” refer to the CUT’s total power and peak power with the MSIC-TPG, respectively. Columns labeled “Ptot” and “Ppeak” refer to the percentages of different methods’ total power reduction and peak power reduction with respect to the LFSR method, respectively. The results of [25] are also included for comparison. For ISCAS’85 benchmarks, the MSIC-TPG saves 25%–50.0% total power and 15.6%–32.6% peak power against the conventional LFSR. All total power reductions and peak power reductions are higher than those with variable-length ring counter in [25].

For ISCAS’89 benchmarks, the MSIC-TPG can save 21.38%–34.52% total power and 3%–32.4% peak power against the conventional LFSR. The total power reductions and peak power reductions are not higher than those with the variable-length ring counter in [25]. One reason for this is that the power simulation results for the MSIC-TPG are with 45-nm technology, whereas results in [25] are with 0.25-μm technology. Experimental results show that leak power contributes about 10% to total power with 45-nm technology, but contributes little to total power with 0.18- or 0.25-μm technology. VI. C ONCLUSION This paper has proposed a low-power test pattern generation method that could be easily implemented by hardware. It also developed a theory to express a sequence generated by linear sequential architectures, and extracted a class of SIC sequences named MSIC. Analysis results showed that an MSIC sequence had the favorable features of uniform distribution, low input transition density, and low dependency relationship between the test length and the TPG’s initial states. Combined with the proposed reconfigurable Johnson counter or scalable SIC counter, the MSIC-TPG can be easily implemented, and is flexible to test-per-clock schemes and test-per-scan schemes. For a test-per-clock scheme, the MSIC-TPG applies SIC sequences to the CUT with the SRAM-like grid. For a test-perscan scheme, the MSIC-TPG converts an SIC vector to low transition vectors for all scan chains. Experimental results and analysis results demonstrate that the MSIC-TPG is scalable to scan length, and has negligible impact on the test overhead. R EFERENCES [1] Y. Zorian, “A distributed BIST control scheme for complex VLSI devices,” in 11th Annu. IEEE VLSI Test Symp. Dig. Papers, Apr. 1993, pp. 4–9. [2] P. Girard, “Survey of low-power testing of VLSI circuits,” IEEE Design Test Comput., vol. 19, no. 3, pp. 80–90, May–Jun. 2002. [3] A. Abu-Issa and S. Quigley, “Bit-swapping LFSR and scan-chain ordering: A novel technique for peak- and average-power reduction in scan-based BIST,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 28, no. 5, pp. 755–759, May 2009. [4] P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, J. Figueras, S. Manich, P. Teixeira, and M. Santos, “Low-energy BIST design: Impact of the LFSR TPG parameters on the weighted switching activity,” in Proc. IEEE Int. Symp. Circuits Syst., vol. 1. Jul. 1999, pp. 110–113. [5] S. Wang and S. Gupta, “DS-LFSR: A BIST TPG for low switching activity,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 21, no. 7, pp. 842–851, Jul. 2002. [6] F. Corno, M. Rebaudengo, M. Reorda, G. Squillero, and M. Violante, “Low power BIST via non-linear hybrid cellular automata,” in Proc. 18th IEEE VLSI Test Symp., Apr.–May 2000, pp. 29–34. [7] P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, and H. Wunderlich, “A modified clock scheme for a low power BIST test pattern generator,” in Proc. 19th IEEE VTS VLSI Test Symp., Mar.–Apr. 2001, pp. 306–311. [8] D. Gizopoulos, N. Krantitis, A. Paschalis, M. Psarakis, and Y. Zorian, “Low power/energy BIST scheme for datapaths,” in Proc. 18th IEEE VLSI Test Symp., Apr.–May 2000, pp. 23–28. [9] Y. Bonhomme, P. Girard, L. Guiller, C. Landrault, and S. Pravossoudovitch, “A gated clock scheme for low power scan testing of logic ICs or embedded cores,” in Proc. 10th Asian Test Symp., Nov. 2001, pp. 253–258.


[10] C. Laoudias and D. Nikolos, “A new test pattern generator for high defect coverage in a BIST environment,” in Proc. 14th ACM Great Lakes Symp. VLSI, Apr. 2004, pp. 417–420. [11] S. Bhunia, H. Mahmoodi, D. Ghosh, S. Mukhopadhyay, and K. Roy, “Low-power scan design using first-level supply gating,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 13, no. 3, pp. 384–395, Mar. 2005. [12] X. Kavousianos, D. Bakalis, and D. Nikolos, “Efficient partial scan cell gating for low-power scan-based testing,” ACM Trans. Design Autom. Electron. Syst., vol. 14, no. 2, pp. 28-1–28-15, Mar. 2009. [13] P. Girard, L. Guiller, C. Landrault, and S. Pravossoudovitch, “A test vector inhibiting technique for low energy BIST design,” in Proc. 17th IEEE VLSI Test Symp., Apr. 1999, pp. 407–412. [14] S. Manich, A. Gabarro, M. Lopez, J. Figueras, P. Girard, L. Guiller, C. Landrault, S. Pravossoudovitch, P. Teixeira, and M. Santos, “Low power BIST by filtering non-detecting vectors,” J. Electron. Test.-Theory Appl., vol. 16, no. 3, pp. 193–202, Jun. 2000. [15] F. Corno, M. Rebaudengo, M. Reorda, and M. Violante, “A new BIST architecture for low power circuits,” in Proc. Eur. Test Workshop, May 1999, pp. 160–164. [16] S. Gerstendorfer and H.-J. Wunderlich, “Minimized power consumption for scan-based BIST,” in Proc. Int. Test Conf., Sep. 1999, pp. 77–84. [17] A. Hertwing and H. Wunderlich, “Low power serial built-in self-test,” in Proc. Eur. Test Workshop, 1998, pp. 49–53. [18] S. Wang and W. Wei, “A technique to reduce peak current and average power dissipation in scan designs by limited capture,” in Proc. Asia South Pacific Design Autom. Conf., Jan. 2007, pp. 810–816. [19] N. Basturkmen, S. Reddy, and I. Pomeranz, “A low power pseudorandom BIST technique,” in Proc. IEEE Int. Conf. Comput. Design: VLSI Comput. Process., Sep. 2002, pp. 468–473. [20] S. Wang and S. Gupta, “LT-RTPG: A new test-per-scan BIST TPG for low switching activity,” IEEE Trans. Comput.-Aided Design Integr. Circuits Syst., vol. 25, no. 8, pp. 1565–1574, Aug. 2006. [21] M. Nourani, M. Tehranipoor, and N. Ahmed, “Low-transition test pattern generation for BIST-based applications,” IEEE Trans. Comput., vol. 57, no. 3, pp. 303–315, Mar. 2008. [22] X. Zhang, K. Roy, and S. Bhawmik, “POWERTEST: A tool for energy conscious weighted random pattern testing,” in Proc. 12th Int. Conf. VLSI Design, Jan. 1999, pp. 416–422. [23] S. F. Q. Abdallatif and S. Abu-Issa, “Multi-degree smoother for low power consumption in single and multiple scan-chains BIST,” in Proc. 11th Int. Symp. Qual. Electron. Design, Apr. 2010, pp. 689–696. [24] S. Chun, T. Kim, and S. Kang, “A new low energy BIST using a statistical code,” in Proc. Asia South Pacific Design Autom. Conf., Mar. 2008, pp. 647–652. [25] B. Zhou, Y.-Z. Ye, Z.-L. Li, X.-C. Wu, and R. Ke, “A new low power test pattern generator using a variable-length ring counter,” in Proc. Qual. Electron. Design, Mar. 2009, pp. 248–252.

623

Feng Liang received the B.E. degree from Zhengzhou University, Henan, China, in 1998, and the M.E. and Ph.D. degrees from Xi’an Jiaotong University (XJU), Xi’an, China, in 2001 and 2008, respectively. He is currently an Associate Professor with the School of Electronic and Information Engineering, XJU. His current research interests include VLSI designs, test, and computer architecture.

Luwen Zhang received the B.E. degree in electronic science and technology from the Xi’an University of Technology, Xi’an, China, in 2009. He is currently pursuing the M.E. degree in electrical engineering from Xi’an Jiaotong University, Xi’an. His current research interests include VLSI testing.

Shaochong Lei received the M.S.E. and Ph.D. degrees from Xi’an Jiaotong University (XJU), Xi’an, China, in 1990 and 2007, respectively. He is currently an Associate Professor with the Microelectronics Engineering Department, XJU. His current research interests include VLSI computeraided designs, system-on-chip testing, and design-for-test.

Guohe Zhang received the B.S. degree in microelectronics and the Ph.D. degree in electronic science and technology from Xi’an Jiaotong University (XJU), Xi’an, China, in 2003 and 2008, respectively. He is currently with the Department of Microelectronics, XJU. His current research interests include low-power and high-speed CMOS circuit designs, semiconductor device designs, and modeling.

Kaile Gao received the B.E. degree from the Changchun University of Science and Technology, Changchun, China, in 2010. He is currently pursuing the M.E. degree in electrical engineering with Xi’an Jiaotong University, Xi’an, China. His current research interests include design-for-test.

Bin Liang received the B.E. degree in electrical engineering from the Xi’an University of Technology, Xi’an, China, in 2010. He is currently pursuing the M.E. degree in electrical engineering with Xi’an Jiaotong University, Xi’an. His current research interests include design-for-test, system-on-chip testing, and VLSI testing.