JOURNAL OF ELECTRONIC TESTING: Theory and Applications 21, 539–549, 2005 c 2005 Springer Science + Business Media, Inc. Manufactured in The United States.
Concurrent Error Detection in a Bit-Parallel Systolic Multiplier for Dual Basis of GF(2m ) CHIOU-YNG LEE Program Coordination Department, Chunghwa Telecommunication Laboratories, Chung-Pei Road, Chung-Li, Tao-Yuan, Taiwan 320, R.O.C.
[email protected]
CHE WUN CHIOU Department of Information and Computer Science, Ching Yun University, 229, Chien-Hsin Rd., Chung-Li, Taoyuan, 320 Taiwan, R.O.C.
[email protected]
JIM-MIN LIN Department of Information Engineering, Feng Chia University, Taichung City 407, Taiwan, R.O.C.
[email protected]
Received September 15, 2004; Revised January 26, 2005 Editor: H.J. Wunderlich
Abstract. The finite field is widely used in error-correcting codes and cryptography. Among its important arithmetic operations, multiplication is identified as the most important and complicated. Therefore, a multiplier with concurrent error detection ability is elegantly needed. In this paper, a concurrent error detection scheme is presented for bit-parallel systolic dual basis multiplier over GF(2m ) according to the Fenn’s multiplier in [7]. Although, the proposed method increases the space complexity overhead about 27% and the latency overhead about one extra clock cycle as compared to Fenn’s multiplier. Our analysis shows that all single stuck-at faults can be detected concurrently. Keywords: finite fields, multiplier, fault-tolerant computing, fault detection, cryptography, single stuck-at fault 1.
Introduction
In recent years, finite field arithmetic operations in GF(2m ) are frequently desired in coding theory [18], cryptography [17], digital signal processing [3, 23], switching theory [1], and pseudorandom number generation [26]. Among the finite field arithmetic operations, multiplication is the most important, complex, and time consuming. Various GF(2m ) multipliers have received the most attention in the literature [6, 11,19]. For cryptography applications, such mul-
tipliers may require millions of logic gates. From the VLSI technology point of view, semiconductor manufacturers try to ensure that their products are reliable. It is nearly impossible not to have faults somewhere in a system at any given time. A fundamental problem in estimating reliability is whether a system will function in a prescribed manner in a given environment for a given period of time. Hence, this article investigates on-line techniques for error detection in bit-parallel systolic multipliers over GF(2m ).
540
Lee, Chiou and Lin
In VLSI designs, systolic architectures are fundamentally suited to rapid computation and depend on regular circuitry to perform arithmetic operations over finite fields GF(2m ). Their common nature supports architectural characteristics such as concurrence, I/Obalance, and simple and regular design. There are three popular types of bases over finite fields; polynomial basis (PB), normal basis (NB), and dual basis (DB). For example, most bit-parallel systolic PB multipliers perform array-type multiplication, because polynomial multiplication modulo primitive (irreducible) polynomial has regular and simple operations. Generally, the array algorithms are classified as least-significant-bit first (LSB-first) and most-significant-bit first (MSBfirst) schemes, such as those by Wang [25] and Yeh [27], which are both regularly connected to identical cells requiring a latency of 3 m clock cycles. Recently, Lee et al. [15, 16] used the inner product operation to implement efficient systolic multipliers with lowlatency and low-complexity architectures, defined by all-one and equally-spaced polynomials. Recently, Lee [13, 14] proposed two-type low-complexity bit-parallel systolic multipliers in which the field GF(2m ) is constructed from trinomials. Fenn et al. [7] suggested a bitparallel systolic DB multiplier over GF(2m ). As mentioned, their multipliers can not support the concurrent error detection function. To output error-free values, many error-detection schemes have been reported for symmetrical cryptosystems [2, 10] and asymmetrical cryptosystems [4, 9]. Fenn et al. [8] proposed an on-line detection method for bit-serial multipliers in GF(2m ) using parity prediction schemes. By employing the same parity prediction scheme, Reyhani-Masoleh and Hasan [24] provided error detection methods in bit-parallel and bit-serial polynomial basis multipliers in GF(2m ). The major problem in using parity checking is that it takes a long time to generate parity. Therefore, this method is not allowed to provide on-line detection capability in systolic array multipliers with bit-parallel output. To overcome this problem, Chiou [5] used REcomputing with the Shifted Operands (RESO) method [21, 22] to provide a concurrent error detection method for polynomial basis multiplier using all-one polynomials. Unfortunately, irreducible all-one polynomials are very rare. Only 13 values exist for m if m is less than and equal to 100. Recently, a bit-parallel systolic DB multiplier over GF(2m ) was proposed by Fenn et al. [17]. Based on such multiplier, this paper investigates a novel parity prediction scheme for concurrent error detection. All
single stuck-at faults in the proposed multiplier will be on-line detected. This new scheme can be used for any field defining irreducible binary polynomial. 2.
Conventional Bit-Parallel Dual Basis Multiplier Over GF(2m )
It is assumed that the reader is familiar with the basic concepts of finite fields. For more information, readers may refer to [20] for finite fields. In the following paragraphs, the dual basis multiplication from [7] is briefly reviewed. It is well known that the finite field GF(2m ) can be viewed as a vector space of dimension m. Suppose that the finite field GF(2m ) is generated by an irreducible polynomial P(x) = p0 + p1 x + · · · + pm−1 x m−1 + x m over GF(2) of degree m. Let α be a root of P(x). Any element A in GF(2m ) can be given as A = a0 + a1 α + · · · + am−1 α m−1 , where these coefficients are over GF(2) and the basis {1, α, . . . , α m−1 } is termed the canonical basis (standard or polynomial basis). Definition 1 ([16]). Let {1, α, . . . , α m−1 } (abbreviated {α i }) and {β0 , β1 , . . . , βm−1 } (abbreviated {βi }) be two bases in GF(2m ), let f : GF(2m ) → GF(2) be a linear function, and let γ ∈ GF(2m ), γ = 0. The bases {α i } and {β j } are then said to be dual with respect to f and γ if f (γ α β j ) = i
1 0
if i = j, if i = j.
(1)
Therefore, the {β j } is called the dual basis of {α i }. Using Definition 1, one can find that, for A ∈ GF(2m ), A=
m−1 i=0
ai α i =
m−1
ai∗ βi ,
(2)
i=0
where ai and ai∗ are the coordinates of A with respect to the polynomial basis and its dual basis, respectively. Assume that the field GF(2m ) is constructed from P(x) = p0 + p1 x + · · · + pm−1 x m−1 + x m over GF(2), every element is given by A = a0 + a1 α + · · · + am−1 α m−1 . Then, its corresponding DB element can be represented as A = f (γ A)β0 + f (γ α A)β1 + · · · + f (γ α m−1 A)βm−1 .
Concurrent Error Detection in a Bit-Parallel Systolic Multiplier Since P(α) = 0, we can write αm =
m−1
pi α i =
m−1
i=0
α m+1 =
m−1
From the above equation, the m × m matrix is called the Hankel matrix-vector. To explore the bit-parallel DB multiplier over GF(2m ), the product C using Eqs. (3)–(4) can be rewritten as
pi(0) α i ,
i=0
C = a0 B + a1 α B + · · · + am−1 α m−1 B m−1 = ai Ci
pi(0) α i+1
i=0
=
(0) pm−1 p0(0)
+
m−1
541
(0) pi−1
(6)
i=0
i=1
where
m−1 (1) (0) + pm−1 pi(0) α i = pi α i ,
Ci = α i B
i=0
.. . α 2m−2 =
= ci,0 β0 + ci,1 β1 + · · · + ci,m−1 βm−1 m−1
ci, j = f (γ α i+ j B) = bi+ j
pi(m−2) α i ,
i=0
where these coefficients are elements of GF(2). Let B = b0 β0 + b1 β1 + · · · + bm−1 βm−1 be represented in the dual basis. Applying the linear function to both sides of each identity in the above equations yields f (γ α i B) = bi ,
for i = 0, 1, . . . , m − 1,
(3)
and f (γ α i B) =
m−1
p (i−m) bj, j
j=0
for i = m, m + 1, . . . , 2m − 2.
(4)
Given the above assumptions, let two elements A and B in GF(2m ) be represented by A=
m−1
ai α i
i=0
B=
m−1
bi βi .
i=0
m−1
Assume that the element C = j=0 c j β j is the product of A and B, and let bi = f (γ α i B), for i = 0, 1, . . . , 2m − 2. With the matrix-vector representation, the product C can be represented as follows:
b0
b1
···
bm−1
b1 .. .
b2 .. .
··· .. .
bm .. .
bm−1
bm
· · · b2m−2
a0 a1 .. .
am−1
=
c0 c1 .. . cm−1
. (5)
As stated above, the following algorithm is presented for dual basis multiplication. Algorithm for dual basis multiplication Input: A = [a0 , a1 , . . . , am−1 ], B = [b0 , b1 , . . . , bm−1 ] and P = [ p0 , p1 , . . . , pm−1 , 1] Output: C = [c0 , c1 , . . . , cm−1 ] = AB For i = 0 to m − 1 { ci = 0 } For i = 0 to m − 1 { bm+i = 0 For j = 0 to m − 1 { c j = c j + ai bi+ j bm+i = bi+ j p j + bm+i } } According to the above algorithm, Fig. 1 shows the bit-parallel systolic DB multiplier which incorporates m 2 identical cells. Each cell is composed of two 2-input AND gates, two 2-input XOR gates, and seven 1-bit latches, as shown in Fig. 2. This circuit requires the latency of 3 m clock cycles, and each cell is required by the maximum computation delay of one 2-input AND gate, one 2-input XOR gate and one 1-bit latch. This DB multiplier proposed by Fenn et al. [17] is suitable for implementing VLSI technology. However, such a multiplier can not detect any existing single stuck-at faults. To overcome this problem, we will modify this
542
Lee, Chiou and Lin
Let us consider bm+i computations. From Eq. (6), computing bm+i can be obtained by bm+i = bi p0 + b1+i p1 + · · · + bm−1+i pm−1
Fig. 1.
The bit-parallel systolic DB multiplier over GF(2m ).
In the following assume that the fault is modeled as a single stuck-at fault, which appears to be the most common model used for logical faults. For this fault model, a fault in a logical gate (i.e., XOR, AND, etc.) results in one of its inputs or outputs being fixed to either a logic 0 (stuck-at-0, or s-a-0 in short) or a logic 1 (stuck-at-1, or s-a-1), respectively [27]. From computing bm+i given by the above equation, the cell Q i, j in Fig. 1 uses one 2-input AND gate and one 2-input XOR gate to perform bm+i computations, as shown in Fig. 2. The fault behaviors in the cell Q i, j can be classified into the following five cases. (a) If the input signal pj at AND-1 gate has a stuck-at fault, then cells in the ith row for computing bm+i can be calculated with j−1 m−1 bk+i pk + bk+i pk k=0 k= j+1 for s − a − 0 bm+i = j−1 m−1 b p + b + bk+i pk k+i k i+ j k=0 k= j+1 for s − a − 1
Fig. 2.
The detailed circuit for the cell Q i, j in Fig. 1.
multiplier to have the concurrent error detection capability using the parity prediction scheme and describe it in the next section. 3.
Proposed Concurrent Error Detection Algorithm
In this section, we will investigate the novel concurrent error detection (CED) algorithm for a dual basis multiplication over GF(2m ). Assume that the single stuck-at fault model is assumed in this paper. From the previous section, the dual basis multiplication in Eq. (6) involves two operations: computing bm+i for 0 ≤ i ≤ m − 2 and original multiplication. Hence, two cases of error detection schemes in this algorithm are discussed as follows. Case 1. Error detection on computing bm+i for 0≤i ≤m−2
(b) If the input signal bi+ j at AND-1 gate has a stuck-at fault, then cells in the ith row for computing bm+i can be given by j−1 m−1 bk+i pk + bk+i pk k=0 k= j+1 for s − a − 0 bm+i = j−1 m−1 b p + p + bk+i pk k+i k j k=0 k= j+1 for s − a − 1 (c) If the input signal bm+i at XOR-3 gate has a stuckat fault, then cells in the ith row for computing bm+i can be computed as follows m−1 bk+i pk for s − a − 0 k= j+1 bm+i = m−1 b p + bk+i pk for s − a − 1 j+i j k= j+1
Concurrent Error Detection in a Bit-Parallel Systolic Multiplier
(d) If the output signal of AND-1 gate has a stuck-at fault, then cells in the ith row for computing bm+i can be obtained as follows j−1 m−1 b p + bk+i pk k+i k k=0 k= j+1 for s − a − 0 bm+i = j−1 m−1 b p + 1 + bk+i pk k+i k k=0 k= j+1 for s − a − 1 (e) If the output signal of XOR-3 gate has a stuck-at fault, then cells in the ith row for computing bm+i can be computed as follows
bm+i =
m−1 bk+i pk k= j+1 m−1
1 +
543
From Eq. (4), it is straightforwardly obtained that computing bm is performed as bm = f (γ α m B) m−1 = f γ pi α i B i=0
=
m−1
pi f (γ α i B)
i=0
=
m−1
pi bi
(9)
i=0
Substituting Eq. (9) into Eq. (8), the predicted parity bit of αB can be formed as
for s − a − 0
Pˆ α B = b0 p0 +
m−1
bi (1 + pi )
i=1
bk+i pk
for s − a − 1
= b0 p0 +
k= j+1
bi p¯ i
i=1
Detection of such errors is discussed below. Let PB be the parity bit of the element B and be defined as
Therefore, assume m−1that the parity PB is pre-computed, since Pˆ α B = i=0 b1+i and
PB = b0 + b1 + · · · + bm−1 Pˆ α B where the symbol “+” denotes the modulo-2 addition (i.e., the exclusive-OR operation). m−1
m−1
Theorem 1. Let B = i=0 bi βi and PB = i=0 bi be the dual basis representation and its parity bit of B, respectively. The predicted parity bit of αB can then be represented as Pˆ α B = b0 p0 +
m−1
m−1
bi p¯ i
(7)
i=1
Thus, let us define the parameter eˆ bm as the following function; such parameter can be used for the detection of bm computations. eˆ bm = Pˆ α B + Pˆ B + bm + b0
eˆ bm+i = Pˆ αi B + Pˆ αi+1 B + bm+i + bi
Proof: From Eq. (6), one can obtain that α B = b1 β0 + b2 β1 + · · · + bm βm−1 . Thus, the predicted parity bit of α B is calculated by (8)
(10)
The parameter eˆ bm = 1 indicates the presence of the faulty bm computation. Analogously, the error detection of bm+i can be performed using the following function
where p¯ i denotes the 1’s complement of pi .
Pˆ α B = b1 + b2 + · · · + bm
Pˆ B = PB , then we have + Pˆ B = bm + b0
(11)
where Pˆ αi B = bi−1 p0 +
m−1 j=1
bi+ j−1 p¯ j
(12)
544
Lee, Chiou and Lin
Case 2. Error detection on original multiplication m−1 Since G = aB is given by G = i=0 gi βi , where gi = abi , it is straightforwardly obtained that the output of G is the logical B when m−10 when a = 0 and m−1 a = 1. Let B = i=0 bi βi and PB = i=0 bi be represented as dual basis and its parity bit of B, respectively. Then the parity bit of the output G with the m−1 input signal of a and B is given by PG = i=0 gi , where gi = abi , 0 ≤ i ≤ m − 1, are the coordinates of G. Thus, the predicted parity bit of the output G can be expressed as Pˆ G = a Pˆ B
(13)
First, computing α i B from Eq. (6) is carried out by α i B = bi β0 + bi+1 β1 + · · · + bi+m−1 βm−1 . It is easily computed that the predicted parity bit of α i B is same as ¯ j . Thus, the predicted Pαi B = bi−1 p0 + m−1 j=1 bi+ j−1 p parity bit of the product C = a0 B + a1 α B + · · · + am−1 α m−1 B can be calculated by Pˆ C =
m−1
ai Pˆ αi B
(14)
i=0
The parity bit of the original product C is straightforwardly obtained from Eq. (6), i.e., PC =
m−1
ci ,
where ci =
i=0
m−1
a j bi+ j
j=0
Finally, the error detection of the product C can be compared with the actual parity PC and the predicted parity Pˆ C , that is, eˆ C = Pˆ C + PC
(15)
The parameter eˆ C = 1 indicates such an existing single stuck-at fault. As stated above, both Eqs. (11) and (15) are used to detect errors at the multiplier output. If bm+i computations are error-free, then eˆ bm+i = 0, and eˆ bm+i = 1 flags an existing single stuck-at fault. Notably, when the single stuck-at fault occurs at bm+i computations, this fault is not injected into the output of bm+ j computations, where i = j. Based on Eq. (11), we can detect all single stuck-at faults occurred on bm+i computations. When single stuck-at fault occur on ci computations, this fault is not injected into the output of c j computations, where i = j. Therefore, for detecting all single
stuck-at faults in the entire multiplier, eˆ C = 1 flags the presence of a single stuck-at fault. Since Eqs. (11) and (15) are over GF(2), the values of eˆ bm+1 and eˆ C would detect not only a single stuck-at fault, but also any odd number of single stuck-at faults. With a similar argument, it is clear that an even number of single stuck-at faults occurring in cells in the ith row will not be detected by eˆ bm+1 . Similarly, the value eˆ C can not detect any even number of single stuck-at faults occurring in cells in the ith column. 4.
Concurrent Error Detection in Bit-Parallel DB Systolic Multiplier Over GF(2m )
Two-type parity prediction functions were discussed in the previous section. Using these parity functions below, we attempt to detect errors in bit-parallel systolic dual basis multiplier over GF(2m ). Under the dual basis multiplier architecture shown in Fig. 1, the modified bit-parallel systolic dual basis multiplier with concurrent error detection is shown in Fig. 3. The modified multiplier consists of (m + 1) × m cells, which includes m ×(m −1) U -cells, (m +1) V cells and (m−1) W -cells. The detailed circuits for U , V and W cells are depicted in Figs. 4, 5 and 6, respectively. The U-cells in the ith row perform the following functions: computing bm+i , α i B, and Pˆ αi+1 B = m−1 ¯ j , where 0 ≤ i ≤ m − 2. bi+1 p0 + j=1 bi+ j+1 p When both bm+i and Pˆ αi+1 B are finally calculated in U-cells, the W-cell uses four signals, bi , bm+i , Pˆ αi+1 B and Pˆ αi B , to carry out error detection of bm+i computations, where Pˆ αi B is computed in U-cells in the (i −1)th row. Notably, Pˆ B = PB if i = 0. Besides, W -cells use Pˆ αi B and ai , for 0 ≤ i ≤ m − 2, to calculate the predicted parity Pˆ C computations. In the last row, Vm−1, j cells for 0 ≤ j ≤ m − 1 perform both α m−1 B and PC computations. Finally, the Vm−1,m cell is responsible for am−1 Pˆ αm−1 B computations and error detection for the entire multiplier. According to the above configuration, this circuit requires a latency of 3 m + 1 clock cycles. Each cell requires the maximum computation delay of one 2-input AND gate, one 2-input XOR gate, and one 1-bit latch. Throughout the proposed multiplier, the single stuck-at fault model is assumed. Let the ith row and the jth column of the array in Fig. 3 be denoted by the X i, j -cell, where “X ” represents one of three cell types, U , V and W cells. Suppose that the X i, j -cell is faulty. The faulty behavior can be classified into the following seven cases and proven detectable.
Concurrent Error Detection in a Bit-Parallel Systolic Multiplier
Fig. 3. tion.
545
The bit-parallel systolic DB multiplier over GF(24 ) with concurrent error detec-
Fig. 4.
The detailed Ui, j -cell circuit in Fig. 3.
(1) Error on p j The output signal p j of the cell Ui, j uses a go-through line from the input signal p j , thus an error on it is easily detected by comparing the primary input signal p j and the primary output signal p j of the array in Fig. 3. Therefore, the error on the signal p j can be neglected. (2) Error on ai The output signal ai of the cell Ui, j is also a gothrough line from the input signal ai , thus an error on
it is easily detected. Therefore, error on the signal ai can be neglected. (3) Error on bi+ j The output signal bi+ j of cell Ui, j also uses a gothrough line from the input signal bi+ j , thus an error on it is easily detected. Therefore, the error on the signal bi+ j can be neglected. (4) Error on p j
546
Lee, Chiou and Lin
Fig. 5. The detailed Wi,m− cell circuit in Fig. 3.
Fig. 6. The detailed Vm, j -cell circuit in Fig.3.
The output signal p j of cell Ui, j also uses a gothrough line from the input signal p j , thus an error on it is easily detected. Therefore, the error on the signal p j can be neglected. (5) Error on c j If an error occurs on the input signal c j of the faulty cell Ui, j . This error will infect the output signal c j .
That is, this error influences the final results on the output signal c j . Therefore, computing the parity bit of C inV-cells can be calculated using the equation PC = m−1 C in W-cells j=0 c j . The predicted parity bit of m−1 is performed using the equation Pˆ C = i=0 ai Pˆ αi B . After comparing the actual parity PC and the predicted parity Pˆ C at Vm−1,m -cell, this error is then detected.
Concurrent Error Detection in a Bit-Parallel Systolic Multiplier
Table 1.
Comparison of various multipliers with error detection.
Multipliers
Fenn et al. [8]
Reyhani-Masoleh and Hasan [24]
Fig. 3
Output
Bit-serial output
Bit-parallel output
Bit-parallel output
Basis
Polynomial
Polynomial
Dual
Generated polynomial
General form
General form
General form
Fault coverage
45 − −52%
50%
100%
Concurrent error detection
No
No
Yes
Time overhead
[log2 m] XOR gate delays
[log2 m] XOR gate delays
1 clock cycle
(6) Error on bm+i or Pˆ αi+1 B If an error occurs on the input signal bm+i of the faulty cell Ui, j . This error will infect only the output signal bm+i . Therefore, using Eq. (11), eˆ bm+1 in the Wi,m cell can become the logical one, that is, this error has been detected. Similarly, the error on Pˆ αi+1 B in cell Ui, j can also be detected using Eq. (11). (7) Errors on Pˆ C or PC If an error occurs on the input signal Pˆ C in the cell Wi,m , then eˆ C in the cell Vm−1,m will become the logical one by applying Eq. (15). That is, this error can be detected. Similarly, the output signal of PC in the Vm−1, j -cell is faulty. Using Eq. (15), this error in the Vm−1,m -cell can be detected. As stated above, the proposed concurrent error detection scheme uses two parity predictions, Pˆ C and Pˆ αi B , to detect errors in the entire multiplier. The advantages of the proposed multiplier are described as follows: (1) All single-stuck faults can be detected by using Eqs. (11) and (15). (2) The proposed multiplier with concurrent error detection takes only one extra clock cycle as compared with the dual basis multiplier in [17] without concurrent error detection.
shows that the proposed multiplier can detect all singlecell faults concurrently. As a comparison, most existing multipliers [8, 24] with parity prediction schemes require at least [log2 m] gate delays for error detection. However, only about half of the faults assumed in these multipliers are detectable. The reason is that a single stuck-at fault in bit-serial multipliers produces multiple errors after m clock cycles and the number of effective errors resulting from the single stuck-at fault is either odd or even. Only odd numbered errors can be detected. Our proposed multiplier can overcome this problem. Table 1 shows a comparison of various multipliers in GF(2m ) with error detection capability. A circuit comparison between the proposed systolic multiplier and Fenn’s multiplier [17] is given in Table 2. Although, the proposed method increases the space overhead about 27% and the latency overhead about one extra clock cycle as compared to Fenn’s multiplier in [17]. However, Fenn’s multiplier can not support the concurrent error detection function.
Table 2. Comparison of two bit-parallel systolic multipliers for the dual basis of GF(2m ). Multipliers
Fenn et al. [7]
Number of cells
m2
The cell complexity 2-input AND
5.
547
Conclusion
Recently, Fenn et al. in [17] proposed a new method for designing a bit-parallel systolic dual basis multiplier over GF(2m ). This paper investigates a parity prediction scheme for concurrent error detection in multiplier designed according to the above method. The analysis
Fig. 3 m × (m + 1) U W V
2
3
1
1
4
2
2-input XOR
2
3
1-bit latches
7
10
Fault coverage
0
100%
Computation time per cell
T A + TX
T A + TX
Latency (unit = cycles)
3m
3m+1
Transistor count
96 m2 + 48 m
122 m2 + 66 m − 60
6
2
548
Lee, Chiou and Lin
Moreover, the proposed architecture has attractive features for high-speed VLSI system design, such as regularity, modularity, and concurrency. The architecture has applications for error correction codes and publickey cryptography.
Acknowledgments The authors are grateful to the referees for their valuable suggestions.
References 1. B. Benjauthrit and I.S. Reed, “Galois Switching Functions and Their Applications,” IEEE Trans. Computers, vol. C-25, pp. 78– 86, 1976. 2. G. Bertoni, L. Breveglieri, I. Koren, P. Maistri, and V. Piuri, “Error Analysis and Detection Procedures for a Hardware Implementation of the Advanced Encryption Standard,” IEEE Trans. Computers, vol. 52, no. 4, pp. 492–505, 2003. 3. R.E. Blahut, Fast Algorithms for Digital Signal Processing, Reading, Mass.: Addison-Wesley, 1985. 4. D. Boneh, R.A. DeMillo, R.J. Lipton, “On the Importance of Eliminating Errors in Cryptographic Computations,” Journal of Cryptology, vol. 14, pp. 101–119, 2001. 5. C.W. Chiou, “Concurrent Error Detection in Array Multipliers for GF(2m ) Fields,” IEE Electronics Letters, vol. 38, no. 14, pp. 688–689, 2002. 6. C.W. Chiou, L.C. Lin, F.H. Chou, and S.F. Shu, “Low Complexity Finite Field Multiplier Using Irreducible Trinomials,” Electronics Letters, vol. 39, no. 24, pp. 1709–1711, 2003. 7. S.T.J. Fenn, M. Benaissa, and O. Taylor, “Dual Basis Systolic Multipliers for GF(2m ),” IEE Computers and Digital Techniques, vol. 144, no. 1, pp. 43–46, 1997. 8. S. Fenn, M. Gossel, M. Benaissa, and D. Taylor, “On-Line Error Detection for Bit-Serial Multipliers in GF(2m ),” Journal of Electronic Testing: Theory and Applications, vol. 13, pp. 29–40, 1998. 9. M. Joye, A.K. Lenstra, J.-J. Quisquater, “Chinese Remaindering Based Cryptosystems in the Presence of Faults,” Journal of Cryptology, vol. 12, pp. 241–245, 1999. 10. R. Karri, G. Kuznetsov, and M. Goessel, “Parity-Based Concurrent Error Detection of Substitution-Permutation Network Block Ciphers,” in Proc. of CHES 2003, Springer LNCS 2779, pp. 113–124, 2003. 11. C¸.K. Ko¸c and B. Sunar, “Low-Complexity Bit-Parallel Canonical and Normal Basis Multipliers for a Class of Finite Fields,” IEEE Trans. Computers, vol. 47, no. 3, pp. 353–356, 1998. 12. P.K. Lala, Fault Tolerant and Fault Testable Hardware Design. Prentice Hall, 1985. 13. C.Y. Lee, “Low Complexity Bit-Parallel Systolic Multiplier Over GF(2m ) Using Irreducible Trinomials,” IEE Proc.-Comput. Digit. Tech., vol. 150, no. 1, pp. 39–42, 2003. 14. C.Y. Lee, “Low-Latency Bit-Parallel Systolic Multiplier for Irreducible x m + x n + 1 with gcd(m, n) = 1,” IEICE Transactions on Fundamentals, vol. E86-A, no. 11, pp. 2844–2852, 2003.
15. C.Y. Lee, E.H. Lu, and L.F. Sun, “Low-Complexity Bit-Parallel Systolic Architecture for Computing AB2 + C in a Class of Finite Field GF(2m ),” IEEE Trans. Circuits and Systems II, pp. 519–523, 2001. 16. C.Y. Lee, E.H. Lu, and J.Y. Lee, “Bit-Parallel Systolic Multipliers for GF(2m ) Fields Defined by All-One and Equally-Spaced Polynomials,” IEEE Trans. Computers, vol. 50, no. 5, pp. 385– 393, 2001. 17. R. Lidl and H. Niederreiter, Introduction to Finite Fields and Their Applications, New York: Cambridge Univ. Press, 1994. 18. F.J. MacWilliams and N.J.A. Sloane, The Theory of ErrorCorrecting Codes, Amsterdam: North-Holland, 1977. 19. E.D. Mastrovito, “VLSI Architectures for Multiplication Over Finite Field GF(2m ),” Applied Algebra, Algebraic Algorithms, and Error-Correcting Codes, in Proc. Sixth Int’l Conf., AAECC6, T. Mora (ed.), Rome, pp. 297–309, July 1988. 20. A.J. Menezes (ed.), Applications of Finite Fields, Boston: Kluwer Academic, 1993. 21. J.H. Patel and L.Y. Fung, “Concurrent Error Detection in ALU’s by Recomputing with Shifted Operands,” IEEE Trans. Computers, vol. C-31, no. 7, pp. 589–595, 1982. 22. J.H. Patel and L.Y. Fung, “Concurrent Error Detection in Multiply and Divide Arrays,” IEEE Trans. Computers, vol. C-32, no. 4, pp. 417–422, 1983. 23. I.S. Reed and T.K. Truong, “The Use of Finite Fields to Compute Convolutions,” IEEE Trans. Information Theory, vol. IT-21, no. 2, pp. 208–213, 1975. 24. A. Reyhani-Masoleh and M.A. Hasan, “Error Detection in Polynomial Basis Multipliers Over Binary Extension Fields,” in Proc. of Cryptographic Hardware and Embedded SystemsCHES 2002, LNCS 2523, pp. 515–528, 2003. 25. C.L. Wang and J.L. Lin, “Systolic Array Implementation of Multipliers for GF(2m ),” IEEE Trans. Circuits and Systems II, vol. 38, pp. 796–800, 1991. 26. C.C. Wang and D. Pei, “A VLSI Design for Computing Exponentiation in GF(2m ) and its Application to Generate Pseudorandom Number Sequences,” IEEE Trans. Computers, vol. 39, no. 2, pp. 258–262, 1990. 27. C.S. Yeh, S. Reed, and T.K. Truong, “Systolic Multipliers for Finite Fields GF(2m ),” IEEE Trans. Computers, vol. C-33, pp. 357– 360, 1984. Chiou-Yng Lee received the Bachelor’s degree (1986) in Medical Engineering and the M.S. degree in Electronic Engineering (1992), both from the Chung Yuan Christian University, Taiwan, and the Ph.D. degree in Electrical Engineering from Chang Gung University, Taiwan, in 2001. From 1988 to now, he was a research associate with Chunghwa Telecommunication Laboratory in Taiwan. He joined the department of project planning. He taught those related field courses at Ching Yun University. His research interests include computations in finite fields, error-control coding, signal processing, and digital transmission system. Besides, he is a member of the IEEE and the IEEE Computer society. He is also an honor member of Phi Tao Phi in 2001. Che Wun Chiou received his B.S. degree in Electronic Engineering from Chung Yuan Christian University in 1982, the M.S. degree and the Ph.D. degree in Electrical Engineering from National Cheng Kung University in 1984 and 1989, respectively. From 1990 to 2000, he was with the Chung Shan Institute of Science and Technology
Concurrent Error Detection in a Bit-Parallel Systolic Multiplier
in Taiwan. He joined the Department of Electronic Engineering, Ching Yun University in 2000. He is currently as Dean of Division of Continuing Education and Professor in Electronic Engineering in Ching Yun University. His current research interests include faulttolerant computing, computer arithmetic, parallel processing, and cryptography. Jim-Min Lin was born on March 5, 1963 in Taipei, Taiwan. He received the B.S. degree in Engineering Science and the M.S. and
549
the Ph.D. degrees in Electrical Engineering, all from National Cheng Kung University, Tainan, Taiwan, in 1985, 1987, and 1992, respectively. Since February 1993, he has been an Associate Professor at the Department of Information Engineering, Feng Chia University, Taichung City, Taiwan. His research interests include Operating Systems, Software Integration/Reuse, Embedded Systems, Software Agent Technology, and Testable Design.