A 54x54-bit multiplier wltii a new redundant binary ... - IEEE Xplore

2 downloads 2597 Views 502KB Size Report
Florida Institute of Technology, Melbourne, FL 32901. (besli, rgd) @ee.fi't.edu. Abstract. A fast multiplier with a new Redundant Binary Booth's. Encoding @BE) ...
A 54x54-BIT MULTIPLIER WITH A NEW REDUNDhi" BINARY BOOTH'S ENCODING Nurettin Besli and R. G, Deshmukh Department of Electrical and Computer Engineering Florida Institute of Technology,Melbourne, FL 32901 (besli, rgd) @ee.fi't.edu Abstract A fast multiplier with a new Redundant Binary Booth's Encoding @BE) for Radix-8 has been developed. The new RBBE enables one to directly generate all Redundant Binary Partial Products (PP) including the hard multiple of 3Multiplicand with no additional circuitry or time delay. Also, the use of TGs in Partial Product Selection Circuitpermits the rapid generation of PPs. In this work, a word size of 54 bits is chosen to comply with the IEEE double-precision floating-point standard, which consists of one sign bit and 53 bits offiaction. Thus, the RBBE unit for Radix-8 generates 18 Redundant Binary PPs to be accumulated by a Redundant Binav Adder @BA) tree. Due to the simpliciv of the necessary interconnections between the adders, the RBA tree has a regular layout. The output of the RBA tree is in the Redundant Binary (RB) form, which needs to be converted to Standard Binary (SB) form.

Although this conversion can be

performed by a Carry Look Ahead (CLA)adder, a better conversion circuit has been designed to take advantage of the RB number xystem. The proposed design contains a converter of a higher speed and a lower transistor count. As a result, the proposed fast multiplier accepts two 54bit SB numbers and produces a final product of 108-bit SB number. Keywords: Computer arithmetic, Booth's algorithm, Signed-Digit numbers, Redundant Binav numbers, and Multiplier.

1. INTRODUCTION As high speed computing systems become increasingly

in demand for a wide range of scientific applications such as signal processing and computer graphics, faster and more precise arithmetic units, especially multipliers, have gained more importance. The need for more accurate results has increased the required bit length of the operation. For example, the multiplication of two doubleprecision floating-point numbers, both of which consist of

one sign bit and 53 bits of fralction according to the IEEE standard, requires a 54x54-bit multiplier. Therefore, a fast 54x54-bit multiplier with a new Redundant Binary Bootli's Encoding is proposed in this paper. Multiplication involves two basic operations: 1) the generation of partial products and 2) their accumulation [l]. Therefore, to design a fast multiplier, either the number of the partial products or the time needed to accumulatethe partial products should be reduced. Various algorithms for red.ucing the number of partial products have been proposed and one of the earliest algorithms is Booth's algorithm [2]. In Booth's algorithm, no partial product is generated for a group of consecutive zeroes in multiplier. However, for a consecutive group of m ones starting at i, two partial products are generated by replacing the m additions of partial products with the addition of a partial product of weight 2 m+i and the subtraction of a partial product of weight 2 ' . T;he Booth's algorithm performs

~~

Proceedings of the 2002 IEEE Canadian Conference on Electrical & Computer Engineering 0-7803-7514-9/02/$17.000 2002 IEEE

the

encoding serially.

Hence, the Mohfied Booth's Algorithm [31 that achieves the encoding in parallel was proposed later to design a fast multiplier. In next section, the modified Booth's algorithms with various radixes will be discussed briefly. After generating the partial products, they must be accumulated to obtain the Em1 product. A fast multioperand adder such as the Wallace tree [4] or the CarrySave-Adder (CSA) tree using multi-input counters or compressors [5,6] should be employed for high-speed accumulation. However, tree structures generally have very irregular interconnections that complicate the impl.ementation, and more importantly result in areaineEicient layouts. One altemative is to replace the C S h by Redundant Binary Signed Digit (FU3SD) adders [7,8], whch operates on RBSD numbers. Each RBSD digit consists of two bits such as positive and negative bits to represent three possible digit values (0, f l ) . In this encoding, the value of a digit is the subtraction of the negative bit from the positive bit. Ldce CSA, the Rl3SD adder takes a finite amount of time to generate a result becixuse a cany propagates only one bit position at most. Since a three-inputltwo-output adder (3-2 compressor

- 597 -

.

..

which is also called CSA) is most commonly used in the Wallace tree, the partial product reduction rate is only 3/2. The reduction rate of 211 can be accomplished by using RBSD adders. In addition, RBSD adder is more suitable for the VLSI design because of its regular layout. The combinationof the Modified Booth's encoding and the Redundant Binary Signed-Digit (Rl3SD) adders has been used in the past to form multipliers [8-111. However, they have been used as separate Units: A Modified Booth's encoding Unit generates the intermediate partial products in Standard Binary (SB) and the second unit converts and reduces these into the partial products in RBSD to be accumulated by RBSD adders. The proposed multiplier uses a Radix-8 (Booth-3) RBSD Booth's encoding which directly generates the RBSD partial products by talung advantage of the special encoding of RBSD numbers. The new design brings simplicity into both the generation of the hard multiples and the negation of the multiples. Therefore, the use of high radix Booth's encoding such as Radix-8 becomes feasible and more reduction in the number of partial products to be summed can be achieved. The RBSD Partial Products can be directly applied to the RBA tree which generates the result in the RBSD form. To obtain a final result in SB form, a Carry-Propagate addition is necessary to convert a RBSD result to a SB number. Many special adders have been developed for this conversion based on the principles of Carry-LookAhead (CLA), Cany -Select, Cany-Save and Manchester adders [8,12,13]. In this work, a new converter has been proposed to take advantage of the RB number system. It will be explained in Section4. Moreover, to enhance time delay and area efficiency in our design, fast gates such as Transmission-Gates (TG), 2input-NAND gates and INVERTERS have been utilized. In the next section, the RBSD number system will be described. In Section 3, the new Radix-8 RBSD Booth's encodmg will be discussed. In Section 4, the new multiplier architecture will be described as a whole. Finally, in Section 5, the conclusions are summarized.

and the next lower bits [ 141. SD nuinbers with a radur-2 are also called Redundant Binary Signed-Digit (RBSD) numbers on which our work is based. In the case of r = 2, then a = 1, and a redundant set of three digits (0, +1} is used to represent numbers in this system. It is common to = - q. use form for representing SD digit, where A radix-r SD integer of the form X = (xn-lx,.2 ... XI xo) has the value in Eqn. (1). n- 1

x=cx;

(1)

i=O For example: If r = 2 and X =( 0 1 i 1), then the value of X is as follows:

x = 0 * 2 +~ 1*2*+ i*zl + i*2O = 1

(2)

The same value can also be represented with X = ( 0 0 0 1 ), which means that the same number can be represented with more than one digit combination. Because of this, the RBSD number system carries the designation Redundant. In the RBSD representation, three possible digits (O,fl} can be encoded by using 2 bits. The best-known two encodings are 1) Sign-and-Magnitude and 2) Positiveand-Negative representations. In the Sign-and-Magnitude encoding, as it was in standard Sign-and-Magnitude, the most sigmficant bit of two is the sign and the other is the value. For example: i= (1 1) and 1 = (0 1). However, in the proposed work, the focus is on Positive-and-Negative encoding which assumes the most significant bit of two as positive and the other as the negative digit. Therefore, the value of each digit is calculated according to Eqn. (3) and all possible combinationsare listed in Table 1.

Table 1. RBSD encoding

2. REDUNDANT BINARY SIGNED DIGIT NUMBER REPRESENTATION (RBSD) AND RBSD ADDER The Signed-Digit (SD) number system was f i i t proposed by Avizienis [7] to make it possible to perfom carry-free addition A SD number in radix-r is based on a redundant set of signed digits. ( -a, -(a-1), ... ,-1 ,O ,1 ,... ,a-1 ,a 1 The highest digit value "a" is restricted to r(r-1)/21 ia 5 r-1, which means r 5 3. However, SD with r = 2 can also accomplish a carry-free addition by examining the present

To be able to accomplish a Carry-Propagate-Free addition with RBSD numbers, a Redundant Binary Adder (RBA) is used. The RBA acceptstwo RBSD numbers and generates one RBSD result. Another word, the RBA reduces 4 inputs to 2 outputs having the same reduction rate with 4-2 compressors. Various Redundant Binary Adders @A) have been designed for both encodings [7,8,10]. In t h s work, the RBA proposed by Makino and

- 598 -

at al. in [8] is used. This RBA is optimized for speed and area efficiency by employing Transmission Gates as shown in Fig. 1.

overlapping fourth bit. Each group is decoded in parallel to sel.ect a certain partial product from the set {MM, &3M,*2M, *h4,0), as shown in Table 2. Table 2. The Booth's Encoding for Radix-8 (Booth-3)

A drawback of this algorithm is the complexity of the Booth decoding and the PP selection logic. In addition to this, h e generation of the 3M: multiple, which is referred to as a hard multiple, cannot be obtained by simple shifting and/or complementation. But rather a full CarryPropagate addition (CPA) is required. Because the generation of all the partial products cannot be guaranteed until the 3M multiple is produced the latency of the multiplier increases.

Figure 1. RBA CMOS circuit diagram [8]

3. GENERATING PARTIAL PRODUCTS 3.1 Modified Booth's Algorithm The Modified Booth's Algorithm [3,6] that perfom the encoding in parallel is done by dividmg the multiplier into overlapping groups of predefined number of bits and selecting only one proper partial product (PP) for each group combination from the specific set of the multiples of multiplicand. Therefore, generating a smaller number of partial products that need to be summed for the final result accomplishes faster multiplication and less area requirement. The most common Booth's encoding is Radix4 Booth's encodmg due to the ease and simplicity of generating the required multiples of the multiplicand (M) which are 0). All of these selected from the set {=t2M, multiples are obtainable directly by a simple wire connection to the multiplicand. The Radix4 Booth's Encodmg, which is also called Booth-2 indicating the reduction rate of the number of partial products, is performed by dividing the multiplier into groups of 2 bits with an overlapping third bit from the lower group. The encoding occurs in parallel because the selection of the multiples depends entirely on the appropriate three bits. Booth's algorithm is not constrained to Radix-4 [151. It is possible to use a greater radix than 4 with a corresponding larger reduction in the number of partial products. In Radix-8 Booth's encoding (Booth-3), the multiplier is partitioned into groups of 3 bits with an

3.2. A Novel Redundant Binary Booth's Encoding (RBBE) As mentioned earlier, the use of the Modified Booth's

encotling for Radix-8 and over is not feasible due to the difficulties in generating hard multiples and the complex decoding logic. Hard multiples cannot be generated by just shifting or connecting wires in SB form and require a Cq4'ropagate addition. Also, all negative partial products must be signextended, which means that additional hardware and rouiing regardless of the radix value: will be needed. The proposed RBSD Booth's encoding design presents ai way to overcome these limiti&ons. The RBSD Booth's encoding uses the same encodmg table as the Modified Booth's encoding, requiring both to generate the same multiples. However, the definition of RBSD in Eqn (3) can be utilized to generate multiples, especially hard multiples. As seen from Table 3., a positive non-hard multiple can be generated by connecting the positive bits (x') of REED partial product to this multiple and the negative bits (x? to the zero (Ground). And, a negative non-hard multiple can be generated by connecting the negative bits (XI of Rl3SD to this multiple and the positive bits (x') to the zero.

- 599 -

Table 3. The RBSD Booth's Encoding for Radix-8 (Booth-3)

I

Multinlicand

Partialhoduct SelectionTable

I

011 011

I I

0 I+4MI+MI 1 I+4MI 0 I

111 111

I I

0 1

I

I

0 0

I I

I T 1

+M 0

Although these connections are correct for unsigned multiplicands, when the numbers are in 2's complement form the most significant bit must be connected to the most significant negative (x3 bit of RBSD for positive multiples and to the positive bit (x') for negative multiples. In the case of a hard multiple, proper two multiples are connected to each bit of RBSD partial product. As an example, the generation of the hard multiple 3M by using the RBSD representation is shown inFig. 2.

Multiplicand (M) = ( 0 0 1 0 0 1 0 1 0 )*= ( 074 ),o 3 M = ( O l I 0 1 I1 1 0)2=(222),0

X = 3M = 4M - lM, X'= 4M &X-= 1M i x ' = 4 M = ( l 0 0 1 0 1 000)2=(296),0j ; X - = l M = ( O O l 0 0 1 0 1 0)2=(074),0i X = 3 M = ( 1 0 7 1 O O O i O)RBso=(222)lo

-

X = (256*1 + 32"l) (64*1 +2"1) = (222),,

I

I

T

I

1

I

Partial Product Selector (k) IPT

I

...

IPt

Figure 3. RBSD Booth-3 Partial Product Generator

The proposed RBSD Booth's decoder does not add any additional hardware because the same number of the multiple selection signals is required as in the SB Modified Booth's decoder. Instead of a 3M-selection signal, 1M-selection signal for the negative part of the RBSD number has to be generated as listed in Table 3. The new RBSD Booth3 decoder circuit is optimized for transistor count and speed by employing TGs and fast EXOR. In addition, the inputs from the multiplier are EXORed in overlapping pairs. This will reduce the number of combinations in the Partial Product Selection Table to the half and simplify the decoding circuit. Also, to achieve strong logic level for the low voltage supply power, TGs with complementary pass-transistors are used instead of nMOS pass-transistor.

Figure 2. An example of generating a RBSD hard multiple

4. RBSD BOOTH MULTIPLIER

The negation of a RBSD number can be done accordmg to Eqn. (4) or Eqn. (5) as follows:

As mentioned earlier, multipliers essentially consist of the PP Generation Unit, the Carry-Free Addition Unit for the accumulation of PPs, and the Carry Propagate Addition Unit for the summation of the intermediate results ore the conversion The block diagram of the new 54~54-bitRBSD Booth-3 multiplier is shown in Fig. 4. The RBSD Booth3 PP generation unit produces 18 RBSD partial products to be accumulated by a Redundant Binary Adder tree as shown in Fig. 4. The depth of the tree is five and the output of the tree is a 108-bit RBSD number. Every level of the tree retires a certain number of RBSD digits starting from the least significant digit. While the correct RBSD digits appear they can be converted to SB bits in parallel with the accumulation process. T h s leads to a reduction in the conversion time.

-x= -cx'- x-) -x=( X - x + ) -x= ((-0(-XI) =(( x++l)-( -x=( x'-23

(RBSD-BM)

(4) F+1) (5)

The fist one involves cross connection of the positive and negative bits based on Eqn. (4). And,the second one requires complementation of all bits based on Eqn. (5). Either one could be used to simplify a design. Moreover, because of existence of negative digit value, there is no need for the sign extensions of the RBSD negative multiples in the partial product accumulation,resulting in the reduction of hardware.

- 600 -

In Fig. 5, the multiplexers with 6 TGs, 2-input NOR gates and INWRTERS are used. The critical path has four multiplexers. However; they are not in series in which the output of TG is connected to the source node of the next TG. This is important because such series connc:ction degrades the propagating signal and results in a slower design.

54-bit &Multiplicand

5. CONCLUSIONS

. A new 54~54-bit multiplier with RBSD Booth-3 encochg has been designed. The proposed system consists of the RBSD Booth's partial product generator, the Redundant Binary Adder tree and the Carry-Propagate adder to fiid the final result in the standard binary form. The multiplier with the Rl3SD Booth3 Encoding has three essential advantages over the Modified Booth's encoding: 1) RBSD Booth-3 encodmg does not need to generate any hard multiples. 2) In the standard Booth's encoding, the negation of multiples, and the sign extension of the partial products for ;mcumulation are necessaxy which requires extra hardware and time delay. Whereas, the new design does not require either. 3) With the use of Redundant Binary adder tree, more regular layout for accurnulation of partial products is possible.

RBA Tree

(RBSD Product)

RBSD Product

Conwetter

b

1O8-bit

Product

References

Figure 4. RBSD Booth-3 Multiplier The RBSD numbers can be converted to Standard Binary by a Carry-Look-Ahead Adder [8,16,17] or the other Carry Propagate Adders. The proposed converter enables a borrow signal to be generated directly from RBSD digits unlike CLA which requires Propagate and Generate signals. If a digit is positive one (+l), the propagation of borrow is stopped. If it is negative (-l), a borrow is generated, and lastly if it is zero (0), the borrow signal from the lower next &git is propagated. The converter circuit based on this idea has been built for the new multiplier. Fig. 5 shows the RBSD to SB Converter for 8 bits.

C I,

Iz7

126

1%

Iz,

[ 11 I. Koren, Computer Arithmetic Algorithm, Prentice Hall, 1993. [2] A. D. Booth, "A Signed Binary Multiplication Te:chnique," Quarterly J. Mech. AppI. Math., vol. 4, part 2, pp. 236-240,1951. [3] 0. L. MacSorley, "High-Speed Arithmetic in Binary Computers," Proc. of the IRE, vol. 49(1), pp. 67-91, Jan 1961. [4] C. S. Wallace, "A Suggestion for a Fast Multiplier," IEEE Trans. Electron. Comp., vol. EC-13, pp. 14-17, Feb. 1964. [5] L. Dad&, "Some schemes for parallel multipliers," Alta Frequenza, vol. 34, p p 343-356, 1965. [6]h![. Mehta, V. Parmar arid E. Swartzlander, "Highspeed multiplier design using multi-input counter and compressor circuits," IEEE Proc. of the 10th Symp. On Computer Arithmetic (ARI;WlO),pp. 43-50, 1991. [7] A. Avizienis, "Signed Digit Number Representation for Fast Parallel Arithme:tic," IRE Trans. Electron. Computers., vol. EC-10, pp. 389-400, Sept. 1961. [8] H. Makino, Y. Nakase, 11. Suzuki, H. Moronika, H. Shinohara and K. Mashiko, "An 8.8-11s 54x54-Bit Multiplier with High Speed Redundant B W Architecture," IEEE J. Solid-state Circuits, vol. 3 1, No. 6, pp. 773-783, June 1996.

Figure 5.8-bit RBSD to SB Converter

-601 -

[9] T. N. Rajashekhara and 0.Kal, "Fast multiplier design using redundant signed-digit numbers," Znt. J. Electronics, vol. 69, No. 3, pp. 359-368, 1990. [lO]S. Kuninobu, T. Nishiyama, H. Edamatsu, T. Taniguchi, and N. Takagi, "Design of high speed MOS multiplier and divider using redundant binary representation," IEEE Proc. of the 8th S p p . On Computer Arithmetic (ARITH8),pp. 80-86, May 1987. [11]X. Huang, W. Liu, and B.W. Y. Wei, "A HighPerformance CMOS Redundant Binary Multiplicationand-Accumulation (MAC) Unit," IEEE Trans. Circuits andSyS. I,vo~. 41,pp. 33-39, Jan. 1994. [12]T. N. Rajashekhara and 0. Kal, "Conversion from signeddigit to radix compliment representation," Int. J. Electronics, vol. 69, No. 6, pp. 717-721, 1990. [ 131S.-M. Yen, C.-S. Laih, C.-H. Chen and J.-Y. Lee, "An Efficient Redundant-Binary Number to Binary Number Converter," ZEEE J. Solid-state Circuits, vol. 27, No. 1, pp. 109-112, January 1992.

[ 141D. S. Phatak, I. Koren, "Hybrid Signed-Digit Number

Systems: A Unifed Framework for Redundant Number Representations With Bounded Carry Propagation Chains," ZEEE Trans. Computers, vol. 43, No. 8, pp. 880-891, August 1994. 51 G.Bewick, M. J. Flynn, "Binary Multiplication Using Partially Redundant Multiples," Technical Report: CSL-TR-92-528, Standford University, June 1992. 6]N. Ohkubo, T. Shinbo, T. Yamanaka, A. Shimizu, K. Sa& and Y. Nakagome, "A 4.4-11s 54x54-b Multiplier Using Pass-Transistor Multiplexer," ZEEE J. Solid-state Circuits, vol. 30, No. 3, pp. 251-257; March 1995. [17]K.-W. Shm, B.-S. Song, K. Bacrania , "A 200-Mhz Complex Number Multiplier Using Redundant Binary Arithmetic," ZEEE J. Solid-state Circuits, vol. 33, No. 6, pp. 904-909, June 1998.

- 602 -

Suggest Documents