(RBSD) booth's encoding - IEEE Xplore

A Novel Redundant Binary Signed-Digit (RBSD) Booth's Encoding Nurettin Besli; Florida Institute of Technology; Melboume, FL R. G. Deshmukh; Florida Institute of Technology; Melboume, FL Keywords: Computer arithmetic, Fast adders, Multipliers, Booth's algorithm, Redundant binary andor Signed-digit numbers.

ABSTRACT Signed Digit Number representation has been used to form fast multipliers due to the capability of carry-free addition and a more regular layout. Booth's encoding and its variations are also employed to design fast multipliers by reducing the number of partial products in the multiplication. However, Booth's encoding is not efficient for higher than Radix-4 because of the difficulty in generating the necessary hard multiples such as 3xMultiplicand and the time delay caused by the decoding circuits. To date, the literature reports the use of the Modified Booth's Encoding (MBE) and the Redundant Binary Signed-Digit (RBSD) adders together in the multiplier designs. However, they have been used as separate units: the MBE unit generates the intermediate partial products (PP) in Standard Binary (SB) form and the second unit converts and reduces these PPs into the RBSD partial products to be accumulated by RBSD adders. This paper presents a novel Redundant Binaly Signed-Digit Booth's Encoding (RBBE) for a multiplier, which directly generates the RBSD partial products and allows the use of Booth's encoding for Radix-4 and Radix-8 without the need to generate any hard multiples. As for RBBE with higher than Radix-8, the number of hard multiples is significantly reduced. Moreover, negation in RBSD requires only wire crossing of two bits of each digit and does not need any cany-propagate operation or sign-extension. Therefore, the generation of negative multiples or the multiplication of 2's complement numbers in RBSD form can be done without additional hardware. This leads to a faster and smaller size multiplier. RBSD adder tree is used to accumulate these RBSD partial products and the result will be in RBSD form. Although the carry-propagate addition is necessary for the conversion from RBSD to SB, this is not a disadvantage over a SB multiplier because the accumulation of SB partial products also requires the same carry-propagate addition to get the final result from the intermediate Sum and Cany at the last stage. 1. INTRODUCTION

With the constant growth of computer applications such as signal processing and computer graphics, fast arithmetic units especially multipliers are increasingly required. Advances in VLSI technology have given designers the freedom to 0-7803-7252-2/02/$10.00 Q 2002 IEEE i,

integrate many complex components, which was not possible in the past. Various high-speed multipliers have been proposed and realized []-I 1 and 14-17]. Multiplication involves two basic operations: 1) the generation of partial products and 2) their accumulation [l]. Therefore, to design a fast multiplier, either the number of the partial products or the time needed to accumulate the partial products should be reduced. Various algorithms for reducing the number of partial products have been proposed. One of the earliest algorithms is Booth's algorithm [2]. In Booth's algorithm, no partial product is generated for a group of consecutive zeroes in multiplier. However, for a consecutive group of m ones starting at i, two partial products are generated by replacing the m additions of partial products with the addition of a partial product of weight 2 ""'and the subtraction of a partial product of weight 2 i . The Booth's algorithm performs the encoding serially. Hence, the Modified Booth's Algorithm [3] that performs the encoding in parallel was proposed and implemented to design a fast multiplier. In next section, the modified Booth's algorithms with various radixes will be discussed briefly. After generating the partial products, they must he accumulated to obtain the final product. A fast multi-operand adder, such as the Wallace tree [4] or the Cany- Save-Adder (CSA) tree using multi-input counter and compressor [5,6] should be employed for high-speed accumulation. However, tree structures generally have very irregular interconnections that complicate the implementation, and more importantly result in area-inefficient layouts. One alternative is to replace the CSAs by Redundant Binary Signed Digit (RBSD) adders [7,8], which operates on RBSD numbers. Each RBSD digit consists of two hits such as positive and negative bits to represent three possible digit values {O, f l ) . In this encoding, the value of a digit is the subtraction of the negative bit from the positive bit. Like CSA, the RBSD adder takes a f ~ t e amount of time to generate the result because the carry can propagate at most only one position. Since a three-inpuutwooutput adder (3-2 compressor which is also called CSA) is most commonly used in the Wallace tree, the partial product reduction rate is only 312. The reduction rate of 2/1 can be accomplished by using RBSD adders. In addition, RBSD adder is more suitable for the VLSI design because of its regular layout.

Proceedings IEEE Southeastcon 2002 426

The combination of the Modified Booth's encoding and the Redundant Binaty Signed-Digit W S D ) adders has been used in the past to form multipliers [S-1 I]. However, they have been used as separate units: Modified Booth's encoding unit generates the intermediate partial products in SB and the second unit converts and reduces these into the partial products in RBSD to be accumulated by RBSD adders. The proposed RBSD Booth's encoding directly generates the RBSD partial products by taking advantage of the special encoding of RBSD numbers. The new design brings simplicity into both the generation of the hard multiples and the negation of the multiples. Therefore, the use of high radix Booth's encoding such as Radix-8 becomes feasible and more reduction in the number of partial products to be summed can be achieved. The RBSD Partial Products can be directly applied to a RBA tree generating the result in the RBSD form. To obtain a fml result in SB form, a Carry-Propagate addition is necessary to convert the RBSD result to SB number. A fast conversion can be done by using one of the Carry-Look-Ahead (CLA) adders reported in Reference [8] or a special converter for RBSD [12, 131. The proposed RBSD Booth's encodmg will be discussed in greater detail in Section 3. In the next section, the RBSD number system will be described. In Section 3, the Modified Booth's encoding for various radix values will be introduced first, and then the new RBSD Booth's encoding will be discussed. In Section 4, the new multiplier architecture will be described as a whole. Finally, in Section 5, the conclusions are summarized. 2. REDUNDANT BINARY SIGNED DIGIT NUMBER REPRESENTATION The Signed-Digit (SD) number system was fxst proposed by Avizienis 171 to make it possible to perform carry-free addition. A SD number system in radiu-r is based on a redundant set of signed digits: { -a, -(a-I),

...,-1 ,O , I , ... ,a-I ,a }

The highest digit value "a" restricted to r(r-l)/21 5 a 5 r-1, which means r 5 3. However, SD with r = 2 can also accomplish a carry-kee addition by examining the present and the next lower bits [14]. SD numbers with a radix-2 are also called Redundant Binary Signed-Digit (RBSD) numbers on which our work is based. In the case of r = 2, then a = 1, and a redundant set of three digits {O, k l } is used to represent form for numbers in this system. It is common to use representing SD digit, where = - x,. A radix-r SD integer of the form X = (x.., x..? .,. xi)x, has the value in Eqn. (I).

x,

x,

n-1

x = C x . rI i i=O For example: If r = 2 and X =( 0 1 7 T ), then the value of X is as follows: x=&+ 1*2*+ i q l + i*2O=1 (2) Proceedings IEEE Southeastcon 2002 421

X

0

-

-lor 1 1 0

X+ 0 0

x-

I

O

1

1

0 1

Table 2: The Booth's Encoding for Radix4 (Booth-2)

Table 3: The Booth's Encoding for Radix-8 (Booth-3)

Partial Product Selection Table I Multiple Multiplier Bits

Partial Product SelectionTable Multi~lierBits Multiple I Multi~LierBits I Multiple

1

+M +2M -2M

10

.-

in 11 11

_..

I

l

l

.M

1 1

0 1

1 1

-M

+3M +4M

011

0

The use of Booth-2 algorithm reduces the numher of partial products from n to r(n+1)/21, where n is the length of the operand. The additional 1 in the expression is necessary to ensure that the last partial product is a positive multiple of the multiplicand for unsigned numbers, which is achieved by adding at least one zero to the left of the multiplier. This is not required for signed numbers. All of the multiples required by Booth-2 are obtainable directly by a simple (shifting) wire connection to the multiplicand as shown in Figure 1. In addition, negative multiples are obtained in 2's complement form by a bit by bit complementation of the bits of the multiplicand and the insertion of a one at the least significant position in the partial product. Also, each negative multiple requires the sign extension up to the leftmost position in the partial product parallelogram.

a

111

A drawback of this algorithm is the necessity of the generation of the 3M multiple, which is referred to as a hard multiple and cannot be obtained by simple shifling and/or complementation. But rather a full cany propagate addition (CPA) is required. This CPA increases the latency of the multiplier because the generation of all the partial products cannot be guaranteed until the 3M multiple is produced. However, the generation of the 3M multiple can be done in parallel within the time required to decode the multiplier to select the required multiples that are used in the generation of the partial products. The 3M adder is a special purpose adder that is designed to add (M+2M) and allows for circuit optimizations. In addition, the complexity of the partial product generator increases as shown in Figure 2.

-

Multiplicand

-

,

.

i 7

Figure 1: Booth-2 Partial Product (PP) Generator The reduction in the number of partial products does not .. ... ............. ... ..... ...... come without cost. The additional hardware is needed for the pp k PP generators as seen in Figure 1. It is essential to take into Figure 2: Booth-3 Partial Product Generator account that the benefits i7om the reduction in the number of partial products may be overwhelmed by the additional time Using this algorithm, the number of partial products is required to generate the partial products and the increase in the r(n+l)/31. It is also possible to further reduce the number of size of the Booth decoders, especially for small multipliers. partial products by using a higher radix, however, the number of hard multiples increases rapidly with the value of the radix. 3.1.2. Radix-8 Booth's Encoding (Booth-3) Booth's algorithm is not constrained to radix-4. It is possible For example: Booth-4 requires the generation of the multiples to use a higher radix than 4 with a corresponding larger (*8M, +7M, f6M, 15M, *4M, +3M, *2M, +M, *0]. The reduction in the number of partial products. In radix-8 Booth's hard multiples 3M (6M is obtained by shiftiig 3M), 5M, and encoding (Booth -3) the multiplier is partitioned into groups of 7M each require a separate adder. In addition to the 3 bits with an overlapping fourth bit. Each group is decoded exponential increase in the number of adders required, the in parallel to select a certain partial product from the set Booth4 encoders are more complex. Therefore, Booth4 and higher are not cost effective. {+4M, *3M, +2M, *M, O}, as shown in Table 3. Proceedings IEEE Southeastcon 2002 428

3.2. A Novel Redundant Binary Booth's Encoding (RBBE) As mentioned earlier, the use of the Modified Booth's encoding for Radix-8 and over is not feasible due to the difficulties in generating hard multiples and the complex decoding logic. Hard multiples cannot be generated by just shifting or connecting wires in SB form and require a canypropagate addition. Also, all negative multiples require the sign-extension which needs the extra hardware and the routing regardless of the radix value. The proposed RBSD Booth's encoding design presents a way to overcome these limitations. Table 4:The RBSD Booth's Multiple Encoding

-x= -(x+- x-) -x= (x-- x') -x= ((-x+)- (-X.))= (( x'+1) - ( 5-+1) -x=( x'-x-)

(4)

(5)

The first one involves only cross wiring from positive to negative bits and vice versa, and the second one requires only complementation of all bits. Either one could be used to simplify a design. The way a negative number is represented in RBSD relinquishes the need for the sign extension of the negative multiples in the partial product accumulation, resulting in the reduction of hardware.

3.2.1. Radix4 Redundant Binary Booth's Encoding Although the Modified Booth-2 encoding does not require any hard multiples as seen from Table 2, the negation of 1M and 2M and the sign extension of negative partial products are necessary. In our design, the negation is simple wire arrangement as explained earlier and there is no need for the sign extension of the negative multiples. Also, the generation of the RBSD partial products (PP) does not require any extra hardware as shown in Figure 4 compared to the Modified Booth-2 Partial Product Generator Logic in F i w e 1.

The RBSD Booth's encoding uses the same encoding table as the Modified Booth's encoding, requiring both to generate the same multiples. However, we can benefit from the definition of RBSD in Eqn. (3) to generate multiples, especially hard multiples. As seen from Table 4, a positive multiple which is not a hard multiple can be generated by connecting the positive bits (x? of RBSD partial product to this multiple and the negative bits (x') to logic 0. And, a negative multiple which is not a hard multiple can be generated by connecting the negative bits (x-) of RBSD to this multiple and the positive bits (x? to logic 0. Although these connections are correct for unsigned m&plicands, when the numbers are in 2's complement form the most significant hit must be connected to the most significant negative (x3 bit of RBSD for positive multiples and to the positive bit (x? for negative multiples. In the case of a hard multiple, proper two multiples are connected to each bit of RBSD partial product. As an example, Figure 3 shows how to generate the hard multiple 3~ by using the RBSD representation.

'

K

'-K

Figure 4: RBSD Booth-2 Partial Product Generator

3.2.2. Radix-8 Redundant Binary Booth's Encoding The Modified Booth-3 encoding requires the hard multiple 3M as seen from Table 3. Whereas, the RBSD Booth-3 encoding does not require any hard multiples for the positive or negative multiple selection as listed in Table 5. Table 5: The RBSD Booth's Encoding for Radix-8 (Booth-3)

Multiplicand OR) = (00 I 0 0 1 0 1 0 ) p (074),, 3 M = ( 0 1 1 0 1 1 1 1 0)2=(222)10 X = 3M = 4M - IM, X'= 4M &X=I M iX'=4M=(I 0 0 1 01000),=(296),,] ~ x = l M = ( O O l O O 010)2=(074)10i l X = 3 M = ( I O i l 000~0),,,,=(222)1, X = (256'1

+ 32'1) - (64'1

+2'1) = ( 222 ),a

Figure 3: An example of generating a RBSD hard multiple

The negation Of a RBSD number can be done in two ways using Eqn. (4) or ( 5 ) as follows:

The hard multiple 3M has been replaced by the RBSD 3M generated by only simple wire-crossing of proper multiples as shown in Figure 5 . In this case, +4M and +IM multiples are

Proceedings IEEE Southeastcon 2002 429

connected to the positive and the negative bits of RBSD number resuectivelv to form 3M in RBSD representation. ........................

multiples. Thus, we design a simple RBSD adder to form 5M by adding 4M and M. This adder is very simple and carrypropagate-free as seen in Figure 6. Therefore, while it does not add any delay, the additional hardware is necessary.

Figure 6: Redundant Binary Adder for 5 Multiple (5M)

:

.

........

.........

Even though the number of multiples to be generated is reduced by the RBSD Booth's encoding, the Partial Product Generator circuit is still complex for the RBSD Booth4 encoding. However, the number of the hard multiples is reduced down from 4 to 1, which can be easily formed without carry-propagation. Also, the use of Transmission Gate can make the generator logic cost effective. This is a step forward to the use of higher radix.

.:

4. RBSD BOOTH-3 MULTIPLIER (RBSD-BM)

Figure 5: RBSD Booth-3 Partial Product Generator In comparison to the Modified Booth-3 design in Figure 2, one advantage of the new RBSD Booth-3 is that all multiples can be generated without the Carry-Propagate addition. Another one is the simplicity of the negation of multiples enabling reduction in the hardware. And, the Partial Product generator, as seen in Figure 5 , can be accomplished without any additional time delay and with almost no extra hardware. As for Booth-4, the hard multiples such as 3M, 5M, 6M and 7M are needed as seen in Table 6.

As mentioned earlier, multipliers essentially consist of the Partial Product Generation Unit, the Carry-Free Addition Unit for the accumulation of Partial Products, and the Carry Propagate Addition Unit for Intermediate sum and carry for SB numbers or the Converter Unit from RBSD to SB for RBSD numbers. As an example, Figure 7 shows a design for a 24x24-bit RBSD Booth-3 multiplier.

RBSD Booth3 Encoder

)-(

L-iLLJ RBSD-to-SB Converter 48-bit

Product

Figure 7:24x24-bit RBSD Booth-3 Multiplier

. . We have already presented the RBSD Booth-3 partial easily generated by connecting proper two multiples to the product generation unit in the previous section. The Partial positive and negative bits of RBSD as shown in Table 4. Product Generator produces 8 RBSD partial products and a However, 5M cannot be generated by combining two simple Proceedings IEEE Southeastcon 2002 430

Redundant Binary Adder (RBA)tree [8] is used to accumulate the RBSD partial products as shown in ~i~~ 7, The tree Of Redundant Binw and the Of the tree is three. The result of RBA tree is 48-bit number in the RBSD form. Therefore, a conversion from RBSD to S a d a d Binary is required which is done by Cany-LookAhead adder (CLA) [8,16,17] in this multipler.

a

5.CONCLUSION A new RBSD Booth's Encoding (RBBE) circuit for various radixes has been designed for a fast multiplier. The proposed system consists ofthe RBSD ~ ~ ~partial t hproduct , ~ generator which generates the partial products in RBSD the Redundant Binary Adder tree and the CLA adder to find the fmal result in the standard binary form. The RBSD Booth's Encoding has three essential advantages over the Modified Booth's encoding: 1) RBBE does not need to generate any hard multiples for Radix4 and Radix-8, and the number of the hard multiples is reduced to one for Radix16. This hard multiple (5M)can be generated by the canypropagation-free adder. 2) In the standard Booth's encoding, the negation of multiples, and the sign extension of the partial products for accumulation are necessary which requires additional hardware and time delay. Whereas, the RBBE does not require either. 3) With the use of Redundant Binary adder tree, more regular layout for accumulation of partial products is possible. Therefore, the RBSD Booth's Encoding enable the use of high radix value such as Radix-8 with Booth's encoding to design a fast multiplier. In RBSD Booth's Encoding design, Transmission Gates can be employed to improve the area efficiency and the speed further. Moreover, any improvements on CLA adder will have a positive effect on the RBSD to SB number converter and the whole multiplier system. Optimized special converters can also be designed by taking advantage of the RBSD number system based on CLA adders.

REFERENCES 1. L Koren, Computer Arithmetic Algorithm, Fm~ticeHall, 1993. 2. A. D.Booth, "A Signed Binaly Multiplication Technique", plonerly J. Mech. Appl. Moth., vol. 4, part 2, pp. 236-240, 1951. 3. 0. L. MacSorlcy, '"High-Speed Arithmetic in Binary Computers", Proc. o/theIRE. vol. 49(1), pp. 67-91, Jan 1961. 4. C. S. Wallace, " A Suggstion for a Fast Multiplier", IEEE Tram. Electron. Comp..vol. EC-13,pp. 14-17,Feb. 1964. 5 . L. Dadda, "Some schemes for parallel multiplien", All4 Frequema, voL 34, pp. 343-356, 1965. 6. M. M e h 4 V. Parmar and E. Swamlander, "High-speed multiplier design using multi-input counter and compressor circuits", lEEE Proc. of the 10th Symp. h Computer Arithmetic (ARlTHIO). pp. 43-50, 1991. 7. k Avidenis, "Signed Digit Number Representation for Fast Parallel Arithmetic", IRE Trons. Electron. Computers., vol. EC-IO, pp. 389400, Sept. 1961. 8. H. Makino, Y. Nakase, H. Suzuki, H. Momnika, H. Shinohm and K. Mashiko, "An 8.811s 54x54-Bit Multiplier with High Speed Redundant Binanr Architecture"., IEEE J. Solid-State Circuits. vol. 31. NO. 6.. DO. 773-i83, lune 1996. 9. T. N. Kajashekhm and 0. Kal, "Fast multiplier design using redundant signeddigit numben", Int. J. Electronics, vol. 69, No. 3, pp. 359-368,

"Design of high specd MOS multiplier and divider using redundant binary V m t a t i o n " . 1EEE p,oc of the 8th S F P . ComPUrer Arithmetic (AR/TH8), pp. 80-86, May 1987. II. X. Huang, W. Liu, and B.W. Y. Wei, "A High-PerfomLulce CMOS Redundant Binary Multinlication-and-Ac~~ulation (MAC) Unit", IEEE Tram. CircuitsundSys.>, vol. 41, pp. 33-39, Jan. 1994. 12. T. N. Rajashekhara and 0.bl,"Conversion from signeddigit IO radix compliment representation". Int. J. Electronics, vol. 69, No,6, pp. 717721.1990. 13. S.-M. Yen, C.-S. Laih, C.-H. Chen and L Y . Lee, "An Efficient Redundant-Binary Number to Binary Number C O W ~ ~ RIEEE " , J. SolidSlate Circuits. vol. 27, No. 1, pp. 109-112, January 1992. 14. D.S. Phatak, 1. Koren, "Hybrid Signed-Digit Numbs Systems: A Unifed Framework for Redundant Number RepESentatiom With Bounded Carry Propagation Chains", IEEE Trans. Computer& vol. 43, No. 8, PP. 880891, August 1994. I S . GBewick, M. I. Flynn, "Binary Multiplication Using Partially Redundant Multiples", Technical Regon: CSL-TR-92428. Standford Universim, June i992. 16. N. Ohkubo, T. Shinba, 7. Yamanaka, A. Shimizu, K. Sasaki and Y . Nakagome, "A 4.4-m 54x54-b Multiplier Using Pass-Transistor Multiplexer", IEEE J. Solid-Stare Circuits, VOI. 30, No. 3, pp. 251-257, March 1995. 17. K.-W. Shin, B.4. Song, K Bacmia , "A ZOO-Mhz Complex Number Multiplier Using Redundant Binary Arithmetic". IEEE J, Solid-Stute Circuifs, vol. 33, No. 6, pp. 904-909, June 1998.

BIOGRAPHIES Nurettin Berll Department of Electrical and Computer Engineering Florida InStiNfe ofTechnology, Melbourne, FL 32901 e-mail: besli@,ee.fit.edu Nurettin Besll received the BS degree and the MS degree in Electronics Engineering fmm Uludag University, Bursa, Turkey, in 1991 and 1994, respectively, and the MS degrec in Computer Engineering from the University of Southem California, Las Angcles, in 1997. Currently. he i s a candidate for the Ph.D. degree in the Deparhnent of Electrical and Computer Engineering at Florida Institute of Technology, Melbourne. His research interests include Computer Arithmetic, COmpulR Architecture, Digital Systems Design and VLSl Design. Mr. Besli is a member of IEEE.

R G. Deshmukh Deparhnent of Electrical and Computer Engineming Florida Institute ofTechnology, Melbourne, FL 32901 e-mail:[email protected]

RG. Dcsbmulth received BS in Mathematics and BE in Electrical Engin-ng from India and MSEE and Ph.D. degre%s from Oklahoma State University, Stillwater, OK He has been working as Associate Pmfessor of Computer Engineering at Florida Insfitutc ofTecbnology, Melbourne, Florida. His research intcmts are in High Performance Computer Architecture, Parallel Processing and Image Pwessing. Mr. Deshmukh is a member of IEEE.

..

19911 ~....

IO. S. Kuninobu, T. Nishiyama, H. Edamatsu, T. Taniguchi, and N. Takagi,

Proceedings IEEE Southeastcon 2002 43 1