The answer maybe found in an analogous quantum mechanical problem. Here, the piles are ..... 2f. 0. 1f. 1. 0. 1e. 0. 4b. \ /. V. 1. 0. 3b. 4a. 0. 2b. 3a. 1b. 0. 2a. 0. 1. 0. 1a. Figure 3.3: Carry Select Addition ..... slowest to evaluate since evaluation time is also a function of the device physics ...... Holt, Rinehart and Winston, 1982.
*
;-' ;
.
t;. '..*. >*,^
**!
"i.' \:
W..-iJfo-.-? >;-'S.
'>-..'
'-.-'
=
addition follows that for
ripple
carry
select,
47
except the formation of the 1. for each block
(a)
Form the
conditional carry by (group generate) calculating the carry-out according to the ripple carry algorithm while holding the
Form the
zero
2. Take the
3. Perform
transfer) conjunct-ing (ant
one
carry-out from the bottom block, and select the
for the next
block,
ripple
3.8 shows
first two
zero.
conditional carry by (group all of the transfer bits together.
ing)
Figure
is left until the end:
perform both steps simultaneously:
carry- in at
(b)
sums
rows
an
continue until the top block has
a
correct
correct
carry additions in each block to form the final
8 bit carry
show the
carries derived from the
operand
ripple
add
skip
algorithm.
carry-in. sums.
of four two bit blocks. The
composed
bits. The third two
carry
carry-out
show the conditional
rows
The next
row
shows the actual
block carry outputs. The 0 conditional carry output of the bottom block is actual carry
so
outputs
selected in series.
are
carries for the row
shows the
operand
this value is
blocks, sums
2We
adders; widths
across
use
are
are
higher
The next to last
produced
found
by
performance
N/n blocks,
of this
in
row
row.
The other block
shows the intermediate
carry fashion. The bottom
ripple
column- wise exclusive- or-'mg the two
where
n
algorithm
adder of width nz- is signified with k instead of n. an
a
is linear since the carries
ripple
is the constant number of bits per block.2
the variable N for adder widths.
hence are
which
an
bits and the carry bit.
The level of
serially
which
just copied from
an
Small adders
member of such
an
be
arrayed to make larger array. Conventionally block
can
48
0
1
0
1
1
1
1
1
0
1
1
0
0
1
1
0
1h
1f
ig
0
1
0 1i
1
0
11
1
2c
1
5a
transfers
conditional carries 0 conditional carries 1
4a
1
2a
6b
1
5b
1
4b
1 7b
7c
0
Figure
0 3c
4c
5c
0
21
3b
1
1 6c
0
actual block carry outs
1
1
6a
1
1*
U
carries within block
11
0
Carry Skip Addition
3.8:
N
0( carry _skip (worst .operands))
(3.28)
Carry Lookahead
Carry Lookahead 1. Calculate bit
addition is
perfomed by following
these steps:
generates and propagates /transfers.
2. Evaluate each carry
3. Use the carries and
Carry lookahead always
"^
1
1
3.6.3
generates
1p
1
2d
2e
1
U 1o
1
1
1a
1 1n
Bbits
2b
3a
0
7a
1
1m
1
1
1b
1c
1d
0
1k
ij
0
1e
A bits
equation
in
propagates
is based
on
=
to calculates the
sums.
the fact that Boolean
be evaluated in two stages of
ci
parallel.
logic equations
can
logic.
go + PoCo
(3.29)
49
c2
=
gi-r Pigo + Pizoco
(3.30)
C3
=
g2 + P2#l + ^2:1^0 + #2:00)
(3.31) (3.32)
O(carry_lookahead(worst_operands)) Where C is
a
=
C
(3.33)
constant.
Such calculations based
fan-in/fan-out ignore
on
two level and-or
several real effects.
evaluation with unlimited
This isn't the issue
adder models which show less than linear order evaluation time
however, are
as
all
ignoring real
effects3. The issue is whether the model appropriately predicts dominate device effects for the size of
adder,
3the
this is
very
only
problem
the
propagation
case
of interest. In the
for N oi
of information
a
case
of the carry lookahead
few bits.
over a
distance is at best linear time.
4
Chapter Gate
Models for the Conventional Adders
Delay
during my study of the adder unit I got the idea of solving vir tually all statements today we speak of data or information with yes/no values. We realized that this principle could be applied to all computing machine components, especially to the control device, and led to switching algebra with the aid of propositional calculus. K. .
.
.
-
-
-
Zuse.
In this
chapter
cussed in the
show
case
gate level
path lengths through
net list is often
the gate instances with their of the named connections circuit
holds even
can
circuit
parasitics.
formally
(nets)
as a
graph. Hence,
appropriate place
xBy gate for the
level
common
transistor
between the
such
as
we mean
gates such
implementation
to start
and nets,
an
so
nand and
with
study
nor.
Such
for the gates when needed.
50
a
list
sizes,
connection can
lengths,
be
nexus
typically
of adder
objects
macros
and
interpreted
of its
physical parameters. Apparently our
we
list of all
a
the net list
adder is the
that all of the instantiated as
algorithm,
pins. Various attributes of the
device
the net list of and
an
pins labeled, along
The connection information in the net list
topology, gate implementation, list is the
net list for
dis
the net lists1.
and output
nodes, pins,
information,
a
algorithms
ASCII file which contains
an
input
be attached to the
physical
of
logical gate implementations
previous chapter. After obtaining
derive the worst A
we
logical the net
implementations.
in the net list
would be
expanded
are
macros
out to the
51
to the
According of
length
graph interpretation,
path through
a
a
count is
gate delay
the net list.2 Because net lists
are
a
be
implemented as computer
syntax akin
objects such In
to
a
net
lists, nets, pins,
general,
the
longest path through
net list
a
slowest to evaluate since evaluation time is also
geometries involved, but
is
reasonable heuristic. It
a
rate
sequence of
limiting
generate, and
path lengths
carry
since
gate delays
The
be
sum
approach
improved
model.
are
upon
of
case
by using
were
simple
3Of
list, since course
counts
function of the device
constant time
at
layout
along
with
are
with
it
In
general times,
such
an
case
expressions
of
combination of
time,
improved timing model3
an
arcs
with SPICE.
acceptable
path lengths through
inputs
can
paths.
carry
evaluation
parasitics, and simulated
typically
time
simple timing
a
optimizing multiple
of the nodes crossed instead of a
operations.
to actual evaluation
device sizes and loads
counts
there must exist
circuits,
path lengths for determing evaluation
paths recognized by
are
physics
same.
The worst result obtained from SPICE is
the net
output
not be assumed to be the
accurate estimates of the worst
should be extracted from
2In actuality, gate delay
a
can
directly proportional
not
looking
the slowest among the
as
have
steps is the propagation of the carries. Propagate,
not all the
reasonably
produce
of the conventional adder
This refinement is necessary when
To obtain
or
chapter
apparent from the previous chapter that the
was
formation are
in the
input
as
path lengths
and numbers.
as
and
as
programs. The functions in this
procedure call and accept
the
computer based
representations, functions for describing net list properties such can
simply
worst
the dual
traversed.
which activate the
paths.
case
graph
of
52
evaluation time estimate.
Note, there operators
are
five distinct usages of and in this thesis There is the
analogous).
are
gramatical
the
name
refering
of the
forms carry
a
net list
generate,
situations. For
was
'and\
'AND'. All of these
use
them
consistently;
when
example,
a
schematic
implemented.
4.1 shows three variations
variations in
according
Figure
4.1
to the
are
NAND
4.3 shows the
macros
along
general, gates
all based
higher
xor
measure
Hence,
the
is known
performance
as
gates
cause
shown in
transfer and
two
Figure
delays 4.2.
in
The
(4.1)
of two and three
input OR-NAND,
fan-in
creating
(aV6)(a5)
are
slower.
to the number of series transistors between the
this
for
on:
implementation
with the three
with
logic
implementations
ab=
Figure
the
on
propagate and generate signals,
or
the worst case,
rails;
operation,
Ripple Carry Figure
In
list,
We have endevored to
ambigous
were some
used to show how
4.1
function called out in the net
macro
to the
gate which performs this operation 'and'. Finally,
unique information.
however there was
to the
logical
version in text, 'and'. The
propositional logic operator, 'A'. The word for refering The word for
other
(the
the "stack
The
input NOR and
and AND-NOR
delay
is
macros.
roughly
output and the
power
related
supply
height".
of the OR-NAND and AND-NOR is
comparable
53
ai
ts
ai b:
ai b.
Figure
4.1: Generate and
o
Propagate Logic
Blocks
D~
>
5> Figure
4.2: XOR and XNOR Blocks
54
Vdd
Vdd
r^tdC
4
Vss
t>
>
Vss
2-NAND
2-NOR
Vdd
Vdd
s
HI
HI
>
41
2
HI Vss
Vss
3-NAND
3-NOR
Vdd
4
Vdd
4
I
ooooooooo I
I
I
I
I
I
I
I
I
I
oooooooooooooooo I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
oooooooooooooooo ooo o
-*
ro
oo c
oooooo 01
o> -J
oo
0) *i
o 03
O o l-i
CD O ert-
ooooooooooooo o OO o ooo
o
-
ro
oo
w
oooooo
n
>j
-
oo
-f
rt
o
o
-A
CO
k
Ol
OOO o
-k
ro
oo w*
ooooo in
m
^
a
10
91
Neither of shown in
example
be created of
speed path Note,
skip
adder has
the block sizes.
level carry
one
simplified figure
packed
by
to the left.
6.7 shows the time one
fewer
diagram
fco delay
Brent and
path through not
Kung's
is. It is apparent from context which cells contain
fco logic.
logic
chains
D).
marked with identical letters
Many adders
are
made from
Manchester carry chains with
together the
restoring
are
6.8 shows
a
hybrid
(A,B,C,
or
of different
inverter stages. time
the
on
square law based
Hence, x
view
in the
adder.
which
are
it may be useful to
axis and N
(bits)
generalized prefix graph
fco
The shared
technologies.
nonrestoring dynamic circuits
prefix graphs by placing
Figure
a
can
the circles where the
darkening
logic
are
This is
block sizes. A faster adder
equal
Figure
case
6.7
probable layout.
to
adder. There is
skip
here than in the worst
have
we
carry
by varying
optimum
an
corresponds
6.6 where the cells have been
Figure
This
6.4 and 6.5
figures
on
CMOS
combined
generalize
the y axis.
for the carry
skip
adder.
figure
6.8
chain blocks is
x2,
For
MCC block. linear we see
we
while the
that block on
skip
Thus, the skipped
time, while the
evaluting
assumed that evaluation time of the Manchester carry
a
carry
has
carry is
generates
are
3x, where seen
a
worst
case
x
traveling
parabolic.
just finished evaluating
its first pass, and
and fourth bits.
time is
For
is the "width" of the accross
the adder in
example,
for the second
at time 6ns
time, block
c
is
carry would be between the third
OTJ'OTJTJTJTJTJ'O'O (Q(Q(Q(Q(Q(Q(Q(Q(Q(Q
Qrq
"O
o
"O -1
TJ ro
"O u
"0 6
P
(Q(Q(Q(Q(Q(Q(Q(Q(Q o-*ufliso
I"
cm
cd
1=: H
p>
CD 05
O p> *~i I
CO i>
CO
OOOOOOOOOO
I"
>
6666666666
CD
a-
> -j
cr; ST
S o
CD
f1
oooooooo
P> *
666666666 i
i
i
i
N
CD 03
000 o
-t
ro
i
! ^
00
o
1
OOOOOOOOOO
o
i
o
6666666666 1
i
!
6OOOO6OOOO6O66O6 CO
6666666666
oooooo as
n
ooooo
>
o
-
10
o
/yo
D*^-
^^
t>i
{>L
S^
S>^
Co
Po p2
k1 Co
N> Figure
8.2:
Transforming
Standard
Ripple
to
Majerski Style Ripple
116
it is
Although is associative g
signals
are
in the carry
possible
(see chapters still
then k is
7 and
no
recursion in
apply Majerski's
8)
complication
a
The adder
required.
path,
to
can
be
needed
longer
a
tree because it
simplified by using it is
as
p,k,
and
place
of p
is that the group t in
just t, but the requirement
for group g remains.
8.3
Ling's
Adder
Ling suggested using crastinating
the fco variant
the final and.
direction, and do
In
same
sistent with the rest of the thesis
we
Hi+i and then the
pieces
paper the
Ling's
not follow the
can
=
opposite con
(8.33)
=
propagated
between bits:
(8.34)
>
the
pieces:
(8.35)
UHi+i
Other Adders
8.4
Reed,
et al.
the gamma of
go in the
(giVa)
by assembling Ci+i
subscripts
pro
find:
to make the carry must be
be recovered
equation 8.14, and then
initial value for carry. In the form