Document not found! Please try again

Synthesis of Minimum-Cost Multilevel Logic Networks via ... - CiteSeerX

2 downloads 83 Views 981KB Size Report
of minimal cost logic circuit synthesis as a genetic algo- rithm. Also described is a ... Etsuko Okushi, Mitsuhiro Yasuda, Hisao Koizumi, Katsuhiko Seo. Mitsubishi ...
SHACKLEFORD ET AL.

SASIMI 2000, Kyoto April 6-7, 2000

Synthesis of Minimum-Cost Multilevel Logic Networks via Genetic Algorithm Barry Shackleford Hewlett-Packard Laboratories, 1501 Page Mill Rd., Palo Alto, CA 94304 U.S.A.

Etsuko Okushi, Mitsuhiro Yasuda, Hisao Koizumi, Katsuhiko Seo Mitsubishi Electric Corporation, 5-5-1, Ofuna, Kamakura, Kanagawa 247-8501 Japan

Hiroto Yasuura Graduate School of Engineering Sciences, Kyushu University, Kasuga-shi, 816 Japan

Abstract—The problem of synthesizing a minimum-cost logic network is formulated for a genetic algorithm (GA). When benchmarked against the a commercial logic synthesis tool, an odd parity circuit required 24 basic cells (BCs) versus 28 BCs for the design produced by the commercial system. A magnitude comparator required 20 BCs versus 21 BCs for the commercial system’s design. Poor temporal performance, however, is the main disadvantage of the GA-based approach. The design of a hardware-based cost function that would accelerate the GA by several thousand times is described.

II.

LOGIC MINIMIZATION PROBLEM

Given a Boolean logic function, what is the minimum cost implementation that can be realized for a given target technology? Logic functions are normally expressed in terms of AND, OR, and NOT operators whereas logic circuits are often most economically implemented in terms of NAND (NOT AND) or NOR (NOT OR) functions. This dichotomy has resulted in most logic synthesis systems being divided into a technology-independent phase, where the expression of the logic function is simplified, and a technology-dependent phase, where the simplified logic function is mapped onto the secondary logic functions embodied in the target technology. Logic function simplification is divided into two-level logic minimization and multilevel logic minimization. Two-level logic is so named because one level is devoted to the AND operation and the other level is devoted to the OR operation. A group of ANDs connected to an OR is called sum-of-products network. A group of ORs connected to an AND is called a product-of-sums network. Minimal two-level networks are formed by finding a minimal-cost set of prime implicants that covers all terms of the original logic function. A prime implicant is a Boolean subcube that is not contained within any other subcube. Prime implicants can be determined visually by means of a Karnaugh map [8] or via a tabular method that was first proposed by Quine [9] and later improved by McClusky [10] which is now often referred to as the Quine-McClusky method [11]. If we think of a two-level logic network as implementing a logic function “in parallel,” i.e., all of the terms are determined concurrently in the first level of logic and then combined in the second level of logic. Then we can think of multilevel logic [12] networks as being implemented “in serial,” i.e., factored subterms are combined with other subterms and inputs to form larger subterms until the logic function is finally realized. Whereas the Quine-McClusky method can provide an optimal two-level implementation of a logic function (within the computational constraints imposed by the NP-complete

Keywords—genetic algorithm, logic synthesis, hardware acceleration

I.

2.

INTRODUCTION

A genetic algorithm is an optimization process based upon simulating natural evolution. The concept of using genetic algorithms to evolve digital circuit designs has been discussed in [1–3]. Efforts can be broadly divided into functional-level evolution and gate-level evolution. Murakawa [4] describes a VLSI chip containing 15 digital signal processors whose functions and interconnections can be dynamically reconfigured according to a chromosomal data pattern. Thompson [5] describes the evolution of dynamic, asynchronous circuits within a 100-gate section of a field programmable gate array. Closer to our own work, Chattopadhyay [6] describes the synthesis of AND-XOR [7] circuits using a genetic algorithm. This paper describes our approach in casting the problem of minimal cost logic circuit synthesis as a genetic algorithm. Also described is a means to accelerate the speed of the genetic algorithm in this application by means of a cost function implemented in hardware. Initial simulations are encouraging in that less costly circuits were generated than those synthesized by a state-of-the-art commercial logic synthesis tool. In the next section we provide a brief survey of the problem of logic minimization. Section III describes the genetic algorithm and problem formulation. The distribution of logic functions in the implementation space is explored in Section IV and finally, Section V details two experiments involving NOR-gate circuit synthesis of an odd parity circuit and a magnitude comparator. 1

SHACKLEFORD ET AL.

Truth Table i1 i2 f 00 1 01 0 10 0 11 1

SASIMI 2000, Kyoto April 6-7, 2000

Chromosome length derived from conventional implementation

Crossover template 1 1 1 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 0 Parent0 1 0 0 0 0 1 0 1 0 0 1 0 1 1 0 0 1 0 1 0 Parent1 1 1 0 1 0 1 0 1 1 1 0 1 0 0 0 1 0 0 1 1

Coverage Genetic Algorithm

Fitness

Child 1 1 0 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 1 0

Cost Mutation template 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 Child 1 1 0 1 0 1 0 1 0 0 0 1 0 0 0 0 1 0 1 0

Chromosome



Logic Circuit

i1 i2

Mutated child 1 1 0 0 0 1 0 1 0 0 1 1 0 0 0 0 1 0 1 0

OR OR

OR

Figure 2 Crossover and Mutation functions f

OR

somes and are stored within an aggregate population array along with their respective fitness values. The purpose of the population is to maintain a wide genetic diversity which accounts for breadth in the search for a good solution. Depth of search is realized by combining fitter parent chromosomes through a process of crossover and mutation (Fig. 2) to produce new child chromosomes. A survival of the fittest, paradigm in which fitter children randomly replace less-fit members of the population, assures evolutionary advance towards an optimal solution.

Figure 1 Generation of minimal-cost logic circuits via genetic algorithm: The genetic algorithm produces chromosomes whose bit patterns represent logic circuits. The resultant circuits are graded as to cost and coverage of the function to be implemented. The circuit’s fitness determines its chance of survival and hence its ability to influence the evolutionary process

set covering subproblem), there is no such assured method for multilevel cost optimization. Instead, current methods rely on heuristics and are roughly divided into rule-based methods and algorithmic methods. Rule-based methods fire specific pattern transformation rules when certain patterns are found within the logic network. LSS [13], first developed by IBM in the late 1970s, is such a system. Initially relying on local transformation rules, the system now contains global optimization methods [14]. SOCRATES [15] is another program using a rule-based approach. Algorithmic methods primarily rely on either algebracic [16] or Boolean [17] operations to simplify the logic function prior to technology mapping. MIS is exemplary of a design system based upon algebraic methods and BOLD represents a design system based upon Boolean operations. Our approach (Fig. 1) uses a genetic algorithm to evolve a minimum-cost circuits from a population of initially random circuits. Evolution is first driven by the relative correctness of the circuits, then, after correct circuits have been generated, evolution is driven by the cost of the circuits. The length of the chromosome, which determines the maximum number of circuit levels, can be established by first implementing the circuit by conventional means and then using that information to establish the maximum number of logic levels to be considered for the GA-based implementation. III.

A.

Survival-Based Genetic Algorithm Description

The GA is made amenable to hardware implementation (Fig. 3) by employing a steady-state population with random parent selection where evolution is promoted by fitter children replacing lesser-fit parents in the population. This is in contrast to Holland’s original GA which uses a generational population scheme with parent selection according to fitness to promote evolution. Initially, a population of np randomly generated chromosomes is created, evaluated (i.e., assigned cost values), and stored in the Population array. A single location in the array contains both the data (nc bits) and cost portions of the chromosome. Since the basis of the algorithm is survival of fitter offspring over less-fit parents, a record of a child’s least-fit parent is kept in the two variables: worst_cost and worst_adrs. As pairs of parents are randomly selected, their cost values are compared and the least-fit parent becomes the new candidate for replacement, with its cost and address being held in worst_cost and worst_adrs respectively. After two parent chromosomes have been randomly selected, a child chromosome is created by the Crossover function (Fig. 2). It is then exposed to the possibility of mutation (Fig. 2). After mutation, the child is ready for evaluation by the problem-specific Cost function. The survival of the child chromosome is determined by comparing its cost value with that of its least-fit parent. If the cost of the new child chromosome is less than worst_cost, then the child data and fitness are stored in the Population array at the location pointed to by worst_adrs. As the process of selection, generation, and survival/replacement continues, the overall cost of the population will decrease and the survival rate of new offspring will eventually drop to zero.

GENETIC ALGORITHM

Genetic algorithms (GAs) [18–20] were described in 1975 by John Holland as a method for finding solutions to difficult optimization problems [21] by means of simulated evolution. Two prerequisites for solving an optimization problem with a GA are (1) the ability to express a potential solution as a bit string and (2) a fitness function to evaluate the goodness of the solution expressed in the bit string. The bit-string potential solutions are termed chromo2

SHACKLEFORD ET AL.

SASIMI 2000, Kyoto April 6-7, 2000

—Initial Population Creation—

Chromosome 1 1 0 0 0 1 0 1 0 0 1 1 0 0 0 0 1 0 1 0

for i = 0 to np – 1 do chromosomedata = Random(2nc); chromosomecost = Cost(chromosomedata); Population(i) = chromosome; end for

Row: i1

g1

g2

g3

g4

Connection Matrix

g1 g2 g3 g4 g5 i1 i2

—Algorithm Body—

g1

while not Evolutionary_Stasis(Population) do

g2 g3

• prior first-parent becomes second-parent parent2 = parent1; parent2_adrs = parent1_adrs;

g4

• • . • . • • •

. . . . . . . . • . • .

Represented Circuit

g2 OR



unused gate

• new first-parent selection parent1_adrs = Random(np); parent1 = Population(parent1_adrs);

g1 i1

g5 OR

OR

g3 OR

i2

f

g3 connected to g5

Figure 4 Chromosome data format: The chromosome represents a connection matrix of NOR gates and primary inputs to the logic function. A 1 in the chromosome represents a connection (shown by a “•” in the connection matrix). The output of the circuit is defined to be that of the rightmost gate in the connection matrix.

• least-fit parent is replacement candidate if parent1cost > parent2cost then worst_cost = parent1cost; worst_adrs = parent1_adrs; else worst_cost = parent2cost; worst_adrs = parent2_adrs; end if-then-else

ng ( n g – 1 ) . nc = n i n g + -----------------------2 nc

The number of possible network configurations is 2 . The functionality of the matrix can be easily expanded by adding rows at the top. For example, adding two rows would allow for the specification of one of four different gate types at each level.

childdata = Crossover(cut_prob, parent1data, parent2data); childdata = Mutation(mutation_prob, childdata); childcost = Cost(childdata);

C.

Fitness Function

For the logic minimization problem, higher fitness is associated with lower cost, so we will address the cost function directly. We will take the cost of a logic network to be the sum of the costs of the individual gates comprising the network. For the CMOS technology [22] that we used as our target technology, gate costs are measured in basic cells (BCs). For NAND and NOR gates the cost CB of a gate g in BCs is given by

• survival determination if childcost < worst_cost then Population(worst_adrs) = child; end if end while Figure 3 Genetic algorithm designed for efficient hardware implementation.

B.

i2

 fanin + 1 if fanin ≥ 1 C B ( g ( fanin ) ) =  0 otherwise. 

Chromosome Data Format

A connection matrix similar to that in Fig. 4 can describe any network not incorporating feedback and comprised of similar circuit elements (in this case, NOR gates). The rows of the connection matrix represent either function inputs (the rectangular upper portion of the matrix) or gate outputs (the triangular lower portion of the matrix). The columns represent inputs to the gates. A one in the connection matrix represents a connection from the signal represented by the row to the gate represented by the column. The connection matrix is arranged so that any function input can be connected to any gate and any gate gj can be connected to any other gate gk where k > j. The full connection matrix is capable of describing all circuits of up to ng gates and ni function inputs. The size of the upper, rectangular portion of the matrix is n i n g cells. The size of the lower, triangular portion of the matrix is n g ( n g – 1 ) ⁄ 2 cells. Letting each cell in the connection matrix be represented by a single bit in the chromosome, the length nc of the chromosome is given by

So, for example, an inverter has a cost of two BCs and a 3input NAND or NOR gate has a cost of four BCs. If the gate’s fanin is zero, then its cost is also zero. However, we still need to account for cost of logic networks that don’t correctly implement the supplied logic function. For these networks, we add a penalty cost increment for every instance in which the synthetic network F does not provide the output specified by the source function T when tested against all possible input combinations. By choosing the penalty increment to be the maximum possible cost (i.e., n c + n g ) for a connection matrix we can be assured that a network with ne errors will have a lower cost than a network with n e + 1 errors. The penalty cost is given by ni

2 –1

CP = ( nc + ng )

∑ i=0

 1 if T ( i ) ≠ F ( i )   0 otherwise.

Thus, the cost C for a given chromosome is composed of a 3

SHACKLEFORD ET AL.

SASIMI 2000, Kyoto April 6-7, 2000

the cost of the fitness function circuit. The cost of the fitness function circuit will be proportional to the size of the connection matrix and will consist of two parts: The first part is the number of 2-input gates (Fig. 5) that effect connections between the gates in the synthetic circuit. Here the cost is simply the number of cells in the connection matrix and can be represented as sum of a linear series of ng terms starting at ni and ending at n i + n g – 1 where, as before, ni is the number of inputs to the synthetic circuit and ng is the number of gates that can be represented by the connection matrix. The second part is the cost of the multi-fanin NOR gates that represent the gates in the synthetic circuit. We can approximate their costs as the cost in basic cells (i.e., fanin + 1 ) divided by 3 (the cost of a 2-input gate). Again, the total cost can be represented as the sum of a linear series of ng terms—this time starting at n i + 1 and ending at ni + ng . Thus, the cost CC (as measured in equivalent 2-input gates) of the hardware implementation of the connection matrix is given by

i1 i2 & OR 1

AND

& OR 2

& 6

3

&

g1

7

g2 g1 g2 g3 g4 g5

1

2

6

7

g1 g2

3 8

4

&

11

i2

& OR

8

&

i1

4

NOR

& OR

&

9

&

13

12 15

g3

5

9 10 Connection 11 12 13 14 Matrix 15 16 17

g3

18 19

g4

20

& OR &

5

&

10

&

14

&

17

f

&

16

& &

18

&

g4

19 &

chromosome bit

20

g5

Figure 5 Hardware circuit to evaluate any circuit expressed by connection matrix: The circuit is simulated by NOR gates g1–5 and the circuit interconnections are effected by the AND gates abutted to each NOR gate. Each AND gate is in turn controlled by a bit in the chromosome.

penalty cost plus the intrinsic cost of the network:

n g ( n g + 2n i – 1 ) n g ( n g + 2n i + 1 ) - + --------------------------------------C C = -------------------------------------2 6

ng

C = CP +

∑ CB ( gi )

.

where the left and right halves of the RHS of the equation represent the first and second parts respectively. However, the circuit shown in Fig. 5 can only evaluate a single row in the function’s truth table. To evaluate the entire n truth table would take 2 i machine cycles. It is possible though, to trade computation cycles for increased hardware cost. Each doubling of the quantity of connection matrix circuits will reduce the number of computation cycles by half. n By implementing 2 i connection matrix circuits (one for each row in the truth table), a synthetic circuit’s compliance the target logic function can be evaluated in one machine cycle. Fig. 6 shows a pipelined implementation of the concept. Each stage of the pipeline evaluates one row of the truth table. If the output off the synthetic circuit does not match the target function at any stage, the penalty cost is incremented and passed to the next stage. The final stage of the pipeline combines the penalty cost with the intrinsic cost of the synthetic circuit. The penalty cost comprises the most significant portion of the solution cost with the actual cost in BCs comprising the least significant portion. The hardware cost of this type of fitness function is plotted in Fig. 7 as a function of ni and ng. Since the cost in BCs for a gate is effectively the number of inputs plus one for the output, the actual cost of the circuit represented by the chromosome can be determined by summing the number of inputs with the number of outputs. A carry-save adder connected to the chromosome will calculate the number of inputs since each 1 in the chromosome represents a gate input. Gate outputs can be detected by ORing bits in the chromosome that are associated with a single column in the connection matrix. The outputs of these ng OR gates are summed in the same carry-save adder to produce the intrinsic circuit cost. The cost of hardware fitness function CF increases expo-

i=1

D.

Speeding Up the GA

A major drawback of a GA is its slow speed when emulated on a conventional computer. Our solution to the speed problem was to implement the GA as a pipelined processor [30], [31] with each stage of the pipeline executing one aspect of the algorithm such as parent selection, crossover-mutation, fitness determination, etc. The net result is that one crossover operation is completed per clock cycle. So, for example, a GA machine running at 100 MHz will execute 108 crossovers per second. However, this requires that each stage of the pipeline, including the solution fitness/cost evaluation, be able to produce a result on each clock cycle. For a complex cost function, such as the NOR gate network cost function considered in this paper, a combination of parallel implementation and/ or further pipelining can be used to attain the necessary performance. In the next subsection, we will consider these techniques as applied to the hardware implementation of the NOR gate network cost function. E.

Hardware Implementation of Cost Function

Fig. 6 shows the hardware circuit to access the functionality of any 5-gate NOR network with two function inputs. The NOR gates of the hypothetical circuit are simulated by NOR gates g1–g5. Each NOR gate is connected to all previous NOR gates and all function inputs through AND gates that govern each connection. The AND gates are in turn controlled by bits in the chromosome register that represent bits in the connection matrix. Manufacturers typically rate integrated circuits by the number of equivalent 2-input NAND or NOR gates that can be contained. So, we will use the same measure in defining 4

SHACKLEFORD ET AL.

SASIMI 2000, Kyoto April 6-7, 2000

20 20

Number of logic function inputs ni

Chromosome register (0,0) 2

20

i

c CM

f

f (0,0) 1

1

= 20

1

Chromosome register

number-of-circuits curves plotted as f (ng ) n = 20

15

232768

i

216384 28192

ni = 3 (b)

108

10

(c)

(d)

2512 2256

107 (a)

2128 264 232

106 5 104

105

216 28

3 (0,1)

10 25

2

20

i

f 1

1

1

1

+ 20

2

Chromosome register (1,0)

similar circuits

f (1,0) 20

IV.

(1,1)

f (1,1) 20

3, 8, 12, 15

4 OR g3 1

4, 9, 13, 16, 18

5 OR g4 1

20

5

5, 10, 14, 17, 19, 20

6 OR gate present detectors g5 1

ones counter

CSA

penalty cost 3

175

200

5 circuit cost (BCs) 8 solution cost

DISTRIBUTION OF LOGIC FUNCTIONS WITHIN THE CONNECTION MATRIX IMPLEMENTATION SPACE

Let us consider three GA-based optimization problems, each with a 69-bit chromosome: The first is a graph partitioning problem with 69 vertices, the second is a set partitioning problem with 69 sets, and the third is a circuit implementation problem with a 69-bit connection matrix. For the graph partitioning problem, each bit in the chromosome represents the partition state of a vertex. All but two (the all-1s and all-0s solutions) of the 269 solutions are legitimate solutions. The fitnesses will vary, but all of the solutions are legitimate in that they divide the graph into two partitions. For the set partitioning problem, each bit in the chromosome represents the inclusion one of the 69 sets in the solution. It is possible to have illegitimate solutions in that all elements from the union of the 69 sets may not be included in the solution. However, it is always possible to eventually move from an illegitimate solution to a legitimate solution by changing 0-bits in the chromosome to 1-bits. The circuit synthesis problem, however, resides in a more highly constrained space: A chromosome representing an illegitimate solution (i.e., a circuit not implementing the desired logic function) can not be transformed into a legiti-

3

Chromosome register

3 OR g2 1

150

2

Chromosome register

2, 7, 11

125

Figure 7 Equal-cost contours for the single-cycle fitness function circuit are plotted as a function of ni and ng. The shaded area indicates the region for a set of 18 curves (ni = 3, 4, … , 20) plotting the size of the circuit space as a function of ng. The following example illustrates the interpretation of the graph: Point (a) represents the cost (approximately 106 gates) of the fitness function for a connection matrix with 50 gates and 9 inputs. Extending this point upwards and reflecting off of the number-of-circuits curve for ni = 9 at point (b), we find that the total number of circuits (c) to be 21675. The chromosome length is log2 of this quantity (i.e., 1675 bits). The difference between number of logic functions for 9 inputs (2512) and number of circuits that could be represented by the chromosome (21675) is indicated by (d).

1

2 OR g1 1

100

c

=

1, 6

75

Number of gates in connection matrix ng

CM

f (0,1)

50

24096 22048 21024

Number of possible circuits

equal cost contours for hardware fitness function (gates) plotted as f (ni,ng )

Function evaluation circuit shown in Fig. 6.

CM: connection matrix CSA: carry save adder

Figure 6 Pipelined hardware implementation of that will produce one chromosome evaluation per machine cycle.

nentially with the number of function inputs as given by ni

CF = 2 ⋅ CC . 5

SHACKLEFORD ET AL.

SASIMI 2000, Kyoto April 6-7, 2000

mate solution by simply changing 0s to 1s. Thus from a genetic algorithm perspective, the circuit synthesis problem is harder than either the graph partitioning problem or the set partitioning problem. The questions then arise: What is the size of the function space and how does it relate to the circuit space? What is the distribution of functions within the function space? What is the most difficult function to synthesize a circuit for and why? Finally, is a GA capable of synthesizing this worst case function? In the next subsection we will discuss the number of possible logic functions as related to the number of possible circuits. In subsection B, we will explain the notation used to describe logic functions, and in subsection C, we will describe experiments to determine the distribution of functions in the circuit space. Then, in section V, we will describe synthesis experiments based, in part, on the findings of subsection C. A.

Logic function

i1 i2 i3

ODD

000 001 010 011 100 101 110 111

0 1 1 0 1 0 0 1

1 0 0 1

916

0 1 1 0

616

Notation



F96

Figure 8 Notation to describe logic function. The notation derives directly from the function’s truth table and is written as a hexadecimal subscript.

1.0 10-1 10-2 Fraction of total

The number of different logic functions increases exponentially with the number of variables (inputs). For a binary function with ni inputs, the number of possible logic functions nf is given as nf = 2

i1 i2 i3 f f

The output f is a 1 if the number of inputs that are 1 is odd.

Number of Logic Functions

2

Truth table

ni

10-3 10-4 10-5

. 10-6

Indeed, this number increases very rapidly: For three inputs, there are only 256 possible functions; for four inputs, the number of possible functions increases to 65,536; for five inputs, the number is over four billion. On the other hand, the number of possible circuits for a full connection matrix starts out large but then only increases 2 as n g , where ng is the number of gates available to the connection matrix. What can we conclude from this? We can say for certain that if

10-7 F 00 Logic function

a 75-bit chromosome and an implementation space of 22 3.8 × 10 circuits. Since the implementation spaces are too large to evaluate exhaustively, 107 random connection matrices were created for each case and the resulting circuit functions are plotted as a histogram in Fig. 9. Function FFF was the most prevalent function generated, accounting for 50 percent of circuits. The output of this function is always 1, regardless of the input state. Two of the four functions not to appear in any of the 7 3 × 10 random circuits were F96 and F69, the 3-input odd and even functions. Our reasoning as to why circuits for these two functions should be so rare is presented in Section V-A.

2

then every function of ni inputs can not be implemented with a circuit comprised of ng or fewer gates. This follows from the fact that the term on the left represents the number of rows in the truth table and that the term on the right represents the number of connection points in the connection LHS RHS >2 matrix. So, if 2 then the number of possible functions is greater than the number of possible circuits. B.

Notation for Logic Functions

V.

In order to succinctly refer to different logic functions, we have adopted a notational system that derives directly from the function’s truth table output column which forms a binary number that is written as a hexadecimal subscript (Fig. 8). The all-zero input row is the position of the least significant bit of the binary function number. C.

FFF

Figure 9 Distribution of 3-input logic functions within the implementation space defined by circuits with a maximum of ten NOR gates.

ng – n 2 > ( n i n g ) + ------------------g , 2 ni

F96

CIRCUIT SYNTHESIS EXPERIMENTS

We performed synthesis experiments on two logic functions: (1) a three-bit odd parity function, and (2) a two-bit magnitude comparator. The implementation technology in all cases was a library composed only of NOR gates with the gate cost in basic cells (BCs) being computed as the gate’s fanin plus one. For each function we took the cheaper of either the sumof-products or the product-of-sums to use as a baseline

Random Generation of Circuits

A 10-gate NOR connection matrices was evaluated, yielding 6

SHACKLEFORD ET AL.

SASIMI 2000, Kyoto April 6-7, 2000

(a) Synopsys Design Compiler-generated circuit: cost = 28 BCs

i1

500

g3 OR g5 OR

g6 2 errors

g8 OR

g4 OR

g10 OR

i2 g2

g9 OR

g7

Cost (BCs)

g1

g11 f

100

1 error

i3

legitimate circuits

(b) GA-generated circuit: cost = 24 BCs

g2 OR i1 i2

g1 OR g3 OR

10 0

1

2

g4 OR

4 5 6 7 Crossover count (x105)

OR

g8 OR

9

10

a) Synopsys Design Compiler generated circuit: cost = 21 BCs

f

i3

i4

g7

g5 OR

g2

i2

Figure 10 Three-input odd parity function F96.

i3

implementation. Then using the Synopsys Design Compiler,1 representing the state-of-the-art in commercial multilevel synthesis, we synthesized a minimum-cost circuit for each function. Finally, we synthesized circuits for each function using the GA-based system shown in Fig. 1. For both functions, the least costly circuits were produced by the genetic algorithm.

g7 OR

OR

g1

A.

8

Figure 11 Cost vs. crossover count of surviving chromosomes for parity circuit synthesis.

g5 OR

g6

3

g3

i1

g4

g8 f

g6 OR

(b) GA-generated circuit: cost = 20 BCs

g3 i3

g1 OR

i1

Odd Parity Function

g2 i2

The odd parity function is particularly difficult to minimize. This can be seen by viewing the function plotted onto a Karnaugh map: The checkerboard pattern of an equal number of 1s and 0s has no adjacent terms that can be grouped for minimization. Each minterm of the odd parity function is thus an essential prime implicant that in turn is a function of all the input variables. Fig. 10a shows the multilevel circuit synthesized by the Synopsys Design Compiler. It is possible to identify two subfunctions within the 11-gate network: Gates g1–5 as well as gates g6–11 each form a two-input even parity function. The second even parity function, however, has a more expensive implementation than the first. Overall, the circuit has a delay of seven gate-levels, a maximum gate fanin of two, a maximum gate fanout of two, and a cost of 28 BCs Fig. 10b shows the multilevel circuit synthesized by the genetic algorithm. Overall, the circuit has a delay of five gate-levels, a maximum gate fanin of three, a maximum gate fanout of two and a cost of 24 BCs. Fig. 11 shows a plot of circuit cost vs. crossover count of surviving chromosomes for the odd parity function. The stratification of the plot is due to the penalty cost function for circuits that fail to satisfy all rows of the function’s truth table. The bottom stratum represents the costs of circuits that properly implement the odd parity function. The next stratum up (costs around 100 BCs) represents circuits that satis-

OR OR

g4

g5 OR

g6 OR

g7 f’

f

i4

Figure 12 Two-bit greater-than comparator function

fied all but one row of the odd parity truth table. Successive strata are due to additional errors in satisfying the function. B.

Magnitude Comparator Function

The magnitude comparator is an arithmetic function that compares the magnitudes of two, unsigned, two-bit binary integers and . The function output is 1 when ( i1 ⋅ 2 + i2 ⋅ 1 ) > ( i3 ⋅ 2 + i 4 ⋅ 1 ) . The multilevel circuit synthesized by the Synopsys Design Compiler (Fig. 12a) required eight gates, but only 21 BCs. Overall, the maximum delay was five gate levels with a maximum fanin of three and a maximum fanout of four. Fig. 12b shows the least costly circuit found by the genetic algorithm. Overall, the maximum delay was five gate levels with a maximum fanin of three, a maximum fanout of two, and cost of 20 BCs Fig. 13 shows a plot of circuit cost vs. crossover count of surviving chromosomes for the magnitude comparator function. As with Fig. 11, the stratification of the plot is due to the penalty cost function for circuits that fail to satisfy all

1 Version 1997.01-44683 – May 27, 1997

7

SHACKLEFORD ET AL.

SASIMI 2000, Kyoto April 6-7, 2000

REFERENCES 1000

[1] H. de Garis, “Genetic programming: Artificial nervous systems, artificial embryos and embryological electronics,” Schwefel and Manner, eds., Parallel Problem Solving from Nature: Proc. of PPSN I, pp. 117–123, Springer-Verlag, 1991. [2] T. Higuchi, T. Niwa, T. Tanaka, H. Iba, H. de Garis, and T. Furuya, “Evolving hardware with genetic learning: A first step towards building a Darwin machine,” Proceedings of the Second International Conference on the Simulation of Adaptive Behavior (SAB92), MIT Press 1993. [3] H. Hemmi, J. Mizoguchi, and J. Shimohara, “Development and evolution of hardware behaviors,” Brooks and Maes, eds., Artificial Life IV, pp. 371–376, MIT Press, 1994. [4] M. Murakawa, S. Yoshizawa, I. Kajitani, and T. Higuchi, “On-line adaptation of neural networks with evolvable hardware,” Proceedings of the Seventh International Conference on Genetic Algorithms, Morgan Kaufmann, pp. 792–799, 1997. [5] A. Thompson, “Silicon evolution,” J.R. Koza et al., eds, Genetic Programming 1996: Proceedings of the First Annual Conference (GP96), pp. 444–452, MIT Press, 1996. [6] S. Chattopadhyay, S. Roy, and P.P. Chaudhuri, “Synthesis of highly testable fixed-polarity AND-XOR canonical networks—A genetic algorithm-based approach,” IEEE Transactions on Computers, vol. 45, no. 4, pp. 487–490, April 1996. [7] T. Sasao and P. Besslich, “On the complexity of MOD-2 sum PLAs,” IEEE Transactions on Computers, vol. 39, no. 2, pp. 262– 266, 1990. [8] M. Karnaugh, “A map method for synthesis of combinatorial logic circuits,” Transactions of the AIEE, Communications and Electronics, vol. 72, part I, pp. 593–599, November 1953. [9] W. V. Quine, “The problem of simplifying truth functions,” American Math. Monthly, vol. 59, pp. 521–531, October 1952. [10] E. J. McCluskey Jr., “Minimization of boolean functions,” Bell System Technical Journal, vol. 35, no. 6, pp. 1417–1444, November 1956. [11] D.D. Gajski, Principles of Digital Design, Prentice Hall, 1997. [12] R. Brayton, G.D. Hachtel, and A. Sangiovanni-Vincentelli, “Multilevel logic synthesis,” Proceedings of the IEEE, vol. 78, no. 2, pp. 264–300, February 1990. [13] J. Darringer, W. Joyner, L. Berman, and L. Trevillyan, “Logic synthesis through local transformations,” IBM Journal of Research and Development, vol. 25, no. 4, pp. 272–280, July 1981. [14] J. Darringer, D. Brand, J. Gerbi, W. Joyner, and L. Trevillyan, “LSS: A system for production logic synthesis,” IBM Journal of Research and Development, vol. 28, no. 5, pp. 537–545, September 1984. [15] K. Bartlett, W. Cohen, A.J. De Geus, and G.D. Hachtel, “Synthesis of multilevel logic under timing constraints,” IEEE Transactions on Computer-Aided Design of Integrated Circuits, CAD-5, no. 4, pp. 582–595, October 1986. [16] R. Brayton, R. Rudell, A. Sangiovanni-Vincentelli, and A. Wang, “MIS: A multiple-level logic optimization system,” IEEE Transactions on Computer-Aided Design of Integrated Circuits, CAD6, no. 6, pp. 1062–1081, November 1987. [17] S. Devadas, A.R. Wang, A.R. Newton, and A. Sangiovanni-Vincentelli, “Boolean decomposition in multilevel logic optimization,” IEEE Journal of Solid State Circuits, vol. 24, no. 2, pp. 399– 408, April 1989. [18] J. H. Holland, “Adaptation in Natural and Artificial Systems,” University of Michigan Press, 1975. (Second edition: MIT Press, 1992.) [19] D. E. Goldberg, “Genetic Algorithms in Search, Optimization, and Machine Learning,” Addison-Wesley, 1989. [20] M. Mitchell, “An Introduction to Genetic Algorithms,” MIT Press, 1996. [21] K.A. De Jong and W.M. Spears, “Using genetic algorithms to solve NP-complete problems,” Proceedings of the Third International Conference on Genetic Algorithms, Morgan Kaufmann, pp. 124–132, 1989. [22] Mitsubishi Electric Corporation. ‘94 Mitsubishi Semiconductor Data Book—CMOS Gate Array 0.8 µm, 1993 (in Japanese). [23] B. Shackleford, E. Okushi, M. Yasuda, H. Koizumi, K. Seo, and T. Iwamoto, “Hardware framework for accelerating the execution speed of a genetic algorithm,” IEICE Transactions on Electronics, vol. E80-C, no. 7, pp. 962–969, July 1997.

Cost (BCs)

2 errors 1 error

100

epoch

legitimate circuits

10 0

1

2 3 Crossover count (x106)

4

5

Figure 13 Cost vs. crossover count of surviving chromosomes for magnitude comparator circuit synthesis. In this experiment, 106 crossovers define an epoch, after which, the inverse logic function is sought.

rows of the function’s truth table. The bottom stratum represents the costs of circuits that properly implement the function. In contrast to the odd parity circuit synthesis experiment, where the cost function was unchanged through out a run, we explored a technique in which the target logic function was inverted every 106 crossovers. This is similar to the genetic algorithm technique known as punctuated equilibria. The purpose of this technique is to force genetic diversity into the population. VI.

CONCLUSION

The problem of minimal-cost, multilevel logic circuit synthesis has been formulated as a genetic algorithm (GA). Performance is superior to a state-of-the-art commercial logic synthesis system in terms of the costs of circuits produced for two trial functions. The lowest cost circuit for the first function, a three-bit odd parity function, cost 24 basic cells (BCs) versus 28 BCs for the commercial system. The second function, a two-bit magnitude comparator, was also less costly at 20 BCs versus 21 BCs respectively. However, the major drawback of the GA-based method is its poor temporal performance when implemented on a general purpose computer. By implementing the chromosome evaluation circuit in hardware and then integrating it into a previously described GA hardware framework [23] the performance of the GA can be accelerated by three to four orders-of-magnitude. VII.

ACKNOWLEDGMENT

We are grateful to Mr. Michio Komoda of the Semiconductor Group at Mitsubishi Electric Corporation for his assistance in performing the Synopsys logic synthesis experiments.

8

Suggest Documents