for Evolving Hardware Cellular Automata. F. Corno ... Built-In Self Test structure based on Cellular Automata that ..... (ICES-2OOO), Edinburgh (UK) , April 2000.
Exploiting the Selfish Gene Algorithm for Evolving Hardware Cellular Automata F. Corno, M. Soma Reorda, G. Squillero Politecnico di Torino Dipartimento di Automatica e Informatica Torino, Italy http://www.cad.polito.it/ Abstract- Testing is a key issue in the design and production of digital circuits and the adoption of Built-In Self Test techniques is increasingly popular. This paper shows an application in the field of Electronic CAD of the Selfish Gene algorithm, an evolutionary algorithm based on a recent interpretation of the Darwinian theory. A three-phase optimization algorithm is exploited for determining the structure of a Built-In Self Test architecture that is able to achieve good fault coverage results with a reduced area overhead. Experimental results show that the attained fault coverage is substantially higher than what can be obtained by previously proposed methods with comparable area requirements.
1 Introduction This paper presents a new approach to the identification of a Built-In Self Test structure based on Cellular Automata that is able to achieve good fault coverage results with a reduced area overhead. We adopt a three-phase algorithm which expolits the Selfish Gene algorithm [ 101 in its optimization tasks instead of a more traditional GA, and this allows us to obtain good results, effectively cutting in half area occupation with respect to previous techniques. Section 2 summarizes the Selfish Gene algorithm. Section 3 introduces some basic concept about Cellular Automata and Built-In Self Test. Section 4 describes the proposed method and reports some experimental results. Section 5 draws some conclusions.
2 Background 2.1 Selfsh Gene The Selfish Gene algorithm (SG) [8] [lo] is an evolutionary optimization algorithm based on the interpretation of the Darwinian theory given by English biologist Richard Dawkins. In the selfish gene biological theory, the population can be simply seen as a pool of genes where the number of individuals, and their specific identities are not of interest. Therefore, the SG resorts to a statistical 0-7803-6375-2/00/$10.00 02000 IEEE.
characterization of the population, by representing and evolving some statistical parameters only. Evolution proceeds in discrete steps: two groups of individuals, called clans [lo], are isolated fkom the original population; then, the two clans are evolved separately for some time and their champions are selected; finally, the two champions are collated in a tournament and winner offspring is allowed to spread back into the original population. As the biological theory suggests, while evaluation is performed at phenotypic level, selection is performed at the level of genes. Clans simulate the mechanism of allopatric speciation: new species arise in very small populations that become isolated fiom their parental group. Speciation in these small groups is very rapid by evolutionary standards. Clan champions are the result of an isolated evolution that amplifies gene linkage: adopting this speciation, the algorithm tests linkages between different genes, and propagates the useful ones back to the main population. In the SG, an individual is identified by its N genes. The list of genes is called genome and a position in the genome is termed locus. Each locus 1 can be occupied by several different genes and candidates are called the gene alleles. Alleles are indicated with ari ( i = 1 ... nl), where 1 is the locus and nl is the number of genes struggling for occupying locus 1. Each allele may appear more or less frequently into individuals composing the population. Let call f i the frequency of allele ari in the population. We will use the term polarization to indicate the genetic variability inside a population. A population where in each locus 1 the fkequency of a given gene pl is hP,= 1 while all its alleles have a fiequency of zero is said to be completely polarized. On the contrary, when all fkequencies are set to f i = 1 I nr, the population is completely non-polarized. It should be noted that a completely polarized population is not composed of identical individuals, because the SG mechanism of mutation always provides a certain amount of variability. Different loci may exhibit different levels of polarization. At the end of the evolution process, the population would certainly be highly polarized. The fmal frequency of a gene measures, in some sense, its success in its battle for
1401
occupying the locus. However, at the beginning of the evolution process, the frequency assumes a different meaning. A completely non-polarized initial population lets the SG start from an unspecified state. Harik, Lobo and Goldberg in 1998, analyzing a simplified algorithm on binary encoded problems, demonstrated that representing the population as a probability distribution is operationally equivalent to the order-one behavior of the simple GA with uniform crossover [ 121. Thus, we can intuitively maintain that the SG behavior is GA-khioned when the initial state is completely nonpolarized. On the other hand, starting from a completely polarized population makes the SG initially behave like a Random Mutation Hill Climber (RMHC).However, after the first generations, the population will likely become less polarized, and the SG will gradually differentiate from a pure RMHC. At the end of the evolution process, the population will be probably polarized around a different, and hopefully better, point in the solution space and therefore the SG will progressively became again more and more RMHC-fbshioned. It is possible to select different initial frequencies for each locus. Therefore, it is possible to shape the behavior of the SG on each dimension of the solution space independently, starting the search process from a partially defined solution. This powerful characteristic is one of the main reasons for choosing the SG in this peculiar optimization problem. Further details about the SG algorithm are available in [8], while biological motivations are analyzed in [9]. Clans were introduced in [101 as an extension to the algorithm for dealing with more complex fitness landscapes. 2.2 Cellular Automata Cellular Automata (CA) are finite state machines defmed as uniform arrays of cells, sometimes named sites, in a n-dimensional space [21]. A CA evolves in discrete steps with the next value of one cell determined by its value and that of a set of cells called the neighbor cells. The function used to compute the value of a cell is called the cell rule. When different cells implement different rules, the CA is hybrid (HCA). Otherwise it is said to be uniform. A rule is linear if it can be expressed as a set of linear operations (e.g., additions over GF(2): EXOR gates). A linear CA is a CA where only linear rules are used CA that are both linear and hybrid are often referred as LHCA. Sometimes, the cell rule is denoted as a number that stands for the decimal representation of the truth table of the function.
2 3 Built-In Selft Test Test of an electronic device aims at detecting faults possibly introduced during the manifacturing process [I]. The test is usually performed by applying a test patrern (an appropriate
sequence of stimuli, generated by an Automatic Test Pattern Generator) to the Unit Under Test 0 and comparhg its output response to the expected one. Normally, special devices called Automatic Test Equipments (ATEs) are used to fed the UUT and to analyze the response. The cost of these devices, which must be able to apply high volumes of stimuli at a high speed, heavily impact the cost and economical feasibility of the test process. On the contrary, in a Built-In Self Test (BIST) design, the generation and application of input stimuli, as well as the analysis of the resulting response, are embedded in the system. Self test is an attractive perspective and BIST architectures are increasingly popular in the electronic CAD field. ~
PRIMARY INPUTS
,
controller
PRIMARY OUTPUTS
t+
+
Figure 1: Typical Architecture of a BIST circuit.
Figure 1 shows the typical architecture of a BIST circuit. Four blocks are added to the original circuit: the Input Pattern Generator (IPG) generates the test pattern to be applied to the UUT; the Output Response Evaluator (ORA) compares the UUT response to the expected one and eventually generates the GooaWaulty (GE) output; a multiplexer 0 switches UUT inputs from the extemal Primary tnputs to the internal IPG generated ones; the BIST Controller, activated by a NormaWest (NE)input, manages the whole system. The effectiveness of a BIST architecture is usually evaluated considering which percentage of faults it is able to detect in the UUT (the Fault Coverage) and how much area the new blocks introduce into the system. If the UUT does not contain any memory element, it is said to be combinational. In this case, input stimuli generated by the IPG do not need to be applied in a given order because the response to an input does not depend on previously applied ones. Differently, the UUT may contain 1402
memory elements and behaves like a Finite State Machine (FSM). In this case, it is called sequential and an ordered sequence of input stimuli is needed. The complexity of the IPG depends on the type of UUT. For a combinational UUT, a pseudo-random generator usually suffices. Several improvements to the basic structure were proposed in CAD literature, but mainly for reducing the length of the test or to reduce the area overhead [I]. On the other hand, designing an IPG for a sequential UUT is still an open problem. Storing a predetermined sequence in a ROM would achieve good fault coverage, but this BIST architecture is merely theoretical because the area overhead would be unacceptable. On the contrary, even weighted pseudo-random IPG with multiple weight sets are not usually able to achieve acceptable fault coverage. Commonly, in BIST architectures sequential UUTs are transformed into combinational ones. This transformation is performed replacing memory elements with more complex cells that behave as memory elements during normal operations but act as inputs and outputs during test [I71 P61. However, there exist several motivations not to modify the UUT. First, adopting this strategy it is hard to test the memory elements themselves. Additionally, transforming memory elements into more complex cells may introduce critical delays inside the circuits. Moreover, in case of proprietary cores, test engineers may not have access to the UUT design and, therefore, are prevented from modifjmg it.
approach was shown to generate satisfactory results in terms of fault coverage [5], but area overheads were higher than competing approaches because it exploits CA cells with four possible states requiring 2 flipflops each. To reduce area overhead in [6] the same CA was used both as IPG and as ORA. However, for some classes of circuits this approach causes a significant decrease in the attained fault coverage. The fmt attempt to exploit the Selfish Gene polarization effects was presented in [ 1 11, where the SG evolves a CA to be used as ORA, starting from the optimal 901150 LHCA. However, in [ 1 I ] the clan mechanism was not implemented.
3 Evolved Hardware Cellular Automata This paper proposes a new methodology for evolving the optimal CA-based IPG in a BIST architecture for a sequential UUT. Next section describes the proposed BIST architecture. Sections 3.2 to 3.5 detail the optimization algorithm.
2.4 Built-In Self Test and Cellular Automata CA have been studied by hardware designers as both IPG [13] [14] and ORA [19]. For testing combinational UUTs, CA have been proposed as pseudo-random IPGs. The most frequently used architecture is a linear hybrid CA: the 90/150 LHCA [4], which is composed of cells implementing either rule 90 (two-input EXOR gate) or rule 150 (three-input EXOR gate). 90/150 LHCA can produce maxi”-length pseudorandom sequences [4] while exhibiting very low crosscorrelation between bits [13]. Different architectures were analyzed in [14]. CA have also been proposed for reproducing a set of unordered input vectors [2] [18]. However, although these methods are effective for combinational UUTs, they are not useful for sequential ones. When designing an IPG for a sequential UUT an unordered set of stimuli is almost useless. Previous attempts to use CA to reproduce ordered input vectors [2] limited themselves to prove the difficulty of attaining any useful result. Experimental data we gathered confirm that it is nearly impossible to embed a given sequence in a CA if it is longer than some tens of vectors. The problem was partially solved in the past [7] by means of a Genetic Algorithm that, instead of trying to reproduce a given sequence, wmputed an optimal set of CA rules directly interacting with a fault simulator. This
3.1 Architecture The architecture presented here implements the IPG with an hybrid uni-dimensional CA with cyclic boundary conditions and I-bit cells. The CA has as many cells as the number of circuit inputs. The output of each cell is connected to one input of the UUT through the MUX. Thus, the test pattern applied to the UUT can be seen as a matrix of N x n g i cells, where N is the length of the test and n g i is the number of UUT primary inputs. The output of the i-th cell is the input of the z-th UUT input and is named the i-th column of the test pattern. Each cell Si (e.g., Figure 2) is fed with the outputs of the two adjacent cells Si., and Si+, and with its previous value. Cells rules are generic three-input boolean functions selected in the space of all possible 2y2”3) = 256 different rules. Let denote withi(/, s, r) the rule implemented by the i-th cell (“I” for “left neighbor”, “r” for “right neighbor” and “s” for “self”). 3.2 Overall Algorithm The overall algorithm is composed of three different steps: a preliminary analysis, a first optimization step and a second optimization step. Both optimization steps exploit the SG algorithm, but while the latter takes advantage of polarization and clans, the former utilizes the SG in his simplest form.
1403
The analysis phase examines a test sequence produced by an Automatic Test Pattern Generator to determine the optimal percentage of 1’s in each column of the test pattern. Then it selects for each CA cell the set of rules that, according to equation (l), honor the percentage of 1’s of the given pattern. RuleJ(I, s, r) honors a given percentage pi of 1’s when pi + E 2 p i 2 pi - E (E is usually set to 0.05). This analysis is, in some way, similar to the one performed for implementing a weighted pseudo-random IPG for combinational UUTs. Although this analysis lessens its significance for sequential UUTs, experimental results indicate that it can increase performances in the following steps.
cINPUT
Figure 2: Architecture of the cell Si
3.4 Phase 2: First optimization
The first optimization step aims at selecting a CA able to produce a maximum-length test pattern using rules than honor the optimal 1’s percentage only. Since the optimization is performed evaluating the CA for 1,000 clock cycles only, with maximum-length test pattern, we mean a test pattern composed of min(2””’-l,1000)different vectors. This task is performed exploiting the SG algorithm in its basic form. The SG is let evolving for 50,000 generations starting fiom a completely non-polarized population. The Fitness function simply counts how many states the CA evolves before entering a loop. Since the UUT is not simulated, required CPU time is not significant compared to the last phase.
T SEQUENCE
ANDIDATE RULES
UM LENGTH CA
POLARIZED SG
Figure 3: Overall algorithm
3.3 Phase 1: Analysis The analysis phase aims at reducing the search space in following phases by selecting a subset of ns rules fiom the 256 possible ones for each CA cell. Let p i be the probability that the i-th cell contains the value of 1, and let indicate with pf,l,=v,= v 2 the conditioned probability that the next value of i-th cell is 1 when programmed with ruleJ, given that its neighbor cell holds value v, and v, respectively. The probability p i can therefore be expressed as: p I. = p 1-1 . ’ pr+l . . Pf,fil/=l,,=l + + (1 - Pi-1).
+Pi4
Pi+] . PfJ/=O,,=l +
.(1 - P i + ] ) . Pf.l/=l,,=O +
+ (1 - Pi-1
3.5 Phase 3: Second optimization The second optimization phase aims at selecting the optimal CA for implementing the IF’G in the BIST architecture. In this phase, the algorithm exploits both polarization and clan mechanism. The population is polarized by assigning, in each CA cell, a frequency o f f = 0.25 to the rule selected in the second phase (i.e., the rule that generates a maximum-length pattern honoring 1 ’s percentage). All other rules selected in the first phase are given a fkquency equal to f = 0.50 I (ns 1). Thus, in the starting population the 75% of individuals will be CA with rules that honor the 1’s percentage of a good test pattern. All remaining rules are given a cumulative frequency of 0.25 (see Figure 4). The feedback factor was set to 10” for intra-clan conflicts and to lo’* for interclan ones to fasten the evolution process. The clan similarity factor, i.e., the indicator of how much a clan is isolated from the whole population, depends on the size of the CA through an inverse exponential function. As a result, the number of rules that are left completely free to evolve in clans range fiom 3 to 10.
(1)
1. (1 - Pi+]).Pf.lr=o.r=o
1404
coverage, showing the effectiveness of the new three-phase approach and the usefulness of the clan mechanism.
25% Phase2 rule 50% All other Phase'1 rules
90/150 [ [ll]I EHCA FC% FC% I CPU[hl FC%
I
I
25% All other rules
Firmre 4: Rule Dolarimtion
4 Experimental Results We implemented the described algorithm in C and run it on a Sun Ultra 5/333 with a 256 Mbyte memory. The ISCAS'89 circuits [3], and the ones known as Addendum to the ISCAS'89 benchmark set' have been used to evaluate the effectiveness of our approach. In Table 1 we give some preliminary experimental results: for each circuit, we first designed the optimal 90/150 LHCA according to [4], and then "computed the fault coverage attained by fault simulating the sequence obtained by making the CA evolve for 100,000 clock cycles. Results obtained by the Evolved Hardware CA selected using the methodology presented in this paper are reported in the last section of the table. Column "FCW reports the fault coverage attained simulating for 10,OOO clock cycles the whole BIST architecture. We limit our BIST to one tenth of the 90/150 LHCA for increasing its performance and because no significant improvements in the attained fault coverage are usually found after the first 10,000 vectors. However, although the BIST circuit runs for only 10,000 clock cycles, for efficiency reasons the fitness computation function considers only the first 1O , OO vectors generated by each CA. The third step of the algorithm executes 5,000 iterations, unless 100% fault coverage is reached earlier, and the best result evaluated is returned. reported in the last column and range CPU times from few hours to about 20 hours per experiment. They are mostly due to fault simulation time, CPU time used in the first and second phases of our algorithm are negligible. The analysis of these ~ s u l t shows s that the Selfish Gene algorithm is effectively able to improve, in most of the cases, the industry standard solution based on the 901150 LHCA. The generated CA has approximately the same area of the 90/150 LHCA, therefore designers are provided with a much better fault coverage at no additional hardware cost. On the other hand, when compared with [Ill, the proposed method always reaches equals or better fault
5 Conclusions We described a three-phase evolutionary algorithm based on the Selfish Gene algorithm for the solution of a crucial problem in the field of Electronic CAD: the identification of the best structure for a CA in charge of generating the input vectors within a BIST structure. The approach makes use of three different optimization phases and exploits the specific characteristics of the SG: the ease to deal with multi-valued genomes where different loci are defined over different domains and the possibility to behave in a hill climbing fashion on only some axis of the search space, starting the search process h m a partially defined solution. Experimental results show that the method is able to identify very good solutions: with the generated CA it is possible to reach a fault coverage much higher than the one obtained with standard engineering practice and higher than the one obtained with a simple one-phase evolutionary optimization algorithm equally based on the Selfish Gene algorithm [ 111. This work demonstrates the effectiveness of the three-phase approach and the efficacy of the Selfish Gene algorithm in a real industrial problem. Testing of embedded circuits is made possible with a BIST approach that does not require any intervention from the outside beyond test activation and result gathering.
These benchmark circuits can be downloaded fiom the CAD Laborarory at the address
Benchmarking
http://www.cbl.ncsu.edu/benchmarks/
1405
Acknowledgments Authors wish to thank Denis Vironda for implementing the algorithm and performing the experiments.
References M. Abramovici, M. A. Breuer, A. D. Friedman: Digital systems testing and testable design, Computer Science Press, New York (USA), 1990 S . Boubezari, B. Kaminska, “A Deterministic Built-In Self-Test Generator Based on Cellular Automata Structures,” IEEE Trans. on Comp., Vol. 44,No. 6, June 1995, pp. 805-816 F. Brglez, D. Bryant, K. Kozminski, “Combinational profiles of sequential benchmark circuits,” International Symposium. on Circuits And Systems, 1989, pp. 1929-1934 K. Cattell, S . Zhang, “Minimal Cost OneDimensional Linear Hybrid Cellular Automata of Degree Through 500”, JETTA, Journal of Electronic Testing an Test Application, Kluwer, 1995, pp. 255258 S . Chiusano, F. Como, P. Prinetto, M. Sonza Reorda, “Cellular Automata for Sequential Test Pattem Generation”, VTS’97: IEEE VLSI Test Symposium, Monterey CA (USA), April 1997, pp. 60-65 F. Como, N. Gaudenzi, P.Prinetto, M. Soma Reorda, “On the Identification of Optimal Cellular Automata for Built-In Self-Test of Sequential Circuits”, VTS’98: 16th IEEE VLSI Test Symposium, Monterey, California (USA), April 1998 F. Como, P. Prinetto, M. Sonza Reorda, “A Genetic Algorithm for Automatic Generation of Test Logic for Digital Circuits”, IEEE International Conference On Tools with Artificial Intelligence, Toulouse (France), November 1996 F. Como, M. Sonza Reorda, G. Squillero, “The Selfish Gene Algorithm: a New Evolutionary Optimization Strategy”, SAC‘98: 13th Annual ACM Symposium on Applied Computing, Atlanta, Georgia (USA), February 1998, pp. 349-355 F. Como, M. Soma Reorda, G. Squillero, “A New Evolutionary Algorithm Inspired by the Selfish Gene Theory”,ICEC’98: IEEE International Conference on Evolutionary Computation, May, 1998, pp. 575-580 F. Como, M. Soma Reorda, G. Squillero, “Optimizing Deceptive Functions with the SG-Clans Algorithm”, CEC’99: 1999 Congress on Evolutionary Computation, Washington DC (USA), July 1999, pp. 2 190-2 195 F. Como, M. Sonza Reorda, G. Squillero, “Evolving Cellular Automata for Self-Testing H a r d m , ” to be 1406
published at 3rd International Conference on Evolvable Systems: From Biology to Hardware (ICES-2OOO),Edinburgh (UK) ,April 2000 G. R Harik, F. G. Lobo, D. E. Goldberg, “The Compact Genetic Algorithm,” 1998 IEEE International Conference on Evolutionary Computation, 1998, pp. 323-327 P. D. Hortensius, R. D. McLeod, W. Pries, D.M. Miller, H.C. Card, “Cellular Automata-Based Pseudorandom Number Generators for Built-In SelfTest,” IEEE Trans. on Computer-Aided Design, Vol. 8, NO. 8, August 1989, pp. 842-859 P. D. Hortensius, R. D. McLeod, B. W. Podaima, “Cellular Automata Circuits for Built-In Self Test,” IBM Journal of Research and Development, Vol. 34, No. 213, March 1990, pp. 389405
JETTA, Journal of Electronic Testing, Theory and Applications, special Issue on Partial Scan Methods, Volume 7, Numbers 112, August/October 1995 B. Konemann, J. Mucha, G. Zwiehoff, “Built-In Logic Block Observation Technique,” Proc. IEEE International Test Conference, October 1979, pp. 3741 A. Krasniewski, S . Pilarski, “Circular Self-Test Path: A low-cost BIST Technique for VLSI circuits,” IEEE Trans. on CAD, Vol. 8, No. 1, January 1989, pp. 4655 J. van Sas, F. Catthoor, H. De Man, “Cellular Automata Based Deterministic Self-Test Strategies for Programmable Data Paths,” IEEE Transaction on
Computer-Aided Design, Vol. 13, No. 7, July 1994, pp. 940-949 M. Serra, T. Slater, J. C. Muzio, D. M. Miller, “The Analysis of One-Dimensional Linear Cellular Automata and Their Aliasing Properties,” IEEE Transaction on Computer-Aided Design, Vol. 9, No. 7, July 1990, pp. 767-778 T. Toffoli, N. Magolus, “Cellular Automata Machines: A New Environment for Modeling,” MIT Press, Cambridge (USA), 1987
S. Wolhm, “Statistical Mechanics of Cellular Automata,” Rev. Mod. Phys. 55,1983, pp. 601-644