Efficient Design Validation Based on Cultural Algorithms

1 downloads 0 Views 155KB Size Report
tions of the state spaces related to a target state. Experimental results show that our approach is very effective to reach hard- to-reach states than existing ...
Efficient Design Validation Based on Cultural Algorithms ∗ Weixin Wu and Michael S. Hsiao Dept. of Electrical & Computer Engineering, Virginia Tech, Blacksburg, VA {wuw, mhsiao}@vt.edu Abstract

distance information. In their work, the authors iteratively refine their abstract circuit, i.e., restore some of the abstracted state variables back to the circuit so it more closely resembles the concrete circuit, until the simulator is able to reach the target. However, with each refinement, the cost of performing formal analysis on the circuit grows exponentially. The authors of [5] address the local optima problem with a guidance strategy. At each abstract pre-image state (each onion ring), a set of previously visited states are stored in “buckets”. The guidance strategy is to randomly adjust the balance between search and backtrack. The approach starts from the closest (lowest numbered) onion ring with a non-empty bucket and flip a fair coin. Heads mean to continue simulation from a random state in that bucket. Tails mean to go on to the next nonempty bucket. While this approach attempts to avoid some local optima, those hard-to-reach states remain difficult to reach.

We introduce a new semi-formal design validation framework to justify hard-to-reach corner-case states. We propose a cultural learning technique to identify the swarming of domain knowledge during the search. In addition, our guidance strategy abstracts sets of partitioned state variables, from which pre-images are computed to capture the expanded portions of the state spaces related to a target state. Experimental results show that our approach is very effective to reach hardto-reach states than existing methods.

1. Introduction Simulation-based design validation has been a widely used technique due to its scalability and the ease to apply to a wide range of circuits. However, generating vectors to cover corner cases in complex hardware designs remains hard. Formal techniques using reachability analysis and model checking [1] can effectively analyze all corner cases in theory, but they do not scale well in large design sizes. In order to mitigate the weaknesses of the two approaches, researchers have proposed several hybrid techniques which attempt to combine and complement the strengths of formal techniques and simulation. Among these, the abstractionguided simulation has been one particularly promising technique, in which formal methods are applied to an abstract model of the original circuit, and the resulting analysis is used to guide the simulator towards a target state of interest. The authors of [2] were the early pioneers of such an approach. Most of the research on this approach uses abstract pre-image from the abstract target states as an approximate distance metric to guide the simulation engine to reach concrete target states [3, 4, 5]. In [3], high-level design information is used to identify modules that closely interact with the property being verified and build an abstract circuit to be verified. Pre-images of the target property are computed on this abstract circuit, and the resulting distance information is used to guide a simulator towards the target in a greedy manner. One drawback of this approach is that the simulator can be stuck in a local optima (dead-end state), due to the inaccuracy in the abstract preimage distances. The authors of [4] propose abstraction refinement as a way to overcome the problems caused by inaccurate

1.1 Motivation Behind Our Approach The previous works of abstraction-guided simulation usually coupled a guiding mechanism with a random vector generator. Clearly, the random vector generator could be replaced with something more intelligent. Evolutionary algorithms such as genetic algorithms (GA) [6] can replace the random approach, and it has shown some success in simulation-based ATPGs such as [7]. However, the GA may be regarded as “local heuristic” in the sense that it does not use global knowledge as each chromosome (individual) in the GA does not share its knowledge with others during the search. We believe that a more intelligent search strategy can yield far better results. We propose a new search engine based on the concept of cultural algorithm [8]. Our approach operates in two spaces: population space and belief space. The population space is similar to a GA containing generations of individuals (test vectors). In the belief space, the domain knowledge acquired by swarms of individuals across generations is analyzed and stored. In our work, the knowledge is consisted of two types of bit-patterns extracted from population space via data mining. This domain knowledge guides the generation of new individuals. Our approach can rapidly evolve input vectors to converge to the target state. We also propose a guidance strategy to avoid dead-end states and avoid the exponential cost increases of abstraction refinement. Our method utilizes the concept of state space decomposition [9]. Our method directly extracts several partitioned sets of state variables from the gate-level netlist such

∗ supported in part by NSF Grant 0305881, 0417340, and SRC Grant 2005TJ-1359.001

978-3-9810801-3-1/DATE08 © 2008 EDAA

1

that they are highly correlated to the target state. Then the pre-images for each partition set are computed. Since variables in these partition sets are highly correlated to the target state, the pre-images represent the abstract behavior of a diverse expanse of the target state space, thus providing a more accurate distance metric than a single abstract pre-image. As a result, we need not refine the abstract models. Moreover, in this method, the abstract model only retains state variables of interest and abstracting the rest as pseudo primary inputs. The size of the partitioned sets are limited, thus the computation and storage costs of this method are small, which makes our approach scalable to large circuits. Experimental results show that our framework is more effective to reach hard-to-reach states than other methods. The rest of the paper is organized as follows. Section 2 provides the preliminaries for this work, including genetic algorithms, cultural algorithm, data mining, and state variable abstraction. Section 3 details the proposed framework and its components. Experimental results are reported in Section 4, and Section 5 concludes the paper.

update()

Belief Space accept()

select()

influence()

Population Space

obj()

generate()

Figure 1. Basic framework of cultural algorithms Space are evaluated with a performance function obj(). An acceptance function accept() will then determine which individuals are to impact the Belief Space. Experiences of those chosen elites will be used to update the knowledge / beliefs of the Belief Space via function update(), which represents the evolution of beliefs. Next, the beliefs are used to influence the evolution of the population. New individuals are generated under the influence of the beliefs. The two feedback paths of information, one through the accept() and influence() functions, and the other through individual experience and the obj() function create a system of dual inheritance of both population and belief. The population component and the belief space interact with and support each other, in a manner analogous to the evolution of human culture.

2. Preliminaries 2.1. Genetic Algorithms Genetic algorithms (GAs) [6] belong to the class of evolutionary computing. Problems are solved via an evolutionary process resulting in a best (most fit) solution (survivor). individual/population: A candidate solution for the problem at hand; fitness: Each individual is associated with a fitness, which measures the quality of this individual for solving the problem; selection/crossover/mutation: Typical GA operations to reproduce a new population from an existing population; generation: When a new population is reproduced from a previous population via the GA operations. A population of individuals evolves over a number of generations using the GA operations of selection, crossover, and mutation, and the average fitness of one generation is expected to improve over the previous generations. The evolutionary process ends when the desired target is achieved or a preset amount of resources have been exhausted.

2.3. Data Mining Data mining has been defined as “the nontrivial extraction of implicit, previously unknown, and potentially useful information from data” [10]. Mining frequent patterns plays an essential role in many data mining tasks. It tries to find interesting patterns from databases, such as association rules, correlations, sequences, and clusters. Among all data mining techniques, association rule mining is one major sub-area. An association rule is the most common form of local-pattern discovery; it finds interesting patterns in a database which probably cannot be explicitly articulated. Association rule mining retrieves all highly correlated relations in the database. First we give the notations commonly used in the association rule mining. Let I = {x1 , ..., xn } be a set of distinct items. A set X ⊆ I with |X| = k is said to be a k − itemset. Let D be a multi-set of transactions T, T ⊆ I. A transaction T supports an itemset X if X ⊆ T . The fraction of transactions from D that support X is called the support of X, denoted by supp(X). An itemset X is called f requent if its support exceeds a given minimal support threshold, supp(X) ≥ minsupp. An association rule is an implication X ⇒ Y , where X, Y ⊆ I and X ∩ Y = ∅. The support of an association rule supp(X ⇒ Y ) = supp(X ∪ Y ). Every association rule is assigned a confidence conf (X ⇒ Y ) = supp(X ∪ Y )/supp(X). For the purpose of association rule generation, it suffices to find all frequent itemsets. The search space of all itemsets can

2.2. Cultural Algorithms Cultural algorithms were developed by Robert G. Reynolds as a complement to the metaphor used by evolutionary algorithms, which had focused mainly on genetic and natural selection concepts [8]. Cultural algorithms operate at two levels: (1) a micro-evolutionary level, which consists of the genetic material that an offspring inherits from its parents, and (2) a macro-evolutionary level, which consists of the knowledge acquired by the individuals through generations. This knowledge is used to guide the behavior of the individuals that belong to a certain population. Figure 1 illustrates the basic framework of a cultural algorithm. A cultural algorithm operate in two spaces: Population Space and Belief Space. First, individuals in the Population 2

Initial state

be represented by a ”subset-lattice”. Figure 2 shows an example lattice over the set of items I = {x1 , x2 , x3 , x4 }. The bold line is an example of actual itemset support and separates the frequent itemsets in the upper part from the infrequent ones in the lower part. The task of discovering all frequent itemsets is quite challenging. The search space is exponentially growing with |I|. Therefore it is not practical to calculate whether it is frequent for each subsets of I.

0011

0000

0101

1010

0010

1110

0100

1000

1111

Target state

{}

Figure 3. State transition graph Preimage (FF3, FF4)

Preimage (FF1, FF2) {x1}

{x2}

{x3}

{x4}

1111 {x1, x2}

{x1, x3}

{x1, x2, x3}

{x1, x4}

{x1, x2, x4}

1100

{x2, x3}

{x2, x4}

{x1, x3, x4}

00XX, 01XX 10XX, 11XX

1111

XX00

XX10

XX01

{x3, x4}

Figure 4. pre-images of partition sets

{x2, x3, x4}

{x1, x2, x3, x4}

3. The Proposed Approach

Figure 2. The lattice for the itemsets I

3.1. Overall Framework

Instead, the downward closure property of itemset support is employed by most association rule mining algorithms to traverse the lattice as little as possible.

As our aim is to reach corner cases of the design more efficiently, we formulate a framework based on cultural algorithms in which data mining is used to identify the swarmed intelligence among the individuals across generations. Data mining is also incorporated to compute the abstraction sets of state variables. The basic flow of our approach is shown in Figure 5. In the abstraction portion, logic simulation of 10000 randomly generated vectors from any arbitrary starting state is performed. The logic values of state variables obtained from the simulation form the mining database. Then, the mining algorithms are applied to the database to extract state variable partition sets which are highly correlated to reaching the target states. Next, pre-images of each partition set are computed via a satisfiability (SAT) engine. These sets of pre-images are used as abstract distance metrics to guide the simulation engine to reach the target state.

Theorem 1: (Downward Closure Property) Given a transaction database D over I, let X, Y ⊆ I be two itemsets. Then X ⊆ Y ⇒ supp(Y ) ≤ supp(X). Hence, if an itemset is infrequent, all of its supersets must be infrequent. With this pruning, the search space is drastically reduced. Apriori is a very efficient algorithm to mine association rules [11]. It combines breadth-first search with counting of occurrences of candidates. Using downward closure property of itemset support, Apriori prunes search space this way: Look at all the subsets of size |K| − 1 of candidate K, whenever there is at least one of those subsets infrequent, then prune K without counting its support. For example, applying Apriori algorithm on itemsets I in Figure 2, after 2 mining iterations, we got infrequent 2-itemsets: {x2 , x3 } and {x3 , x4 }. For the 3-itemsets mining iteration, we only need to count support of 3-itemset {x1 , x2 , x4 }. The other 3-itemsets all have infrequent subsets, thus can be directly pruned.

Once the partitions are formed, starting from initial state(s), a cultural algorithm based logic simulator is guided in the following manner: Every time when the GA evolves the population for 3 generations, the cultural algorithm evaluates selected individuals in the population space. The evaluation process learns two types of bit-patterns in the input vectors: favored patterns (FPs) and less-favored patterns (LPs) via data mining. These patterns are updated to the belief space. The patterns in belief space will influence the generation of descendant individuals in the next 3 generations.

2.4. State Variable Abstraction An efficient state variable abstraction should capture necessary transitions that would lead to a target state. For example, consider the state transition graph in Figure 3, there are 4 state variables (F F1 , F F2 , F F3 , F F4 ). Pre-images of two state variable partition sets (F F1 , F F2 ) and (F F3 , F F4 ) are showed in Figure 4. It is clear that the partition set (F F3 , F F4 ) constitutes a better abstract distance metric. After a set of logic simulations, all states that reach the target state are stored in a database. Table 1 shows a portion of database. If we mine frequent itemsets from database, the important partition set (F F3 , F F4 ) will be captured.

Table 1. Mining database v1 v2 v3 v4 . ..

3

F F1 0 0 1 1 . ..

F F2 0 1 0 1 . ..

F F3 0 0 0 0 . ..

F F4 0 0 0 0 . ..

Abstraction Engine

Concrete Model

...... Target state

Randomly simulate 10000 vectors

Generate first population of PI vectors

Mine state variable partition sets

Cultural algorithm based simulation engine

Compute preimages for

Reach target states

State explosion N timeframe ILA

One large set of abstract state variables

...... Target state Fixed point

N timeframe ILA

Multiple small set of abstract state variables

Figure 6. One vs multiple parallel partition sets

each partition set

Y

End

tion set mi . This is feasible because the size of each partition set is limited to a maximum of 8 variables. Thus, there could be at most 256 distinct abstract states in abstract pre-images. The lower graph in Figure 6 helps to illustrates the procedure. The computing of abstract pre-images for partition set mi is tightened by constraining variables that appear in partition sets mj (j = [1, ...N ], j = i) to values in their respective abstract pre-images. The entire abstract pre-image computation is iterated twice for N partition sets at each time-frame, so as to tighten abstract pre-images for each partition set.

N Priority queue

Figure 5. Our framework

3.2. Abstraction engine 3.2.1 Partitioning state variables In our abstraction approach, each input vector assigns a logic value 1 or 0 to all primary inputs (PIs) and state variables. After 10000 logic simulations, all input sequences which reach the target state are recorded in the mining database as follows: the columns denote the state variables, the rows denote the input vectors, and the entries in the table denote logic values. An example mining database is shown in Table 1. Unlike the general data mining approach where an item is either included or excluded from a transaction, in our work, both values (logic 0 and logic 1) should be considered. Apriori algorithm is employed to mine the frequent itemsets from database, which forms the abstracted state partition sets. Generally, a larger size of partition set represents a less abstract circuit and the calculated pre-images from it resembles more accurate state transitions in the concrete model. For example, in the extreme case when the partition set includes all state variables, the calculated pre-images is the complete preimages in the concrete circuit. However, large partition sizes exponentially increase the cost of pre-image computation and leads to state explosion. In the mean while, a small abstraction set may be insufficient to capture the path which is necessary to transit to the target state. Multiple small, parallel partition sets provide a more accurate distance metric yet the computation and storage costs of the pre-images are still small. Figure 6 illustrates the idea. In our experiments, the mining support threshold is set to minsupp = 0.2, and we perform 8 mining iterations. The partitioned state sets are chosen from 5itemsets to 8-itemsets such that their overlapping is less than half of the size.

3.3. Vector Generation Our vector generator is based on a cultural algorithm engine. Algorithm 1 shows the pseudo code of our framework.

3.2.2 Computing pre-images

Algorithm 1 Cultural algorithm based logic simulator initial population = random PI vectors initial priority queue = reset/initial state initial belief space = empty while !(target reached) or !(timeout) do for every 3 populations do start state = top of priority queue for every population do for each individual in the population do simulate individual and record each visited state if dist(visited state) ≤ dist(start state) then push visited state to the priority queue end if calculate fitness value of this individual end for perform selection/crossover operations apply knowledge in belief space to repair operation evolve a new generation of population end for evaluate each population update input pattern FPs and LPs in belief space end for if No progress for certain cycles then start state = next state of priority queue end if end while

After extracting N state variable partition sets M = {m1 , m2 , ..., mN }, we compute the abstract pre-images for each partition set using a satisfiability (SAT) engine. The abstract pre-images are computed to a fixed point for each parti-

An individual is a primary input vector (test sequence). The length of an individual (L) is set equal to the structural sequential depth, where the sequential depth is defined as the 4

minimum number of state variables in a path between the PIs and the√furthest gate. The population size P is set to P = 16 × L. The initial population of individuals is randomly generated. The search starts from an initial (reset) state. For every 3 populations, a best reached state is chosen. The cultural algorithm evaluates all 3 populations and updates the belief space with newly learned knowledge. The knowledge in belief space is applied to GA’s repair operation for generating next 3 new populations. Our cultural learning framework is shown in Figure 7.

The repair operation fixes bad genes (bits) in new individuals. Two types of input patterns (FPs and LPs) from belief space are applied to repair phase. The bits of each individual must be consistent with FPs, while no LPs is allowed to occur. For example, an individual is a 4 bit vector (b1 , b2 , b3 , b4 ). A FP is (x, 1, 0, x), a LP is (1, x, x, 1), where x means the bit value is don t care. A new individual is (1, 1, 1, 1). In repair phase, the logic value of b3 is flipped. One bit from b1 and b4 is randomly selected, the logic value is flipped. After repair, the new individual is (0, 1, 0, 1). 3.3.3 Evaluating populations

FPs and LPs Update( ) Repair( ) GA

In evaluation phase, the mean fitness µf it and the standard deviation σf it of a population are calculated by following equations: P  µf it = (f iti )/P (2)

Repair( ) Update( )

3 Populations

GA

3 Populations

Evaluate( )

GA

Evaluate( )

σf it

Figure 7. Cultural algorithm framework

The fitness value for an individual is determined by the best state s reached by the individual during logic simulation, as shown in the following equation: f its =

(3)

Where P is the size of a population. Then the population is divided at points (µf it − σf it /2) and (µf it + σf it /2) to 3 regions as shown in Figure 9. The knowledge in both the favored region and less-favored region will be learned to update the belief space.

3.3.1 Fitness of individuals

N 

 i=1  P 1  = (f iti − µf it )2 P i=1

Individuals

(Mi − di )

Less−favored region

(1)

.

.

. . . . . . . . . .. . . . . . . .

i=1

where N is the number of partition sets, Mi is the maximum distance value of the ith set, and di is the distance at which the state s mapped on the ith abstract pre-images. Figure 8 illustrates an example of fitness calculation. The fitness value is essentially the estimated abstract distance of the state to the target. The best state s is the state with the least abstract distance to target state (maximal fitness value).

. . . . .. . .

Favored region

. . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . . Fitness

Figure 9. Evaluation of a population 3.3.4 Updating the belief space

5

4

3 XX10

2

1

Data mining is employed to extract frequent patterns FPs and LPs from both the favored region and less-favored region. Database D includes all individuals of a population. Df /Dl denotes the portion of D where individuals reside in favored/less-favored region. The mining support threshold is set to minsupp = 0.2. Mining is performed on Df and Dl to retrieve FPs and LPs. Only maximal frequent itemsets are kept, thus there is no duplicating FPs or LPs. The confidence of an itemset X in FPs or LPs is calculated by Equations 4 and 5. supp(X, Df ) (4) conf (X, Df ) = supp(X, D) supp(X, Dl ) conf (X, Dl ) = (5) supp(X, D) The confidence threshold is set to 1.2. So the mined FPs and LPs are only frequent in their region instead of the entire database. This guarantees the mined patterns are significant. Such patterns are updated to the belief space.

Pre−images of set 1

A concrete state 1010

Target state

Fitness = 2 + ... + 1 1XX0

Pre−images of set N

Figure 8. Estimated distance of a concrete state

3.3.2 Selection, crossover and repair operations We use a fitness proportionate selection procedure. Individuals are randomly selected from the current generation of population. More fit individuals have greater probability of being selected. For two selected individuals, the center point of each is chosen as the crossover point to generate two new individuals. 5

4. Experimental Results

improvement in the fitness values of three abstraction-guided methods for one aborted state in pjava-iu. The x-axis denotes the number of cycles and the y-axis denotes the fitness, which measures how close the target state is to the current state in terms of the abstract preimages computed. As shown in the figure, the fitness values from the CA-based method converges to the target state in a much smaller number of cycles than the other two methods. The trends are similar for other circuits as well.

We implemented the proposed approach in C++. Experiments were conducted on a number of benchmarks: ISCAS89, ITC99, a PicoJava processor from SUN [12] and two designs made in-house: Finite/Infinite Impulse Response filters (fir / iir). We ran experiments on Linux machines with a 3.2 GHz Pentium 4 CPU and 2G RAM. The target states are hard-toreach corner states which were aborted from an ATPG. An aborted state is a state that the ATPG failed to justify within its resource limits.

5. Conclusion

Table 2. Results of large circuits Circuit

# FFs

s35932 s38584 b11 b12 pjava-icu pjava-iu fir iir

1728 1452 31 121 415 1244 202 252

RND

We have presented a new and powerful framework based on cultural algorithms to justify the hard-to-reach target states in sequential circuits. Data mining is used to extract the intelligence among the swarm of individuals as well as computing the partition sets to form the abstract model. This technique efficiently guides the search to converge to the difficult target states. Another advantage of our approach is that no refinement of abstract circuits is needed, which reduces the computational costs. The experimental results showed that our technique is very effective in reaching hard-to-reach states compared with previous methods.

ATPG

Abstraction guided Onion-ring GA CA TO TO 4578 585 179 TO TO 5271 2219 262 TO 7362 1944 656 48 TO 4675 932 204 52 TO 8691 3325 475 101 TO TO 8729 3505 348 TO 11034 1814 324 73 TO TO 2623 1901 109 TO: 10800 seconds

We chose 4 of the hardest-to-justify states for each benchmark and report the total run times. In the PicoJava processor, the target states are taken from both inside the Instruction Cache Unit (pjava-icu) and Instruction Unit (pjava-iu). The run time limit was set to 3 hours. Table 2 compares our guided ATPG with other methods. For each circuit, the number of state variables is first given. Then, results for the random approach are reported, followed by results of a GA-based ATPG method similar to [7]. The last three columns are abstractionguided methods, including an onion-ring based approach similar to [5], a genetic algorithm based guidance, and our cultural algorithm based guidance. From this table, we can see that in all instances, our cultural algorithm based search outperforms all other methods. Consider pjava-iu which has 1244 flipflops. Both the random and the GA-based ATPG timed out, Among the three abstraction-guided methods, the onion-ring based strategy took 8729 seconds, GA-based strategy took 3505 seconds, and our CA-based strategy took only 348 seconds, which is at least one order of magnitude improvement over other techniques.

References [1] J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and L. J. Hwang. Symbolic model checking: 1020 states and beyond. Proc. IEEE Symp. on Logic in Computer Science, 1990, pp. 428439. [2] C. H. Yang and D. L. Dill. Validation with guided search of the state space Proc. Conf. on Design automation, 1998, pp. 599604. [3] S. Shyam and V. Bertacco. Distance-guided hybrid verification with GUIDO. Proc. the Conf. on Design, Automation and Test in Europe, 2006, pp. 1211-1216. [4] K. Nanshi and F. Somenzi. Guiding simulation with increasingly refined abstract traces. Proc. Conf. on Design Automation, 2006, pp. 737-742. [5] F. M. D. Paula and A. J. Hu. An effective guidance strategy for abstraction-guided simulation. Proc. Conf. on Design automation, 2007, pp. 63-68. [6] J. H. Holland. Adaptation in Natural and Artificial Systems. Ann Arbor, MI: University of Michigan Press, 1975. [7] M. S. Hsiao, E. M. Rudnick, and J. H. Patel. Sequential circuit test generation using dynamic state traversal. Proc. European Conf. on Design and Test, 1997, pp. 22-28. [8] R. G. Reynolds. An introduction to cultural algorithms. Proc. Conf. on Evolutionary Programming, 1994, pp. 131C139. [9] H. Cho, G. D. Hachtel, E. Macii, B. Plessier, and F. Somenzi. Algorithms for approximate FSM traversal based on state space decomposition. IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems, Dec. 1996, pp. 1465-1478. [10] W. Frawley, G. P. Shapiro and C. Matheus. Knowledge Discovery in Databases: An Overview. AI Magazine, 1992, pp. 213-228. [11] R. Agrawal, H. Mannila, R. Srikant, H. Toivonen and A. I. Verkamo. Fast discovery of association rules. Advances in knowledge discovery and data mining, 1996, pp. 307-328. [12] Sun Microsystems. PicoJava technology. http://www.sun.com/microelectronics/communitysource/picojava.

Benchmark picojava−iu 250

200

fitness

150

100

50 CA GA Onion−ring 0

0

0.5

1

1.5 2 simulation cycles

2.5

3

3.5 4

x 10

Figure 10. Fitness improvements for pjava-iu Looking into pjava-iu more closely, Figure 10 shows the 6

Suggest Documents