Theory and Applications of Nonlinear Cellular Automata In VLSI Design
A report in partial fulfillment for the Degree of Doctor of Philosophy in Computer Science and Technology
Sukanta Das
Theory and Applications of Nonlinear Cellular Automata In VLSI Design
Sukanta Das
A report in partial fulfillment for the Degree of Doctor of Philosophy in Computer Science and Technology
Under the supervision of Dr. Biplab K Sikdar Asst. Professor Dept. of Computer Science and Technology Bengal Engineering And Science University, Shibpur
Department of Computer Science and Technology Bengal Engineering And Science University, Shibpur West Bengal, India – 711103 June, 2006
Department of Computer Science & Technology Bengal Engineering And Science University, Shibpur P.O. Botanic Garden, Howrah–711 103
CERTIFICATE OF APPROVAL Certified that, the Thesis entitled Theory and Applications of Nonlinear Cellular Automata In V LSI Design, is a record of bonafide work carried out by Sukanta Das under my supervision and guidance. In my opinion, the thesis has fulfilled the requirements of the degree of Doctor of Philosophy in Computer Science and Technology of the Bengal Engineering And Science University, Shibpur. The work has reached the standard necessary for submission and to the best of my knowledge the results embodied in this thesis have not been submitted for the award of any other degree or diploma.
(Dr. Biplab K Sikdar) Asst. Professor Department of Computer Science & Technology Bengal Engineering And Science University, Shibpur Howrah - 711103, India
To whom who are deprived from even primary education
Acknowledgment No individual how brilliant he or she might be can create something without the help of people around him or her. These very words I have learned from my life. I can not really mention the names of all who have helped and guided me to prepare this thesis. It is the outcome of co-operation, help, guidance and motivation of a lot of people. At the very outset, I express my gratitude to my supervisor Dr. Biplab K Sikdar who can understand my strength & weakness, my emotion, and who has motivated me towards the research activity with proper guidance. In the same breath, I would like to express my gratitude to Prof. P Pal Chaudhuri. He has changed my attitude towards research and taught me to be disciplined in innovative research. I would like to specially thank my co-researchers Niloy-da, Sourav, Chandrama, Pradipta, Subhayan and Monalisa-di. They have been wonderful. In the early days Niloy-da guided me to carry on the research activity. I would also like to express my heartfelt thanks to two group members of our research team, Debdas and Anirban. Their cerebral contributions help the thesis to take a proper shape. I am grateful to Dr. Shipra Das(Bit), Asst. Professor, Department of Computer Science & Technology of this University, who guided me to find a new research problem. I have also received immense pleasure working with a number of students during this period. I am grateful to HODs of Computer Science & Technology of this University – Prof. P. K. Nandi, Prof. Amit Das and Prof. Uma Bhattacharya. I am specially grateful to Dr. H. Rahaman, Head, Department of Information Technology of this University, who has encouraged me to complete my thesis work. I would also like to thank Prof. S S Barat, Director, Purabi Das School of Information Technology, for his moral support. I thank all the laboratory staff and other staff for their support and service. I would like to express my heartfelt respect to my father. He is the major architect of all my success. I would also like to express the same to my late jyathu (uncle) who inspired me to proceed in higher education, and my late dadu (grand father). I would like to extend my gratitude towards the two important ladies in my life. My mother who is always taking care of even my slightest inconvenience; my didi (elder sister) who has been my friend, teacher and motivator from my childhood. I convey my deep gratitude to all the relatives, specially my jamaibabu for his continuous encouragement and moral support. Lastly I would like to express my heartfelt thanks to my sunu-bunu (sister) and sunu-bhai (brother) without whom I can not be complete in life. It is my great pleasure to acknowledge the wishes of friends and well wishers, both in academic and non-academic spheres. I convey my sincere thanks to them, in particular Debasis, Ashoke, Sujit, Arindam, Bapi, Kalyan, Kanika, Amiya, Uttam-da and Sudipta. They all inspired me to cross each hurdle. I thank the members of Bharatia Bigyan O Yuktibadi Samity who have been beside me for last 10 years and taught me a lot.
I must be thankful to all my teachers who have built me to reach here. Specially, I would like to express my deep respect to my great teacher Sri Hirendra Nath Laskar, and Prof. Samir Roy. Lastly, I would like to extend my sincere thanks to those unknown persons who silently smoothen my life by providing the service for basic needs. Today I am feeling lucky. I, being a citizen of India where most of the people can not cross the boundary of primary education, am going to submit this thesis for the highest degree of a University. I do believe, the prime reason behind this is not my intelligence but the socio-economic structure of our country. I, therefore, dedicate this thesis to those, socio-economic structure deprived whom from even primary education.
Dated: June, 2006 Bengal Engineering And Science University, Shibpur ....................... Sukanta Das
Abstract In recent years, Cellular Automata (CA) have been found as an attractive modeling tool for various applications, such as, pattern recognition, image processing, data compression, encryption and specially V LSI design & test. However, for all such applications, a special class of CA, called as linear/additive CA, has been utilized. Since linear CA limit the search space, we may not reach to the best result while searching for the solution to a problem. Nonlinear CA can be an alternative to linear/additive CA for achieving desired solutions in different applications. However, the nonlinear CA are yet to be characterized to fit the design for modeling an application. This thesis targets characterization of the nonlinear CA and utilizes the huge search space of nonlinear CA in developing applications in V LSI design. The interconnection among the CA cells (CA rules) are completely classified for efficient synthesis of reversible cellular automata. An analytical framework is developed to explore the properties of CA rules for 3-neighborhood 1-dimensional CA. It is found that in two-state 3-neighborhood CA, the CA rules fall into 6 groups depending on their potential to form reversible CA. The proposed classification of CA rules enables synthesis of reversible CA in linear time. An efficient design of Pseudo-Random Pattern Generators (P RP Gs), based on the nonlinear reversible CA, has also been reported. The performance of the P RP G is evaluated with the battery of diehard tests. It is found that the proposed P RP G is the best among state-of-the-art designs in terms of its randomness quality. The structure of the proposed nonlinear CA based P RP G is utilized to design a cost optimal Test Pattern Generator (T P G) for a CU T (Circuit Under Test). The T P G can avoid patterns prohibited to the CU T and can ensure better fault efficiency compared to existing designs. Further, we exploit the scalable structure of the nonlinear CA in designing T P Gs for multiple cores without investing the disparate hardware for the different T P Gs. The thesis reports a new BIST (Built-In Self-Test) structure, referred to as the U BIST (Universal BIST ). U BIST can generate any one of the four kinds of test patterns – (i) pseudo-random, (ii) pseudo-exhaustive, (iii) pseudo-random without P P S (Prohibited Pattern Set), and (iv) deterministic. Finally, the nonlinear CA theory is employed to address the issue of data services in cellular mobile network. CA act as an efficient query processor, resulting a hardwired solution to data service. The CA based query processor is found to be twice faster than the state-of-the-art designs with soft computation.
Contents 1 Introduction
1
1.1
Motivation for the Research Undertaken in the Thesis . . . . . . . . .
2
1.2
Objectives of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . .
3
1.3
Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . .
3
2 A Survey On Cellular Automata And P RP Gs 2.1
2.2
5
Cellular Automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.1.1
Early phase of development . . . . . . . . . . . . . . . . . . . .
6
2.1.2
Simplification of cellular automata structure . . . . . . . . . . .
6
2.1.3
Exploring reversible cellular automata . . . . . . . . . . . . . .
7
1-dimensional CA characterization . . . . . . . . . . . . . . . . . . . .
8
2.2.1
Linear/Additive CA . . . . . . . . . . . . . . . . . . . . . . . .
9
2.2.2
Nonlinear CA
. . . . . . . . . . . . . . . . . . . . . . . . . . .
10
2.3
CA Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.4
Application of CA in V LSI domain . . . . . . . . . . . . . . . . . . .
14
2.5
Pseudo-Random Number Generation . . . . . . . . . . . . . . . . . . .
16
2.5.1
Pseudo-random number generators . . . . . . . . . . . . . . . .
16
2.5.2
Pseudo-Random Pattern Generation with CA . . . . . . . . . .
17
2.5.3
Application of P RP G in V LSI circuit testing
. . . . . . . . .
18
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
19
2.6
Conclusion
3 Characterization of Nonlinear Cellular Automata
20
3.1
Cellular Automata Basics . . . . . . . . . . . . . . . . . . . . . . . . .
21
3.2
Characterization of CA rules . . . . . . . . . . . . . . . . . . . . . . .
24
3.2.1
Rule Min Term (RM T ) . . . . . . . . . . . . . . . . . . . . . .
24
3.2.2
Reachability Tree . . . . . . . . . . . . . . . . . . . . . . . . . .
26
3.2.3
Identification of group CA . . . . . . . . . . . . . . . . . . . . .
30
vi
3.2.4
Synthesis of a Group CA . . . . . . . . . . . . . . . . . . . . .
32
3.3
Group rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
3.4
Classification of group rules . . . . . . . . . . . . . . . . . . . . . . . .
36
3.4.1
Formation of class . . . . . . . . . . . . . . . . . . . . . . . . .
37
3.4.2
Class relationship between R i and Ri+1 . . . . . . . . . . . . .
39
3.5
Characterization of irreversible CA . . . . . . . . . . . . . . . . . . . .
43
3.6
Conclusion
45
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4 Nonlinear Cellular Automata Based Pseudo-Random Pattern Generator 46 4.1
Cellular Automata Revisited . . . . . . . . . . . . . . . . . . . . . . .
47
4.2
Requirements for the P RP G . . . . . . . . . . . . . . . . . . . . . . .
48
4.3
Randomness property in CA rules . . . . . . . . . . . . . . . . . . . .
50
4.3.1
Local randomness in a CA cell . . . . . . . . . . . . . . . . . .
50
4.3.2
Global randomness of a CA . . . . . . . . . . . . . . . . . . . .
53
4.4
Synthesis of PRPG . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
54
4.5
Randomness Quality of P RP G . . . . . . . . . . . . . . . . . . . . . .
56
4.6
Evolution of P RP G . . . . . . . . . . . . . . . . . . . . . . . . . . . .
58
4.6.1
Fitness Function . . . . . . . . . . . . . . . . . . . . . . . . . .
59
4.6.2
Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
4.6.3
Mutation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
61
4.6.4
Evolution of the design . . . . . . . . . . . . . . . . . . . . . .
61
4.7
Conclusion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5 Design of On-Chip T P G around Nonlinear CA based P RP G
62 63
5.1
P RP G based On-Chip T P G . . . . . . . . . . . . . . . . . . . . . . .
64
5.2
Nonlinear CA based on-chip T P G . . . . . . . . . . . . . . . . . . . .
64
5.2.1
Fault Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . .
64
5.2.2
Hardware overhead . . . . . . . . . . . . . . . . . . . . . . . . .
66
Cost optimal design of P RP G for T P G . . . . . . . . . . . . . . . . .
68
5.3.1
The design . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
68
5.3.2
P RP G evolution for cost optimal design . . . . . . . . . . . . .
70
5.3.3
Feasibility of the design . . . . . . . . . . . . . . . . . . . . . .
72
5.3.4
Randomness quality of the cost optimal P RP G . . . . . . . . .
73
5.3.5
Fault efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
Hardware implementation . . . . . . . . . . . . . . . . . . . . . . . . .
77
5.3
5.4
5.5
5.6
T P G for multiple cores with scalable P RP G . . . . . . . . . . . . . .
78
5.5.1
Design of n2 −bit P RP G from n1 −bit P RP G . . . . . . . . . .
78
5.5.2
Design of n−bit, m−bit and (n + m)−bit T P G for multiple cores 79
Conclusion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6 Design of Universal BIST Structure 6.1
6.2
6.3
6.4
6.5
80 81
Linear Cellular Automata . . . . . . . . . . . . . . . . . . . . . . . . .
82
6.1.1
Group CA Characteristics . . . . . . . . . . . . . . . . . . . . .
83
6.1.2
Synthesis of Group CA . . . . . . . . . . . . . . . . . . . . . .
85
Design Specifications for Universal BIST . . . . . . . . . . . . . . . .
86
6.2.1
Pseudo-Random Pattern Generation . . . . . . . . . . . . . . .
86
6.2.2
Pseudo-Exhaustive Pattern Generation
. . . . . . . . . . . . .
86
6.2.3
P RP G without Prohibited Pattern Set . . . . . . . . . . . . .
87
6.2.4
Deterministic Test Set Generation . . . . . . . . . . . . . . . .
89
Design of Linear CA based U BIST
. . . . . . . . . . . . . . . . . . .
90
6.3.1
Degree of Pseudo-Exhaustiveness . . . . . . . . . . . . . . . . .
90
6.3.2
U BIST Design Algorithm . . . . . . . . . . . . . . . . . . . . .
91
6.3.3
Performance of linear CA based U BIST . . . . . . . . . . . . .
94
Nonlinear CA based P RP G without P P S . . . . . . . . . . . . . . . .
99
6.4.1
Overview of the design . . . . . . . . . . . . . . . . . . . . . . .
99
6.4.2
Avoiding Prohibited Patterns . . . . . . . . . . . . . . . . . . . 101
6.4.3
Quality of the T P G without P P S . . . . . . . . . . . . . . . . 102
Nonlinear CA based DT SG . . . . . . . . . . . . . . . . . . . . . . . . 103 6.5.1
Design Options for DT SG . . . . . . . . . . . . . . . . . . . . . 106
6.5.2
The design of nonlinear CA based DT SG . . . . . . . . . . . . 108
6.5.3
DT SG design algorithm . . . . . . . . . . . . . . . . . . . . . . 110
6.5.4
Performance of nonlinear CA based DT SG . . . . . . . . . . . 111
6.6
U BIST Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.7
Conclusion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7 Nonlinear CA Based Design of Data Service Scheme
117
7.1
Data services in cellular mobile network . . . . . . . . . . . . . . . . . 118
7.2
Overview of the query processing scheme . . . . . . . . . . . . . . . . . 119
7.3
7.2.1
The micro cells . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.2.2
The data service scheme . . . . . . . . . . . . . . . . . . . . . . 120
Nonlinear non-group CA with multiple attractors . . . . . . . . . . . . 123
7.4
Design of Query Processor . . . . . . . . . . . . . . . . . . . . . . . . . 125 7.4.1
Partitioning of a cell in micro cells . . . . . . . . . . . . . . . . 125
7.4.2
Synthesis of CA for query processor . . . . . . . . . . . . . . . 125
7.5
Synthesis of CA through reverse engineering . . . . . . . . . . . . . . . 127
7.6
Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 7.6.1
Feasibility of the design . . . . . . . . . . . . . . . . . . . . . . 131
7.6.2
Performance of query processor . . . . . . . . . . . . . . . . . . 132
7.7
Removal of conflict . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
7.8
Effectiveness of the design . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.9
Conclusion
8 Conclusion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 137
8.1
Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
8.2
Future Extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
List of Figures 2.1
Structure of P RP G based test logic . . . . . . . . . . . . . . . . . . .
18
2.2
Structure of pseudo-random pattern testing with data compression . .
19
3.1
Block diagram of an n−cell null boundary CA. . . . . . . . . . . . . .
21
3.2
Block diagram of an n−cell periodic boundary CA. . . . . . . . . . . .
21
3.3
Implementation of null boundary CA with F F s and combinational logic circuits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
22
3.4
State transitions of a reversible CA < 105, 177, 170, 75 >. . . . . . . .
23
3.5
State transitions of an irreversible CA < 105, 177, 171, 75 >. . . . . . .
24
3.6
Determination of next state. . . . . . . . . . . . . . . . . . . . . . . . .
26
3.7
Reachability Tree for the CA < 105, 129, 171, 65 >. . . . . . . . . . . .
27
3.8
Reachability tree for the CA < 90, 15, 85, 15 >. . . . . . . . . . . . . .
28
3.9
Compressed reachability tree for the CA < 90, 15, 85, 15 >. . . . . . .
30
3.10 Reachability tree for a non-group CA < 90, 85, 15, 15 > designed with group rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
37
3.11 Determination of class relationship . . . . . . . . . . . . . . . . . . . .
39
4.1
CA states as the pseudo-random patterns . . . . . . . . . . . . . . . .
49
4.2
Scalable PRPG design . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
4.3
Crossover of CA1 and CA2 resulting CA3 and CA4 . . . . . . . . . . .
60
4.4
Crossover of CA1 =< 9, 165, 90, 80 > and CA2 =< 6, 240, 102, 17 > . .
60
4.5
Comparison of randomness property of maxlength CA and PRPG . .
61
5.1
For c7552: improvement in fault coverage with the number of test patterns. Dotted line is for maxlength linear T P G, continuous line is for nonlinear T P G. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
For s6669: improvement in fault coverage with the number of test patterns. Dotted line is for maxlength linear T P G, continuous line is for nonlinear T P G. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
66
5.2
x
5.3
Mutation of CA < 9, 165, 90, 80 >, resulting the CA < 9, 90, 90, 80 > .
71
5.4
Cost per cell vs. No. of generations (Approach I) . . . . . . . . . . . .
74
5.5
Comparison of randomness property of Max Length CA and PRPG .
75
5.6
Architecture of T P G for multiple cores . . . . . . . . . . . . . . . . . .
78
5.7
Architecture of T P G implementing n − P I, m − P I and (n + m) − P I
80
6.1
An n−cell linear CA . . . . . . . . . . . . . . . . . . . . . . . . . . . .
82
6.2
A 4-cell linear maximal length CA . . . . . . . . . . . . . . . . . . . .
83
6.3
A 7-cell linear group CA with cycle structure [1(1), 1(7), 1(15), 1(105)]
84
6.4
T −matrix of a non-maximal length CA . . . . . . . . . . . . . . . . .
85
6.5
P EP G with 6 P Is (A, B, C, D, E, F) and 2 P Os (P O 1 & P O2 ) . . .
87
6.6
The Prohibited/Deterministic Pattern set . . . . . . . . . . . . . . . .
88
6.7
Variation of free space with the cardinality of PPS. . . . . . . . . . . .
96
6.8
Variation of run length with the cardinality of pattern set . . . . . . .
98
6.9
DPE vs. performance . . . . . . . . . . . . . . . . . . . . . . . . . . .
98
6.10 Nonlinear CA based P RP G without P P S . . . . . . . . . . . . . . . . 100 6.11 No. of test patterns without P P S vs. n (size of CA) . . . . . . . . . . 104 6.12 No. of iterations required vs. n to get P P S free T P G . . . . . . . . . 104 6.13 A deterministic test pattern set . . . . . . . . . . . . . . . . . . . . . . 105 6.14 DT SG with single CA . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 6.15 DT SG with multiple CA . . . . . . . . . . . . . . . . . . . . . . . . . 107 6.16 Finding Initial Solution for SA . . . . . . . . . . . . . . . . . . . . . . 109 6.17 |DT S| vs. Number of seeds to cover random DTS
. . . . . . . . . . . 111
6.18 CA size vs. seed to cover MINTEST . . . . . . . . . . . . . . . . . . . 112 6.19 An n−cell linear CA based U BIST
. . . . . . . . . . . . . . . . . . . 115
6.20 An n−cell nonlinear CA based U BIST
. . . . . . . . . . . . . . . . . 115
7.1
Cellular network architecture . . . . . . . . . . . . . . . . . . . . . . . 119
7.2
Partitioning of a cell S into micro cells . . . . . . . . . . . . . . . . . . 120
7.3
Query code generated from an MU . . . . . . . . . . . . . . . . . . . . 121
7.4
Block diagram of a query processor . . . . . . . . . . . . . . . . . . . . 122
7.5
State transitions of a CA with rule vector < 10, 69, 204, 68 > . . . . . 122
7.6
Partitioning scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
7.7
Network cell S with object hospitals . . . . . . . . . . . . . . . . . . . 126
7.8
Selection of RMTs to map attractors . . . . . . . . . . . . . . . . . . . 128
7.9
Selection of RMTs to construct basin . . . . . . . . . . . . . . . . . . . 129
7.10 Randomly generated graph from for 4-bit patterns . . . . . . . . . . . 130 7.11 Removal of conflict by extending the neighborhood . . . . . . . . . . . 134
List of Tables 3.1
Truth table for rule 90, 150 and 75 . . . . . . . . . . . . . . . . . . . .
22
3.2
Binary values of the CA < 105, 128, 171, 65 > cell rules . . . . . . . . .
25
3.3
Relationship between RM T s of cell i and cell (i + 1) for next state computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
26
3.4
RM T s of the CA < 90, 15, 85, 15 > rules . . . . . . . . . . . . . . . . .
28
3.5
List of group rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
36
3.6
Class Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
38
3.7
Formation of class relationship between R i and Ri+1 . . . . . . . . . .
40
3.8
Class relationship of Ri and Ri+1 . . . . . . . . . . . . . . . . . . . . .
41
3.9
First Rule Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
3.10 Last Rule Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
4.1
RM T s of rule 90 and 150 . . . . . . . . . . . . . . . . . . . . . . . . .
52
4.2
Class Relationship of Ri and Ri+1 maintaining Property 1 and Property 2 55
4.3
First rule table maintaining Property 1 . . . . . . . . . . . . . . . . . .
56
4.4
Last rule table maintaining Property 2 . . . . . . . . . . . . . . . . . .
56
4.5
Diehard tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
57
4.6
Randomness test I (for n = 63) . . . . . . . . . . . . . . . . . . . . . .
58
4.7
Randomness test II . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59
4.8
Results of GA evolution . . . . . . . . . . . . . . . . . . . . . . . . . .
62
5.1
Comparison of Test Results . . . . . . . . . . . . . . . . . . . . . . . .
67
5.2
Area overhead of basic gates (GENLIB) . . . . . . . . . . . . . . . . .
68
5.3
Cost of group rules (cost within bracket) . . . . . . . . . . . . . . . . .
69
5.4
Cost of first and last cell rules . . . . . . . . . . . . . . . . . . . . . . .
69
5.5
Results of GA evolution for minimal cost P RP G synthesis . . . . . . .
73
5.6
Randomness test for minimal cost P RP G (Approach I) . . . . . . . .
75
5.7
Fault efficiency of cost optimal P RP Gs . . . . . . . . . . . . . . . . .
76
xiii
5.8
Comparison of area overhead . . . . . . . . . . . . . . . . . . . . . . .
77
6.1
The results of BIST P G for P RP G without P P S
. . . . . . . . . . .
95
6.2
Randomness Test (Linear P RP G without P P S) . . . . . . . . . . . .
96
6.3
Results of BIST P G for generation of randomly generated DT S
. . .
97
6.4
Results of BISTPG for generating DTS for benchmark circuits . . . .
99
6.5
Results of BIST P G design without P P S . . . . . . . . . . . . . . . . 103
6.6
Performance of DT SG . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
6.7
Performance of DT SG targeting hard-to-detect faults . . . . . . . . . 114
7.1
Success rate of the proposed design . . . . . . . . . . . . . . . . . . . . 132
7.2
Performance in terms of service . . . . . . . . . . . . . . . . . . . . . . 133
7.3
Effectiveness of the CA based design . . . . . . . . . . . . . . . . . . . 136
Chapter 1
Introduction Newton was the creator of the classical world of certainty that had constructed with Newtonian Physics. Laplace and Keplar were the principal architects of that world. The nature in the classical world was viewed unconditionally deterministic [274]. The dream of the creators of classical world was to measure everything with arbitrary precession. A number of researchers were working in this direction till the end of nineteenth century with a determination. However, the dream cave in suddenly in the early phase of twentieth century. Heisenberg introduced the principle of uncertainty, the pillar of Quantum mechanics. Still few were arguing for the classical world, but, the last bastion of certainty was collapsed at the end with emergence of Chaos Theory. Before the advent of chaos, the operative world was order. The word disorder was prohibited to the language of science. The chaos introduced irregularity in regularity and captured the imagination of all of us. Newton and Keplar were stuck to harmony of the world. They missed the uncertainty hiding in the dark in no way diminishes their genius. It took all their innate ability to choose and pick classical controversies from among the many challenges. Chaos theory obliterates their concept of determinism. Things in nature depends extremely critically on initial conditions. A minute change in the initial state can reflect massive changes in system’s subsequent behavior. A tiny perturbation may completely alters the result. A butterfly flapping its wings in the Amazonian forest can trigger torrential rains in New Delhi. In truth, to a scientist, it does not necessarily mean lack of order. Rather, it denotes indeterminacy, impossibility to make long term predictions. The field of chaos did not really take off until 1950s. Thanks to this unexpected ally of Cellular Automata (CA). This tool came to be just as essential to the study of chaotic systems. Cellular Automata are sufficiently simple to allow detailed mathematical analysis, yet sufficiently complex, to exhibit chaos in dynamical systems. Story of cellular automata dates back to 1940’s with Ulam. This mathematician was pursuing evolution of graphic constructions generated by simple rules. The base of this construction was a two-dimensional space divided into cells, a sort of grid. John von Neumann, relying on Turing’s works, interested himself on the theory of self-
1
2 reproductive automata and worked on the conception of a self-reproductive machine. Von Neumann [294] envisioned the modeling of self-reproducing automata empowered to simulate the bacterial growth, the growth of patterns on seashells, fluid dynamics, and the voting patterns of individuals who made decisions based on their local neighbors. The most important milestone in the development history of Cellular Automata is due to Wolfram [300]. The simplified homogeneous structure of CA motivates a number of researchers [51] to undertake the study of CA behavior amenable to matrix algebraic analysis. It allows a programmer to specify extremely simple rules for local interaction to model very complex systems. The V LSI era cultivates more enthusiasm among the CA researchers. The simple structure of CA with local interconnections are ideally suited for V LSI implementation. As the cost-effective alternative to existing tools CA is gaining tremendous importance in modeling different applications, such as image processing, language recognition, pattern recognition, circuit testing, study of fractals and chaos, etc [51]. These computing models are limited to Galois 2 Field (GF (2)) CA, a cell capable of storing either 0 or 1.
1.1
Motivation for the Research Undertaken in the Thesis
The GF (2) CA, conventionally called as linear/additive CA, can be mapped into matrix algebraic form and can efficiently be characterized with the existing methods of linear algebra. Ease characterization of linear/additive CA attracted researchers to develop successful CA applications in Pseudo Random Pattern Generation (P RP G), V LSI circuit testing, cryptography, pattern recognition etc. However, GF (2) CA limit the search space while targeting different applications. In spite of this serious bottleneck with linear/additive CA, nonlinear CA applications yet not come under the focus of CA research specially in V LSI era. The unavailability of efficient characterization methodology restricts researchers from choosing nonlinear CA to model V LSI applications. Further, it seems that the nonlinear CA is hard to realize in hardware and may incur more implementation cost than that of GF (2) CA. However, the proper implementation technique for nonlinear CA may require minimal cost. For example, to design a conventional P RP G, 90/150 linear CA is utilized [130]. Each cell of 90/150 CA demands at least one XOR logic gate, whereas the nonlinear CA rule 15 requires only a N OT gate to be implemented. Therefore, proper choice of CA rules in designing the P RP G may reduce the overall cost. However, no research work was done in this direction, since a little attention has been paid to characterize the nonlinear CA. In this scenario, we have undertaken this research work targeting development of nonlinear CA, theory and its applications, in a systematic way. We have concentrated on to characterize the next state logic functions of CA cells. Complete characterization
3 of next state logic opens new areas of CA applications suit for V LSI design. The next section elaborates the objectives of the thesis.
1.2
Objectives of the Thesis
There are two functional profiles of this research work - (1) characterization of nonlinear cellular automata and (2) demonstrating its effectiveness in a number of application areas. The characterization of nonlinear CA concentrates on the investigation of local CA rule, as well as global dynamics of CA. For the application domain, we choose the field of V LSI design. Therefore, the objectives of the thesis are • to characterize CA rules that are the building blocks of a CA, • to design a methodology to characterize the nonlinear CA, • to develop a tool for characterization (analysis/synthesis) of nonlinear CA, • to employ nonlinear CA in the field of V LSI circuit testing, • to explore applications of nonlinear CA in the fields related to the V LSI design.
1.3
Organization of the Thesis
To fulfill the objectives of the thesis, the tasks handled are reported in the following order. In order to study the reported research work in related fields, a comprehensive survey is undertaken in Chapter 2. The survey covers two aspects: (i) CA and its applications in diverse fields; and (ii) the pseudo-random number generators, its evolution and application in the field of V LSI circuit testing, as nonlinear CA based pseudo-random pattern generators for V LSI circuit testing is the focus application for this thesis. The nonlinear CA and its characterization is introduced in Chapter 3. This chapter provides the theoretical background of the thesis. It reports a tree based method to characterize the CA rules as well as the CA as a whole. The analysis for both the reversible and non-reversible nonlinear CA is provided. The theoretical background for the design of pseudo-random pattern generator (P RP G) around reversible nonlinear CA is developed in Chapter 4. The conditions to be satisfied for the CA rules to construct a CA based P RP G are also reported. A genetic algorithm (GA) based method is introduced for the P RP G of better quality. We report the applications of P RP G in V LSI circuit testing in Chapter 5. The applications we choose for the current research are: • Design of P RP G based Test Pattern Generator (T P G) for a Circuit Under Test (CU T ).
4 • Utilize the huge search space of nonlinear CA for cost-optimal design of P RP G that incurs less hardware cost as well as shows a good randomness quality. • Design a P RP G to realize a scalable T P G. Chapter 6 proposes a new BIST (Built-In Self Test) structure, referred to as universal BIST (U BIST ). The design handles all the following four classes of test pattern generation – 1. pseudo-random, 2. pseudo-exhaustive, 3. pseudo-random test patterns without prohibited pattern set (P P S), and 4. deterministic test set. The thesis addresses the issue of data services in cellular mobile network. Nonlinear CA is employed to design the query processor. Such a hardwired query processing technology is dealt with in Chapter 7. Chapter 8 concludes the thesis and identifies the future research direction.
Chapter 2
A Survey On Cellular Automata And P RP Gs This thesis work targets the characterization of 1-dimensional 3-neighborhood nonlinear Cellular Automata (CA), and its applications in V LSI domain. In this background, this chapter provides a comprehensive survey on CA, its evolution, and applications in diverse fields. As we concentrate on the CA applications in V LSI domain, specially the design of P RP Gs for V LSI circuit testing, a brief survey on P RP Gs is also reported in this chapter.
2.1
Cellular Automata
One of the major goals of computer science is to provide abstract models of concrete computers. It demands expressiveness (that is, whatever aspects of the computer are deemed relevant should be captured by the model) and accuracy (that is, whatever one can prove about the model should be true about the computer). Introduction of Cellular Automata (CA) is an important development in this direction. Cellular Automata are more expressive than Turing machines [288], as they provide explicit means for modeling parallel computation on a space-time background. Initially, CA were used chiefly as “toy models” for phenomenology associative with dissipative (macroscopically irreversible) processes. The typical topics were biological organization [150], self-reproduction [294], chemical reactions [212], and visual pattern processing [95]. However, day-by-day the CA research had gained a great attraction due to its simplicity, and power to model physical phenomena. The evolution (phase of development) of cellular automata and its applications in various fields is next provided.
5
6
2.1.1
Early phase of development
The concept of CA was initiated in the early 1950s by J. von Neumann and Stan Ulam [294] for modeling biological self-reproduction. Neumann’s comment was – a cellular automaton can be computationally universal. The CA structure proposed involved a 5-neighborhood interactions among the cells with 29 states per cell. Later in [273], Thather proposed improvements in the construction of von Neumann’s cellular space with quite complex initial configuration, at least in terms of the number of cells required. The works that targeted structural simplification are [17, 22, 53, 63]. In [53], Lee reduced the state count of a CA to 16, essentially keeping the flavor as that of von Neumann model. Subsequently, Codd [63] proposed reduction in the required number of states to only 8. Arbib, in his work [17], provided a simple description of self-reproducing CA, whereas Banks [22] worked with a CA having 4 states per cell. All these CA were the 2-dimensional with 5-neighborhood configuration (self and four orthogonal neighbors). The 9-neighborhood (Moore neighborhood) CA, with two states per cell had also been shown to be capable of universal computation in [191]. Besides two dimension, Cellular Automata on multi-dimensional grid are proposed in [176, 242]. The grid is either null or periodic boundary. While in null boundary configurations the boundary cells were assumed to have null (logic ‘0’) dependencies, a periodic boundary was one where in some dimensions the grid was considered to be folded [25, 204]. The concept of intermediate boundary for terminal cells had also been proposed in [51, 204]. In 1962, Moore had proved that if a state of a CA is having more than one predecessor, then there must be a state without any predecessor [191]. The converse was proved by Myhill in 1963 [201]. The states with no predecessor are called Garden of Eden and are considered as the common phenomena of a CA. That is, all the CA are considered as irreversible. The reversibility of CA, where initial state of a CA comes back after a number of time steps, was explicitly addressed only in 1972 by Richardson [238], and Amoroso & Patt [16].
2.1.2
Simplification of cellular automata structure
A number of theoretical research had been performed on cellular automata since the days of von Neumann. However, von Neumann’s CA ware never simulated/implemented in computer due to its complexity. Therefore, the demand of simpler and more practical architectures of CA, that could be used to model the divergent application areas was persistent. The first notable simplification was achieved in 1970s. John Conway proposed his famous game of life [113], which received widespread interest among researchers. The Moore neighborhood had been utilized to model the game of life [113]. However, the revolution in the history of cellular automata was happened in the early 1980s. Stephen Wolfram [307] studied in detail a family of simple 1-dimensional cellular automata that could simulate complex behaviors [176, 300, 302, 303, 304]. The proposed CA structure was viewed as a discrete lattice of two-state cells, with 3-neighborhood
7 dependency (self, left and right neighbors). This structure attracted a large section of researchers working in the diverse fields and a specialized class of 1-dimensional CA, called linear/additive CA, had gained the primary attention [51]. The states of linear/additive CA were assumed to be elements of Galois Field GF(q), where q is the number of states [176] of a CA cell. Such a CA became popular specially in the V LSI applications. A mathematical model based on matrix algebra was developed for analysis of linear/additive CA behavior [76, 243]. Das in [76] reported that a 3-neighborhood linear/additive CA could be represented by tri-diagonal matrix. Recent advancements in this direction are the design of GF (2 p ) CA, developed around Galois extension field GF(2p ) [221, 222, 257], and the hierarchical CA GF ((2 p )q )... structures. A cell of the GF(2p ) CA consists of p number of memory elements and can store values from the set {0, 1, · · ·, 2 p -1}. The properties of CA with varying (non-uniform) neighborhoods for the CA cells had also been studied in [139, 313]. A number of works [154, 248, 47] proposed the iterative cellular automata, where only a particular cell was provided with an input, for Language Recognition. In conventional CA, the next state function (rule) is deterministic in nature. There are also variations where rule set can be probabilistic [29, 32, 118, 143, 169, 281] or fuzzy [5, 6, 38, 39, 101, 102, 170]. The nature of next state functions can vary significantly in such a CA. However, there are some defined standard rule sets across different applications, e.g., Linear Rules [51], Diffusion Rules [56], etc. The transition phenomenon in the simplified CA is predominantly synchronous in nature - that is, the cells get updated at the same instance of time. A few interesting works on asynchronous CA [247, 285] have been published recently to suite the modeling of events that are asynchronous in nature.
2.1.3
Exploring reversible cellular automata
Reversibility in cellular automata states was first addressed by Richardson [238], and Amoroso & Patt [16] (Section 2.1.1). Following that a number of theoretical works were published on the reversibility [15, 135, 177, 178, 179, 206, 244, 318]. In spite of the acceleration gained in CA research after the introduction of reversibility in CA states, for many years, the most acceptable reversible CA exhibited extremely simple orbits – the longest orbit was of period 2, analyzed through brute-force enumeration [281]. It was thought to be difficult to decide on whether an arbitrary cellular automata would be reversible. Except for the 1-dimensional CA, no one even knew of a systematic procedure for testing the reversibility [16, 143]. Toffoli in 1976 proved the existence of reversible CA that were computation and construction universal [276]. He further reported that (1977) each cellular automaton of dimension d could be embedded into a reversible cellular automaton of dimension d + 1 [277]. However, reversibility of 1-dimensional CA was the key focus of interest since decades. Some interesting results on 1-dimensional two-state reversible CA were reported in [16]. The researchers were in search for reversible CA. It was found none for the
8 neighborhood of 2 and 3. The search stooped at neighborhood 4 with exactly eight cases (out of 65,536), where all the CA cells followed the same local rule (uniform rule). Second order technique and partitioning schemes [281], block permutation [145], etc. were also employed in identifying the reversible cellular automata from a large space of CA. In [165], the study on the group properties of reversible CA [165] was reported by Lio Liberti. The interesting properties of reversible CA had attracted the researchers for a long time to model a number of applications in hydrodynamics, dynamical systems, heat conduction, wave scattering, nucleation, dendritic growth, physical modeling, etc. [281]. The dynamical properties of reversible cellular automata were investigated in [157, 174]. For V LSI applications, a special class of reversible CA, referred to as linear/additive group CA structure, had been developed [51]. Recently, the concept of Quantum-dot Cellular Automata (QCA) with 1-dimensional reversible infinite number of cells has been proposed as a quantum computer [298]. Inokuchi and Mizoguchi introduced a notion of cyclic QCA with finite cell arrays [133, 134]. QCA have the potential applications to the statistical mechanics of lattice systems, and ultraviolet regularization of quantum field theories. The reversible structure of QCA appears as the natural model of computation extending the welldeveloped theory of classical cellular automata into quantum domain. Different variations of cellular automata have lent versatility to the modeling power of the tool. In order to gain insight into the modeling accuracy of CA based simulation tool, characterization of CA state transition behavior is of great importance. The next section presents a brief history of 1-dimensional cellular automata and the methodologies developed to characterize such CA.
2.2
1-dimensional CA characterization
A detailed characterization of CA dynamics enables understanding of the emergent behavior and computational capacity of the system [67, 117]. Borrowing the concept from the field of continuous dynamical systems, Wolfram [302] first classified CA into four broad categories: • Class 1: CA which evolve to a homogeneous state; • Class 2: those which evolve to simple separated periodic structures; • Class 3: which exhibit chaotic or pseudo-random behavior; and • Class 4: which yield complex patterns of localized structures and are capable of universal computation [303]. Based upon this broad classes, detailed categorization of CA had been proposed by Li et. al. [164], and Gutowitz [122]. On the other hand, Walker [296] examined a family
9 of sparsely connected Boolean nets to characterize the CA machines. Classification of CA based on the structure of attractors was made by Kurka [156]. The methodologies proposed for characterization of CA behavior can be grouped into two categories - characterization of Additive/Linear CA and the Nonlinear CA . The linear/additive CA are amenable to detailed characterization through Linear Algebraic tools [51]. On the other hand, due to absence of standard mathematical tool, there has been varied effort with different parameters to characterize nonlinear CA.
2.2.1
Linear/Additive CA
Characterization of linear/additive CA is much simpler than the other classes of CA due to its correspondence with the state-of-art algebraic models. Since the next state function of such a CA cell follows the operations in Galois Field, the properties of Galois Field can be applied to characterize its state transition behavior. The application of CA in V LSI domain has made GF(2) CA as the most popular variant [78, 77]. Characterization of Linear Feedback Shift Register (LFSR) [267] and linear machines [97] has provided the platform for characterization of linear/additive CA . The initial works in this direction concentrated on finding a suitable algebraic representation of linear CA – that is, Characteristic Matrix [74, 78]. Prior to that the attempts for characterization with diploynomial [146, 176], based on Graph Theoretic Properties [269], failed to represent the hybrid CA with different neighborhood configuration for different CA cells. The matrix algebraic tool and corresponding minimal and characteristic polynomial of the characteristic matrix have opened up interesting features of the CA . The first important finding is the categorization of additive CA into Group (reversible) and Non-Group (irreversible) CA. Group CA : The most effective application of Group CA is to generation of pseudorandom patterns. Serra et. al. [253] report that maximal length Group CA, with all non-zero states lying in a single cycle, produces best quality of pseudo-random patterns. Such a maximal length CA are designed only with two CA rules (neighborhood configuration), 90 and 150, and its characteristic polynomial is primitive [41]. It implies, a maximal length CA cannot be designed with periodic boundary CA since its characteristic polynomial can be factored [27, 204]. Serra et. al. [252] use a version of Lanczos Tri-diagonalization method over GF(2) to synthesize a maximal length CA. Simplified versions are reported in [272, 41]. It is further generalized over GF(q) by Muzio et. al. [40]. Recently, Makato shows that for certain lattice sizes, maximal length CA can be generated only through rule 90 [180]. The more detailed characterization of Group CA is noted in [52, 243]. The phase-shift properties of CA-cells, important for analyzing pseudo-random patterns, are studied in [203, 196]. Besides one-dimensional null boundary CA, characterization of two-dimensional CA had been reported in [58, 61, 42]. The pseudorandomness quality of the patterns generated from two dimensional CA and related
10 characterization are reported in [283]. Non-Group CA : The non-group CA, are initially dumped as degenerate cases while characterizing the group CA [267]. However, in recent times, it is receiving much attention [46, 49, 55, 31, 107, 239] from the researchers working in diverse fields. The isomorphism of tree structures of non-group CA [176, 51, 48] brings out two important results. Firstly, it is mapped to a table structure with its cyclic states pointing to the address of the table [49]. Secondly, the linear and complemented variant of a nongroup CA produced interesting symmetry within themselves [202]. To formalize the behavior of this symmetry, Chakraborty et. al. introduces the concept of Dual CA [46]. Scientists realize that non-group CA in fact have more potential than Group CA. Some of the interesting classes of non-group CA are studied extensively, such as – Multiple Attractor Cellular Automata (M ACA) [49], Depth-1* Cellular Automata (D1∗ CA) [58], Single Attractor Cellular Automata (SACA) [91] that have been used in a wide range of functions like hashing [110], classification [227], designing testable F SM [60], authentication [91] etc. Characterization of non-group CA employing Boolean Decision Diagram is proposed by Chattopadhyay et. al. [50]. Recently, Cho proposes characterization of dual properties of non-group CA [54]. In order to extend the versatility of additive CA and to analyze a physical system at different levels of hierarchy, work on GF(q) based architecture with q > 2 has been reported. Cattel and Muzio provide introductory analysis of CA behavior over GF(q) [40]. Further, a number of researchers have shown their interest in extending the CA research based on the theory of finite field in GF(m), where m is a positive integral power of 2 [214, 215, 222, 257]. Investigation to develop a CA based hierarchical modeling tool is the major focus of these research. Paul [221] introduces the theory of GF(2p ) cellular automata designed over Galois extension field GF(2 p ). The characterization is further extended to GF((2 p )q ) CA to arrive at a hierarchical structure [258].
2.2.2
Nonlinear CA
The limiting factor for characterization of nonlinear CA is, unlike linear/additive CA, the absence of proper mathematical model for such a machine. A few number of works have been reported for simulating nonlinear CA based on the theory of linear CA [190, 189]. These works consider the advantage of linear algebraic tools employed for characterizing the wide varieties of nonlinear CA state transitions. The major thrust for characterizing nonlinear CA has been to study the dynamics of CA as it evolves in time. The emergent patterns in such systems show some form of globally coordinated behavior. Detailed idea about the dynamics of the CA helps understanding the emergent behavior. Simultaneously, it targets analysis of the computational capacity of this system [117, 71]. Wootters et. al. [309], McIntosh [182], Barbe [24], Voorhees [295] and Jen [137] have contributed a lot towards such
11 characterization with different set of tools and parameters. The parameters defined by the researchers to classify the CA state space are of two categories - local and global parameters. Local Parameters : The analysis of CA behavior through local parameters implies that the study of the nature of the CA 0 s emergent behavior is tracked based upon the rule (neighborhood) configuration of the CA . The most important one is Langton’s λ parameter [159]. It looks into the fraction of 1s in the binary rule configuration of a CA cell. The equivalent idea of internal homogeneity has been introduced earlier by Walker [297]. Several interesting works [186, 217] regarding the critical value of λ (λc ), the value of λ around which the CA rule changes from order to chaos [159], have been reported. The other notable local parameters proposed are the Z parameter [311, 313] which looks into the distribution of 1s and 0s in the rule table, parameter proposed by Zwick [321], and P parameter [311, 312] to characterize the hybrid CA. Global Parameters : The global parameters quantify characterization of a CA with regard to the Garden of Eden - non-reachable states, attractor basins, the entropy of the evolved patterns, etc. Investigation of Garden of Eden has started in 1970s [201, 16]; further developments in this direction are noted in [143, 144]. Kaneko [141] has introduced an information theoretic approach to characterize the complexity of Garden of Eden states in terms of its volumes, stability against noise, information storage capacity, etc. A recent work by Wuensche [312] reports that CA rules can be classified into ordered, complex or chaotic rules based on the measure of G-Density (Bushiness of Garden of Eden), In-Degree Frequency, etc. To accurately model a dynamical system, Lyapunov Exponents (defined for dynamical system lattices) [142, 255] and Derrida Plot [93] which measures the divergence of trajectories based on Hamming Distance are proposed. Formal language theory to characterize CA evolution is also introduced in [301]. A number of authors have concentrated on the computational universality of CA [73, 182] as a part of CA behavior. Further studies on the computational capabilities have been reported in [9, 22, 63, 153, 191, 275]. The basic motivation for characterization of a CA state space is to evaluate the global behavior from the given local rules of the CA. However, the inverse problem of deducing the local rules from a given global behavior is extremely difficult [36]. In general, synthesis of CA to model a given state space is NP-Hard [140]. There are some efforts, with limited success, that identify the CA structure from the given attractor basins. Notable among those are the works reported by Wuencshe [310], Manor Askenazi [18], and John Meyer [200]. The most popular way to systematically deduce global behavior from the local rules depends on the evolutionary computation technique. The initial work on CA evolution is reported by Packard and his colleagues [217, 237]. Koza [155] has applied genetic algorithm for generation of simple random numbers. The first major publication projecting the genetic algorithm for evolving CA is due to Mitchell et. al. [186].
12 Later this concept is refined and reaffirmed by a series of works [70, 79, 80]. In order to circumvent the sampling error with respect to random selection of initial configuration of an evolutionary process, Paredis [220] proposes the co-evolution process where both the CA and the initial configuration (IC) are simultaneously evolved. Juille and Pollack [138] modify the co-evolutionary setup by introducing a limit on the selection of ICs. Pagie and Hogeweg [218] embed the co-evolutionary model in a 2D grid and introduce an extension on the fitness function used to evaluate the ICs. The evolutionary process on two dimensional CA to perform the Density Classification Task is pursued in [194]. The task of complex computation can not be fully modeled through uniform CA [36]. To map such complex tasks to CA, use of hybrid CA is a necessity. The convergence problem of Genetic Algorithm with hybrid CA has been addressed through introduction of parallel genetic algorithm [35, 261, 271, 282]. Capcarre et. al. [37] have presented a detailed study on the dynamics of hybrid CA evolution while solving the three standard tasks - synchronization, density classification and random number generation. Further, theoretical insights regarding evolutionary dynamics of CA are drawn while generating Deterministic Patterns [140] and Pseudo-Random Pattern for V LSI Circuit testing [66, 67]. In a recent work, Maji et. al. [171] consider the state transition diagram of a CA as the graph which is optimized through evolutionary schemes like Genetic Algorithm/ Simulated Annealing to arrive at the desired hybrid CA. The interesting properties of cellular automata have motivated researchers for decades to employ CA in various applications. The brief description on the applications follows in the next section.
2.3
CA Applications
Since the days of von Neumann, a number of researchers have targeted cellular automata (CA) for modeling various applications. Simple local structure with complex global behavior is the key reason of such attraction [308]. Consequently, researchers from diverse fields have identified CA dynamics with the problems in their own field. CA have been used to model biological systems [14, 99, 235, 291], kinetics of molecular systems and crystal growth [217], dynamical systems as diverse as the interaction of particles and the clustering of galaxies [247]. In the field of computer science, the CA based methods has been employed to model von Neumann (self-reproducing) machines [306] and even the parallel processing architectures [173, 230]. Beyond the domain of natural science, it has also been used to study the system properties of other diverse fields [106]. Such a diversity has taken the research on cellular automata to a new height, but also attracted researchers from diverse fields to join and exploit the world of cellular automata. The following broad areas are the domain of CA applications– CA Games : Conway and his colleagues [30] illustrated how extremely simple CA
13 rules can be used to characterize very complex system behaviors such as ‘Game of Life’. The game was made popular through Martin Gardner [113, 114]. Their motivation was to design a simple set of rules to study the macroscopic behavior of a population. The other important applications in this direction were the Games that provide insights into the synchronization problems, such as, the Firing Squad [96], Firing Mob [72] and Queen Bee [264]. CA as Parallel Computing Machine : The application of CA, in machine design, was proposed for building parallel multipliers [19, 64], parallel processing computers [173, 230], prime number sieves [100], and also for sorting machine [208]. Design of fault-tolerant computing machine around CA was reported in [209, 293]. CA based machines, CAM s (CA Machines), had been developed by Toffoli and others [278]. The structure of such machines having high degree of parallelism was ideally suited for simulation of complex systems [280]. Applications of 2-dimensional CA were extended to image processing and pattern recognition in [240, 266]. The publication of von Neumann’s work in late 1950s [294], had raised the possibility of using self-replicating machine to perform complex computations [262]. The self-replicating structures are exploited to solve the NP-complete problem [57]. Researchers also present cellular automata as a typical manometer-scale classical computer [28]. CA For Modeling Physical Systems : The most widely used application of cellular automata is the modeling of physical systems. It has been considered as an alternative to differential equations in modeling laws of physics [279]. CA models with an emphasis on spin systems [69, 167, 228, 290], models for various forms of regular, dendritic and random growth [216], models for pattern formation in reaction-diffusion systems [168, 212, 299], modeling of hydrodynamical systems [105], etc. are some of the major applications investigated so far. A number of applications have been reported in the fields of DN A sequences, and solution of differential equations [34, 211, 265, 319]. A successful application of cellular automata is the modeling of the immune system [33, 44] explored by Celada, Seiden, and De Boer et. al. Detecting genetic disorders of cancerous cells [192, 193], developing drug therapy for HIV infection [219, 256], ecosystem [236] modeling, modeling the nature of fish migration [246], and the growth of vegetable population [21] are the major applications of CA as physical system model. Cellular Automata have also been used to model chemical processes, for analyzing poisoning of a surface during catalysis [56], the driven diffusion system, where the external field biases the movement of each species [116], the solidification process [166], alloy formation [216], etc. Further, the phenomenon of Coalescence of clouds, fog, atmospheric pollution are the important modeling problems for CA [231]. The lattice gas automata are also the well known CA model emulating physical systems [56]. CA for simulating Society :
The great diversity of cellular automata has also
14 attracted the social science community. James M. Sakoda has developed CA based model in the social science – the Checkerboard Model of Social Interaction in 1971 [241]. Thomas Schelling has proposed CA model to analyze the segregation processes of individuals [245]. Peter S. Albin explicitly classifies the checkerboard models as CA in [13]. The other important CA applications in socio-economical systems are the use of one-dimensional CA to analyze pricing in a spatial setting [147], two-dimensional CA for two-person games [210], etc. The artificial societies described in Epstein and Axtell [98] are also based on CA machine. CA based modeling tool for social dynamics is described by Gaylord and D’Andra in [115]. The recent development in this domain is the voters model [198], homogeneous society model [20], etc. employing cellular automata. Theoretical framework of such modeling techniques is found in [124]. Pattern Recognition : The advent of Neural Net [126, 127] popularizes the use of machine intelligence in recognizing patterns. However, the inherent dense structure of Neural Network is not suitable for V LSI implementation. The sparse network of CA has attracted researchers to target CA for pattern recognition [186, 289, 296]. The notable works in this domain are the CA based model of associative memory and its application for pattern recognition [45, 136, 195, 232]. The biologists prefer CA to recognize genetic disorder in cells leading of cancer [192, 193]. Concepts from the discipline of Biology are borrowed to develop CA model for clustering of data. In [49], it is observed that a special class of CA, referred to as M ACA, behave as a natural classifier and can be effective for V LSI and other applications. The linear CA and its application in pattern recognition and pattern classification are reported in [107, 111, 260]. Pattern recognition, in the framework of nonlinear CA, are also reported in [108, 111, 172, 254]. One of the major application areas of CA, specially for the additive/linear CA, is the V LSI domain. The review on such application areas is presented in a separate section that follows.
2.4
Application of CA in V LSI domain
The simplicity, modularity and cascadable structure of additive CA makes it acceptable to V LSI industry. There are different applications, ranging from the V LSI test domain to design of authentication hardware, where the CA have been proposed as a low cost alternative to state-of-the-art design methodologies. Following are some of the important applications of CA in V LSI domain: Error correcting codes : CA−based Error Correcting Codes (CAECC) were introduced by Chowdhuri et. al. [59]. The complexity for CAECC had been shown to be lesser than that of well known Hsiao code [131]. Both the single byte error correcting and double byte error detecting codes, proposed in [62], ware found to be superior than the state-of-the-art schemes. The scheme was further enriched by Paul
15 [223] through introduction of GF (2 p ) CA. Design of CA based Cipher system and Authentication : In [205], Nandi et. al. have presented an elegant scheme for CA based cipher system design. The design requires low cost in comparison to the state-of-the-art designs. Both block and stream ciphering strategies designed with programmable cellular automata (P CA) had been reported in [23, 104, 249]. Dasgupta et. al. [91] have proposed an ASIC for Message Authentication. Further improvement, by inserting invisible watermark in images, is reported in [199] with GF (2p ) CA. CA based Compression technology : Bhattacharya et. al. [31] have proposed methods to perform text compression employing cellular automata technology. A novel technique of deriving CA transforms for compression and encryption [158] is proposed by Olu Lafe. Lafe has showed that cellular automata are capable of generating billions of orthogonal, semi-orthogonal, bi-orthogonal, and non-orthogonal bases, ideal for generating Walsh, Hadamard, Haar & Wavelet Transforms. CA based transforms are also proposed by Paul [224, 225, 226] and Shaw [254] for developing efficient scheme for image and document compression. V LSI Design and Test : Class 3 CA, proposed by Wolfram [302], are found to be suitable for pseudo-random pattern generation [302] useful for V LSI design and test. Hortensius [130] proposes the hybrid CA based Pseudo-Random Pattern Generator (P RP G) for built-in self-test in V LSI circuits. Subsequently, the performance of CA based P RP Gs has been enriched in [58, 74, 250, 251, 253, 286, 287]. Chowdhuri [58], Das [74, 77], Tsalide et. al. [286, 287] and Sikdar [257] have also proposed the CA as a framework for Built-In Self-Test (BIST ) structures. CA based low power BIST is proposed in [65, 148]. Applications of CA for deterministic test pattern generation are investigated by Albicki et. al. [10, 11, 12] and Das et. al. [74, 75]. The cyclic property of CA is utilized to generate the specific set of patterns. Subsequently, Nandi [202] has established CA as a universal test pattern generator. Both the additive and non-additive CA have been exploited as the efficient signature analyzers [129] due to its lesser aliasing property. Serra et. al. investigate the aliasing property of 1-dimensional linear CA in [253]. Similar work has also been reported by Das [74] and Misra [184]. One of the pioneering works in CA based design of V LSI test logic is CALBO (Cellular Automata Logic Block Observer) [183], a structure analogous to BILBO (Built-In Logic Block Observer) popularly used in V LSI circuit testing. The testable design of F SM , around cellular automata, is implemented in [185, 187]. The Synthesis-For Testability (SF T ) technique for F SM is proposed by Choudhuri and the co-researchers [46, 60]. They employ a particular class of CA referred to as D1 ∗ CA for such a synthesis. The special class of CA, referred to as M ACA, have been efficiently employed to diagnose faulty subcircuit with a chip [259]. The Quantum-dot Cellular Automata (QCA) is the youngest member in CA family and
16 V LSI design around QCA are getting popularity among the designers [103, 213]. The most of the CA applications referred so far in V LSI domain employ linear/additive CA. This is because of – (i) ease characterization of linear/additive CA, and (ii) lack of proper tool for characterization of the nonlinear CA. The prime focus of this thesis work is to target characterization of nonlinear CA for its effective applications in V LSI domain. One such important application targeted is the design of P RP G. We, therefore, presents a brief overview on pseudo-random number generation schemes.
2.5
Pseudo-Random Number Generation
The randomness in nature puzzled thinkers for centuries. They found a great interest in the behavior of randomness and the source of randomness. Not only for the theoretical interest, it was studied for different applications in scientific works. Handmade random number generations were carried out with throwing of dice, dealing out cards, or drawing numbered balls [151]. In due course, mechanized devices were built for quick generation of random numbers. The Electronic Random Number Indicator Equipment (ERIE) was used for many years in lottery. However, all such machines were costly, inefficient, and inappropriate for practical applications. The invention of computer had mobilized the interest of using arithmetic operations for random number generation during 1940s. It overcomes the inadequacy of mechanical methods. However, the sequences of numbers generated from such generators are not true random in statistical sense and referred to as pseudo-random numbers [151, 160]. Whenever a sequence of numbers is generated using arithmetic logic or algorithm, then each number in the sequence is completely dependent on its predecessor(s). That is, the computer generated numbers can not be independent from each other. In fact, the sequence of numbers are predictable. Definition 2.1 A sequence of numbers R 1 , R2 , · · · , Rn constitutes a random sample of size N if each number has an equal probability of being selected, and these N random variables are independent.
2.5.1
Pseudo-random number generators
The pseudo-random numbers are important for different applications, such as Monte Carlo Simulation, Cryptography, etc [151]. The first pseudo random number generator was proposed by John von Neumann and Metropolis in 1946 [292]. The basic idea of the method (midsquare method) was to take the middle digits from the square of its predecessor. It had been proved that the midsquare method was the poor source of random numbers. A more practical approach to generate pseudo random numbers, namely Linear Congruential Method (LCG), was proposed by D. H. Lehmer in 1949 [163].
17 LCG is the most popular pseudo random number generator in use [161]. A sequence of random integers X1 , X2 , · · · is defined by the following recurrence relation: Xi+1 = (aXi + c) mod m
(2.1)
The maximum possible period length in LCG is m and the randomness quality of the sequence completely depends on m, a, c, and X 0 . A further improvement in terms of randomness quality is obtained when the linear congruential generator is changed to quadratic congruential generator [151, 68], where Xi+1 = (dXi2 + aXi + c) mod m
(2.2)
However, the generalized form of LCG is: Xi+1 = (a1 Xi−1 + · · · ak Xi−k ) mod m
(2.3)
For a prime m and properly chosen ai s, the sequence has a maximal period length mk − 1 [151]. For a number of applications, a random sequence of zeros and ones are desired. Such a sequence generators are generally called Pseudo-Random Pattern Generator (PRPG). The P RP G obtained from Equation 2.3 with m = 2 is the Linear Feed back Shift Register (LFSR) or Tauswothe generator [270]. The implementation of LF SR can be either fibonacci implementation or Galois implementation. Galois implementation is more attractive to hardware and generally faster than the Fibonacci [151]. The cellular automata (CA) are also considered for the generation of efficient P RP Gs. The next subsection investigates the development of CA based P RP G.
2.5.2
Pseudo-Random Pattern Generation with CA
The randomness property of CA was first investigated by Wolfram [305] and focused on the randomness of the patterns generated by a CA designed with uniform rule30. It was shown that the CA are superior to the LF SR in terms of its randomness quality. Since 1980s, CA based design of P RP G has become an active field of research. In the last 20 years, 1-dimensional CA based P RP Gs are studied extensively. Hortensius reports the randomness property of nonuniform (hybrid) CA designed with the rules 90/150, and 30/45 [128, 130]. It has been shown that 90/150 hybrid CA has the better potential to generate pseudo-random sequence than 30/45 CA. In 1996, Sipper and Tomassini [263] and in 1999 Tomassini et. al. [284] evolve 50-cell CA with a melange of rules 90, 150 & 165 and 90, 105, 150, & 165. The test results show that these two nonuniform CA based P RP Gs are better than those designed only with 90/150 and 30/45 [130]. The 2-dimensional (2-D) CA based P RP Gs are proposed by Chowdhuri et. al. [61] in 1994. Their experimental results suggest that the 2-D CA are superior to 1-D CA in terms of its randomness quality. Following that, Tomassini et. al. [283] evolve several
18 Circuit Under Test
PRPG
Comparator
Display
Reference Unit
Figure 2.1: Structure of P RP G based test logic 8 × 8 2-D CA based P RP Gs to show better performance of the CA based P RP Gs over that of LF SRs. Further improvement in randomness quality of nonuniform CA is noted with controllable cellular automata (CCA) [120], hierarchical cellular automata [258]. The 1-D CA introduced so far are in 3-neighborhood, i.e., the next state of a cell of such a CA depends on the states of left and right neighbors in addition to its current state. Moreover, the CA based P RP Gs proposed so far are designed around the linear CA. Nonlinear CA is not explored properly. This research work proposes efficient design of P RP G in the framework of nonlinear CA targeting its application in V LSI circuit testing.
2.5.3
Application of P RP G in V LSI circuit testing
The Pseudo-Random Pattern Generators (P RP Gs) play an important role in the field of V LSI circuit testing [26]. It is a common practice in industry for a long time [125]. Figure 2.1 shows the classical structure for P RP G based testing of V LSI circuits. The reference unit of Figure 2.1 is either a golden unit (a known good unit) or a good machine simulator computing the correct response to the input stimulus. The comparator compares the actual response from the circuit under test (CU T ) and the expected response from the reference unit. A schematic diagram for random test logic structure with data compression is shown in Figure 2.2. For the scheme to work, a reference signature from the fault free circuit is supplied. P RP G suits better for the on-chip logic design. The on-chip test pattern generator (T P G) designs are reported in [8, 94, 125]. Linear Feed Back Shift Register (LF SR) are widely used, both in industry and research, as P RP G for V LSI circuit testing [26, 152, 229]. However, cellular automata (CA) is established as an efficient P RP G alternative to LF SR [51]. Researchers have found interest in CA due to its high quality of randomness, as well as its regular and cascadable structure. Different CA structures are proposed to improve the efficiency of the CA based test pattern generator. Hortensius [128, 130] proposes a hybrid
19 Reference
PRPG
CUT Start / Stop
Data Compressor
Comparator
Display
Ready
Figure 2.2: Structure of pseudo-random pattern testing with data compression CA based pattern generator. The efficiency of it is further improved in [51]. The 2-dimensional GF (2) CA have also been explored in [58] for test pattern generation. Weighted random BIST structure has been proposed in [123]. CA based weighted random pattern generation scheme is developed in [207]. Recently, a low cost hardware for generating pseudo-random test patterns with Ring Generator is reported in [197]. The quality of CA based P RP G is further improved with the introduction of Hierarchical CA (HCA) [258]. It has been shown that the randomness quality and as well as fault efficiency of HCA is better than other CA based designs [257] with marginal extra cost.
2.6
Conclusion
The historical development in Cellular Automata (CA) theory and its applications in various fields are reviewed in this chapter. It also reports on also surveyed the various aspects of random number generation schemes, as well as the developments of Pseudo-Random Pattern Generators (P RP Gs) and its applications in the field of V LSI circuit testing. Over the years, the linear/additive CA is considered as the powerful modeling tool in V LSI design and test. However, such a special class of CA limits the search space while employed for an application. This motivates us to explore the potential of nonlinear CA in modeling applications from V LSI domain. It requires detail characterization of nonlinear CA rules and the state space of nonlinear CA as reported in the chapter that follows.
Chapter 3
Characterization of Nonlinear Cellular Automata Since the invention of homogeneous structure of Cellular Automata (CA) [294], it has been employed for modeling physical systems with a diversity. To get better insight of a physical system, in due course, the CA structure is simplified with a restriction to local interactions among the cells [300]. The simplified structure of [300] is an 1dimensional CA, each cell having two states (0/1) with uniform 3-neighborhood (self, left neighbor and right neighbor) dependencies among the CA cells. It effectively introduces the modularity in a CA structure. Though, in a number of works [307], it has been shown that the 1-dimensional 3-neighborhood CA exhibit excellent performance while modeling physical systems, it is hard to view that the interacting objects in a dynamical system obey the same local rule (homogeneity) during its evolution. To model such a variety of physical systems, non-homogeneous CA structure (also called hybrid CA) is evolved as an alternative choice. A number of researchers have, therefore, projected their attention to hybrid CA [41, 51, 130] since 1980s and explored the potential design with 1-dimensional hybrid CA specially in V LSI domain [51]. A detail characterization of hybrid CA and its applications in V LSI domain [25, 130] have been reported in [51]. All such applications [25, 130] are developed around the linear/additive CA structure. However, the linear/additive CA limit the search space for the design while modeling an application with CA. For effective modeling of applications from diverse fields, the nonlinear CA can be a better alternative than the linear/additive CA. However, due to lack of available results on characterization, the nonlinear CA are not considered in V LSI applications as well as in the other fields. The above scenario motivates us to undertake the research for characterization of nonlinear CA targeting V LSI design and test. We report an explicit characterization of nonlinear CA, with a special attention to V LSI design and test in this chapter. The preliminary version of this characterization has been reported in [89, 88]. The characterization of individual cell rules and the CA as a whole are reported in the subsequent sections. To facilitate such characterization of CA, we include the basics of cellular automata in the following section. 20
21
0 Cell 1
Cell 2
Cell n−1
Cell n
0
Figure 3.1: Block diagram of an n−cell null boundary CA.
Cell 1
Cell 2
Cell n−1
Cell n
Figure 3.2: Block diagram of an n−cell periodic boundary CA.
3.1
Cellular Automata Basics
A Cellular Automaton (CA) consists of a number of cells organized in the form of a lattice. It evolves in discrete space and time, and can be viewed as an autonomous finite state machine (F SM ). Each cell stores a discrete variable at time t that refers to the present state of the cell. The next state of the cell at (t + 1) is affected by its state and the states of its neighbors at time t. In this work, we concentrate on such 3-neighborhood CA (self, left and right neighbors), where a CA cell is having two states - 0 or 1. Therefore, the next state S it+1 of the ith CA cell is specified by the next state function fi as t t ) , Sit , Si+1 Sit+1 = fi (Si−1
(3.1)
t , S t and S t where Si−1 i i+1 are the present states of the left neighbor, self and right neighbor of the ith CA cell at time t.
The collection of states of the cells S t = (S1t , S2t , · · · , Snt ) at time t is the present state of a CA. Therefore, the next state of an n−cell CA is determined as t t S t+1 = (f1 (S0t , S1t , S2t ), f2 (S1t , S2t , S3t ), · · · , fn (Sn−1 , Snt , Sn+1 ))
(3.2)
t = S1t (that is, left neighbor of the left most cell is the right If S0t = Snt and Sn+1 most cell and vice versa), then the CA is referred to as periodic boundary CA. On t the other hand, if S0t = 0 (null) and Sn+1 = 0 (null), the CA is null boundary. Block diagrams of the n−cell null boundary CA and periodic boundary CA are noted in Figure 3.1 and Figure 3.2 respectively. Figure 3.3 shows the schematic diagram of a two-state 3-neighborhood null boundary CA. Each CA cell is implemented with a flip-flop (F F ) and a combinational logic realizing the next state function.
If the next state function (combinational logic) of the i th cell is expressed in the form of a truth table, then the decimal equivalent of its output is conventionally referred to as the ‘Rule’ Ri [300]. In a two-state 3-neighborhood CA, there can be a total of 2 8 (256) rules. Three such rules 90, 150, and 75 are illustrated in Table 3.1. The first
22 IN OUT Cell 1 (FF)
f1 null boundary
... ...
IN OUT
IN OUT
IN OUT
Cell i−1
Cell i
Cell i+1
(FF)
(FF)
(FF)
IN OUT
...
Cell n
(FF)
...
fi
fn
Combinational logic circuit null boundary
Figure 3.3: Implementation of null boundary CA with F F s and combinational logic circuits. Table 3.1: Truth table for rule 90, 150 and 75 Present state : 111 110 101 100 (RM T ) (7) (6) (5) (4) (i) Next State : 0 1 0 1 (ii) Next State : 1 0 0 1 (iii) Next State : 0 1 0 0
011 010 001 000 Rule (3) (2) (1) (0) 1 0 1 0 90 0 1 1 0 150 1 0 1 1 75
Note: RM T stands for Rule Min Term. The value 0/1 noted on 3rd/4th /5th row shows the output of the three variable switching function.
row of the table lists the possible 2 3 (8) combinations of the present states of (i − 1) th , ith and (i + 1)th cells at time t. The last three rows indicate the next states of the i th cell at (t + 1) for the rules, 90, 150 and 75 respectively. From Table 3.1, we can also form the next state combinational logic corresponding to a rule. That is, for t t ⊕ Si+1 Rule 90: Sit+1 = Si−1 t+1 t t ⊕ Sit ⊕ Si+1 Rule 150: Si = Si−1 t+1 t t t t ).(S t ). Rule 75: Si = Si .(Si−1 ⊕ Si+1 ) + (Si−1 i
The next state functions fi s for the rules 90 and 150 employ XOR logic. These rules are called linear rules. On the other hand, rule 75 is a non-linear one. Out of total 256 rules, there are only 14 rules that employ XOR/XN OR logic function and are referred to as linear/additive rules. Other rules employ nonlinear logic functions (AN D, OR, etc.). Definition 3.1 The set of rules R =< R1 , R2 , · · · , Ri , · · · , Rn > that configure the cells of a CA is called the rule vector. Definition 3.2 Whenever all the Ri s (i = 1, 2, · · · , n) of a rule vector R are linear/additive, the CA is referred to as Linear/Additive CA, otherwise the CA is a Nonlinear one.
23 Definition 3.3 A CA is uniform if R1 = R2 = · · · = Rn ; otherwise the CA is hybrid. The sequence of states generated (state transitions) during its evolution with time directs the CA behavior (Figure 3.4 and Figure 3.5). The state transition diagram of a CA may contain cyclic and non-cyclic states (a state is called cyclic if it lies in a cycle) and based on this, the CA can be categorized as either reversible or irreversible CA. 13 0
10
6
4
9
1 12
14
11
8
2
5
3
7
15
Figure 3.4: State transitions of a reversible CA < 105, 177, 170, 75 >.
Definition 3.4 A CA is reversible if it contains only cyclic states in its state transition diagram; otherwise the CA is irreversible. In a reversible CA, the initial CA state repeats after certain number of time steps (Figure 3.4). Therefore, all the states of a reversible CA are reachable from some other states and each state has exactly one predecessor. On the other hand, in an irreversible CA (Figure 3.5), there are some states which are not reachable (non-reachable states) from any other state. Moreover, some states of such a CA are having more than one predecessor [191, 201]. For example, the states marked as 5 and 13 of Figure 3.5 are the non-reachable states, whereas 15 and 7 have more than one predecessor. The non-reachable states of an irreversible CA form Garden of Eden. The reversible linear/additive CA forms a cyclic group [51], and so popularly called as group CA. Similarly, irreversible linear/additive CA is referred to as non-group CA. In view of the structural similarity in the state transition diagram of reversible (irreversible) CA with linear/additive group (non-group) CA, we refer reversible (irreversible) CA as the group (non-group) CA in the subsequent discussions. The basic component of a CA is its cell rules. Behavior of the CA state transitions
24 13
5
10
2
4
8
1
7
14
3
11
12
9
6
0
15
Figure 3.5: State transitions of an irreversible CA < 105, 177, 171, 75 >. depends on the (i) CA rules that configure the cells, and (ii) sequence of rules that form the CA. The next section reports characterization of the CA rules.
3.2
Characterization of CA rules
This section reports the characterization of CA cell rules that helps to identify the reversible CA or group CA. A tree based method is proposed to characterize the CA rules as well as to synthesize a group CA in linear time. The number of 1s in a rule plays a key role in determining the reversibility of a CA and depending on the number of 1s and 0s in its 8−bit binary representation, as defined below, a CA rule can be classified as balanced or unbalanced. Definition 3.5 A rule is Balanced if it contains equal number of 1s and 0s in its 8−bit binary representation; otherwise it is an Unbalanced rule. The rules shown in Table 3.1 are the balanced rules. Each of the rules has four 1s and four 0s in its 8-bit binary representation. On the other hand, rule 171 with five 1s in its 8-bit representation (10101011) is an unbalanced rule. In order to facilitate characterization of CA rules, we introduce further the following terminology.
3.2.1
Rule Min Term (RM T )
From the view point of Switching Theory, a combination of the present states (as noted in the 1st row of Table 3.1) can be viewed as the Min Term of a 3-variable
25
Table 3.2: Binary values of the CA < 105, 128, 171, 65 > cell rules RMT F irst cell Second cell T hird cell F ourth cell
111 110 101 100 011 010 001 000 Rule (7) (6) (5) (4) (3) (2) (1) (0) d d d d 1 0 0 1 105 1 0 0 0 0 0 0 1 129 1 0 1 0 1 0 1 1 171 d 1 d 0 d 0 d 1 65
t , S t , S t ) switching function. Therefore, each column of the first row of Table (Si−1 i i+1 3.1 is referred to as Rule Min Term (RM T ). The column 011 in the truth table (Table 3.1) is the 3rd RM T . The next states corresponding to this RM T are 1 for Rule 90 and 75, and 0 for Rule 150. The characterization reported in this chapter is based on the analysis of RM T s of CA rules. 0
0
Definition 3.6 A rule Ri is the complement rule of Ri if each RM T of Ri is the 0 complement of the corresponding RM T value of R i . Therefore, Ri + Ri = 255. For example, rule 90 and 165 are the complement rules of each other. Relationship among RM T s : RM T s of a rule dictate the next state of the CA cell, configured with that rule. Therefore, the next state of a CA is determined by the RM T s of all the cell rules. However, the RM T s of two consecutive cell rules R i and Ri+1 are related while the CA changes its state during t to (t + 1) th instant of time. The following discussion illustrates the relationship between the cell rules. Let say, 0011 is the present state (Figure 3.6) of a 4-cell CA with rule vector < 105, 129, 171, 65 >. The RM T s of the 4 rules are noted in Table 3.2. As we are considering null boundary CA, we omit the don 0 t care bits (marked by d in the table) of the leftmost and rightmost cell rules 105 and 65 respectively. The don 0 t care RM T s will never appear while the CA changing its state. Since the CA is in 3-neighborhood, an RM T can be considered as the 3-bit window (i − 1, i, i + 1). Further, the 3-bit window for the (i + 1)th cell can be found from the window of i th cell with 1-bit right shift. As we assume the present state of the CA is b 1 b2 b3 b4 = 0011, the 3-bit window for the first cell (left most cell) of the null boundary CA of Table 3.2 is 0b 1 b2 = 000 (Figure 3.6). The next state for the first cell is, therefore, guided by the RM T 0 (Table 3.2) – that is, 1. To find the next state for second cell, the window is to be shifted right by 1-bit position and it is b 1 b2 b3 = 001. Hence the next state of second cell is 0, the value of RM T 1 of second rule (Table 3.2). Similarly, after 1-bit right shift, the window becomes b2 b3 b4 = 011. Therefore, the next state for third cell is 1 (Table 3.2). Finally, the next state of the cell 4 can be computed and it is 1. These results in CA state transition from 0011 to 1011. It can be observed that, if the RM T window for i th cell is (bi−1 bi bi+1 ), bi = 0/1, then one can predict that the RM T window for (i + 1) th cell will be either (bi bi+1 0)
26 Present State 0
0
0
1
1
0
window − cell 1 RMT 0
window − cell 2 RMT 1
window − cell 3 RMT 3
window − cell 4 RMT 6
1
0
1
1
Next State
Figure 3.6: Determination of next state. Table 3.3: Relationship between RM T s of cell i and cell (i + 1) for next state computation RM T at ith rule 0 1 2 3 4 5 6 7
RM T s at (i + 1)th rule 0, 1 2, 3 4, 5 6, 7 0, 1 2, 3 4, 5 6, 7
or (bi bi+1 1). In other words, if the ith CA cell changes its state following the RM T k (decimal equivalent of bi−1 bi bi+1 ) of rule Ri , then the (i + 1)th cell will generate the next state following the RM T 2k mod 8 (b i bi+1 0) or (2k + 1) mod 8 (bi bi+1 1) of rule Ri+1 . This relationship between the RM T s of R i and Ri+1 while computing the next state of a CA is shown in Table 3.3. The relation, noted in the table, plays an important role in characterizing the CA behavior configured with different cell rules. We propose the concept of Reachability Tree in the following subsection to formalize the characterization.
3.2.2
Reachability Tree
The Reachability tree is defined to characterize the CA states. It is a binary tree and represents the reachable states of a CA. Each node of the tree is constructed with RM T (s) of a rule. The left edge of a node is considered as the 0-edge and the right edge is the 1-edge. The number of levels of the reachability tree for an n−cell CA is (n + 1). Root node is at Level 0 and the leaf nodes are at Level n. The nodes of Level
27 0, 1, 2, 3
A
0
(0, 3) 0, 1, 6, 7
2 ,3, 4, 5
0
B
(2, 3, 4, 5)
1
(2,4, 6)
E
0 H 1
(0, 1, 3, 5, 7) (2,4)
2, 3, 6, 7
0
(0) (2)
I
1
0
(4)
O
R
J
0, 1, 6, 7
1
F
0, 1, 4, 5
(6)
(0, 7)
2, 3, 4, 5
0
0 G
(3, 5)
1
(6)
1, 2, 6, 7
1 0 K
(0) (2)
Second bit
1
C
(1, 6)
1
D
0, 1, 4, 5 (4)
0
()
0,1,2,3,4,5,6,7
0
First bit
1
(1, 2)
4, 5
1 0
(6) (4)
(0, 1, 7)
Third bit
0,1,2,3,6,7
L 1
0 M 1
( ) (2)
Fourth bit
(0, 6)
W N
O
P
S
T
U
V
X
Y
Figure 3.7: Reachability Tree for the CA < 105, 129, 171, 65 >. i are constructed following the selected RM T s of R i+1 for the next state computation of cell (i + 1). The number of leaf nodes in the tree denotes the number of reachable states of the CA. A sequence of edges from the root to a leaf node, representing an n−bit binary string, is a reachable state, where 0-edge and 1-edge represent 0 and 1 respectively. Figure 3.7 shows the reachability tree for a CA with rule vector < 105, 129, 171, 65 > (the RM T s of the CA rules are noted in Table 3.2). The decimal numbers within a node (Figure 3.7) at level i represent the RM T s of the CA cell rule R i+1 following which the cell (i + 1) may change its state. The RM T s of a rule for which we follow 0-edge or 1-edge are noted in the bracket. For example, the root node (level 0) is constructed with RM T s 0, 1, 2 and 3 as cell 1 can change its state following any one of the RM T s 0, 1, 2, and 3. As the state of the left neighbor of cell 1 is always 0, the RM T s 4, 5, 6 & 7 are the don0 t cares for cell 1. For the RM T s 1 (001) and 2 (010) of 105 (Table 3.2), the next states are 0 and it is 1 for the RM T s 0 (000) and 3 (011). Therefore, at level 1, node after the 0-edge of level 0 contains the RM T s 2, 3, 4 & 5 (Figure 3.7 and Table 3.3). As the RM T s 2, 3, 4 and 5 of second cell rule (129) are 0 (Table 3.2), this node does not have 1-edge (dotted line in Figure 3.7). It signifies that any state started with 01 (edge sequence AB, BE) is non-reachable. On the other hand, 0010 (edge sequence AB, BD, DI, IP), 0011, etc are the reachable states of the CA. Definition 3.7 Two RM T s are equivalent if both result in the same set of RM T s effective for the next level of Reachability Tree. For example, the RM T s 0 and 4 are equivalent as both result in the same set of effective RM T s 000=0, 001=1} (Table 3.3) for the next level of Reachability Tree. Similarly, the RM T s 1 & 5, 2 & 6, and 3 & 7 are equivalent. Definition 3.8 Two RM T s are sibling at level i + 1 if they are resulted in from the same RM T s at level i of the Reachability Tree.
28
Table 3.4: RM T s of the CA < 90, 15, 85, 15 > rules RMT
111 110 101 100 011 010 001 000 Rule (7) (6) (5) (4) (3) (2) (1) (0) d d d d 1 0 1 0 90 0 0 0 0 1 1 1 1 15 0 1 0 1 0 1 0 1 85 d 0 d 0 d 1 d 1 15
F irst cell Second cell T hird cell F ourth cell
0, 1, 2, 3
0
1 (1,3)
(0,2)
0, 1, 4, 5
0 (4,5) 0 (1,3) 0 (6)
0
(0,1)
1 (0,2)
1 0 (2)(4)
2, 3, 6, 7
1
1 (0)
0 (1,3) 0
1
1
(6,7) 1 (0,2) 0
1
0 (5,7) 0 (6)
1 (4,6)
1 0 (2)(4)
1 (0)
(2,3)
0 (5,7) 0
1
1 (4,6) 0
1
Figure 3.8: Reachability tree for the CA < 90, 15, 85, 15 >. The RM T s 0 and 1 are the sibling RM T s as these two are resulted in either from RM T 0 or from RM T 4 (Table 3.3). If a node of Reachability Tree associates an RM T k, it also associates the sibling of k. Theorem 3.1 The reachability tree for a group CA is balanced. Proof : Since all states of a group CA are reachable, the number of leaf nodes in the Reachability Tree for the n−cell group CA is 2 n (number of states). Therefore, the tree is balanced as it is a binary tree of (n + 1) levels. 2 Example 3.1 Let us consider a 4-cell CA < 90, 15, 85, 15 > noted in Table 3.4. The reachability tree of the CA is shown in Figure 3.8. The tree is balanced and it is a group CA. The above discussions point to the fact that the identification of a group CA (nongroup CA) can be done only by constructing the reachability tree for the CA. If the number of non-reachable states in a Reachability Tree is zero, then we can conclude that the CA is a group CA. However, computation of the number of non-reachable states involves exponential complexity when the CA is group CA. There is no such method to compute the number of non-reachable states in a CA even in polynomial time. In this work, we propose an algorithm that can identify
29 a group (non-group) CA in O(n) time. We also report a linear time solution to synthesize an n−cell group CA. The following theorem guides the design of such a solution. Theorem 3.2 The reachability tree of a 3-neighborhood null boundary CA is balanced if each edge, except the leaf edges, is resulted from exactly two RM T s of the corresponding rule. Proof : Let us consider an intermediate edge l is resulted from a single RM T k of a rule. Now the following two cases may arise: (i) The edge l is in between level (n − 2) and level (n − 1) (predecessor to the leaf edge): that is, the edge l connects a node of level (n − 2) with its one child node at level (n − 1). Therefore, the child node at level (n − 1) contains RM T s {2k mod 8, (2k + 1) mod 8}. Since it is a node at level (n − 1), the node corresponds to the CA cell rule Rn . As the CA is null boundary, RM T (2k + 1) mod 8 does not exist. Hence the tree is unbalanced as only one edge can be generated from a single RM T . (ii) The edge l is any intermediate edge: for this case, the very next edges of l will be resulted from RM T 2k mod 8 or from RM T (2k + 1) mod 8. If both the RM T s are same for that particular rule, then the tree becomes unbalanced. Otherwise, there exist two edges and each will be resulted from a single RM T . The process may be continued till the predecessor of the leaf node is reached. That is, the tree may remain balanced till the predecessor of the leaf node, and there are a number of edges whose next level edge is the leaf edge resulted from a single RM T . Hence the tree is unbalanced by the Case i. 2 Example 3.2 Consider the 4-cell CA < 90, 15, 85, 15 > of Example 3.1. The CA is a group CA. Each intermediate edge of the reachability tree (Figure 3.8) is resulted from exactly two RM T s. The RM T s are noted within the brackets. Corollary 3.1 All the nodes except leaves of the reachability tree for a group CA is constructed with 4 RM T s. Proof : Since both the 0-edge and 1-edge of a node of the reachability tree for the group CA resulted exactly from 2 RM T s (Theorem 3.2), the node is, therefore, constructed with 4 RM T s. 2 There may be 2i number of nodes at level i of the reachability tree for an n−cell CA, i ≤ n. However, all the nodes are not unique. Two or more similar nodes at a level produce the same subtree. The reachability tree, therefore, contains a number of similar subtrees. For simplicity, we can show only one instance of subtree replacing other similar subtrees of the reachability tree. Such a reachability tree is referred to as compressed reachability tree. Figure 3.9 is the compressed reachability tree of Figure 3.8. A dotted line in points to the similar subtree. The following theorem characterizes the nodes at each level of a reachability tree. Theorem 3.3 : At each level, except the root, of the reachability tree for a group CA, there are 2 or 4 unique nodes.
30 0, 1, 2, 3
0 (0,2)
1 (1,3)
0, 1, 4, 5
0 (4,5) 0, 1, 2, 3
(0,1) 0, 1, 2, 3
1 (0,2)
0 (1,3) 2, 3, 6, 7
0 (6)
2, 3, 6, 7
1
0, 1, 4, 5
1 (2)
0
1
(2,3)
(6,7) 4, 5, 6, 7
4, 5, 6, 7
0 (5,7) 2, 3, 6, 7
1 (4,6) 0, 1, 4, 5
0 (4)
1 (0)
Figure 3.9: Compressed reachability tree for the CA < 90, 15, 85, 15 >. Proof : Each node of the reachability tree for a group CA is constructed with 4 RM T s (Corollary 3.1) and the sibling RM T s (Definition 3.8) are associated with the same node. Since there are 4 sets of sibling RM T s (0 & 1, 2 & 3, 4 & 5, and 6 & 7), 3 different organizations of RM T s for the nodes are possible – {0, 1, 2, 3} & {4, 5, 6, 7}, {0, 1, 4, 5} & {2, 3, 6, 7} and {0, 1, 6, 7} & {2, 3, 4, 5}. This implies, if a node at level i is constructed with N 1 ={0, 1, 2, 3}, then there exists another node at that level constructed from N2 ={4, 5, 6, 7}. Therefore, minimum number of unique nodes in a reachability tree of a group CA is 2. It is obvious from Theorem 3.2 that the 2 out of 4 RM T s (Corollary 3.1) of a node in the reachability tree for group CA are d (d = 0/1) and the rest 2 are d 0 . Therefore, 2 RM T s of N1 or N2 are d, and the other 2 are d0 . So, another two nodes may be possible at level i taking 2 RM T s that produce d from N 1 and another 2 RM T s from N2 . Hence the maximum number of possible nodes in a reachability tree for a group CA is 4. 2 Example 3.3 Consider the group CA of Example 3.1. Figure 3.9 shows the unique nodes of the reachability tree for the group CA. Each level except the root contains 2 unique nodes. Based on the above discussions, we next propose a method for identification of the group properties of a CA followed by the synthesis scheme for an n-cell group CA in Section 3.2.4.
3.2.3
Identification of group CA
This subsection develops an algorithm that can check whether a CA is group. The algorithm (IdentifyGroup) scans a CA rule vector from left to right and constructs the compressed reachability tree. It then notes an edge in the reachability tree associating other than 2 RM T s. If there is any such edge, then the CA is non-group (Theorem 3.2). The algorithm uses a structure S with an array of sets. The number of sets in S is indicated by nos. The rule vector, scanned by the algorithm, is a two dimensional array (Rule[n][8]), where (Rule[i][j]) indicates the RM T j of rule R i .
31 Algorithm 3.1 IdentifyGroup Input: n (CA size), Rule[n][8] (CA) Output: group or non-group Step 1: Find (a) S[1] = {j}, where Rule[1][j] = 0 and 1 ≤ j ≤ 3, (b) S[2] = {j}, where Rule[1][j] = 1 and 1 ≤ j ≤ 3. (c) If |S[1]| 6= |S[2]|, report the CA as non-group and exit. (d) Set nos := 2. Step 2: For i = 2 to n − 1 2.1 For j = 1 to nos Determine 4 RM T s for the next level node from S[j] using Table 3.3. Distribute these 4 RM T s into S 0 [2j] and S 0 [2j + 1], such that S 0 [2j] and S 0 [2j + 1] contain the RM T s that are 0 and 1 respectively for the i th rule. If |S 0 [2j]| 6= |S 0 [2j + 1]|, then report the CA as non-group and exit. 2.2 Replace RM T s 4, 5, 6 and 7 by equivalent RM T s 0, 1, 2 and 3 respectively for each S 0 [k]. 2.3 If |S 0 [k]| = 1, report the CA as non-group and exit. 2.4 Remove duplicate sets from S 0 and assign the sets of S 0 to S. 2.5 nos := number of sets in S. Step 3: For j = 1 to nos Determine next 4 RM T s of S[j], of which 2 are don’t cares since it is the last rule. If both the RM T s are 0 or 1 for the rule, then report the CA as non-group and exit. Step 4: Report the CA as group.
Example 3.4 This example illustrates the execution steps of Algorithm 3.1. Let consider the CA < 90, 15, 85, 15 > of Table 3.4. From Step 1 of Algorithm 3.1, we get S[1] = {0, 2} and S[2] = {1, 3}. In Step 2, when i = 2 we obtain – S 0 [1] = {4, 5}, S 0 [2] = {0, 1}, S 0 [3] = {6, 7} and S 0 [4] = {2, 3}. Since each set of S 0 contains exactly 2 RM T s, decision (group or non-group) at this stage can not be taken. Now S 0 is modified as S 0 [1] = {0, 1}, S 0 [2] = {0, 1}, S 0 [3] = {2, 3} and S 0 [4] = {2, 3}. Here each set of S 0 contains exactly 2 RM T s. Now, S 0 is reduced by removing the duplicates and then assigned to S. Therefore, S[1] = {0, 1} and S[2] = {2, 3}. When i = 3, S 0 [1] = {1, 3}, S 0 [2] = {0, 2}, S 0 [3] = {5, 7} and S 0 [4] = {4, 6}. Hence the modified S 0 : S 0 [1] = {1, 3}, S 0 [2] = {0, 2}, S 0 [3] = {1, 3} and S 0 [4] = {0, 2}. Further, assigning reduced S 0 to S, we get S[1] = {1, 3} and S[2] = {0, 2}. Now Step 3 results in S 0 [1] = {2}, S 0 [2] = {6}, S 0 [3] = {0} and S 0 [4] = {4}. Each set of S 0 contains a single RM T – that is, the number of 0s and 1s in RM T s 2, 6 and RM T s 0, 4 are the same. So, the CA is a group CA (Step 4). 2 Complexity: Step 2 is the main loop in Algorithm 3.1. It contains a sub loop with expected nos number of iterations. The maximum value of nos is 4 as the maximum possible unique nodes at level i is 4 (Theorem 3.3). Therefore, the execution time of the algorithm depends only on n. Hence the complexity of the group CA identification algorithm (Algorithm 3.1) is of O(n).
32
3.2.4
Synthesis of a Group CA
Synthesis of group CA is exactly the reverse process of analysis reported in the earlier section. We next propose an efficient synthesis algorithm (SynthesizeGroupCA 1) for such a CA in the current section. Input to the algorithm is n, the size of CA to be synthesized, and the output is an n-cell group CA. It determines the CA cell rules from analysis of the RM T s for the rule R i , i = 1, 2, · · · , n. The RM T s are set in such a way that each edge of the reachability tree is resulted from two RM T s (Theorem 3.2). The algorithm also uses the two dimensional array (Rule[n][8]) noted in the earlier subsection. Algorithm 3.2 SynthesizeGroupCA 1 Input: n. Output: An n−cell group CA R. Step 1: Distribute two 0s and two 1s arbitrarily in most significant 4 RM T s of Rule[1]. Consider, S[1] = {j} for Rule[1][j] = 0 and S[2] = {j} for Rule[1][j] = 1 (1 ≤ j ≤ 3). Set nos := 2 Step 2: For i = 2 to n − 1 2.1 For j = 1 to nos Determine 4 RM T s for the next level node from S[j] using Table 3.3. Distribute two 0s and two 1s arbitrarily in these 4 RM T s such that the equivalent RM T s can not be the same. Store the RM T s that are 0 and 1 in S 0 [2j − 1] and S 0 [2j] respectively. 2.2 Replace RM T s 4, 5, 6 and 7 by equivalent RM T s 0, 1, 2 and 3 respectively for each S 0 [k]. 2.3 Remove duplicate sets from S 0 and assign the sets of S 0 to S. 2.4 nos := number of sets in S. Step 3: For j = 1 to nos Determine next 4 RM T s of S[j], of which 2 are don’t cares since it is the last rule. Distribute 0 and 1 randomly in the effective two RM T s for Rule[n]. Step 4: Report the CA as an n−cell group CA.
Complexity: Step 2 of the algorithm contains a loop that is dependent on n (number of cells in the CA to be synthesized). However, the sub loop under Step 2 and a loop in Step 3 iterate based on nos. Since the maximum value of nos is constant (4), the algorithm depends only on n. Hence, the complexity of the synthesis algorithm is O(n). Example 3.5 This example illustrates the synthesis of a 4-cell group CA following Algorithm 3.2. Let us consider, (i) two 0s and two 1s are distributed arbitrarily in least significant 4 RM T s for the first rule, (ii) S[1] = {0, 2} (i.e., the RM T s 0 and 2 are 0), and (iii) S[2] = {1, 3} (i.e., the RM T s 1 and 3 are 1). Therefore, the number of nodes at level 1 is 2; one is generated from the edge that comes from {0, 2} and other is generated from the edge that comes from {1, 3}. Hence, the first rule is 10. However, the RM T s for the nodes are {0, 1, 4, 5} and {2, 3, 6, 7} (Step 2.1). Two 0s and two 1s are randomly distributed in each set so that the equivalent RM T s (RM T 0 & 4, 1 & 5, etc.) can not be the same while these RM T s constitute a node. Suppose,
33 RM T 0, 1, 2 and 3 are 1, and RM T 4, 5, 6 and 7 are 0. Hence, S 0 [1] = {4, 5} & S 0 [2] = {0, 1} and S 0 [3] = {6, 7} & S 0 [4] = {2, 3} (Step 2.1). The 2nd cell rule is therefore 15 (Step 2.2). However, the number of sets is 4 and each set produces a node for the next level. Since RM T 0 & 4, 1 & 5, 2 & 6, and 3 & 7 are equivalent, we replace RM T 4 by 0, 5 by 1, 6 by 2 and 7 by 3 (Step 2.3). Therefore, S 0 [1] & S 0 [4] and S 0 [2] & S 0 [3] are equivalent and the number of unique set is 2 – {0, 1} & {2, 3} (Step 2.4 and Step 2.5). Similarly, the nodes for the next level are S[1] = {0, 1, 2, 3} and S[2] = {4, 5, 6, 7}. Consider, S 0 [1] = {1, 3} & S 0 [2] = {0, 2} and S 0 [3] = {5, 7} & S 0 [4] = {4, 6}. That is, RM T 1, 3, 5 and 7 are 0, and RM T 0, 2, 4 and 6 are 1. Hence the 3 rd cell rule is 85. However, the number of unique set is again 2 – {1, 3} & {0, 2}. The unique nodes for next level, that is, the predecessors to the leaves, are {2, 3, 6, 7} and {0, 1, 4, 5}. However, RM T 1, 3, 5 and 7 are don’t cares, as it is the right most cell of a null boundary CA. Therefore, the RM T s of the nodes are {2, 6} and {0, 4}. 1 and 0 are distributed randomly for the RM T s of each set. Suppose, RM T 4 and 6 are 0, and RM T 0 and 2 are 1. Hence the last cell rule is 5 (Step 3). Therefore, the synthesized 4-cell CA is < 9, 15, 85, 5 >. However, the CA is equivalent to < 90, 15, 85, 15 > (Example 3.4). The reachability tree and the compressed reachability tree for the CA is noted in Figure 3.8 and Figure 3.9 respectively. From Theorem 3.2 and Algorithm 3.1, it can be observed that each rule of a CA plays an important role to determine the group/non-group behavior of a CA. The CA rules are, therefore, further classified as group rule and non-group rule to facilitate the characterization of CA behavior. Definition 3.9 A rule is a Non-group Rule if its presence in a rule vector makes the CA non-group. Otherwise, the rule is Group Rule. Characterization of group rules also leads to a synthesis scheme for group CA. The following section reports such characterization of CA rules and the synthesis scheme.
3.3
Group rules
The group rules are the basic building blocks of a group CA. This section analyzes the properties of group rules in 3-neighborhood dependency. Theorem 3.4 An unbalanced rule is a non-group rule. Proof : Let us consider R =< R1 , R2 , · · · , Ri , · · · , Rn > be a CA, where Ri is an unbalanced rule and R00 =< R1 , R2 , · · · , Ri 00 , · · · , Rn > is a group CA. All the rules of R and R00 are same except the ith rule. We have to prove that R is non-group due to the presence of Ri .
34 The reachability tree of R is balanced up to (i − 1) th level as R00 is group CA with the same rules of R up to (i − 1)th cell. Since Ri is unbalanced, there exists at least one node at (i − 1)th level that has a child resulted from 1 RM T (or 3 RM T s). This implies that the tree is unbalanced (Theorem 3.2). Therefore, the CA with rule vector R is non-group. Hence the proof. 2 Alternative proof : The above theorem can also be proved by considering the basic structure of non-group CA state transition diagrams (Figure 3.5) that contains states with more than one predecessor. Let us consider, i th rule Ri of a rule vector R =< R1 , R2 , · · · , Ri , · · · , Rn > be an unbalanced rule and the next state value of the ith cell corresponding to k number of RM T s of R i be di , where di =0/1 and k > 4. Therefore, there are k ∗ 2n−3 number of current states for which the next state has the form S = {· · · di · · ·}. The maximum possible number of such next states is clearly 2n−1 . Since k ∗ 2n−3 > 2n−1 (k > 4) – that is, the number of next states is lesser than that of current states. It implies, there is at least a state in S which contains more than one predecessor. Therefore, the CA with unbalanced rule is a non-group CA. 2 Example 3.6 The 4−cell CA with rule vector < 105, 177, 170, 75 > is a group CA. Therefore, all of the four rules are group rules. On the other hand, a CA with rule vector < 105, 177, 171, 75 > is a non-group CA (Figure 3.5). The presence of rule 171 (binary value 10101011) makes the CA non-group. That is, 171 is a non-group rule. 171 is an unbalanced one. The number of 1s in the RM T s of 171 is 5. There are 8 C4 = 70 balanced CA rules in 3-neighborhood dependency. However, all of them are not the group rules. Only 62 are the group rules and the rest 8 are balanced non-group rules. The following theorem characterizes the balanced non-group rules. Theorem 3.5 : A balanced rule with same value for the RM T s 0, 2, 3, 4, or RM T s 0, 4, 6, 7, or RM T s 0, 1, 2, 6, or RM T s 0, 1, 3, 7 is a non-group rule. Proof : Let us consider, the RM T s of a balanced rule r are clustered as g 1 = {0, 2, 3, 4} and g2 = {1, 5, 6, 7}, where the value of each RM T ∈ g 1 is d and for g2 it is d0 (d = 0/1). Now the following four cases may arise – Case I: r is the first rule of a rule vector – Since RM T s 4, 5, 6 and 7 are don 0 t cares for the first rule in a null boundary CA, the clustering of RM T s effectively becomes g1 ={0, 2, 3} and g2 ={1}. Hence, 0-edge (1-edge) of first level of the reachability tree is resulted either from 3 or 1 (1 or 3) RM T s of r. Therefore, the tree is unbalanced (Theorem 3.2). Hence the CA designed with r is non-group – that is, r is a non-group rule. Case II: r is the second rule – Consider the first rule is balanced over its least significant 4 RM T s 0, 1, 2 and 3. Therefore, the possible clustering of RM T s to form the 0-edge and 1-edge from root of the reachability tree are: f1 = [{0, 1}&{2, 3}], f2 = [{0, 2}&{1, 3}], and f3 = [{0, 3}&{1, 2}].
35 That is, for f1 if the RM T s 0 and 1 are 1, then the RM T s 2 and 3 are 0. Therefore, the RM T s of level 1 nodes of the tree are: for f1 : {0, 1, 2, 3} and {4, 5, 6, 7}, for f2 : {0, 1, 4, 5} and {2, 3, 6, 7}, and for f3 : {0, 1, 6, 7} and {2, 3, 4, 5} However, if the RM T s of first rule are clustered like f 1 or f3 , the children of the second level nodes are resulted from one or three RM T s, due to the clustering of RM T s of r. This implies, the tree is unbalanced and the CA is non-group. If the RM T s of first rule are clustered like f 2 , the RM T s of r for level 1 nodes are [{0, 4} & {1, 5}] and [{2, 3} & {6, 7}]. Therefore, the edges of reachability tree will be resulted from RM T s 0 & 4, RM T s 1 & 5, RM T s 2 & 3, and RM T s 6 & 7. Since the RM T s 0 & 4 (similarly 1 & 5) are equivalent (Definition 3.7), two nodes at level 2 are constructed with 2 RM T s. It violets Corollary 3.1. Hence the CA is non-group. Case III: r is the ith rule – Consider, the reachability tree of the CA is balanced up to the level (i−1). Since RM T s 0 & 4, 1 & 5, 2 & 6, and 3 & 7 are equivalent, without loss of generality, we can consider that the level i nodes are generated from RM T 0, 1, 2, and 3. But any combination of these RM T s leads to an unbalanced tree (Case II). Case IV: r is the nth rule – Since RM T 1, 3, 5 and 7 are don0 t cares for the last rule, the clustering of RM T s of r effectively becomes g 1 = {0, 2, 4} and g2 = {6}. Consider the reachability tree is balanced up to the level (n − 1). Therefore, a number of nodes at level (n − 1) contain two RM T s out of 4 effective RM T s from g 1 . These nodes will have only a single child. This leads to an unbalanced reachability tree and the CA becomes non-group. The above discussion implies that the CA is non-group due to the presence of r. Hence r is a non-group rule. With the similar logic, it can also be shown that a balanced rule with any of the following RM T clusterings a. {0, 4, 6, 7} & {1, 5, 2, 3} b. {0, 1, 2, 6} & {4, 5, 3, 7} c. {0, 1, 3, 7} & {2, 6, 4, 5} is a non-group rule. Hence the proof. 2 Corollary 3.2 The number of balanced non-group CA rules in 3-neighborhood is 8. Proof : As there are 4 clusterings of RM T s that lead to a balanced non-group CA rule (Theorem 3.6) and each clustering corresponds to 2 CA rules, the total number of such balanced non-group rules is 4 × 2 = 8. 2 From the earlier discussion, it can be identified that the balanced non-group rules are – 29, 46, 71, 116, 139, 184, 209 and 226. The rest 62 out of 70 balanced rules are the group rules and are listed in Table 3.5. The 62 group rules can only form the reversible (group) CA. However, any sequence of group rules in a CA rule vector does not necessarily imply that the resulted CA is a group CA. The following theorem provides the theoretical justification of the fact.
36 Table 3.5: List of group rules 15 58 99 142 170 210
23 60 101 147 172 212
27 75 102 149 177 216
30 77 105 150 178 225
39 78 106 153 180 228
43 83 108 154 195 232
45 51 53 54 57 85 86 89 90 92 113 114 120 135 141 156 163 165 166 169 197 198 201 202 204 240
Theorem 3.6 Only a specific sequence of group rules forms a group CA. Proof : A CA is group if its reachability tree is balanced. Any sequence of group rules obviously may not result in a balanced reachability tree. Hence, a specific sequence of group rules can only form a group CA. 2 Alternative proof : Let us consider, the group rules of an n-cell CA are configured in such a way that the CA loaded with any seed produces two types of states – {· · · di di+1 · · ·} and {· · · d0i d0i+1 · · ·}, where di (= 0/1) is the state of ith cell while d0i is its complement. Therefore, for 2 n number of current states, the next states are S = {· · · di di+1 · · · , · · · d0i d0i+1 · · ·}. The maximum possible cardinality of S is 2×2 n−2 = 2n−1 . Since the number of next states is lesser than that of current states, there exists at least a state in S with more than one predecessor. Therefore, the CA is non-group. Hence any sequence of group rules can’t form group CA, rather a specific sequence of group rules forms group CA. 2 Example 3.7 The CA < 90, 15, 85, 15 > is a group CA (Example 3.4). However, the CA R =< 90, 85, 15, 15 > is a non-group CA even though each of the rules in R is a group rule (Table 3.5). The reachability tree for R is shown in Figure 3.10. Theorem 3.6 directs that the group rules are interrelated. The sequence of group rules that form a group CA follows a specific relation. The next section reports classification of 62 group rules based on the relation that must be followed to form a rule sequence for group CA.
3.4
Classification of group rules
It is reported in the earlier section that there are some specific relations among the group rules and that should be considered while forming a group CA. This section identifies the relations and reports classification of 62 group rules to find the sequence of rules for a group CA rule vector. Classification of rules is presented in Subsection 3.4.1, and Subsection 3.4.2 establishes the relationship among the rules of different classes.
37 0, 1, 2, 3
0
1 (1,3)
(0,2)
0, 1, 4, 5
0 (1,5) 2, 3
2, 3, 6, 7
1
0
(0,4) 0, 1
1 (2, 3) 4, 5, 6, 7
0 (4, 6)
1
(2,6)
(3,7) 6, 7
1
(0, 1)
0, 1, 2, 3
1 (0, 2)
0 (6, 7) 4, 5, 6, 7
0 (4, 6)
4, 5
(4,5)
0
0, 1, 2, 3
1
(0, 2)
Figure 3.10: Reachability tree for a non-group CA < 90, 85, 15, 15 > designed with group rules.
3.4.1
Formation of class
Let us consider, the rules R1 , R2 , · · ·, Ri are selected for cell 1, cell 2, · · ·, cell i respectively to form an n−cell group CA satisfying Theorem 3.1 and Theorem 3.2. Further, consider S is the set of all group rules (|S| = 62). Now, the CA cell (i + 1) can support a set of rules Sj ∈ S so that any rule of Sj can be selected as Ri+1 , satisfying the theorems 3.1 and 3.2. We refer the class of (i + 1) th cell as C which supports the rules of Sj . The term class for cell (i + 1) as well as for the S j is used interchangeably. Therefore, the class of S j is C. Lemma 1 There are 6 possible classes of group CA cells in 3-neighborhood dependency. Proof : Each node of the reachability tree of a group CA contains 4 RM T s (Corollary 3.1). Since the sibling RM T s are associated with the same node in the reachability tree and there are 4 sets of sibling RM T s (0 & 1, 2 & 3, 4 & 5, and 6 & 7), 3 different organizations of RM T s for the nodes are possible. The organizations are – {0, 1, 2, 3} & {4, 5, 6, 7}, {0, 1, 4, 5} & {2, 3, 6, 7}, and {0, 1, 6, 7} & {2, 3, 4, 5}. Therefore, if the reachability tree contains a node with RM T s {0, 1, 2, 3} at i th level, it also contains a node with RM T s {4, 5, 6, 7}. Each level of the reachability tree of a group CA can have either 2 or 4 unique nodes (Theorem 3.3). Whenever at a level, there are only 2 unique nodes, then the RM T s of the nodes may be organized as one of the 3 possible combinations of RM T s. For that case, the rule Ri+1 is declared as of class I, II, or III respectively. On the other hand, if at the level there are only 4 unique nodes, then the RM T s of the nodes may be organized as any two of the 3 possible combinations of RM T s. Whenever the nodes are organized like class I & II, I & III, or II & III, the class of that cell is declared as IV , V , or V I respectively. Therefore, there are 6 classes of group rules. 2
38 Table 3.6: Class Table Class I
II III
IV
V
VI
RM T s of nodes {0, 1, 2, 3} {4, 5, 6, 7}
{0, {2, {0, {2,
1, 3, 1, 3,
4, 6, 6, 4,
5} 7} 7} 5}
{0, {4, {0, {2, {0, {4, {0, {2, {0, {2, {0, {2,
1, 5, 1, 3, 1, 5, 1, 3, 1, 3, 1, 3,
2, 6, 4, 6, 2, 6, 6, 4, 4, 6, 6, 4,
3} 7} 5} 7} 3} 7} 7} 5} 5} 7} 7} 5}
Rules 51, 53, 54, 57, 58, 60, 83, 85, 86, 89, 90, 92, 99, 101, 102, 105, 106, 108, 147, 149, 150, 153, 154, 156, 163, 165, 166, 169, 170, 172, 195, 197, 198, 201, 202, 204 15, 30, 45, 60, 75, 90, 105, 120, 135, 150, 165, 180, 195, 210, 225, 240 15, 23, 27, 39, 43, 51, 77, 78, 85, 86, 89, 90, 101, 102, 105, 106, 113, 114, 141, 142, 149, 150, 153, 154, 165, 166, 169, 170, 177, 178, 204, 212, 216, 228, 232, 240 60, 90, 105, 150, 165, 195
51, 85, 86, 89, 90, 101, 102, 105, 106, 149, 150, 153, 154, 165, 166, 169, 170, 204
15, 90, 105, 150, 165, 240
Rules under each class: Since each node of the reachability tree for a group CA is constructed with 4 RM T s (Corollary 3.1) and both the edges (0-edge and 1-edge) of the node are resulted from 2 RM T s (Theorem 3.2), as the 2 out of 4 RM T s are 0 and others are 1. Therefore, RM T s of a node may be grouped as 4 C2 = 6 different ways. However, the RM T s of the nodes of class II can not be grouped any one of the 6 partitions. For class II (RM T partition is {0, 1, 4, 5} & {2, 3, 6, 7}), 0 & 4 (similarly 1 & 5, 2 & 6, and 5 & 7) are the equivalent RM T s (Definition 3.7) and both of these contribute same set of RM T s for the next level. Hence any of the equivalent RM T s are grouped together to generate a node for the next level. That is, the number of RM T s of that node becomes 2. This results in the CA as non-group (Corollary 3.1). Therefore, equivalent RM T s under the same node can not be grouped to produce d (d = 0/1) simultaneously. Hence 4 groupings of RM T s out of 6 are possible in each node for class II. Therefore, the number of group rules of class II is 4 × 4 = 16. Since equivalent RM T s are not associated with the same node for class I and III, all of the 6 groupings are possible for each node. Hence number of rules for those classes are 6 × 6 = 36. The classes and corresponding rules are given in Table 3.6. The rules under class IV , V , and V I are the common rules between I & II, I & III, and II & III respectively.
39
Ri :
Level i−1
4, 5, 6, 7
0, 1, 2, 3
(0, 1)
(2, 3)
(4, 5)
(6, 7)
0, 1, 2, 3
4, 5, 6, 7
0, 1, 2, 3
4, 5, 6, 7
Level i
Nodes of Ri+1 (two unique nodes − Class I) (a) Next rule class is I
Ri :
0, 1, 2, 3
Level i−1
4, 5, 6, 7
(0, 1)
(2, 3)
(4, 6)
0, 1, 2, 3
4, 5, 6, 7
0, 1, 4, 5
(5, 7)
2, 3, 6, 7 Level i
Nodes of Ri+1 (four unique nodes − Class IV) (b) Next rule class is IV Figure 3.11: Determination of class relationship
3.4.2
Class relationship between Ri and Ri+1
This section determines the relationship between the classes of R i and Ri+1 . From the known Ri and its class, we can find the class of R i+1 . Let us consider the class of Ri be I (Figure 3.11). Therefore, two unique nodes having RM T s {0, 1, 2, 3} and {4, 5, 6, 7} are available at the (i − 1) th level of the reachability tree. Now consider the RM T s of R i are clustered as {0, 1, 4, 5} and {2, 3, 6, 7}, where the RM T s of a set are the same, either 0 or 1. In Figure 3.11(a), the RM T s {0, 1, 4, 5} are considered as 0, and the RM T s {2, 3, 6, 7} as 1. Therefore, the RM T s are grouped as (0, 1), (2, 3), (4, 5) and (6, 7). Each edge of the nodes is resulted from any one of these groups. Hence two edges connecting the node having RM T s {0, 1, 2, 3} with its children are resulted from (0, 1) and (2, 3). Therefore, the two children (for next level) of that node are having RM T s {0, 1, 2, 3} and {4, 5, 6, 7} (Table 3.3) (Figure 3.11(a)). Similarly, the children of another node having
40 Table 3.7: Formation of class relationship between R i and Ri+1 (1) Class of Ri I
II IV
(2) RM T s of unique nodes at level (i − 1) {0, 1, 2, 3} {4, 5, 6, 7}
{0, {2, {0, {4, {0, {2,
1, 3, 1, 5, 1, 3,
4, 6, 2, 6, 4, 6,
5} 7} 3} 7} 5} 7}
(3) Groupings of RM T s at level (i − 1) (0, 1), (2, 3) (4, 5), (6, 7) (0, 2), (1, 3) (4, 6), (5, 7) (0, 3), (1, 2) (4, 7), (5, 6) {(0, 1), (2, 3) (4, 6), (5, 7)} or {(0, 2), (1, 3) (4, 5), (6, 7)} (0, 1), (4, 5) (2, 3), (6, 7) (0, 1), (2, 3) (4, 5), (6, 7) {(0, 1), (2, 3) (4, 6), (5, 7)} or {(0, 2), (1, 3) (4, 5), (6, 7)}
(4) RM T s of unique nodes at level i {0, 1, 2, 3} {4, 5, 6, 7} {0, 1, 4, 5} {2, 3, 6, 7} {0, 1, 6, 7} {2, 3, 4, 5} {0, 1, 2, 3} {4, 5, 6, 7} {0, 1, 4, 5} {2, 3, 6, 7} {0, 1, 2, 3} {4, 5, 6, 7} {0, 1, 2, 3} {4, 5, 6, 7} {0, 1, 2, 3} {4, 5, 6, 7} {0, 1, 4, 5} {2, 3, 6, 7}
(5) Class of Ri+1 I II III IV
I I IV
RM T s {4, 5, 6, 7} are constructed with RM T s {0, 1, 2, 3} and {4, 5, 6, 7} – that is, the nodes are same with the other two children. Therefore, the next level of the reachability tree contains two unique nodes having RM T s {0, 1, 2, 3} and {4, 5, 6, 7} (Figure 3.11(a)). Hence the class of R i+1 is I. Further, if the RM T s of Ri are grouped as (0, 1), (2, 3), (4, 6), and (5, 7) (Figure 3.11(b)), the nodes of level i, generated from the node of level (i − 1) with RM T s {0, 1, 2, 3}, are having RM T s {0, 1, 2, 3} and {4, 5, 6, 7}. The other two nodes at level i, generated from the node with RM T s {4, 5, 6, 7}, are having RM T s {0, 1, 4, 5} and {2, 3, 6, 7} (Figure 3.11(b)). In this case, the next level of reachability tree contains four unique nodes having RM T s {0, 1, 2, 3}, {4, 5, 6, 7}, {0, 1, 4, 5}, and {2, 3, 6, 7} (Figure 3.11(b)). Therefore, the organizations of RM T s support the property of both the Class I and Class II. Therefore, the class of R i+1 is IV . Table 3.7 partly displays the relationship between group rules. Only 3 classes, I, II, and IV are selected to illustrate the relationship. First column shows the class of Ri . Column 2 notes the RM T s of unique nodes at level (i − 1). Whereas, Column 3 shows the grouping of RM T s for Ri . The RM T s of unique nodes at level i are shown in Column 4. Based on the unique nodes at level i, the class of R i+1 is decided and is reported in Column 5. The details of relationship among the classes are reported in Table 3.8. The first and
41 Table 3.8: Class relationship of Ri and Ri+1 Class of Ri I
II III
IV
V
VI
Ri 51, 60, 195, 204 85, 90, 165, 170 102, 105, 150, 153 53, 58, 83, 92, 163, 172, 197, 202 54, 57, 99, 108, 147, 156, 198,201 86, 89, 101, 106, 149, 154, 166, 169 15, 30, 45, 60, 75, 90, 105, 120, 135, 150, 165, 180, 195, 210, 225, 240 15, 51, 204, 240 85, 105, 150, 170 90, 102, 153, 165 23, 43, 77, 113, 142, 178, 212, 232 27, 39, 78, 114, 141, 177, 216, 228 86, 89, 101, 106, 149, 154, 166, 169 60, 195 90, 165 105, 150 51, 204 85, 170 102, 153 86, 89, 90, 101, 105, 106, 149, 150, 154, 165,166, 169 15, 240 105, 150 90, 165
Class of Ri+1 I II III IV V VI I I II III IV V VI I IV V I II III VI I IV V
second columns of the table represent the class of i th cell and the rule Ri respectively, whereas the class of the (i + 1)th cell corresponding to this pair (the class of i th cell and Ri ) is noted in the last column. It can be observed that a rule can be the member of more than one class. For example, rule 90, 105, 150 and 165 are the members of all the 6 classes. Such rules are referred to as the complete rules. Definition 3.10 A rule is complete if it is the member of each class. First and Last rule: The class identification of rules is applicable for both the null boundary and periodic boundary CA. In this work, we have concentrated only on 1-dimensional 3-neighborhood null boundary CA. The RM T s 4, 5, 6 and 7 are the don’t cares for R1 as the present state of left neighbor of cell 1 (left most cell of a CA) is always 0. So, there are only 4 effective RM T s (0, 1, 2, 3) for R 1 . Similarly, the RM T s 1, 3, 5 and 7 are the don’t care RM T s for R n . The effective RM T s for Rn are, therefore, 0, 2, 4 and 6. That is, rule 105 and 9 are equivalent if selected as the R1 . Similarly, the rules 75 and 65 are effectively the same while chosen for the
42 Table 3.9: First Rule Table Rules for R1 3, 12 5, 10 6, 9
Groupings of RM T s (0, 1) (2, 3) (0, 2) (1, 3) (0, 3) (1, 2)
RM T s of nodes for level 2 {0, 1, 2, 3} {4, 5, 6, 7} {0, 1, 4, 5} {2, 3, 6, 7} {0, 1, 6, 7} {2, 3, 4, 5}
Class of R2 I II III
2
nth CA cell. Therefore, there are 22 = 16 effective rules for the left most (R 1 ) as well as for the right most (Rn ) cells. The above discussion is formalized in the following corollary. Corollary 3.3 If R =< R1 , R2 , · · · , Rn > is a group CA, then R1 and Rn must be balanced over their effective 4 RM T s. Proof : Let us consider, the first rule is unbalanced over its 4 effective RM T s. That is, the next state of 3 RM T s out of 4 effective RM T s of R 1 be d (d = 0/1). Therefore, there are 3 ∗ 2n−2 number of current states for which the next state has the form S = {d · · ·}. The maximum possible number of such next states is clearly 2 n−1 . Since the number of next states is lesser than that of current states, there is at least a state in S which contains more than one predecessor. Hence the CA is non-group. This is because of that the R1 is unbalanced over its 4 effective RM T s. Therefore, to form a group CA, R1 must be balanced over its 4 effective RM T s. With similar logic, it can also be proved that Rn has to be balanced over its 4 effective RM T s. 2 The corollary signifies that the unbalanced rule 3 is a group rule when it is selected as the R1 . The rule 3 is balanced over its effective (least significant) 4 RM T s. There are 4 C2 = 6 rules (out of total 16 effective rules for the R 1 ) that are balanced over their least significant 4 RM T s. Table 3.9 identifies such 6 rules and the corresponding class of rule R2 for the second CA cell. The similar consideration is also true for the Rn . Table 3.10 lists all such 6 group rules (5, 17, 20, 65, 68, 80) for the R n . The group CA synthesis scheme is proposed in Algorithm 3.2 (Section 3.2.4). However, a relatively simpler method to synthesize a group CA can be developed based on the tables 3.8, 3.9 and 3.10. For example, let us consider the synthesis of a 4-cell group CA and say, rule 9 is selected randomly as R 1 from Table 3.9. Therefore, the class (obtained from Table 3.9) of 2 nd cell rule is III. From Class III of Table 3.8, say rule 177 is selected randomly as the R 2 . Therefore, the class of R3 is found to be V, since the class of 2nd cell rule is III and R2 = 177 (Table 3.8). We select rule 170 as R3 from Class V in consultation with Table 3.8. The class of last cell is, therefore, II. Rule 65 is selected randomly for R4 from Table 3.10. Therefore, the 4-cell group CA is R =< 9, 177, 170, 65 >. The formal algorithm, of O(n) complexity, to synthesize a group CA is presented below.
43 Table 3.10: Last Rule Table Rule class for Rn I II III IV V VI
Rule set for Rn 17, 20, 65, 68 5, 20, 65, 80 5, 17, 68, 80 20, 65 17, 68 5, 80
Algorithm 3.3 SynthesizeGroupCA 2 Input: n (CA size), tables 3.8, 3.9 and 3.10. Output: A group CA – that is, the rule vector R =< R1 , R2 , · · · , Rn >. Step 1: Pick up the first rule R1 randomly from Table 3.9 and set the class of R2 . C := Class of R2 (C ∈ {I, II, III}). Step 2: For i := 2 to n − 1 repeat Step 3 and Step 4. Step 3: From class C of Table 3.8, pick up a rule randomly. Step 4: Find class C 0 for the (i + 1)th cell rule from Table 3.8 (C 0 ∈ {I, II, III, IV, V, VI}) based on the rule Ri and its class C. Assign C := C 0 Step 5: From class C of Table 3.10, pick up a rule as Rn . Step 6: Form the rule vector R =< R1 , R2 , · · · , Rn >.
The earlier discussions are targeted for the reversible or group CA. However, irreversible CA or non-group CA have its wide application domain. Characterization of such CA state transition behavior is reported in the following section.
3.5
Characterization of irreversible CA
This section presents a scheme that characterizes the irreversible CA states. It identifies the non-reachable states and also computes the number of non-reachable states of an irreversible (non-group) CA in O(n) time. Further characterization results are reported in Chapter 7 while describing the application of such CA for data services in cellular mobile network. The proposed scheme for characterization of irreversible CA states is the generalization of Algorithm 3.1 reported in Section 3.2.3. The theoretical aspects of such characterization are formulated in the following theorems. Theorem 3.7 : An n−cell non-group CA contains at least 2 n−3 non-reachable states. Proof : In 3-neighborhood, 18 of total states are to be determined by each out of 8 RM T s of the ith CA cell rule Ri . Since the CA is non-group, there is at least one RM T of Ri that causes an unbalanced reachability tree for the CA. Therefore, 18 of
44 total states are obviously non-reachable. Hence, the number of non-reachable states n is at least 28 = 2n−3 . 2 Theorem 3.8 : An n−cell non-group CA constructed only with balanced rules contains at least 2n−2 non-reachable states. Proof : Let us consider the reachability tree for an n−cell non-group CA, configured only with balanced rules, is balanced up to the i th level and rule Ri is responsible for that. Since Ri is balanced, therefore, there exist at least 2 RM T s that causes the tree as unbalanced. As 81 of total states are determined by an RM T , total number of non-reachable states for such CA is ( 18 + 18 = 14 ) of the total states. Hence an n−cell non-group CA, configured with balanced rules, contains at least 2 n−2 non-reachable states. Hence the proof. 2 Corollary 3.4 An n−cell linear/additive non-group CA contains at least 2 n−2 number of non-reachable states. Proof : Since a linear/additive rule is balanced, the result is directly followed from Theorem 3.8. 2 Computing the number of non reachable states: The following algorithm (CalNonReachableStates) is the generalization of Algorithm 3.1. Algorithm 3.1 terminates as soon as it guesses the existence of non-reachable state. Whereas CalNonReachableStates (Algorithm 3.4) scans the whole rule vector of a CA and then decides on the number of non-reachable states. The algorithm (CalNonReachableStates) assumes the variables S, an array of sets, and nos, the number of sets in S. The arrays oldW eight and newW eight are used to store the number of states that may be reachable and to store the number of states that are non-reachable respectively. The number of non-reachable states are stored in the variable NS. Algorithm 3.4 CalNonReachableStates Input: n (CA size), Rule[n][8] (CA). Output: number of non-reachable states. Step 1: Find S[1] = {j}, where Rule[1][j] = 0 and 1 ≤ j ≤ 3, and S[2] = {j}, where Rule[1][j] = 1 and 1 ≤ j ≤ 3. If S[i] = φ, then set NS := 2n−1 , oldWeight[1] := 2n−1 and nos := 1, where i = 1, 2. Otherwise, set oldWeight[1] := oldWeight[2] := 2n−1 and nos := 2. Step 2: For i = 2 to n − 1 2.1 For j = 1 to nos Determine RM T s for the next level nodes from S[j] using Table 3.3. Distribute these RM T s of ith rule having next state value 0 into S 0 [2j − 1] and 1 into S 0 [2j]. Set newWeight[2j-1] := newWeight[2j] := oldWeight[j]/2. If S 0 [k] = φ, then set NS := NS + newWeight[k], where k = 2j − 1, 2j. 2.2 Replace RM T s 4, 5, 6 and 7 by equivalent RM T s 0, 1, 2 and 3 respectively for each S 0 [k]. 2.3 If S 0 [k] = S 0 [k 0 ] for any k 0 , then set oldWeight[k] := newWeight[k] + newWeight[k 0 ];
45 otherwise, set oldWeight[k] := newWeight[k]. 2.4 Assign unique sets of S 0 to S, and find nos := number of sets in S. Step 3: For j = 1 to nos Determine next RM T s of S[j], of which 2 are invalid since it is the last rule. Distribute these RM T s of last rule having next state value 0 into S 0 [2j − 1] and 1 into 0 S [2j]. If S 0 [k] = φ, then set NS := NS +oldWeight[k]/2, where k = 2j − 1, 2j. Step 4: Report the value of NS as the number of non-reachable states of the CA.
Complexity: Since Algorithm 3.4 uses a main loop in Step 2 that depends on n, and the maximum value of nos is constant, the execution time of the algorithm is dependent on n. Therefore, the complexity of the above algorithm is clearly O(n).
3.6
Conclusion
This chapter reports the detail characterization of 1-dimensional 3-neighborhood hybrid CA. The concept of reachability tree is introduced to characterize the CA. An O(n) time solution scheme is proposed to identify the reversible or group CA. A linear time solution is also proposed for the synthesis of reversible CA through classification of all the 256 CA rules into 6 classes. Further, a scheme to compute the non-reachable states of an irreversible CA is proposed with O(n) complexity. However, the characterization of the attractors and basins are reported in Chapter 7. One of the applications of nonlinear CA theory, developed in this chapter, is the design of pseudo-random pattern generator (P RP Gs). The synthesis of CA based P RP G has been reported in the next chapter.
Chapter 4
Nonlinear Cellular Automata Based Pseudo-Random Pattern Generator The P RP G plays an important role in various computational fields – Monte Carlo techniques, Brownian dynamics, zero knowledge proof, and stochastic optimization methods such as simulated annealing and genetic algorithm. Further, P RP Gs are of significant importance in the field of V LSI circuit testing, cryptography etc. [26, 51, 92, 130]. The P RP Gs for V LSI applications are traditionally implemented with Linear Feedback Shift Registers (LF SRs) [26] and linear Cellular Automata (CA) [51, 130]. The GLF SR [229], Phase-Shift LF SR [233], weighted random number generators [123, 207], Hierarchical CA [258] and ring generators [197] have also been reported for the design of efficient P RP Gs. The LF SRs, maximal length linear CA and ring generators are also utilized for scalable design of P RP G structure in limited cases [268, 197]. However, maximal length linear CA/LF SRs are found to be the base structure for todays P RP Gs, specially in V LSI applications. In a linear CA based P RP G, designed with n−cell maximal length CA [51], all the − 1) non-zero patterns can be generated starting from any initial pattern (seed). It is established that such a P RP G is comparatively better than those designed with the LF SR [51, 130], in terms of pseudo-randomness quality of the generated patterns. However, the conventional CA based P RP G design suffers from the following drawbacks: (2n
• It needs an n−degree primitive polynomial to synthesize an n−cell maximal length CA. The computation involved to find a primitive polynomial is of exponential complexity. • The list of n−degree primitive polynomials is available for n ≤ 500 [2]. • Even if the n−degree primitive polynomial is available, the complexity for syn46
47 thesis of an n−cell maximal length CA from a primitive polynomial is O(n 3 ) [41, 111]. • Scalable design – that is, the design of an (n + 1)−cell P RP G from the available design of n−cell P RP G is most difficult to realize. In this chapter, we introduce an efficient Pseudo-Random Pattern Generator (P RP G) developed around the nonlinear CA, synthesized in Chapter 3, with linear time complexity. The theoretical aspect of the source of better randomness in such CA based P RP G is also reported. A number of works [51, 119, 130] have been reported to characterize the randomness property of CA. However, such schemes are mostly rely on the statistical analysis of randomness quality. On the other hand, the work reported in this chapter provides complete characterization of the CA rules, employed for the design of P RP G in terms of the randomness quality. The major outcome of this chapter can be summarized as follow: • An analytical framework has been provided to characterize the nonlinear CA rules generating a global function that represents the P RP G of excellent quality. • The characterization, proposed, has enabled the formal proof on - why a linear CA with 90/150 rules displays high quality of pseudo-randomness [51, 130]. • A scheme has been proposed for synthesis of an n−cell CA based scalable P RP G in O(n) time [90]. The complexity of the design of (n + 1)−cell P RP G from an available n−cell P RP G structure is constant (two time steps). • It has been experimentally verified that the quality of randomness of the proposed P RP G, designed with O(n) complexity, is at least as good as that of maximal length linear CA based P RP G involving O(n 3 ) design complexity. • A genetic algorithm based scheme is developed to evolve P RP Gs with improved randomness quality. It has been observed that all the empirical tests proposed in DIEHARD [175] are passed for 45-cell CA. It is less than the minimum number of cells (i.e., 55 cells) needed for 1-D linear CA to pass DIEHARD tests. Chapter 3 reports the details of nonlinear cellular automata theory that is to be employed in the current design. However, for ease of reference the following section revisited the topics of CA relevant for the design of P RP G.
4.1
Cellular Automata Revisited
Details on CA structure are noted in Section 3.1 of Chapter 3. A CA is viewed as an autonomous finite state machine (Figure 3.3 of Chapter 3). In our context, each CA cell is having two states - 0 or 1 and the CA is a 3-neighborhood 1-dimensional CA.
48 The next state of a CA cell i is t , St, St ) Sit+1 = fi (Si−1 i i+1 t , S t and S t The Si−1 i i+1 are the present states of the left neighbor, self and right neighbor of the ith cell at time t. fi is the next state function and can be expressed in the form of a truth table (Table 3.1). The decimal equivalent of the function denotes the rule Ri for the ith CA cell.
The sequence of rules R =< R1 , R2 , · · · , Ri , · · · , Rn > that configures the CA cells is called the rule vector. If all Ri s (i = 1, 2, · · · , n) employ XOR/XN OR logic, the CA is a linear/additive CA, otherwise it is a nonlinear CA. The analytical framework reported to characterize a nonlinear CA in Chapter 3 introduces the following terminologies – Balanced rule [Definition 3.5]: A rule is Balanced if it contains equal number of 1s and 0s in its 8−bit binary representation; otherwise it is Unbalanced. The rule 90 (01011010) is a balanced rule. On the other hand, rule 171 (10101011) with five 1s in its binary representation is unbalanced. RMT: A combination of the present states (as noted the in 1 st row of Table 3.1) can t , S t , S t ) switching function and, be viewed as the Min Term of a 3-variable (S i−1 i i+1 therefore, is referred to as the Rule Min Term (RM T ). A CA either can be a reversible (group) CA or irreversible (non-group) CA. The state transition diagram of a group CA contains only cyclic states (Figure 3.4 of Chapter 3), whereas a non-group CA is having both the cyclic and non-cyclic states (Figure 3.5 of Chapter 3). If all the states except all 0s state (for linear CA) lie in a single cycle, then it is a maximal length CA. The group CA is employed for the design of proposed P RP G. The requirements for such a design are reported in the following section.
4.2
Requirements for the P RP G
The basic objective of the proposed design is to synthesize an n−bit P RP G structure that can easily be extendable to (n + 1)−bit P RP G. The P RP G is developed around the cascadable structure of CA involving nonlinear rules. Therefore, the synthesis scheme should result in a CA which ensures generation of a large number of states (patterns) without repetition (Figure 4.1). Further, the generated patterns (p1 , p2 , · · · , pi , · · · , pj , · · ·) should exhibit good quality of pseudo-randomness among themselves. A maximal length CA is the right choice for the P RP G and can generate high quality pseudo-random patterns [51]. The length of the maximal length cycle is exponentially large (2n − 1). The probability of occurrence of a pattern, while running a maximal length CA, is 21n [51]. However, the complexity of synthesizing an n−cell maximal length CA is O(n3 ) and the scalable design of P RP G with maximal length
49
P1
P2
P3 Pi
Pj
Figure 4.1: CA states as the pseudo-random patterns CA is hard to realize. In common applications, it is impractical to run full length (2 n − 1) cycle of a CA to generate pseudo-random patterns. For a large value of n, rather than searching for a maximal length cycle, the CA having a large cycle length (in the current design, it is O(215 )) with its rules suffice to maintain high quality of pseudo-randomness can satisfy the requirements of P RP G. That is, a non-maximal length group CA, that can generate a large number of patterns without repetition, is the target of our current design. Complexity of the design scheme and the scalable architecture of the design are reported in Section 4.4. The following theorem justifies the use of group CA for the proposed P RP G. Theorem 4.1 : A group CA shows better randomness in its generated states than that of a non-group CA. Proof : In a non-group CA, a number of states are non-reachable and a number of states have multiple predecessors. Since a number of states are mapped into a single state, a non-group CA has a definite bias to particular states. Therefore, random change of seeds may produce the same next state in a non-group CA. On the other hand, random change of seeds produces different next states in a group CA, as each state of the group CA is mapped into a unique next state. Therefore, a group CA shows better randomness quality in its generated states than that of a non-group CA. 2 The randomness in the patterns generated from a CA depends on the randomness in generated next states from a CA cell in different time steps. That is, the randomness property of the CA rule that configures a CA cell controls the quality of a CA based P RP G. The above discussions identify that the following objectives are to be fulfilled to satisfy the requirements for the design of a scalable P RP G: 1. synthesize nonlinear group CA having a large cycle, 2. ensure that the rules that configure the CA cells should generate random states, and
50 3. find a sequence of CA rules, for an n−cell P RP G, such that the selected rule sequence can also be employed to design an (n + 1)−cell P RP G. Synthesis of group CA is described in Section 3.2.4 and Section 3.4.2 of Chapter 3. The next two sections report characterization of the CA rules targeting realization of the objectives 2 and 3.
4.3
Randomness property in CA rules
The rule Ri , selected for a CA cell i, determines the interconnection of the cell with its neighbors – that is, the next state (either 0 or 1) generated out of it. This section reports characterization of Ri in respect of its contribution toward the state transition behavior of a CA as well as the randomness properties of the patterns generated out of the CA states. The value (0/1) of an RM T in the CA rule R i dictates the randomness in a CA cell (local randomness) configured with R i . It ultimately affects the randomness in states generated by the CA as a whole (global randomness). The analysis of RM T s to evaluate the randomness in CA rules, that is, the local randomness in CA cells, is next reported.
4.3.1
Local randomness in a CA cell
This subsection reports the analysis of next state function of nonlinear CA rules to evaluate the local randomness in a cell. Their mapping to the global function that specifies the state transition behavior of an n-cell CA is described in the next subsection. A CA cell displays local randomness in its generated states if the states (0 and 1) are equally probable when the change of states in neighboring cells is arbitrary. The following theorems and the subsequent discussions characterize the CA rules displaying the local randomness. Theorem 4.2 A CA cell configured with balanced rule exhibits better randomness compared to that of unbalanced one. Proof : Let, a CA cell be configured with an unbalanced rule and the next state of the cell corresponding to k number of RM T s be d, where k > 4 and d = 0/1. Therefore, the probability of being the next state as d is p = k8 . Therefore, p > 21 , as the k > 4. That is, the values 0 and 1 are not uniformly distributed for the 8 possible cases. Hence the randomness in generated states out of an unbalanced CA rule suffers. 2 Theorem 4.3 A CA cell, next state of which is dependent on its k 1 neighbors, exhibits better randomness than a cell having dependency on k 2 neighbors; where k1 > k 2 .
51 Proof : Let us consider the next states of the i th cells, cell1 i and cell2 i, of two CA be dependent on k1 and k2 neighbors respectively; where k1 > k2 . So, the next state of cell1 i will be affected when any one of the k 1 neighbors changes its state. Similarly, the next state of cell2 i will be affected if one of its k2 neighbors changes its state. Let, both the CA are loaded with the same seed (S). Therefore, the CA cells will generate the states (0/1) based on their neighborhood dependencies. Say, S is changed to S 0 so that at least one bit of S 0 , that corresponds to one of the k1 neighbors of cell1i , differs from S, where all the bits corresponding to k 2 neighbors of cell2i are remained unchanged. That is, at most (k1 − k2 ) bits of S are different in S 0 . Therefore, cell2i is stuck to the same state whereas cell 1i may change its state following the state changes of its neighbors. This implies, the random change in seed (S) does not affect cell 2i . Therefore, the patterns generated from cell 1i displays better randomness compared to that of cell2i . Hence the proof. 2 Corollary 4.1 A k−neighborhood CA generates better pseudo-random patterns than 3-neighborhood CA, where k > 3. Proof :
The proof directly follows from Theorem 4.3. 2
In 3-neighborhood dependency, the next state of a CA cell may not depend on all its 3-neighbors. For example, if a cell is configured with the rule 15, its next state depends only on the left neighbor. Therefore, in order to display good quality of pseudo-randomness, the rule for a CA cell should satisfy certain intrinsic properties. Subsequent discussions point to such properties. Let the current state of j th CA cell at time t be z and the next state function for the cell be fj . Therefore, the next state of the cell can be any one of the following four alternatives d0 = fj (0, z, 0); d2 = fj (1, z, 0);
d1 = fj (0, z, 1) d3 = fj (1, z, 1)
depending on the present states of its neighbors, where d i = 0/1 (i = 0, 1, 2, 3). However, the following three cases may arise. Case I: The value of all the di s are identical. That is, d0 = d1 = d2 = d3 = d (say). This indicates the j th cell has no dependencies on any neighbor. Therefore, the probability of getting stuck to the same state is p(d|z) = 1. Hence, p(d 0 |z) = 0, where d0 is the complement of d. It implies that if case I is true, then the randomness of the generated patterns strongly suffers. Case II: Let us assume that 3 out of 4 d i s are the same. Without loss of generality say, d0 = d1 = d2 = d. Therefore, p(d|z) = 34 and p(d0 |z) = 14 . Here also the state z of the j th cell has a bias to d which leads to poor randomness. Case III: Let us further consider, 2 d i s out of 4 are the same. If d0 = d1 = d (i.e., d2 = d3 = d0 ), then d does not depend on the right neighbor of j th cell. It means, the random state changes of (j + 1)th cell do not affect the next state (d) of the j th cell. That is, a definite bias of d towards the combination of (j − 1) th and j th cell. It shows,
52 Table 4.1: RM T s of rule 90 and 150 RM T
111 110 101 100 011 010 001 000 (7) (6) (5) (4) (3) (2) (1) (0) Rule 90 0 1 0 1 1 0 1 0 Rule 150 1 0 0 1 0 1 1 0
poor randomness in the generated patterns. Similarly, for d 0 = d2 (i.e., d1 = d3 ), randomness suffers. The following conclusions are evident from the earlier discussions in respect of local randomness generated in a CA cell due to its rule: (i) If d0 6= d1 (i.e., d2 6= d3 ) for the j th rule of a CA, the randomness of the patterns generated by the CA improves. It signifies that the different next state values for the RM T s 0z0 & 0z1 (and 1z0 & 1z1) of j th rule lead to better randomness. Therefore, if z = 0, the next state values corresponding to the RM T s 000 (0) & 001 (1) (and 100 (4) & 101 (5)) are to be different. Similarly, if z = 1, the next state values corresponding to the 2nd & 3rd (and 6th & 7th ) RM T s should also be different. This fact leads to the following property of a rule that ensures generation of good quality random patterns. Property 1: For good quality of random patterns, the RM T s of a pair (0 & 1), (2 & 3), (4 & 5) and (6 & 7) should exhibit different next state values. (ii) Further, an outcome of the earlier discussion is - if d 0 6= d2 (i.e., d1 6= d3 ) for the ith rule of a CA, the randomness of the patterns generated by the CA is also improved. This points to the following property. Property 2: For good quality of random patterns, the RM T s of a pair (0 & 4), (1 & 5), (2 & 6) and (3 & 7) should exhibit different next state values. For example, rule 90 maintains Property 1 as the next state values for the RM T s 0 & 1 (2 & 3, 4 & 5 and 6 & 7) are different (Table 4.1). The rule 90 also maintains Property 2, as the RM T s of a pair (0 & 4)/ (1 & 5)/ (2 & 6)/ (3 & 7) show different next state values. Corollary 4.2 A CA configured with 90/150 rules displays good quality of pseudorandomness. Proof : Since both the rules, 90 and 150 maintain Property 1 and Property 2 (Table 4.1), a CA configured with these two rules shows better randomness quality. 2 Theorem 4.4 : If a rule Ri maintains Property 1 (Property 2), then its complement 0 rule Ri also maintains Property 1 (Property 2). Proof : Since Ri maintains Property 1 (Property 2), consider the next state values of the RM T pair (x and y) differ – that is, if RM T x produces d (=0/1) then RM T 0 y produces d0 . Each RM T of Ri is the complement of corresponding RM T of R i
53 0
(Definition 3.6). Therefore, the next state values of RM T x and y of R i are d0 and 0 d respectively. This implies, Ri maintains Property 1 (Property 2). Hence the proof. 2 Corollary 4.3 A balanced rule follows Property 1 (Property 2) either for zero (Property not maintained)/ 2 (partially maintained)/ 4 (fully maintained) RM T pairs. Proof : There are 4 RM T pairs that are to be hold for Property 1 (Property 2) in an 8-bit rule. Therefore, the possible number of RM T pairs for which the next states differ is 0/ 1/ 2/ 3/ 4. However, different next states within 1 or 3 RM T pairs necessarily imply that the rule is unbalanced. Hence, a balanced rule follows condition for Property 1 (Property 2) either for 0, 2 or 4 RM T pairs. Hence the proof. 2 Out of 256 CA rules, only the complete rules (Definition 3.10), that is, rule 90, 105, 150 and 165, fully maintain both the properties 1 and 2. However, it has been found (experimentally) that if only a few rules in a CA rule vector maintain Property 1 and Property 2 partially, then the CA shows a good randomness behavior (Section 4.5). Therefore, to synthesize the proposed nonlinear CA based P RP G with better randomness quality, we choose the balanced rules that hold the properties 1 & 2 fully or partially.
4.3.2
Global randomness of a CA
The earlier subsection targets characterization for the randomness property of a CA cell. However, the randomness of the patterns p 1 , p2 , · · · , pi , · · · (Figure 4.1) generated from the CA based P RP G depends on the randomness of the states generated by the CA – that is, the global randomness property of the CA states. Theorem 4.1 points to the fact that the proposed P RP G is to be developed around the group CA. Therefore, while selecting a CA cell rule for the proposed P RP G, the design should target construction of a group CA. It has been shown in Section 3.3 of Chapter 3 that a specific sequence of rules, constituting the rule vector of a CA, plays the key role to determine cyclic behavior of the CA. Therefore, the following guidelines are to be followed for selecting the CA rules while designing a P RP G. C1 : CA cell rules should be the group rules. C2 : CA cell rules should satisfy Property 1 and Property 2 (Section 4.3.1). C3 : The CA should have a sufficiently large cycle of length O(2 c ). The value of c depends on the number of random patterns required for a particular application. For the current design, we set the lower limit of c as 15. C4 : The selected rule sequence for the CA should result in a scalable structure – that is, the selected rule sequence for an n−cell group CA should also be utilized for designing the (n + 1)−cell group CA. The next section describes the synthesis of the proposed P RP G around the nonlinear group CA.
54
4.4
Synthesis of PRPG
This section describes an efficient synthesis scheme of nonlinear CA based P RP G structure. The synthesis scheme for nonlinear group CA has been reported in Chapter 3. The proposed P RP G synthesis algorithm is developed around Algorithm 3.3 to include the solution to the issues (C 2 , C3 ) noted in Section 4.3.2. It selects group rules for the rule vector R =< R1 , · · · , Ri , · · · , Rn > of a P RP G, following three rule tables, Table 4.2, Table 4.3 and Table 4.4, that maintain Property 1 and/or Property 2 (Section 4.3.1). All the group rules noted in Table 3.5 (Chapter 3) do not maintain Property 1 and/or Property 2, even partially. Therefore, Table 3.8, Table 3.9 and Table 3.10 are modified to Table 4.2, Table 4.3 and Table 4.4 respectively. Table 4.2 contains only the group rules that maintain Property 1 and Property 2 fully or partially (Corollary 4.3), whereas the first cell rules of Table 4.3 and the last cell rules shown in Table 4.4 fully maintain Property 1 and Property 2 respectively. Example 4.1 illustrates the synthesis of a CA based 4-bit P RP G with the help of Table 4.2, Table 4.3 and Table 4.4. The formal algorithm is presented in Algorithm 4.1. Example 4.1 Synthesis of 4-cell PRPG: Consider, rule 9 is selected randomly as R1 from Table 4.3. Therefore, the class (obtained from Table 4.3) of 2 nd cell rule is III. From Table 4.2 and Class III, rule 150 is selected randomly as the R 2 . Rule 150 maintains Property 1 and Property 2 fully (Table 4.2). The class of R 3 is found to be II (Table 4.2), since the class of 2nd cell rule is III and R2 = 150. The next rule (3rd rule) is to be selected in such a way that the class of last cell (i.e., 4 th cell) can not be V as Table 4.4 does not contain any rule of class V for R n . However, for the current example the class of 4th cell is always I (Table 4.2). We select rule 45 as R 3 from Table 4.2. The class of last cell is, therefore, I and say rule 65 is selected randomly for R4 from Table 4.4. Therefore, the 4-cell P RP G is R =< 9, 150, 45, 65 >. Algorithm 4.1 PRPG-Synthesis Input: n (CA size), tables 4.2, 4.3 and 4.4. Output: A P RP G – that is, rule vector R =< R1 , R2 , · · · , Rn >. Step 1: Pick up the first rule R1 randomly from Table 4.3 and set the class of R2 . C := Class of R2 (C ∈ {I, II, III} of Table 4.3). Step 2: For i := 2 to n − 2 repeat Step 3 and Step 4. Step 3: From class C of Table 4.2, pick up a rule as Ri . Step 4: Find class C 0 for the (i + 1)th cell rule from Table 4.2 (C 0 ∈ {I, II, III, IV, V, VI}) based on the rule Ri and its class C. Assign C := C. Step 5: From class C of Table 4.2, pick up a rule as Rn−1 such that the class of Rn can not be V. Find the class C ∈ {I, II, III, IV, V I} for Rn from Table 4.2. Step 6: Pick up a rule from Table 4.4 as Rn based on class C. Step 7: Output the rule vector R =< R1 , R2 , · · · , Rn >.
Complexity: The overhead of proposed P RP G synthesis scheme is the time required to construct the tables 4.2 to 4.4 and the storage required for the tables. However,
55
Table 4.2: Class Relationship of Ri and Ri+1 maintaining Property 1 and Property 2 Class of Ri I
Ri Property 1 partial
Property 2 full
partial
85, 90, 165,170 102, 105, 150, 153 53, 58, 83, 92, 163, 172, 197, 202 54, 57, 99, 108, 147, 156, 198,201
II
30, 45, 75, 120, 135, 180, 210, 225
86, 89, 101, 106, 149, 154, 166, 169 90, 105, 150, 160
53, 58, 83, 92, 163, 172, 197, 202 54, 57, 99, 108, 147, 156, 198,201 86, 89, 101, 106, 149, 154, 166, 169
85, 105, 150, 170 90, 102, 153, 165
86, 89, 101, 106, 149, 154, 166, 169
V
VI 105, 150 90, 165
VI
23, 43, 77, 113, 142, 178, 212, 232 27, 39, 78, 114, 141, 177, 216, 228 86, 89, 101, 106, 149, 154, 166, 169
IV 90, 165 105, 150 85, 170 102, 153 86, 89, 90, 101, 105, 106, 149, 150, 154, 165, 166, 169
V
15, 30, 45, 60, 75, 90, 105, 120, 135, 150, 165, 180, 195, 210, 225, 240 15, 240 105, 150 90, 165
III
23, 43, 77, 113, 142, 178, 212, 232 27, 39, 78, 114, 141, 177, 216, 228
full 60, 195 90, 165 105, 150
I
I II III IV V VI
60, 195 90, 165 105, 150
86, 89, 101, 106, 149, 154, 166, 169
Class of Ri+1 I II III IV
90, 105, 150, 165
I IV V II III VI
15, 240 105, 150 90, 165
I IV V
56
Table 4.4: Last rule table maintaining Property 2 Table 4.3: First rule table maintaining Property 1 Rules for R1 5, 10 6, 9
Class of R2 II III
Class for Rn I II III IV VI
Rules for Rn 20, 65 5, 20, 65, 80 5, 80 20, 65 5, 80
the construction of the tables involves one time cost and the storage required is insignificant for such small tables. Algorithm 4.1 executes a loop (Step 2 to Step 4) to synthesize the desired P RP G. The number of iterations in this loop is proportional to n, the size of the P RP G. All other operations with in the algorithm demand constant time. Therefore, the proposed scheme synthesizes a P RP G, of arbitrary length, in linear time (O(n)). Scalable design: The major advantage of the P RP G synthesis scheme proposed in this section is – we can synthesize an (n + 1)−cell P RP G in two time steps if an n−cell P RP G structure is known. For example, consider the 4-cell P RP G R =< 9, 150, 45, 65 > of Example 4.1. To synthesize a 5-cell P RP G from R, the synthesis tool has to select the rules R4 & R5 for the 4th and 5th CA cell respectively. The rules R1 (=9), R2 (=150) and R3 (=45) for the cells 1, 2, and 3 can be selected from the rule vector R of the 4-cell P RP G (Figure 4.2). As the class of R 3 is II (Example 4.1), the class of 4th cell rule is I (Table 4.2). We select R 4 = 149 randomly from Class I of Table 4.2. The class of 5th cell rule R5 is then identified as V I. The random selection from Class V I of Table 4.4 gives R5 = 80. Therefore, the resulted 5-cell P RP G is < 9, 150, 45, 149, 80 >. The design requires only two time steps for the selections of R4 & R 5 . The next section evaluates the efficiency of P RP G designed with Algorithm 4.1 in respect of its randomness quality.
4.5
Randomness Quality of P RP G
This section reports the results of a number of empirical tests that show the randomness quality of the patterns generated by the P RP G designed in Section 4.4. All the empirical tests, reported here, are included in the Diehard battery of tests. Diehard test, proposed by George Marsaglia in 1996, is presently considered as the most stringent test than the classical tests proposed so far. It contains different tests and each test produces a set of p values. A p of a test may take any value from 0 to 1. Whenever a set of patterns is tested for its randomness quality, the test f ails if
57 Cell
1
2
3
4
9
150
45
65
III
Cell
I
II A 4−cell PRPG
Class
1
2
3
4
9
150
45
149
III
II
I
Class
5
80 VI
A 5−cell PRPG from a 4−cell PRPG
Figure 4.2: Scalable PRPG design the p’s are either 0 or 1. For random patterns, p values are within 0 and 1. However, in our experimentation, we declare that a test succeeds if the p values are in between 0.025 and 0.975 [175]. Table 4.5: Diehard tests No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Name of Tests Birthday Spacings Overlapping Permutations Ranks of 31x31 and 32x32 matrices Ranks of 6x8 Matrices The Bitstream Test Monkey Tests OPSO,OQSO,DNA Count the 1‘s in a Stream of Bytes Count the 1‘s in Specific Bytes Parking Lot Test Minimum Distance Test The 3DSpheres Test The Sqeeze Test Overlapping Sums Test Runs Test The Craps Test
Table 4.5 notes the statistical tests available in Diehard. However, the test number 3 and 6 respectively contain two and three different tests. That is, there are total 18 statistical tests for Diehard battery. The detail description of each test is provided in [175]. A result showing the p values of a particular experimentation, for n = 63, is shown in Table 4.6. The column M ax notes the results of tests with maximal length linear CA, whereas the column under the heading P RP G reports the results of proposed P RP G. We have experimented for different n (size of the P RP G), varying from 20
58 Table 4.6: Randomness test I (for n = 63) Name of Test Birthday Spacings Overlapping Permutations Ranks of 31x31 matrix Ranks of 32x32 matrix Ranks of 6x8 Matrices The Bitstream Test Monkey Test OPSO Monkey Test OQSO Monkey Test DNA Count the 1‘s in a Stream of Bytes Count the 1‘s in Specific Bytes Parking Lot Test Minimum Distance Test The 3DSpheres Test The Sqeeze Test Overlapping Sums Test Runs Test The Craps Test
Max p val status .634983 success 1.0000 failure 0.000 failure 0.000 failure 0.000 failure .170012 success .5436 success .1214 success 0.000 failure .130782 success 1.0000 failure .650000 success .900000 success 1.0000 failure .105987 success 0.0903 success 1.0000 failure 1.0000 failure
PRPG p val status .680642 success .974141 success .701 success .127 success 0.000 failure .183236 success 1.0000 failure .2365 success .7651 success 1.0000 failure .613167 success 1.0000 failure .086369 success .956771 success .433171 success 1.0000 failure 1.0000 failure 1.0000 failure
to 300. Table 4.7 contains the results for n = 24, and n = 63. For an n, a number of experimentations with different seeds have been performed for each test. We declare a test as pass if at least in 75% cases the test succeeds; otherwise the test f ails. The results of extensive experimentation establish the fact that the randomness quality of the patterns generated by the proposed nonlinear CA based P RP G is at least as good as that of maximal length linear CA/LF SR based designs. The P RP G synthesized from Algorithm 4.1 may not be the best one for a particular n. The huge search space of nonlinear CA opens up the possibility of a large number of P RP G structures for an n. We next propose a Genetic Algorithm (GA) based scheme to evolve the P RP Gs, for obtaining an optimum structure.
4.6
Evolution of P RP G
A number of P RP G structures can be synthesized with Algorithm 4.1 for an n. However, all the P RP Gs may not be of same randomness quality. We, therefore, use the GA (Genetic Algorithm) framework to find a better design for P RP G. The parameters for GA evolution are described in the following subsections.
59 Table 4.7: Randomness test II Name of Test Birthday Spacings Overlapping Permutations Ranks of 31x31 and 32x32 matrices Ranks of 6x8 Matrices The Bitstream Test Monkey Tests OPSO,OQSO,DNA Count the 1‘s in a Stream of Bytes Count the 1‘s in Specific Bytes Parking Lot Test Minimum Distance Test The 3DSpheres Test The Sqeeze Test Overlapping Sums Test Runs Test The Craps Test
4.6.1
n = 24 Max P RP G pass pass pass pass pass pass pass pass fail fail fail fail pass pass fail fail pass pass pass pass pass pass fail fail fail fail pass pass pass pass
n = 63 Max P RP G pass pass pass pass pass pass pass pass fail fail pass fail pass pass pass pass pass pass pass pass pass pass pass pass pass pass pass pass pass pass
Fitness Function
The randomness quality of the patterns generated from a P RP G is measured in the framework of DIEHARD. A P RP G can pass/succeed all the 15 tests of DIEHARD in the best case. However, if the number of tests passed for a P RP G is x, then we can define randomness f actor =
x 15
for the P RP G. The scheme has no bias to DIEHARD, and it can easily adopt any test parameter to evaluate the efficiency of a P RP G. Since we are considering only the randomness quality of a P RP G to evaluate its performance, the randomness f actor can be considered as the f itness function for GA evolution.
4.6.2
Crossover
The crossover operation implemented in this work is similar to that of conventional single point crossover and sets the crossover point randomly. Figure 4.3 describes the crossover technique. CA3 and CA4 are the off-springs of the parents CA 1 and CA2 . Since the proposed P RP G is developed around the group CA, the off-springs should also be the group CA. However, the off-springs resulted directly out of this crossover may not satisfy the properties for a group CA. Therefore, to fix the population for next generation, out of the crossover operation, we need to modify the off-springs to make them group CA. The following example illustrates the steps. Example 4.2 Let us consider the two CA, CA 1 =< 9, 165, 90, 80 > and CA2 =
and CA2 =< 6, 240, 102, 17 > 6, 240, 102, 5 >, are selected for crossover at 2 (crossover point between cell 2 and cell 3, Figure 4.4). Therefore, the off-springs CA 3 =< 9, 165, 102, 5 > and CA4 =< 6, 240, 90, 80 > are generated directly after the crossover operation. Now, the next task of the crossover process is to check whether the off-springs CA 3 and CA4 can be the members for next generation – that is, whether CA 3 and CA4 are the group CA or not. It can be observed that the class of the CA cells 2, 3, and 4 of CA 1 is III, whereas, for CA2 these are III, I and III respectively. CA 3 assumes rules for cell 1 and cell 2 from CA1 and the rest from CA2 . Hence, the class of cell 2 of CA3 is III (from CA1 ). As the rules of cell 3 and cell 4 are selected from CA 2 , these two cell rules are of class I, and III respectively. However, cell 2 of CA 3 with rule 165 and class III does not point to 102 as a rule of class I for cell 3 (Table 4.2). The rule 165 for cell 2 is, therefore, to be replaced by a rule that follow the relation. There are 2 such rules, 15 and 240, where the classes of R i and Ri+1 are III and I (Table 4.2) respectively. Let say, we select rule 15. Therefore, CA 3 is modified as < 9, 15, 102, 5 > and considered as the candidate for the next generation. However, CA4 does not require any such modification.
61
No. of Tests Passed
15
10
5
Max
PRPG
n = 24
PRPG Max n = 45
Max PRPG
n
n = 63
Figure 4.5: Comparison of randomness property of maxlength CA and PRPG
4.6.3
Mutation
We implement single point mutation on the present population to form population for the next generation. A CA is randomly selected from the current population as the candidate and then rule Ri of the ith cell is replaced by a rule that holds Property 1 and/or Property2 and simultaneously maintains the properties of group CA.
4.6.4
Evolution of the design
This section reports the results of evolution scheme implemented for designing the P RP G. We measure the randomness quality of the P RP Gs in each generation, and it is found that a P RP G resulted out of the scheme displays better randomness quality than that of the linear maximal length CA. It is observed that an n = 45 cell nonlinear CA based P RP G, evolved from the proposed scheme, succeeds all the 15 tests of Diehard. Whereas, an 1-D linear CA that can succeed all the Diehard tests should have at least 55 cells. A comparative study on the randomness quality of the proposed P RP G and the maximal length linear CA based P RP G is shown in Figure 4.5. From the study of GA evolutions, we have set the associated parameters to derive the next population (N P ) out of the present population (P P ). The population size for each generation is set to 50. We follow the elitist model – that is, the 10 best solutions of P P are forwarded to the N P . Out of the rest 40 solutions of N P , 20 are generated from the crossover operations, 10 from mutation operation, and 10 are the newly generated group CA. The experimentation is done for different values of n and it is found that the proposed GA based synthesis scheme converges, resulting the desired P RP G, for all n.
62 Table 4.8: Results of GA evolution CA size
24
45
63
Generation Number 0 5 10 20 30 40 50 0 5 10 20 30 40 0 5 10 20 30 40 50
# test passed (DIEHARD) 4 5 6 9 11 11 11 9 10 10 11 12 15 4 10 10 14 14 14 14
Table 4.8 shows the sample results for n = 24, 45 and 63. Column 2 of Table 4.8 indicates the generation number for the GA run. The 0 th generation indicates the run with initial population. The maximum number of randomness tests passed by the population (P RP G) of a generation is noted in Column 3.
4.7
Conclusion
This chapter introduces the nonlinear CA as the Pseudo-Random Pattern Generators (P RP Gs). The proposed P RP G synthesis scheme is of O(n) time complexity. The randomness quality of patterns generated by the synthesized P RP G is shown at least as good as that of maximal length linear CA/LF SR based design, traditionally used for the P RP G. The randomness quality of the proposed P RP G is further improved by introducing the GA framework. It is found that a 45-cell CA resulted out of the GA evolution can pass all the DIEHARD tests – that establishes the better performance of the design in comparison to linear CA based design. The proposed P RP G can be employed for designing on-chip test pattern generator for V LSI circuits. The details of such an application of nonlinear CA based P RP G are reported in the chapter that follows.
Chapter 5
Design of On-Chip T P G around Nonlinear CA based P RP G The Pseudo-Random Pattern Generator (P RP G), designed around the modular structure of nonlinear cellular automata (CA), has been reported in Chapter 4. The design has been guided by theoretical analysis of CA rules. The P RP Gs are traditionally implemented with LF SR or maximal length linear CA. It is established that the CA based P RP G is superior to the LF SR based P RP G in terms of randomness quality [130]. We have, therefore, compared the randomness quality of the proposed P RP G with that of maximal length CA based P RP G in Chapter 4. The randomness quality is tested with the battery of Diehard tests [175]. It has been established in Chapter 4 that the quality of proposed nonlinear CA based P RP G is better than that of the conventional maximal length CA based designs. It is found that a 45-cell P RP G designed around nonlinear CA can pass all the Diehard tests, which is less than the minimum number of cells (i.e. 55) required to pass Diehard tests for state-of-the-art CA based P RP Gs [121]. Further, the nonlinear CA based P RP G is highly scalable – that is, an n−cell P RP G can easily be extended to (n + 1)−cell P RP G. In the era of V LSI, the P RP Gs play a key role in the field of circuit testing. The P RP G acts as an on-chip Test Pattern Generator (T P G) to detect faults in a Circuit Under Test (CU T ) [26, 51]. A report on state-of-the-art designs is noted in Section 5.1. This chapter reports the design of a T P G for a CU T employing the P RP G proposed in Chapter 4. The proposed design and its effectiveness are reported in Section 5.2. An important objective of designing on-chip T P G is – the area overhead of the T P G is to be minimum. We utilize the huge search space of nonlinear CA to design such a T P G. The design of cost optimal T P G is reported in Section 5.3. The hardware requirement for the T P G and comparison with other available designs are reported in Section 5.4. One of the important advantages of the nonlinear CA based P RP G is the P RP G is scalable. This scalability property is effectively utilized to design a T P G for multiple
63
64 cores. The design is reported in Section 5.5.
5.1
P RP G based On-Chip T P G
The pseudo-random pattern generators (P RP Gs) are widely used in the field V LSI circuit testing as efficient test pattern generators (T P Gs) [26, 130]. The P RP G generates a large volume of patterns to test different CU T s (Circuit under Test) of a V LSI chip that may be accessed through a full or partial scan path. The on-chip T P Gs are traditionally implemented with Linear Feed Back Shift Register (LF SR) [26, 152, 229]. However, cellular automata (CA) is established as an efficient alternative to LF SR [51]. Different CA structures are proposed to improve the efficiency of the CA based T P G. Hybrid CA based T P G was proposed in [128, 130]. The efficiency of the T P G was further improved in [51]. The 2-dimensional GF (2) CA have also been explored in [58] for test pattern generation. The weighted random T P G structure has been proposed in [123]. CA based weighted random pattern generation scheme is developed in [207]. Recently, a low cost hardware for generating pseudo-random test patterns with Ring Generator is reported in [197]. The quality of CA based P RP G is further improved with the introduction of Hierarchical CA (HCA) [258]. It has been shown that the randomness quality and as well as the fault efficiency of HCA is better than the other CA based designs [257] with marginal additional cost. The nonlinear CA based P RP G is proposed in Chapter 4 of this work. It is established that the randomness quality of nonlinear CA based P RP G is the best among those existing designs. Such a P RP G can also be utilized as the on-chip T P G for V LSI designs. The effectiveness of such T P G is evaluated in the next section.
5.2
Nonlinear CA based on-chip T P G
The nonlinear CA based P RP G, reported in Chapter 4, is efficiently employed to design the on-chip T P G for V LSI circuits. In the proposed design, the P RP G is synthesized using Algorithm 4.1. The larger cycle of the P RP G is exploited to generate the test patterns for a circuit under test (CU T ). A large sequence of patterns, generated from the P RP G, can be used for testing the CU T . However, without loss of generality, 215 test patterns are assumed to be sufficient for a CU T with arbitrary number of P Is. We next report the fault efficiency of the T P G designed around the nonlinear CA.
5.2.1
Fault Efficiency
The fault simulation, considering the single stuck-at fault model, is done for a large number of ISCAS benchmark circuits in the framework of Cadence Verifault fault
65
Figure 5.1: For c7552: improvement in fault coverage with the number of test patterns. Dotted line is for maxlength linear T P G, continuous line is for nonlinear T P G. simulator. The sample results, showing the consistency in performance, in terms of fault coverage of the proposed design are noted in the figures 5.1 and 5.2. The fault coverage is expressed as f ault coverage = Total no. of detected faults , Total no. of faults in the CUT while the F F s of sequential circuits are assumed to be initialized to 0. Figure 5.1 depicts the improvement in fault coverage with the increase in number of test patterns generated by the maximal length linear T P G (dotted line) and the nonlinear CA based T P G (continuous line) for c7552 benchmark circuit. The similar results for s6669 benchmark circuit are shown in Figure 5.2. The graphs, shown in the figures, clearly establish the fact that the performance of the proposed nonlinear CA based T P G is effectively as good as that of maximal length linear CA based design. Table 5.1 further compares the fault coverage figures shown by the maximal length linear CA (Column 4) and the proposed T P G (Column 5). A benchmark circuit is tested with the same number of test vectors for both the designs. Column 3 indicates the number of test vectors applied for a test. In each design scheme, for a circuit, we generate different set of test vectors with 5 different random seeds. A CU T is then tested with the 5 sets of test vectors. The fault coverage shown in a column is the best among these 5 results. Table 5.1 reports the test results of 26 combinational and sequential circuits. It can
66
Figure 5.2: For s6669: improvement in fault coverage with the number of test patterns. Dotted line is for maxlength linear T P G, continuous line is for nonlinear T P G. be observed that the fault coverage of the proposed T P G (i) is same or better for 18 cases (marked with *), and (ii) worse for 8 cases than the results obtained with maxlength linear CA. It confirms that the proposed on-chip T P G is effectively as powerful as that of conventional maximal length CA based design. Further, in some cases the proposed T P G achieves consistently better fault efficiency. Effectiveness of the design is further evaluated through estimation of the hardware requirement for the T P G.
5.2.2
Hardware overhead
An n−cell maximal length CA based T P G requires n F F s and hardware for combinational logic to realize rules of the CA. The combinational logic for each CA cell is implemented with an XOR gate (at most 3 inputs). For example, to implement the rule 150 for a CA cell, a 3-input XOR gate is required; whereas for rule 90 it requires 2-input XOR. In the proposed design of n−cell T P G around nonlinear CA, the hardware requirement is (i) n F F s, as required for maximal length CA based T P G, and (ii) combinational logic implemented with AN D/OR/XOR/N OT / · · · gates to realize the nonlinear rules for the CA cells. For example, rule 166 requires two 2-input AN D, one 2-input OR, one 2-input XOR and one N OT gate; whereas implementation
67 Table 5.1: Comparison of Test Results Circuit Name s1196 s1238 s967 s1423 s1269 s3271 c6288 c1908 s5378 s641 s713 s35932 c432 c432m c499 c499m c1355 c1355m s3384 s4863 c3540 c880 s991 s6669 c7552 c2670
# PI 14 14 16 17 18 26 32 33 35 35 35 35 36 36 41 41 41 41 43 49 50 60 65 83 207 233
# Test Vector 12000 10000 9000 15000 1200 10000 60 4000 8000 2000 2000 14000 400 4000 600 2000 1500 12000 8000 8000 3500 2500 6000 4500 12000 2000
Fault Coverage (%) Max Len TPG 94.85 95.25* 89.67 90.04* 98.22 98.22* 56.50 51.16 99.18 99.56* 98.99 98.99* 99.51 99.46 99.41 99.41* 67.63 68.06* 85.63 85.65* 81.41 81.24 61.91 79.95* 98.67 99.24* 83.57 84.68* 98.95 98.81 97.78 97.56 98.98 99.24* 92.23 92.23* 91.78 91.95* 91.83 94.29* 95.85 95.85* 99.47 99.05 95.06 94.95 99.97 99.99* 94.25 94.44* 84.60 84.57
of rule 15 needs only one N OT gate. A sample comparison of hardware requirements for our proposed T P G and the conventional maximal length CA based T P G is next reported. For a CU T , the gate area of 48-cell proposed T P G is found to be 643. The maximal length CA based T P G of the CU T requires 608 gate area. That is, the proposed T P G adds 643−608 = 5.44% overhead. Further, for 256-cell T P G of a CU T , the gate area 643 is 1203 with nonlinear CA. However, it is 1146 for maximal length CA based design. Therefore, added overhead for this example is 4.97%. That is, the area overhead for implementing the nonlinear CA based T P G is marginally higher than that of a maximal length CA based design. The area per gate, estimated with GEN LIB, is reported in Table 5.2. Since we are utilizing the huge search space of nonlinear CA, a number of P RP Gs can be synthesized for a T P G (Algorithm 4.1). However, all the P RP Gs may not have the same area overhead. The following section targets synthesis of a P RP G for
68 Table 5.2: Area overhead of basic gates (GENLIB) Gate NOT AND OR XOR XNOR
# input 1 2 2 2 2
area 1 2 2 2.5 3
T P G that maintains a good pseudo-randomness quality as well as incurs optimal area overhead.
5.3
Cost optimal design of P RP G for T P G
Nonlinear CA based P RP G has been proposed (Chapter 4) to overcome the limitations of linear CA based conventional designs. The P RP G, proposed in Chapter 4, also exhibits an excellent randomness quality in the generated patterns and effectively employed for the design of on-chip T P G in the earlier section. Further, as the linear CA limits the search space for P RP G design, cost optimal design for an arbitrary value of n may not be achievable with linear CA. In this section, we target the desired cost optimal solution with nonlinear CA. The cost of a T P G normally points to the area overhead, power consumption, delay, etc, of the T P G. The area overhead is the major concern to the test engineers while designing the on-chip T P G for a circuit. We, therefore, consider area overhead for the cost optimal design of P RP G based T P G. The proposed cost optimal P RP G for the T P G, employing the theory of nonlinear CA, is synthesized with Algorithm 4.1. A Genetic Algorithmic (GA) framework is proposed to evolve the desired P RP G structure that reduces the search space for CA and ensures the cost optimal solution, simultaneously improving its randomness quality.
5.3.1
The design
An n−cell P RP G requires n F F s and the combinational logic to generate the next state of the CA employed for the design of P RP G. Since n number of F F s are mandatory to implement any n−cell CA, we concentrate only on the cost of combinational logic in the optimal design process. We consider the cost of combinational logic as the cost of a CA rule. The proposed cost optimal P RP G design methodology, therefore, targets synthesis of a CA with the rules that involve minimal combinational logic – that is, of minimal weight. The weight for a rule is determined by finding out the minimum number of gates
69 Table 5.3: Cost of group rules (cost within bracket) 15(1.0) 23(9.5) 27(8.0) 30(10.5) 39(8.0) 43(9.5) 45(9.5) 51(1.0) 53(8.0) 54(10.5) 57(8.5) 58(8.0) 60(2.5) 75(9.5) 77(10.5) 78(8.0) 83(8.0) 85(1.0) 86(10.5) 89(10.5) 90(2.5) 92(8.0) 99(5.5) 101(9.5) 102(2.5) 105(6.0) 106(9.5) 108(9.5) 113(9.5) 114(8.0) 120(9.5) 135(10.0) 141(7.0) 142(9.5) 147(10.0) 149(10.0) 150(5.0) 153(3.0) 154(10.0) 156(10.0) 163(7.0) 165(3.0) 166(9.5) 169(10.0) 170(0.0) 172(7.0) 177(7.0) 178(9.5) 180(9.5) 195(3.0) 197(7.0) 198(7.5) 201(10.0) 202(7.0) 204(0.0) 210(9.5) 212(9.5) 216(7.0) 225(10.0) 228(7.0) 232(8.5) 240(0.0)
Table 5.4: Cost of first and last cell rules F irst cell rule Last cell rule
3(1.0) 5(1.0) 6(2.5) 9(3.0) 10(0.0) 12(0.0) 5(1.0) 17(1.0) 20(2.5) 65(3.0) 68(0.0) 80(0.0
(AN D, OR, N OT , · · ·) required to implement the next state logic of the rule. The cost (i.e., area overhead) of a rule is then estimated with GEN LIB (Table 5.2). For example, the CA rule 166 requires two 2-input AN D, one 2-input OR, one 2-input XOR and one N OT gate. Its area overhead is of 9.5 (Table 5.2). Whereas, for the implementation of rule 15, we need only one N OT gate – that is, area overhead of rule 15 is only 1. The cost (area overhead) of the group rules are noted in Table 5.3. Whereas, the cost of first and last cell rules are noted in Table 5.4. The above discussion dictates that to synthesize nonlinear CA based optimal cost P RP G, the CA rules selected for the design should satisfy the following criteria [87]. The rules to construct the P RP G should be so that the weight of each rule is optimal. The following example illustrates the proposed cost optimal design of P RP G. Example 5.1 The 4-bit P RP G 1 =< 9, 165, 90, 80 >, synthesized using Algorithm 4.1, requires 3 gates (one 2-input XOR and two 2-input XN ORs) to implement the combinational logic for the rules. Its cost is 8.5 (Table 5.3) and it is not a cost optimal solution. However, for the 4-cell P RP G 2 =< 6, 240, 102, 80 >, to implement the combinational logic we require only two 2-input XOR gates – that is, cost is only 5. Therefore, P RP G2 is a better option than P RP G1 in terms of hardware implementation cost for a 4-bit P RP G. A P RP G constructed with minimal weight rules of Table 5.3 and Table 5.4, however,
70 may not show better randomness quality. Therefore, the synthesis of P RP G with optimal weight rules, simultaneously satisfying the desired randomness quality, opens a large search space. In the current design, we employ GA (Genetic Algorithm) framework to guide the search for synthesizing P RP G with desired quality. We next report the evolution of such P RP G.
5.3.2
P RP G evolution for cost optimal design
The evolution of P RP G using GA framework has been reported in Section 4.6 of Chapter 4. However, the only focus for that evolution is the randomness quality. For the current design, the P RP G is evolved to reach a cost optimal design, maintaining a good randomness quality. Therefore, the GA framework, reported in Section 4.6, is utilized to attain the cost optimal design of P RP G with a minor modification. The details of the framework with modification are reported here for ease of reference. Fitness function Fitness of a CA (P RP G) depends on the two factors – randomness quality (randomness factor) of the patterns generated from the CA and the area overhead for the CA (area factor). Randomness Property: The randomness properties of patterns generated by the P RP Gs, for different values of n, are studied based on the metric proposed in Diehard [175]. Diehard offers 15 different statistical tests. A P RP G can pass/succeed all the 15 tests in the best case. If x is the number of tests passed for a P RP G, then its x randomness f actor = 15 Area overhead: An n−cell CA requires n F F s and a combinational logic to generate the next state for each F F . The combinational logic for a CA cell is implemented with AND/OR/XOR/NOT/· · · gates to realize its rule. Since F F is mandatory for implementation of a CA cell, only the overhead of combinational logic can vary for different rules. The average gate area (overhead) of combinational logic for 62 group rules is 6.016129 and C = 6.016129 × n be the average cost of the combinational logic for an n−cell CA. Now, if the actual gate area of an n−cell CA is G, then the defines the fitness of the CA/P RP G in terms of area overarea f actor = 1 − G C head. Therefore, the fitness of a CA (P RP G), while considering its randomness quality as well as the area overhead, is determined as: f itness =
randomness f actor 2
+
area f actor
× 100%
(5.1)
71 Crossover The crossover technique, implemented in Section 4.6, is adopted for the evolution of P RP G targeting cost optimization. Figure 4.3 and Figure 4.4 describe the crossover process, whereas Example 4.2 illustrates the technique with two 4-bit P RP Gs. Mutation We implement single point mutation on the present population to form population for the next generation. A CA is randomly selected as the candidate, from the current population, for mutation. A rule Ri of that CA is then arbitrarily selected and replaced by a rule that incurs less hardware implementation cost and simultaneously ensures that the CA remains a group. The following example illustrates the mutation process.
2 9
165 III
90 III
80 III
Cost: 3
3
2.5
0
9 Class:
90 III
90 III
80 III
Cost: 3
2.5
2.5
0
Class:
Candidate CA = 8.5 Muted CA = 8.0
Figure 5.3: Mutation of CA < 9, 165, 90, 80 >, resulting the CA < 9, 90, 90, 80 > Example 5.2 Let us consider the P RP G < 9, 165, 90, 80 > is selected as candidate chromosome for mutation. The classes of cells 2, 3 and 4 of the CA are III, III and III respectively (Tables 4.2, 4.3, 4.4) and its cost is 8.5 (Example 5.1). Assume that the second cell of the CA is selected for mutation. The classes of both second and third cell are III. There are 4 rules (90, 102, 153, 165) where the classes of R i and Ri+1 both are III (Table 4.2). Therefore, if the rule 165 is replaced by any one of the 4 rules, the resultant CA will remain a group. We replace the rule 165 by rule 90. The rule 90 maintains Property 1 and Property 2, and incurs less cost (cost 2.5) compared to rule 165 (cost 3). Therefore the mutated CA is < 9, 90, 90, 80 > and its cost becomes 8. The mutation technique is described in Figure 5.3. Synthesis of optimal cost P RP G From the study of GA evolutions, we have set the associated parameters to derive the next population (N P ) out of the present population (P P ). The population size
72 for each generation is set to 50. We follow the elitist model – that is, the 10 best solutions of P P are forwarded to the N P . Out of the rest 40 solutions of N P , 20 are generated from the crossover operations, 10 from mutation operation, and 10 are newly generated P RP Gs. In this work, we adopt two approaches to evolve the CA for the desired P RP G structure. In Approach I, the evolution process assumes the f itness function considering both the randomness f actor and area f actor (Equation 5.1), whereas in Approach II only the area f actor is considered for the fitness function during evolution of CA for the P RP G. The steps for evolving the CA based P RP G (Approach I & Approach II) are summarized in the following algorithm. Algorithm 5.1 Minimal Cost PRPG Input: P RP G size (n), Approach I or II Output: CA rule vector (P RP G) Step 1: Generate 50 P RP Gs as the Initial population. Consider Initial population as P P . Step 2: Repeat Step 3 to Step 8 for 50 times. Step 3: Calculate fitness (Approach I) or area factor (Approach II) of each member of P P . Step 4: Select best 10 CA from P P for N P . Step 5: Generate 20 CA for N P using crossover on P P . Step 6: Generate 10 CA for N P using mutation on P P . Step 7: Generate 10 new P RP Gs for N P . Step 8: P P ← N P Step 9: Calculate fitness of the members of final population (Approach II). Report the CA rule vector with maximum fitness.
To get a solution from Algorithm 5.1 for 45−bit P RP G, it takes approximately 10 hours (Intel P-4 3.0 GHz m/c) in Approach I; whereas the execution time to find the desired solution in Approach II is only 29 minutes. However, Approach I results in a solution with better randomness quality. On the other hand, the P RP G in Approach II incurs lesser cost than the P RP G resulted out from Approach I. The detail experimental results validating the effectiveness of the proposed P RP G synthesis scheme are reported in the following subsections. Subsection 5.3.3 comments on the feasibility of the design, whereas Subsection 5.3.4 reports the randomness quality of the P RP G synthesized from Algorithm 5.1.
5.3.3
Feasibility of the design
To evaluate the effectiveness of the P RP G synthesis scheme, reported in Algorithm 5.1, we run the algorithm for different values of n (P RP G size) and then observe its convergence, resulting the desired P RP G. Table 5.5 shows sample results for n = 24, 45 and 63. Column 2 of Table 5.5 indicates the generation number of GA evolution. The 0th generation indicates the run with initial population. The next three columns report the results of the design in Approach I (Section 5.3.2); whereas the last three columns of Table 5.5 display the performance in Approach II. The number of randomness tests passed, the gate area (cost) of P RP G, and fitness of the best solution are noted in the columns 3, 4 and 5 respectively. Since the P RP Gs designed
73 Table 5.5: Results of GA evolution for minimal cost P RP G synthesis n
24
45
63
Generation 0 5 10 20 30 40 50 0 5 10 20 30 40 50 0 5 10 20 30 40 50
Approach I # test passed Cost 4 88.5 5 89.0 6 81.0 9 88.5 11 88.5 11 78.5 11 78.5 9 172.5 10 169.5 10 168.0 11 156.5 12 155.0 15 152.5 15 152.0 4 244.5 10 250.5 10 244.5 14 244.5 14 240.0 14 235.5 14 225.0
fitness 32.68 35.84 44.00 49.35 56.01 59.48 59.48 48.14 52.03 52.31 57.76 61.37 71.83 72.0 31.58 50.80 51.58 64.91 65.50 66.08 68.44
Approach II # test passed Cost 89.0 88.5 78.5 78.5 75.0 75.0 9 75.0 172.5 156.5 150.0 144.5 138.0 135.5 11 135.5 240.5 235.5 225.0 220.0 212.5 208.5 9 208.5
fitness
57.36
61.73
52.47
from Approach II are evolved based on the area f actor and the number of tests passed by a P RP G is not checked in each generation, the f itness is computed only after the 50th generation. The columns 6, 7 and 8 report the results for Approach II. It is to be noted from Table 5.5 that the GA is converging towards the final solution in both the approaches. The average cost of combinational logic in different generations are shown in Figure 5.4 for a CA cell in the P RP G. We have synthesized P RP Gs of different size by running Algorithm 5.1 for a fixed number of generations. The cost/cell of Figure 5.4 is the average per cell cost of all such P RP Gs. Figure 5.4 and Table 5.5 point to the fact that if the number of generations (Step 2 of Algorithm 5.1) is increased, then the cost of synthesized P RP G may further be reduced, simultaneously improving the randomness quality of the design (Approach I). The results on randomness quality of the cost optimal P RP G are reported next.
5.3.4
Randomness quality of the cost optimal P RP G
It has been established in Chapter 4 that the randomness quality of nonlinear CA based P RP G, synthesized using Algorithm 4.1, is excellent and at least as good as that of maximal length CA. The P RP Gs have further been evolved to obtain better randomness quality and it has been shown in Chapter 4 that the evolved P RP G is
74
4
Cost/cell
3.5
3 0
10
20
30
40
50
# Generation
Figure 5.4: Cost per cell vs. No. of generations (Approach I) the best among all other conventional P RP Gs in terms of randomness quality. This subsection reports the randomness quality of optimal cost P RP G synthesized from Algorithm 5.1. Since maximal length linear CA based P RP G is the best among all other conventional P RP Gs implemented with LF SR, ring generator, etc. in terms of randomness quality [130], we compare the randomness quality of the proposed P RP G with that of maximal length CA based P RP G. However, a special category of maximal length CA, referred to as the minimal cost maximal length CA, that incur less hardware cost compared to conventional maximal length CA has been reported in [43]. We, therefore, also compare the randomness quality of the proposed P RP G with that of minimal cost maximal length CA. The experimentation is done for different n (size of the P RP G), varying from 20 to 300. Table 5.6 contains the results for n = 24 and 63. The column under heading M AX, M inM ax and P RP G represent the conventional maximal length CA, minimal cost maximal length CA, and the proposed P RP G with Approach I respectively. For an n, a number of experimentations with different seeds have been performed for each test. We declare a test as pass if at least in 75% cases the test succeeds; otherwise the test f ails. A comparative study of the proposed cost optimal P RP G with maximal length CA and minimal cost maximal length CA based designs is further shown in Figure 5.5. The P RP G (1) and P RP G (2) of Figure 5.5 represent the proposed P RP G with Approach I and the proposed P RP G with Approach II respectively. We avoid comparison with Ring generator as the Ring generators do not succeed the tests of Diehard for n = 24 to 128. The results of extensive experimentation establish the fact that the randomness quality of the patterns generated by the proposed nonlinear CA based optimal cost P RP G (Approach I) is better than that of maximal length linear CA/LF SR based designs. The P RP G can efficiently be employed as on-chip T P G. The following
75
Table 5.6: Randomness test for minimal cost P RP G (Approach I) Name of Test Overlapping sum Runs 3D Spheres Parking Lot Birthday Spacings Count-the-1’s Binary Rank 6x8 Binary Rank 31 & 32 Count-the-1’s (Sp. Byte) Bit stream The Craps Minimum Distance Overlapping Permu DNA The Squeeze
Max pass pass pass pass pass pass pass pass fail fail pass fail pass fail pass
n = 24 MinMax P RP G fail pass pass pass pass pass fail pass fail pass fail pass pass pass fail pass fail fail fail fail pass pass fail pass fail fail fail fail fail pass
No. of Tests Passed
15
10
5
MaxPRPG (1) PRPG (2) MinMax n = 24
Max pass pass pass pass pass pass pass pass pass fail pass pass pass pass pass
n = 63 MinMax P RP G fail pass fail pass pass pass pass pass pass pass pass pass fail pass pass pass fail pass pass fail fail pass pass pass fail pass fail pass fail pass
PRPG (1) Max PRPG (1) Max PRPG (2) PRPG (2) MinMax MinMax n = 45
n
n = 63
Figure 5.5: Comparison of randomness property of Max Length CA and PRPG
76 subsection reports the fault efficiency of the T P G.
5.3.5
Fault efficiency
The fault efficiency of nonlinear CA based on-chip T P G has been reported in Section 5.2.1. Table 5.1 has shown the experimental results on benchmark circuits. The fault coverage achieved by nonlinear CA based on-chip T P G is considered as the target for the current design. Table 5.7 reports the experimental results on cost optimal P RP G based T P G. We have applied same number of test vectors to evaluate the fault efficiency of T P Gs. Column 4 of Table 5.7 shows the target fault coverage that is achieved by nonlinear CA based on-chip T P G (column 5 of Table 5.1). The columns 5 and 6 report the fault coverage attained by the P RP Gs designed with Approach I and Approach II respectively. It is clear from the table that the fault coverage attained by all P RP Gs designed with Approach I and most of the P RP Gs designed with Approach II are either the same or better than the target (marked by *). Table 5.7: Fault efficiency of cost optimal P RP Gs Circuit Name s1196 s1238 s967 s1423 s1269 s3271 c6288 c1908 s5378 s641 s713 s35932 c432 c432m c499 c499m c1355 c1355m s3384 s4863 c3540 c880 s991 s6669 c7552 c2670
# PI 14 14 16 17 18 26 32 33 35 35 35 35 36 36 41 41 41 41 43 49 50 60 65 83 207 233
# Test Vector 12000 10000 9000 15000 1200 10000 60 4000 8000 2000 2000 14000 400 4000 600 2000 1500 12000 8000 8000 3500 2500 6000 4500 12000 2000
Target 95.25 90.04 98.22 51.16 99.56 98.99 99.46 99.41 68.06 85.65 81.24 79.95 99.24 84.68 98.81 97.56 99.24 92.23 91.95 94.29 95.85 99.05 94.95 99.99 94.44 84.57
Fault Coverage (%) Approach I Approach II 95.25* 95.25* 90.04* 89.67 98.22* 98.22* 56.50* 51.16* 99.56* 99.56* 98.99* 98.99* 99.51* 99.46* 99.41* 98.36 68.06* 67.63 85.65* 85.65* 81.41* 81.24* 79.95* 78.06 99.24* 99.24* 84.68* 84.68* 98.95* 98.81* 97.78* 96.58 99.24* 99.24* 92.23* 91.86 91.95* 91.95* 94.29* 94.23 95.85* 95.85* 99.47* 99.05* 95.06* 94.95* 99.99* 99.99* 94.44* 94.44* 84.60* 84.47
77 Table 5.8: Comparison of area overhead # bit 24
32
45
48
Parameters Total Elements Net Utilization Gate Count Delay (in ns) Total Elements Net Utilization Gate Count Delay (in ns) Total Elements Net Utilization Gate Count Delay (in ns) Total Elements Net Utilization Gate Count Delay (in ns)
Ring 96 1.81268% 249 6.172 96 2.41691% 337 6.172 6.172 146 3.67573% 513 6.172
LF SR 64 1.61128% 201 6.172 84 2.1148% 265 6.172 118 2.97079% 369 6.224 126 3.17225% 393 6.172
MinMaxCA 85 2.13998% 327 6.224 114 2.87009% 445 6.224 160 4.02819% 621 6.253 171 4.30513% 636 6.224
P RP G 88 2.21551% 333 6.253 101 2.54279% 363 6.172 161 4.05337% 625 6.253 169 4.25478% 651 6.224
However, in the design of cost optimal solution of P RP G, we do not consider the actual cost, that is, actual area, delay, etc., of a CA. The next section reports the hardware implementation of P RP G and estimates the cost of such implementation.
5.4
Hardware implementation
This section reports the area and delay overhead of the proposed P RP G while implemented in hardware. It is then compared with traditional P RP Gs developed around the maximal length CA, LF SR, and Ring Generators. The overhead of a design is estimated by running the synthesis tool XILIN X [3]. Xilinx Spartan3 is used as the target device family, whereas all the designs are implemented on the F P GA deviceXC3S50 − 4P Q208. Table 5.8 reports the summary result of the experimentation. First column of Table 5.8 notes the size of pattern generator. The parameters based on which the comparison among the Ring Generator [197], LF SR [26], minimal cost maximal length CA [43], and the proposed P RP G is done are mentioned in Column 2. The parameters – total elements, net utilization, and gate count are considered for comparison of area overhead. Percentage of net utilization quantifies the area overhead of a design. The smaller value indicates the less area overhead. The delay parameter of a design signifies the maximum time required for an output after a clock generated. The columns under the headings Ring, LF SR, MinMaxCA, and P RP G in Table 5.8 display the simulation results for Ring Generator, LF SR, minimal cost maximal length CA, and the proposed P RP G (Approach I) respectively. It is clear form Table
78 IN
OUT
Cell 1 (FF)
0
f1
... ...
IN OUT Cell n1 − 1 (FF)
IN OUT Cell n1
IN OUT Cell n1+ 1
(FF)
(FF)
IN OUT Cell n2
...
(FF)
...
fn
1
fn2
0
FF
Figure 5.6: Architecture of T P G for multiple cores 5.8 that the area overhead of the proposed P RP G is almost similar to that of minimal cost maximal length CA. However, the randomness quality of the proposed nonlinear CA based P RP G is better than that of minimal cost maximal length CA/maximal length CA. The area overhead of Ring Generator and LF SR are normally less than that of P RP G. However, the randomness quality suffers for Ring Generator and LF SR.
5.5
T P G for multiple cores with scalable P RP G
An important drawback of the traditional P RP Gs, implemented with LF SR or linear maximal length CA, is – it is difficult to design an (n + 1)−bit P RP G from an available design of n−bit P RP G. The major advantage of the nonlinear CA based P RP G synthesis scheme (Algorithm 4.1 of Chapter 4) is that the structure of an n−bit P RP G can be utilized to design an (n + 1)−bit P RP G, in constant time, without sacrificing the randomness quality of P RP G. Such a scalable P RP G structure can be employed to design the on-chip T P Gs for a V LSI chip implementing multiple cores. It replaces disparate T P G hardware for different cores into single one and results in drastic reduction of the cost for test logic. This section reports the design of such T P G for a CU T (Circuit Under Test) having multiple cores. We next address the approach of designing n 2 −bit P RP G from an n1 −bit P RP G, where n2 > n1 .
5.5.1
Design of n2 −bit P RP G from n1 −bit P RP G
In the proposed design, a P RP G for the T P G of a CU T having n 1 -P I is so designed that its structure can further be utilized to design a T P G for the CU T with n 2 -P I, where n2 > n1 . The architecture of such a scalable design is next illustrated. Figure 5.6 gives the general architecture for the scalable T P G structure. It can act as the n1 -cell as well as n2 -cell T P G (n2 > n1 ). At the nth 1 cell of the design we need to add a switch, implemented with an AN D gate and a flip-flop (F F ), as shown in
79 the figure. While considering the structure as the n 1 −cell T P G, the F F is reset to 0. Since we are using null boundary CA the right neighbor of the n th 1 cell is to be 0. For implementing the n2 -cell T P G, utilizing the same structure, the switching F F is set to 1 so that the next state logic of the n th 1 -cell can depend on the present state of the th (n1 + 1) cell. To realize the n1 −cell and n2 -cell P RP Gs for T P G, we first select rules for the cells 1, 2, · · ·, (n1 − 1) following Algorithm 4.1 of Chapter 4. The n th 1 rule is to be selected in such a way so that it can act as the last cell of the n 1 -cell P RP G as well as the nth 1 cell of the n2 -cell P RP G. This rule selection process is illustrated with the following example. Example 5.3 This example illustrates the design of a T P G structure, where n 1 = 4 and n2 = 7. To design the 4-cell and 7-cell CA based P RP Gs for the T P G structure, the rules for 4-cell P RP G are selected first following Algorithm 4.1. Let the rule vector of the CA is < 9, 149, 105, 20 >. The rules 20 and 150 are equivalent while considered as the last CA cell rule. Moreover, rule 150 satisfies Property 1 and Property 2 of Section 4.3.1 for maintaining randomness quality. Therefore, we can select 150 in place of 20 as the 4th CA cell rule and can continue selection of the rest other rules for the 7-cell P RP G following Algorithm 4.1. Say, the resulted rule vector for the 7cell P RP G is < 9, 149, 105, 150, 106, 150, 65 >. It gives the rule vector of the desired T P G structure. To realize the 4-cell T P G, right neighbor of the 4 th cell is set to 0 by resetting the switching F F . The T P G design for multiple cores can also be implemented for cores with n − P I (Primary Input), m − P I and (n + m) − P I respectively. The design follows in the next subsection.
5.5.2
Design of n−bit, m−bit and (n + m)−bit T P G for multiple cores
In the proposed design, the P RP G based T P G is so designed that it can act as n−bit T P G, m−bit T P G as well as (n + m)−bit T P G. The design is similar to that of the design reported in the earlier subsection. However, the T P G reported in Subsection 5.5.1, can be utilized for two cores with n 1 − P I and n2 − P I (n2 > n1 ). Whereas the proposed T P G can test three cores with n − P I, m − P I and (n + m) − P I. To implement such a T P G structure, an n-cell P RP G is first synthesized following Algorithm 4.1. As the right most cell of the n-cell P RP G should act as the n th cell of (n + m)-cell P RP G, the first cell of the m-cell P RP G is selected in such a way so that it can also act as the (n + 1)th cell of (n + m)-cell P RP G. The other cells of m-cell P RP G are then selected following Algorithm 4.1. The architecture of the T P G is shown in Figure 5.7. Two switches, implemented with flip-flops (F F 1 & F F2 ) and AN D gates, are employed to facilitate the design. Whenever F F 1 and F F2 , both are set to 0, the T P G can act as n-cell and m-cell T P G separately. However, while the T P G act as (n + m)-cell T P G, both the flip-flops, F F 1 and F F2 are set to 1. The following example illustrates the design.
80 IN
OUT
Cell 1 (FF)
0
f1
... ...
IN OUT Cell n − 1 (FF)
IN OUT Cell n (FF)
IN OUT Cell n + 1 (FF)
fn
fn + 1
FF1
FF2
IN OUT Cell (n+m)
...
(FF)
...
f(n+m)
0
Figure 5.7: Architecture of T P G implementing n − P I, m − P I and (n + m) − P I Example 5.4 This example illustrates the T P G design for three cores with 4-P I, 3-P I and 7-P I. Let us consider, the 4-cell P RP G < 9, 150, 45, 65 > is synthesized following Algorithm 4.1 as the T P G for 4-P I CU T . Rule 65 and 105 are equivalent while considered as last cell rule. Further, the rule 105 maintains Property 1 and Property 2. Therefore, rule 105 can be considered as fourth cell rule for the desired 7-cell T P G. However, it is found that the class of the fourth cell of the 4-cell P RP G is I. To design 7-cell T P G, the fifth cell is to be selected in such a way that the cell can act as the first cell of 3-bit P RP G. Since the class of fourth cell is I and the rule is 105, we select 165 as the fifth cell rule (Table 4.2). Rule 165 and rule 5 are equivalent while they are treated as the first cell rule of Chapter 4. The class of next cell (that is, sixth cell for the 7-cell P RP G and second cell of the 3-cell P RP G) is found to be II (tables 4.2 and 4.3). We select the remaining cell rules from the tables 4.2, 4.3 & 4.4 to reach the 7-cell P RP G < 9, 150, 45, 105, 165, 105, 20 > for the T P G of a 7-P I CU T . Therefore, the P RP G can be utilized for the T P Gs of 4-P I, 3-P I and 7-P I cores (CU T ). The effectiveness of the T P G, proposed in Subsection 5.5.1 and Subsection 5.5.2 is evaluated with different ISCAS benchmark circuits. The performance of the T P G is already reported in Section 5.2.
5.6
Conclusion
This chapter reports the design of T P G for test applications. It is based on the nonlinear CA based P RP G reported in Chapter 4. The experimental results provided in this chapter establish the superiority of the proposed T P G over that of the stateof-the-art designs. The P RP G is further evolved to reach a cost optimal solution for T P G. Moreover, the scalable structure of P RP G is utilized to design the T P Gs, for a CU T having multiple cores.
Chapter 6
Design of Universal BIST Structure Over the last decade, test technology has achieved reasonable maturity. The design for testability (DF T ) techniques such as full/partial scan are in place coupled with high performance testers [4]. However, with the introduction of deep sub-micron technology and System-on-Chip (SoC) design methodology, the cost of testers and area overhead for scan design become prohibitive. Further, the problem of at-speed testing of high speed circuits can not be adequately handled with scan path based DF T . The self-test designs incorporating BIST (Built-In Self-Test) methodology appear as the alternative solution to this problem. The BIST architecture requires addition of two basic hardware blocks to a circuit under test (CU T ): a test pattern generator (T P G) and a response analyzer [7]. The following issues guide the design of BIST P G (Built-In Self-Test Pattern Generator) for a CU T : (1) cost of test generation, (2) quality of the generated test patterns measured in terms of fault coverage of the BIST ed CU T , and (3) the cost of test application. Due to its simplicity, small test generation time, and low area overhead, Pseudo Random Pattern generation (P RP G) scheme is usually implemented in the design of BIST P G. The earlier chapter deals with such a design of on-chip T P G for BIST structure around nonlinear cellular automata (CA) based P RP G. However, a T P G implemented with P RP G suffers from the following drawbacks: (i) it fails to detect pseudo-random pattern resistant faults; and (ii) it may generate patterns that are declared prohibited for a CU T in the sense that such a pattern may place the CU T in an unstable state or may even damage the circuit components [84, 112]. The problem (ii) can be solved by developing a P RP G that avoids generation of Prohibited Pattern Set (P P S). In the subsequent discussion we shall refer this scheme as P RP G without P P S [109, 112]. On the other hand, the problem (i) can be solved through on-chip generation of some deterministic test patterns for the pseudo-random
81
82 Cell i Cell 1
Cell i−1
Input
Input
i Input
Cell i+1 Input
Cell n Input
FF Output
Output
Output
Output
Output
Figure 6.1: An n−cell linear CA pattern resistance faults of the CU T . Such a scheme – that is, the Deterministic Test Set Generation (DT SG) technique utilizes the test patterns (test set) identified by the well known test generation algorithms, such as P ODEM , F AN etc. Different methods have been proposed for generating such deterministic patterns on-chip [8]. The Pseudo-Exhaustive Pattern generation (P EP G) scheme [181] is generally employed for a circuit where each of the Primary Outputs (P Os) of the CU T is a function of only a subset of Primary Inputs (P Is). A vector space theoretic analysis of linear CA state transition behavior to generate pseudo-exhaustive test patterns has been proposed in [221]. P EP G based on LF SR is reported in [181]. In the above background, this chapter deals with the design of a Universal BIST (U BIST ) structure that can generate test patterns for each of the above four cases (i) P RP G, (ii) P RP G without P P S, (iii) P EP G, and (iv) DT SG. We provide two alternative designs of U BIST – one is based on linear CA, and other is developed around the nonlinear CA. The linear CA based U BIST utilizes the analytical framework of vector subspaces generated by a linear CA [221]. Whereas, the characterization of CA rules guides the design of nonlinear CA based U BIST . The theory of cellular automata has been introduced in Chapter 3. Characterization of nonlinear CA has also been reported in the same chapter. We next briefly introduce the tools for characterization of linear CA to enable the proposed design steps.
6.1
Linear Cellular Automata
If all the cells of a cellular automaton (CA) use XOR as their next state logic, then the CA is a linear; otherwise it is nonlinear (Definition 3.2). Figure 6.1 depicts the general structure of an n-cell linear CA. The CA are characterized by an n × n characteristic matrix (T ) and the elements of T is represented as Tij =
1, if the next state of the ith cell depends on the present state of the j th cell. 0, otherwise.
83 T1 =
1 1 0 0
1 0 1 0
0 1 1 1
0 0 1 1 4x4
4
3
Characteristics Polynomial : X + X + 1
a) T - Matrix of a Maximum - Length Group CA
0
3
4
10
11
8
12
6
14
7
2
15
9
5
13
1
b) The State -Transition Diagram of a Maximum-Length Group CA
Figure 6.2: A 4-cell linear maximal length CA The state transition of a linear GF (2) CA is defined by : ft+1 (x) = T × ft (x)
(6.1)
ft+m (x) = T m × ft (x)
(6.2)
and for the (t + 1)th and (t + m)th time instant respectively. For a linear group CA, det[T ] 6= 0. In this chapter we concentrate on the design of U BIST with a set of group CA. Characterization of such CA follows.
6.1.1
Group CA Characteristics
The state transition diagram of a linear group CA consists of a set of cycles (referred to as cycle structure) represented by [µ 1 (k1 ), µ2 (k2 ), · · · , µm (km )], where ki is the cycle length and µi is the number of such cycles with cycle length k i . An n-cell group CA is a maximal length CA if all the states except the all zero state lie in a cycle (Figure 6.2) – that is, the cycle structure is [1(1), 1(2 n − 1)]. Otherwise, the CA is a non-maximal length CA (Figure 6.3). The characteristic polynomial of a group CA is A(x) = |T + Ix|. If the CA is a maximal length CA, then the A(x) is a primitive polynomial. In other word, a primitive polynomial produces maximal length CA. The following theorem guides the design of our linear CA based U BIST . Theorem 6.1 The possible number of n−cell maximal length CA may vary from φ(2n − 1)/n to 2(φ(2n − 1)/n), where φ is Euler totient function. Proof : For a given n, there are φ(2n − 1)/n number of primitive polynomials [1]. Each primitive polynomial produces at least one and at most two maximal length CA [41]. Hence there are at least φ(2n − 1)/n and at most 2(φ(2n − 1)/n) number of maximal length CA for an n. 2
84
T =
0 1 0 0 0 0 0
1 1 1 0 0 0 0
0 1 1 1 0 0 0
0 0 0 0 1 0 0
0 0 0 1 1 1 0
0 0 0 0 1 0 1
0 0 0 0 0 1 1
Characteristic Polynomial, (x)= (x4 + x +1 )(x3 + x +1)
II
Cycles:
0000000
Cycle length
I Cycle length = 1
1101101
= 7 0110100 1011001
1st
0000110
0010001
IV 0000010
III
Cycle length
Cycle length
= 105
= 15 0001111
0001001 0000111
0100110 11th 1111011 12 th
Cycle Structure = 1(1), 1(7), 1(15), 1(105)
Figure 6.3: A 7-cell linear group CA with cycle structure [1(1), 1(7), 1(15), 1(105)] Non Maximal Length CA : A non maximal length CA can be formed by combining m primitive polynomials. The characteristic polynomial φ(x) of such a non-maximal length CA is φ(x) = φ1 (x) · φ2 (x) · · · φm (x) (6.3) where each φi (x) is a primitive polynomial. Each primitive polynomial contributes a cyclic subspace, Ii . The entire state space V of a group CA is the direct sum of subspaces – that is, V = I 1 + I2 + · · · + I m . A subspace Ii corresponds to a cycle structure. The complete cycle structure of the state space V is computed with the help of following theorems [51, 111]. Theorem 6.2 Consider two CA with characteristic polynomial φ 1 (x) and φ2 (x) have the cycle structures A1 and A2 , where A1 =[1(1), µ1 (k1 ), · · · , µn (kn )] and A2 =[1(1), 0 )]. A CA with characteristic polynomial φ(x) = φ (x)φ (x) is the µ01 (k10 ), · · · , µ0m (km 1 2 th product of each i term of A1 with j th term of A2 , i = 1, 2, · · · , n; j = 1, 2, · · · , m. The product of µi (ki ) and µ0j (kj0 ) produces µ(k) where µ = µi µ0j gcd(ki , kj0 ) and k = lcm(ki , kj0 ). Theorem 6.3 An n-cell CA with characteristic polynomial φ(x) = φ 1 (x) φ2 (x) · · · φm (x), where φi (x) is a primitive polynomial of degree n i , then the largest cycle has length > 2n−1 if and only if n0i s are mutually primes to each others. Example 6.1 (x3 +x+1), (x4 +x+1) and (x5 +x+1) are three primitive polynomials of degree 3, 4 and 5 respectively which are primes to each others. The T −matrix with characteristic polynomial (x3 +x+1)(x4 +x+1)(x5 +x+1) is noted in Figure 6.4. The cycle structure of the CA is [1(1), 1(7), 1(15), 1(31), 1(105), 1(217), 1(465), 1(3255)]. Hence the length (3255) of the largest cycle is greater than 2 12−1 = 211 .
85
T=
0 1 0 0 0 0 0 0 0 0 0 0
1 1 1 0 0 0 0 0 0 0 0 0
0 1 1 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0 0 0
0 0 0 1 1 1 0 0 0 0 0 0
0 0 0 0 1 0 1 0 0 0 0 0
0 0 0 0 0 1 1 0 0 0 0 0
0 0 0 0 0 0 0 1 1 0 0 0
0 0 0 0 0 0 0 1 1 1 0 0
0 0 0 0 0 0 0 0 1 1 1 0
0 0 0 0 0 0 0 0 0 1 1 1
0 0 0 0 0 0 0 0 0 0 1 0
Figure 6.4: T −matrix of a non-maximal length CA Theorem 6.4 If a primitive polynomial of degree m appears as a factor (raised to any arbitrary power) of the characteristic polynomial of the CA matrix, then the CA exhaustively covers at least one set of m−bit positions in a cycle of length 2 m − 1. Therefore, the above theorems define that if the characteristic polynomial of a CA φ(x) = φ1 (x)φ2 (x) · · · φm (x), and φi (x) is a primitive polynomial of degree n i , then the CA exhaustively covers at least m sets of n i -bit positions in a cycle of length (2ni − 1). We next briefly report the synthesis of a T −matrix for a linear group CA. The details of the synthesis process has been reported in [111].
6.1.2
Synthesis of Group CA
This subsection provides an overview of group CA synthesis scheme from a given characteristic polynomial φ(x) = φ1 (x) · φ2 (x) · · · φm (x), where each φi (x) is primitive. The synthesis of a linear CA (T matrix) from a primitive polynomial is done with the help of the algorithm proposed by Cattel & Muzio [41]. The following theorem formalizes the synthesis of CA from φ(x). Theorem 6.5 [111] If the characteristic matrices T (φ i (x))s with characteristic polynomial φi (x)s are arranged in block diagonal form, then the resultant matrix T has the characteristic polynomial φ(x), where T (φ1 (x)) 0 ··· 0 T (φ2 (x)) 0 · ·· · T = · · 0 ··· T (φm (x)) The T −matrix of Figure 6.4 illustrates the result of Theorem 6.5. Here, φ i (x)s are
86 φ1 (x) = x3 + x + 1, φ2 (x) = x4 + x + 1 and φ3 (x) = x5 + x + 1. The next section describes the design specifications of U BIST for both the linear CA based design as well as the nonlinear CA based design.
6.2
Design Specifications for Universal BIST
The U BIST structure proposed in this thesis is designed with linear CA, as well as around nonlinear CA. However, the design specifications vary for such two cases. For linear CA, it essentially boils down to synthesis of the T −matrix of the CA from the design specifications identified in following subsections. The design specifications for linear CA are expressed in terms of availability of factor polynomials in the characteristic polynomial of T −matrix. Each of the factor polynomials represents a vector subspace. On the other hand, the target of nonlinear CA based design is to synthesize a rule vector for an n−cell CA, that can generate the desired test patterns.
6.2.1
Pseudo-Random Pattern Generation
This mode of test generation scheme applies a set of randomly generated patterns as the test vectors to a CU T . In a pseudo-random test, each input bit has the equal probability of being either a 0 or 1. The linear CA based P RP G, implemented with a maximal length CA, generates a good quality of pseudo-random patterns and reported in [51]. The nonlinear CA based P RP G is reported in Chapter 4 of this thesis. Design Specification: To generate n−bit pseudo-random patterns, the linear CA based U BIST should have the T −matrix with its characteristic polynomial having a factor primitive polynomial of degree n. For a nonlinear CA based P RP G, the P RP G is synthesized using Algorithm 4.1.
6.2.2
Pseudo-Exhaustive Pattern Generation
CA-based pseudo-exhaustive test pattern generation scheme is reported in [51, 221] along with the design of non-maximal length group CA and identification of the CA cells that generate pseudo-exhaustive patterns. The following example illustrates the underlying principle. Let there be a 6-P I (Primary Input) CU T as shown in Figure 6.5. The P O 1 is the function of 4 P Is (A, B, C and D) while P O 2 depends on 3 P Is (D, E and F). To test the subcircuit covered in P O1 , we need to supply all possible combinations of 4-bit patterns from a T P G on the P Is - A, B, C and D. On the other hand, to test P O 2 cone, it is necessary to supply all possible 3-bit patterns on D, E and F. The CA of Figure 6.3 with cycle structure [1(1), 1(7), 1(15), 1(105)] can be used to test this circuit. The CA seeded with a pattern of 15 (7) length cycle, operated for 15 (7) cycles, generates exhaustive patterns to test P O 1 (P O2 ) cone. The pseudoexhaustive patterns are generated from the last 4 bits (first 3 bits). Alternatively,
87
A
B
C
PO 1
D
E
F
PO 2
Figure 6.5: P EP G with 6 P Is (A, B, C, D, E, F) and 2 P Os (P O 1 & P O2 ) the three bit pseudo-exhaustive patterns can be also covered by the 4−bit pseudoexhaustive set of 15 length cycle. Design Specification: The above discussion sets the design specification, for the U BIST , for generation of P EP G. To generate pseudo-exhaustive patterns of say n i bits, the U BIST should be configured as a CA whose characteristic polynomial has a primitive factor of degree ni . The T −matrix of the example CA of Figure 6.3 has the factor polynomials (x4 + x + 1) and (x3 + x + 1) of degree 4 and 3 respectively.
6.2.3
P RP G without Prohibited Pattern Set
Solution to the problem of designing a P RP G without P P S (Prohibited Pattern Set) is reported in [109, 112]. The underlying principle of the proposed solution is reviewed below. This basic scheme is refined in this work and adopted for design of U BIST . The prohibited patterns are of two types [109]. Type I : These are random patterns regarding which the test designer has gathered information in the process of testing, normally by running AT P G (Automatic Test Pattern Generator). These type of patterns may not display any co-relation whatsoever. The test engineer also may not have any idea why those patterns place the CU T in an undesirable state. For a CU T , the cardinality of this class of P P S is very small. Type II : This type of P P S is formed out of some prohibited functions (P F s) on a subset of primary inputs (PIs) to the CU T . Test designer or AT P G identifies those P F s while analyzing the logic of the CU T . For example, let assume that out of the 7 P Is of a CU T , four P Is - a, b, c and d be associated with the given prohibited function defined as ab0 c0 d + abcd = 1. So the number of prohibited patterns resulted from the P F is 2 × 23 = 16. Though the P P S of a CU T can be of two types, under careful study it is observed that both the types can be mapped into a single one [109]. The only difference between these two types is that the prohibited function (P F ) of Type II can result in a large number of patterns in the P P S. In the current work, the P P S of both the types are simply referred to as the P P S and are handled by a single scheme. To address the issue of P P S while generating the pseudo-random test patterns for a CU T , we propose the solution with non-maximal length group CA.
88
Pattern set,
P
=
0 0 0 0 0 0 1 1 0 0
0 0 0 0 0 1 1 0 1 0
0 0 0 0 0 1 0 1 0 1
0 0 1 0 1 0 1 1 0 0
1 0 0 1 1 1 1 0 1 0
1 1 0 1 1 0 0 0 1 0
0 0 1 1 1 0 1 1 0 1
Figure 6.6: The Prohibited/Deterministic Pattern set The following two terminologies are introduced to designate the cycles of a nonmaximal length group CA that are to be considered for designing P RP G without P P S. Definition 6.1 Target Cycle (TC): The cycle of largest length generated by the CA and mostly free from patterns of P P S. Definition 6.2 Redundant Cycle (RC): The cycles other than the T C - these are redundant in the sense that these are not used for P RP G and the prohibited patterns are mainly covered by these cycles. The following example illustrates the basic scheme for designing the P RP G without P P S. It will help to identify the design specifications and design objectives for such P RP G designed with linear CA. Example 6.2 Let the P P S of a CU T contains 10 prohibited patterns as shown in Figure 6.6. The CA used for P RP G without P P S is the one shown in Figure 6.3. Out of the given P P S, the cycle II (length = 7) contains 3-prohibited patterns P P S1 ={0110100, 1101101, 1011001}, whereas the prohibited patterns P P S 2 = {0000110, 0000010, 0001001, 0000111, 0001111} is covered in cycle III (length = 15). The remaining 2 prohibited patterns {0010001, 0100100} fall in cycle IV and are separated by a distance of 10 time steps - that is T 10 (0010001) = (0100100). The cycles II & III are called RCs (Definition 6.2). The cycle IV is called the Target Cycle (T C) and can be employed for generation of test patterns without P P S. To avoid the two prohibited patterns of cycle IV , the CA is loaded with 0100100 and can run for L=94 time steps to generate test pattern sequence starting from 1111110 - that is, the state following 0100100. The free space available in the T C for P RP G is greater than 75% of the length available in the corresponding maximal length CA - that is, 27 − 1 = 127. Design Specification: The above discussion identifies that the U BIST , supporting P RP G without P P S, should be configurable as a CA where the factors of characteristic polynomial of its T -matrix are the primitive polynomials of low degree (Figure 6.3). Each of the factors of the characteristic polynomial generates vector subspace
89 of smaller dimension that covers the P P S. Further, following the results of Theorem 6.2 and Theorem 6.3, it has to be ensured that the resulting n−cell CA should also have a cycle of length greater than the 2 n−1 . So, the design objectives for an efficient P RP G without P P S can be noted as follows. Design Objectives: (i) P P S should be covered by the Redundant Cycles (RCs), (ii) the Target Cycle (T C) should have large cycle length (> 2 n−1 ) to ensure pseudorandom quality of patterns generated from an n−cell CA, and (iii) the members of P P S, if exists on T C, should be covered by a short distance so that the majority of T C is available for P RP G without P P S.
6.2.4
Deterministic Test Set Generation
Let us assume that the P P S noted under last subsection for P RP G without P P S be the Deterministic Test Set (DT S) for a CU T and to be generated by the U BIST . Then the design specification for the Deterministic Test Set Generation (DST G) will remain as it is noted in the last subsection for P RP G without P P S. That is, U BIST supporting generation of a given DT S should be configurable as a CA where the factors of characteristic polynomial of the T −matrix are primitive polynomials of low degree. However, for the DT SG, the design objectives undergo following modifications. Design Objectives: (i) The DT S should be covered by the minimum number of smaller length cycles, and (ii) The minimum number of extra patterns (redundant patterns) should be allowed to cover the DT S. The following terminologies are defined for proper justification of the above design objectives for DT SG. Definition 6.3 Run length (R): The run length (R) for DT SG is the number of time steps (CA states traversed) required to cover the given DT S. Definition 6.4 Seed (S): Seed S is a pattern loaded in the DT SG for its traversal to generate the given DT S. Each discrete graph in the state transition diagram of the CA employed for the DT SG requires at least a single seed to initiate the CA state traversal. The Design Objectives reported above are to minimize the R and number of S to generate the desired DT S. The following example illustrates the basic scheme for a linear CA based design. Example 6.3 Consider the pattern set P of Figure 6.6 as the DT S. The group CA shown in Figure 6.3 can also be selected as the candidate DT SG. Out of the
90 given 10 patterns of Figure 6.6, the cycle II (length = 7) contains 3 test patterns DT S1 ={0110100, 1101101, 1011001}, and the test patterns DT S 2 ={0000110, 0000010, 0001001, 0000111, 0001111} are covered by the cycle III (length = 15). The rest 2 patterns DT S3 ={0010001, 0100100}, fall in the cycle IV , are separated by a distance of 10 time steps. The run lengths R 1 , R2 and R3 to generate the DT S1 , DT S2 and DT S3 are 4, 11 and 11 respectively if the CA is loaded with the seeds 1101101, 0000110 and 0010001 respectively. Therefore, the run length (R) to generate DT S of Figure 6.6 is 26 (R1 + R2 + R3 ) with three seeds. Based on the design specification and design objectives identified in this section, we next proceed to design the U BIST structure, around the linear CA. However, the preliminary design of linear CA based U BIST is reported in [83].
6.3
Design of Linear CA based U BIST
The analysis of design specifications for the P RP G/ P EP G/ P RP G without P P S/ DT SG leads to the following design guidelines: (1) The U BIST should be configurable as a non-maximal length group CA generating small length cycles, each representing a vector subspace and (2) The characteristic polynomial of a non-maximal length group CA is to be φ(x) = φ1 (x)φ2 (x) · · · φk (x), where φi (x) is a primitive polynomial of degree n i (i =1, 2, · · · , k). Each of these primitive polynomials of degree n i produces a cycle of length 2ni − 1. Combination of these, as noted in Theorem 6.5, results in a non-maximal length CA of desired cycle length. The largest cycle length generated by such a CA is guided by the results of Theorem 6.2 and Theorem 6.3. However, the above guidelines lead to the following fundamental problem addressed in the next subsection [83] – Find the match between the given P P S/DT S and the patterns generated in the smaller length cycles of selected CA employed to use for P RP G without P P S and DT SG.
6.3.1
Degree of Pseudo-Exhaustiveness
Each small length cycle of a CA is characterized by Theorem 6.4. It specifies that a CA, with characteristic polynomial having an m−degree primitive polynomial as a factor, exhaustively generates at least one set of m−bit positions in the vectors of the subspace of dimension 2m − 1. Therefore, the necessary condition for a given pattern set to be covered fully by a cycle of length 2 m − 1 is that it should have an m-bit pseudo-exhaustive field. In this background, we introduce the concept of Degree of Pseudo-Exhaustiveness (DP E) as defined below. Definition 6.5 Degree of Pseudo-Exhaustiveness (DP E): The DP E of an m−bit field, designated as F m , in an n−bit pattern set P of cardinality c is given
91 by: DP E(F m ) = F m.
UP c
× 100, where U P =number of unique m−bit non-zero patterns in
Let m number of different partitions be possible for n. Further consider, for the partition, n can be expressed as the summation of k r number of integers, n = n1 + n2 + · · · + nkr , and pi be the number of unique non-zero patterns in F i (i =1, 2, · · ·, nkr ). Hence, the DP E of the pattern set is min(p i )/c. Therefore, DP E(P ) = max(min(pi ))/c, where i = 2, 3, · · · , kr and r = 1, 2, · · · , m.
r th
In the proposed design, we concentrate on those partitions of n, where n i and nj are relatively prime to each other for any i and j, i 6= j. This is necessary to satisfy the condition noted in Theorem 6.3 Example 6.4 Consider the vector set P defined in Figure 6.6. P has the 7 bit field. 7 can be partitioned into (3, 4) and (2, 5). p 1 = 5 and p2 = 8 for the first partition, whereas p1 = 3 and p2 = 9 for the second partition. Therefore, DP E(P ) = max(min(5, 8), min(3, 9))/10 = 5/10 = 50%. This indicates the maximum number of vectors in P that may be accommodated in a cycle of a 7−bit CA. The characteristic matrix of the CA, indicated in Figure 6.3, reports that its one of the vector subspaces (cycle II) covers 5 vectors out of the desired 10 vectors.
6.3.2
U BIST Design Algorithm
It is described in the earlier sections that the design of CA for the P RP G and P EP G is straight forward, so the U BIST design algorithm concentrates only on the design of P RP G without P P S and DT SG. The basic steps of the algorithm are noted below. Elaborations of the basic steps are given next to Algorithm 6.1. Algorithm 6.1 UBIST DESIGN LinearCA Input: A pattern set (P P S or DT S). Output: The desired U BIST . Step 1. Construct initial data base of maximal length CA. Step 2. Make an intelligent guess based on DP E of P P S/DT S. Step 3. Synthesize n−cell non-maximal length CA with ni −cell (ni < n) maximal length CA. Step 4. Search appropriate CA for the problem and output it. 1. Construction of Initial Data Base For any n, primitive polynomials are generated. A number of CA are generated from the primitive polynomials using Cattel-Muzio algorithm reported in [41]. All the maximal length ni -cell CA (i = 3, 4, · · ·) are stored in a data base. The number of maximal length CA for an n, as per Theorem 6.1, is at least φ(2 n − 1), where φ is Eular totient function [1]. 2. Intelligent Guess based on DP E of P P S/DT S This step selects n1 , n2 , · · ·, nkr (Section 6.3.1), where n = n1 + n2 + · · · + nkr ,
92 depending on the DP E. The algorithm presented below gives a fairly accurate result for step 2. It tests the partitions of n to arrive at an intelligent guess of the partition point, based on the DP E. Algorithm 6.2 Find Partition Input: n−bit pattern set P (P P S/DT S). Output: (n1 , n2 , · · · , nkr ). Step I: Determine all possible ni , i = 1, 2, · · · , kr for any kr , such that n1 + n2 + · · · + nk = n and ni s are mutually prime to each other. Step II: for each partition (n1 , n2 , · · · , nkr ) Find Pj = DP E of the pattern set for j th partition. Step III: Output (n1 , n2 , · · · , nkr ) for which Pj is maximum. 3. CA Synthesis In this step, the appropriate non-maximal length CA are synthesized for the desired P RP G/DT SG. The algorithm reported below performs the synthesis. Algorithm 6.3 Synthesize LinearCA Input: (1) Database created in Step 1, and (2) output partition from Algorithm 6.2. Output: A set of non-maximal length group CA. Step I: For each ni of (n1 , n2 , · · · , nk ) repeat Step II to Step III. Step II: Corresponding to each ni , pick up one maximal length CA from the initial data base. Step III: Form non-maximal length n−cell CA with ni s (n = n1 +n2 +· · ·+nk ), using Theorem 6.5. Step IV: Repeat Step II and Step III until all the maximal length CA corresponding to each ni are not covered.
4. Search for Solution Step 3 of Algorithm 6.1 results in a set of candidate non-maximal length group CA for the U BIST (Algorithm 6.3). To pick up the desired one, effective for U BIST , we optimize the search in step 4 of Algorithm 6.1. For the P RP G without P P S of U BIST , we implement search I, whereas search II is considered for the design of DT SG. Search I: P RP G without P P S Searching procedure for the P RP G without P P S ensures coverage of maximum number of states (prohibited patterns) in the smaller cycles so that the largest cycle can be employed for the P RP G. That is, the larger cycle can be free from the given P P S. To evaluate this criteria, a performance function (pf) is defined, where pf (P P S) =
sp 2n
(6.4)
and sp is the amount of free space. Let the set {k 1 , k2 , · · · , km } denote the cycle lengths of a group CA. A prohibited pattern p lies on a cycle of length k i if T ki .p = p, where T is the characteristic matrix of the CA. Based on the above formulation, the following algorithm evaluates pf and finds the desired CA with highest pf .
93 Algorithm 6.4 search CA forPPS Input: A set of CA with their cycle structures, P P S. Output: The desired CA with maximum pf . Step I: Select one CA. Step II: Sort the set of lengths such that kj < ki if j < i. Step II: For i = 1 to m − 1 { Calculate T ki . Remove the pattern p ∈ P P S from P P S for which T ki .p = p. } Step IV: Locate the remaining patterns on the largest cycle. Step V: Calculate the largest free space without a prohibited pattern, and pf (Relation 6.4). Step VI: Repeat Step I to Step V for each CA. Step VII: Output the CA with maximum pf .
Search II: DT SG For DT SG the objective is to find a CA that can generate the patterns of DT S with minimum run length (R) and minimum number of seeds (s). However, minimization of R and the number of seeds are conflicting in nature. We have, therefore, designed a function that ensures a balance between these two. The rational behind designing the function is stated next. In general, the cost of seeds (s) is assumed to be higher than that of increased run length (R). Seeds are stored in memory that leads to storage cost. Further, a seed has to be fetched from memory and to be loaded in the CA during testing of a CU T . Memory access time is significantly higher than that of running the on-chip CA for one extra cycle. Based on this consideration, the performance function for selecting a CA for DT SG (U BIST ) is defined as pf (DT S) =
p R(1+(s − 1)2 /p)
(6.5)
where p is the number of patterns in DT S, R is run length and s is the number of seeds applied. Algorithm 6.5 selects such a CA, for which the pf value is maximum. It first targets to locate the patterns (deterministic patterns) in different cycles. The number of cycles required to accommodate the patterns indicates the number of seeds. Total number of patterns generated to cover the deterministic test patterns is the run length. Algorithm 6.5 search CA forDTS Input: A set of CA with their cycle structures, DT S. Output: The desired CA with maximum pf . Step I: Select one CA. Step II: Locate the patterns on the different cycles. Step III: Count the number of cycles and report it as the number of seeds (s). Step IV: Find maximum run (Ri ) required corresponding to each cycle to cover other patterns on it. Step V: Calculate total run length R = R1 + R2 + · · ·. Step VI: Calculate pf (Relation 6.5).
94 Step VII: Repeat step I to step VI for each CA. Step VII: Output the CA with maximum pf .
6.3.3
Performance of linear CA based U BIST
This subsection reports the experimental set up and the results for both the designs P RP G without P P S and DT SG. The followings are the summary of the experimental results showing the success rate, percentage of free space achieved for the design and the randomness quality of the P RP G without P P S. The results of DT SG are described in the last part of this section. Performance of P RP G without P P S 1. Success rate: Table 6.1 depicts the summary of the results in designing the BIST P G that generates good quality pseudo random patterns while avoiding generation of the given P P S. Real life data in respect of P P S for a CU T is proprietary in nature and not usually available. In the absence of real life data, the experimentation is done with randomly generated P P S. The value of n and the cardinality of P P S, are noted in the columns 1 & 2. Column 3 denotes the length of T C (Target Cycle) used for the P RP G. The free space % available, without encountering any prohibited pattern, for test generation is shown in column 4. The results of Table 6.1 are shown for |P P S|=10 and 15. 2. Free space achieved: The available free space on the largest length cycle reduces with the increase of number of prohibited patterns. Figure 6.7 depicts the effect of increase in the |P P S| on the available free space for a particular value of n (number of bits in a pattern ∈ P P S). We display two results (Figure 6.7) for n =10 and 15 to illustrate the scenario. 3. Study of randomness property: The randomness property of the patterns generated by the T C (Target Cycle), for different values of n, are studied based on the metric proposed in [151]. The comparative studies on randomness quality of the patterns generated by BIST P G and the corresponding maximal length CA are presented in Table 6.2. We run the CA with different seeds of the T C. The ‘pass’ implies that the test succeeds at least for 75% cases. The results establish the fact that the randomness quality of the patterns generated by the T C of BIST P G is as good as that of maximal length CA. Performance of DT SG The experimentation is conducted on synthesized data followed by the DT S for benchmark circuits. Synthesized Data : The followings are the summary of experimental results performed on randomly synthesized data. It shows the run length and number of seeds required to cover the DT S. 1. Table 6.3 shows the summary of the result on randomly generated DT S. The value of n and the cardinality of pattern set are noted in the columns 1 & 2. Columns 3 & 4 show the run length and number of seeds required to generate the patterns.
95
Table 6.1: The results of BIST P G for P RP G without P P S
(Linear CA based design) (1) #Cell 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
(2) |P P S| 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 15 15 15 15 15 15 15 15 15 15 15 15 15 15 15
(3) TC 217 381 1023 1953 3255 8001 15841 27559 63457 131071 262143 458745 1040257 2097151 4063201 225 465 889 1905 3937 8191 15841 31705 59055 82677 259969 458745 1040257 1966065 3138051
(4) F reeSpace % 59.76 62.30 56.73 74.31 51.96 67.78 55.02 51.53 74.16 57.78 70.41 57.70 56.55 58.41 65.62 50.14 48.63 45.41 33.64 39.11 39.13 41.00 50.00 44.90 34.80 37.70 45.70 45.54 49.51 42.65
96
pf ( % of total space )
100
80
60
n = 15
n = 10
40
20
5
10
15
20
25
30
Cardinality of PPS
Figure 6.7: Variation of free space with the cardinality of PPS.
Table 6.2: Randomness Test (Linear P RP G without P P S) Random Test Gap test Run test Serial corr test Equidist. test Auto-corr test Cross-corr test
Max pass pass pass fail pass pass
n=9 BIST P G pass fail pass fail pass fail
n = 15 to 20 Max BIST P G pass pass pass pass pass pass fail fail pass pass pass pass
97
Table 6.3: Results of BIST P G for generation of randomly generated DT S
(Linear CA based design) (1) #cell (n) 8 9 10 11 12 13 8 9 10 11 12 13
(2) cardinality 25 25 25 25 25 25 50 50 50 50 50 50
(3) run length 113 281 551 355 27 3336 177 267 321 762 1008 162
(4) #seed 3 2 2 2 5 2 3 2 3 2 2 4
The results of Table 6.3 are shown with cardinality of pattern set = 25 and 50. 2. Figure 6.8 shows a sample graph indicating the increase of run length (R) with the increase in cardinality of DT S. The results are derived for n = 10-bit pattern. 3. Figure 6.9 displays the relation between the degree of pseudo-exhaustiveness (DP E) and the performance (pf ). Higher pf value of the given DT S signifies lesser run length (R), and the number of seeds (s). Figure 6.9 shows that the increase of DP E also increases the value of pf – that is, minimizes the run length and the number of seeds to generate the given DT S. DTS of Benchmark Circuits: For this experimentation, we initially consider DT S of a benchmark for 100% fault coverage. However, the DT S given for a benchmark may not have the acceptable DP E – that is, the probability of inclusion of the DT S in smaller cycles of a CA may be less. Therefore, if the DP E value for a DT S is below a threshold, then we modify the DT S while ensuring (1) the fault coverage of the circuit is still 100%, confirmed through fault simulation, and (2) the DP E of the pattern set is increased. Table 6.4 shows the performance of BIST P G for the DT S of benchmark circuits. Column 1 shows the name of circuits, whereas the columns 2 and 3 indicate the number of bits in a pattern and cardinality of the DT S respectively. Number of seeds and the run length required to generate the DT S are reported in the columns 4 & 5. The columns 6 & 7 report the number of seeds and the run length required to generate the DT S after modification of the DT S with improved DP E. The results of Table 6.4 show that the performance of the BIST P G is quite high for the DT S with improved DP E.
98
900
(R)
800
Run Length
700
n=10
600 500 400 300 200
25
50
75
100
125
150
Cardinality of DTS
Figure 6.8: Variation of run length with the cardinality of pattern set
100 90 80
pf ( in % )
70 60 50
|DTS| = 25
40 30 20 10 0 10
20
30
40
50
60
70
80
90
Degree of Pseudo − Exhaustiveness ( DPE )
Figure 6.9: DPE vs. performance
100
99
Table 6.4: Results of BISTPG for generating DTS for benchmark circuits
(Linear CA based design) Circuit name s27f s386f s298f s208f s820f s832f s344f
n 7 13 17 18 23 23 24
|P | 14 142 70 70 284 277 46
DPE modification Before After R s R s 60 2 60 2 1925 2 1450 2 2827 3 198 3 1068 1 1068 1 4278 4 3057 4 4021 4 2809 4 5027 4 2387 5
The experimental results shown in this section points to the fact that the linear CA based U BIST can be a better solution to the problem of designing on-chip T P G for a CU T . We further improve the performance of such a design by exploring the huge search space of nonlinear CA in the next sections. The components of U BIST are the P RP G, P EP G, P RP G without P P S and DT SG. Performance of nonlinear CA based P RP G is reported in Chapter 4. The design of P EP G around the nonlinear CA falls back to design of P EP G with linear CA rules and exhibit similar performance as the linear CA based design. Therefore, in the following sections we report the nonlinear CA based design of P RP G without P P S and the DT SG.
6.4
Nonlinear CA based P RP G without P P S
This section proposes the design of P RP G around the nonlinear CA that avoid generation of a given set of prohibited patterns (P P S). The objective of this design is reported in Section 6.2. Primary results of nonlinear CA based P RP G without P P S have been provided in the papers [84, 82, 86]. Overview of the design is illustrated next.
6.4.1
Overview of the design
Let us consider the state transition diagram of a 5-cell nonlinear group CA, with rule vector < 5, 105, 153, 154, 5 >, shown in Figure 6.10(b). It contains 2 cycles of length 8 and 24. Further, assume that we are considering the design of T P G for a CU T with 5 Primary Inputs (P Is). The P P S of the CU T contains 5 prohibited patterns as shown in Figure 6.10(a). Out of the given 5 prohibited patterns, 3 are found to fall in the cycle of length 8 and two are in the larger cycle of length 24 (the prohibited patterns in the cycles are shown as the shaded nodes). The 5−cell CA of Figure 6.10 can act as the desired T P G for the given CU T
100
PPS
=
1 1 1 0 1
1 1 0 0 1
1 1 1 1 1
0 1 0 1 0
1 1 0 1 0
(29) (31) (20) (7) (28)
(a) Prohibited Pattern Set 24
2
Target cycle
13
11
9
7
12
Redundant cycle
22
14
29
0
1
Length = 8
10
27
20
31
6
28
15
Length = 24
3
25 26
21
8
16
18
5 19
17 4
30
23
(b) State transition of the 5−cell nonlinear CA (< 5, 105, 153, 154, 5 >)
Figure 6.10: Nonlinear CA based P RP G without P P S
developed around the nonlinear CA. It avoids the P P S while the CA is loaded with 3 (00011) and run for 21 time steps. The 21 patterns, starting from 3 to 9, can be the desired test patterns for the CU T without any prohibited pattern included. That is, these 21 patterns which is approximately 66% of the total number (2 5 = 32) of states in the CA may be applied to the CU T as pseudo-random test patterns. The number of such patterns (CA states) increases exponentially for a CU T as the number of P Is increases. However, in practical situation the desired number of pseudo-random test patterns is, in general, not more than 30,000 for a sufficiently large circuit. Therefore, the design objectives reported in Section 6.2.3 for the P RP G without P P S can be redefined as follows. Design Objectives: The design of T P G/P RP G without P P S for an n-P I CU T should fulfill the following objectives: • The generated test patterns should have good quality of randomness. • The test pattern set without prohibited patterns should be generated from a large cycle (Target Cycle) of a group CA selected as the T P G. • Prohibited patterns in the target cycle are to be clustered so that a large number of patterns (say, 30,000) can be generated without encountering the prohibited patterns.
101 A heuristic scheme to select a P RP G, fulfilling the objectives of avoiding P P S, is described in the following subsection.
6.4.2
Avoiding Prohibited Patterns
The target cycle of P RP G is exploited to generate the test pattern set without containing prohibited patterns. That is, for the desired T P G we should search for a P RP G in which I. the prohibited patterns are clustered, and II. there is a large sequence of patterns that does not include any prohibited pattern. While searching the CA based P RP G for such T P G, without loss of generality, we consider 30,000 test patterns are sufficient for to test a CU T with an arbitrary number of P Is. The limit of 30,000 can easily be extended in practical designs. I. Clustering of prohibited patterns It is to be noted that the desired P RP G avoiding a given P P S is a nonlinear group CA. The synthesis of P RP G is reported in Algorithm 4.1. Now, to find whether in a given P RP G the prohibited patterns are clustered or not, the following heuristic scheme is implemented for the current design. Heuristic 1: While selecting a rule R i of the rule vector R =< R1 , R2 , · · · , Ri , · · · Rn > of a group CA for the P RP G, the CA is seeded with an arbitrary pattern Pj ∈ P P S. The next state of the ith CA cell is then compared with the ith bit of a pattern Pk ∈ P P S − {Pj }. If a Pk exists, then we repeat the process considering P k as the seed – that is, the next state of the i th cell is compared with the ith bit of a pattern Pl ∈ P P S − {Pj , Pk }. This process continues till a match is found. If there exists k 0 number of such patterns {P1 , P2 , · · · , Pk0 } and k 0 is close to |P P S|, then it can be assumed that the ith rule of R is fit to cluster the given P P S. If the condition holds for n0 number of rules, where n2 < n0 ≤ n while designing the n-cell P RP G, then there is a high probability that the patterns belong to P P S are clustered in the cycles of the CA and the CA can be the desired T P G. From the extensive experimentation, we set the value of k 0 as 34 of |P P S|. II. A sequence with large number of patterns Once we get a CA (P RP G) satisfying the condition I, the CA is loaded with an arbitrary pattern P ∈ / P P S. Since a large cycle covers maximum state space, it is highly probable that P is in the large cycle of the CA. Now, if the CA is run for at most 30,000 time steps and Dj is the number of patterns generated by the CA before encountering a prohibited pattern of P P S, where D j =30,000, then the CA is selected for the desired T P G. The following algorithm summarizes the design steps for the proposed T P G for U BIST , built around nonlinear CA. Algorithm 6.6 T P G WithoutPPS NonlinearCA Input: P P S, n (CA size). Output: A group CA (R =< R1 , R2 , · · · , Rn >) to be selected as T P G, seed. Step 1: Synthesize an n−bit P RP G (Algorithm 4.1).
102 Step 2: Check whether the condition I is satisfied or not (Heuristic I). If not go to step 5. Step 3: Find the seed and the maximum amount of test patterns Dj , the P RP G can generate without encountering any prohibited pattern (Condition II). Step 4: If Dj = 30,000, report the CA as T P G with the seed and exit. Step 5: Repeat steps 1 to 4 for maximum L (=100 say) times. Step 6: Report the CA having max(Dj ). Report its seed.
The randomness quality of the nonlinear CA based P RP G has been extensively studied in Chapter 4. The quality of the proposed T P G without P P S resulted from Algorithm 6.6 is studied in the following subsection.
6.4.3
Quality of the T P G without P P S
The effectiveness of the proposed T P G design scheme is evaluated in this subsection. Table 6.5 depicts the summary of results of designing the T P G for U BIST that generates good quality pseudo random patterns while avoiding generation of the given P P S. The value of n (number of cells in the T P G) and |P P S|, the cardinality of P P S chosen for an n, are noted in the columns 1 & 2. We have assumed that the desired maximum number of test patterns is 30,000. If a design results in a T P G that generates maximum p number of test patterns without encountering a prohibited pattern, then the perf ormance of the design is defined as perf ormance =
p × 100 % 30000
(6.6)
We report the performance in column 3 of Table 6.5. Column 4 reports the number of iterations required to synthesize the desired T P G (Algorithm 6.6). The results noted in Table 6.5, with large value of |P P S|, are for Type II of prohibited patterns (Section 6.2.3). Such a P P S is formed by choosing a P F (Prohibited Function), represented as the sum-of-product terms (randomly generated) of the P Is of a benchmark circuit. Finally, the P F is converted to prohibited pattern set. For example, the P P S noted in Table 6.5 for n = 32, (row 22) having |P P S| = 400, is formed from the sum-ofproducts of selected P Is of c6288 benchmark circuit. The figures of Table 6.5 imply that the performance of the proposed design of BIST P G is proportional to the size of CA (n). Whereas, the required number of iterations to find a desired T P G decreases with n. The scenarios are depicted in Figure 6.11 and Figure 6.12 respectively with two sets of P P S having cardinalities 15 and 50. Figure 6.11 shows the number of test patterns generated without P P S, for different n, by executing Algorithm 6.6 for maximum of L =100 iterations. On the other hand, Figure 6.12 reports the number of iterations required to achieve the maximum number of test patterns, in percent of 30,000 (the assumed limit), for different values of n. The following conclusions are drawn from the results shown in the figures 6.11 and 6.12. (1) As the size of CA (n) increases, the quality of T P G improves and for large n the proposed scheme always ensures the desired T P G.
103
Table 6.5: Results of BIST P G design without P P S
(Nonlinear CA based design) (1) #Cell (n) 14 15 16 17 18 19 21 22 25 27 29 32 35 37 39 48 64 84 96 128 256 32 35 48 128 256
(2) |P P S| 10 10 20 20 20 25 25 25 30 30 30 35 35 35 40 50 50 100 150 150 250 400 1024 6144 11264 22528
(3) performance (%) 67.78 55.02 64.16 57.78 70.41 77.70 78.41 85.62 90.14 91.63 99.41 100 100 100 100 100 100 100 100 100 100 100 100 100 100 100
(4) # Iterations 100 100 100 100 100 100 100 100 100 100 100 68 61 40 20 17 10 13 10 5 2 80 65 30 10 3
(2) As the size of CA (n) increases, the required number of iterations (for Algorithm 6.6) to converge the desired solution decreases and it becomes close to single iteration.
6.5
Nonlinear CA based DT SG
The nonlinear CA based P RP G design and its application in designing T P G have been reported in Chapter 4 and Chapter 5. Earlier section describes the design of T P G/P RP G without P P S for the proposed U BIST structure. In this section, we report the overview of the design of Deterministic Test Set Generator (DT SG) for U BIST . The preliminary concepts of this scheme has been reported in the paper [85].
104
100
| PPS | = 15
| PPS | = 50 (% of 30,000)
No. of Test Patterns
80
60
40
20
10
20
30
Size of CA (n)
40
50
60
Figure 6.11: No. of test patterns without P P S vs. n (size of CA)
100
80
Iteration
| PPS | = 15
| PPS | = 50
60
40
20
10
20
30
Size of CA (n)
40
50
60
Figure 6.12: No. of iterations required vs. n to get P P S free T P G
105
P =
0001101 0001010 0110111 0001100 1100000 0111010 0111001 0010010 0001001 0100010
(13) (10) (55) (12) (96) (58) (57) (18) (9) (34)
Figure 6.13: A deterministic test pattern set Let us consider the given DT S 0 P 0 for a 7-P I (Primary Input) CU T containing 10 test patterns, as shown in Figure 6.13. The decimal equivalent of the 7-bit test patterns are noted in bracket. Further, consider Figure 6.14 showing the partial state transition behavior of a 7-cell irreversible (non-group) CA with rule vector < 210, 238, 245, 120, 97, 109, 109 >. The circles of the diagram represent the states of the CA and a 7-bit state is denoted by its decimal equivalent. The state transition diagram of Figure 6.14(a) contains 5 patterns that belong to P (shown as hazy circle). Similarly, the rest 5 test patterns of P fall in the state transition diagram of Figure 6.14(b). If we load the CA of Figure 6.14 with 0001010 (10) and allow it to run, it can generate 4 test patterns (10, 18, 58, and 96) in 6 time steps (Figure 6.14(a)). Further, to generate 12, 13, 9, 55, and 57 ∈ P , we need to load the CA with 12. Generation of these 5 test patterns requires 12 time steps. That is, for complete generation of 10 test patterns ∈ P with the 7 − cell CA, we require to load the CA with 3 seeds – 10, 12 and 34. The number of time steps (run length) required to apply the 10 test patterns of P , while testing the circuit under test (CU T ), is 6+12+0=18 (34 is to be applied directly to the CU T , so the number of time step to generate 34 is assumed to be 0). That is, the CA of Figure 6.14 with these 3 seeds (10, 12 and 34) can be employed as the on-chip DT SG for a CU T having the DT S of Figure 6.13. The above illustration justifies the specifications for the design of DT SG as specified in Section 6.2.4. We next formalize the design specifications for the current design with nonlinear CA. Design Specification: While we load the CA of Figure 6.14 with seed 10 and run it for 6 time steps, the pattern sequence generated by the CA is 6, 18, 58, 126, 50 and 96. Out of these six patterns, 6, 126 and 50 are not the members of P – that is, these are not the defined deterministic test patterns for the circuit. The extra such patterns are referred to as dummy patterns. The dummy (redundant) patterns add overhead in respect of test application time. Whereas, the seeds (10, 12 and 34) add storage overhead in designing the on-chip DT SG for U BIST . Therefore, the quality of a DT SG is to be evaluated in respect of (i) the storage area to store the seeds and (ii) the number of time steps (run length) required to generate the complete deterministic test pattern set. For the example DT SG of Figure 6.14,
106 12
76
13 77 73
15
1 14
79 78
9
21
10
...
74 6
...
48
127
18
45 100
38
22 58
103
114
126
44 34
...
59
62
50 96
94
49
105
125 107
122 55
30
52
47
...
121
(a)
0 57 116
23 87 111
84
(b)
CA =
Figure 6.14: DT SG with single CA the run length = 18 and the number of seeds = 3. The above discussion signifies that the Design Specification for a nonlinear CA based DT SG is – Synthesize a CA (rule vector) that can generate the patterns of a given DT S by traversing the minimum run length (R) and with very few seeds (S). We next propose the available design options to meet such specifications for DT SG.
6.5.1
Design Options for DT SG
There can be a number of design options for a DT SG to cover the given DT S. For example, in designing the DT SG for P (Figure 6.13) with the CA of Figure 6.14, we can reduce the run length by considering an extra seed 55. It can avoid generation of the 6 dummy patterns, 1 to 125 of Figure 6.14(b). In an alternative design option, we can try to reduce the number of seeds to avoid memory overhead. The design denies 34 as the seed – as 34 is not utilized for further generation of test patterns belonging to the DT S. The DT SG thus designed (with two seeds, 10 and 12) covers 90% (9 out of 10) of DT S and can be accepted if the dummy patterns detect the faults that are exclusively detected by 34. The above two design options optimize either the run length or the number of seeds in a DT SG. However, in the practical design we set a threshold for both – the run length and the number of seeds. The design principles introduced so far consider the single CA based design. To
107 57 121
.. . .. .
55
119 34
.. .
98
.. .
12
2
14
10
33
.. .
18
.. .
32
51 52 0
122
58
12
9
58
50
13
16
.. .
31
.. .
100
102
3
.. . 54
96
59
.. .
CA 1 = < 241,139,173,145,50,171,172 > CA 2 = < 100,161,58,214,40,114,242 > (a)
(b)
Figure 6.15: DT SG with multiple CA achieve the optimal value of R (run length) and the number of seeds, we can also have a design scheme with multiple CA. The principle of this design is next illustrated with the same example DT S of Figure 6.13. Let us consider the two CA with rule vectors CA 1 =< 241, 139, 173, 145, 50, 171, 172 > and CA2 =< 100, 161, 58, 214, 40, 114, 242 >. The relevant part of the state transition diagram of each CA are shown in Figure 6.15. If CA 1 (Figure 6.15(a)) is loaded with 57 and run for 5 time steps, it can generate the test patterns 55, 34, 12, 96 and 10 of P (Figure 6.13). Similarly, we can get rest of the test patterns of P – that is, 18, 9, 58 and 13 in 3 time steps considering CA 2 and 18 as the seed. Therefore, the desired DT SG for the given DT S of Figure 6.13 can be designed with the two nonlinear CA, CA1 and CA2 . The design results in two seeds, 57 and 18 and the number of dummy patterns for this case is 0. That is, the computation of design cost should consider – (i) the storage requirement for the two CA, CA 1 & CA2 , (ii) memory to store the two seeds, and (iii) the required number of time steps i.e., 8. Therefore, for the multiple CA based design of DT SG, the target is to find the minimum number of CA rule vectors and their seeds that can cover the given DT S within the minimum run length. In general, the design with single CA requires more hardware to store the seeds and results in much higher run length. On the other hand, multiple CA based design demands more hardware for storing the CA rule vectors but needs lesser hardware for the seeds with lesser run length. In the current work, we have considered single CA
108 based DT SG. The detailed design methodology is next described.
6.5.2
The design of nonlinear CA based DT SG
The design of DT SG for the given DT S of a CU T boils down to the selection of appropriate nonlinear CA rule vector and the corresponding seeds. However, the selection of desired CA rule vector from a large search space of nonlinear CA is a computationally hard problem [140]. So, we fall back to a guided search method followed in Simulated Annealing (SA) [149]. The evolution scheme of Simulated Annealing encodes a solution in an array of cells referred to as chromosome. In the current design, a chromosome is the rule vector (i.e., set of rules) of a CA. The length of the chromosome is n × 8 bit for an n-cell CA based design, where a CA rule is represented by its RM T s (Section 3.2). For example, the equivalent code (chromosome) for a 3-cell CA < 210, 238, 245 > is < 11010010 11101110 11110101 >. The performance of a solution in the process of SA evolution is measured by a fitness function. The fitness function, the generation of initial solution for SA, and the mutation of present solution, for the current design, are introduced next. Fitness Function: The basic design objective of DT SG is to select a CA structure (rule vector) that can generate a given set of deterministic test set (DT S). The selected CA solution should cover maximum number of patterns ∈ DT S. Moreover, the run length (R) and the number of seeds to cover the patterns of DT S are to be minimized. For the current design, to define the fitness function (F ), the maximum allowable number of seeds and run length are set as fixed and F of a solution is defined as the percentage of patterns accommodated by the solution within this limit. That is, F =
No. of accommodated patterns ∈ DTS × 100% Cardinality of DTS
(6.7)
Initial Solution: Initial solution for SA plays an important role to achieve the optimal design of DT SG. Better selection of the initial solution speeds up the convergence of annealing process. For the current application, we have employed a heuristic to find the initial solution. Extensive experimentation in this direction proves the effectiveness of this heuristic solution. In the proposed heuristic, a disconnected directed graph is constructed assuming the patterns of DT S as nodes. While constructing the graph, a new node is attached in such a way that – after the addition of node the graph signifies the state transition diagram of a CA. The basic motivation of this technique is to find a CA rule that can accommodate as many as patterns from DT S in its state transition diagram. The following discussion illustrates the technique. Let us consider the patterns 0111010 and 0100010 are taken arbitrarily from P of
109 Node 1
0111 010
Node 1
Node 2
010 0 0 1 0
Node 2
010 0 0 1 0
Node 3
101 1 1 0 1
Node 3
011 0 1 1 1
0111 010
Case II
Case I (a) 111
0
110
101
100
011
010
1
001
000
Cell 3
Case I 1/0
Cell 6 Conflict
0
1
Cell 3
Case II 1
Cell 6
(b)
Figure 6.16: Finding Initial Solution for SA Figure 6.13. Further, assume that these two patterns are the consecutive nodes (Node 1 and Node 2) of a directed graph, as shown in Figure 6.16(a). If Figure 6.16(a), Case I, represents the state transition diagram of a 7−cell CA, then Node 2 is the next state of Node 1 of the CA. The rules of the 7-cell CA can be partially determined following the state transition – Node 1 to Node 2. Figure 6.16(b) describes the formation of CA rules through the selection of RM T s from the nodes. Let us consider the cell 3 and cell 6 of the 7−cell CA (the rules are numbered from left to right). The state of cell 3 is 1 and it is 0 at Node 2. It implies, cell 3 changes its state from 1 to 0 while the combination of present states of its neighbors is 111. That is, RM T 111 of the rule of cell 3 is 0 (Figure 6.16(b), Case I). Similarly, for the rule of cell 6, RM T 010 is 1. It is evident from the above discussions that a state transition, Node 1 to Node 2, can fix up single RM T for each cell rule. To find the rule of a CA cell, we need to fix up all the 8 RM T s for the cell. Addition of new nodes in the partial state transition diagram fulfills the requirement. However, the addition of nodes demands consideration to the following two cases. Case I: Let a pattern 1011101 is inserted as Node 3 in the graph (Figure 6.16(a), Case I). The transition, Node 2 to Node 3, signifies that RM T 100 should be 1 for Cell 3. On the other hand, for the combination 010 of present states of its neighbors, the next state of Cell 6 is 0. Therefore, RM T 010 for the rule of Cell 6 results in a conflict – as it is already selected as 1 while considering the transition, Node 1 to Node 2. Case II: Let another pattern 0110111 is chosen as Node 3 (Figure 6.16(b), Case II). Then RM T 100 of cell 3 is 1 and RM T 010 for cell 6 is 1. That is, there is no conflict in selecting the rules for the CA cells 3 and 6. It implies that 0110111 can be
110 the next state of Node 2 in the partial state transition diagram of the desired CA. The outcome of Case II is a directed graph with maximum number of patterns from the DT S as nodes. The termination of the process noted in Case II gives us the partial information about the CA rules (with unselected RM T s of each rule). To get a CA rule, the unspecified RM T s are fixed randomly. For example, after the steps shown in Figure 6.16(a), Case II, the rule for Cell 3 can be 00110100 that is, 52. The set of such rules, forming the rule vector R, is the initial solution for the SA. Node 1 of Figure 6.16(a) is one of the seeds in initial solution. The other seeds are the patterns ∈ DT S that are not in the directed graph. These patterns, in effect, form the disconnected graph (not shown in the figure) of deterministic test patterns. Mutation: The mutation for SA directs the present solution to converge to an optimal solution. In the proposed design, we employ single point (single RM T ) mutation with low probability. The mutation scheme performs little change over its present solution. There is a high probability that the covered patterns of DT S in the present solution remain intact in the muted solution. Moreover, there is a chance that the patterns ∈ DT S which are not covered by the present solution may be covered by the muted solution. A bit of the chromosome (present solution) representing an unspecified RM T , while forming the initial solution, is selected randomly for the mutation. The mutation flips a bit from 0 to 1 or 1 to 0.
6.5.3
DT SG design algorithm
Based on the basic framework of SA, mentioned above, a complete design algorithm is presented in this subsection. While setting the initial temperature for SA, to reach the desired solution, we fall back to exhaustive experimentation. For the current design, it is set to 100. The formal algorithm is next provided. Algorithm 6.7 DTSG DESIGN NonlinearCA Input: A pattern set (DT S), maximum number of seeds and run length allowed. Output: The desired CA for DT SG and seeds Step 1: Get Initial Solution (IS) and calculate its fitness F1 (Equation 6.7). Step 2: Set T emp = 100 and IS as the best and current solution. Step 3: While T emp ≥ 100 execute Step 4 to Step 6. Step 4: Mute current solution and calculate its fitness F2 . Step 5: Let δ=(F2 − F1 ) is the improvement of fitness. If(δ > 0) { accept the solution. If the solution is better than the best solution, Consider it as the best solution. } Otherwise accept the result with prob. eδ/T emp . Step 6: Reduce T emp exponentially.
111
6.5.4
Performance of nonlinear CA based DT SG
To study the potential of nonlinear CA based DT SG design scheme, we have performed experimentation for two different cases (i) assuming that there is no restriction in the number of seeds. The final fitness F should reach to 100% – that is, the complete set of DT S will be covered by the DT SG. (ii) the maximum number of seed is fixed as the 20% of |DT S|. The comparison, in respect of memory overhead, of nonlinear CA based design with that of conventional linear/additive counterpart to achieve 100% fitness is shown in Figure 6.17 and Figure 6.18. The fault efficiency of DT SG and the memory saving out of this design with respect to the direct storage of DT S patterns are given in Table 6.6. In all the experimentations, the run length (R) is assumed to be 4 × |DT S| for a seed. 80 70
Linear CA
No. of Seeds
60 50
Non−Linear CA
40 30 Fitness=100%
20
n=32
10 10
20
30
40 50 |D T S|
60
70
80
Figure 6.17: |DT S| vs. Number of seeds to cover random DTS To compare the nonlinear CA based design with that of linear counterpart, the experimentation is performed on both the (i) randomly synthesized DT S, and (ii) M IN T EST test set of a benchmark circuit. Figure 6.17 depicts the number of seeds required to cover the DT S (completely) in both the linear and nonlinear CA based designs. We generate 10 randomly synthesized DT S for a particular cardinality. The average number of seeds required to cover a DT S is then noted. For the graph shown in Figure 6.17, we assume CA size n=32. It is clear from the figure that the linear CA based DT SG requires more number of seeds compared to that of nonlinear CA based DT SG. The experimental results on M IN T EST patterns are shown in Figure 6.18. It
112 1.0 0.9
L i n e a r CA
0.8 S e e d / |D T S|
0.7 0.6
Nonlinear CA
0.5 0.4 0.3
Fitness=100%
0.2 0.1 10
20
30
40
50
60
70
80
n − CA Size
Figure 6.18: CA size vs. seed to cover MINTEST can be concluded, from the graphs shown in the figure, that the memory requirement (due to the seeds [1/(seed/|DT S|)] to accommodate all the patterns ∈ DT S), in a nonlinear CA based design is much less than the corresponding linear CA based design. Further, the memory requirement in nonlinear CA based design is almost independent of n (the CA size). To find the fault efficiency of the DT SG, the experimentation is performed on a number of ISCAS benchmark circuits. The DT S for a circuit is the set of test patterns of M IN T EST . Here we assume the maximum limit of the number of seeds as the 20% of |DT S|. In such a design, the DT SG may not cover the complete set of DT S. Table 6.6 depicts the summary of the performance results of DT SG. The first column of the table contains the name of the circuits, whereas the second and third columns note the number of primary inputs (P Is) and the cardinality of DT S of the circuit under test (CU T ) respectively. The number of seeds in the optimal design is shown in the forth column. The fifth column represents the percentage of memory saving while using nonlinear CA based DT SG in comparison with the direct storage of DT S. To store an n−cell pattern or for an n−cell CA n F F s are required. Therefore, the memory saving can be calculated as Cardinality of DT S − (no. of seeds + 1) × 100% Cardinality of DT S
(6.8)
The fault coverage of the DT SG is noted in the sixth column of Table 6.6. The figures in the sixth column represent the % of faults covered by the DT SG, assuming the fault coverage with M IN T EST as 100%. The fault coverage attained by a maximal length CA, with the same run length as DT SG, is studied and shown in the last
113
Table 6.6: Performance of DT SG
(Nonlinear CA based design) (1) Circuit name s386f s349f s298f s820f s832f s344f s526f s349f s382f s510f s400f s1196f s713f s838f c6288 c1908 c432 c499 c1355 c3540 c880 c5315 c7552 c2670
(2)
(3)
(4)
#PI 13 14 17 23 23 24 24 24 24 25 25 32 54 67 32 33 36 41 41 50 60 178 207 233
|DT S| 63 13 23 93 94 44 49 13 25 54 24 113 21 75 12 106 27 52 84 84 16 37 73 44
#seed 11 3 2 10 11 3 7 3 3 4 4 14 4 4 2 6 3 3 4 7 4 5 13 8
(5) Memory saving(%) 80.95 69.23 86.96 88.17 87.23 90.91 83.67 69.23 84.00 90.74 79.17 86.73 76.19 93.33 75.00 93.40 85.19 92.31 94.05 90.48 68.75 83.78 80.82 79.55
(6) FC (in %) of DT SG 100 100 100 100 99.60 100 100 100 100 100 100 100 97.60 100 100 100 100 100 100 100 99.15 99.99 97.71 98.22
(7) FC (in %) with Max CA 95.83 95.14 97.08 93.29 91.49 98.53 94.23 98.00 100 99.29 97.42 93.55 88.43 57.14 99.37 97.33 97.09 98.69 98.79 95.18 96.39 97.79 93.24 83.69
column of the table for comparison. It is observed that the DT SG attains almost the same fault coverage that is attained by the original DT S. Moreover, it can save, in average, more than 80% of memory (column 5 of Table 6.6). The run length of a DT SG is chosen much greater than the DT S. However, the number of memory fetching in DT SG is very less compared to the normal DT S generation scheme. It effectively reduces the total test application time. The results of designing the desired DT SG, targeting pseudo-random pattern resistant faults, are shown in Table 6.7. In this design, we extract the patterns of M IN T EST that can detect the faults, remained undetected by the nonlinear CA based P RP G. The DT SG is then constructed to cover all such deterministic test patterns. We assume no limit on the maximum number of seeds to achieve 100% fitness in Algorithm 6.7. The sample results are reported in Table 6.7. The number of patterns to cover pseudo-random pattern resistant faults is shown in column 3,
114
Table 6.7: Performance of DT SG targeting hard-to-detect faults
(Nonlinear CA based design) (1) Circuit name s386f s298f s832f s344f s349f s510f s400f s713f c1908 c432 c499 c1355 c3540
(2) Complete |DT S| 63 23 94 44 13 54 24 21 106 27 52 84 84
(3) |DT S| to detect hard-to-detect faults 11 3 13 6 4 5 7 11 21 6 6 10 13
(4) # seeds 3 1 5 2 1 2 2 3 14 2 2 3 4
whereas, the number of seeds required to generate such patterns is reported in column 4.
6.6
U BIST Hardware
The earlier sections provide the design methodologies and performance of linear and nonlinear CA based U BIST . This section reports the hardware realization of U BIST based on linear and nonlinear CA. The hardware implementation of U BIST structure requires a 3-neighborhood programmable CA (P CA) [51] that can realize any rule vector of the CA. Such a structure can be programmed to generate each of the following four sets of patterns as required for a test logic of the CU T . • Pseudo Random Patterns, • Pseudo-Exhaustive Patterns, • Pseudo Random Patterns without P P S, and • Deterministic Test Patterns. For a linear CA based U BIST , a P CA cell should incorporate any linear rule. It requires a 3−bit shift register, three switches and an XOR gate (Figure 6.19). The shift register holds the dependency information of a cell on its neighbors. Its output controls the switches.
115
Cell i Cell 1
Cell i−1
Input
Input
Cell i+1
i Input
Input
Cell n Input
FF Output
Output
Output
Output
Output
Rule
3 bit Shift Register
Figure 6.19: An n−cell linear CA based U BIST
Cell i Cell 1
Cell i−1
Input
Input
Cell i+1 Input
Input
Cell n Input
FF Output
Output
Output
Output
Output
8 X 1 MUX Rule RMT 7 6 5 4 3 2 1 0
8 bit Shift Register
Figure 6.20: An n−cell nonlinear CA based U BIST
116 On the other hand, the P CA structure that implements the nonlinear CA based U BIST is shown in Figure 6.20. Each cell of the P CA demands a F F , an 8 to 1 M U X, and an 8-bit register that stores the rule for that cell. Different features of P CA hardware are next reported: • A verilog code has been written for the design and simulated using Cadence Verilog Simulator on a Sun Ultra-60 machine. • The design has been synthesized and analyzed using Synopsis Design Compiler and Signal Scan. • The design has been implemented with 0.25µCM OS technology.
6.7
Conclusion
The CA (linear and nonlinear) are effectively employed for designing a new architecture for BIST referred to as the Universal BIST (U BIST ). It can efficiently generate any one of the four classes of test patterns - pseudo-random, pseudo-exhaustive, pseudo-random without P P S and deterministic. Taking advantage of the regular, modular and granular structure of cellular automata, a low cost P CA hardware has also been proposed for implementation of U BIST . Experimental results reported in this chapter establishes the fact that the performance of nonlinear CA based U BIST is quite better than that of linear CA based U BIST . In this thesis work, we target characterization of nonlinear CA for V LSI applications. The earlier chapters explore the design of test logic for V LSI circuits around nonlinear CA. In the following chapter, we apply the nonlinear CA theory for designing a hardwired data service scheme in mobile network.
Chapter 7
Nonlinear CA Based Design of Data Service Scheme The earlier chapters have dealt with the theory of nonlinear cellular automata (CA) and its applications in the field of V LSI circuit testing. This chapter proposes the design of an efficient query processor effective for location dependent data services in a cellular mobile network. The CA hardware acts as a search engine and performs computation to provide data service at a high speed. The rapid growth of the wireless communication and the mobile computing paradigms have opened up a new class of application – Location-dependent information service (LDIS) [162, 315, 316, 320]. Through LDIS a mobile client can access location sensitive data such as traffic reports, travel information, and other emergency services. The location dependent information service includes the processing of queries ranging from proximity query (eg. Where is the nearest hospital?) to broadcast query (eg. Weather forecast for the day). A number of solutions for processing such queries in cellular mobile network have been reported [188, 234, 315, 317, 320]. The basic assumption for all such solutions is - the data are stored in the backing store of server at the Base Station (BS). Since in mobile network an arbitrary number of queries can be raised and a particular MU (Mobile Unit) may submit several queries, the scheme of caching [315] at MU is proposed to improve the system performance. The MU always checks whether the current query can totally or partially be answerable from its cache. It optimizes the power consumption as well as the use of channel bandwidth of a network. However, cache inconsistency limits the effectiveness of such schemes. A number of solutions are also reported to handle the cache inconsistency problem in the mobile network [234, 316]. The caching in query processing can not avoid the dependency on the server at BS and affects the response time for a query/data service in most of the cases. The broadcast technique, proposed in [317], is an alternative to the individual query processing and can loose the dependency on BS server. Broadcast disseminates infor-
117
118 mation to all users to allow simultaneous access to the data. It results in an efficient usage of the scarce bandwidth of the network. However, broadcast compels an MU to listen to the broadcast channel all the time even if the MU is power inefficient. In [132], the authors proposed an air indexing technique that targets reduction of power consumption by predicting the arrival time of the desired data at MU. The MU can stay in power saving mode until the requested data arrives. The limitation of the data service schemes, alternative to individual query processing, is - the user may not be satisfied with the service since he/she may have to wait for a long time to get the answer to a query. On the other hand, the major bottleneck in individual query processing is the processing time of BS server - that is, processing time for the requested data, searching time for the database, and the time required for a number of disk accesses. In this scenario, we propose an efficient data service scheme to process the individual query (for user satisfaction) in a mobile network. The Cellular Automata technology is employed to design the proposed hardware based scheme that ensures reduced processing time for requested data. The scheme assumes partitioning of a network cell into a number of micro-cells. To submit a query for an object, an MU first determines the micro-cell number of its current position. The micro-cell number (mcell id) and the desired object code are then sent to the nearest BS which is equipped with a query processor, developed around the CA hardware. The proposed CA based query processor avoids searching of the backing store and all sorts of soft computation in responding the query. It ensures service to query at the rate twice faster than that of state-of-the-art soft computation techniques. The encoding technique employed for the micro-cells provides accurate design of CA hardware and can handle the frequent updation (addition/removal) of data objects in the network. The basics of the data service scheme has been published in [81]. The detailed design is reported in the subsequent sections following the next section that introduces the preliminary concepts required for the design of proposed data service scheme.
7.1
Data services in cellular mobile network
In a mobile environment, there may exist two classes of queries – (i) queries that are to be answered depending on the current location of a mobile unit (MU), and (ii) the queries that are independent of the MU’s present location. In the current work, we concentrate on the design of query/data service scheme for Class (i) type of queries – that is, queries that requires location dependent information service (LDIS). Jianliang Xu and Dik Lun Lee identify the five types of queries [314] that can be answered depending on the current location of the MU. The relevant three of these five are – • Local queries: These refer to the queries for which the query results are valid within the current network cell where the MU is located [314].
119 C1
: Cell
C2 WIRE − LINE NETWORK C3
:Base Station(BS)
C5
C4
: Mobile Switching Center(MSC)
wired links
C7 C8
overlap region C6
C 12 C9
C 10
wireless links
C 11
Figure 7.1: Cellular network architecture • Geographically clustered: Clustered queries refer to the queries with spatial constrains, for example, “list all the hospitals within the 500 km radius” [314]. • Nearest Queries: Examples of such queries are “find the nearest gas station”, “find the nearest hospital”, etc. The CA based design of data service scheme proposed in this chapter handles the queries of type “nearest queries”. However, the CA technology can easily be extended for all location dependent queries. The overview of the scheme follows in the next section.
7.2
Overview of the query processing scheme
The cellular mobile network (Figure 7.1) is a collection of geometric areas, called network cells. Each cell, in general, is serviced by a base station (BS) placed at its center. A number of BSs are linked to a mobile switching center (MSC). The MSC acts as a gateway [315] of the network to existing wired network such as PSTN, ISDN, etc. The wireless communication takes place between a BS and a mobile unit (MU). Whereas, the communication between a BS and an MSC (or the inter MSC) is wire line. In a location-dependent information/data service (LDIS), a data item (answer) can have different values for different locations and answer to a query depends on the location from where the query originates. This section introduces the proposed data service scheme for such a system employing hardware structure of cellular automata (CA). In the proposed design, a network cell, serviced by a base station (BS), is partitioned into a number of micro cells. While an M U submits a query for an object, it sends the micro cell number (mcell id), it belongs to, to the nearest BS along with the requested object code. In response to the query, the query processor at BS selects a CA based search engine corresponding to the object code and then reports the answer to query depending on the mcell id. The partitioning of a cell into a number
120
(5) 0101
(1) 0001
(4) 0100
(0) 0000
11 01 (13)
1001 (9) (3) 0011
11 00 (12) (6) 0110
10 00 (8) (2) 0010
1011 (11)
1110 (14)
1010 (10)
(7) 0111 1111 (15)
Figure 7.2: Partitioning of a cell S into micro cells of micro cells, and the data service scheme are addressed separately in the following subsections.
7.2.1
The micro cells
The network cells, as shown in Figure 7.1, are assumed to be hexagonal. For the current design, each such cell is partitioned into a number of square shaped sub-cells referred to as the micro-cells (Figure 7.2). The area covered by a micro-cell (dotted boxes) is predefined for a particular cell. Let assume that the 16 micro-cells are considered for a network cell S and the code for each micro-cell (mcell id) is shown in Figure 7.2 (details are given in Section 7.4.1). Prior to submission of a query, an MU determines its current geographical location in consultation with the servicing BS. Based on the current location, the MU can compute the mcell id of the micro-cell it currently belong to.
7.2.2
The data service scheme
The proposed data service scheme is developed based on the following two assumptions: • The required object data/information (for which the query is raised) are stored in a high speed memory. At the time of system startup, the required data are loaded in predefined locations. This makes the system independent of the backing store. • If an object is nearest to a point in a micro-cell, then it is the nearest object from any point of the micro-cell.
121
query_id
mcell_id query_code
Figure 7.3: Query code generated from an MU To raise a query, it is necessary that the MU sends the query code (Figure 7.3) containing a query id and the mcell id. The query id represents the query such as “where is the nearest hospital?” and it is unique for an object. On the other hand, the mcell id points to the micro cell (Figure 7.2) the MU currently passing through. Each base station of the mobile network contains a query decoder and the set of CA (rule vectors) CAset (Figure 7.4). A CA ∈ CAset corresponds to a query object (query id). On receiving the query code (mcell id and query id) from an MU, the query decoder decodes the query id and selects a CA from the CAset. The selected CA is then run for a predefined number of time steps considering the mcell id of query code as seed (Figure 7.4(a)). The final state of CA after the run points to an mcell id where the desired object can be found. The memory address (location) of the desired information related to the object is then computed by an address calculator (AC). Computation of memory address: To compute the address of the desired object data, the AC pads the final state of the CA (mcell id) with a base address (Figure 7.4(b)). Selection of a unique base address for a particular type of object (say, Hospital) is the easiest way to store the information. However, this may lead to memory fragmentation. For example, let consider, there are 16 micro-cells in a network cell and 1 KB memory block is allocated to store information/data for each object (say, Hospital). Since a hospital may be in any one of the micro-cells, 16 KB memory is to be kept in the BS server for this object type. However, if a network cell contains only 3 instances of such object type (Hospital), then out of the 16KB only 3KB is used and the rest 13KB creates memory fragmentation. To minimize the memory fragmentation, we adopt sharing of base addresses. Let us consider, Si and Sj are the two sets of mcell ids that contain the objects of two different types O1 and O2 . If Si ∩ Sj = φ, then the same base address can be used to store the data for O1 and O2 . The following rule can effectively handle the memory fragmentation. Rule 1: Use the same base address for k number of object types if they leave minimum unused memory. Illustration for the data service scheme: Figure 7.4 describes the architecture of a query processor. It is capable of processing proximity query on 4 different types of objects, namely Hospital (H), Restaurant (R), Police Station (P ), and Bank (B). Let us assume that the four CA – CAH , CAR , CAP , and CAB (Figure 7.4(a)) are synthesized for H, R, P and B respectively. The CA H < 10, 69, 204, 68 > (Figure 7.5),
122 mcell_id
. . . 00
n
. . .
m
AC
M E M O R Y
CA H 01
query_id Query Decoder
m
n
. . .
AC
CA R n
. . .
10
m
AC
CA P n
. . .
11
m
AC
CA B
(a) Query Processor n
... Base Address
m
...
(b) Address Calculator (AC)
Figure 7.4: Block diagram of a query processor 8
9
0
1
4
5
2
12
13
S1
S2
S3
14
6
10
7
15
11
3 S4
Figure 7.5: State transitions of a CA with rule vector < 10, 69, 204, 68 > CAR , CAP and CAB are stored at the addresses 00, 01, 10 and 11 respectively in a BS. If the query “Where is the nearest H?” is raised by an MU, with query id = 00, from an mcell id 1000 (8), then the query decoder at the nearest BS selects the CA H . The CAH is then loaded with mcell id 1000 and is run for a predefined number (3) of steps. It settles the CAH to a the final state 1100 (12). The final state (attractor, Figure 7.5) padded with a stored base address at the address calculator (AC) points to the memory address where the information about the nearest hospital (H) is stored. The earlier discussion points to the fact that, the design of CA based data service scheme demands synthesis of a CA for each object type with the state transitions similar to that of Figure 7.5. Such a CA is referred to as the non-group CA (Section 3.1 of Chapter 3).
123 The details on CA have been introduced in Chapter 3. However, in the current design, we employ only nonlinear non-group CA with a number of attractors. The next section introduces the non-group CA theory relevant for the current design. The design of query processor developed around the non-group CA is reported in Section 7.4.
7.3
Nonlinear non-group CA with multiple attractors
Figure 3.5 of Chapter 3 depicts the state transition diagram of a non-group CA. There are two cycles in the diagram – state 15 forms a cycle of length 1, and state 7, 3 and 11 form a cycle of length 3. The cycles, in general, are refereed to as the attractors. However, in the current context we will refer the attractors having cycle of length 1. The state transition diagram of the non-group CA (Figure 7.5) forms a number of disjoint graphs. Each disjoint graph is a basin. Figure 7.5 contains 4 basins rooted at the attractors – 2, 12, 13 and 3. Therefore, the CA seeded with any state of a basin converges to any of the 4 stable attractor states. In the proposed design of query processor, we need to synthesize this class of CA having multiple basins, rooted at the attractors. The following theorem characterizes the attractors in 3-neighborhood dependency that guides the synthesis of a CA desirable for the current design. Theorem 7.1 : If B = b1 b2 · · · bi−1 bi bi+1 · · · bn and C = c1 c2 · · · ci−1 ci ci+1 · · · cn are L two attractors of an n−cell CA and bi−1 = ci−1 , bi = ci and bi+1 ci+1 = 1, then B 0 = b1 b2 · · · bi−1 bi ci+1 · · · cn and C 0 = c1 c2 · · · ci−1 ci bi+1 · · · bn are also the attractors of the CA. Proof : To prove the theorem, we have to show that the next states of B 0 = b1 b2 · · · bi−1 bi ci+1 · · · cn and C 0 = c1 c2 · · · ci−1 ci bi+1 · · · bn are B 0 and C 0 respectively. Since B = b1 b2 · · · bi−1 bi bi+1 · · · bn and C = c1 c2 · · · ci−1 ci ci+1 · · · cn are two attractors of a 3-neighborhood CA, all the bits except i th and (i+1)th of B 0 and C 0 are produced in next state. Therefore, if ith and (i + 1)th bits of B 0 and C 0 are reproduced as the next state, the theorem will be proved. Since b1 b2 · · · bi−1 bi bi+1 · · · bn and c1 c2 · · · ci−1 ci ci+1 · · · cn are two attractors, the RM T (bi−1 bi bi+1 ) and RM T (ci−1 ci ci+1 ) are bi and ci respectively for the ith cell. If B 0 = b1 b2 · · · bi−1 bi ci+1 · · · cn and C 0 = c1 c2 · · · ci−1 ci bi+1 · · · bn be two attractors, then the RM T s (ci−1 ci bi+1 ) and (bi−1 bi ci+1 ) are to be ci and bi respectively for the ith cell. However, bi−1 = ci−1 and bi = ci imply that the RM T s (ci−1 ci bi+1 ) and (bi−1 bi ci+1 ) are ci and bi respectively. Therefore, the rule for i th cell remain unchanged for these two attractors. Similarly, the rule for (i + 1) th cell also remains unchanged. Hence the proof. 2 It is obvious from the above theorem that the derived attractors are the same with the original if bk = ck where 1 ≤ k < i − 1. Example 7.1 Let 010101 and 000111 be two attractors. For the attractors, third and
124 fourth bits are same whereas fifth bit varies. Therefore, according to Theorem 7.1, the derived attractors are 010111 and 000101. Corollary 7.1 An n−cell CA, null boundary or periodic boundary, synthesized with 2 arbitrary attractors, can have maximum 2 m+1 attractors, derived from given 2 attractors, where m = b n−1 3 c. Proof : Let us consider, a CA is synthesized from two given attractors B and C. For one set of (i − 1, i, i + 1) that maintains Theorem 7.1 in B and C, the number of attractors is doubled. That is, two pairs of attractors are there. If there is another such set of (i − 1, i, i + 1) each of the two pairs of attractors derives another pair of attractors. Hence, for two such sets of (i − 1, i, i + 1), total number of attractors are 22+1 . However, it is obvious that if i = 2 in Theorem 7.1, then B = B 0 and C = C 0 . That is, no new attractor is derived. Therefore, leaving the left most bit, maximum number of possible (i − 1, i, i + 1) set is b n−1 3 c. Therefore, maximum number of possible attractors derived from the two attractors is 2 m+1 , where m = b n−1 3 c. Hence the proof. 2 Synthesis of a CA from the given states of basins is computationally hard problem [36, 140]. There are few efforts to find the CA structure from attractor basins [200, 310, 18]. However, the success rate is very much limited. Therefore, we propose a synthesis algorithm, reported in Section 7.4.2, based on Theorem 7.1 that results in a better solution to reach to a CA structure for modeling a given basin. The following result is important for the proposed synthesis algorithm. It is reported in Section 6.5.2 of Chapter 6 that if a state transition graph is constructed arbitrarily, the RM T s of CA rules to model the transition graph encounter conf lict. We characterize the conflicts in RM T s in the following theorem. Theorem 7.2 : Low hamming distance among a set of binary strings implies chances of conflict in RM T s are less when the strings are mapped to represent a CA basin. Proof : The patterns with low hamming distances in average form the similar combination for (i − 1), i and (i + 1)th bits in most of the cases. Therefore, the chances of conflict in RM T s at ith rule is less as the RM T bi−1 bi bi+1 produces bi in most cases, where bi−1 , bi and bi+1 are the (i − 1), i and (i + 1)th bit values. Hence the proof. 2 Example 7.2 For the example state transition graph of Figure 7.5, the lists S 1 = {1110=14, 0110=6, 1010=10, 0010=2}, S 2 = {1000=8, 0000=0, 0100=4, 1100=12}, S3 = {1001=9, 0001=1, 0101=5, 1101=13}, and S 4 = {0111=7, 1111=15, 1011=11, 0011=3} exhibit low hamming distances among the patterns within a list (maximum hamming distance is 2). It results in the rules of the CA H without conflict while selecting the values for RM T s.
125 Based on the theorems presented in this section, we next introduce the design of query processor employing cellular automata.
7.4
Design of Query Processor
The architecture of the proposed CA based query processor is shown in Figure 7.4. The design of such query processor essentially boils down to the synthesis of CA (CA H , CAR , · · ·) employed for handling the queries. A cellular network with m object type demands synthesis of m number of CA. Further, the proposed scheme employs the concept of micro cell. Therefore, the query processor design process should target the following tasks. Task I: Partitioning of network cell into a number of micro cells. Task II: Synthesis of CA to answer the nearest queries. The next subsection deals with the partitioning technique. The CA synthesis scheme is reported in subsection 7.4.2.
7.4.1
Partitioning of a cell in micro cells
A mobile network cell (Figure 7.1) is partitioned into micro cells based on the total coverage area of the cell, number of instances of different objects, etc. Each micro cell is assigned a micro cell number (mcell id), and it is considered as a CA state. Therefore, a network cell should be partitioned in such a way so that hamming distance among adjacent micro cells (mcell id) are minimum (Theorem 7.2). We illustrate the encoding scheme (employed for the current design), that satisfies the requirement, with an example. Let us consider, a network cell S is divided into 4 partitions (Figure 7.6(a)). The partitions are encoded as 00, 01, 11 and 10 as shown in the figure. Each partition is further divided into 4 and follows the similar encoding. Therefore, the network cell is now having 16 partitions (micro-cells), where each partition is encoded with 4 bits as shown in Figure 7.6(b) (the bold face two bits are due to partitioning in the second step). We refer each partitioning as a step. The number of such partitioning steps employed for a network depends on the desired number of micro-cells. It is obvious from Figure 7.6 that the hamming distances among the codes ((mcell ids) of adjacent micro cells are very less.
7.4.2
Synthesis of CA for query processor
The CA based solution for data service is introduced in the earlier section. It demands synthesis of a desired CA for each object type - that is, CA H , CAR , etc (Figure 7.4) for the objects Hospital, Restaurant, etc. This section targets synthesis of CA for such design. The following example illustrates the synthesis of CA H . Let us consider the network cell S shown in Figure 7.7. The following are the
126
01
0101 00 01 0100 0000
00
11 01 10 01 1100 1000
11
0111 00 11 0110 00 10
10
1111 1011 1110 1010 (b) Final partition (16 micro cells)
(a) First Partition
Figure 7.6: Partitioning scheme
(5) 0101
(1) 0001
(7) 0111
(0) 0000
H3
H4 11 01 (13)
(4) 0100
1001 (9) (3) 0011
10 00 (8) (2) 0010
11 00 (12) (6) 0110
H2 1111 (15)
1011 (11)
H1 1110 (14)
1010 (10)
Figure 7.7: Network cell S with object hospitals characteristics of S. • The whole cell area is covered by 16 micro-cells. • There are 4 hospitals H1 , H2 , H3 and H4 in S. H1 is in the micro-cell 2, H2 in 3, H3 in 12, and H4 is in 13. • The nearest hospital from the micro cells 2, 6, 10 and 14 is H 1 , whereas, the nearest hospital from the micro cells 3, 7, 11 and 15 is H 2 . Similarly, the nearest hospital from the micro cells 0, 4, 8 and 12 is H 3 , and the H4 is nearest to the micro cells 1, 5, 9 and 13. • The proximity relations (described above) are represented as the proximity lists < 2, 6, 10, 14 >, < 3, 7, 11, 15 >, < 12, 4, 8, 0 >, and < 13, 9, 1, 5 >. The microcell containing the hospital is placed as the first element of a proximity list.
127 • The collection of proximity lists for an object type (Hospital), is referred to as the (cluster)object name - that is, (cluster)H = {< 2, 6, 10, 14 >, < 3, 7, 11, 15 >, < 12, 4, 8, 0 >, < 13, 9, 1, 5 >}. • For all other object types like Restaurant, Bank, etc. we can have clusters (cluster)R , (cluster)B , respectively. The state transition diagram shown in Figure 7.5 is for a 4-cell CA. The CA can act as the CAH for the example network cell S of Figure 7.7. The state transition diagram of the CA contains 4 disjoint set of states - that is, the basins S 1 , S2 , S3 , and S4 . The basins exactly map the (cluster) H = {< 2, 6, 10, 14 >, < 3, 7, 11, 15 >, < 12, 4, 8, 0 >, < 13, 9, 1, 5 >}, where each state of the CA corresponds to a micro-cell. Therefore, the basic task of the synthesis scheme is to find out the CA < R 1 , R2 , · · · , Rn > that exactly maps the lists of an object cluster to its basins. The selection of R i for the cell i demands fixing of the RM T s for R i . We synthesize the desired CA based on reverse engineering technique. However, the limitation of 3-neighborhood CA is, a set of predefined basins may not always be modeled by the CA exactly, even if the states maintain Theorem 7.2. Few states are displaced for that case. The extension of neighborhood of CA cells can be an alternative solution. However, extension of neighborhood increases the design cost of query processor. Therefore, we adopt two design schemes for the query processor – one is in 3-neighborhood and the other is in extended neighborhood (5, 7, · · ·). Next we introduce the CA synthesis scheme in 3-neighborhood based on reverse engineering scheme.
7.5
Synthesis of CA through reverse engineering
Let us consider the selection of RM T s for the rules of cell 1 and cell 3 (from left to right) of the 4-cell CAH (null boundary) while mapping the attractors 0010, 1100, 1101 and 0011 of the CAH basins to the (cluster)H - that is, (i) 0010 state of the desired CA should be the next state of 0010. If we consider a null boundary CA, for cell 1 this transition sets the RM T 000 to 0 (0010 → 0010 guides that the combination of present states of neighbors of cell 1 is 000 and the next state is 0). Similarly, it sets the RM T 010 for cell 3 as also 1 (Figure 7.8). (ii) for 1100 attractor state, cell 1’s 011 RM T is 1 and cell 3’s 100 RM T is set to 0. (iii) 1101 attractor state directs that cell 1’s 011 RM T is to be set to 1. It is to be noted that the RM T 011 for cell 1 is already fixed as 1 in (ii). Therefore, this selection of RM T for cell 1 does not result in any conflict. This process may be continued for other attractors and may fix some of the RM T s of the CA cells. However, no cell encounters the conf lict when RM T s are filled up only from an arbitrary number of given attractors. Following theorem characterizes the conflict in RM T s. Theorem 7.3 : There is no conflict in RM T s of a cell when a CA synthesized from an arbitrary number of attractors.
128 RMTs
111 110 101 100 011 010
1/1
RMTs
0/0
111 110 101 100 011 010
0
0
1
001 000
1
Cell 1
001 000
Cell 3
Figure 7.8: Selection of RMTs to map attractors Proof : The conf lict arises when a cell is assumed to have 0 and 1 simultaneously as next state for a particular combination of present states of its neighbors. Since the present state and the next state are the same for attractors, the next state for a cell is always the present state of that cell when a CA is synthesized from any number of given attractors. Therefore, no cell encounters the conf lict in RM T s. Hence the proof. 2 The next task, after fixing up the RM T s for attractors, is to map the lists of (cluster)H to the basins of the desired CAH . Figure 7.8 depicts the status of the RM T s for cell 1 and cell 3 after the mapping of first elements of the lists in (cluster) H to the attractors of CA basins. Figure 7.9 illustrates the selection of RM T s for the cells 1 and 3 while mapping the first list < 2 = 0010, 6 = 0110, 10 = 1010, 14 = 1110 > of (cluster) H to an attractor basin of the desired CA. The circled RM T s indicate the selected values after the mapping of attractors. The following steps explains the mapping of list with a basin: (i) Say, 10 = 1010 is the predecessor of the attractor 2 = 0010 - that is, 0010 is the next state of 1010. It guides that RM T 010 of cell 1 is to be 0. For cell 3, the RM T 010 is to be 1 which is already set to 1 and, therefore, there is no conflict. (ii) Similarly, we assume that 6 = 0110 will be the predecessor of 1010. It fixes the values of RM T s 001 as 1 and 110 as 1 for the cells 1 and 3 respectively. (iii) Now, 14 = 1110 can be the predecessor of 0110 if the RM T 011 for cell 1 is 0. However, it is already set to 1. So, there is a conflict in RM T 011 for cell 1 (Figure 7.9(a)). The conflict can be avoided by considering 1110 as the predecessor of 1010. It sets the RM T s 011 and 110 of the cells 1 and 3 as 1. As these two values are already set to 1, there will be no conflict for these cases (Figure 7.9(b)). Therefore, proper construction of state transition diagram to map a list may avoid collision. However, such construction is computationally hard problem. We, therefore, adopt the reverse engineering technique as a solution to the problem. The proposed technique (algorithm) accepts the pattern set P= {P 1 , · · · , Pi , · · · Pk } that represents a cluster, as the input, each P i (i = 1, 2, · · · , k), an n-bit pattern, represents mcell id. The output of the algorithm is the CA rule vector R =< R1 , R2 , · · · , Rn > specifying the rule number applied for each of the cells. The following three phases illustrate the logic of the synthesis algorithm.
129 RMTs
111 110 101 100 011 010
10 0
1110
001 000
1
0
conflict
Cell 1
X
RMT 011 (cell 1) 0110
Conflict RMTs
111 110 101 100 011 010
1/1
0
0
1
001 000
1 1
1010
Cell 3 0010
(a)
RMTs
111 110 101 100 011 010
1
1 0
S1 001 000
1
0
1110
0110
Cell 1 1010
RMTs
111 110 101 100 011 010
1/1
0
0
11 1
001 000
Cell 3 0010
S1 (b)
Figure 7.9: Selection of RMTs to construct basin • Phase I: The state transition diagram of a CA can be conceived as a graph consisting of multiple sub-graphs (Figure 7.5), where an attractor basin is portrayed as a sub-graph. In Phase I, we first set the attractors for basins. A basin represents the proximity lists of an object cluster and the first element (mcell id) of the list is considered as the attractor. After fixing up the attractors for each list, we find the additional attractors, if any, of the CA corresponding to the lists (Theorem 7.1). As next step, we randomly generate the sub-graph for each attractor. The nodes of the sub-graph are the mcell ids of the list. The following example illustrates the underlying principle of Phase I of the synthesis scheme. Example 7.3 Figure 7.10(a) represents two sub-graphs for n = 4. The number of nodes p in each sub-graph is 5 and the heights of two sub-graphs are 1 and 2 respectively. The patterns P1 = 0000 and P2 = 1111 are the cyclic nodes of the sub-graphs and considered to be the attractors of two attractor basins – Basin 1 and Basin 2. 0001, 0010, 0100, 1000 and 1110, 1101, 1011, 0111 are the transient states. These are mapped into the two attractor basins as shown in Figure 7.10(a). • Phase II: From the sub-graphs (state transition graph) generated in Phase I, we construct a state transition table. Figure 7.10(b) represents a state transition table derived from two sub-graphs of Figure 7.10(a). The number of sub-graphs (attractor basins) is equal to the number of lists in a cluster. If the number of nodes of a sub-graph is p, then the total number of
130
0111 1000
0100
0010
1110
1101
0001 1011
Basin-2
0 0 0 0 Basin-1 1111
(a) Directed Single Cycle Sub-graphs Basin
1
2
Present State
Next State
0000 0001 0010 0100 1000 1111 1011 0111 1101 1110
0000 0000 0000 0000 0000 1111 1111 1011 1011 1011
(b) State Transition Table
For ‘2nd’ Cell : Neighborhood :1 1 1 1 1 0 1 0 1 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 Next State :
0/1
0
0
0
0
0
0
0
‘Collision’
For ‘3rd’ Cell : Neighborhood :1 1 1 1 1 0 1 0 1 1 0 0 0 1 1 0 1 0 0 0 1 0 0 0 Next State :
1
1
1
0
1
0
0
0
(c) Next State Function
Figure 7.10: Randomly generated graph from for 4-bit patterns entries in the state transition table is k · p, where k is the number of attractors. For example, in case of Figure 7.10(a), the number of attractors is k = 2 and p = 5. Therefore, the total number entries in the state transition table of Figure 7.10(b) is 10. • Phase III: This phase generates the rule vector R =< R 1 , R2 , · · · , Rn > of the CA from the state transition table shown in Figure 7.10(b). Let consider the ith cell whose rule Ri is to be identified. As we consider 3neighborhood dependency, to identify the rule R i , we concentrate only on 3 columns - (i − 1)th , ith and (i + 1)th of the (k · p) entries of state transition table. Suppose, for a present state configuration of the i th cell, the next state is ‘0’ for n0 times and ‘1’ for n1 times at the state transition table, then the state ‘0’ and ‘1’ collide to formulate the next state of the i th cell. Figure 7.10(c) represents the neighborhood configurations along with the next state of 2nd and 3rd cells of the patterns/states of two attractor basins. For the 3rd cell, there is no collision between states ‘0’ and ‘1’ for any 2 3 configurations; whereas for ‘111’ configuration of 2 nd cell n0 = 1 and n1 = 1 – that is, there is an instance of collision. In order to resolve this conflict we introduce the following heuristics. • If n0 ' n1 , the collision between state ‘0’ and ‘1’ is high. In that case, we randomly decide the next state of a cell.
131 • If n0 >> n1 (n0 .
Since the conf lict can not be avoided completely, a few number of mcell ids may be the member of some basins that represent other lists of the cluster. In such cases, the CA resulted from Algorithm 7.1 may not ensure 100% correct answer to query. The effectiveness/performance of the design is evaluated in the next section.
7.6
Experimental results
The whole experimentation is divided in two parts – the feasibility analysis of the design and the performance evolution of the query processor.
7.6.1
Feasibility of the design
The major task of the proposed design scheme is to find different CA that can support the clustering of mcell ids. However, all the lists may not be modeled by the CA basins. A few number of mcell ids of a list may erroneously become the member of other list when they are modeled by a CA. This implies, a number of queries may not be answered correctly. We therefore define a parameter, success rate, success rate =
correctly modeled mcell ids × 100% Total mcell ids
(7.1)
which indicates percentage of queries that can be answered correctly. Due to unavailability of real life data, we have experimented with synthesized data. The whole area of a mobile network is arbitrarily partitioned into a number of imaginary areas in such a way that a partition may contain at most one instance of a particular object. Clustering for the objects is done arbitrarily.
132 Table 7.1: Success rate of the proposed design # objects
# mcell id
4
10 16 16 24 16 24 24 32
8
avg. # instances per object 3 3 4 5 3 3 4 5
success rate 100% 100% 90% 80% 100% 95% 100% 100%
Experimentation is done assuming different types of objects and also varying the number of clusters of mcell ids. Table 7.1 shows the sample results of experimentation for the number of objects 4 and 8 (column 1). The second column of Table 7.1 depicts the number of partitions (assumed) for a design. The average number of instances per object is noted in column 3. The success rate is reported in column 4. We have experimented with different clustering of mcell ids for a fixed number of objects and mcell ids. The success rate, noted in column 4, for a particular number of objects and partitions (mcell ids) attains at least in 75% cases while searching for solutions. It is obvious from the reported results that for most of the cases we successfully find the solution. The performance of the query processor designed around the CA, synthesized from Algorithm 7.1 is reported next.
7.6.2
Performance of query processor
Although a number of works related to location dependent data processing are reported, the schemes more or less use some variations of indexed structure to facilitate database searching. The specially designed structures such as D tree, trian-tree, traptree, etc, to suit in LDD (Location Dependent Data), have some amount of access latencies. Moreover, the response time for those systems is seriously suffered. On the other hand, the proposed scheme does not effectively require any database searching. It uses a hardware circuit to retrieve the required information stored in the main memory at BS. Hence, unlike the state-of-the-art solutions, the data service in the proposed design is very fast. The design requires F F s, the combinational logic, and a decoder (Figure 7.4). Cost of these hardware is even less than the required memory for storing index files that are essential for traditional query processing systems. The works related to the location dependent query processing, reported till date, involve soft computation. So, we compare the performance of our CA based data service scheme with that of soft computation based services. For both the schemes, we assume that the traffic pattern follows Poisson distribution. Further, for solution
133 Table 7.2: Performance in terms of service λ/ms 20 10 5 1 0.1 0.07 0.05 0.031 0.030
Soft comp. performance in % 0.227 0.302 0.454 1.659 15.284 21.808 30.355 49.334 50.647
CA based performance in % 0.460 0.587 0.901 3.329 30.618 43.635 60.749 97.688 100.00
with soft computation, the following parameter values are chosen: System clock = 2 GHz. RAM speed = 66 MHz. Seek time = 30 ms. Rotational latency = 3. Number of disk access to retrieve data of an object = 1. Table 7.2 reports the performance. The first column indicates the average arrival rate (λ/ms) of query to the server. Column 2 shows the success rate (number of queries successfully answered, the rest are discarded when the input query buffer is full) in soft computation, whereas the last column reports the success rate of the proposed CA based scheme. It is clear from Table 7.2 that the performance of the proposed CA based scheme is approximately twice of that of the soft computing scheme. However, the proposed scheme can not ensure 100% success rate (Section 7.6.1) due to the conflicts in RM T s while designing the CA for query processor. The following section introduces a technique for removal of conflict in RM T s through extension of neighborhood in CA.
7.7
Removal of conflict
The limitation of 3-neighborhood CA is – due to conflicts in RM T s, a set of predefined basins may not always be exactly modeled by the CA, even the mcell ids maintain low hamming distance among them. The conf lict in RM T s may result in a CA where a few number of mcell ids are the members of some basins that represent other lists of the object cluster. Therefore, the CA resulted from the reverse engineering technique may not ensure 100% correct answer to a query. The extension of neighborhood of CA (k ≥ 3) can be an alternative solution to remove the conflict. For example, if there is a conflict in 3-neighborhood dependency, we try to resolve it in 5-neighborhood and so on. That is, rather considering 8 RM T s
134 Node 1
111 0100
Node 1
1110100
Node 2
010 0100
Node 2
0100100
Node 3
101 1101
Node 3
1011101
Case I (3−neighborhood) RMT
111
110
101
100
011
010
1/0
Case I
Case II (5−neighborhood)
(a) 001
000
Cell 5 Conflict
RMT 01111 01110 01101 01100 01011 01010 01001 01000 00111 00110 00101 00100 00011 00010 00001 00000
1 11111 11110 11101 11100 11011 11010 11001 11000 10111 10110 10101 10100 10011 10010 10001 10000
1 Case II
(b)
Cell 5
Figure 7.11: Removal of conflict by extending the neighborhood for a CA rule, we have to choose 32 RM T s. Let us consider the patterns 1110100, 0100100 and 1011101 form a directed graph as shown in Figure 7.11(a). Case I shows that there is a conflict in RM T s while modeling the rule for cell 5 in 3-neighborhood. However, this problem can be solved by extending the neighborhood from 3 to 5. It is shown in Figure 7.11(b) that there is no conflict in RM T s for the same directed graph in 5-neighborhood dependency while considering rule for the CA cell 5. Once all the lists of a cluster are mapped to the CA basins, avoiding conflicts in the RM T s, we can get most of the RM T s of a CA cell rule as fixed. The RM T s that are not decided during mapping of lists are considered as the don’t cares (either 0 or 1). In a trial for the example design of Figure 7.5, the 4 RM T s 011, 010, 001, and 000 of cell 1 are fixed as 1, 0, 1, 0 respectively (Figure 7.9) while mapping the (cluster) H to the CA basins. Therefore, the RM T s 111, 110, 101, and 100 for the rule of cell 1 are the don’t cares. If we choose the don’t care values as 0, then the rule for cell 1 of the CAH is 00001010=10. The Synthesis Algorithm: The synthesis of CA, avoiding conflict, is described in Algorithm 7.2. The algorithm draws a state transition diagram (disjoint graph) randomly for a given cluster and then tries to set the rule for each CA cell in 3neighborhood dependency to satisfy the state transition. If conflict is found, the algorithm next tries with 5-neighborhood dependency, and so on. Finally, it returns a rule vector (CA) with k-neighborhood dependency.
135 Algorithm 7.2 FindCA Input: The m number of clusters Output: The CA rule vector Step 1: Repeat Step 2 to Step 4 for each cluster. Step 2: Randomly draw a state transition diagram so that each proximity list of the cluster forms a basin. Step 3: For i = 1 to n repeat Step 4 Step 4: For k = 3, 5, 7, · · · repeat the following. Try to find the rule for ith cell with k-neighborhood. If there is no conflict, store the rule as the ith CA cell rule, and execute Step 4 with next i. Else try with next k. Step 5: Report the rule vector.
The next section highlights the major experimental results that support the effectiveness of the proposed design.
7.8
Effectiveness of the design
The major task of the proposed design of data service scheme is to find different CA, state transition of which can model the clustering of mcell ids. To find the desired CA we choose k-neighborhood CA, where k ≥ 3. Since the increase in neighborhood adds the design complexity, our target is to find CA rules with minimum k - that is, 3. Table 7.3 reports the sample results of experimentation. The first column of Table 7.3 depicts the number of partition steps (Section 7.4.1) executed to partition a network cell to micro cells. The average number of instances per object type is noted in column 2. We have experimented with different clustering of mcell ids for a particular number of partition steps. While synthesizing, different cells of CA may select different neighborhood dependencies. The average number of neighborhood dependency for the cells of a CA is calculated and reported in column 3 for the null-boundary CA. We have also experimented with k-neighborhood periodic boundary CA. The result of the experimentation is reported in column 4. It is obvious from the reported results of column 4 that the neighborhood dependency in most of the cases is 3 for a periodic boundary CA based design. The design of CA is one time cost for a mobile network. A design remains unchanged till any updation of the clustering of mcell ids is required. The updation of the clusters are important when an instance of an object type is introduced or removed from the network area. An updation may change the set of CA rules (that is, RM T s of different cell rules). Implementation of the updated rules can be done by simply replacing the content of RM T register of Figure 6.20. Figure 6.20 of Chapter 6 shows the combinational logic required for a CA cell in 3-neighborhood. It consists of an 8 to 1 MUX and the 8-bit RM T register. For 5-neighborhood, we need a MUX of 32 to 1 and the RM T register is of 32-bit.
136 Table 7.3: Effectiveness of the CA based design # partition steps 2 3 4 3 3 3 4 4 4 5 5 5
7.9
avg. # instances per object 3 3 3 4 5 6 6 8 10 8 10 12
avg. # neighborhood null boundary 3.0 3.0 3.0 3.0 3.3 3.6 3.3 3.6 3.6 3.3 3.3 3.6
avg. # neighborhood periodic boundary 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.3 3.0 3.0 3.3
Conclusion
The work reports a nonlinear CA based data service scheme for location dependent data. It reduces the delay of query service significantly through introduction of hardware query processor. The proposed technology applies an altogether different approach, a paradigm shift, for data service and also explores a new domain of CA applications.
Chapter 8
Conclusion In the concluding chapter, we provide a summary of the major contributions of the work that has been reported in depth in different chapters of the thesis. We also provide pointers to future research, in the field of Cellular Automata (CA), to attract attention of researchers from diverse disciplines.
8.1
Main Contributions
The prime objective of the current research is to characterize the nonlinear cellular automata and to introduce the nonlinear CA in the field of V LSI design. This work reports detail characterization of the nonlinear CA, specially for nonlinear group (reversible) CA (Chapter 3). It classifies the CA rules (interconnections among the CA cells) based on the 1s & 0s in 8-bit binary representation of each rule. The classification of CA rules enables ease synthesis of group CA. Two efficient methods to identify and synthesize an n-cell group CA, in O(n) time, have been proposed in Chapter 3. Characterization of non-reachable states of a non-group (irreversible) CA are also reported. The fundamental results, reported to characterize the state transition behavior of nonlinear CA, open up a new dimension to the CA research community. In Chapter 4 of this thesis report, we have proposed the design of P RP G (PseudoRandom Pattern Generator) utilizing the fundamental results on nonlinear group CA, provided in Chapter 3. The randomness properties of the CA rules are characterized to facilitate the design. The huge search space of nonlinear CA is then exploited to reach the best solution in pseudo-random pattern generation. The T P Gs (Test Pattern Generators), based on the P RP G reported in Chapter 4, for V LSI circuit testing have been reported in Chapter 5. A cost-optimal design has also been proposed in this chapter. It is established that the nonlinear CA based T P G supports scalable design and can effectively be employed for designing test structure for a chip with multiple cores. A Universal BIST (U BIST ) structure developed around the cellular automata is 137
138 proposed in Chapter 6. The U BIST can generate any of the four classes of test patterns – (1) pseudo-random, (2) pseudo-exhaustive, (3) pseudo-random without prohibited pattern set, and (4) deterministic test patterns. Such a design is most desirable to the V LSI test research community. Finally, in Chapter 7, we address a completely different problem, the issue of Location-Dependent Information Services (LDIS) in cellular mobile network. We introduce cellular automata as an efficient tool to find a solution to the problem. Nonlinear CA provide the hardware solution to data services/ query processing and avoids any sort of soft computation resulting faster response for the queries.
8.2
Future Extension
There are important aspects of the research, reported in this thesis that can be extended for the benefit of the CA research community. Following are the points to future extensions of this work: 1. A detailed theoretical formulation to identify the cycle structure – that is, length of each cycle of a nonlinear group CA, 2. Research to identify and to synthesize maximal length nonlinear CA, 3. The detail characterization of nonlinear multiple attractor CA (M ACA), 4. Exploring CA based hardwired solution for other applications as targeted for data services in cellular mobile network.
Bibliography [1] In http://mathworld.wolfram.com/PrimitivePolynomial.html. [2] List of primitive polynomials. In http://www.csr.uvic.ca/˜ mserra/CA.html. [3] Xilinx. In http://www.xilinx.com. [4] M. Abramovici, M. A. Breuer, and A. D. Friedman. Digital Systems Testing and Testable Design. Jaico Publishing House, 1997. [5] A. Adamatzky. Hierarchy of Fuzzy Cellular Automata. Fuzzy Sets and Systems, 62:167–174, 1994. [6] A. Adamatzky. Identification of Cellular Automata. Taylor and Francis Inc., Bristol, 1994. [7] V. D. Agrawal, C. R. Kime, and K. K. Saluja. A tutorial on built-in-self-test part 1: principles. In IEEE Design & Test of Computers, pages 73–82, March 1993. [8] S. B. Akers and W. Jansz. Test Set Embedding in a Built-In Self Test Environment. In Proceedings of International Test Conference, pages 257–263, 1989. [9] Victor Aladyev. Survey of Research in the theory of Homogeneous Structures and their applications. Mathematical Biosciences, 22:121–154, 1974. [10] A. Albicki and M. Khare. Cellular Automata used for test pattern generation. In Proceedings of ICCAD, pages 56–59, 1987. [11] A. Albicki and S. K. Yap. Covering a set of test patterns by a cellular automata. Technical report, Comp. Sc. and Engg. Research Review, Univ. of Rochester, USA, 1987. [12] A. Albicki, S. K. Yap, M. Khare, and S. Pamper. Prospects on Cellular Automata Application to Test Generation. Technical Report EL-88-05, Dept. of Electrical Engg., Univ. of Rochester, USA, 1988. [13] P. S. Albin. The Analysis of Complex Socioeconomic Systems. Lexington Books, London, 1975. 139
140 [14] Zoran Aleksic. Computation in Inhomogeneous Celluar Automata. In David Green and Terry Bossomaier, editors, Complex Systems: From Biology to Computation. IOS Press, Amsterdam, 1993. [15] S. Amoroso, G. Cooper, and Y. N. Patt. Some clarifications of the concept of a garden-of-eden configuration. Journal of Computer and System Sciences, 10(1):77–82, 1975. [16] S. Amoroso and Y. N. Patt. Decision procedures for surjectivity and injectivity of parallel maps for tesselation structures. Journal of Computer and System Sciences, 6:448–464, 1972. [17] M. Arbib. Simple self-reproducing universal automata. Information and Control, 9:177–189, 1966. [18] M Askenazi. GeneTool software, Email:
[email protected]. [19] A. J. Atrubin. A one-dimensional real-time iterative multiplier. IEEE Trans. on Computers, EC-14(3):394–399, October 1965. [20] Franco Bagnoli and Carlo Guardiani. Sympatric Speciation Through Assortative Mating in a Long-Range Cellular Automata. In Proceedings of Sixth International Conference on Cellular Automata for Research and Industry, ACRI, The Netherlands, pages 405–414, October 2004. [21] H. Baltzer, W. P. Braun, and W. Kohler. Cellular Automata Model for Vegetable Dynamics. Ecological Modelling, 107:113–125, 1998. [22] E. R. Banks. Information Processing and Transmission in Cellular Automata. PhD thesis, MIT, 1971. [23] Feng Bao. Cryptanalysis of a Partially Known Cellular Automata Cryptosystem. IEEE Trans. on CAD, 53(11):1493–1497, November 2004. [24] A. M. Barbe. A cellular automata ruled by an eccentric conservation law. Physica D, 45:49–62, 1990. [25] P. H. Bardell. Analysis of Cellular Automata Used as Pseudo-random Pattern Generators. In Proceedings of International Test Conference, pages 762–768, 1990. [26] P. H. Bardell, W. H. McAnney, and J. Savir. Built-in Test for VLSI : PseudoRandom Techniques. John Wiley & Sons, 1987. [27] P.H. Bardell and W.H. McAnney. Pseudo-random arrays for built-in tests. IEEE Trans. on Computers, C-35(7):653–658, July 1986. [28] S. C. Benjamin and N. F. Johnson. A Possible Nanometer-scale Computing Device based on an Adding Cellular Automaton. Applied Physics Letters, 1997.
141 [29] C. Bennett and G. Grinstein. Role of irreversibility in stabilizing complex and nonergodic behaviour in locally interacting discrete systems. Phys. Rev. Lett., 55:657–660, 1985. [30] E. R. Berlekamp, J. H. Conway, and R. K. Guy. Winning ways for your mathematical plays, volume 2. Academic Press, 1984. [31] S. Bhattacharjee, J. Bhattacharya, and P Pal Chaudhuri. An Efficient Data Compression Hardware based on Cellular Automata. In Proceedings of Data Compression Conference (DCC95), page 472, 1995. [32] Stephen A Billings and Yingxu Yang. Identification of Probabilistic Cellular Automata. IEEE Transaction on System, Man and Cybernetics, Part B, pages 1–12, 2002. [33] R. J. De Boer and P. Hogeweg. Growth and recruitment in the immune network. In A. F. Perelson and G. Weisbuch, editors, Theoretical and Experimental Insights into Immunology, volume 66, pages 223–247. Springer Verlag, New York, 1992. [34] C. Burks and D. Farmer. Towards modeling DNA sequences as automata. Physica D, 10:157–167, 1984. [35] Erik Cantu-Paz. A Summary on Research of Parallel Genetic Algorithm. Technical Report 95007, Illinois Genetic Algorithms Laboratory, July 1995. [36] M. S. Capcarrere. Cellular Automata and other Cellular System: Design and Evolution. PhD thesis, Swiss Federal Institute of Technology, Luassane, 2002. [37] M. S. Capcarrere, A Tettamanzi, and Moshe Sipper. Statistical Study of a class of Cellular Evolutionary Algorithm. Evolutionary Computation, 7(3):255–274, 1998. [38] G. Cattaneo, P. Flocchini, G. Mauri, and N. Santoro. Fuzzy Cellular Automata and Their Chaotic Behavior. In Proceedings of International Symposium on Nonlinear Theory and its Applications, Hawaii, IEICE, volume 4, pages 1285– 1289, 1993. [39] G. Cattaneo, P. Flocchini, G. Mauri, and N. Santoro. Cellular Automata in Fuzzy Backgrounds. Physica D, 105:105–120, 1997. [40] Kevin Cattel and J. C. Muzio. Analysis of One-Dimensional Linear Hybrid Cellular Automata over GF(q). IEEE Trans. on Computers, 45(7):782–792, July 1996. [41] Kevin Cattel and J. C. Muzio. Synthesis of One Dimensional Linear Hybrid Cellular Automata. IEEE Trans. on CAD, 15:325–335, 1996.
142 [42] Kevin Cattell, Shujian Zhang, Micaela Serra, and Jon C. Muzio. 2-by-nn Hybrid Cellular Automata with Regular Configuration: Theory and Application. IEEE Transactions on Computers, 48(3):285–295, March 1999. [43] Kevin Cattell and Shujlan Zhang. Minimal cost one-dimensional linear hybrid cellular automata of degree through 500. JOURNAL OF ELECTRONIC TESTING: Theory and Applications, 6:255–258, January 1995. [44] F. Celada and P. E. Seiden. A computer model of cellular interactions in the immune system. Immun-t, 13:56–62, 1992. [45] M. Chady and R. Poli. Evolution of Cellular-automaton-based Associative Memories. Technical Report no. CSRP-97-15, May 1997. [46] S. Chakraborty, D. Roy Chowdhury, and P. Pal Chaudhuri. Theory and Application of Non-Group Cellular Automata for Synthesis of Easily Testable Finite State Machines. IEEE Trans. on Computers, 45(7):769–781, July 1996. [47] H. J. Chang, O. H. Ibarra, and A. Vergis. On the power of one-way communication. Comm. of the ACM, 35(3):697–726, July 1988. [48] S. Chattopadhyay. Some Studies on Theory and Applications of Additive Cellular Automata. PhD thesis, IIT, Kharagpur,India, 1996. [49] S. Chattopadhyay, S. Adhikari, S. Sengupta, and M. Pal. Highly Regular, Modular, and Cascadable Design of Cellular Automata-Based Pattern Classifier. IEEE Transaction on VLSI Systems, 8(6):724–735, December 2000. [50] S. Chattopadhyay and P. Pal Chaudhuri. Efficient signatures of Boolean Functions for Rapid Matching in Antifuse based FPGA Technology Mapping. In International Conference on Computer Systems and Education, Bangalore, India, June 1994. [51] P Pal Chaudhuri, D Roy Chowdhury, S Nandi, and S Chatterjee. Additive Cellular Automata – Theory and Applications, volume 1. IEEE Computer Society Press, California, USA, ISBN 0-8186-7717-1, 1997. [52] Pabitra Pal Chaudhuri and Arup Roy Chowdhury. Economic Return to the Previous State for any Nearest Neighborhood Reversible Autonomous Finite State Machines. Information Sciences, 126(1-4):129–136, 2000. [53] Chester Lee. Synthesis of a Cellular Universal Machine using the 29-state Model of von Neumann. Automata Theory Notes, The University of Michigan Engineering Summer Conferences, 1964. [54] Sung-Jin Cho, Un-Sook Choi, and Han-Doo Kim. Behavior of Complemented TPMACA whose complemented vector is Acyclic in a linear TPMACA. Mathematical and Computer Modelling, 36:979–986, 2002.
143 [55] Sung-Jin Cho, Han-Doo Kim, and Un-Sook Choi. Analysis of trees of complemented CA derived from a linear TPMACA. In Joint Workshop on Combinatorics, 2002. [56] B Chopard and M Droz. Cellular Automata Modelling of Physical Systems. Cambridge University Press, 1998. [57] H. H. Chou and J. A. Reggia. Problem solving during artificial selection of self-replicating loops. Physica D, 115:293–312, 1998. [58] D. Roy Chowdhury. Theory and Applications of Additive Cellular Automata for Reliable and Testable VLSI Circuit Design. PhD thesis, IIT, Kharagpur, India, 1992. [59] D. Roy Chowdhury, S. Basu, I. Sen Gupta, and P. Pal Chaudhuri. Design of CAECC — Cellular automata based error correcting code. IEEE Trans. on Computers, 43(6):759–764, June 1994. [60] D. Roy Chowdhury, S. Chakraborty, B. Vamsi, and P.Pal Chaudhuri. Cellular Automata based synthesis of easily and fully testable FSMs. In Proceedings of ICCAD, pages 650–653, Nov 1993. [61] D. Roy Chowdhury, I. Sen Gupta, and P. Pal Chaudhuri. A class of TwoDimensional Cellular Automata and Applications in Random Pattern Testing. JOURNAL OF ELECTRONIC TESTING: Theory & Applications, 5:67–82, 1994. [62] D. Roy Chowdhury, I. Sen Gupta, and P. Pal Chaudhuri. CA-Based Byte Error Correcting Code. IEEE Trans. on Computers, 44(3):371–382, March 1995. [63] E. F. Codd. Cellular Automata. Academic Press Inc., 1968. [64] S. N. Cole. Real time computation by n-dimensional iterative arrays of finite state machines. IEEE Trans. on Computers, C-18, 1969. [65] F. Corno, M. Rebundengo, M. Sonza Reorda, G. Squillero, and M. Violante. Low power BIST via non-linear hybrid cellular automata. In Proceedings of VLSI Test Symposium, 2002. [66] F. Corno, M. Sonza Reorda, and G. Squillero. The Selfish Gene Algorithm: a New Evolutionary Optimization Strategy. In Proceedings of 13th Annual ACM Symposium on Applied Computing, Atlanta, Georgia (USA), pages 349–355, February 1998. [67] F. Corno, M. Sonza Reorda, and G. Squillero. Evolving effective CA/CSTP: BIST architectures for sequential circuits. In Proceedings of the 2001 ACM symposium on Applied computing, pages 345–350. ACM Press, 2001. [68] R. R. Coveyou and R. D. MacPherson. Fourier analysis of uniform random number generators. J. Assoc. Comput. Mach., 14:100–119, 1967.
144 [69] M. Creutz. Deterministic ising dynamics. Annals of Physics, 167:62–72, 1986. [70] J. P. Crutchfield and M. Mitchell. The Evolution of Emergent Computation. In Proceedings of the National Academy of Sciences,USA, volume 93(23), pages 10742–10746, 1995. [71] J P Crutchfield and N H Packard. Symbolic Dynamics of Noisy Chaos. Physica D, 7:201–223, 1983. [72] K. Culik and S. Dube. An efficient solution to the firing mob problem. Theoretical Computer Science, 91:57–69, December 1991. [73] K. Culik, L. P. Hard, and S. Yu. Computation theoretic aspects of cellular automata. Physica D, 45/1-3:357–378, September 1990. [74] A. K. Das. Additive Cellular Automata : Theory and Application as a Built-in Self-test Structure. PhD thesis, IIT, Kharagpur, India, 1990. [75] A. K. Das and P. Pal Chaudhuri. An Efficient On-chip Deterministic Test Pattern Generation Scheme. Euromicro Journal, Microprocessing & Microprogramming, 26:195–204, 1989. [76] A. K. Das and P. Pal Chaudhuri. Efficient Characterization of Cellular Automata. Proceedings of IEE (Part E), 137(1):81–87, January 1990. [77] A. K. Das and P. Pal Chaudhuri. Vector space theoretic analysis of additive cellular automata and its applications for pseudo-exhaustive test pattern generation. IEEE Trans. on Computers, 42(3):340–352, March 1993. [78] A. K. Das, A. Sanyal, and P. Pal Chaudhuri. On the characterization of cellular automata. Information Science, 1991. [79] R. Das, J. P. Crutchfield, M. Mitchell, and J. E. Hanson. Evolving Globally Synchronized Cellular Automata. In Proceedings of Sixth International Conference on Genetic Algorithms, San Fransisco, CA, pages 336–343, 1995. [80] R. Das, M. Mitchell, and J. P. Crutchfield. A Genetic Algorithm discovers Particle Based Computation in Cellular Automata. Parallel Problem Solving from Nature - III, Y. Davidor, H.-P. Schwefel, and R. Mnner (eds.), SpringerVerlag, pages 344–353, 1994. [81] Sukanta Das, Sipra Das(Bit), and Biplab K Sikdar. Non-linear Cellular Automata Based Design of Query Processor for Mobile Network. In IEEE SMC conference, Hawaii, volume 3, pages 2751–2756, October 2005. [82] Sukanta Das, Debdas Dey, Subhayan Sen, Biplab K. Sikdar, and P. Pal Chaudhuri. An Efficient Design of Non-linear CA Based PRPG for VLSI Circuit Testing. In Proceedings of ASP-DAC, pages 110–112, 2004.
145 [83] Sukanta Das, Niloy Ganguly, Biplab K. Sikdar, and P Pal Chaudhuri. Design of a Universal BIST (UBIST) Structure. In Proceedings of 16 th International Conference on VLSI Design, pages 161–166, January 2003. [84] Sukanta Das, Anirban Kundu, Subhayan Sen, Biplab K. Sikdar, and P. Pal Chaudhuri. Non-Linear Celluar Automata Based PRPG Design (Without Prohibited Pattern Set) In Linear Time Complexity. In Proceedings of Asian Test Symposium, pages 78–83, 2003. [85] Sukanta Das, Anirban Kundu, and Biplab K. Sikdar. Nonlinear CA Based Design of Test Set Generator Targeting Pseudo-Random Pattern Resistant Faults. In Proceedings of Asian Test Symposium, pages 196–201, 2004. [86] Sukanta Das, Anirban Kundu, Biplab K. Sikdar, and P. Pal Chaudhuri. Design of Nonlinear CA Based TPG Without Prohibited Pattern Set In Linear Time. JOURNAL OF ELECTRONIC TESTING: Theory and Applications, 21:95–109, January 2005. [87] Sukanta Das, Hafizur Rahaman, and Biplab K. Sikdar. Cost Optimal Design of Nonlinear CA Based PRPG for Test Applications. In Proceedings of Asian Test Symposium, pages 284–287, 2005. [88] Sukanta Das and Biplab K Sikdar. Classification of CA Rules Targeting Synthesis of Reversible Cellular Automata. In Accepted for Publication in the Proceedings of International Conference on Cellular Automata for Research and Industry, ACRI, France, 2006. [89] Sukanta Das, Biplab K Sikdar, and P Pal Chaudhuri. Characterization of Reachable/Nonreachable Cellular Automata States. In Proceedings of Sixth International Conference on Cellular Automata for Research and Industry, ACRI, The Netherlands, pages 813–822, October 2004. [90] Sukanta Das, Biplab K. Sikdar, and Parimal Pal Chaudhuri. Nonlinear CA Based Scalable Design of On-Chip TPG for Multiple Cores. In Proceedings of Asian Test Symposium, pages 331–334, 2004. [91] P. Dasgupta, S. Chattopadhyay, and I. Sengupta. An ASIC for Cellular Automata based Message Authentication. In Proceedings of 12 th International Conference on VLSI Design, India, pages 538–541, January 1999. [92] D. E. Denning. Cryptography and Data Security. Addison - Wesley Publishing Company, Reading, Mass, 1982. [93] B. Derrida and D. Stauffer. Phase Transitions in Two-Dimensional Kuaffman Cellular Automata Random Network Automata. Europhys. Lett. 2, 739, April 1986.
146 [94] C. Dufaza, H. Viallon, and C. Chevalier. BIST Hardware Generator for Mixed Test Scheme. In Proceedings of European Design and Test conference, Paris, page 424, 1995. [95] M. J. B. Duff and K. Preston Jr. Modern Cellular Automata : Theory and Applications. Plenum Press NY, 1984. [96] E. Moore (editor). Sequential Machines: Selected Papers. Addision-Wesley Publishing Company, Inc., Redwood City, CA, 1964. [97] B. Elspas. The theory of autonomous linear sequential networks. TRE Trans. on Circuits, CT-6(1):45–60, March 1959. [98] J. M. Epstein and R. Axtell. Growing Artificial Societies: Social Science from the Bottom Up. MIT Press, Cambridge, 1996. [99] G. Bard Ermentrout and Leah Edelstein-Keshet. Cellular automata approaches to biological modeling. Journal of Theoretical Biology, 160:97–133, Jan 1993. [100] P. C. Fischer. Generation of primes by a one-dimensional real-time iterative array. J. ACM, 12:388, 1965. [101] P. Flocchini, F. Geurts, A. Mingarelli, and N. Santoro. Convergence and Aperiodicity in Fuzzy Cellular Automata: Revisiting Rule 90. Complexity International, 6, 1998. [102] P. Flocchini, F. Geurts, and N. Santoro. CA-like error propagation in fuzzy CA. Prallel Computing, 23(11):1673–1682, November 1997. [103] L. Fortuna, M. La Rosa, D. Nicolosi, and D. Porto. Nanoscale System Dynamical Behaviors: From Quantum-Dot-Based cell to 1-d Arrays. IEEE Trans. on VLSI Systems, 12(11):1167 – 1173, November 2004. [104] E. Franti, S. Goschin, M. Dascalu, N. Catrina, and M. Dobrin. CRIPTOCEL: Design of Cellular Automata based Cipher Schemes. In Proceedings of International Conference on Communications, Circuits and Systems (ICCCAS), pages 1103 – 1107, 2004. [105] U. Frisch, B. Hasslacher, and Y. Pomeau. Lattice gas automata for the navierstokes equation. Phys. Rev. Lett., 56(14):1505–1508, 1986. [106] S. Galam. Spontaneous Coalition Forming – Why Some are Stable. In Proceedings of Fifth International Conference on Cellular Automata for Research and Industry, ACRI, Switzerland, pages 1–9, October 2002. [107] N. Ganguly, S. Dhar, A. K. Roy, B. K. Sikdar, and P Pal Chaudhuri. Cellular Automata Based Hamming Hash Family: Synthesis and Application. In Proceedings of 9th International Conference of Advance Computing and Communication, India, December 2001.
147 [108] N. Ganguly, P. Maji, A. Das, B. K. Sikdar, and P. Pal Chaudhuri. Characterization of Non-Linear Cellular Automata Model for Pattern Recognition. In Proceedings of AFSS International Conference on Fuzzy Systems, Calcutta, India, pages 214–220, 2002. [109] N. Ganguly, A. S. Nandi, S. Das, B. K. Sikdar, and P. Pal Chaudhuri. An Evolutionary Design of Pseudo-Random Test Pattern Generator Without Prohibited Pattern Set (pps). In Proceedings of Asian Test Symposium, pages 260–265, 2002. [110] N. Ganguly, B. K. Sikdar, J. Deb, D. Halder, and P Pal Chaudhuri. Hashing Through Cellular Automata. In Proceedings of 8 th International Conference of Advance Computing and Communication, India, December 2000. [111] Niloy Ganguly. Cellular Automata Evolution : Theory and Applications in Pattern Recognition and Classification. PhD thesis, Bengal Engineering College (a Deemed University), India, 2004. [112] Niloy Ganguly, Biplab K Sikdar, and P Pal Chaudhuri. Design of an On-Chip Test Pattern Generator Without Prohibitited Pattern Set (PPS). In Proceedings of ASP-DAC/VLSI Design 2002, India, pages 689–694, 2002. [113] Martin Gardner. The fantastic combinations of John Conway’s new solitaire game ‘Life’. Scientific American, 223:120–123, October 1970. [114] Martin Gardner. On cellular automata self-reproduction, the garden of eden and the game of ‘Life’. Scientific American, 224(2):112–117, 1971. [115] R. J. Gaylord and L. D’andra. Simulating Society: A Mathematica Toolkit for Modelling Socioeconomic Behavior. Springer: New York, 1998. [116] R. J. Gaylord and P. R. Wellin. Computer Simulation with Mathematica: Exploration in Complex Physical and Biological Systems. Springer: New York, 1995. [117] P Grassberger. Towards a quantitative theory of Self-Generated Complexity. J. Theo. Phys, 25:907938, 1986. [118] G. Grinstein, C. Jayaprakash, and Y. He. Statistical mechanics of probabilistic cellular automata. Phys. Rev. Lett., 55:2527–2530, 1985. [119] Sheng-Uei Guan and Syn Kiat Tan. Pseudorandom Number Generation With Self-Programmable Cellular Automata. IEEE Trans. on CAD, 23(7):1095–1101, July 2004. [120] Sheng-Uei Guan and Shu Zhang. An Evolutionary Approach to the Design of Controllable Cellular Automata Structure for Random Number Generation. IEEE Trans. on CAD, 7(1):23–36, February 2003.
148 [121] Sheng-Uei Guan, Shu Zhang, and Therese Quieta. 2-d CA Variation With Asymetric Neighborship for Psedorandom Number Generation. IEEE Trans. on CAD, 23(3):378–388, March 2004. [122] Howard Gutowitz. A hierarchical classification of CA. Physica D, 45:136, 1990. [123] I. Hartmann and G. Kemaitz. How to do weighted random testing for BIST. In Proceedings of ICCAD, pages 568–571, 1993. [124] R. Hegselmann and A. Flache. Understanding Complex Social Dynamics: A Plea for Cellular Automata Based Modelling. Journal of Artificial Societies and Social Simulation, 1(3), June 1998. [125] S. Hellebrand, H. Wunderlich, and A. Hertwig. Mixed-mode BIST using embedded processors. In Proceedings of International Test Conference, pages 195–204, October 1996. [126] J. J. Hopfield. Neural networks and physical system with emergent collective computational abilities. Proceedings of National Academic of Sciences, 79:2554– 2558, 1982. [127] J. J. Hopfield. Pattern Recognition computation using action potential timings for stimulus representations. Nature, 376:33–36, 1995. [128] P. D. Hortensius, R. D. McLeod, and H. C. Card. Parallel random number generation for VLSI systems using cellular automata. IEEE Trans. on Computers, C-38(10):1466–1473, October 1989. [129] P. D. Hortensius, R. D. McLeod, and H. C. Card. Cellular automata based signature analysis for built-in self-test. IEEE Trans. on Computers, C-39(10):1273– 1283, October 1990. [130] P. D. Hortensius, R. D. McLeod, W. Pries, and H. C. Card. Cellular Automata Based Pseudorandom Number Generators for Built-In Self-Test. IEEE Trans. on CAD, 8(8):842–859, August 1989. [131] M. Y. Hsiao. A class of optimal minimum odd-weight-column SEC-DED codes. IBM Journal of Research Development, 14(4):395–401, October 1970. [132] T. Imielinski, S. Viswanathan, and B. R. Badrinath. Data on air : Organization and access. IEEE Trans. on Knowledge and Data Engineering, 9(3), 1997. [133] S. Inokuchi, K. Honda, H. Y. Lee, T. Sato, Y. Mizoguchi, and Y. Kawahara. On reversible cellular automata with finite cell array. http://www.math.kyushuu.ac.jp/coe/report/pdf/2005-27.pdf, 2005. [134] S. Inokuchi and Y. Mizoguchi. Generalized Partioned Quantum Cellular Automata and Quantization of Classical CA. Int. Journal of Unconventional Computing, 1:149–160, 2005.
149 [135] G. Jacopini and G. Sontacchi. Reversible parallel computation: an evolving space-model. Theoretical Computer Science, 73:1–42, 1990. [136] E. Jen. Invariant strings and pattern recognizing properties of 1d CA. Journal of Statistical Physics, 43:243265, 1986. [137] E. Jen. Aperiodicity in one-dimensional Cellular Automata. Physica D, 45:3–18, 1990. [138] H. Juille and J. B. Pollack. Coevolutionary Learning and Design of Complex Systems. Advances in Complex System, 2(4):371–394, 2000. [139] J. Jump and J. Kirtane. On the interconnection structure of cellular automata networks. Information Control, 24:74–91, 1974. [140] D. Kagaris and S. Tragoudas. Von Neumann hybrid cellular automata for generating deterministic test sequences. ACM Transactions on Design Automation of Electronic Systems (TODAES), 6(3):308–321, 2001. [141] K. Kaneko. Complexity in basin structures and information processing by transition among attractors. Dynamical systems and nonlinear oscillations, World Scintific, 1986. [142] K. Kaneko. Lyapunov analysis and information flow in coupled map lattices. Physica D, 23:436–447, 1986. [143] J. Kari. Reversibility of 2d cellular automata is undecidable. Physica D, 45:379– 385, 1990. [144] J. Kari. Reversibility and surjectivity problems of cellular automata. Journal of Compututer and System Sciences, 48(1):149–182, February 1994. [145] J. Kari. Representation of reversible cellular automata with block permutations. Mathematical Systems Theory, 29:47–61, 1996. [146] Y Kawahara, S.Kumamoto, Y. Mizoguchi, M. Nohmi, H. Ohtuka, and T. Shoudai. Period Lengths of Cellular Automata on Square Lattices with Rule 90. J. Math. Phy, 36(3):1435–1456, April 1995. [147] D. C. Keenan and M. J. O’Brien. Competition, collusion and chaos. Journal of Economic Dynamics and Control, 17:327–353, 1993. [148] H. Kilic and L. Oktem. Low-Power Test Pattern Generator Design for BIST via Non-Uniform Cellular Automata. In International Symposium on VLSI Design, Automation and Test, pages 212–215, 2005. [149] S. Kirkpatrick, C. D. Gelatt Jr., and M. P. Vecchi. Optimization by Simulated Annealing. Science, 220(4598):671–680, 1983.
150 [150] T. Kitagawa. Cell space approaches in Biomathematics. Math. Biosciences, 19:27–71, 1974. [151] Donald E. Knuth. The Art of Computer Programming – Seminumerical Algorithms, volume 2. Pearson Education, third edition, 2000. [152] B. Koenemann. LFSR-Coded Test Patterns for Scan Designs. In Proceedings of IEEE Euro. Test Conf., pages 237–242, 1991. [153] S. R. Kosaraju. On some open problems in the theory of cellular automata. IEEE Trans. on Computers, C-23:561–565, 1974. [154] S. R. Kosaraju. Speed of recognition of context-free languages by array automata. SIAM J. Comput., 4(3):331–340, September 1975. [155] J. R. Koza. Genetic Programming : On the Programming of Computers by means of of Natural Selection. Cambridge MA, MIT Press, 1992. [156] P. Kurka. Languages, Equicontinuity and Attractors in Cellular Automata. Ergodic Theor. Dynamic System, 217:417–433, 1997. [157] P. Kurka. Zero-dimensional dynamical systems, formal languages, and universality. Theory of Computing Systems, 32:423–433, 1999. [158] O. Lafe. Data Compression and Encryption Using Cellular Automata Transforms. page 234, 1996. [159] C. G. Langton. Computation at the Edge of Chaos. Physica D, 42:12–37, 1990. [160] A. M. Law and W. D. Kelton. Simulation Modeling and Analysis. Tata McGrawHill Publishing Company Ltd., third edition, 2003. [161] Pierre L’Ecuyer. Uniform random number generators: A review. In Proceedings of the 1997 Winter Simulation Conference, pages 127–134, 1997. [162] Dik Lun Lee, Jianliang Xu, Baihua Zeng, and Wang-Chien Lee. Data management in location-dependent information services. IEEE Pervasive Computing, 15(2), March/April 2002. [163] D. H. Lehmer. Mathematical methods in large scale computing units. Ann. Comput. Lab. Harvard Univ., 26:141–146, 1951. [164] W. Li, N. H. Packard, and C. G. Langton. Transition Phenomena in Cellular Automata rule space. Physica D, 45:77–94, 1990. [165] Lio Liberti. Structure of the invertible CA transformations group. Journal of Computer and System Sciences, 59:521–536, 1999. [166] F. Liu and N. Goldenfeld. Genetic Feature of Late-Stage Crystal Growth. Phys. Rev. A., 42:895–903, 1990.
151 [167] R. Livi, G. Martinez-Mekler, and S. Ruffo. Periodic orbits and long transients in coupled map lattices. Physica D, 45:452–460, 1990. [168] Barry F. Madore and Wendy L. Freedman. Computer simulations of the belousov-zhabotinsky reaction. Science, 222:615–616, 1983. [169] M Mahajan. Studies in Language Classes defined by different Time-Varying Cellular Automata. PhD thesis, Indian Institute of Technology, Madras, 1992. [170] P. Maji and P Pal Chaudhuri. Fuzzy cellular automata for modeling pattern classifier. IEICE Transactions on Information and Systems, E88-D(4):691–702, April 2005. [171] P. Maji, N. Ganguly, and P. Pal Chaudhuri. Error Correcting Capability of Cellular Automata Based Associative Memory. IEEE Transaction on Systems, Man and Cybernetics, Part A, 33(4):466–480, 2003. [172] P. Maji, N. Ganguly, A. Das, B. K. Sikdar, and P. P. Chaudhuri. Study of non-linear cellular automata for pattern recognition. Proceedings of Cellular Automata Conference, Japan, pages 187–192, 2001. [173] F. B. Manning. An approach to highly integrated, computer-maintained cellular arrays. IEEE Trans. on Computers, C-26, 1977. [174] L. Margara, G. Mauri, G. Cattaneo, and E. Formenti. On the dynamical behavior of chaotic cellular automata. Theoretical Computer Science, 217:31–51, 1999. [175] G. Marsaglia. DIEHARD: A battery of tests of randomness. http://stat.fsu.edu/˜ geo/diehard.html, 1996.
In
[176] O. Martin, A. M. Odlyzko, and S. Wolfram. Algebraic Properties of Cellular Automata. Comm. Math. Phys., 93:219–258, 1984. [177] A. Maruoka and M. Kimura. Conditions for injectivity of global maps for tessallation automata. Info. Control, 32:158–162, 1976. [178] A. Maruoka and M. Kimura. Injectivity and surjectivity of parallel maps cellular automata. Journal Computer and System Sciences, 18:47–64, 1979. [179] A. Maruoka and M. Kimura. Conditions for injectivity of global maps for tessallation automata. Theory of Computer Science, 18:269–277, 1982. [180] Makoto Matsumoto. Simple cellular automata as pseudorandom m-sequence generators for built-in self-test. ACM Transactions on Modeling and Computer Simulation (TOMACS), 8(1):31–42, 1998. [181] E. J. McCluskey. Verification Testing - A Pseudoexhaustive Test Technique. IEEE Trans. on Computers, C33(6):541–546, June 1984.
152 [182] H. V. McIntosh. Wolfram’s Class IV automata and a good life. Physica D, 45:105–121, 1990. [183] R. D. McLeod, P. Hortensius, R. Schneider, H. C. Card, G. Bridges, and W. Pries. CALBO - Cellular Automaton Logic Block Observation. In Canadian Conference on VLSI, November 1986. [184] S. Misra. Theory and Application of Additive Cellular Automata for easily testable VLSI circuit design. PhD thesis, IIT, Kharagpur, India, 1992. [185] S. Misra, B. Mitra, and P. Pal Chaudhuri. Synthesis of self-testable sequential logic using programmable cellular automata. In Proceedings of International Conference on VLSI Design, pages 193–198, January 1992. [186] M. Mitchell, P. T. Hraber, and J. P. Crutchfield. Revisiting the Edge of Chaos: Evolving Cellular Automata to Perform Computations. Complex Systems, 7:89– 130, 1993. [187] B. Mitra, P. R. Panda, and P. Pal Chaudhuri. A flexible scheme for state assignment based on characteristics of the FSM. In Proceedings of ICCAD, pages 226–229. California, November 1991. [188] S. Mitra and S. Das(Bit). Query Processing in a Cellular Network - A Database Approach. In Proceedings of IEEE Vehicular Technology Conference, volume 4, pages 2560–2564, 2001. [189] C. Moore. Quasi Linear Cellular Automata. In Santa Fe Institute Working Paper, 1996. [190] C. Moore and T. Puin. Predicting Non-Linear Cellular Automata Quickly by Decomposing Them into Linear Ones. Physica D, 111:27–41, 1997. [191] Edward F. Moore. Machine models of self reproduction. In Arthur W. Burks, editor, Essays on Cellular Automata. University of Illinois Press, Urbana, 1970. [192] J. H. Moore and L. W. Hahn. A Cellular Automata-based Pattern Recognition Approach for Identifying Gene-Gene and Gene-Environment Interactions. American Journal of Human Genetics, 67(52), 2000. [193] J. H. Moore and L. W. Hahn. Multilocus Pattern Recognition using Onedimensional Cellular Automata and Parallel Genetic Algorithms. In Proceedings of the Genetic and Evolutionary Computation Conference, 2001. [194] F. J. Morales, J. P. Crutchfield, and M. Mitchell. Evolving two-dimensional Cellular Automata to Perform Density Classification : A report on work in Progress. Parallel Computing, 27:571–585, 2001. [195] K. Morita and S. Ueno. Parallel generation and parsing of array languages using reversable cellular automata. International Journal of Pattern Recognition and Artificial Intelligence, 8:543–561, 1994.
153 [196] G. Mrugalski, J. Rajski, and J. Tyszer. Cellular automata-based test pattern generators with phase shifter. IEEE Trans. on CAD, 19(8):878–893, August 2000. [197] G. Mrugalski, J. Rajski, and J. Tyszer. High Speed Ring Generators and Compactors of Test Data. In Proceedings of VTS, pages 57–62, 2003. [198] H. Muhlenbein and R. Hons. Stochastic Analysis of Cellular Automata and Voter Model. In Fifth International Conference on Cellular Automata for Research and Industry, ACRI, pages 92–103, 2002. [199] Monalisa Mukherjee, Niloy Ganguly, and P Pal Chaudhuri. Cellular automata based authentication (caa). In Proceedings of Fifth International Conference on Cellular Automata for Research and Industry, ACRI, Switzerland, pages 259– 269, 2002. [200] J. E. Myers. Random Boolean Networks - Three Recent Results. Private Communication. [201] J. Myhill. The converse of moore’s garden of eden theorem. In Proceedings of American Mathematical Society, volume 14, pages 685–686, 1963. [202] S. Nandi. Additive Cellular Automata : Theory and Application for Testable Circuit Design and Data Encryption. PhD thesis, IIT, Kharagpur, India, 1994. [203] S. Nandi and P. Pal Chaudhuri. Additive Cellular Automata as on-chip test pattern generator. In Proceedings of Second Asian Test Simposium, November 1993. [204] S. Nandi and P. Pal Chaudhuri. Analysis of periodic and intermediate boundary 90/150 cellular automata. IEEE Trans. on Computer, 45(1):1–12, January 1996. [205] S. Nandi, B. K. Kar, and P. Pal Chaudhuri. Theory and Application of Cellular Automata in Cryptography. IEEE Trans. on Computers, 43(12):346–1357, December 1994. [206] Masakazu Nasu. Local maps inducing surjective global maps of one dimensional tesselation automata. Mathematical Systems Theory, 11:327–351, 1978. [207] Danial J. Neebel and C. R. Kime. Cellular automata for weighted random pattern generation. IEEE Trans. on Computers, 46(11):1219–1229, November 1997. [208] H. Nishio. Real time sorting of binary numbers by one-dimensional cellular automata. Technical report, Kyoto University, 1981. [209] H. Nishio and Y. Kobuchi. Fault tolerant cellular space. Journal of Compututer and System Sciences, 11:150–170, 1975.
154 [210] Martin A. Nowak and Robert M. May. Evolutionary games and spatial chaos. Nature, 359:826–829, 1992. [211] S. Omohundro. Modeling cellular automata with partial differential equations. Physica D, 10:128–134, 1984. [212] Y. Oono and M. Kohmoto. A discrete model for chemical turbulance. Phys. Rev. Lett., 55:2927–2931, 1985. [213] M. Ottavi, V. Vankamamidi, F. Lombardi, S. Pontarelli, and A. Salsano. Design of a QCA Memory with Parallel Read/Serial Write. In IEEE Computer Society Annual Symposium on VLSI, pages 292–294, May 2005. [214] C. Paar. A New Architecture for a Parallel Finite Field Multiplier with Low Complexity Based on Composite Fields. IEEE Transactions on Computers, 45(7):856–861, 1996. [215] C. Paar, P. Fleischmann, and P. Roelse. Efficient Multiplier Architectures for Galois Fields GF(24n ). IEEE Transactions on Computers, 47(2):162–170, 1998. [216] N. H. Packard. Lattice Models for Solidification and Aggregation. In First International Symposium for Science on Form, 1986. [217] N. H. Packard. Adaptation towards the Edge of Chaos. In J. A. S. Kelso, A. J. Mandell, and M. F. Shlesinger, editors, Dynamic patterns in complex systems, pages 293–301. World Scientific, Singapore, 1988. [218] L. Pagie and P. Hogeweg. Information Integration and Red Queen Dynamics in Coevolutionary Optimization. In Proceedings of CEC, pages 1260–1267, 2000. [219] R. Pandey. Cellular automata approach to interacting cellular network models for the dynamics of cell population in an early HIV infection. Physica A, 179:442–470, 1991. [220] J. Paredis. Coevolving Cellular Automata : Beware of Red Queen! In Proceeding of ICGA VII, pages 393–399, 1997. [221] K. Paul. Theory and Application of GF (2 p ) Cellular Automata. PhD thesis, Bengal Engineering College (a Deemed University), India, 2002. [222] K. Paul, D. Roy Choudhury, and Parimal Pal Chaudhuri. Theory of Extended Linear Machines. IEEE Transactions on Computers, 51(9):1106–1110, 2002. [223] K. Paul and D Roy Chowdhury. Application of GF(2 p ) CA in Burst Error Correcting Codes. In Proceedings of International Conference of VLSI Design, pages 562–567, January 2000. [224] K. Paul, D. Roy Chowdhury, and P. Pal Chaudhuri. Cellular Automata Based Transform Coding for Image Compression. In Proceedings of International Conference on High Performance Computing (HiPC), pages 269–273, December 1999.
155 [225] K. Paul, D Roy Chowdhury, and P. Pal Chaudhuri. Scalable Pipelined MicroArchitecture for Wavelet Transform. In Proceedings of International Conference on VLSI Design, pages 144–147, January 2000. [226] K. Paul, P. Dutta, D. Roy Chowdhury, P. K. Nandi, and P. Pal Chaudhuri. A VLSI Arcitecture for On-Line Image Decompression using GF(2 8 ) Cellular Automata. In Proceedings of International Conference on VLSI Design, pages 532–537, January 1999. [227] K. Paul, A. Roy, P. K. Nandi, B. N Roy, M. Deb Purkhayastha, S. Chattopadhyay, and P. Pal Chaudhuri. Theory and Application of Multiple Attractor Cellular Automata for Fault Diagnosis. In Proceedings of ATS, Singapore, December 1998. [228] Y. Pomeau. Invariant in cellular automata. J. Phys. A, 17:L415–L418, 1986. [229] D. K. Pradhan and Mitrajit Chatterjee. GLFSR–A New Test Pattern Generator for Built-in-Self-Test. IEEE Transactions on Computer -Aided Design of Integrated Circuits and Systems, 18(2):319–328, 1999. [230] K. Preston, M. J. Duff, S. Levialdi, Ph. E. Norgren, and J. I. Toriwaki. Basics of cellular logic with some applications in medical image processing. Proceedings of IEEE, 67(5):826–857, 1979. [231] A. Provota and C. Nicolis. A Microscopic Aggregation Model of Droplet Dynamics in Warm Clouds. J. Stat. Phys., 74:75–89, 1994. [232] R. Raghavan. Cellular automata in pattern recognition. Information Science, 70:145–177, 1993. [233] J. Rajski, N. Tamarapalli, and J. Tyszer. Automated Synthesis of Large Phase Shifters for Built-In Self-Test. In Proceedings of International Test Conference, pages 1047–1056, 1998. [234] Q. Ren and M. H. Dunham. Using semantic caching to manage location dependent data in mobile computing. In International Conference on Mobile Computing and Networking, Mobicom, pages 210–221, August 2000. [235] L. V. Reshodko and Z. Drska. Biological systems of cellular organization and their computer models. Journal of Theo. Biology, page 563, 1977. [236] M. Resnick. Turtles, Termites and Traffic Jams. MIT Press, 1994. [237] F. C. Richards, T. P. Meyer, and N. H. Packard. Extracting Cellular Automata Rules directly from Experimental Data. Physica D, 45:189–202, 1990. [238] D. Richardson. Tesselations with local transformations. Journal of Computer and System Sciences, 5:373–388, 1972.
156 [239] M. Roncken, K. Stevens, and P. Pal Chaudhuri. CA-BIST for Asynchronous Circuits: A Case Study on RAPPID Asynchronous Instruction Length Decoder. In Proceedings of 6th International Symposium on Advanced Research in Asynchronous Circuits and Systems, Eilat Israel, 2000. [240] A. Rosenfeld. Picture Languages. Academic, New York, 1979. [241] J. M. Sakoda. The Checkerboard Model for Social Interaction. J. Math. Socio., 1:119–132, 1971. [242] Palash Sarkar and Rana Barua. Multi-dimensional σ-automata, π-polynomial and generalized s-matrices. Theoretical Computer Science, 197(1-2):111–138, 1998. [243] Palash Sarkar and Rana Barua. The set of Reversible 90/150 Cellular Automata Is Regular. Discrete Applied Math, 84(1-3):199–213, 1998. [244] Tadakazu Sato and Namio Honda. Certain relations between properties of maps of tesselation automata. Journal of System and Computer Sciences, 15:121–145, 1977. [245] Thomas C. Schelling. Dynamic models of segregation. Journal of Mathematical Sociology, 1(2):143–186, June 1971. [246] B. Schonfisch and M. Kinder. A Fish Migration Model. In Fifth International Conference on Cellular Automata for Research and Industry, ACRI, pages 210– 219, 2002. [247] B. Schonfisch and A. De Roos. Synchronous and Asynchronous updating in Cellular Automata. Biosystems, 51:123–143, 1999. [248] J. Seiferas. Observations on nondeterministic multidimensional iterative arrays. ACM Symposium on the Theory of Computing, ACM Press, Newyork, pages 276–289, 1982. [249] Subhayan Sen, Chandrama Shaw, Dipanwita Roy Chowdhury, Niloy Ganguly, and P Pal Chaudhuri. Cellular Automata Based Cryptosystem (CAC). In Proceedings of ICICS, pages 303–314, 2002. [250] M. Serra. Algebraic analysis and algorithms for linear cellular automata over GF(2) and the applications to digital circuit testing. Congressus Numerantium, 75:127–139, 1990. [251] M. Serra and G. L. Chen. Pseudo-random pattern generation and fault coverage of delay faults with non linear finite state machines with high entropy. In Proceedings of IEEE On-Line Testing Workshop,Crete, Greece, pages 66–77, 1997.
157 [252] M. Serra, T. Slatear, and A Lanczos. Algorithm in a Finite Field and its Application. Journal of Combinatorial Mathematics and Combinatorial Computing, pages 11–32, April 1990. [253] M. Serra, T. Slater, J. C. Muzio, and D. M. Miller. Analysis of one dimensional cellular automata and their aliasing probabilities. IEEE Trans. on CAD, 9(7):767–778, July 1990. [254] C. Shaw, D. Chatterjee, P. Maji, S. Sen, and P. Pal Chaudhuri. A Pipeline Architecture For Encompression (Encryption + Compression) Technology. In Proceedings of 16th International Conference on VLSI Design, pages 277–282, January 2003. [255] M. Shereshevsky. Lyapunov Exponent for one-dimensional Cellular Automata. J Nonlinear Science, 2:1–8, 1992. [256] H. Sieburg, X. McCutchan, X. Clay, X. Cabalerro, and X. Ostlund. Simulation of HIV infection in artificial immune systems. Physica D, 45:208–227, 1990. [257] Biplab K. Sikdar. Theory and Applications of Hierarchical Cellular Automata for V LSI Circuit Testing. PhD thesis, Bengal Engineering College (a Deemed University), 2003. [258] Biplab K. Sikdar, Niloy Ganguly, and P Pal Chaudhuri. Design of Hierarchical Cellular Automata For On-Chip Test Pattern Generator. IEEE Trans. on CAD, 21(12):1530–1533, December 2002. [259] Biplab K. Sikdar, Niloy Ganguly, and P Pal Chaudhuri. Fault Diagnosis of VLSI Circuits with Cellular Automata Based Pattern Classifier. IEEE Trans. on CAD, 24(7):1115 – 1131, July 2005. [260] Biplab K. Sikdar, Niloy Ganguly, Purnabha Majumder, and P Pal Chaudhuri. Design of multiple attractor GF (2p ) cellular automata for diagnosis of vlsi circuits. In P roceedingsof 14th International Conferferce on VLSI Design, pages 454–459, January 2001. [261] M. Sipper. Evolution of Parallel Cellular Machines : The Cellular Programming Approach. Springer-Verlag, Heidelberg, 1997. [262] M. Sipper, E. Sanchez, D. Mange, M. Tomassini, A. Prez-Uribe, and A. Stauffer. A phylogenetic, ontogenetic, and epigenetic view of bio-inspired hardware systems. IEEE Trans. Evolutionary Computation, 1(1):83–97, 1997. [263] M. Sipper and M. Tomassini. Generating parallel random number generators by cellular programming. Intl. J. Modern Phys., 7(2):180–190, 1996. [264] A. Smith. Introduction to and Survey of Polyautomata Theory. Automata, Languages, Development, North Holland Publishing Co., 1976.
158 [265] S. Smith, R. Watt, and R. Hameroff. Cellular automata in cytoskeletal lattices. Physica D, 10:168, 1984. [266] S. R. Sternberg. Language and architecture for parallel image processing, page 35. North Holland Publishing Co., Amsterdam, 1980. [267] H. Stone. Linear Machines. Princeton Univ. Press, 1965. [268] X. Sun, E. Konotopidi, M. Serra, and J. C. Muzio. The concatenation and partitioning of linear finite state machines. International Journal of Electronics, 78(5):809–839, 1995. [269] K. Sutner. Additive automata on graphs. Complex Systems, 2:649–661, 1988. [270] R. C. Tausworthe. Random numbers generated by linear recurrence modulo two. Mathematics of Computation, 19:201–209, 1965. [271] A. Tettamanzi and Marco Tomassini. Evolutionary Algorithms and their Applications. Bio-Inspired Computing Machines : Towards Novel Computing Architectures, 1998. [272] S Tezuka and M. Fushimi. A Method of Designing Cellular Automata as Pseudorandom Number Generators for Built-in Self-Test for VLSI. Finite Fields : Theory, Applications and Algorithms, Contemporary Mathematics AMS, 168:363– 367, 1994. [273] J. Thatcher. Universality in Von Neumann cellular model. In Tech. Report 03105-30-T, ORA, University of Michigan, 1964. [274] Trinh Xuan Thuan. Chaos and Harmony. Oxford University Press, 2001. [275] T. Toffoli. Cellular Automata Machines. PhD thesis, Univ. of Michigan, 1977. [276] T. Toffoli. Computation and construction universality of reversible cellular automata. Journal of Computer and System Sciences, 15:213–231, 1977. [277] T. Toffoli. Computation and construction universality of reversible cellular automata. J. Comput. System Sci., 15:213–231, 1977. [278] T. Toffoli. CAM : A high-performance cellular automata machine. Physica D, 10:195, 1984. [279] T. Toffoli. Cellular automata as an alternative to (rather than an approximation of) differential equations in modeling physics. Physica D, 10:117, 1984. [280] T. Toffoli and N. Margolus. Cellular Automata Machines. The MIT Press, 1987. [281] T. Toffoli and N. H. Margolus. Invertible cellular automata : A review. Physica D, 45:229–253, 1990.
159 [282] Marco Tomassini. The Parallel Genetic Cellular Automata : Application to Global Function Optimization. In Proceedings of Fifth International Conference on Artificial Neural Networks and Genetic Algorithm, pages 385–391, 1993. [283] Marco Tomassini, Moshe Sipper, and Mathieu Perrenoud. On the generation of high-quality random numbers by two-dimensional cellular automata. IEEE Transactions on Computers, 49(10):1146–1151, 2000. [284] Marco Tomassini, Moshe Sipper, M. Zolla, and Mathieu Perrenoud. Generating high-quality random numbers in parallel by cellular automata. Future Gen. Compt. Syst., 16:291–305, 1999. [285] Marco Tomassini and Mattias Venzi. Artificially Evolved Asynchronous Cellular Automata for the Density Task. In Proceedings of Fifth International Conference on Cellular Automata for Research and Industry, ACRI, Switzerland, pages 44– 55, October 2002. [286] Ph. Tsalides. Cellular Automata based Built-In Self-Test Structures for VLSI Systems. Elect. Lett., 26(17):1350–1352, 1990. [287] Ph. Tsalides, T. A. York, and A. Thanailakis. Pseudo-random Number Generators for VLSI Systems based on Linear Cellular Automata. IEE Proc. E. Comput. Digit. Tech., 138(4):241–249, 1991. [288] Alan Turing. On computable numbers, with an application to the entscheidungsproblem. Proceedings of London Math. Soc., Ser. 2 42:230–265, 1936. [289] P. Tzionas, Ph. Tsalides, and A. Thanailakis. Design and VLSI implementation of a Pattern Classifier using pseudo 2D Cellular Automata. IEE Proc. G, 139(6):661–668, December 1992. [290] G. Vichniac. Simulating physics with cellular automata. Physica D, 10:96, 1984. [291] P. Vitanyi. Sexually reproducing cellular automata. Math. Biosciences, 18:23– 54, 1973. [292] John von Neumann. Various techniques used in connection with random digits. Natl. Bur. Std. Aool. Math. Ser., 12:36–38, 1951. [293] John von Neumann. Probabilistic logics and the synthesis of reliable organisms from unreliable components, J. von Neumann’s Collected Works. A. Taub, Ed, 1963. [294] John von Neumann. The theory of self-reproducing Automata, A. W. Burks ed. Univ. of Illinois Press, Urbana and London, 1966. [295] B. Voorhees. Nearset neighbour cellular automata over z 2 with periodic boundary conditions. Physica D, 45:26–35, 1990.
160 [296] C. C. Walker. Attractor dominance patterns in sparsely connected boolean nets. Physica D, 45:441–451, 1990. [297] C. C. Walker and W. R. Ashby. On the Temporal Characteristics of Behavior in Certain Complex Systems. Kybernetik, 3:100–108, 1966. [298] J Watrous. On one-dimensional quantum cellular automata. In 36th Annual Symposium on Foundations of Computer Science, pages 528–537, 1995. [299] A. Winfree, E. Winfree, and H. Seifert. Organizing centers in a cellular excitable medium. Physica D, 17:109–115, 1985. [300] S. Wolfram. Statistical mechanics of cellular automata. 55(3):601–644, July 1983.
Rev. Mod. Phys.,
[301] S. Wolfram. Computation theory of cellular automata. Commun. Math. Phys., 96:15–57, 1984. [302] S. Wolfram. Universality and Complexity in cellular automata. Physica D, 10:1–35, 1984. [303] S. Wolfram. Undecidability and intractability in theoretical physics. Phys. Rev. Lett., 54:735–738, 1985. [304] S. Wolfram. Random sequence generation by cellular automata. Advances in Applied Mathematics, pages 123–169, 1986. [305] S. Wolfram. Theory and applications of cellular automata. World Scientific, Singapore, 1986. ISBN 9971-50-124-4 pbk. [306] S. Wolfram. High speed computing: Scientific Application and Algorithm Design, ed. Robert B. Wilhelmson. University of Illinois Press, 1988. [307] S. Wolfram. Cellular Automata and Complexity — Collected Papers. Addison Wesley, 1994. [308] S. Wolfram. A New Kind of Science. Wolfram Media, Inc., 2002. [309] W. K. Wootters and C. G. Langton. Is there a sharp transition for deterministic cellular automata. Physica D, 45:95–104, 1985. [310] A. Wuensche. Visible Learning : Sculpting the Basin of Attraction Fields of Random Boolean Network. Computing with Logical Neurons, 1993. [311] A. Wuensche. Complexity in One-D Cellular Automata. Santa Fe Institute Working Paper 94-04-025, 1994. [312] A. Wuensche. Classifying Cellular Automata Automatically. Santa Fe Institute Working Paper 98-02-018, 1998.
161 [313] A. Wuensche and M. J. Lesser. The Global Dynamics of Cellular Automata. Santa Fe Institute Studies in the Science of Complexity, Addison Wesley, 1992. [314] J Xu and D L Lee. Querying Location-dependent Data in Wireless Cellular Environment. In http://www.w3.org/Mobile/posdep/query xujl.html. [315] J. Xu and D. L. Lee. Querying location-dependent data in wireless cellular environment. In W3C and WAP Workshop on Position Dependent Information Services, February 2000. [316] J. Xu, X. Tang, and D. L. Lee. Performance analysis of location-dependent cache invalidation schemes for mobile environments. IEEE Trans. on Knowledge and Data Engineering, 15(2):474–488, March/April 2003. [317] J. Xu, B. Zhang, W. Lee, and D. L. Lee. Energy Efficient Index for Querying Location Dependent Data in Mobile Broadcast Environments. In 19th International Conference on Data Engineering, Bangalore, India, pages 239–250, 2003. [318] Takeo Yaku. Inverse and injectivity of parallel relations induced by cellular automata. In Proceedings of the American Mathematical Society, volume 58, pages 216–220, 1976. [319] D. Young. A local activator-inhibitor model of verterbate skin patterns. Math Biosciences, 72:51–58, 1984. [320] B. Zheng and D. L. Lee. Processing location dependent queries in a multi-cell wireless environment. In ACM Press, pages 54–65, 2001. [321] M. Zwick and H. Shu. Set Theoretic Reconstructibility of Elementary Cellular Automata. Advances in Systems Science and Applications, Special Issue, 1:31– 36, 1995.
162
Author’s Publications 1. Sukanta Das and Biplab K Sikdar: Classification of CA Rules Targeting Synthesis of Reversible Cellular Automata. Accepted for Publication in Proceedings of International Conference on Cellular Automata for Research and Industry, ACRI, France, 2006. 2. Chandrama Shaw, Sukanta Das and Biplab K Sikdar: Cellular Automata Based Encoding Technique for Wavelet Transformed Data Targeting Still Image Compression. Accepted for Publication in Proceedings of International Conference on Cellular Automata for Research and Industry, ACRI, France, 2006. 3. Sukanta Das, Hafizur Rahaman and Biplab K Sikdar: Cost Optimal Design of Nonlinear CA Based PRPG for Test Applications. IEEE 14th Asian Test Symposium, Kolkata, December 18-21, 2005. 4. Sukanta Das, Sipra Das(Bit) and Biplab K Sikdar: Non-linear Cellular Automata Based Design of Query Processor for Mobile Network. IEEE SMC conference, Hawaii, October 10-12, pages 2751-2756, 2005. 5. Sukanta Das, Anirban Kundu, Biplab K. Sikdar, P. Pal Chaudhuri: Design of Nonlinear CA Based TPG Without Prohibited Pattern Set In Linear Time. JOURNAL OF ELECTRICAL TESTING: Theory and Applications 21, pages 97-109, 2005. 6. Biplab K. Sikdar, Sukanta Das, Samir Roy, Niloy Ganguly, Debesh K. Das: Cellular Automata Based Test Structures with Logic Folding. VLSI Design, India: 71-74, 2005. 7. Sukanta Das, Biplab K. Sikdar, P. Pal Chaudhuri: Characterization of Reachable/Nonreachable Cellular Automata States. International Conference on Cellular Automata for Research and Industry, ACRI, pages 813-822, The Netherlands, 2004. 8. Sukanta Das, Debdas Dey, Subhayan Sen, Biplab K. Sikdar, Parimal Pal Chaudhuri: An efficient design of non-linear CA based PRPG for VLSI circuit testing. ASP-DAC, Japan, pages 110-112, 2004. 9. Sukanta Das, Anirban Kundu, Biplab K. Sikdar: Nonlinear CA Based Design of Test Set Generator Targeting Pseudo-Random Pattern Resistant Faults. Asian Test Symposium, pages 196-201, Taiwan, 2004. 10. Sukanta Das, Biplab K. Sikdar, Parimal Pal Chaudhuri: Nonlinear CA Based Scalable Design of On-Chip TPG for Multiple Cores. Asian Test Symposium, pages 331-334, Taiwan, 2004.
163 11. Sukanta Das, Anirban Kundu, Subhayan Sen, Biplab K. Sikdar, Parimal Pal Chaudhuri: Non-Linear Celluar Automata Based PRPG Design (Without Prohibited Pattern Set) In Linear Time Complexity. Asian Test Symposium, pages 78-83, Chaina, 2003. 12. Sukanta Das, Niloy Ganguly, Biplab K. Sikdar, Parimal Pal Chaudhuri: Design Of A Universal BIST (UBIST) Structure. VLSI Design, pages 161-166, India, 2003. 13. Niloy Ganguly, Anindyasundar Nandi, Sukanta Das, Biplab K. Sikdar, Parimal Pal Chaudhuri: An Evolutionary Strategy To Design An On-Chip Test Pattern Generator Without Prohibited Pattern Set (PPS). Asian Test Symposium, pages 260-265, Guam, 2002.