Computers and Intelligent Systems

ISAST Transactions on Computers and Intelligent Systems

ISAST Transactions on

No. 2, Vol. 1, 2009 (ISSN 1798-2448)

Computers and Intelligent Systems A. Javadi, M. Mehravar, A. Faramarzi and A. Ahangar-Asr: An Artificial Intelligence Based Finite Element Method……………………………………………………1 D. Peters, D. Raskovic and D. Thorsen: An Energy Efficient Parallel Embedded System for Small Satellite Applications………………………….8 K. Pyragas, V. Pyragas, and T. Pyragiene: Control and Synchronization of Dynamical Systems via a Time-Delay Feedback………………………....17 A. Hedman and H. Alm: Testing Image Browsers - An Analysis of Layout and Presentation Factors that Affect Usability…………23 N. Bouhmala and X. Cai: A Multilevel Approach for the Satisfiability Problem………………………………………………………29 J. C. Chedjou, K. Kyamakya, M. A. Latif1, U. A. Khan, I. Moussa and Do Trong Tuan: Solving Stiff Ordinary Differential Equations and Partial Differential Equations Using Analog Computing Based on Cellular Neural Networks…………………………………………………………….38

No. 2 Vol.1, 2009

A. Caneco, C. Gracio, S. Fernandes, J. L. Rocha and C. Ramos: Synchronizability and Graph Invariants……………………………………………………………………..47 C. Adams and M. Rodrigues: Dealing with Non-Policy-Conformant Requests in Credential Systems…………………………………….53 M. Ohba, K. Matsuoka, and T. Ohta: Eliciting State Transition Diagrams from Programs described in a Rule-based Language………………….58 K. S. Thampatty, M. P. Nandakumar and E. P. Cheriyan: RTRL Based Multivariable Neuro-controller for Non-linear Systems………………………………………67 H. Rashidi and Z. Rashidi: A Simple Technique for Implementation of Coroutines in Programming Languages……………………….75 J. Yang, G. Zhao, L. Ray, S. Huang: Analyzing and Correlating Interactive Sessions with One-Dimensional Random Walk to Detect Stepping-Stone Intrusion……………………………………………………………………………………..78 A. Mutazono, M. Sugano, and M. Murata: Self-organizing Anti-phase Synchronization Scheme for Sensor Networks Inspired by Frogs’ Calling Behavior…………………………………………………………………………………..86 N. Auluck: Improving the Schedulability of Hybrid Real Time Heterogeneous Network of Workstations (NOWs)…...94 L. Coppolino, V. Vianello, S. Giordano, M. Belfiore: Using real time data streaming to build a GPS Localization System………………………………………...97 J. Zhang and J. Meng: Improved Pilot Aided Mobile Channel Estimation for OFDM Systems under Narrowband Interference….104 B. Ghavami, H. Pedram: Automatic Slack Matching of Asynchronous Circuits Utilized in the Synthesis Tool Framework………..111

1

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 A. Javadi et.al: An Artificial Intelligence Based Finite Element Method

An Artificial Intelligence Based Finite Element Method Akbar A. Javadi, Moura Mehravar, Asaad Faramarzi and Alireza Ahangar-Asr

Abstract—In this paper, a new approach is presented based on artificial intelligence and evolutionary computing, for constitutive modeling of materials in finite element analysis, with potential applications in different engineering disciplines. This new approach presents a unified framework for constitutive modeling of complex materials in finite element analysis using evolutionary polynomial regression (EPR). EPR is a data-driven method based on evolutionary computing, aimed to search for polynomial structures representing a system. A procedure is presented for construction of EPRbased constitutive model (EPRCM) and its integration in finite element procedure. The main advantage of EPRCM over conventional and neural network-based constitutive models is that it provides the optimum structure for the material constitutive model representation as well as its parameters, directly from raw experimental (or field) data. It can learn nonlinear and complex material behavior without any prior assumption on the constitutive relationship. The proposed approach provides a transparent relationship for the constitutive material model that can readily be incorporated in a finite element model. A procedure is presented for efficient training of EPR, computing the stiffness matrix using the trained EPR model and incorporation of the EPRCM in a commercial finite element code, ABAQUS. The application of the developed EPR-based finite element method is illustrated through two examples and advantages of the proposed method over conventional and neural network-based FE methods are highlighted. Index Terms— Constitutive Modeling, Data Mining, Evolutionary Computation, Finite Elements

Manuscript received November 2, 2009. Akbar A. Javadi is a Senior Lecturer in Geotechnical Engineering in University of Exeter, School of Engineering, Mathematics and Physical Sciences, Exeter, EX4 4QF, UK (corresponding author; phone: +44 1392 263640; fax: +44 1392 217965; e-mail: [email protected]). Moura Mehravar is an MSc student in Civil Engineering, Department of Civil Engineering in Azad University, South Tehran Branch, Tehran, Iran, (e-mail: [email protected]). Asaad Faramarzi is a PhD student in Geotechnical Engineering in University of Exeter, School of Engineering, Mathematics and Physical Sciences, Exeter, EX4 4QF, UK (e-mail: [email protected]). Alireza Ahangar-Asr is a PhD student in Geotechnical Engineering in University of Exeter, School of Engineering, Mathematics and Physical Sciences, Exeter, EX4 4QF, UK (e-mail: [email protected]).

S

I. INTRODUCTION

imulation techniques, and in particular the finite element method, have been used successfully to predict the response of systems across a whole range of industries including aerospace and automotive, biomedical, chemical processes, geotechnical engineering and many others. In this numerical analysis the behavior of the actual material is approximated with that of an idealized material that deforms in accordance with some constitutive relationships. Therefore, the choice of an appropriate constitutive model that adequately describes the behavior of the material plays an important role in the accuracy and reliability of the numerical predictions. During the past few decades several constitutive models have been developed for various materials. Most of these models involve determination of material parameters, many of which have a little physical meaning [1]. Despite considerable complexities of constitutive theories, due to the erratic and complex nature of some materials such as soils, rocks, composites, etc., none of the existing constitutive models can completely describe the real behavior of these materials under various stress paths and loading conditions. In conventional constitutive modeling, an appropriate mathematical model is initially selected and the parameters of the model (material parameters) are identified from appropriate physical tests on representative samples to capture the material behavior. When these constitutive models are used in finite element analysis, the accuracy with which the selected material model represents the various aspects of the actual material behavior affects the accuracy of the finite element prediction. In the past few decades, attempts have been made by a number of researchers to use artificial neural networks (ANN) to model the constitutive material behavior. The application of ANN for constitutive modeling of concrete was first proposed by Ghaboussi et al. [2]. Ghaboussi and Sidarta [3] presented an improved technique of ANN approximation for learning the mechanical behavior of drained and undrained sand. The role of ANN in constitutive modeling was also studied by a number of other researchers (e.g., [4-9]). These studies indicated that neural network-based constitutive models can capture nonlinear material behavior with high accuracy. While it has been shown that ANNs offer great advantages in constitutive

2


modeling of materials, they also have some drawbacks. One of the main disadvantages of the NNCM is that the optimum structure of the ANN (e.g., number of layers, number of neurons in the hidden layers, transfer functions, etc.) must be identified a priori which is usually obtained using a time consuming trial and error procedure. Another drawback of the NNCM approach is the large complexity of the network structure as it represents the knowledge in terms of a weight matrix and biases that are not accessible to user [10]. Although one of the main applications of material modeling is in numerical analysis of boundary value problems, to date not many researchers have considered the integration of the neural network based constitutive models (NNCMs) in numerical modeling techniques such as finite element method [1]. The main reason for this appears to be the fact that there are considerable difficulties in incorporating a general NNCM in finite element codes [11]. However, more recently it has been shown that NNCM can be practically incorporated in a finite element code as a material model (e.g., [12, 13]). Hashash et al. described some of the issues related to the numerical implementation of a NNCM in finite element analysis and derived a closedform solution for material stiffness matrix for the neural network-based constitutive model [13]. Javadi and his coworkers have carried out extensive research into application of neural networks in constitutive modeling of complex materials. They have developed an intelligent finite element method (NeuroFE code) based on the incorporation of a back propagation neural network (BPNN) in finite element analysis (e.g., [14–17]). In their work, they used actual material test results to extract stress– strain relationship and to train the NNCM. It has been shown that NNCMs trained in this way can be very efficient in learning and generalizing the constitutive behavior of complex materials and give better results (compared with conventional constitutive models) when they are employed in a finite element code to analyse structures or domains made of the material under consideration. In this paper a new approach is introduced for constitutive modeling of complex materials, which integrates numerical and symbolic regression to perform evolutionary polynomial regression (EPR). The strategy uses polynomial structures to take advantage of their favorable mathematical properties. The main idea behind the EPR is to use evolutionary search for exponents of polynomial expressions by means of a genetic algorithm (GA) engine. This allows (i) easy computational implementation of the algorithm, (ii) efficient search for an explicit expression (formula) and (iii) improved control of the complexity of the expression generated [18]. In what follows, the main principles of EPR will be outlined. A procedure is presented for computing the stiffness matrix using the trained EPR model and incorporation of the EPRCM in the finite element software ABAQUS. The application of the developed EPR-based finite element

method is illustrated through two examples. II. EVOLUTIONARY POLYNOMIAL REGRESSION Evolutionary polynomial regression (EPR) is a datadriven method based on evolutionary computing, aimed to search for polynomial structures representing a system. A general EPR expression can be presented as [18]: n

y = ∑ F ( X , f ( X ), a j ) + a0

(1)

j =1

where y is the estimated vector of output of the process;

a j is a constant; F is a function constructed by the process; X is the matrix of input variables; f is a function defined by the user and n is the number of terms of the target expression. The general functional structure represented by F ( X , f ( x ), a j ) + a0 is constructed from elementary functions by EPR using a (GA) strategy. The GA is employed to select the useful input vectors from X to be combined. The building blocks (elements) of the structure of F are defined by the user based on understanding of the physical process. While the selection of feasible structures to be combined is done through an evolutionary process, the parameters a j are estimated by the least square method. EPR is a technique for data-driven modeling. In this technique, the combination of the genetic algorithm to find feasible structures and the least square method to find the appropriate constants for those structures implies some advantages. In particular, the GA allows a global exploration of the error surface relevant to specifically defined objective functions. By using such objective (cost) functions some criteria can be selected to be satisfied through the search process. These criteria can be set in order to (i) avoid the overfitting of models, (ii) push the models towards simpler structures, and (iii) avoid unnecessary terms representative of the noise in data. An interesting feature of EPR is in the possibility of getting more than one model for a complex phenomenon. A further feature of EPR is the high level of interactivity between the user and the methodology. The user’s physical insight can be used to make hypotheses on the elements of the target function and on its structure (Eq. (1)). Selecting an appropriate objective function, assuming pre-selected elements in Eq. (1) based on engineering judgment, and working with dimensional information enable refinement of final models. Detailed explanation can be found in [18], [19]. III. CONSTITUTIVE MODELLING USING EPR In constitutive modeling using EPR, the raw experimental or in-situ test data are directly used for training the EPR. In this approach, there are no mathematical models to select and the EPR learns the constitutive relationships directly


3

from the raw data during the training process. As a result, there are no material parameters to be identified and as more data becomes available, the material model can be improved by re-training of the EPR using the additional data. Furthermore, the incorporation of an EPR in a finite element procedure avoids the need for complex yielding/failure functions, flow rules, etc. An EPR equation can be incorporated in a finite element code/procedure in the same way as a conventional constitutive model. It can be incorporated either as incremental or total stress-strain strategies. In this study both the incremental and total stressstrain strategies have been successfully implemented in the intelligent finite element model. A. Input and output parameters The choice of input and output quantities is determined by both the source of the data and the way the trained EPR model is to be used. A typical scheme to train most of the neural network based material models includes an input set providing the network with the information relating to the current state units (e.g., current stresses and current strains)

σ = 8.136 × 1013 ε 5 − 2.839 × 1011 ε 4 + 3.115 × 108 ε 3

(2) − 8.285 × 10 4 ε 2 + 2.10 × 1011ε − 5.921 × 10−3 where ε is the strain and σ is corresponding stress. The second set of data is corresponding to a material with elasto-plastic behavior. After training EPR, Eq. 3 is selected as the best EPR model based on the COD.

250

200 Stress (MPa)

and then a forward pass through the neural network yields the prediction of the next expected state of stress and/or strain relevant to an input strain or stress increment [7]. In this paper, a method is introduced based on EPR for constitutive modeling in finite element analysis. This method takes advantage of the explicit mathematical representation of the relationships in EPR. To evaluate the potential of using EPR to derive functions describing the constitutive behavior of materials two sets of stress-strain data are employed to train and test the EPR models. In both cases the data is divided into two separate sets. One set is used for training of the EPR model and the other one is used for validation to appraise the generalization capability of the trained EPR model. After training and validation, the best function is selected, based on the quality of fit according to the coefficient of determination (COD) and also how well the selected model represents the actual stress-strain behavior. The first set of strain-stress data is representing a material with linear elastic behavior. The selected EPR model for the curve passing through data points is:

150 100

σ = 2.167 × 1011 ε − 6.297 × 1015 ε 3 +1.034 × 1018 ε 4

Original Data EPR

− 8.032 × 1019 ε 5 + 3.441 × 1021 ε 6 − 8.024 × 10 22 ε 7

50

+ 8.766 × 1023 ε 8 − 2.545 × 10 24 ε 9 − 6.505 × 106

(3)

0 0

0.0002

0.0004 0.0006 Strain

0.0008

(a) 1400 1200

Stress (MPa)

1000 800 600 Original Data

400

EPR

200 0 0

0.005

0.01

0.015

0.02

0.025

Strain

(b) Fig. 1. Results of the EPR models prediction and the original data.

0.001

Figs. 1a and 1b show the stress-strain curve predicted by Eqs. 2 and 3 (as a marker points) against those expected. It can be seen from these figures that EPR has successfully captured the material behavior with an excellent accuracy. The material model in an FE analysis has to provide the material stiffness matrix also known as the Jacobian. For infinitesimal strain increments ( dε ), J is the continuum Jacobian, J c

∂ ( dσ ) (4) ∂ ( dε ) This equation will be employed later to build the stiffness matrix. Jc =

IV. INTELLIGENT FINITE ELEMENTS The obtained EPRCMs are implemented in a widely used general-purpose finite element code ABAQUS through the user defined material module (UMAT). UMAT updates the stresses and provides the material Jacobian matrix for every increment at every integration point [20], [21]. In the developed methodology (EPR–FEM), the EPRCM replaces


4

the role of a conventional constitutive model. The source of knowledge for EPR is a set of raw experimental (or in situ) data representing the mechanical response of the material to applied load. When EPR is used for constitutive description, the physical nature of the input–output data for the EPR is clearly determined by the measured quantities, e.g., stresses, strains, etc. The manner, in which EPRCM is

incorporated in a FE code, is described in Fig. 2. This figure also shows the main steps of EPR. The constitutive relationship are generally given in the form [22]

∆σ = D∆ε

EPR

(5)

FEA Start

Start

Input Data (Geometry, Applied Load, Initial and Boundary Condition)

Input Data (Experimental Data, Physical Insight)

Increase the Applied Load Incrementally

Current state of stresses and strains Load Increment Loop

Genetic Algorithm

UMAT EPRCM(s)

Mathematical Structure

1- Next state of Stresses 2- Jacobian Matrix

EPR Constitutive equation

Iteration Loop

Least Square

Solve the Main Equation

NO

Symbolic Function

Convergence

YES

Fitness

NO

Output Result

YES

Whole load applied?

NO

Check based on fitness criteria and or generation number YES

STOP

Fig. 2. The incorporation of EPR-based material model in ABAQUS finite element software for an integration point.


where D is material stiffness matrix also known as the Jacobian. Assuming that matrix is elastic and isotropic for a load increment, matrix D is given in terms of Young’s modulus, E, and Poisson’s ratio, ν. For a plane strain case, for example

ν ν 1 − ν  ν ν ν 1 −  E D=  ν ν 1 −ν (1 + ν )(1 − 2ν )   0 0 0 

0

     1 − 2ν  2 

0 0

The EPR-based finite element model incorporating the trained EPR was used to analyse the behavior of the cylinder under applied internal pressure. Assuming a linear behavior for a small load increment in the nonlinear FE analysis, the tangential elastic modulus of the material at each strain can be obtained from the derivative of the Eq. 2. Therefore the EPR based elastic modulus can be taken as: Et =

(6)

A. Numerical Examples 1) Example 1: This example involves a thick circular cylinder conforming to plane strain conditions. Fig. 3 shows the geometric dimensions and the element discretization employed in the solution and it is seen that 12 parabolic isoparametric elements have been used. The cylinder is made of linear elastic material with a Young's modulus of E=2.1×105 N/mm2 and a Poisson's ratio of 0.3 [22]. This example was deliberately kept simple in order to verify the computational methodology by comparing the results of a linear elastic finite element model. The loading case considered involves an internal pressure of 8.0×104 kN with boundary conditions as shown in Fig. 3.

dσ = 4.068 × 1014 ε 4 − 1.1356 × 1012 ε 3 dε + 9.345 × 108 ε 2 − 1.657 × 105 ε + 2.1 × 1011

(7)

Eq. 7 is used to calculate stiffness matrix (Eq. 6). During the analysis, the Poisson’s ratio was kept constant.

200

160 Standard FEM

Stress (MPa)

5

IFEM (EPRCM)

120

80

40

0 0

40

80

120

160

200

r (mm)

(a)

Radial Displacement (mm)

0.08

P

0.06 Standard FEM IFE (EPRCM)

0.04

0.02

0 0

40

80

120

160

200

r (mm)

100 mm

(b) 200 mm

Fig. 3. FE Mesh in symmetric quadrant of a thick cylinder.

Fig. 4. Comparison of the results of the EPR-FEM and standard FEM in terms of (a) radial stress and (b) radial displacement


6

The results are compared with those obtained using a standard linear elastic finite element method. Fig. 4 shows the radial displacements and radial stresses along a radius of the cylinder, predicted by the two different methods. Comparison of the results shows that the results obtained using the EPR based FEM are in excellent agreement with those attained from the standard finite element analysis. This shows the potential of the developed EPR based finite element method in deriving constitutive relationships from raw data using EPR and using these relationships to solve boundary value problems. 2) Example 2: The second example is a plane stress beam (Fig. 5) subjected to uniform pressure. The developed EPR constitutive model (Eq. 3) is used to describe the material behavior. To evaluate the described methodology, displacement of point A (mid-span) of the beam predicted using the EPR-based FE analysis is compared with that from conventional FE analysis. For the conventional FE analysis, an elasto-plastic model in ABAQUS, based on the tabulated stress-strain data, is used. For the EPR-based FE analysis, the Young’s modulus is determined as: Et =

dσ = 2.167 × 1011 − 1.889 × 1016 ε 2 + 4.137 × 1018 ε 3 dε − 4.016 × 1020 ε 4 + 2.064 × 1022 ε 5 − 5.616 × 10 23 ε 6 + 7.013 × 10 ε − 2.272 × 10 ε 24

7

25

(8)

8

Fig. 6 shows the load-displacement curves at point A (mid-span of the beam) obtained using the conventional FE model and the developed EPR-based FE model. It is shown that results of the EPR-based FEM are in very good agreement with those from the conventional FE analysis using an elasto-plastic model.

Applied Pressure (MPa)

50 40 30 20 Elasto_Plastic FE Analysis EPRCM FE Analysis

10 0 0

1

2

3

4

5

displacement (mm)

Fig. 6. Comparison of the results of the EPR-based FEM and conventional FEM

V. CONCLUSIONS An intelligent finite element method (EPR-FEM) has been developed based on the integration of an EPRCM in a finite element framework. In the developed methodology, the EPRCM is used as an alternative to the conventional constitutive models for the material. A procedure is presented for computing the stiffness matrix using the trained EPR model and incorporation of the EPRCM in a commercial finite element code ABAQUS. The efficiency and adaptability of the proposed method have been demonstrated by successful application to two boundary value problems. The results of the analysis have been compared to those obtained from conventional FE analyses using the linear elastic and elastic-plastic models. The result shows that EPRCM can be successfully implemented in a finite element model as an effective alternative to conventional material models. It is also shown that stiffness matrix elements can be directly obtained from EPR stressstrain relationship. REFERENCES [1]

100 mm

[2]

A 600 mm Fig. 5. Simply supported beam under uniform pressure

[3] [4] [5] [6] [7]

H.S. Shin, Neural network based constitutive models for finite element analysis, PhD Dissertation, University of Wales, Swansea, UK, 2001. J. Ghaboussi, J. Carret and X. Wu, Knowledge-based modeling of material behavior with neural networks. Journal of Engineering Mechanics Division, vol. 17, pp. 32-153, 1991. J. Ghaboussi and D.E. Sidarta, New nested adaptive neural networks (NANN) for constitutive modeling. Computers and Geotechnics, vol. 22, pp. 29-52, 1998. D. Penumadu and R. Zhao, Triaxial compression behavior of sand and gravel using artificial neural networks. Computers and Geotechnics, vol. 24, pp. 207-230, 1999. G.W. Ellis, C. Yao, R. Zhao, D. Penumadu, Stress–strain modeling of sands using artificial neural netwoks, ASCE Journal of Geotechnical Engineering Division vol. 121, pp. 429–435, 1995. J.-H. Zhu, M.M. Zaman, S.A. Anderson, Modeling of soil behavior with a recurrent neural network, Canadian Geotechnical Journal vol. 35, pp. 858–872, 1998. J. Ghaboussi, D.A. Pecknold, M. Zhang, R.M. Haj-Ali, Autoprogressive training of neural network constitutive models,

7

[8] [9]

[10] [11]

[12] [13]

[14] [15]

[16] [17]

[18] [19] [20] [21] [22]


International Journal for Numerical Methods in Engineering vol. 42, pp. 105–126, 1998. D.E. Sidarta, J. Ghaboussi, Constitutive modeling of geomaterials from nonuniform material tests, Computers and Geotechnics vol. 22,pp. 53–71,1998. D. Penumadu, J.L. Chameau, Geomaterial modeling using neural networks, in: N. Kartman, I. Flood, J.H. Garrett (Eds.), Artificial Neural Networks for Civil Engineering: Fundamentals and Applications, ASCE, 1997, pp. 160–184. A.A. Javadi and M. Rezania. Applications of artificial intelligence and data mining techniques in soil modeling. Geomechanics and Engineering, an International Journal, vol. 1, pp. 53-74, 2009. H.S. Shin, G.N. Pande, Enhancement of data for training neural network based constitutive models for geomaterials, in: Proceedings of the Eighth International Symposium on Numerical Models in Geomechanics-NUMOG VIII, Rome, Italy, pp. 141–146, 2002. H.S. Shin and G.N. Pande, On self-learning finite element code based on monitored response of structures. Computers and Geotechnics, vol. 27, pp. 161-178, 2000. Y.M. Hashash, S. Jung and J. Ghaboussi, Numerical implementation of a neural network based material model in finite element analysis. Int. Journal for Numerical Methods in Engineering, vol. 5, pp. 9891005 2004. A.A. Javadi, T.P. Tan and M. Zhang, Neural network for constitutive modeling in finite element analysis. Computer Assisted Mechanics and Engineering Sciences, vol. 10, pp. 375-381, 2003. A.A. Javadi, M. Zhang, T.P. Tan, Neural network for constitutive modeling of material in finite element analysis, in: Proceedings of the Third International Workshop/Euroconference on Trefftz Method, Exeter, UK, pp. 61–62, 2002. A.A. Javadi, T.P. Tan, A.S.I. Elkassas, An intelligent finite element method, in: Proceedings of the 11th International EG-ICE Workshop, Weimar, Germany, pp. 16–25, 2004. A.A. Javadi, T.P. Tan, A.S.I. Elkassas, Intelligent finite element method, in: Proceedings of the Third MIT Conference on Computational Fluid and Solid Mechanics, Cambridge, Massachusetts, USA, pp. 347–350, 2005. O. Giustolisi and D.A. Savic. A Symbolic Data-driven Technique Based on Evolutionary Polynomial Regression. Journal of Hydroinformatics, vol. 8, pp. 207–222, 2006. A. Doglioni, O. Giustolisi, D.A. Savic, B.W. Webb, An investigation on stream temperature analysis based on evolutionary computing, Hydrological Processes vol. 22, pp. 315–326, 2008. ABAQUS User Subroutines Reference Manual (version 6.7-1) Dassault Systems, 2007. ABAQUS Theory Manual (version 6.7-1) Dassault Systems, 2007. D.R.J. Owen, E. Hinton, Finite Elements in Plasticity: Theory and Practice, Pineridge Press, Swansea, 1980.

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 Daniel Peters, Dejan Raskovic and Denise Thorsen: An Energy Efficient Parallel Embedded System for Small Satellite Applications

8

An Energy Efficient Parallel Embedded System for Small Satellite Applications Daniel Peters, Dejan Raskovic and Denise Thorsen  Abstract — An energy efficient parallel embedded system based on two low power microcontrollers and one or two GPS receivers is developed for small satellite applications. The microcontrollers can be configured to perform parallel processing in cases where extra processing is needed for high speed GPS receivers or for advanced filtering algorithms. An original, high speed DMA-based communication link is developed, to exchange large amounts of data between two microcontrollers. A case study is presented in which the system uses a single antenna to determine an attitude of a small satellite vehicle. Index Terms — embedded systems, parallel systems, GPS, attitude determination.

S

I. INTRODUCTION

MALL satellite vehicles, especially nanosatellites such as CubeSat [1], have a very limited power budget and are physically small. The standard 1U (one unit) CubeSat measures only 10×10×10 cm. Designing a versatile and efficient embedded system that can handle multiple tasks is crucial for such satellites. In our case, in addition to the requirement that the system must be able to process and store the data gathered by the sensors, it needs to be capable of determining the satellite’s attitude as well. Traditional approaches typically involve specialized GPS receivers with multiple antennas that can estimate an attitude by comparing the carrier phase of the received signals. Such systems are too big and require too much power to be used in small satellites. Therefore, we designed an embedded system based on two Texas Instruments low-power microcontrollers, with two GPS ports, removable microSD flash storage, and an Organic Light Emitting Diode (OLED) display as a user interface.

II. HARDWARE DESCRIPTION We built several test systems to use them as development and validation platforms for SNR attitude determination. A simplified block diagram of our typical testbed hardware design is shown in Figure 1. The input devices include at least one GPS receiver, but a second GPS receiver may also be used. A second GPS receiver can be used to:  Simultaneously use and compare different antenna types;  Simultaneously use and compare different GPS receivers; or

 Calculate a 3-axis attitude estimation using two non-aligned antennas. A reference may also be connected as an input device. It serves as a truth attitude estimation comparison, especially during testing, and can be any type of device such as a magnetometer or carrier wave attitude GPS system. The processing hardware is comprised of two Texas Instruments MSP430 [2] microcontroller units (MCUs). Two MCUs are used in the design to accommodate large number of external peripherals (multiple GPS receivers, flash memory, high resolution color OLED display, etc.) and to provide flexibility in terms of available processing power. The onboard data storage allows logging of GPS data for analysis and post processing at a later time. This allows for the testbed to be used as a stand-alone embedded system — not requiring any computer connection while making measurements. The user interface consists of push buttons and an OLED graphical screen. The screen displays the control menu and other selectable parameters. The user interface allows the user to easily configure and operate the testbed according to the desired task. The testbed is powered by a 3.7 V Polymer Lithium Ion battery. It has a charging circuit and an under voltage lockout (UVLO) to prevent over-discharge battery damage. A switching boost regulator is used to step up the voltage for the OLED screen, and a linear regulator is used for all the IC supplies. The software of the testbed was developed in the ANSI C programming language using Imagecraft’s ICC version 7 for MSP430 (ICCV7430) development environment. Additionally, the Pumpkin Salvo Real-Time Operating System (RTOS) library was used for developing the software. Three main revisions of the testbed were designed and manufactured during the course of hardware development – POWER MANAGEMENT BATTERY

UVLO

CHARGER

INPUT DEVICES GPS

USER INTERFACE

REGULATOR

PROCESSING

GPS

The material is based in part upon work supported by the Alaska Space Grant Program and Alaska EPSCoR Program. D. Peters, D. Raskovic and D. Thorsen are with the Department of Electrical and Computer Engineering, University of Alaska Fairbanks, Fairbanks, AK 99775, USA (contact email: [email protected]).

MSP430 REFERENCE

Figure 1: Testbed block diagram

MSP430

BUTTONS DISPLAY

FLASH MEMORY STORAGE


9 TABLE 1. TESTBED HARDWARE REVISION DIFFERENCES

HARDWARE REVISION

1.2

2.0

2.1

DCO

Crystal

Crystal

8-bit I/O

8-bit DMA

16-bit DMA

GPS

On board

External

External

Flash Storage

On board Flash

Removable SD Card

Removable SD Card

MSP430 MCUs

2 × F249

1 × F5419 1 × F2616

2 × F2616

Clock Inter-Processor Communication

revisions 1.2, 2.0 and 2.1. Table 1 summarizes the main design differences between board revisions, while Figure 2 shows two different design revisions. A. Revision 1.2 Design Revision 1.2 (Figure 2a) was the first PCB that was manufactured. The board was fitted on the corner of a cube so that two patch antennas could be mounted orthogonally. The board performed well for the initial design, and software was developed to compute a real-time attitude solution. The board did however have some shortcomings that needed attention. The microcontrollers were designed to use the internal Digital Controlled Oscillator (DCO) clock source for simplicity, reduced component count, and lower power consumption. However, the internal DCO has a much lower frequency accuracy and temperature stability compared to a crystal. Under the normal operating conditions the DCO was able to satisfy our timing requirements, and the results were as expected. The problems became apparent during very low temperature testing in Alaska, as the system became unreliable, presumably because of UART communication frequency mismatches. To improve the stability of the system and to facilitate the use of our in-house developed techniques for clock variations control under extreme conditions [3], we added an external crystal. Revision 1.2 was designed to have the GPS module connections directly on the PCB. This limited the design to be capable of using only one kind of GPS receiver module. A more versatile design would be to have the GPS module on a small separate PCB with a cable connection to the testbed. The storage solution for Revision 1.2 was to have two Serial Peripheral Interface (SPI) flash memory ICs. Data stored to flash would need to be downloaded via a serial cable to a host computer to be analyzed. Although this was a workable system, removable SD Card flash media was used in later revisions for more convenience and higher capacities. B. Revision 2.0 Design Some of the shortcomings of Revision 1.2 were addressed with Revision 2.0. A crystal oscillator was used to solve the communication problems at extreme temperatures. The GPS receiver modules were no longer connected directly to the

testbed to allow more flexibility of using various types of receivers. A new inter-MCU protocol was also developed for faster communication using Direct Memory Access (DMA) controllers and timer peripherals. Lastly, the newest available MSP430 5xx series MPU was also used for the higher clock speed (16 MHz instead of 8 MHz), more communication ports and memory capabilities. However, this choice to use the newest available hardware proved to be problematic due to some compatibility issues with the software tools. 1) MSP430 5xx Series Issues Compatibility issues arose when using the 5xx series MSP430. Several items consumed considerable troubleshooting time searching for the cause of erroneous results before the culprits were discovered. The watchdog timer and hardware multiplier memory addresses changed for the 5xx series, and a CPU hardware bug caused incorrect program flow while using debugging breakpoints. The problems arose because all previous variants of the MPS430 used the same address for both the watchdog timer and the hardware multiplier. The programmers of the RTOS took this for granted and hard-coded the addresses in the RTOS library.

(a)

(b) Figure 2. Testbed PCBs of (a) Revision 1.2, (b) Revision 2.1


10

Source Voltage

Enable Signal

3.1 V

3.1V Soft Controlled

15 V

Communication Signals Revision 2.1 Testbed

5V PWR IN LDO

CHARGER

BOOST REG.

UVLO MCU1

LI-ION BATTERY

GPS

SWITCH

MSP430 F2616

8 bit + Control

UART

BUTTONS CLK DMA Comm. Interface

External Devices

GPS

REFERENCE

OLED DISPLAY

UART

DOS ON CHIP

MCU0 MSP430 F2616

UART / SPI

UART

SPI

Removable SD CARD

Figure 3. Testbed Revision 2.1 block diagram

Thus, when the code from the library was executed on the 5xx MPU, sporadic resets would occur when the hardware multiplier or watchdog timer were used. The library files are pre-compiled and cannot be modified by the programmer to make a correction of the addresses. Once the watchdog timer address problem was discovered, a request was made to the RTOS vendor to compile a new library. The request was granted and a new library was generated that did not use the watchdog timer. While this solved the sporadic reset problems, other strange results persisted until the next problem with the hardware multiplier address change was also discovered. The RTOS library did not use the hardware multiplier, and so the hardware multiplier peripheral was disabled by the compiler for the rest of the program as a workaround. Several other hardware bugs that existed in initial specimens of the MSP430 5xx series affected our system. Fearing more time-consuming surprises and frustration, the 5xx series MPU was abandoned and another PCB was designed to use the older, yet compatible 2xx series MPU (also operating at 16 MHz). Using the 5xx without the RTOS libraries would probably have worked, but changing the PCB design was favored instead of changing the software because of the greater time invested into the software and desire to keep the RTOS.

C. Revision 2.1 Design The overall design of Revision 2.1 (Figure 2b) was kept nearly the same as the previous revision, with the exception of eliminating the 5xx series MPU and implementing a 16 bit data bus for the MPU high speed inter-processor communication protocol, described later. Some other minor enhancements included adding more status LEDs for debugging purposes, adding a header for power measurements, configuring the power management for soft shutdown control, and adding ground pads for measurement probe ground clips. A detailed block diagram of the final design is shown in Figure 3. 1) Power Management Power for the testbed is provided either by the 3.7 V lithium battery, or by an external 5V DC source. When the external source is connected and the battery is charging, a P-MOSFET unloads the battery. When the external power source is not connected, current flows from the battery through a PMOSFET via the body diode. A Low Drop Out (LDO) linear regulator maintains 3.1V for all the logic circuits. The power distribution of the testbed is controlled through software. A complete control over the power distribution is crucial for an embedded system designed to operate


11

autonomously, possibly in outer space. Both MCUs have power connected continuously, even when the testbed is switched off. This is the standby mode, where the MCUs remain in a low-power sleep mode until the power switch is activated. Once the power switch is turned on, the MCUs enter into active mode and enable power to the remaining circuits. This type of power scheme is implemented because it allows for a graceful shutdown of various components and preservation of collected data. When the power switch is turned off by the user, the MCUs are still able to perform some actions before turning off the power to the various components. The GPS modules, the OLED screen, and the SD Card are the components requiring a graceful shutdown. After the power switch is turned off, but prior to shutdown, the following actions are taken:  A message is sent to the GPS modules to save ephemeris information to non-volatile memory. This significantly reduces signal acquisition time for the next power up.  The OLED screen 15V power is disabled for at least 100 ms prior to logic power-off, according to datasheet sequence instructions.  Any data queued to be written to the SD Card is completed.

MPUs. In addition, ICCV7430 included a binary library for the Tiny version 3.2.3 of Pumpkin Inc.’s SalvoTM RTOS [6], all at a low cost. The testbed task management was complex enough to warrant using a RTOS as opposed to the traditional superloop programming model. For these reasons, the Imagecraft tools were chosen. NOICE430, which is a debugging environment, also comes bundled with ICCV7430.

III. SOFTWARE DEVELOPMENT

A. Pumpkin Salvo Real-Time Operating System Pumpkin Salvo RTOS is designed to provide the benefits of using a RTOS while minimizing the resource footprint. This is especially important for low power and memory constrained MCUs such as the MSP430. Salvo is a cooperative RTOS, and thus uses very little memory and no stack. The tradeoff is that the programmer must explicitly manage task switching. Many other RTOSs are pre-emptive, which do not require explicit task switching, but instead require significantly more memory resources. The premise of the RTOS is that a kernel schedules tasks to be run. It decides which tasks have the highest priority and consequently which order they are executed. Individual tasks are miniature programs running within the overall program. Tasks execute instructions for a specific action, sometimes repetitively, and are implemented as functions in the C environment. Tasks may also include function calls, but the interaction with the kernel must occur within the task and not within a function called by the task. Interactions of the task with the kernel takes many forms, but the most common include delays, waiting for events, and task switches (yielding control to the kernel so another task may run). A delay will allow the kernel to run other tasks until the set amount of time has elapsed in the current task. Waiting for events suspends the task until something else happens, such as the occurrence of an interrupt or when another task sets a semaphore. A semaphore is a method by which resources are shared or by which program synchronization is achieved. Once the event happens, the kernel resumes the task that is waiting for an event. A switch is inserted by the programmer into locations within a task where the kernel is allowed to suspend the task and schedule other tasks. The amount of time it takes for the task to resume after the switch depends on how many other tasks with a higher priority are waiting to run. The priority of any tasks is defined by the programmer, and when two or more tasks have the same priority, they are run in a round-robin fashion. There are many other features of the Salvo RTOS including timeouts, event flags, messages, queues, etc., that are beyond the scope of this paper. The Tiny version used for development in this project has a reduced feature set that disables some features such as timeouts and task message passing, and also makes all tasks have the same priority. Features that are included with the Tiny version include delays and semaphores, with no restriction on the number of tasks or events.

At the onset of software development, Imagecraft’s ICCV7430 [5] was one of the only development environments capable of programming the 16 MHz 2xx series MSP430

B. Testbed Program Flowchart The testbed has the capability of sharing the workload between two processors, or using them as a real parallel

Once the preceding actions are completed, the power to the rest of the circuit is removed and the MCUs enter the lowest power mode. Another benefit of the soft-shutdown is that the UVLO will signal the MCUs when the battery is nearly depleted at 3.0V, allowing a graceful shutdown before the battery is completely exhausted. While the testbed is shutdown, a minute amount of current is consumed by the MCUs and by the LDO quiescent current, allowing for long battery life in standby mode. Still, after the battery is normally discharged during normal operation, it should be either recharged or disconnected to prevent a slow overdischarge condition in the standby mode. 2) Memory Storage The interface to the SD Card removable memory is physically connected in two ways for flexibility. Only one can be implemented at a time. The first choice is for MCU0 to communicate directly with the SD Card via a SPI link. The MCU must conduct all the low level file structure operations necessary for the data to be readable on a computer. Open source software such as ELM FatFS is available to perform these operations. The second choice involves an extra IC called DOSonCHIP™ [4]. This circuit performs the low level file operations and writes files to the SD Card using a File Allocation Table (FAT) system. The commands are simplified for the MCU in this case, and MCU1 handles the data storage operations. For the previous case where MCU0 handles the SD Card communication directly, the DOSONCHIP IC should not be populated on the PCB so as to avoid contention of the SPI signals.


12

system for applications that can be parallelized. For example, in our case study application, attitude determination, the processing of the attitude solution may be performed on either MCU, or performed together in parallel. The manner in which the workload is shared is also dictated in part by the peripheral physical connections. For example, the OLED screen is controlled only by MCU1. Other peripherals such as the SD Card may be controlled by either MCU, etc. The flowchart in Figure 4 is color coded to indicate which MCU is performing the specific item in the flowchart when at least two GPS modules are connected. Blue items are performed by MCU1 only, green items are performed by both MCUs, and red items are performed by either MCU depending on the hardware configuration. C. High Speed Inter-Processor Protocol The MSP430 does not have an external bus for memory access, so all communication must happen via one of the serial peripherals, or through the general digital I/O ports. The three available serial peripheral choices are UART, SPI, or I2C. SPI has the theoretical fastest transfer rate of 16 Mbps for a 16 MHz MPU. Rather than using a serial peripheral, using the digital IO ports controlled by the DMA peripheral and Start

Initialize ports and peripherals Create new file on SD Card

Read GPS Data

Save Data to SD Card

Mode

SKYVIEW

ATTITUDE

SNR PLOTS

BATTERY

SD CARD

Extract: SNR Azimuth Elevation

Extract: SNR Azimuth Elevation

Extract: SNR

Sample Voltage

Read SD Card Status

Calculate SV SkyView Screen Locations

Calculate Attitude

Prepare SV Bargraph

Save Data to SD Card

Figure 4. Testbed program flowchart

triggered by a timer peripheral is considered. This protocol is extended to the general case of multiple processors, even though the embedded system presented here uses only two processors. For the MSP430 microcontroller, digital IO normally requires 6 clock cycles for read/write operations of the port. By contrast, the DMA controller can read/write data from/to the ports every 2 clock cycles. When two 8 bit ports are combined on the MSP430 to make a 16 bit data bus, 2 Bytes are transferred per DMA cycle. For a 16 MHz processor, a bandwidth of 128 Mbps is possible. The read/write operations need a synchronization mechanism, which is accomplished by using the timer peripherals to trigger the DMA controllers. The sending MCU uses a DMA channel to write data from a memory array to a single output port. The receiving MCU uses a DMA channel to read data from a single port and fill a memory array. The DMA controllers are set up such that the sending MCU with control of the bus triggers its own DMA controller one clock cycle before the DMA controller of the receiving MCU gets triggered. In this way once the transfer is initiated, each data word remains on the bus for two clock cycles with the read cycles interleaving the send cycles. The timer peripheral is needed as the trigger because the output registers can be set up to make a strobe pulse that is as short as one clock cycle long. This is not possible with a general purpose digital IO on the MSP430. The registers are set up such that the sending MCU will trigger on the rising edge of the strobe via the DMA external trigger, while the receiver will trigger on the falling edge via a timer input capture. The connection diagram is shown in Figure 5 (next page). The data bus can be either 8 bits or 16 bits wide. There is a unique Bus Request signal from each MCU to the Arbiter, and also a unique Bus Grant signal from the Arbiter to each MCU. The arbiter has a crystal oscillator, and sources the clock for all other MCUs. The Arbiter is able to enter sleep modes while still providing the clock to the other MCUs. The other MCUs are also able to enter sleep modes without affecting the DMA timing, since there is no extra clock wakeup time for cases when an external clock source is used. The data transfer sequence, illustrated in Figure 7 (next page), is as follows: 1. 2. 3. 4.

Display Contents on OLED Screen

5. 6.

The sending MCU sets up the timer registers to trigger the appropriate destination MCUs. The sending MCU asserts the Bus Request line. If the sending MCU is the arbiter, steps 2, 3 and 6 are skipped. When the bus is available, the arbiter asserts the Bus Grant line. The sending MCU starts the timer peripheral, automatically triggering the DMA controller for the desired receiving MCU(s), and its own DMA controller. As soon as the transfer is complete, the sending MCU releases the Bus Request line. When the arbiter senses the Bus Request line has been deasserted, the Bus Grant line is de-asserted, and the arbiter handles the next bus request.


13

Clock Bus Request Bus Grant DMA Trigger Data

1...

...m

Write Read

Figure 7. Data transfer timing for a message of size m

Clock

Bus

Request

Grant

Trigger

Figure 6. Protocol transfer rates

MCU0 (Arbiter) Timer I/O I/O Interrupt I/O

MCU1 Timer I/O Interrupt I/O I/O

. . .

MCUn Timer I/O Interrupt I/O I/O

Figure 5. High speed inter-processor communication connection diagram

D. High Speed Inter-Processor Protocol Measurements The data transfer performance is measured for a dual MCU system running at 16.777 MHz and using a 16 bit data bus.

The operating frequency is slightly higher than the maximum frequency specified by the Texas Instruments. However, our extensive testing [7] shows that this family of microcontrollers is capable of operating at much higher frequencies. However, the final production system would be made to operate at a maximum recommended frequency, or even below it, to improve reliability. The message size is set to 2048 bytes. The arbitration overhead is measured to be 6.7 μs, with a total transfer time of 128.8 μs, resulting in a throughput of 127.26 Mb/s. Figure 6 shows the effect of arbitration overhead for varying message sizes. The performance advantage is clear for large message sizes, but is negated by the arbitration overhead when the message size is small. The results are compared to the theoretical transfer using the SPI bus at the maximum speed, which is independent of message size. The results are normalized to the operating speed of the MCU. In this case a message size greater than 16 bytes will give the DMA protocol a performance advantage over the SPI protocol. The break-even point is however not fixed, since the arbitration time is not deterministic and may be longer or shorter. If the application uses message sizes in the vicinity of the break-even point, the extra signal traces needed for the parallel data bus and control lines may not warrant the small gain in performance over the SPI bus. But the advantage becomes very significant when message sizes are large, particularly when transfers involving one-to-many MCUs are involved. The DMA high speed interconnection protocol has applications beyond the testbed developed for the attitude determination system. The use of parallel processing is especially suited for wireless sensor networks. The reason for this is because in many sensor networks, a sensor node is in an idle state for the vast majority of the time. Only when some sporadic event happens is there a need for high performance processing. Hence the need for extremely low power consumption during idle sleeps modes and high performance capabilities exists for many sensor network implementations. The problem is that there are few choices of processors that meet both criteria. Many high performance processors have relative high idle power consumption, while many low power processors which have miniscule idle power consumptions do not have exceptional processing capabilities. An approach to


14

B  (LT WL) 1 LT WS  a1 b1 c1    x   a1 a 2  a n   w1 0  0     y    b b b  0  0  a 2 b2 c 2   n           1 2  z   c1 c 2  cn  0  wn      a n bn c n   

the problem is to use multiple low power processors in a performance scalable architecture. The idle power of the multiple processors combined may still be an order of magnitude lower than a high performance processor, and high performance computing is achievable if the application is able to make use of parallelism. The high speed DMA protocol developed for the testbed is ideal for a wireless sensor network with multi-processors. The capability of one-to-all transfers and high data rates compared to the on-board peripherals such as SPI minimize the communication time overhead associated with multi-processing systems. E. A Case Study: An Embedded GPS Attitude Determination System We tested our embedded design in a GPS-based attitude determination system that uses a signal-to-noise ratios and a single antenna to determine an attitude of a small satellite vehicle. A SAP antenna was fabricated specifically for this application, using the designed parameters (ground plane area, inner and outer radius, feedpoint, etc.) that were shown to produce the antenna with ideal characteristics under the size constraints we were facing. In this study, a Trimble Copernicus GPS receiver [8] is connected to the system to provide the SV LOS vectors and SNR measurements. The calculation used on the system is similar to the initialization equation in [9]. The attitude estimation for each epoch is given by Equation 1, where S is the cosine of the antenna SNR-to-α mapping function and W is a weighting matrix of the measurements. The number of SVs used for the measurement denoted by n must be at least three. The SNR measurements are normalized to the mean value when α=0 for each individual satellite to account for transmission power variations. W is optimized to the antenna characteristics, and is a function of the off-boresight angle such that the most accurate regions of the antenna pattern are more heavily weighted. The weighting matrix can be easily calculated for SAP antennas (it is a relatively simple function of off-boresight angle and the elevation of the SV), but it could be more complex for other types of antennas. Also, since there are currently three different SV antenna configurations in our GPS constellation, a correction factor has to also accommodate for the differences between their antenna gain patterns. Finally, additional or more complex filtering might be required. The amount of processing mentioned in the previous paragraph, coupled with a fact that we wanted to be able to use two GPS receivers, multiple antennas, and a range of

1

 S1  a1 a 2  a n   w1 0  0       S 2  b1 b2 bn  0  0     c1 c 2  c n  0  wn     S n 

(1)

peripheral devices, while still leaving enough processing power for this board to be used as the main CubeSat controller, further justifies our decision to design a parallel system consisting of two low-power microcontrollers. F. File Formats The attitude determination system saves information to a removable micro SD Card. The data is saved as files in a FAT16 format, so as to be readable on a computer. The files are stored under the GPSLOG folder of the SD Card. The naming convention for the files are as follows: wwww_n.gps, where w is the 4 digit GPS week number, and n is a file suffix in case more than one file are recorded during a week. The .GPS file packet consists of the GPS time, then the elevation, azimuth, and SNR level for each of the 32 GPS Space Vehicle (SV) Pseudorandom Numbers (PRNs). The packets are wrapped with header and footer words for delineation. Figure 8 shows the data format used; the brackets indicate the number of bytes for each field. The SNR, elevation and Azimuth are all floating point numbers on the testbed initially before they are written to the SD Card. To save space, they are converted to fixed point integers by multiplying the values according to Table 2. When the files are analyzed later, the numbers need to be divided by the correct values and converted back to floating point data types. G. Calibration and Testing The embedded system and SAP antenna were positioned in a field with a mostly open sky view. GPS SV visibility was unobstructed to approximately 10º above the horizon for all Header (2)

GPS Time (4)

SNR (2)

Data (196)

SV 1 (6)

SV 2 (6)

Elevation (2)

Figure 8. Testbed output file packet format

Footer (2)

...

SV 32 (6)

Azimuth (2)


15

TABLE 2. TESTBED OUTPUT FILE FIELD DESCRIPTION

Error (Degrees)

FIELD Header GPS Time SNR Elevation Azimuth Footer

DATE TYPE Integer Long Integer Integer Integer Integer

DESCRIPTION 0x5555 Time in seconds since start of current GPS week = SNR × 10 (dB‐Hz or AMU) = Elevation × 1000 (radians) = Azimuth × 1000 (radians) 0xAAAA

10 5 0 0

1

2

3

4 Sample (s)

5

6

7 x 10

4

Figure 9. Raw measurement errors for zenith pointing antenna

azimuths. The obstructions below 10º included distant terrain and trees at the perimeter of the field. The antenna was statically set to the zenith direction for a 20 hour logging session at a 1 Hz sample rate. The recorded SNR samples and corresponding off-boresight angles were used to obtain the best-fit linear equation representing the SNR-to-α mapping function for the SAP antenna. The mean of the absolute residuals between the data points and the mapping function was calculated to be 3.3°. To determine the accuracy of attitude estimates a Septentrio PolaRx2eH Carrier wave attitude determination GPS receiver was used as a reference. The PolaRx2eH antennas were arranged with a baseline of 1 m, giving a reference accuracy of 0.6° pitch and 0.2° azimuth. The estimation error was calculated as the inverse cosine of the normalized truth and estimation vector dot products. Figure 9 shows the time series two-axis attitude error over the same 20 hour session used for the antenna calibration. The RMS error was 2.7º, and the maximum error was 9.3º. The zenith pointing results are compared in Table 3 to the two best zenith pointing ground measurements made by Behre in [10]. This result is also slightly better than the one achieved with a zenith pointing choke-ring patch antenna in orbital experiments, where RMS errors of 3.2º were recorded [9]. More details about the results obtained using our attitude determination system can be found in [12]. IV. SUMMARY A dual-microcontroller parallel embedded system was designed and built to allow advanced processing of data and interfacing with multiple peripherals, while keeping the power consumption at a very level. The system is fully functional and it is being used in conjunction with a specialized SAP antenna to perform GPS measurements for SNR attitude estimation. It is small enough to be used in a whole range of nanosatellites,

including a CubeSat. The University of Alaska Fairbanks, through its Student Rocket Project, has completed numerous successful launches of sounding rockets from the Poker Flat Research Range, the world's only scientific rocket launching facility owned by a university. The development of the embedded system described in this paper represents an important step in our transition from launching research payloads using sounding rockets to the use of nanosatellite vehicles. REFERENCES [1]

CubeSat Community Website, California Polytechnic State University, San Luis Obispo, http://www.cubesat.org/. 2009 [2] MSP430x2xx Family User’s Guide, Texas Instruments Inc., 2008. [3] D. Raskovic, V. Revuri, D. Giessel and A. Milenkovic, "Time Synchronization for Wireless Sensor Networks Operating in Extreme Temperature Conditions," Proc. 41st Southeastern Symposium on System Theory, IEEE SSST-2009, University of Tennessee Space Institute, Tullahoma, TN, Mar. 15-17, 2009. [4] DOSonCHIP™, Embedded File Systems in Silicon, Wearable Inc., http://dosonchip.com/. 2009 [5] Version 7 C Compiler Tools with Windows IDE for TI MSP430 / MSP430X Microcontrollers, ImageCraft Inc., http://www.imagecraft.com/ [6] SalvoTM RTOS, Pumpkin Inc., http://www.pumpkininc.com/ [7] D. Raskovic and D. Giessel, “Dynamic Voltage and Frequency Scaling For On-Demand Performance and Availability of Biomedical Embedded Systems,” IEEE Transactions on Information Technology in Biomedicine, 2009, In press [8] Copernicus® GPS Receiver, Trimble, http://www.trimble.com/embeddedsystems/copernicus.aspx, [9] P. Axelrad and C. Behre, “Satellite Attitude Determination Based on GPS Signal-to-Noise Ratio,” Proc. of the IEEE, Vol. 87, No. 1, January 1999. [10] Septentrio Satellite Navigation PolaRx2/2e User Manual V3.2.2, Septentrio Satellite Navigation NV, 2008 [11] C. Behre, “GPS Based Algorithms For Low Cost Satellite Missions,” PhD Dissertation, University of Colorado, Boulder, CO, December 1997. [12] D. Peters, D. Raskovic, and D. Thorsen, “Design and Evaluation of a GPS Attitude Determination System for Small Satellite Applications,” International Journal of Navigation and Observation, under review.

16

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 Daniel Peters, Dejan Raskovic and Denise Thorsen: An Energy Efficient Parallel Embedded System for Small Satellite Applications Daniel Peters is a graduate student in the Electrical and Computer Engineering Department at the University of Alaska Fairbanks. His work as a Research Assistant includes system development for sounding rockets and small satellites. His areas of interests include embedded systems and navigation.

Dejan Raskovic received the B.S. (dipl. ing.) and the M.S. degrees in electrical and computer engineering from the University of Belgrade, Serbia in 1993 and 1996, respectively, and the PhD degree in computer engineering from the University of Alabama in Huntsville in 2003. He is an Assistant Professor of electrical and computer engineering at the University of Alaska Fairbanks. His current areas of interests include embedded systems, energy efficient processing, and mobile and body area sensor networks. Denise Thorsen received her Ph. D in electrical engineering from the University of Illinois, Urbana, in 1996. She is an Associate Professor of electrical and computer engineering at the University of Alaska Fairbanks. Her research areas include radar techniques for observing the middle atmosphere, including neutral atmospheric wind motions, turbulence, temperature, and electron densities. She also works in systems design of radar and small satellites used in environmental remote sensing.

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 Kestutis Pyragas, Viktoras Pyragas, and Tatjana Pyragiene: Control and Synchronization of Dynamical Systems via a Time-Delay Feedback

17

Control and Synchronization of Dynamical Systems via a Time-Delay Feedback Kestutis Pyragas, Viktoras Pyragas, and Tatjana Pyragien˙e

Abstract—Some methods in the field of chaos control and synchronization involve a time-delay feedback in order to stabilize certain unstable manifolds embedded in chaotic attractors. Here we discuss two problems with the time-delay feedback, namely, the delayed feedback control algorithm that aims to stabilize unstable periodic orbits of chaotic systems and the anticipating synchronization algorithm, which enables a real-time forecasting of chaotic states. Index Terms—Anticipating synchronization of chaos, controlling chaos, time-delay feedback.

I. I NTRODUCTION The models with time-delay feedback erase naturally in physics, biology, ecology and other fields of science. Timedelay often appears in many control systems either in the state, the control input, or the measurements. Unlike ordinary differential equations, delay systems are infinite dimensional in nature and their analysis is much more complicated. The time-delay feedback is, in many cases, a source of instability. However, in some specially designed systems the time-delay feedback may manifest itself as a stabilizing factor. In this talk, we discuss two methods in chaos control and synchronization field where the time-delay feedback is used for the stabilization of some desired unstable states of chaotic systems. The first method is the delayed feedback control (DFC) algorithm in which the time-delay feedback is used for the stabilization of unstable periodic orbits (UPOs) of chaotic systems. The second method is the algorithm of anticipating synchronization. Here the time-delay feedback provides the stabilization of unstable anticipating manifold of the slave system driven by the identical master system. In both cases we present the main ideas of the methods, their developments up to date and our last achievements. II. D ELAYED FEEDBACK

CONTROL ALGORITHM

The DFC algorithm has been invented in the early 1990s [1] as a simple, robust, and efficient method to stabilize UPOs in chaotic systems. Nowadays it has become one of the most popular methods in the chaos control research [2]. The DFC algorithm is reference-free; it uses a delayed feedback in the K. Pyragas is with Semiconductor Physics Institute, 11 A. Goštauto, LT01108 Vilnius, Lithuania and Faculty of Physics, Vilnius University, LT-10222 Vilnius, Lithuania. Email: http://pyragas.pfi.lt V. Pyragas is with Semiconductor Physics Institute, 11 A. Goštauto, LT01108 Vilnius, Lithuania. Email: [email protected] T. Pyragien˙e is with Semiconductor Physics Institute, 11 A. Goštauto, LT01108 Vilnius, Lithuania. Email: [email protected]

form of a signal F (t) = kD(t) that is proportional to the difference D(t) = y(t) − y(t − τ ). (1) between the current system state y(t) and its state y(t − τ ) delayed by the period τ of UPO. The UPO may become stable under the appropriate choice of feedback strength k. Note that only the stability properties of the orbit are changed, while the orbit itself and its period remain unaltered. The method allows a noninvasive stabilization of UPOs in the sense that the control force vanishes when the target state is reached. The controlled system can be treated as a black box, since the method does not require any exact knowledge of either the form of the periodic orbit or the system’s equations. The method is especially appealing for experimentalists, since one does not need to know anything about the target orbit beyond its period τ . The DFC algorithm has been successfully applied to a number of real world problems in diverse experimental systems, including electronic chaotic oscillators, mechanical pendulums, lasers, gas discharge systems, a current-driven ion acoustic instability, a chaotic Taylor-Couette flow, chemical systems, high-power ferromagnetic resonance, helicopter rotor blades, and cardiac systems (cf. [3] for a review). The DFC method has been verified for a large number of theoretical models from different fields. Examples include stabilization of high-speed semiconductor lasers, buck converters, chaotic systems with dry friction, excitable systems, carfollowing traffics, economical models, satellite attitude control, chaotic roll motion of a flooded ship in waves, ensembles of globally coupled oscillators, etc. [3]. More recently the DFC has been proposed to eliminate chaotic oscillations in microcantilever sensors of a dynamic force microscope [4], in robots with kinematical redundancy [5] and in the Internet [6]. A reach variety of modifications of the DFC have been suggested in order to improve its performance. Adaptive versions of the DFC with automatic adjustment of delay time [7]–[9] and control gain [10] have been considered. For spatially extended systems, various modifications based on spatially filtered signals have been analyzed [11]–[13]. The wave character of dynamics in some systems allows a simplification of the DFC algorithm by replacing the delay line with the spatially distributed detectors. Mausbach et al. [14] reported such a simplification for a ionization wave experiment in a conventional cold cathode glow discharge tube. Due to dispersion relations the delay in time is equivalent to the spatial displacement and the control signal can be constructed without use of the delay line. Socolar, Sukow, and Gauthier [15] improved an original DFC scheme by using an information

18


from many previous states of the system. This extended DFC (EDFC) scheme achieves stabilization of UPOs with a greater degree of instability [16], [17]. During the past decade the odd number limitation of the DFC techniques has been intensively discussed in the literature. The main statement of the limitation is that any UPOs with an odd number of real Floquet multipliers greater than unity can never be stabilized by any DFC technique. The history of this limitation is dramatic. This statement was proven a decade ago [18], [19] and it was commonly accepted. However, Fiedler et al. [20] have recently shown by a simple example that this limitation does not hold in general for autonomous systems. To overcome the odd number limitation a counterintuitive idea based on an unstable controller has been proposed [21]–[23]. The theory of DFC is difficult because the delayed feedback induces an infinite number of degrees of freedom. Even linear analysis of such systems is complicated due to the infinite number of Floquet exponents characterizing the stability of controlled orbits. Nevertheless, some analytical approaches have been developed in vicinity to various bifurcations of periodic orbits, such as the period doubling bifurcation [18], [24], the subcritical Hopf bifurcation [21]–[23] and the NejmarkSacker (discrete Hopf) bifurcation [25]. Note that the linear analysis is insufficient to guarantee experimental success of DFC algorithm, because the control performance may strongly depend on the basin of attraction of the stabilized state. The basins of attraction can be incredibly complex. Recently the first step has been taken in understanding a possible mechanism responsible for formation of basins of attraction in DFC schemes [26], [27]. The idea is based on analysis of the type of bifurcations on control boundaries, i.e. at values of the control gain where the target orbit changes the stability. It is suggested that a continuous (supercritical) transition at the control boundary should indicate a large basin of attraction and a discontinuous (subcritical) transition should lead to a small basin. Unfortunately, this approach is not universal and does not guarantee the correct prediction for the system parameters far away from the bifurcation point. The lack of a general theory concerning the global properties of DFC systems, represents a serious drawback of the method. To improve the global properties of the linear DFC algorithm several modifications have been proposed. A first heuristic idea has been suggested in the original paper [1]. It has been shown that limiting the size of the control force by a simple cutoff increases a basin of attraction of the stabilized orbit. This idea has proved itself in a number of chaotic systems and now it is widely used in experimental implementations of DFC method. An alternative two-step DFC algorithm has been proposed in Ref. [28]. Here, in the first step, one seeks only a rough approach to the desired state. For this some of the system parameters is detuned for a time and an extraneous stable periodic orbit is created in the vicinity of the target orbit. In the second step, the target state is reached exactly by returning the initial value of the parameter and switching on the DFC force. Finally, a nonlinear DFC for systems close to a subcritical Hopf bifurcation has been proposed in Ref. [27]. It has been shown that the basin of

attraction can be enlarged considerably by coupling control forces through the phase of the signal. Now we present briefly our recent results concerning the global properties of the DFC algorithm. To improve the global properties we have invoked an ergodicity of chaotic systems. By ergodicity we mean the fact that a trajectory of any chaotic system visits a neighborhood of each periodic orbit with finite probability. The idea of using ergodicity in chaos control research has been first formulated in the seminal paper by Ott, Grebogy and Yorke [29] and has been employed in their OGY control algorithm. We have adapted this idea for DFC algorithm. As well as in the OGY algorithm we do not perturb the system until it comes in a small neighborhood of the desired orbit. Using a scalar observable we develop a technique which allows us to evaluate a moment when the state of the free system approaches the target orbit. At this moment we activate the DFC force and stabilize the target. The algorithm does not require a knowledge of location of the orbit. For continuous-time systems our algorithm can be easily implemented by means of electronic circuits. Below we demonstrate the main ideas of our approach by two simple examples. A. Controlling the Hénon map Time-discrete maps are very convenient dynamical toy models for analysis of the DFC algorithm. Such systems are easer to handle since the dimension of phase space stays finite even if the control loop is included. The trends discovered through analysis of discrete maps are a good starting point for developing intuition about the behavior of continuous systems. Moreover, in systems with slow dynamics, the schemes for controlling discrete maps may be directly implemented. Consider the Hénon map subjected to DFC: 1 − ax2n + byn + k(xn − yn ),

xn+1

=

yn+1

= xn .

(2) (3)

The Hénon map is the 2D dynamical system described by two variables (xn , yn ). In the following we fix the values of the parameters of the Hénon map at a = 1.5 and b = 0.2. The last term in Eq. (2) describes the DFC force, where k is the feedback gain. The free as well as controlled Hénon map possesses two fixed points (xF , yF ) = (x∗1 , x∗1 ) and (˜ xF , y˜F ) = (x∗2 , x∗2 ), where i h (4) x∗1,2 = b − 1 ± [(1 − b)2 + 4a]1/2 /2a.

In Fig. 1 we show the phase portrait of the free (k = 0) Hénon map. We see that the fixed point (xF , yF ) is embedded in the chaotic attractor, while the point (˜ xF , y˜F ) is an extraneous fixed point, which is outside of the attractor. Our aim is to devise the DFC algorithm, which is able to stabilize the fixed point (xF , yF ) for any initial conditions placed on the strange attractor. From linear analysis of the map (2),(3) it follows that the target fixed point (xF , yF ) is stable for values of the feedback gain in the interval (b − 1 + 2ax∗1 )/2 < k < b + 1.

(5)


19

Fig. 1. (color online). The phase portrait of the free Hénon map (7),(8) for a = 1.5 and b = 0.2. The black (blue online) dots show the strange attractor. The crossed circle and square denote the target (xF , yF ) and extraneous (˜ xF , y˜F ) fixed points, respectively. The straight dash lines x± n = yn ± ε define the boundaries of inequality (7). The inverse transformation of these 2 lines shown by dashed parabolas x± n = 1 + byn − ayn ± εb define the boundaries of inequality (8). The whole region restricted by conditions (7), (8) is marked by gray (yellow online) color. The identity line xn = yn and its inverse transformation are shown by dotted lines. Their intersections produce the fixed points. The boundary lines are depicted for ε = 0.7.

An optimal value of the feedback gain providing the fastest convergence to the target state is given by kop = 2[ax∗1 + 1 − (2ax∗1 + 1 − b)1/2 ].

(6)

For the chosen values of parameters we get kop ≈ 0.566. In the following we use this value in our control algorithm. The linear stability guarantees that the basin of attraction of the target point for the nonlinear map (2),(3) occupies some region around this point. As well as in the previous section we need conditions which allow us to check whether the current state of the free system is in vicinity of the target state. For 2D map we require two conditions in order to separate some region around the target point. Our aim is to formulate these conditions without any knowledge of even approximate position of the fixed point. Moreover, we suppose that only one scalar variable, say xn , is available for observation. Then we formulate the desired conditions as follows |xn − xn−1 | < ε and |xn−1 − xn−2 | < ε. This means that we check the smallness of DFC perturbation if it would be applied not only at the current moment n but also at the previous time n − 1. Since xn−1 = yn we can rewrite the above conditions as follows |xn − yn | < ε, |xn−1 − yn−1 | < ε.

(7) (8)

For an M -dimensional map we would write M analogous conditions. Geometrically, conditions (7),(8) separate some regions in the (xn , yn ) plane in the vicinity of the both fixed points. In Fig. 1, the region surrounding the target point is marked by gray (yellow online) color. The region is bounded by four curves. The inequality (7) defines the region between two straight lines x± n = yn ± ε parallel to the identity line.

Fig. 2. Histogram of time N needed to achieve control in the Hénon map (2),(3). P (N ) shows the number of successful stabilizations with the given time of control. The parameter ε and the mean time hN i are (a) ε = 2.7, hN i = 19.44; (b) ε = 1.77, hN i = 17.41; (c) ε = 0.7, hN i = 16.93; (d) ε = 0.2, hN i = 59.27. In (a) only 86.92% of initial conditions are successful, while in (c)-(d) the 100% success rate is obtained.

The inequality (8) bounds the region between two parabolas 2 x± n = 1 + byn − ayn ± εb, which represent an inverse Hénon transformation of the above lines. With the decrease of parameter ε, the region defined by conditions (7),(8) shrinks and for suitably small ε should fit into the basin of attraction of the target point. Note that we do not need an additional condition to exclude the extraneous fixed point (˜ xF , y˜F ) since here this fixed point is outside of the strange attractor. In Fig. 2 we demonstrate the performance of our algorithm for different values of the parameter ε. To gather statistics on times needed to achieve control we apply our algorithm for many different initial conditions placed on the strange attractor of the free Hénon map. For a given initial condition (x0 , y0 ) the procedure is as follows. First, we set k = 0 and start an iteration of the free map (2),(3). On each step of iteration we check the conditions (7),(8). As soon as these conditions are satisfied we switch on the control force by setting k = kop . Then we continue the iterations until the system approaches the desired fixed point with a given accuracy εF = 2×10−3 ε. As this happens we record the total time N (number of iterations) elapsed from the start to the end of this procedure. We repeat this procedure for 104 different initial conditions randomly chosen on the strange attractor and plot a histogram of time N needed to achieve the control. If the value of the parameter ε is too large, then the algorithm fails for some part of initial conditions [Fig. 2 (a)]. By decreasing the parameter ε we can gain 100% success rate [Figs. 2 (b)-(d)]. However, unduly small ε can lead to rather long mean time hN i.

B. Controlling the Rössler system To extend our algorithm for continuous-time chaotic systems let us consider the Rössler equations subjected to DFC


20

force x˙ = −y − z, y˙ = x + ay − kD(t),

(9) (10)

z˙

(11)

= b + z(x − c).

Here x, y, z are the dynamic variables of the Rössler system. The parameters a = 0.2, b = 0.2 and c = 5.7 are chosen such that the system exhibits chaotic behavior. We suppose that y(t) is an observable and the DFC perturbation kD(t) is applied only to the second equation of the Rössler system. Here k is the feedback gain and D(t) denotes the difference (1) of the observable between the current state and the state delayed by the period τ of UPO. Numerical analysis shows that the linear DFC is able to stabilize the period-1 and period-2 UPOs of the Rössler system for any initial conditions placed on the strange attractor. However, this is not the case for the period-3 UPOs with the period τ = 17.51. The linear DFC is not able to stabilize this orbit for any initial conditions and thus we need a modified algorithm. As well as in the previous example we do not perturb the system when its state is far away from the desired orbit and activate the control in the form of Eq. (1) when the state approaches the target UPO. The linear analysis of Eqs. (9)(11) with the perturbation (1) shows that the target orbit is stable in the interval of control gain 0.05 < k < 0.34. The optimal value of the control gain is kop = 0.06. It provides the fastest convergence to the stabilized orbit. The main problem here is to define the moment when the state of the free system falls in a small neighborhood of the desired orbit. Having a scalar observable y(t), the standard way to reconstruct the phase space of the system is to introduce delay coordinates. Using an (M + 1)-dimensional delay-coordinate vector r(t) = [y(t), y(t − δt), . . . , y(t − M δt)] with delay δt one can write inequalities analogous to inequalities (7),(8) for the Hénon map. Now to guarantee the closeness of the system state to the desired orbit, instead of (7),(8) we can require |D(t − jδt)| < ε, j = 0, . . . , M . These inequalities mean that the difference |D(t)| has to be small in M + 1 equally spaced points in the time interval [t − ∆t, t] with ∆t = M δt. For continuous-time systems, we can alternatively require the smallness of a moving average of the difference |D(t)| in theR whole interval ∆t (a window of moving average), t i.e. (1/∆t) t−∆t |D(s)|ds < ε. The main advantage of such an alternative is that the moving average can be simply estimated electronically. The simplest moving average filter can be designed as a first order low-pass filter. As a result, we come to the following modification of the DFC algorithm. To estimate the moving average of the difference |D(t)| we introduce an auxiliary variable w that satisfies first order filter equation τw w˙ = |D(t)| − w. An asymptotic solution of this equation is Z t 1 e(s−t)/τw |D(s)|ds. w(t) = τw −∞

(12)

(13)

Thus the variable w represents an exponentially weighted moving average of the difference |D(t)|. The characteristic

Fig. 3. Histogram of time needed to achieve control in the Rössler system for k = kop = 0.06, τ = 17.51, and τw = τ /7. N is the number of periods τ needed for stabilization of the target UPO and P (N ) is the number of successful stabilizations with the given time of control. The parameter ε and the mean number hN i of periods needed to achieve control are (a) ε = 0.7, hN i = 6.87; (b) ε = εc = 0.33, hN i = 9.54; (c) ε = 0.25, hN i = 11.25; (d) ε = 0.2, hN i = 12.4. In (a) only 79% of initial conditions are successful, while in (c)-(d) the 100% success rate is obtained.

window of the moving average is ∆t = τw . The smallness of this variable indicates the closeness of the system state to the target orbit. The control procedure is as follows. We start from initial conditions placed on the strange attractor of the free Rössler system and integrate Eqs. (9)-(11) and (12) for k = 0 as long as w > ε. As soon as the variable w becomes small, w < ε, we set k = kop and continue the integration of the system subjected to DFC. We assume that the control goal is achieved when w decreases to a given value εF = 3 × 10−2 ε. We repeat this procedure for 103 different initial conditions and plot in Fig. 3 histograms of time needed to achieve the control. For suitably chosen filter parameter τw = τ /7 and fairly small ε < εc ≈ 0.33, the algorithm produces 100% success rate for any initial conditions. For ε = εc the mean time of control amounts approximately 10 periods of UPO. We emphasize that our modification is very simple. The closeness of the system state to the target orbit is estimated from the usual DFC signal (1) by means of the standard lowpass filter (12). Thus the modification allows a simple analogue implementation and preservers the main advantages of the usual linear DFC technique. III. A NTICIPATING

CHAOTIC SYNCHRONIZATION

Synchronization of oscillations is a phenomenon common to a large variety of nonlinear dynamical systems in physics, chemistry, and biology [30]. Whereas first investigation on the synchronization phenomenon goes back to the work by Huygens in 1665, the past decades have witnessed a considerable interest in the topic of synchronization of chaotic systems. The behavior of chaotic systems is characterized by instability and, as the result, limited predictability in time. Intuitively it would seem that chaos and synchronization are two mutually exclusive notions. However, it has been shown that synchronization can appear in chaotic systems in many different ways,


21

including the identical, phase, projective, lag, and anticipating synchronization. The latter type of synchronization introduced by Voss [31] some years ago is most counterintuitive. In the case of anticipating synchronization one deals with two systems, a "master" and a "slave" which are coupled unidirectionally via a time-delay feedback in such a manner that the slave system predicts the behavior of the master system. More specifically, the coupling scheme introduced by Voss is as follows: r˙ 1 r˙ 2

= f (r1 ) = f (r2 ) + K[r1 − r2 (t − τ )]

(14) (15)

where r1 (t) and r2 (t) are the dynamic vector variables of the master and slave system, respectively, f is a nonlinear vector function, τ is a delay time, and K is a coupling matrix. It is easy to see that the anticipatory synchronization manifold r2 (t) = r1 (t + τ ) is a solution of Eqs. (14),(15). For appropriate choice of τ and K this solution can be stable, i.e., the slave anticipates by an amount τ the output of the master. This phenomenon has been studied numerically for a variety of systems and justified experimentally in electronic circuits [32] and chaotic semiconductor lasers [33]. Implementation of anticipating synchronization as a strategy for real-time forecasting of a given dynamics requires a design of coupling schemes with a possibly large anticipation time. The analysis performed in Ref. [31] shows that the scheme (14),(15) with a diagonal matrix K is ineffective. Its maximum stably anticipation time is much shorter than the characteristic time scales of the system’s dynamics. In order to enlarge the prediction time it was proposed to extend Eq. (15) with a chain of N unidirectionally coupled slave systems [34]: r˙ i = f (ri ) + K[ri−1 − ri (t − τ )],

i = 2, . . . , N + 1. (16)

Formally, the prediction time of this scheme is N times larger as compared to the scheme (15). However, it was shown that the chain (16) is unstable to propagating perturbations and this convectivelike instability limits the number of slaves in the chain which can operate in a stable regime [35]. In our recent paper [36], we addressed the question whether is it possible to considerably prolong the prediction time via a suitable choice of the coupling matrix K. For typical lowdimensional chaotic systems we gave a positive answer. We proposed an algorithm of design K and showed that the prediction time can be enlarged several times in comparison to the diagonal coupling usually used in the literature. Utilizing the scheme (14),(15) with the single slave system we obtained stably anticipation time comparable to a characteristic period of chaotic oscillations. Here we demonstrate a heuristic idea of our algorithm with the Rössler system, which is given by a three-dimensional vector variable r = [x, y, z] and vector field f (r) = [−y − z, x + ay, b + z(x − c)].

(17)

In the following we set a = 0.15, b = 0.2, c = 10 and suppose that both r and f are the vectors columns. Although the Rössler system has two fixed points, the strange attractor is originated from one of them r0 = [(c − s)/2, (s − c)/2a, (c − s)/2a]

Fig. 4.

Phase-lag compensation of delayed vector r2 (t − τ ).

located close to the origin, where s = (c2 − 4ab)1/2 . The fixed point is a saddle-focus with an unstable 2D manifold (an unstable spiral) almost coinciding with the (x, y) plane and a stable 1D manifold almost coinciding with the z axis. The phase point of the system spends most time in the (x, y) plane moving along the unstable spiral according to approximate equations x˙ = −y, y˙ = x + ay. Whenever x approaches a value x ≈ c, the z variable comes into play. The phase point leaves for a short time the (x, y) plane and then returns to the origin via a stable z-axis manifold. Taking into account such a topology of the strange attractor we choose the coupling matrix as K = kQ, where k is a scalar parameter defining the coupling strength and   cos α − sin α 0 (18) Q =  sin α cos α 0  0 0 0 is a 3 × 3 matrix that projects the vector field onto the unstable (x, y) plane and rotates this projection by the angle α = ωτ . Here ω is a frequency of the unstable spiral, which for the Rössler system is ≈ 1. The main advantage of such a choice consists in phase-lag compensation of the time-delay feedback term in Eq. (15). When the system moves along the unstable spiral in the (x, y) plane, the vector Qr2 (t − τ ) is in-phase with the vector r2 (t) [cf. Fig. 4], and thus the term Kr2 (t − τ ) provides a correct negative feedback. We refer to this coupling law as a phase-lag compensating coupling (PLCC). In Fig. 5 we compare the effect of PLCC with the usual diagonal coupling, when K = k diag[1, 1, 1]. The time of reliable prediction for the PLCC is τ ≈ 3.8. It exceeds 4 times the maximum prediction time for the diagonal coupling. The characteristic period of chaotic oscillations for the Rössler system is ≈ 6. Thus our algorithm allows us to make prediction for more than a half of this period. We stress, that the PLCC enables forecasting of the global dynamics of the system, although the coupling matrix (18) takes into account only the local properties of the phase space. Indeed, the phase-lag compensation via rotation of the vector field is strongly valid only in the vicinity of the fixed point. It is notable that a similar rotation feedback gain has been recently used by Fiedler et al. [20] to overcome the odd number limitation in the DFC algorithm. The proposed algorithm can be generalized for any Rösslertype dynamical system. It works even for the double-scroll chaotic systems [36] such as the Chua circuit or the Lorenz system.


22

Fig. 5. Time series y1 (t) of the master (thin line) and y2 (t) of the slave (bold line) Rössler systems. (a) The diagonal coupling with K = 0.36 diag[1, 1, 1] and τ = 0.9. (b) The PLCC with K = 0.18 Q, ω = 1, and τ = 3.6. In both cases the coupling parameters are optimized in such a way as to attain the maximum stably anticipation time.

IV. C ONCLUSION Although the DFC and anticipating synchronization algorithms are similar in the sense that both involve a timedelay feedback, however, the DFC is much more effective. It provides a stabilization of UPOs whose periods are comparable with the characteristic period of chaotic oscillations. In contrast, the usual anticipating synchronization algorithm is able to stabilize the anticipatory manifold only for small time-delays. To prolong this time we have designed a special phase-lag compensating coupling. The difference in efficacy of these two algorithms is due to different role of the delayed feedback in these algorithms. The DFC law (1) involves both the proportional feedback term y(t) and the delay feedback term y(t − τ ). The stabilization of UPOs is mainly caused by the proportional feedback. The delay term here is necessary to zero the control force on the desired UPO. In contrast, the feedback law K[r1 − r2 (t − τ )] in the anticipating synchronization algorithm (14),(15) does not contain the proportional feedback term. It involves only the delayed feedback term r2 (t − τ ) and only this term is responsible for the stabilization of the anticipatory manifold. R EFERENCES [1] K. Pyragas, "Continuous control of chaos by self-controlling feedback," Phys. Lett. A, vol. 170, pp. 421–428, 1992. [2] E. Schöll and H. G. Shuster, Handbook of Chaos Control. Wiley-VCH, Weinheim, 2008. [3] K. Pyragas, "Delayed feedback control of chaos," Philos. Trans. R. Soc. London, vol. A364, pp. 2309–2334, 2006. [4] K. Yamasue and T. Hikihara, "Control of microcantilevers in dynamic force microscopy using time delayed feedback." Review of scientific instruments, vol. 77, p. 053703, 2006. [5] L. Li, Z. Liu, D. Zhang, and H. Zhang, "Controlling chaotic robots with kinematical redundancy," Chaos, vol. 16, p. 013132, 2006. [6] C.L. Liu and Y.P. Tian, "Eliminating oscillations in the Internet by timedelayed feedback control," Chaos, Solitons & Fractals, vol. 35, pp. 878– 887, 2008. [7] A. Kittel, J. Parisi, and K. Pyragas, "Delayed feedback control of chaos by self-adapted delay time," Phys. Lett. A vol. 198, pp. 433–438, 1995. [8] H. Nakajima, H. Ito, and Y. Ueda, "Automatic adjustment of delay time acid feedback gain in delayed feedback control of chaos," IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciencies, vol. E80A, pp. 1554–1559, 1997. [9] G. Herrmann, "A robust delay adaptation scheme for Pyragas’ chaos control method," Phys. Lett. A, vol. 287, pp. 245–256, 2001. [10] S. Boccaletti and F.T. Arecchi, "Adaptive control of chaos," Europhys. Lett., vol. 31, pp. 127–132, 1995.

[11] M.E. Bleich, D Hochheiser, J.V. Moloney, and J.E.S. Socolar, "Controlling extended systems with spatially filtered, time-delayed feedback," Phys. Rev. E vol. 55, pp. 2119–2126, 1997. [12] D. Hochheiser, J.V. Moloney, and J. Lega, "Controlling optical turbulence," Phys. Rev. A vol. 55, pp. R4011–R4014, 1997. [13] N. Baba, A. Amann, E. Schöll, and W. Just, "Giant improvement of time-delayed feedback control by spatio-temporal filtering," Phys. Rev. Lett., vol. 89, p.074101, 2002. [14] Th. Mausbach, Th. Klinger, A. Piel, A. Atipo, Th. Pierre, and G. Bonhomme, "Continuous control of ionization wave chaos by spatially derived feedback signals," Phys. Lett. A, vol. 228, pp. 373–377, 1997. [15] J.E.S. Socolar, D.W. Sukow, and D.J. Gauthier, "Stabilzing unstable periodic orbits in a fast diode resonator using continous time-delay autosinchronization," Phys. Rev. E, vol. 50, p. 2343–2346, 1994. [16] K. Pyragas, "Control of chaos via extended delay feedback," Phys. Lett. A, vol. 206, p. 323–330, 1995. [17] M.E. Bleich, and J.E.S. Socolar, "Stability of periodic orbits controlled by time-delay feedback." Phys. Lett. A, vol. 210, pp. 87–94, 1996. [18] W. Just, T. Bernard, M. Ostheimer, E. Reibold, and H. Benner, "Mechanism of time-delayed feedback control," Phys. Rev. Lett., vol. 78, pp. 203– 206, 1997. [19] H. Nakajima, "On analytical properties of delayed feedback control of chaos," Phys. Lett. A, vol. 232, pp. 207–210, 1997. [20] B. Fiedler, V. Flunkert, M. Georgi, P. Hoevel, and E. Schöll, "Refuting the Odd-Number Limitation of Time-Delayed Feedback Control," Phys. Rev. Lett., vol. 98, pp. 114101, 2007. [21] K. Pyragas, "Control of chaos via an unstable delayed feedback controller," Phys. Rev. Lett., vol. 86, pp. 2265–2268, 2001. [22] K. Pyragas, V. Pyragas, and H. Benner, "Delayed Feedback Control of Dynamical Systems at a Subcritical Hopf Bifurcation," Phys. Rev. E, vol. 70, p. 056222, 2004. [23] V. Pyragas and K. Pyragas, "Delayed feedback control of the Lorenz system: An analytical treatment at a subcritical Hopf bifurcation," Phys. Rev. E, vol. 73, pp. 036215, 2006. [24] K. Pyragas, "Analytical properties and optimization of time-delayed feedback control," Phys. Rev. E, vol. 66, p. 026207, 2002. [25] T. Pyragiene, and K. Pyragas, "Delayed feedback control of forced selfsustained oscillations," Phys. Rev. E, vol. 72, p. 026203, 2005. [26] W. Just, H. Benner, and C. Loewenich, "On global properties of timedelayed feedback control: weakly nonlinear analysis," Physica D, vol 199, p. 33, 2004. [27] K. Höhne, H. Shirahama, C.,-U. Choe, H. Benner, K. Pyragas, and W. Just, "Global Properties in an Experimental Realization of TimeDelayed Feedback Control with an Unstable Control Loop," Phys. Rev. Lett., vol. 98, p. 214102, 2007. [28] A. Tamaševˇcius, G. Mikolaitis, V. Pyragas, and K. Pyragas, "Delayed feedback control of periodic orbits without torsion in nonautonomous chaotic systems: Theory and experiment," Phys. Rev. E, vol. 76, p. 026203, 2007. [29] E. Ott, C. Grebogi, and J.A. Yorke, J. "Controlling chaos," Phys. Rev. Lett., vol. 64, pp. 1196–1199, 1990. [30] A. Pikovsky, M. Rosemblum, and J. Kurths, Synchronization: A universal concept in nonlinear sciences, Cambridge University Press, Cambridge, England, 2001. [31] H.U. Voss, "Anticipating chaotic synchronization," Phys. Rev. E, vol. 61, p. 5115-5119, 2000. [32] H.U. Voss, "Real-time anticipation of chaotic states of an electronic circuit" Int. J. Bifurc. Chaos Appl. Sci. Eng., vol 12, pp. 1619–1625 , 2002. [33] S. Tang and J.M. Liu, "Experimental verification of anticipated and retarded synchronization in chaotic semiconductor lasers," Phys. Rev. Lett., vol. 90, p. 194101, 2003. [34] H.U. Voss, " Dynamic long-term anticipation of chaotic states," Phys. Rev. Lett., vol. 87, p. 014102, 2001. [35] C. Mendoza, S. Boccaletti, and A. Politi, "Convective instabilities of synchronization manifolds in spatially extended systems," Phys. Rev. E, vol. 69, p. 047202, 2004. [36] K. Pyragas and T. Pyragien˙e, "Coupling design for a long-term anticipating synchronization of chaos," Phys. Rev. E, vol. 78, p. 046217, 2008.

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 A. Hedman and H. Alm: Testing Image Browsers - An Analysis of Layout and Presentation Factors that Affect Usability

23

Testing Image Browsers - An Analysis of Layout and Presentation Factors that Affect Usability A. Hedman and H. Alm

Abstract—Digital cameras are replacing analogue ones, and electronic photo albums are everywhere on the Internet. Image browsing is used for searching the contents of such albums, but also for a variety of other tasks. Results from image browser studies are sometimes either unexpected or in conflict with previous results. Based on observations made during such experiments, we have identified and analysed a number of visual factors that seem to be relevant for the outcome of the test. This analysis identifies research gaps that need to be filled in order to support choice of appropriate browsing technique in interface design. Some suggestions for future user tests are also given.

Image browsing normally means either or both of the following: •

A set of images is displayed on the screen, and the task is to identify a possible target image, usually by enlarging it with some technique.

•

An image, too large to fit the screen at readable size, is displayed, and the task is to investigate it.

Index terms—Fisheye, Fitts' law, icons, image browsing, usability, zoom.

M

I. INTRODUCTION

OST computer users would agree that the amount of screen space can be insufficient, but never too large. One type of task that requires a lot of screen space is viewing images (Figure 1). As digital cameras are becoming more common, people tend to store their images in electronic albums, rather than in physical ones. People also more often use the Internet in search for information about geographic locations (Figure 2). Normally some browsing technique is used, that enlarges images and enables overviews of entire sets of images as well as selected images in detail. Browsing techniques have been compared in several studies over the years, with various results. The question of which technique is most efficient gets an answer in each of these studies, but the answers differ from one study to another. The analyses do not really form a clear pattern that can provide guidelines for future design. Before conducting yet another test, we wanted to look into the details behind what makes a certain technique efficient. Manuscript received October 9, 2009. This work was supported in part by the European Union, Interreg IV A Nord. Anna Hedman is with the Department of Computer Science and Electrical Engineering, Luleå University of Technology, SE-97187 Luleå, Sweden, (phone: +46 920-493067; e-mail: [email protected]). Håkan Alm is with the Department of Human Work Sciences, Luleå University of Technology, SE-97187 Luleå, Sweden (e-mail: [email protected]).

Figure 1 Electronic photo album Image browsing techniques usually make use of one or more of the following mechanisms: Click to expand is the same strategy used on a regular desktop system (Figure 3). The iconic representation of the image can be either a miniature image or a simple symbol. When clicked on, a large and readable version of the content is displayed. If the representation is a miniature image, this technique resembles a two-level zoom without animation. Panning has the same purpose as scrollbars, but is often more efficient to use. For diagonal movements, scrollbars require two operations, while panning can be done with one single movement. Zooming can be done in several ways, with discrete levels or in a continuous mode, with animation or without. This mechanism is particularly useful in combination with a pan function. Adjusting the view by zooming out and select a new target to zoom in on can be very time consuming and inefficient.


24

Distortion-oriented techniques are interfaces that provide both focus and context, such as the perspective wall [16], the bifocal view, and all types of fisheye views [8], [15]. Focus is normally moved either by dragging or clicking. Multiple views are often used in combination with another technique. Especially a zoom interface can benefit from an overview to prevent the user from getting lost.

Figure 2 Map with overview There are mainly two ways to improve the efficiency and usability of image browsing: Modify the presentation of the data, or modify, or change, the browsing technique used for the task. It seems as if certain image browsing techniques benefits from certain layout and presentation strategies. This paper aims at identifying and understanding factors that affect usability for existing techniques. Perhaps this analysis could help interpreting the outcome of image browser tests, and give some useful directions on how to conduct future studies with more homogeneous results. First we will give a summary of comparisons made, followed by a definition of task completion time for image browsing present. After that we analyse the visual factors of image browsing, and discuss the impact of each of them. Finally, we suggest some topics for future work.

interface and the drag interface. Hornbæk, Bederson, and Plaisant [13] compared zooming interfaces with and without an overview found that, in contrast to previous results, subjects were faster without the overview. User preference was towards having an overview, though. Combs and Bederson [4] compared four browsers for photographs: One thumbnail browser featuring a horizontal scrollbar, a zoom-and-pan browser, and two 3D browsers. Participants performed better with both 2D browsers than with the 3D browsers. Furthermore, they also made fewer incorrect selections with 2D browsers, and rated them higher. This contradicts previous results showing that scroll bars are slower compared to zoom. In a study on the efficiency of fisheye browsers, Schaffer et al. [18] compared fisheye and zoom-and-replace interfaces on node-link diagrams representing a telephone network. For this task, the fisheye view was significantly faster than the zoomand-replace. Hornbæk and Frøkjær [12] compared linear, fisheye, and overview-plus-detail interfaces in order to find out how browsers affect the readability of electronic documents. They found that participants performed faster with the fisheye interface, but received higher grades with the overview+detail interface. Contradicting the results indicating that fisheye would be faster than zoom, Donskoy and Kaptelinin [5] found that, in a comparison of scroll-bar, zoom, and fisheye interfaces, the lowest task completion times were achieved with the zooming interface. The fisheye turned out to be the slowest technique. Gutwin and Fedak [9] made a comparison on small screens, finding that a fisheye interface was useful for web navigation tasks, and a two-level zoom for a monitoring task, while panning slowed down users, regardless of task type.

II. RELATED WORK Several studies comparing browsing techniques have been conducted, with various results, not always supporting previous work. In a comparison by Beard and Walker [1] interfaces using scrollbars, zoom, and roam (pan) with and without overview, were compared. They found that all techniques gave better results with the overview, and that scroll bars were significantly slower than the other two techniques. A study by Kaptelinin [14] comparing scroll bars, dragging, a pop-up overview, and a pop-up overview containing a fieldof-view indicator, showed that interfaces with a pop-up overview were significantly faster than both the scroll-bar

Figure 3 Desktop WIMP interface Two studies by Hedman et al. [10] and Hedman [11] compared an iconic interface with zoom-and-pan and a bifocal view. The second study was conducted on a 50" plasma display, but the results were similar. The iconic interface turned out to be the most efficient in both studies. These


25

experiments, and a number of pilot studies, raised a lot of questions, which is the ground for the analysis presented in this paper. A variety of image browsers was presented by Spence [19], and a description of image browsers is given in the image browser taxonomy by Plaisant et al. [17]. They describe image browsers in detail, and how different types of browsing tasks make use of different browser characteristics. The most extensive survey of research done within this field so far was made by Cockburn et al. [3]. In their summary, the authors state that it is difficult to provide guidelines from previous research, since there are many factors influencing the results. This paper provides an analysis of visual factors in image browsers. The aim is to give one explanation to why results from previous research differ so much, and highlight a few things to consider when conducting tests on image browsers.

III. MEASURING USABILITY User tests involving image browsers are often conducted with task time completion, error rate, and ratings from subjects, and as measures of usability. In this paper we focus mainly on task completion time. The time it takes to find a target image among others can be divided into three parts: •

The time it takes to recognize a possible target. (Seek)

•

The time it takes to access the target. (Access)

•

The time it takes to interpret the information once it has been accessed. (Interpret)

The time it takes to recognize a possible target depends on how easy it is to distinguish the target from the rest of the set [2], [21]. A click-to-expand technique is superior to most other techniques if the real target can be located in this phase. However, in a set of data where the target cannot be identified either by its characteristics, or by location, there might not be much to gain by clicking on each object, compared to using a zoom-and-pan or a fisheye function. This assumption has not yet been verified, though. The time it takes to access the target, once it has been selected by the user depends on the technique used. Most techniques are divided into a sequence of operations: Move mouse and press button, or press button and move mouse, or move mouse, press button, and then move mouse again, etc. The time it takes to interpret the information, once the image has been accessed, depends on type of data and, of course, on the type of task.

IV. LAYOUT AND PRESENTATION OF IMAGES Image browsing is often discussed in terms of interaction technique or user task. Here we discuss the visual details, i.e. presentation and layout of images. There are a number of variables that could affect task completion time.

Visual factors can be divided into two groups. The first group consists of variables that are not limited by any other factor mentioned below: •

Display size

•

Resolution

•

Type of data

•

Organization of images

The second group consists of layout and presentation parameters that are limited by the variables of the first group, and also may limit other variables within the group, and create confounding variables. For image browsers, these are: •

Number of image

•

Size of images (i.e. size of representations)

•

Spacing

•

Appearance of images

•

Display area usage (visual angle)

•

Magnification ratio

In the following subsections, we discuss these variables in more detail. A dependency graph (Figure 8) illustrates the relationships between variables, and how each variable affects task completion time. A. Display Size The size of the display limits the largest possible size of images that can be viewed on the screen, and how large spacing between images can be. In combination with resolution, display size also affects the number of images that can be viewed at the same time and the magnification ratio between overview and detail view. If the image cannot be viewed fully at a readable size, the time it takes to interpret the contents of an image could benefit from a larger display. B. Resolution Resolution affects the maximum number of images that can be displayed in the same area at once, and appearance of images. Whether the effect on the appearance of images is significant or not, depends on how important resolution is to recognition of various symbols. When it comes to text and numbers, readability is more affected by size of characters than by resolution [20], not necessarily meaning that users find size more important than high resolution. C. Type of Data Type of data is related to appearance of images. A simple symbol of a distinct colour and shape is clearly different from a complex image with many different colours and patterns. The time it takes to interpret information might be highly dependent on this factor. In user tests aiming at comparing image browsing techniques, time spent in the interpretation phase should be as

26


short as possible. If the purpose is to find a certain target image, the content of the image should not be too complex. A simple and suitable task for this would be to identify a photo or some simple figure. If the task includes browsing a large image, a more complex image could be used in the test. Complexity of an image varies with the amount of information represented in the image, and how it is presented. A detailed photo can contain a lot of detail that yet can be accessed all at once. A bus table on the other hands needs to be systematically searched in order to find certain information. When conducting a user test it is important to keep in mind that for complex tasks, performance can be affected by the level of user skills. D. Organization of images Organization of images is a matter of layout and sorting. Images can be laid out in various patterns (rows and columns, etc.) on the display. Sorting can be done by some characteristics, such as content or appearance of images (colour, shape, etc.). This factor could affect the visual search phase if the user recognizes the pattern. E. Number of images Number of images can, when the display is full, be increased by decreasing the size of the images. However, there is an upper limit for number of pictures that, when exceeded, will make the view useless without a proper browsing technique. The largest number of objects that can be viewed at the same time is limited by size of images, spacing between images, and resolution. If resolution is high, more objects (but of smaller size) can be displayed.

Figure 5 Varied image size with number of images and spacing kept constant G. Spacing Spacing affects, and is affected by, similar factors as size of images. Number of images, size of images, and the size of the display area limits maximum spacing. Even when using a fixed number and size of images, the display area usage, and thereby also the visual angle, will change if the spacing between the images is changed . With Fitts's law [7] in mind, it would be reasonable to assume that larger spacing would increase task completion time. However, the effects of spacing in an iconic interface were investigated by Everett and Byrne [6], and they found no evidence for this. It remains to be investigated how spacing affects other browsing techniques, and whether different techniques are efficient for different levels of spacing. Furthermore, there might be an interaction with other factors, such as display size (the effects of spacing could be larger on a larger display), type of data and appearance of icons. In a comparative study, where users could use either a drag function or a click function in the same bifocal interface, it was observed that users seemed to prefer to click [11]. A possible explanation for this could be that objects separated by spacing encourage clicking.

Figure 6 Varied spacing with number of images and image size kept constant Figure 4 Varied number of images with spacing and image size kept constant F. Size of images Size of images is, according to Fitts' law [7], important to the time it takes to access an image. Besides affecting precision in the target access phase, size could also affect the time it takes to visually find a target or interpret the information. Maximum size of images is limited by the number of images, and by spacing between images. If magnification ratio plays an important role (for instance when zoom interfaces are tested), the size of smaller representations of images will be determined by this factor. Or, magnification ratio will be determined by size of images, just as spacing and number of images. Size of images also affects appearance of images, since larger images are more easily recognized than smaller ones.

H. Appearance of images Appearance of images has an impact on the amount of scaling (magnification ratio) needed in order to interpret the content. Small images, icons, and other symbols are more easily recognized if they have different characteristics, such as colour, shape, and contrast. An image with a colour that is distinctly different from the rest of the set can be found just as quickly in a large set as in a small set [21], [22]. If all images have approximately the same appearance, and there is no text accompanying the image, there is nothing but location to go by in order to find a target image. And if the location is unknown, the user needs to go through the images one by one in order to find a target. I. Display area usage Display area usage refers to how much of the display is used, counting from the edges of the topmost and leftmost

27


images, to the lowest and rightmost images. This measure determines the visual angle. When either spacing, size of image, or number of images is changed while the other variables are kept constant, the visual angle changes automatically. (See Figure 4, 5, and 6.) This means that testing the variables above separately under equal conditions is difficult. On smaller displays, the difference in visual angle between various levels of spacing is rather small. On a 50'' display, however, the area to search can differ by several decimeters in each direction. This could affect both the search phase and the target access phase. J. Magnification ratio The magnification ratio, or zoom factor, is the ratio between the miniature image and the actual image. This factor is particularly important to consider when comparing a zoom interface to other browsing techniques. In a zoom interface, a large zoom factor means a longer distance to move along the z-axis, before a readable view is achieved. Furthermore, zooming in on a large high-resolution image could mean that the user loses too much context for the technique to be useful. Panning and dragging operations are affected by the magnification ratio too. If the ratio is high, the actual area to pan in a zoomed in mode will be large, and the larger this area is, the more surface will lie "outside" the focus area. (Figure 7)

Figure 7 Original view and zoomed in view Panning or dragging the view from one edge of the surface to the opposite will be problematic. A rate high enough to enable reaching the opposite edge in one hand movement will affect the rate at which the view passes by on the display, and make it hard to see what is on the screen. Small movements, done at a lower rate, require several hand movements back and forth, and more time to go from one edge to the other, which causes fatigue. Figure 8 illustrates the relationships between presentation and layout variables. At the bottom of the figure we have the four variables from the first group that are not affected by other variables. Above them are the variables from the second group with variables that are both affected by and affect other variables. Right below the top line, representing the three steps of a browsing task, is browsing technique.

V. DISCUSSION It is clear that different browsing techniques are unequally efficient when it comes to simple tasks, such as accessing an image. For instance, the time it takes to access a selected image with an iconic click-and-expand interface can be calculated using Fitts’ law: t = a + b log2(1+D/W)

(1)

The time it takes to access a selected image with a zoom interface, regardless of starting point, could be calculated by: t = a + b log2(1+D/W) + tzoom + tpan

(2)

Either of tzoom or tpan could be 0, but not both. Hence, accessing one single image can not be done as quickly with a zoom interface as with an iconic interface. We have tried to put focus on the fact that task completion time for image browsing could be affected by many factors. These factors could be affected by other factors, which makes it harder to say exactly what improves usability and what does not. Further testing is needed, but testing image browsers is not an easy task. Assume we take into account a variable, factor A, in a usability test. Factor A is affected by another factor B, which we have not considered. Now, how do we tell what actually was the cause of the outcome of the test? If not all variables can be controlled, then it is harder to predict and explain the outcome, and results may be in conflict with results from other similar studies. Each factor that has not yet been tested should be tested separately. Then interaction between various factors and browsing techniques should be tested. With this approach, it might be possible to clarify whether certain browsing techniques benefit from certain layout and presentation strategies. However, there is more than layout and presentation that affects usability for image browsing. Plaisant et al. [17] described a number of different image browsing tasks and what browsing mechanisms were needed for various operations. It is reasonable to assume, just as Cockburn et al. [3] pointed out, that there exists an interaction between type of task and browsing technique. A logical way to continue would be to create a definition of properties of different tasks. If properties of different tasks are defined, it will be possible to vary those task properties and to investigate the possibility of an interaction between type of task and browsing technique. Another category of factors to analyse is the hardware factors, such as type of input device, acceleration function, usage of buttons, etc. Considering that user experience and task completion time sometimes differ, it is important to investigate subjective satisfaction further. Measures on mental workload can be added to the part of the analysis concerning subjective satisfaction, which would increase knowledge concerning the subject’s attitudes toward different browsing techniques. Research that involves human beings is difficult. There is always a risk for confounding variables and unpredicted


28

Figure 8 Dependency graph of layout and presentation factors behaviour. Thorough design and analysis of experiments will generate reliable results and a solid base to build further research on. Hopefully, this contribution is a step in that direction. REFERENCES [1]

Beard, D. and Walker II, J. Navigational techniques to improve the display of large two-dimensional spaces. Behaviour and Information Tech., 9(6), 1990, p. 451-466. [2] Byrne, M. D. Using Icons to Find Documents: Simplicity is Critical. In Proc. Conf. Human Factors in Computing Systems, 1993, p. 446-453. [3] Cockburn, A., Karlson, E., and Bederson, B. B. A Review of Overview+Detail, Zooming, and Focus+Context Interfaces. ACM Computing Surveys (CSUR), 41(1), 2008, p. 1-31. [4] Combs, T. and Bederson, B. B. Does zooming improve image browsing? In Proceedings of the ACM Conference on Digital Libraries (DL '99), Berkeley, CA 1999, p. 130-137. [5] Donskoy, M. and Kaptelinin, V. Window navigation with and without animation: a comparison of scroll bars, zoom, and fisheye view. Extended Abstracts of Human Factors in Computing Systems (CHI '97), Atlanta, GA 1997, p. 279-280. [6] Everett, S. P. and Byrne, M. D. Unintended effects: Varying icon spacing changes users' visual search strategy. In Human Factors in Computing Systems: Proceedings of CHI 2004, New York 2004, p. 695702. [7] Fitts, P. M. The Information capacity of the human motor system in controlling the amplitude of movement. Journal of Experimental Psychology 47, 1954, p. 381-391. [8] Furnas, G. W. Generalized fisheye views. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. (CHI '86), Boston, MA 1986, p. 16-23. [9] Gutwin, C. and Fedak, C. Interacting with Big Interfaces on Small Screens: a Comparison of Fisheye, Zoom, and Panning Techniques. In Proceedings of Graphics Interface, 2004, p. 19-26. [10] Hedman, A., Carr, D., and Nässla, H. Browsing Thumbnails: A Comparison of Three Techniques. In Proceedings of the 26th International Conference on Information Technology Interfaces, 2004, p. 353-360.

[11] Hedman, A. Image Browsing on a Large Display. In Proceedings of the 29th International Conference on Information Technology Interfaces, 2007, p. 245-250. [12] Hornbæk, K. and Frøkjær, E. Reading patterns and usability in visualizations of electronic documents. ACM Transactions on Computer-Human Interaction (TOCHI), 10(2), 2003, p. 119-149. [13] Hornbæk, K. and Bederson B. B. and Plaisant, C Navigation patterns and usability of zoomable user interfaces with and without an overview. ACM Transactions on Computer-Human Interaction (TOCHI), 9(4), ACM Press 2002, p. 362-389. [14] Kaptelinin, V. A comparison of four window navigation techniques in a 2D browsing task. The Conference Companion on Human Factors in Computing Systems (CHI’95), Denver, Colorado 1995, p. 282-283. [15] Leung, Y. K. and Apperley, M. D. A review and taxonomy of distortionoriented presentation techniques. ACM Transactions on ComputerHuman Interaction, 1(2), 1994, p. 126-160. [16] Mackinlay, J.D., Robertson, G.G. and Card, S.K. The perspective wall: detail and context smoothly integrated. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. (CHI '91), New Orleans, LA 1991, p. 173-179. [17] Plaisant, C., Carr, D. and Shneiderman, B. Image-browser taxonomy and guidelines for designers. IEEE Software, 12(2), 1995, p. 21-32. [18] Schaffer, D., Duo, Z., Greenberg, S., Bartram, L., Dill, J., Dubs, S. and Roseman, M. Navigating hierarchically clustered networks through fisheye and full-zoom methods. ACM Transactions on Computer-Human Interaction, 3(2), 1996, p. 162-188. [19] Spence, R. Information Visualization: Design for interaction (2nd Edition). Prentice-Hall (Pearson), 2007, ISBN: 0132065509. [20] Swaminathan, K. and Sato, K. Interaction Design for Large Displays, ACM's interactions, 4(1), 1997, p. 15-24. [21] Treisman, A., & Gelade, G. A feature integration theory of attention. Cognitive Psychology, 12, 1980, p. 97-136. [22] Treisman, A. Search, similarity and the integration of features between and within dimensions. Journal of Experimental Psychology: Human Perception and Performance, 17, 1001, p. 652-676.

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 Noureddine Bouhmala and Xing Cai: A Multilevel Approach for the Satisfiability Problem 1

29

A Multilevel Approach for the Satisfiability Problem Noureddine Bouhmala1 and Xing Cai2,3 1 Vestfold

University College, P.O. Box 2243, N-3103 Tønsberg, Norway for Biomedical Computing, Simula Research Laboratory, P.O. Box 134, N-1325 Lysaker, Norway 3 Department of Informatics, P.O. Box 1080 Blindern, University of Oslo, N-0316 Oslo, Norway

2 Center

Abstract—A large number of problems that occur in VLSI design, knowledge representation, automated learning, and other areas of artificial intelligence are essentially satisfiability problems. The satisfiability problem refers to the task of finding an assignment of variables that makes a boolean expression true. The growing need for more efficient and scalable algorithms has led to constant development of new SAT solvers. In this paper, we introduce a multilevel approach combining the multilevel paradigm with the GSAT greedy algorithm for solving the satisfiability problem. We present a comparative analysis of the new algorithm’s performance using a benchmark set containing randomized and structured problems from various domains. Keywords: SAT, GSAT, multilevel optimization, combinatorial optimization.

I. I NTRODUCTION The satisfiability problem (SAT) is known to be NPcomplete [4] and has an important role in the fields of VLSI Computer-Aided design, Computing Theory, and Artificial Intelligence. Generally, a SATVproblem is defined in form of m a propositional formula Φ = j=1 Cj , which has m clauses and involves n boolean variables. A boolean variable can have one of the two values: true or false. Each clause Cj has the following form: _ _ Cj = ∨ , k∈Ij

k∈I¯j

where Ij and I¯j denote two distinct sets of literals (i.e., Ij ∩ I¯j = ∅). For a boolean variable x, the symbol x ¯ denotes its negation. The objective is to determine whether Φ evaluates to true. Such an assignment, if it exists, is called a satisfying assignment for Φ and Φ is called satisfiable. Otherwise Φ is said to be unsatisfiable. Most SAT solvers use a Conjunctive Normal Form (CNF) representation of the formula Φ. In CNF, the formula is represented as a conjunction of clauses, each clause is a disjunction of literals, and a literal is a boolean variable or its negation. For example, P ∨ Q is a clause containing Corresponding author: N. Bouhmala (tlf.: +47 33031219, fax: +47 3303110, email: [email protected]).

the two literals P and Q. The clause P ∨ Q is satisfied if either P is true or Q is true. When each clause in Φ contains exactly k literals, the restricted SAT problem is called k-SAT. In this paper, we focus on the 3-SAT problem, where each clause contains exactly 3 literals. Since we have two choices for each boolean variable, and taken over n variables, the size of the search space S is |S| = 2n . The paper is organized as follows. In section 2 we review various algorithms for SAT problems. Section 3 explains the basic greedy GSAT algorithm. In section 4, the multilevel paradigm is described. Section 5 explains the multilevel greedy algorithm. In section 6, we look at the results from testing the new approach on a test suite of problem instances. Finally in section 7 we present some conclusions. II. M ETHODS FOR SAT SAT has been extensively studied due to its simplicity and applicability. The simplicity of the problem coupled with its intractability makes it an ideal platform for exploring new algorithmic techniques. This has led to the development of several meta-heuristics for solving SAT problems, which usually fall into two main categories: systematic algorithms and local search algorithms. Systematic search algorithms are guaranteed to return a solution to a problem if one exists and prove it insoluble otherwise. Due to their combinatorial explosion nature, large and complex SAT problems are hard to solve using systematic algorithms. One way to overcome the combinatorial explosion is to give up completeness. Local search algorithms are techniques using this strategy. They typically start with an initial assignment of truth values to variables, randomly or heuristically generated. Satisfiability can then be formulated as an iterative optimization problem in which the goal is to minimize the number of unsatisfied clauses. Thus, the optimum is obtained when the value of the objective function equals zero, which means that all clauses are satisfied. During each iteration, a new value assignment is selected from the ”neighborhood” of the present one, by performing a ”move”. Most local search algorithms use a 1-flip neighborhood relation, which means that two truth

30


value assignments are considered to be neighbors if they differ in the truth value of only one variable. Performing a move, then, consists of switching the present value assignment with one of the neighboring value assignments. The search terminates if no better neighboring assignment can be found. Note that choosing a fruitful neighborhood, and a method for searching it, is usually guided by intuition – theoretical results that can be used as guidance are sparse. One of the earliest local search algorithms for solving SAT is GSAT [29]. Basically, GSAT begins with a randomly generated assignment of values to the variables, and then uses the steepest descent heuristic to find the new variablevalue assignment, which best decreases the numbers of unsatisfied clauses. After a fixed number of moves, the search is restarted from a new random assignment. The search continues until a solution is found or a fixed number of restarts have been performed. An extension of GSAT, referred to as random walk [30] has been realized with the purpose of escaping from local optima. In a random walk step, a randomly unsatisfied clause is selected. Then, one of the variables appearing in that clause is flipped, thus effectively forcing the selected clause to become satisfied. The main idea is to decide at each search step whether to perform a standard GSAT or a random-walk strategy with a probability called the walk probability. Another widely used variant of GSAT is the WalkSAT algorithm originally introduced in [30]. It first picks randomly an unsatisfied clause c, and then, in a second step, one of the variables with the lowest break count, appearing in the selected clause, is randomly selected. The break count of a variable is defined as the number of clauses that would be unsatisfied by flipping the chosen variable. If there exists a variable with break count equals to zero, this variable is flipped, otherwise the variable with minimal break count is selected with a certain probability (noise probability). It turns out that the choice of unsatisfied clauses, combined with the randomness in the selection of variables, enable WalkSAT to avoid local minima and to better explore the search space. Extensive tests have led to the introduction of new variants of the WalkSAT algorithm referred to as Novelty and R-Novelty [23]. These two variants use a combination of two criteria when choosing a variable to flip from within an unsatisfied clause. Quite often, these two algorithms can get stuck in local minima and fail to get out. To this end, recent variants have been designed [21][22][14] using a combination of search intensification and diversification mechanisms leading to good performance on a wide range of SAT instances. Other algorithms [9][12][7][8] used history-based variable selection strategies in order to avoid flipping the same variable. In parallel to the development of more sophisticated versions of randomized improvement techniques, other methods based on the idea of modifying the evaluation function [37][15][33][27][28] in order to prevent the search from getting stuck in non-attractive areas of the

underlying search space have become increasingly popular in SAT solving. The key idea is to associate the clauses of the given CNF formula with weights. Although these clause weighting SLS algorithms differ in the way clause weights should be updated (probabilistic or deterministic), they all choose to increase the weights of all the unsatisfied clauses as soon as a local minimum is encountered. A new approach to clause weighting known as Divide and Distribute Fixed Weights (DDFW) [16] exploits the transfer of weights from neighboring satisfied clauses to unsatisfied clauses in order to break out form local minima. Recently, a strategy based on assigning weights to variables [25] instead of clauses greatly enhances the performance of the WalkSAT algorithm, leading to the best known results on some benchmarks. Lacking the theoretical guidelines while being stochastic in nature, the deployment of several metaheuristics involves extensive experiments to find the optimal noise or walk probability settings. To avoid manual parameter tuning, new methods have been designed to automatically adapt parameter settings during the search [22][24] and results have shown their effectiveness for a wide range of problems. The work conducted in [10] introduced Learning Automata (LA) as a mechanism for enhancing local search based SAT Solvers, thus laying the foundation for novel LA-based SAT solvers. Finally, a new approach based on an automatic procedure for integrating selected components from various existing solvers in order to build a new efficient algorithm that draw the strengths of multiple algorithms [38][20]. III. T HE GSAT G REEDY A LGORITHM This section is devoted to explaining the GSAT greedy algorithm and one of its variants as they are embedded into the multilevel paradigm. Basically, the algorithm of GSAT which is shown in Fig. 1 begins with a random generated assignment of values to variables, and then uses the steepest descent heuristic to find the new variable-value assignment which best decreases the numbers of unsatisfied clauses. After a fixed number of moves, the search is restarted from a new random assignment. The search continues until a solution is found or a fixed number of restart is performed. As with any combinatorial optimization, local minima or plateaus (i.e., a set of neighboring states each with an equal number of unsatisfied clauses) in the search space are problematic in the application of greedy algorithms. A local minimum is defined as a state whose local neighborhood does not include a state that is strictly better. The introduction of an element of randomness (i.e., noise) into a local search methods is a common practice to increase the success rate of GSAT and improve its effectiveness through diversification [2]. The algorithm of GSAT Random Walk, which is shown in Fig. 1, starts with a randomly chosen assignment. Thereafter two possible criteria are used in order to select the


31

Procedure GSAT Random Walk Input: Set of clauses, MAX-TRIES, MAX-FLIPS, and walk probability p; Output: A satisfying truth assignment, if found; Begin For i := 1 To MAX-TRIES Do T ← a random truth assignment; For j := 1 To MAX-FLIPS Do If T satisfies the clauses Then return T; If rnd(0, 1) ≤ p Then PossFlips ← the variable giving largest decrease in unsatisfied clauses; (If multiple choices, random pick.) Else PossFlips ← a random variable in some unsatisfied clause; V ← Pick (PossFlips); T ← T with V flipped; EndFor EndFor return “no solution found” End Fig. 1.

The GSAT Random Walk Algorithm.

variable to be flipped. The first criterion uses the notion of a “noise” or walk-step probability to randomly select a currently unsatisfied clause and flip one of the variables appearing in it also in a random manner. At each walk-step, at least one unsatisfied clause becomes satisfied. The other criterion uses a greedy search to choose a random variable from the PossFlips set of the variables that, when flipped, achieve the largest decrease (or the least increase) in the total number of unsatisfied clause. The walk-step strategy may lead to an increase in the total number of unsatisfied clauses even if improving flips would have been possible. In consequence, the chances of escaping from local minima of the objective function are better compared with the basic GSAT [31]. IV. T HE M ULTILEVEL PARADIGM The multilevel paradigm is a simple technique which at its core involves recursive coarsening to produce smaller and smaller problems that are easier to solve than the original one. Figure 2 shows the process of the generic multilevel paradigm in pseudo-code. The multilevel paradigm consists of three phases: coarsening, initial solution, and multilevel refinement. During the coarsening phase, a series of smaller problems is constructed by matching pair of vertices of the input original problem in order to form clusters. A coarser problem is defined using the clusters, and the coarsening procedure is recursively iterated until a sufficiently small problem is obtained. Computation of an

initial solution is performed on the coarsest level (smallest problem). Finally, the solution found at each level is projected to give an initial solution for the next level and then refined using a chosen local search algorithm. A common feature that characterizes multilevel algorithms, is that any solution in any of the coarsened problems is a legitimate solution to the original graph. This is always true as long as the coarsening is achieved in a way that each of the coarsened problems retains the original problem’s global structure. The key success behind the efficiency of the multilevel techniques is the use of the multilevel paradigm. This paradigm offers two main advantages which enable local search techniques to become much more powerful in the multilevel context. First, by allowing local search schemes to view a cluster of vertices as a single entity, the search becomes restricted to only those configurations in the solution space in which the vertices grouped within a cluster are assigned the same label. During the refinement phase a local refinement scheme applies a local a transformation within the neighborhood (i.e., the set of solutions that can be reached from the current one) of the current solution to generate a new one. As the size of the clusters varies from one level to another, the size of the neighborhood becomes adaptive and allows the possibility of exploring different regions in the search space. Second, the ability of a refinement algorithm to manipulate clusters of vertices provides the search with an efficient mechanism to escape from local minima. Multilevel techniques were first introduced to deal with the graph partitioning problem (GCP) [1] [11] [13] [18] [19] [34] and have proved to be effective in producing high quality solutions at a lower cost than single level techniques. The traveling salesman problem (TSP) was the second combinatorial optimization problem to which the multilevel paradigm was applied [35][36] and has shown a clear improvement in the asymptotic convergence of the solution quality. Finally, when the multilevel paradigm was applied to the vehicle routing problem [26], the results seem to be in line with the general trend observed in GCP and TSP as its ability to enhance the convergence behaviour of the local search algorithms was above the best known results. V. T HE M ULTILEVEL F RAMEWORK FOR SAT •

Coarsening: The original problem P0 is reduced into a sequence of smaller problems P0 , P2 , . . . , Pm such that |P0 | > |P1 | > |P2 | . . . > |Pm |. It will require at least 0 O(log N/N ) coarsening phases to coarsen the prob0 c lem down to N variables. Let Vi i+1 denotes the set of variables of Pi combined to form the cluster c of Pi+1 . During the coarsening phase, a sequence of smaller problems, each with fewer variables is constructed. Let

32


The Multilevel Paradigm Input: Problem P0 Output: Solution Sfinal (P0 ) Begin level := 0 /* Coarsening phase */ While (Not reached the desired number of levels) Plevel+1 := Coarsen (Plevel ) level := level + 1 end /* Find an initial solution at the coarsest level */ S(Plevel ) := Initial Solution (Plevel ) /* Uncoarsening and refinement phase */ While ( level > 0 ) Sstart (Plevel−1 ) := Uncoarsen (Sfinal (Plevel )) Sfinal (Plevel−1 ) := Refine (Sstart (Plevel−1 )) level := level − 1 end End Fig. 2.

The Multilevel Generic Algorithm.

Procedure Refine Input: Pi , Ai , MAX-FLIPS; Output: A satisfying truth assignment, if found; Begin For j := 1 To MAX-FLIPS Do If Ai satisfies the clauses Then return Ai ; PossFlips ← a randomly selected cluster with largest decrease in unsatisfied clauses; V ← Pick (PossFlips); Ai ← Ai with V flipped; EndFor End Fig. 3.

•

The GSAT Refinement Algorithm.

P0 denotes the original problem. The next level coarser problem P1 is constructed from P0 by collapsing pairs of variables into clusters. The variables are visited in random order. If a variable has not been matched yet, then we randomly select one randomly unmatched variable, and a cluster consisting of these two variables is created. Unmatched variables are simply copied the next level. The newly formed clusters are used to define a new and smaller problemm, and the coarsening process is recursively iterated until the size of the problem reaches some desired threshold. Initial solution: An initial assignment Am of Pm is easily computed using a random assignment algorithm, which works by randomly assigning to each cluster of the coarsest problem Pm the value of true or false.

•

•

Projection: Optimization based on the multilevel paradigm attempts to solve the problem through a hierarchy of optimization levels, where the final solution of the problem is subject to the outcome of each of the enclosed levels in order. Having optimized the assignment Ak+1 on a Pk+1 , the assignment must be projected onto its parent Pk . Since each child cluster ci of Pk+1 is the result of merging at most two parent clusters cl and cm from Pk , obtaining Ak from Ak+1 is done by simply assigning to every parent cluster at level k the same value as its child cluster at level k +1 c (i.e., Ai [u] = Ai+1 [v], ∀u ∈ Vi i+1 ). Refinement: At each level, the assignment from the previous level is projected back to give an initial assignment and further refined. Even though Ai+1 is a local minimum of Pi+1 , the projected assignment Ai may not be at a local optimum with respect to Pi . Since Pi is finer, it may still be possible to improve the projected assignment using a version of GSAT adapted to the multilevel paradigm. The idea of GSAT refinement as shown in Fig. 3 is to use the projected assignment of Pi+1 onto Pi as the initial assignment of GSAT. Since the projected assignment is already a good one, GSAT will converge quicker to a better assignment. During each level, GSAT is allowed to perform MAXFLIPS iterations before moving the a finer level. If a solution is not found at the finest level, a new round of coarsening, random initial assignment, and refinement is performed. VI. E XPERIMENTAL R ESULTS

A. Benchmark Instances To illustrate the potential gains of the multilevel greedy algorithm, we selected a benchmark suite covering different domains including benchmark instances of the International Competition and Symposium on Satisfiability Testing, held in 1996 in Beijing. These instances are by no means intended to be exhaustive but rather give an indication of typical performance behavior. All these benchmark instances are known to be hard and difficult to solve and are available from the SATLIB website (http://www.informatik.tudarmstadt.de/AI/SATLIB). All the instances are satisfiable instances and have been used widely in the literature in order to give an overall picture of the performance of different algorithms. Due to the randomization of the algorithm, the time required for solving a problem instance differs between different runs. Therefore, for each problem instance, we run GSAT and MLVGSAT 100 times with a cutoff parameter (maximum allowed solving time is set at 300 seconds). All the plots are given in the logarithmic scale showing the evolution of the solution quality based on averaged


33 10000

100 MLVGSAT gsat

MLVGSAT gsat

unsatisfied clauses

unsatisfied clauses

1000

100

10

10

1

1 1

10

100

1000

1

10 times

times 10000

100

100000 MLVGSAT gsat

MLVGSAT gsat

10000

unsatisfied clauses

unsatisfied clauses

1000

1000

100 100

10

10 1

10

100

1000

1

10

times

Fig. 4. problem the time clauses. problem

100

1000

times

Random: (Top) Evolution of the best solution on a 600 variable with 2550 clauses (f600.cnf). Along the horizontal axis we give in seconds, and along the vertical axis the number of unsatisfied (Bottom) Evolution of the best solution on a 1000 variable with 4250 clauses (f1000.cnf).

Fig. 6. SAT-encoded graph coloring: (Top) Evolution of the best solution on a 300 variable problem with 1117 clauses (flat100.cnf). Along the horizontal axis we give the time in seconds, and along the vertical axis the number of unsatisfied clauses. (Bottom) Evolution of the best solution on a 2125 problem with 66272 clauses (g125-17.cnf).

10000

100000

MLVGSAT gsat

MLVGSAT gsat

10000

unsatisfied clauses

unsatisfied clauses

1000

1000

100 100

10 1

10

100

1000

times

Fig. 5. Random: Evolution of the best solution on a 2000 variable problem with 8500 clauses (f2000.cnf). Along the horizontal axis we give the time in seconds, and along the vertical axis the number of unsatisfied clauses.

results over 100 runs. The test problems include Random-3SAT, SAT-encoded Graph Coloring Problems, SAT-encoded Logistics Problems, SAT-encoded Block World Planning Problems, and SAT-encoded Quasigroup Problems. B. Experimental Results Figures 4-15 show individual results which appear to follow the same pattern within each domain. Overall, at least for the instances tested here, we observe that the search pattern happens in two phases. In the first phase that corresponds to the early part of the search, both algorithms behave as a hill-climbing method. This phase can be described as a fast-working one, with a large

10 1

10

100

1000

times

Fig. 7. SAT-encoded graph coloring: Evolution of the best solution on a 2250 variable problem with 70163 clauses (g125-18.cnf). Along the horizontal axis we give the time in seconds, and along the vertical axis the number of unsatisfied clauses.

number of the clauses being satisfied. The best assignment climbs rapidly at first, and then flattens off as we mount the plateau, marking the start of the second phase. The plateau spans a region in the search space where flips typically leave the best assignment unchanged. The long plateau becomes even more pronounced as the number of flips increases, and occurs more specifically in trying to satisfy the last few remaining clauses. The transition to the plateau corresponds to a change to the region where a small number of flips gradually improves the score of the current solution ending with an improvement of the best assignment. The plateau is rather short with MLVGSAT compared


34 1000

10000 MLVGSAT gsat

MLVGSAT gsat

900 800 1000

unsatisfied clauses

unsatisfied clauses

700 600 500 400

100

300 10 200 100 0

1 0

5

10

15

20

25

1

10

times

100

10000

100000 MLVGSAT gsat

MLVGSAT gsat

10000

unsatisfied clauses

1000

unsatisfied clauses

1000

times

100

10

1000

100

1

10 1

10

100

1000

1

10

times

100

1000

times

Fig. 8. SAT-encoded block world: (Top) Evolution of the best solution on a 116 variable problem with 953 clauses (medium.cnf). Along the horizontal axis we give the time in seconds, and along the vertical axis the number of unsatisfied clauses. (Bottom) Evolution of the best solution on a 459 problem with 7054 clauses (huge.cnf).

Fig. 10. SAT-encoded Logistics: (Top) Evolution of the best solution on a 843 variable problem with 7301 clauses (logisticsb.cnf). Along the horizontal axis is the time in seconds, and along the vertical axis the number of unsatisfied clauses. (Bottom) Evolution of the best solution on a 1141 problem with 10719 clauses (logisticsc.cnf).

100000

100000

MLVGSAT gsat

MLVGSAT gsat

10000

unsatisfied clauses

unsatisfied clauses

10000 1000

100

1000

10

1

100 1

10

100

1000

times

1

10

100

1000

times

Fig. 9. SAT-encoded block world: Evolution of the best solution on a 1087 variable problem with 13772 clauses (bw-largeb.cnf). Along the horizontal axis we give the time in seconds, and along the vertical axis the number of unsatisfied clauses.

Fig. 11. SAT-encoded logistics: Evolution of the best solution on a 4713 variable problem with 21991 clauses (logisticsd.cnf). Along the horizontal axis we give the time in seconds, and along the vertical axis the number of unsatisfied clauses.

with that of GSAT. The projected solution from one level to a finer one offers an elegant mechanism to reduce the length of the plateau as it consists of more degrees of freedom that can be used for further improving the best solution. The plots show a time overhead for MLVGSAT specially for large problems due mainly to data structures settings at each level. We believe that this initial overhead, which is a common feature in multilevel implementations, can be considerably minimized by a more efficient implementation. Comparing GSAT with MLVGSAT for small sized problems (up to 1500 clauses), as shown in Figs. 6-8, we see that both algorithms seem to be reaching the optimal quality solutions. It is not immediately clear which of the

two algorithms converges more rapidly. This is probably highly dependent on the choice of the instances in the test suite. For example, the run time required by MLVGSAT for solving instance flat100-239 is more than 12 times higher than the mean run-time of GSAT (25sec vs 2sec). The situation is reversed when solving the instance blockmedium (20sec vs 70sec). The difference in convergence behavior of both algorithms start to become more distinctive as the size of the problem increases. All the plots show a clear dominance of MLGSAT over GSAT throughout the whole run. MLVGSAT shows a better asymptotic convergence (to around 0.008% − 0.1%) in excess of the optimal solution as compared with GSAT which only reach around


35 100000 MLVGSAT gsat

100000 MLVGSAT gsat 10000

unsatisfied clauses

unsatisfied clauses

10000

1000

100

1000

100

10 10 1

10

100

1000

times 1 1

100000

10

100

1000

times

MLVGSAT gsat 100000

MLVGSAT gsat

10000

unsatisfied clauses

unsatisfied clauses

10000

1000

100

1000

100

10 1

10

100

1000

times 10 1

Fig. 12. SAT-encoded quasigroup: (Top) Evolution of the best solution on a 129 variable problem with 21844 clauses (qg6-9.cnf). Along the horizontal axis we give the time in seconds, and along the vertical axis the number of unsatisfied clauses. (Bottom) Evolution of the best solution on a 729 problem with 28540 clauses (qg5.cnf).

10

100

1000

times

Fig. 14. SAT competition Beijing: (Top) Evolution of the best solution on a 410 variable problem with 24758 clauses (4blockb.cnf). Along the horizontal axis we give the time in seconds, and along the vertical axis the number of unsatisfied clauses. (Bottom) Evolution of the best solution on a 8432 problem with 31310 clauses (3bitadd-31.cnf).

1e+06 MLVGSAT gsat

100000

MLVGSAT gsat

10000 1000 unsatisfied clauses

unsatisfied clauses

100000 10000

100

1000

10 1

10

100

1000

100

times

Fig. 13. SAT-encoded quasigroup:Evolution of the best solution on a 512 variable problem with 148957 clauses (qg1-8.cnf). Along the horizontal axis we give the time in seconds, and along the vertical axis the number of unsatisfied clauses.

10 1

10

100

1000

times 100000 MLVGSAT gsat

(0.01% − 11%). The performance of MLVGSAT surpasses that of GSAT although few of the curves overlay each other closely, MLVGSAT has marginally better asymptotic convergence. The quality of the solution may vary significantly from run to run on the same problem instance due to random initial solutions and subsequent randomized decisions. We have thus chosen the Wilcoxon Rank test to check the level of statistical confidence in differences between the mean percentage excess deviation from the solution of the two algorithms. The test requires that the absolute values of the differences between the mean percentage excess deviation from the solution of the two algorithms are sorted

unsatisfied clauses

10000

1000

100

10

1 1

10

100

1000

times

Fig. 15. SAT competition Beijing: (Top) Evolution of the best solution on a 8704 variable problem with 32316 clauses (3bitadd32.cnf). Along the horizontal axis we give the time in seconds, and along the vertical axis the number of unsatisfied clauses. (Bottom) Evolution of the best solution on a 758 problem with 47820 clauses (4blocks.cnf).


36 TABLE I W ILCOXON STATISTICAL TEST. Problem f600 f1000 f2000 flat100 g125-17 g125-18 logistic-b logistic-c logistic-d bw-medium bw-huge bw-large-b qg6-9 qg5-9 qg1-8 4block 3bitadd-31 3bitadd-32 4blocksb

MLVGSAT %EX 0.001 0.002 0.002 0 0.0002 0.001 0.00008 0.001 0.005 0 0.0007 0.0002 0.001 0.001 0.0004 0.00008 0.0008 0.0007 0.00008

GSAT %EX 0.004 0.009 0.01 0 0.001 0.11 0.0001 0.004 0.08 0 0.001 0.001 0.003 0.002 0.011 0.0008 0.006 0.004 0.0002

Null Hypothesis Accept Accept Accept Reject Accept Accept Accept Accept Accept Reject Accept Accept Accept Accept Accept Accept Accept Accept Accept

from smallest to largest, and these differences are ranked according to the absolute magnitude. The sum of the ranks is then formed for the negative and positive differences separately. As the size of the trials increases, the rank sum statistic becomes normal. If the null hypothesis is true, the sum of ranks of the positive differences should be about the same as the sum of the ranks of the negative differences. Using two-tailed P value, significance performance difference is granted if the Wilcoxon test is significant for P < 0.05. Looking at Table I, we observe that the difference in the mean excess deviation from the solution is significant for large problems and remains insignificant for small problems. VII. C ONCLUSIONS In this paper, we have described and tested a new approach for solving the satisfiability problem based on combining the multilevel paradigm with the GSAT greedy algorithm. The resulting algorithm progressively coarsens the problem, provides an initial assignment at the coarsest level, and then iteratively refines it backward level by level. In order to get a comprehensive picture of the new algorithm’s performance, we used a suite of SAT-encoded problems from various domains. By analyzing the results, we observed that, using the same computational time, MLVGSAT provides better quality solutions compared with those of GSAT. The broad conclusions that we may draw from the results are that the multilevel paradigm can either speed up GSAT or even improve its asymptotic convergence. Our results indicated that the larger the instance, the higher the difference between the mean percentage excess deviation from the solution. An obvious subject for further work would be the use of more efficient data structures in order to minimize the

overhead during the coarsening and uncoarsening phases. It would be of great interest to further validate or contradict the conclusions of this work by extending the range of problem classes. Finally, other subjects for further work include combining the multilevel paradigm with state-ofthe-art heuristics and developing new coarsening strategies. ACKNOWLEDGMENT We thank the following students at the University College of Vestfold for implementing the computer codes used in this paper: Lars Kristian Bremnes, Roy Endré Dahl, and Kristian Korsmo. R EFERENCES [1] S.T. Barnard and H.D. Simon. A fast multilevel implementation of recursive spectral bisection for partitioning unstructured problems. Concurrency: Practice and Experience, 6(2):101–17, 1994. [2] C. Blum and A. Roli. Metaheuristics in combinatorial optimization: Overview and conceptual comparison. ACM Computing Surveys, 35(3):268–308, September 2003. [3] B. Cha and K. Iwama. Performance tests of local search algorithms using new types of random CNF formula. Proceedings of IJCAI’95, pages 304–309. Morgan Kaufmann Publishers, 1995. [4] S.A. Cook. The complexity of theorem-proving procedures. Proceedings of the Third ACM Symposium on Theory of Computing, pages 151–158. ACM, 1971. [5] A.E. Eiben and J.K. van der Hauw. Solving 3-SAT with adaptive genetic algorithms. Proceedings of the 4th IEEE Conference on Evolutionary Computation, pages 81–86. IEEE Press, 1997. [6] J. Frank. Learning short-term weights for GSAT. Proceedings of IJCAI’97, pages 384–389. Morgan Kaufmann Publishers, 1997. [7] I. Gent and T. Walsh. Unsatisfied variables in local search. In J. Hallam, editor, Hybrid Problems,Hybrid Solutions, pages 73–85. IOS Press, 1995. [8] I. Gent and T. Walsh. Towards an understanding of hill-climbing procedures for SAT. Proceedings of AAAI’93, pages 28–33. MIT Press, 1993. [9] F. Glover. Tabu search - Part I. ORSA Journal on Computing, 1(3):190–206, 1989. [10] O.C. Granmo, N. Bouhmala. Solving the satisfiability problem using finite learning automata. International Journal of Computer Science and Applications, 4(3):15–29, 2007. [11] R. Hadany and D. Harel. A multi-scale algorithm for drawing graphs cicely. Tech.Rep.CS99-01, Weizmann Inst.Sci., Faculty Maths.Comp.Sci, 1999. [12] P. Hansen and B. Jaumand. Algorithms for the maximum satisfiability problem. Computing, 44(4):279–303, 1990. [13] B. Hendrickson and R. Leland. A multilevel algorithm for partitioning graphs. In S. Karin, editor, Proc. Supercomputing’95, San Diego, 1995. ACM. [14] H. Hoos. An adaptive noise mechanism for WalkSAT. Proceedings of the Eighteen National Conference in Artificial Intelligence (AAAI02), pages 655–660. AAAI Press, 2002. [15] F. Hutter, D. Tompkins, H. Hoos. Scaling and probabilistic smoothing: Efficient dynamic local search for SAT. Proceedings of the Eighth International Conference of the Principles and Practice of Constraint Programming (CP’02), pages 241–249. Springer, 2002. [16] A. Ishtaiwi, J. Thornton, A. Sattar, and D.N. Pham. Neighborhood clause weight redistribution in local search for SAT. Proceedings of the Eleventh International Conference on Principles and Practice Programming (CP’05), pages 772–776. Springer, 2005. [17] D.S. Johnson and M.A. Trick, editors. Cliques, Coloring, and Satisfiability, Volume 26 of DIMACS Series on Discrete Mathematics and Theoretical Computer Science. American Mathematical Society, 1996. [18] G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput., 20(1):359– 392, 1998.

37


[19] G. Karypis and V. Kumar. Multilevel k-way partitioning scheme for irregular graphs. J. Par. Dist. Comput., 48(1):96–129, 1998. [20] A.R. KhudaBukhsh, L. Xu, H. Hoos, K. Leyton-Brown. SATenstein: Automatically building local search SAT solvers from components. Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI-09), pages 517–524, 2009. [21] C.M. Li, W.Q. Huang. Diversification and determinism in local search for satisfiability. Proceedings of the Eight International Conference on Theory and Applications of Satisfiability Testing (SAT-05), pages 158–172. Springer, 2005. [22] C.M. Li, W. Wei, and H. Zhang. Combining adaptive noise and lookahead in local search for SAT. Proceedings of the Tenth International Conference on Theory and Applications of Satisfiability Testing (SAT07), pages 121–133. Springer, 2007. [23] D. McAllester, B. Selman, and H. Kautz. Evidence for invariants in local search. Proceedings of AAAI’97, pages 321–326. MIT Press, 1997. [24] D.J. Patterson and H. Kautz. Auto-WalkSAT: A self-tuning implementation of WalkSAT. Electronic Notes on Discrete Mathematics, 9:360–368, 2001. [25] S. Prestwich. Random walk with continuously smoothed variable weights. Proceedings of the Eight International Conference on Theory and Applications of Satisfiability Testing (SAT-05), pages 203–215. Springer, 2005. [26] D. Rodney, A. Soper, and C. Walshaw. The application of multilevel refinement to the vehicle routing problem. In D. Fogel et al., editors, Proc. CISChed 2007, IEEE Symposium on Computational Intelligence in Scheduling, pages 212–219. IEEE, 2007. [27] D. Schuurmans, and F. Southey. Local search characteristics of incomplete SAT procedures. Proc. AAAI-2000, pages 297–302. AAAI Press, 2000. [28] D. Schuurmans, F. Southey, and R.C. Holte. The exponentiated subgradient algorithm for heuristic Boolean programming. Proc. IJCAI01, pages 334–341. Morgan Kaufman Publishers, 2001. [29] B. Selman, H. Levesque, and D. Mitchell. A new method for solving hard satisfiability problems. Proceedings of AAA’92, pages 440–446. MIT Press, 1992. [30] B. Selman, H. Kautz, and B. Cohen. Noise strategies for improving local search. Proceedings of AAAI’94, pages 337-343. MIT Press, 1994. [31] B. Selman and H. Kautz. Domain-independent extensions to GSAT: Solving large structured satisfiability problems. In R. Bajcsy, editor, Proceedings of the international Joint Conference on Artificial Intelligence, volume 1, pages 290–295. Morgan Kaufmann Publishers Inc., 1993. [32] W.M. Spears. Simulated Annealing for Hard Satisfiability Problems. Technical Report, Naval Research Laboratory, Washington D.C., 1993. [33] J. Thornton, D.N. Pham, S. Bain, and V. Ferreira Jr. Additive versus multiplicative clause weighting for SAT. Proceedings of the Nineteenth National Conference of Artificial Intelligence (AAAI-04), pages 191–196. AAAI Press, 2004. [34] C. Walshaw and M. Cross. Mesh partitioning: A multilevel balancing and refinement algorithm. SIAM J. Sci. Comput., 22(1):63–80,2000. [35] C. Walshaw. A multilevel approach to the travelling salesman problem. Oper. Res., 50(5):862–877, 2002. [36] C. Walshaw. A Multilevel Lin-Kernighan-Helsgaun Algorithm for the Travelling Salesman Problem. Tech. Rep. 01/IM/80, Comp. Math. Sci., Univ. Greenwich, 2001. [37] Z. Wu., and B. Wah. An efficient global-search strategy in discrete Lagrangian methods for solving hard satisfiability problems. Proceedings of the Seventeenth National Conference on Artificial Intelligence (AAAI-00), pages 310–315. AAAI Press, 2000. [38] L. Xu, F. Hutter, H. Hoos, K. Leyton-Brown. SATzilla: Portfoliobased algorithm selection for SAT. Journal of Artificial Intelligence Research, 32(1)565-606, 2008.

38

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 J. C. Chedjou et.al.: Solving Stiff Ordinary Differential Equations and Partial Differential Equations Using Analog Computing Based on Cellular Neural Networks

Solving Stiff Ordinary Differential Equations and Partial Differential Equations Using Analog Computing Based on Cellular Neural Networks J. C. Chedjou1, K. Kyamakya1, M. A. Latif1, U. A. Khan1, I. Moussa2 and Do Trong Tuan3 Abstract— Setting analog cellular computers based on cellular neural networks systems (CNNs) to change the way analog signals are processed is a revolutionary idea and a proof as well of the high importance devoted to the analog simulation methods. We provide an in-depth description of the concept exploiting analog computing based on the CNN paradigm to solve nonlinear and highly stiff ordinary differential equations (ODEs) and partial differential equations (PDEs). We appply our method to the analysis of the dynamics of two systems modeled by complex and stiff equations. The first system consists of three coupled Rössler oscillators in a Master-SlaveAuxiliary configuration. The capabilities of this coupled system to exhibit regular and chaotic dynamics have been demonstrated so far. The synchronization modes of the coupled system can be exploited in chaotic secure communications. The second system is the Burgers’ equation which is a well-known classical model for analyzing macroscopic traffic flow motions/scenarios. As a proof of concept of the proposed approach, the results obtained in this paper are compared with the results available in the relevant literature (benchmarking) and, the proposed concept is validated by a very good agreement obtained. The computation based on CNN paradigm is advantageous as it provides accurate and ultra-fast solutions of very complex ODEs and PDEs and performs real-time computing. Keywords— ODE, PDE, Cellular neural network, Fieldprogrammable gate array (FPGA), templates, dicretization, coupling, stiffness.

I. INTRODUCTION During the last sixty years, the theory of digital computation has been based on the concept of deterministic Turing machine [13]. However, despite the great success of digital computers, the existence of so called hard/complex problems has revealed some weaknesses or limitations on their computing capabilities. Indeed, digital computers are exposed to transient phenomena, stiffness, accumulation of round-off errors, divergences or floating point overflows [7-11] when dealing with the analysis of complex problems/scenarios which in essence are generally modeled by nonlinear ODEs and/or PDEs in the spatio-temporal domain. Further solving these complex and stiff equations with digital computers is very time consuming [14]. 1. The authors are with the Transportation Informatics Group, Institute of Smart System Technologies, University of Klagenfurt (Austria), phone:+(43)463 2700 3550 fax:+(43)463 2700 3698, email: [email protected], [email protected] 2. The author is with the University of Yaoundé 1(Cameroon), email: [email protected] 3. The author is with the faculty of Electronics and Telecommunications, Hanoi University of Technology(Vietnam), email: [email protected]

Analog computers could be considered as an alternative method to analyze the dynamics of complex and stiff systems since these computers are not exposed to problems faced by digital computers when dealing with the simulation of complex models ODEs and/or PDEs. Despite some limitations presented by analog computers e.g. lack of universality, requirement of highly precise electronic components, and limitation of the dynamics of analog computers due to their bias (power supply), the analog method has a great potential to efficiently analyze complex and stiff models and provide ultra-fast and accurate solutions. Precisely, an implementation of analog computers on reprogrammable hardware (e.g. FPGA) could be envisaged to improve and optimize the computing process with analog computers. It should be worth mentioning that a challenging and up-todate research question concerns the development of efficient, accurate and ultra-fast (high speed) computing platforms to solve complex and stiff ODEs and/or PDEs which are equations modeling the dynamics/motion of real-life engineering systems, scenarios or events [14]. Concerning the methods/approaches for solving stiff PDEs, some interesting works have been carried out. Ref. [15] does present a learning method based on CNN, i.e. a modified online back propagation (BP) algorithm. Ref. [16] deals with the implementation of the CNN paradigm on very large scale integrated circuit (VLSI) to solve PDE. The authors of Ref. [17] consider the discrete CNN paradigm to solve PDEs with applications for image multi-scale analysis. Ref. [18] presented an emulation of the CNN paradigm on digital platforms to solve PDEs. The idea of fixed-point is introduced which is being exploited to decrease the computing precision, leading to the increasing computing speed. Ref. [19] is focused on the CNN-based analog computing paradigm to solve PDEs. The paradigm is shown to be flexible in setting boundary conditions and selecting discretization methodologies as well. Ref. [20] discusses various possibilities of mapping CNN models into PDEs in order to approximate their solutions numerically. It is shown that the mapping process is not a general method as this process is limited (i.e. not valid) for a specific class of PDEs. The state-of-the-art shows/presents the CNN paradigm as being an attractive alternative solution to conventional numerical computation method [1-2, 15-20]. Indeed, it has been intensively shown that CNN is an analog computing paradigm which performs ultra-fast calculations and provides accurate results [1, 2]. Interestingly, a speed-up of the analog computing process is possible by an implementation on reprogrammable computing (i.e. FPGA).

39


This paper considers the concept of analog computing based on the CNNs paradigm to solve complex nonlinear and stiff equations (ODEs and/or PDEs). We provide an indepth presentation of our concept and apply it to two well known and stiff nonlinear differential equations namely the system consisting of three coupled Rössler equations in Master-Slave-Auxiliary configuration (coupled ODE) and the Burger’s equations. The first coupled model has been shown appropriate to exhibit regular and chaotic waves which in their synchronization regime can be used in secure communications [10], the second model (Burgers’) is the traditional model to describe macroscopic traffic flow scenarios. We explain and show the possibility of deriving appropriate CNN templates to solve complex ODEs and/or PDEs. Using our approach, these equations are mapped to a CNN array in order to facilitate templates calculation. On the other hand complex/stiff PDEs are transformed into ODEs having array structures. This transformation is achieved by applying the method of finite difference. This method is based on the Taylor’s series expansion. The structure of the paper is as follows. Section 2 provides a brief overview of the CNN paradigm. Section 3 explains/shows the approach to solve complex/stiff ODEs with the CNN paradigm. An application example is provided based on solving coupled Rössler equations with CNN. Section 4 applies the same approach to solve a nonlinear and stiff PDE (Burgers’) with CNN. In section 5, the benchmarking of the proposed concept is presented by comparing our results with those provided by the relevant literature. Section 6 deals with conclusions and outlooks.

II. THE CNN PARADIGM/CONCEPT The concept of CNN was introduced by Leon O. Chua and Yang [1]. The cell, which is the fundamental building block of the CNN processor, is a lumped circuit containing both linear and nonlinear elements (Fig.1a). Fig.1a presents the CNN processor as an array of cells characterized from an input, a state and an output. This CNN processor is built using identical analog processing elements called cells [1-2]. These cells can be arranged in a k-dimensional square grid which is the most commonly used CNN type amongst many others namely the spherical CNN [4] and the star CNN [3], just to name a few. In these types of CNN, cells are locally connected (i.e. each cell is connected to its neighborhood) via programmable weights called templates. These templates are changed to make it programmable the CNN cell array.

Fig. 1a: (a) The basic CNN cell defined as a nonlinear first order circuit, (b) The PWL sigmoid function, and (c) A CNN architecture composed by a two-dimensional array of MxN cells arranged in a

Hence, the essential/fundamental of the technology based on the CNN paradigm is located in templates. These templates

are sets of matrices which are space invariant (i.e. cloning templates) if the values of templates do not depend on the position of the cell [1, 2]. This condition/case is particularly interesting to materialize spatial discretization. A complete (or full) specification of the dynamics of a CNN cell array requires the definition of Dirichlet (or boundary) conditions. The CNN computing platform developed in this paper exploits the structure of the state-control CNN (SC-CNN) [1, 2] which is modeled by (1).

[

]

M dxi ˆ x +A y +B u +I = −xi + ∑ A ij j ij j ij j i dt j =1

(1)

The coefficients Aij , Aij and Bij are the self-feedback template, feedback template and control template, respectively. The schematic representation of a state-control CNN cell coupled to (M-1) neighboring cells is shown in Fig. 1b. Ii is the bias value and yi is the nonlinear output sigmoid function of each cell. u j denotes the input value and xi represents the state of each cell.

Fig. 1b: SIMULINK graphical representation of the basic model of a statecontrol CNN cell (SC-CNN) coupled to (M-1) Neighbours.

III.

SOLVING COMPLEX AND STIFF ODE WITH CNN

A.

Principle for CNN templates findings According to the general theory in nonlinear dynamics based on the linearization of the vector field [12], complex and stiff ODEs can be described by a unique vector field in n

a bounded region of R which is solution in (2).

dx = A( x)[x − F ( x)] dt

(2)

Where A(x ) is a n× n matrix function of x , F being the n

mapping of R to itself. It should be worth to mention that (2) can be transformed into the classical model of Chua equation in which components of the matrix A are all linear and F is a piecewise linear function of the variable x . Our approach in this paper transforms complex ODEs into the form described in (2) in order to make them solvable by the CNN paradigm since it is well-known that (2) can easily be mapped into the form of CNN model in (1). Therefore a good identification of the equations leads to determining the appropriate CNN templates to solve the complex and stiff ODEs. We consider the case of a system consisting of three identical oscillators of the Rössler type coupled in a Master-

( )+ (x , y , z ) system under

Slave-Auxiliary configuration. The master x1 , y1 , z1 slave

(x , y , z ) + auxiliary 2

2

2

3

3

3

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 J. C. Chedjou et.al.: Solving Stiff Ordinary Differential Equations and Partial cis-72Using Analog Computing Based on Cellular Neural Networks Differential Equations

40

investigation is modeled by the following set of complex and stiff coupled ODEs:

dx1, 2,3 = −ω1, 2,3 y1, 2,3 − z1, 2,3 + dt ε1, 2,3 x2,1,1 + x3,3, 2 − x1, 2,3

(

Slave system fixed point:

)

ωi

(3a) (3b) (3c)

are the natural frequencies of the oscillators,

εi

are the elastic coupling coefficients (couplings through solutions) and ai , f i and U i are the system parameters. This coupled system can be assimilated to a sequential system (i.e. a system which output depends on both inputs and previous state of the output). We now want to provide an in-depth explanation of our approach for solving complex and stiff ODEs with the CNN paradigm. Therefore we consider (3) which are good prototypes of complex and stiff ODEs. We first transform (3) into the form

⎡ x1, 2,3 ⎤ ⎡− ε1, 2,3 − ω1, 2,3 − 1 ⎤ ⎡ x1, 2,3 ⎤ ⎥ ⎥ ⎢ ⎥⎢ d ⎢ 0 ⎥ ⎢ y1, 2,3 ⎥ ⎢ y1, 2,3 ⎥ = ⎢+ ω1, 2,3 + a1, 2,3 dt ⎢ ⎥ ⎥ ⎢ ⎥⎢ − U1, 2,3 ⎥⎦ ⎢⎣ z1, 2,3 ⎥⎦ 0 ⎢⎣ z1, 2,3 ⎥⎦ ⎢⎣ 0

(

⎡ε1, 2,3 x2,1,1 + ε 3,3, 2 ⎢ 0 +⎢ ⎢ f + x .z ⎣ 1, 2,3 1, 2,3 1, 2,3

(4)

From (4) one can show the existence of fixed points through (5)

⎡ x1, 2,3 ⎤ ⎥ d ⎢ ⎢ y1, 2,3 ⎥ = 0 dt ⎢ ⎥ ⎢⎣ z1, 2,3 ⎥⎦

(5)

Under a good specification of the parameters setting (coefficients) of (4) one can evaluate fixed/equilibrium points as follows:

⎡ x01 ⎤ ⎢ ⎥ Master system fixed point: Χ 1 = ⎢ y01 ⎥ ⎢z ⎥ ⎣ 01 ⎦

2

(6b)

⎡ x03 ⎤ ⎢ ⎥ Auxiliary system fixed point: Χ 3 = ⎢ y03 ⎥ ⎢z ⎥ ⎣ 03 ⎦

∧

(6a)

(6c)

It should be worth mentioning that the aim is to linearize the vector field around fixed points. This linearization around a non-zero equilibrium/fixed point provides the possibility of modifying the nonlinear part of the coupled system without changing the qualitative dynamics of the system. This statement can be materialized by: A Χ

∧

→ A Χ 1, 2 , 3 . Therefore, (2) can be considered to evaluate the linear part of the vector field at fixed points. This linear part is represented or materialized by 3× 3 matrices defined as follows: 1, 2 , 3

Amaster

⎡a11 a12 a13 ⎤ = ⎢⎢a21 a22 a23 ⎥⎥ ⎢⎣a31 a32 a33 ⎥⎦

(7a)

Aslave

⎡b11 b12 b13 ⎤ = ⎢⎢b21 b22 b23 ⎥⎥ ⎢⎣b31 b32 b33 ⎥⎦

(7b)

Aauxiliary

⎡c11 c12 c13 ⎤ = ⎢⎢c21 c22 c23 ⎥⎥ ⎢⎣c31 c32 c33 ⎥⎦

(7c)

)⎤ ⎥ ⎥ ⎥ ⎦

Χ

⎡ x02 ⎤ ⎢ ⎥ = ⎢ y02 ⎥ ⎢z ⎥ ⎣ 02 ⎦

∧

dy1, 2,3 = ω1, 2,3 x1, 2,3 + a1, 2,3 y1, 2,3 dt dz1, 2,3 = f1, 2,3 + z1, 2,3 (x1, 2,3 − U1, 2,3 ) dt Where

∧

From which the corresponding CNN templates are derived under precise values of the coefficients of the model in (3).

B. Design of the complete CNN processor We now want to design a CNN computing platform to investigate the issues of synchronization in the master-slaveauxiliary system modeled by (3). Full insight on synchronization issues can be achieved if it is possible to perform computations in wide ranges of the values of systems parameters in (3), even in cases some of these values can make the system experience stiffness. The efficiency of the calculations using CNN makes it a good candidate to perform computations in the cases of high stiffness and therefore an appropriate tool to tackle the difficulties faced by the classical numerical approach when dealing with computation of the model in (3). Using the structure of the elementary/basic CNN shown in Fig. 1b, we have designed the complete CNN processor to solve (3). We found that a total number of nine basic CNN cells are needed to implement the complete CNN processor on-top of SIMULINK to solve (3). Fig. 2a shows the structure of the three basic CNN cells needed to implement the master system. The same structure is needed


41

to implement both slave and auxiliary systems due to the symmetric nature of the coupled systems under investigation.

nization in the coupled system when monitoring the control parameter ε 3 . The results obtained from the complete CNN processor are shown in figures 3 revealing the possibility for the coupled system to behave in synchrony in the regular state (Fig. 3a) and in the chaotic state (Fig. 3b) as well.

Fig. 2a: Schematic representation of the 3 CNN cells needed to implement each subsystem (i.e. master or slave or auxiliary) on top of SIMULINK.

Fig. 2b shows the implementation on-top of SIMULINK of the complete CNN processor to solve (3). The complete circuit in Fig. 2b is made-up of three layers with a total number of nine coupled CNN cells (Yellow blocs) whereby each layer uses three coupled CNN cells and represents the Rössler oscillator (i.e. Master or Slave, or Auxiliary subsystem).

Fig. 3a. Simulation results using CNN on top of SIMULINK showing the projection of the regular attractors of the master and slave systems

(

)

x1 , x2 for ε 3 = 0.0050 . This projection shows the regime of regular synchronization in the master-slave-auxiliary coupled system. The other parameters are defined in the text.

in the plane

Fig. 2b: Schematic representation of the complete CNN- processor (master-slave-auxiliary) implemented on-top of SIMULINK.

C. Results and proof of concept To illustrate the concepts, we have chosen the following values of the system parameters: ω1 = 0.9700 ,

a1 = 0.2650 , ε1 = 0.0176 , f 2 = 1.1500 , ω3 = 0.9650 , U 3 = 4.2796 .

f1 = 1.1500 , ω2 = 0.9750 , U 2 = 4.2796 , a3 = 0.2650 ,

U1 = 4.1596 , a2 = 0.2650 , ε 2 = 0.0460 , f 3 = 1.1500 ,

To verify the state of the complete CNN processor in Fig. 2b, we investigate the states of chaotic and regular synchro-

Fig. 3b. Simulation results using CNN on top of SIMULINK showing the projection of the chaotic attractors of the master and slave systems in

(

)

the plane x1 , x2 for ε 3 = 0.0155 . This projection shows the regime of chaotic synchronization in the master-slave-auxiliary coupled system. The other parameters are defined in the text.

In order to verify the results obtained from the CNN processor, we use the same sets of system parameters to perform a direct numerical simulation of (3). The direct numerical simulation performed using Turbo-C has led to results in Figs. 4. A very good agreement obtained when comparing results in Figs. 3 with the results in Figs. 4 is a


42

proof to validate the concept we have developed in this paper showing the possibility of using the CNN to solve complex and stiff ODEs.

Fig. 4a. Direct numerical simulation results (Turbo-c) showing the projection of the regular attractors of the master and slave systems in

(

)

the plans x1 , x2 for ε 3 = 0.0050 . This projection shows the regime of regular synchronization in the Master-slave-auxiliary coupled system. Same values of parameters as in Fig. 3a.

Fig. 4b. Direct numerical simulation results (Turbo-c) showing the projection of the chaotic attractors of the master and slave systems in the plans

(x1 , x2 ) for ε 3 = 0.0155 . This projection shows the

regime of chaotic synchronization in the Master-slave-auxiliary coupled system. Same values of parameters as in Fig. 3b.

IV. SOLVING NONLINEAR PDE WITH CNN Four main variables (discrete or continuous) are considered when solving PDEs: time, value of the state variable, interaction of parameters, and space. The overall approach is based on transforming PDEs into ODEs and, arranged these ODEs into a form which can be identified with the CNN models for templates calculations. PDEs are good prototypes for modeling traffic flow which is an interesting phenomenon of modern world. The modeling process though very important to get insights on/of the traffic dynamics appears challenging as we all experience it daily traffic problems are still not well understood. The reasons of this misunderstanding are justified by: the dynamic nature of the traffic dynamics which is time varying; the unpredictable or stochastic nature of the traffic

dynamics which could be caused/due by/to accidents, maintenances of roads, or unpredictable events/situations influencing the traffic. These are some examples showing the nonlinear dynamical character of the traffic dynamics. Thus, nonlinear equations are good candidates to describe the dynamics of traffic. At the macroscopic level, nonlinear PDEs are used for the modeling process. Specifically, the Burgers’ equation has been intensively used for this purpose. However, it is well known that deriving exact analytical solutions of complex/stiff nonlinear PDEs is challenging/impossible. The numerical simulation of complex nonlinear PDEs is exposed to transient phenomena, accumulation of round-off errors during computation and stiffness caused by the high degree of nonlinearity. This simulation is very time consuming as well when dealing with complex nonlinear PDEs. The key issue when simulating complex PDEs is the Dirichlet conditions (i.e. boundary conditions) which are generally very difficult to define during the numerical simulation of equations modeling engineering systems. This justifies the need of appropriate and efficient simulation tools which are robust to the problems faced by the classical simulation tools for solving complex PDEs. It should be worth noticing that this issue is still unsolved by the state-of-the-art as the available traffic simulation algorithms/tools are slow and provide results with low accuracy. This paper develops a conceptual computing framework for complex/stiff nonlinear ODEs and PDEs. The concept developed is based on the paradigm of CNNs which could be implemented on FPGA. Amongst equations of practical interest in engineering, the Burgers’ equation (8) has been intensively used for many applications (e.g. modeling Traffic flow, modeling Shock Waves propagation, modeling Fluids dynamics, and modeling Heat propagation) [5]. This is a nonlinear wave equation which can be used to describe the corresponding density waves from several models of traffic flow. Another interesting application of the Burgers equation is the modeling and optimal control of traffic flow.

∂u ( x, t ) ∂ 2 u ( x, t ) ∂u ( x, t ) =α − β u ( x, t ) 2 ∂t ∂x ∂x

(8)

α

and β are constants and u ( x, t ) is the density of the wave. Equation (8) is a nonlinear equation with traveling wave solutions. In the conservation law form (i.e. α = 0 ) the Burgers’ equation reduces to a first order hyperbolic PDE. This equation is illustrated with the well-known model of highway traffic theory, namely the Lighthit-WhithamRichards (LWR) [6]. It can be shown that the solution of (8) is envisaged in the form

u( x, t ) = g[x − vt] v( x, t ) = α + β u( x, t )

g

(9a) (9b)

is an unknown function which depends on the initialboundary value problem (to be specified/defined). The solution in (9) describes a right-moving wave with a velocity v( x, t ) depending on the density of the wave. This dependence can lead to many striking effects namely breaking and formation of shock fronts. This last corresponds to bunching of cars on a highway.


43

In order to solve the PDE in (8) with the paradigm of CNNs, the spatial discretization method (i.e. the finite difference method) is performed in order to transform the proposed PDEs into sets of ODEs and arrange them into a suitable form the CNN paradigm can solve. Specifically, we first discretize the function spatially by using the Taylor series expansion and next we use the CNN paradigm to account for temporal change (i.e. solving ODE in time domain). The second order Taylor’s series expansion (of the solution in (8)) around a fixed point x0 can be envisaged in the form

u( x, t ) = u( x0 , t ) + (x − x0 )u′( x0 , t ) + +

(x − x0 )2 u′′( x , t )

(10)

0

2

Considering two neighboring points situated at a distance

Δx0

around the fixed point

x0

(i.e. left and right to

x0 )

can lead to the following mathematical formulation:

xi = x0 ± Δx0

(11)

(11) Can be used to derive new forms of (10) as follows:

(Δx0 )2 u′′( x , t ) 2

0

(Δx0 )2 u′′( x , t ) 2

0

(12a)

(12b)

Thus, one can use (12) to deduce the Taylor’s series expansion of the second and first derivatives around the fixed point x0 as follows:

u′′( x0 , t ) =

The spatial domain is built from a number of grid-points localized by the position xi , the index

u( x0 + Δx0 , t ) + u( x0 − Δx0 , t ) − 2u( x0 , t ) (Δx0 )2 u( x0 + Δx0 , t ) − u( x0 − Δx0 , t ) 2Δx0

α=

1 2

and β

= 1.

a regular form,

Δx0

xi

being an integer.

= 10 ).

Fig. 5a

Using the same set of parameters, we

have also performed the direct numerical simulation of (8) on MATLAB. Fig. 5b shows the spatial evolution of the solution in (8) while Fig. 5c shows the spatio-temporal evolution of the solution in (8). The results in Fig. 5a obtained using CNN simulator were generally very close to the same plots obtained using simulation on MATLAB. However, a divergence was observed for i ≤ 8 . This divergence can be explained by the fact that the accuracy of the CNN simulator decreases with decreasing number of cells constituting the CNN simulator.

i=10

(13b) i=1

Equation (11) can be used to construct a spatial domain which is made-up of a number of grid-points

i

Therefore (14) clearly shows that the analog computing of PDEs is possible by transforming them into ODEs which are expressed in the form of (14). This form is a set of coupled first order ODEs, the number of equations being fixed by the index i . To solve (14) by the CNN paradigm, we apply some algebraic manipulation on (14) to transform it into the form of i first order ODEs which are further identified with the SC-CNN model in (1). Here, the total number of cells used to build the complete CNN processor is fixed by the index i . Our various numerical computations using the CNN paradigm were very slow as for the sake of obtaining accurate results we were obliged to use a huge amount of CNN cells. It was possible to obtain some sample results up to the

(13a)

u′( x0 , t ) =

(14)

shows the temporal evolution of the solution in (8) obtained by our CNN processor built with 10 cells for the parameters

u( x0 − Δx0 , t ) = u( x0 , t ) − Δx0u′( x0 , t ) +

dui ⎛ α ⎞ ⎟[ui+1 − 2ui + ui−1 ] − =⎜ dt ⎜⎝ (Δx0 )2 ⎟⎠ ⎛ β ⎞ ⎟⎟ ui [ui+1 − ui−1 ] ⎜⎜ ⎝ 2Δx0 ⎠

maximal number of 10 CNN cells ( imax

u( x0 + Δx0 , t ) = u( x0 , t ) + Δx0u′( x0 , t ) +

from the following first order ODE derived by substituting (13) into (8).

arranged in

being the distance between them (i.e.

grid-points). Therefore the time evolution of the solution

Fig. 5a. Results of the CNN simulator showing the temporal evolution of the solution in (8) for α = 1 / 2

u ( x, t ) in (8) is obtained at each grid-point xi . This leads

β = 1 and i = 1 to 10 .

to a general solution

u ( xi , t ) = ui which can be obtained


44

Fig. 5b. Results of a direct simulation on MATLAB showing the spatial evolution of the solution in (8) for α = 1 / 2 , β = 1 and i = 200

Fig. 5c. Results of a direct simulation on MATLAB showing the spatio-temporal evolution of the solution in (8) for α = 1 / 2 , β = 1 and i = 2000

V. BENCHMARKING The proof of concepts of the approach developed in this paper (i.e. an approach based on the linearization of the vector field around fixed points associated with the CNN paradigm) is now performed by comparing the results obtained with those available in the relevant literature. We have considered two types of well-known nonlinear and stiff models/equations (ODE & PDE) which have already been considered in Refs. [10, 21, 5] in other to make it possible a comparison of the results obtained by the approach developed in this paper. Considering ODE, the approach developed in this paper has been used to solve a well-know model describing the dynamics of a system of coupled Rössler type non-identical self-sustained chaotic oscillators which has been intensively used in Ultra-Wide-Band communication and positioning systems [10, 21]. Our approach has revealed that the model of coupled Rössler equations can exhibit synchronized dynamics both in its regular and chaotic states for specific values/settings of system parameters. Using different values/settings of system parameters, similar results were reported in Ref. [10] by both numerical and experimental methods. Therefore, performing a fine tuning of the parameter- settings, the approach developed in this paper could lead to similar results (e.g. plots) as in Ref. [10]. This remark can be drawn from the fact that our approach has been used under the same values/settings of system parameters in Ref. [21] and, similar results were obtained.

Considering PDE, we have considered the nonlinear Burger’s equation which is a well-known prototype commonly used to describe (in time domain) the mean density of moving particles along a well- specified axis [5]. This equation has also been presented in this paper as a good prototype for both modeling and simulation of macroscopic traffic scenarios. Although the approach developed in this paper and that developed in Ref. [5] are based on the application of a concept based on the CNN paradigm to solve the Burger’s, the main difference is the challenging idea developed in our paper which consists of linearizing the vector field around fixed points in order to modify the nonlinear part of the model/equation without changing the qualitative dynamics of the system. Using the same values/settings of system parameters in Ref. [5], a very good agreement has been obtained when comparing the results (e.g. plots) in this paper with those in Ref. [5]. The main and challenging contribution of this paper is the proposal of a systematic method to derive for both nonlinear and stiff ODE and PDE the corresponding templates to make them solvable by the CNN paradigm. It should be worth noticing that although the approach based on the CNN paradigm to solve nonlinear and stiff ODE and PDE has been intensively developed in the relevant literature [1-6], it is still not clearly explained or shown the process leading to the derivation of the CNN templates. The main objective of this paper has been focused on the development of an analytical framework to clarify this issue. VI. CONCLUSION This work has presented a general concept for solving complex/stiff equations with the CNN paradigm. A systematic analytical study has been carried out showing the mapping of ODEs and PDEs into the CNN model in order to deduce the values of corresponding templates. The concept presented in this work has been applied for solving two specific types of well-known complex and stiff equations namely the system consisting of three coupled Rössler oscillators (ODE) and the Burgers’ equation (PDE). It has been shown that solving nonlinear ODEs with the CNN paradigm is possible through the linearization process. Further, the space discretization process is essential to transform nonlinear PDEs into ODEs in order to make them solvable by the CNN paradigm. In this paper we use the second order Taylor series expansion for the discretization process. The results obtained using the approach developed in this paper were very close to those in the relevant literature. The parameter settings used in this paper were obtained from the relevant literature. The motivation of using the settings of parameters in the relevant literature was the possibility of comparing our results with the existing ones. The limitations we found with the approach developed in this paper are twofold. The first is related to the fact that solving complex/stiff equations with CNN is possible only by performing a systematic analytical study in order to derive conditions in which the qualitative dynamics of the system might not be strongly sensitive to the degree of nonlinearity of the system. This analytical development appeared very challenging as the difficulty of performing this task increases with the degree of nonlinearity of the equation under investigation. Therefore an interesting research question would have been developing a general theoretical framework to linearize nonlinear ODEs. The second


45

limitation is related to the discretization of PDEs in space. In fact it was observed that the accuracy of the method increases with increasing quantity of grid points. However, increasing the number of grid points make it very difficult the numerical solution of PDEs with the approach developed in this work. This can be justified by the fact that the numerical simulation of a huge amount of CNN cells on the MATLAB platform is very time consuming (very slow). To optimize the results of the approach proposed in this work it will be of great interest considering the analog computing based on reprogrammable hardware emulation on FPGA or on GPU to attempt the ultra fast computing process.

[18]

[19]

[20]

[21]

and applications(special issue on CNN technology), vol. 34 issue 1, pp. 77-88 2006. Nagy Zoltan, Vöröshazi Zsolt, Szolgay Peter, “Emulated digital CNNUM solution of partial differential equations“, International journal of circuit theory and applications, ISSN: 0098-9886, vol. 34, pp. 445470, 2006. Fausto Sargeni, Vincenzo Bonaiuto, “Programmable CNN Analogue chip for RD-PDE multi method simulations”, Analog integrated circuits and signal processing, ISSN: 0925-1030, vol. 44, pp 283-292, issue 3 (September 2005). M. Gilli, T. Roska, L. O. Chua, and P.P. Civalleri, “On the relationship between CNNs and PDEs”, Proceedings of the 7th IEEE workshop on Cellular Neural Networks and their applications, IEEE circuits and systems society 2002. I. Moussa, J. C. Chedjou, K. Kyamakya and Van Duc Nguyen, “Dynamics of a secure communication module based on chaotic synchronizations”, ISAST Transactions on communication and networking, No.1, vol.2, 2008, pp.14-23.

REFERENCES [1] L.O Chua and L. Yang, “Cellular Neural Networks: Theory”, IEEE Trans. On Circuits and Systems, Vol.35, 1988, 1257-1272 [2] G. Manganaro, P. Arena, L. Fortuna, “Cellular Neural Networks: Chaos, Complexity and VLSI Processing,” Springer-Verlag Berlin Heidelberg, 1999, 44-45. [3] M. Ito, and L.O Chua, “Star cellular neural networks for associative and dynamic memories”, International Journal of Bifurcation and Chaos, vol. 14, 2004, 1725-1772. [4] T. Yang, K. R. Crounse and L.O Chua, “Spherical cellular nonlinear networks”, International Journal of Bifurcation and Chaos, vol. 11, 2001, 241-257. [5] T. Roska, L.O Chua, D. Wolf, T. Kozek, R. Tetzlaff, and F. Puffer, “Simulating nonlinear waves and partial differential equations via CNN- Part I: Basic techniques”, IEEE Trans. On Circuits and Systems, Vol. 42, 1995, 807-815. [6] J.P. Aubin, A.M. Bayen, P. T. Saint-Pierre, “Computation and control of solutions to the Burgers equation using viability theory”, proceeding of the American Control Conference, vol. 6, 2005, 3906-3911. [7] J. C. Chedjou, K. Kyamakya, I. Moussa, H. –P. Kuchenbecker, W. Matis, “Behavior of a self-sustained Electromehanichal Transducer and Routes to Chaos”, Journal of vibrations and Acoustics, ASME Transactions, Vol. 128, pp 282-293, 2006. [8] J. C. Chedjou, K. Kyamakya, Van Duc Nguyen, I. Moussa and J. Kengne,” Performance evaluation of analog systems simulation methods for the analysis of nonlinear and chaotic modules in communications”, ISAST Transactions on Electronics and Signal processing, vol.2, 2008, pp71-82. [9] J.C. Chedjou, H. B. Fotsin, P. Woafo, and S. Domngang, "AnalogSimulation of the Dynamics of a van der Pol oscillator coupled to a Duffing oscillator," IEEE Transactions on Circuits and Systems-I, vol. 48, pp. 748-757, 2001. [10] J. C. Chedjou, K. Kyamakya, W. Mathis, I. Moussa, A. Fome, A. V. Fono, “Chaotic Synchronization in Ultra Wide Band Communication and Positioning Systems”, Journal of Vibration and Acoustics, Transactions on the ASME, Vol. 130, 2008, 011012/1-011012/12. [11] J. C. Chedjou, “On the Analysis of Nonlinear Electromechanical Systems With Applications," Doctorate Dissertation, University of Hanover, Germany, Published under the code:ISBN 3-8322-3750-X, Shaker Verlag (Germany), march 2005. [12] R. Brown, “Generalizations of the Chua Equations,” IEEE Transactions on Circuits and Systems-I, vol. 40, pp. 878-884, 1993. [13] Rolf Herken, “The Universal Turing Machine. A Half-Century Survey”, ISBN-13:978-3211826379, Springer, Wien, Feb 1995. [14] J. Mailen Kootsey, “Future Directions in Computer Simulations”, ISSN:0092-8240(print) 1522-9602(online), Journal:Bulletin of Mathematical Biology, Springer New York 1986. [15] M. J. Aein, H. A. Talebi, “Introducing a training methodology for Cellular Neural Networks solving partial differential equations”, International joint conference on neural networks 2009, ISBN:978-1-4244-3548-7. [16] K. Hadad, A. Piroozmand, “Application of Cellular Neural Network(CNN) method to the Nuclear Reactor Dynamic Equations”, Elsevier, Annals of nuclear energy 34(2007) 406-416. [17] F. Corinto, M. Biey, M. Gilli, “Non-linear coupled CNN models for Multiscale Image Analysis”, International journal of circuit theory

Jean Chamberlain Chedjou received in 2004 his doctorate in Electrical Engineering at the Leibniz University of Hanover, Germany. He has been a DAAD (Germany) scholar and also an AUF research Fellow (Postdoc.). From 2000 to date he has been a Junior Associate researcher in the Condensed Matter section of the ICTP (Abdus Salam International Centre for Theoretical Physics) Trieste, Italy. Currently, he is a senior researcher at the Institute for Smart Systems Technologies of the AlpenAdria University of Klagenfurt in Austria. His research interests include Electronics Circuits Engineering, Chaos Theory, Analog Systems Simulation, Cellular Neural Networks, Nonlinear Dynamics, Synchronization and related Applications in Engineering. He has authored and co-authored 3 books and more than 40 journals and conference papers.

Kyandoghere Kyamakya obtained the M.S. in Electrical Engineering in 1990 at the University of Kinshasa. In 1999 he received his Doctorate in Electrical Engineering at the University of Hagen in Germany. He then worked three years as post-doctorate researcher at the Leibniz University of Hannover in the field of Mobility Management in Wireless Networks. From 2002 to 2005 he was junior professor for Positioning Location Based Services at Leibniz University of Hannover. Since 2005 he is full Professor for Transportation Informatics and Director of the Institute for Smart Systems Technologies at the University of Klagenfurt in Austria. M. Ahsan Latif obtained his MSc in computer sciences in 2001 at the University of Agriculture, Faisalababad, Pakistan. He has served Pakistan defence ministry as programmer for five years. Currently he is doing his Doctorate under prof. Kyandoghere Kyamakya and Dr. Jean Chamberlain Chedjou. His research domain comprises of nonlinear dynamics in image processing, cellular neural networks and solving stiff ODEs and PDEs.

Umair Ali Khan completed his bachlor degree in Computer Engineering from Quaid-e-Awam University, Nawabshah, Pakistan. He joined the same university as a lecturer in 2005. In 2008, he started his Masters degree in Information Technology in Alpen Adria University, Klagenfurt, Austria where he is also currently working as a research assistant in Transportation Informatics Group. His research interests include nonlinear dynamics, image processing, cellular neural networks and robotics. Before this, he has 3 international publications on obstacle detection using cellular neural networks, benchmarking of traditional genetic algorithm with a novel approach for image processing using cellular neural networks and traffic light controller based on reinforcement learning.

46


Moussa Ildoko holds a MSc. in control and signal processing and a Doctorate degree in Electronics from the University of “Valen- ciennes et du Hainaut-Cambrésis” in France, respectively in 1982 and 1985. He obtained his final Doctorate, the so-called “Doctorat d’Etat” in French educational system in 2008 at the University of Yaoundé I (Cameroon). He is currently an Associate researcher at UDETIME (Doctorate School of Electronics, Information Technology, and Experimental Mechanics) at the University of Dschang, Cameroon. Besides, he is a Senior Lecturer at the University of Yaoundé 1, Cameroon. His research interests are related to Nonlinear Dynamics, Analog Circuits Design, Chaos-based Secure Communications. Do Trong Tuan received B.E., M.E., and Dr.Eng. degrees in Electronics and Telecommunications from Hanoi University of Technology, Vietnam in 1997, 2000 and 2006, respectively. He has been a lecturer at Hanoi University of Technology since 1999. During 20082009, he studied at Institute for Smart-Systems Technologies, University of Klagenfurt, Austria as a post-doctoral fellow. His fields of research are Mobile Communications, Electronic Navigations, Random/Stochastic Graph Optimization and Cellular Neural Networks.

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 A. Caneco et.al.: Synchronizability and Graph Invariants

47

Synchronizability and Graph Invariants Acilina Caneco∗ , Clara Grácio† , Sara Fernandes‡ , J. Leonel Rocha§ and Carlos Ramos¶ ∗ Mathematics

Unit, DEETC, and CIMA-UE, Instituto Superior de Engenharia de Lisboa, Rua Conselheiro Em´ıdio Navarro, 1, 1949-014, Lisboa, Portugal. Email: [email protected]. ´ of Mathematics, Universidade de Evora and CIMA-UE, ´ Rua Romão Ramalho, 59, 7000-671 Evora, Portugal. Email: [email protected]. † Department

´ of Mathematics, Universidade de Evora and CIMA-UE, ´ Rua Romão Ramalho, 59, 7000-671 Evora, Portugal. Email: [email protected]. ‡ Department

§ Mathematics

Unit, DEQ, Instituto Superior de Engenharia de Lisboa, Rua Conselheiro Em´ıdio Navarro, 1, 1949-014, Lisboa, Portugal. Email: [email protected]. ´ of Mathematics, Universidade de Evora and CIMA-UE, ´ Rua Romão Ramalho, 59, 7000-671 Evora, Portugal. Email: [email protected]. ¶ Department

Abstract—A huge application in real problems justifies the enormous interest in the study of the network complexity. A first objective of the research on complex networks is to understand how its structure is affected by the dynamics in progress on them. For instance, it is important to know how the flow of a city traffic, the spread of an epidemic, the blogs growth in the network community and many others is influenced by the network structure. In previous works we studied the synchronizability of a network in terms of the local dynamics [1], [2], assuming that the topology of the graph is fixed. In this paper we are interested in studying the effects of the network structure, i.e., the topology of the graph, in the synchronizability of the network, through the study of some graph parameters, such as the conductance, the clustering coefficient, the performance of the graph and the eigenratio. We define the clustering turning point to identify the formation of clusters and observe its effect on the network synchronization.

KEY-WORDS: synchronizability, networks, graph invariants, conductance, clustering coefficient. I. I NTRODUCTION With more information being transported at higher data speeds, synchronization of traffic in networks is becoming increasingly important. For instance, in the field of mobile telephony (with the new 3G systems), traffic in the network continues to increase enormously. How can you know when your network is synchronized? This is an example of the great importance of the study of the synchronization. Analytic and numerical criteria have been developed for the establishment of the approximate domains of synchronization. The observation of the effect of each structural feature on the synchronizability is a significant work, since it is possible to know the relationships between each typical structural feature (which defines each type of network) and the synchronizability. Then, we could obtain the expression of the

synchronizability as a function of these properties. So, the synchronizability of a specific structure (scale-free, regular, random, small-world), can be largely provided through the topological features. Synchronization of networks occurs for values of coupling parameter belonging to a certain interval. The extremes of this interval depend not only on the local dynamical nodes, but also on the spectrum of the Laplacian of the associated graph. It is necessary to study some system of graph invariants and how they relate with some qualitative properties of the network. We search the effect of the conductance, the clustering coefficient, the performance of the graph, and the clustering turning point in the amplitude of the synchronization interval. In a previous work [3] we have started the investigation of the effect of some graph invariants on the synchronizability, expressed by the eigenratio r. For example, the synchronizability decreases as the conductance decreases. But, we notice that the decreasing of clustering coefficient implies the decreasing of the synchronizability, only while there is no cluster formation and reverse its behavior from that moment on. In this work we quantify that moment of cluster formation defining the clustering turning point and we investigate its effect on the Laplacian eigenvalues. The basic concepts of graph theory and the Laplacian are introduced in section II. In section III our model for the coupled dynamics on the networks is introduced and a quantitative measure for the synchronizability is formulated. In section IV the graph invariants, conductance, clustering efficiency, clustering coefficient and clustering turning point are defined. Finally, in section V are presented the numerical results that show the effect of these invariants on the network synchronization interval amplitude.


48

ISAST TRANSATIONS ON COMPUTERS AND INTELLIGENT SYSTEMS

II. P RELIMINARIES

where b > 0 is the coupling parameter, A = [aij ] is the adjacency matrix and Γ = diag(1, 1, ...1). Equation (1) can be rewritten as

Mathematically, networks are described by graphs and the theory of dynamical networks is a combination of graph theory and nonlinear dynamics. From the point of view of dynamical systems, we have a global dynamical system emerging from the interactions between the local dynamics of the individual elements and graph theory then analyzes the coupling structure. A graph is a set G = (V (G), E(G)) where V (G) is a nonempty set of N vertices or nodes (N is called the order of the network or of the associated graph) and E(G) is the set of m edges or links eij that connect two vertices vi and vj (m is called the size of the graph) [5]. If the graph is weighted, for each pair of vertices (vi , vj ) we set a non negative weight aij such that aij = 0 if the vertices vi and vj are not connected. If the graph is not weighted, aij = 1 if vi and vj are connected and aij = 0 if the vertices vi and vj are not connected. If the graph is not directed, which is the case that we will study, then aij = aji . The matrix A = A(G) = [aij ], where vi , vj ∈ V (G), is called the adjacency matrix. That matrix carries an entry 1 at the intersection of the ith row and the jth column if there is an edge from i to j. When there is no edge, the entry will be 0. The degree of a node vi is the number of edges i=N P incident on it and is represented by ki , that is, ki = aij .

x˙ i = f (xi ) + b

x1 (t) = x2 (t) = ... = xN (t) → e(t), t→∞

(2)

(3)

where e(t) is a solution of an isolated node (equilibrium point, periodic orbit or chaotic attractor), satisfying e(t) ˙ = f (e(t)). The discretized form of the network equation (2) is xi (k +1) = f (xi (k))+b

N X

lij f (xj (k)), with i = 1, 2, ..., N.

j=1

(4) Let be hmax the Lyapunov exponent of each individual ndimensional node. If, [8] 1 + e−hmax 1 − e−hmax 1 ci = c= P N i=1 1 i,ki >1

|E(U, V − U )| ; φ2 (G) = min U ⊂V min{|E1 (U )| , |E1 (V − U )|} φ3 (G) = min U ⊂V

φ4 (G) = min U ⊂V

The second ¡ ¢one is obtained computing first the average of |{eij }| and k2i and then their ratio N P

|E(U, V − U )| N ; |U | . |V − U |

C=

|E(U, V − U )| ³ ´. N |U | . log |U |

i=1

In these definitions, |E(U, V − U )| is the number of edges from U to V − U and |E1 (U )| means the sum of degrees of vertices in U , [4]. The clustering coefficient measures how densely connected the neighbors of a particular node are, however it does not tell you which neighbors are the ones giving the largest contributions to the clustering coefficient. The clustering coefficient ci of a vertex vi measures the probability that two neighboring nodes of the node vi are also neighbors of each other, it measures the local cohesiveness around a node and it is given by the proportion of edges between the vertices within its neighborhood, divided by the number of edges that could possibly exist between them. Representing by ki the degree of a vertex vi and by Ni = {vj : eij ∈ E} the neighborhood of vertex vi , then for an undirected graph, one can define, [15], the clustering coefficient ci of a vertex vi , with degree larger than one, by

ci =

i=k [

E(Ci )

and Inter(C k ) = E(G)\Intra(C k ). The performance of a clustering should measure the quality of each cluster as well as the cost of the clustering. In [14] this bicriteria is based in a two-parameter definition of a (α, ε)clustering, where α should measure the quality of the clusters and ε the cost of such partition, that is, the ratio of the intercluster edges to the total of edges in the graph. Definition 1: We call a partition C k = {C1 , C2 , ..., Ck } of V an (α, ε)-clustering if: 1) The conductance of each cluster is at least α Φ(G(Ci )) ≥ α, for all i = 1, ..., k; 2) The fraction of intercluster edges to the total of edges is at most ε ¡ ¢ Inter C k ≤ ε. |E(C k )|

aij ajm ami

ki (ki − 1)

2

i=1

In terms of the adjacency matrix A = A(G) = [aij ] the clustering coefficient can be calculated by

j=1 m=1

, with ejk ∈ E; vj , vk ∈ Ni .

Intra(C k ) =

2 |{ejk }| |{ejk }| , with ejk ∈ E; vj , vk ∈ Ni . ¡ki ¢ = ki (ki − 1) 2

N X N X

¡ki ¢

A clustering of the graph G is a partition of the vertices set C k = {C1 , C2 , ..., Ck } and the Ci ⊂ V (G) are called clusters. C k is called trivial if either k = 1 or all clusters Ci contain only one element. We identify a cluster Ci with the induced subgraph of G, i.e., the graph G(Ci ) = {Ci , E(Ci )}. The set of intracluster edges and the set of interclusters are defined, respectively, by

We will use the second one |E(U, V − U )| . φ(G) = min U ⊂V min {|E1 (U )| , |E1 (V − U )|}

ci =

|{ejk }|

i=1 N P

.

According to this definition the clustering is good if it maximizes α and minimizes ε. We introduce then a coefficient that accomplish both optimization problems.

This was the local aspect of clustering. To consider the global aspect, which is also called community structure, it is 3


50


4 5

Definition 2: For an (α, ε)-clustering C, define the performance of C by the ratio ε R= . α

4

3

4

3

5

2

5

2

6

6 7

9 13 11

1 15 8 14

14

9

9 13

10

12

13

10

12

11

2

15 8

14

9

3

7

7 15

8 14

4

6 1

7 15

8

5

2

1

1

10

3

6

11

10

12

13 11

12

That means that a clustering is better if it has a smaller R. Fig. 1.

Starting with a complete graph we delete edges until the formation of clusters becomes visible. It is necessary to identify this situation, where the formation of clusters is clear. This turning point, that we call clustering turning point, is the maximum number of edges that the graph should have, in order to be possible to identify the formation of clusters.

Formation process of three clusters.

shows four steps of the evolution of the network, starting with a 15-node complete graph, with 105 edges. In Fig. 2 it is shown the behavior of the graph invariants, the conductance φ, the performance R and the eigenratio r with the formation of the three clusters C 3 . All these quantities decrease when we delete edges. In Fig. 3 it is shown the behavior of the clustering coefficient c, the conductance φ and the eigenratio r, with the formation of the three clusters C 3 . When we delete edges, the conductance φ and the eigenratio r decrease. The clustering coefficient c behaves in the same way only in the first part of the process, but increases after the clustering turning point.

¯ ¯ ¯ ¯ ¯Inter(C k )¯ < N =⇒ |E(G)| < N + ¯Intra(C k )¯ , where N = |V (G)|. Definition 3: Let G be a graph, with N = |V (G)| and a clustering C k = {C1 , C2 , ..., Ck } where |N (Ci )| is the number of the vertices of each cluster Ci . The clustering turning point of the clustering C k is defined by

Φ,R,r 1.2

¯ ¯ Cl(C k ) = N + ¯Intra(C k )¯ + 1.

1.0

When we delete edges, the clustering coefficient and the conductance decrease. But from the clustering turning point, the clustering coefficient increases, denoting the formation of clusters. Nevertheless, the conductance decreases further, as it measures the flow of the network.

0.8 0.6 0.4 0.2

V. N UMERICAL RESULTS edges 30

We study the network synchronization as a function of the connection topology, fixing the local dynamics. We try to understand the relation of some graph invariants with the spectrum of the Laplacian matrix, ( [4], [6]). We can find a great number of formulas relating some graph invariants with the eigenvalues characterizing the synchronization interval, λ2 and λN , but none, as far as we know, for a relation between these eigenvalues and the clustering formation, neither for a relation between the conductance and the clustering. We perform an experimental evaluation to observe, for several network types, the effect of the graph conductance φ, the clustering coefficient c, the eigenratio r and the performance of the (α, ε)-clustering R, on the cluster formation. We simulate the creation of clusters starting with a complete graph of N vertices where each vertex is connected to every other one, excluding self-connections. Thus, we have N (N2−1) edges and we delete edges leading to the creation of several number of clusters. In our case, we choose N = 15, and we delete edges leading to the creation of clusterings C 3 , C 4 , and C 5 with three, four and five clusters. For the case of a clustering with three clusters C 3 = {{1, 2, 3, 4, 5} , {6, 7, 8, 9} , {10, 11, 12, 13, 14, 15}}, Fig.1

40

50

60

70

80

90

100

Fig. 2. Evolution of the conductance φ, the performance R and the eigenratio r in the formation process of the C 3 clustering, as the number of edges (in the horizontal axis) increases.

3 clusters {1,2,3,4,5}, {6,7,8,9} e {10,11,12,13,14,15}

1,200000 1,000000 0,800000 Clust

0,600000

Cond 0,400000

r

0,200000

31

36

48

42

54

66

60

78

72

88

84

96

92

10 5 10 3

10 0

0,000000 -0,200000

Fig. 3. Evolution of the clustering coefficient c, the conductance φ, and the eigenratio r in the formation process of the C 3 clustering, as the number of edges (in the horizontal axis) decreases.

Note that, N = |V (G)| = 15, so the clustering turning point is 47. 4


51


¯ ¯ ¯Intra(C 3 )¯ = 31 =⇒ Cl(C 3 ) = 47.

4 clusters {1,2,3},{4,5,6,7}, {8,9,10} e {11,12,13,14,15} 1,20000

This means that, there is a clear formation of this C 3 clustering (Inter(C 3 ) < N ) when E(G) < 47.

1,00000 0,80000

The Fig. 4 shows four steps of the evolution of the network. We begin with a complete graph, with 15 nodes and 105 edges and after deleting edges we obtain four clusters, C 4 = {{1, 2, 3} , {4, 5, 6, 7} , {8, 9, 10} , {11, 12, 13, 14, 15}}.

0,60000

Clust Cond r

0,40000 0,20000

1

8 9

9 13

10 11

Fig. 4.

13 11

12

14

27

22

39

33

51

45

63

57

75

Fig. 6. Evolution of the clustering coefficient c, the conductance φ, and the eigenratio r in the formation process of the C 4 clustering, as the number of edges (in the horizontal axis) decreases.

9 13

10 11

13

10

12

11

12

Formation process of four clusters.

clustering C 5 . All these quantities decrease when we delete edges. In Fig. 9 it is shown the behavior of the clustering coefficient c, the conductance φ and the eigenratio r, with the formation of C 5 . When we delete edges, the conductance φ and the eigenratio r decrease. The clustering coefficient c behaves in the same way only in the first part of the process, but increases after the clustering turning point. For this case, we have ¯ ¯ ¯Intra(C 5 )¯ = 15 =⇒ Cl(C 5 ) = 31.

The Fig. 5 shows the behavior of the graph invariants, the conductance φ, the performance R and the eigenratio r, with the formation of the four clusters C 4 . All these quantities decrease when we delete edges. Φ,R,r 1.2 1.0 0.8

This means that, there is a clear formation of this C 5 clustering (Inter(C 5 ) < N ) when E(G) < 31. In the first part of the process of deleting edges, the clustering coefficient decreases, but after the clustering turning point, the clustering coefficient increases, see Fig. 9, denoting the beginning of cluster formation.

0.6 0.4 0.2 edges 40

69

8 14 9

10

12

15

15 8

14

14

1 7

15

15 8

-0,20000

6

7

7

2

1

1

7

3

5

2

6

6

6

4

3

87

5

2

81

4

3

93

4

5

5

2

10

3

99

0,00000 4 5

60

80

100 4 5

3

5

2

6


7

4

3

13 12

Fig. 7.

10

13 11 12

3 2 1

7 1

15 8

14

14 9

4

6

1 8

8

11

5

2

15

14 10

3

7

15

9

4

6

1 7

8

For this case, we have ¯ ¯ ¯Intra(C 4 )¯ = 22 =⇒ Cl(C 4 ) = 38.

5

2

6

1

9 10 11 12

13

14 9 10

13 11 12

Formation process of five clusters.

4

This means that, there is a clear formation of this C clustering (Inter(C 4 ) < N ) when E(G) < 38, see Fig. 6. In Fig. 6 are the results that compare the parameters clustering coefficient c, conductance φ and eigenratio r. When we delete edges, the conductance φ and the eigenratio r decrease. The clustering coefficient c behaves in the same way only in the first part of the process, but increases after the clustering turning point.

VI. C ONCLUSION It is possible to observe that the clustering turning point, detects, also, a significant change in the values of conductance. The conductance measures the ability of the system to spread; it is natural that when forming the clusters, i.e., when there are, considerably, fewer links between clusters, the conductance decreases sharply. It is exactly this that we see in the computations and presented in the above examples. In the last step of the simulations, when the graph becomes disconnected, the conductance is zero. Conversely, when the number of links between clusters is less than the order of the graph (cluster formation) the clustering coefficient increases sharply.

In Fig. 7 we show four steps of the evolution of a network, starting again with a complete graph with 105 edges and deleting edges, until we obtain the clustering C 5 = {{1, 2, 3} , {4, 5, 6}, {7, 8, 9}, {10, 11, 12}, {13, 14, 15}}. The Fig. 8 show the behavior of the conductance φ, the performance R and the eigenratio r with the formation of the 5


52


Φ,R,r 1.2

nization. We can see that the curve of R follows the curve of the eigenratio which, in turn, follows the curve of the conductance. Thus, the three quantities characterize both the synchronizability and the quality of the clustering. These quantities vary in the opposite direction: better clustering implies poorer synchronization. For the network (2) the synchronization interval is (5). Fixing the dynamics f in the nodes, the synchronization interval will be as larger as much the eigenratio r = λ2 /λN is bigger. Our conclusions are based in the observance of similar behavior of all four parameters: the graph conductance φ, the clustering coefficient c, the eigenratio r and the performance of the (α, ε)-clustering R.

1.0 0.8 0.6 0.4 0.2 edges 20

40

60

80

100


ACKNOWLEDGMENT We would like to thank FCT (Portugal), Instituto Superior de Engenharia de Lisboa and CIMA-UE for having in part supported this work.

5 clusters {1,2,3}, {4,5,6}, {7,8,9}, {10,11,12} e {13,14,15} 1,200000

1,000000

R EFERENCES

0,800000 Clust 0,600000

[1] A. Caneco, S. Fernandes, C. Grácio and J. L. Rocha, Symbolic dynamics and networks, Proceedings of the International Workshop on Nonlinear Maps and Applications (NOMA’07), 42-45, 2007. [2] A. Caneco, C. Grácio and J. L. Rocha, Symbolic dynamics and chaotic synchronization in coupled Duffing oscillators, J. Nonlinear Math. Phys., 15 , 3, 102-111, 2008. [3] A. Caneco, S. Fernandes, C. Grácio and J. L. Rocha, Networks synchronizability, local dynamics and some graph invariants, 2009 (submitted). [4] S.L. Bezrukov, Edge Isoperimetric Problems on Graphs, Theoretical Computer Science}, 307, 473-492, 2003. [5] B. Bollobás and O. M. Riordan, in Handbook of Graphs and Networks: From the Genome to the Internet, Wiley-VCH, 2003. [6] B. Bollobás, Random Graphs, New York, 1985. [7] C. Grácio and J. Sousa Ramos, The first eigenvalue of the Laplacian and the conductance of a compact surface, Nonlinear Dynamics, 44, 243-250, 2006. [8] X. Li and G. Chen, Synchronization and desynchronization of complex dynamical networks: An engineering viewpoint,” IEEE Trans. on Circ. Syst. –I, Vol. 50, pp.1381-1390, 2003. [9] P. N. McGraw and M. Menzinger, Clustering and the synchronization of oscillator networks, Phys. Rev. E., 72, 015101(R), 2005. [10] B. Mohar, The Laplacian Spectrum of Graphs, Graph Theory, Combinatorics and Applications, Vol.2, pp.871-898, 1991. [11] S. Fernandes and S. Jayachandran, Conductance in Discrete Dynamical Systems, Proceedings of the XII International Conference on Difference Equations and Applications, July 25-30, 2007, (Accepted). [12] S. Fernandes and J. S. Ramos, Second eigenvalue of transition matrix associated to interval maps, Chaos, Solitons & Fractals 31(2), 316-326, 2007. [13] F. Comellas and S. Gago, Synchronizability of complex networks, J. Phys. A: Math. Theor., 40, 4483–4492, 2007. [14] R. Kannan, S. Vempala and A. Vetta, On Clusterings : Good, bad and spectral, Journal of the ACM (JACM) 51, (3), 497 - 515, 2004. [15] D. J. Watts and S. H. Strogatz, Collective dynamics of small-world networks, Nature, 393, 440-442, 1998.

Cond r

0,400000

0,200000

0,000000 1

2

3

4

5

6

7

8

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

Fig. 9. Evolution of the clustering coefficient c, the conductance φ, and the eigenratio r in the formation process of the C 5 clustering, as the number of edges decreases. The horizontal axis indicates the number of steps in this process.

This clustering turning point evaluating where we can observe the formation of a network clustering, relates the number of intracluster edges, the number of intercluster edges and the order of the network. We can think what will be the number of edges that it is necessary to delete in order to identify subgraphs, that we can call clusters. That is, a condition for cluster formation. If there are clusters formation, the synchronizability decreases with the decreasing of the clustering coefficient until there are no clusters, but reverse its behavior when the clusters become apparent. You can say that the claim ”a large value of the clustering coefficient enhances the synchronization”, [13], is true for example, in the case one delete edges following a certain path and in the beginning of cluster formation, but it is not true after the clusters appearance, [3]. However, we can observe, that the conductance enables us to evaluate the synchronizabilty of the network. Low conductance means that there is some bottleneck in the graph, a subset of nodes not well connected (a few interclusters edges) with the rest of the graph. Our study conduces to the following conclusions: 1) A bad clustering implies a larger synchronization interval; 2) The conductance of the underlying graph is a good parameter to characterize the clustering and the synchro6

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 C. Adams and M. Rodrigues: Dealing with Non-Policy-Conformant Requests in Credential Systems

53

Dealing with Non-Policy-Conformant Requests in Credential Systems Carlisle M. Adams, Senior Member, IEEE, Marion G. Rodrigues

Abstract— Credential systems allow users to demonstrate possession of specific attributes required for a particular transaction without revealing any other non-essential personal information; thus they play an important role in the construction of privacy-preserving access control infrastructures. However, the ability to conduct transactions in a truly anonymous way can facilitate undesirable activity by legitimate credential owners. This paper proposes a mechanism for limiting such activity without sacrificing the privacy of every credential owner. Furthermore, some freedom of choice remains: the credential verifier can decide whether a particular action would constitute undesired activity with respect to his own policy (and is therefore non-policy-conformant) while the credential owner can decide whether to reveal information that may allow her actions to be tracked. Either party may choose to proceed with or abort the transaction. Aborting the transaction early reveals no information that would compromise the privacy of the requester. This mechanism is flexible and reduces the risk of misuse that may be associated with wide-scale deployment of credential systems. Index Terms— Credential systems; Privacy; Anonymity; Digital Credentials; Anonymous Credentials; Credential misuse; Policy non-conformance.

C

I. INTRODUCTION

REDENTIAL systems have been proposed and discussed in the academic literature in many different forms (including pseudonym systems [13, 14], digital credentials [2, 3], anonymous credentials [6, 7], and variations of all the above). The primary goal of any credential system is to preserve a user’s privacy in transactions (such as resource access requests) that take place over digital networks. Existing credential systems, such as [3] and [7], allow the users to demonstrate possession of the specific attributes required for a particular transaction (for example, citizenship, year of birth, status, or privilege) without revealing any other non-essential personal information. Thus, credential systems play an important role in the construction of privacy-preserving access control Manuscript received September 29, 2009. This work was supported in part by a grant from the Natural Sciences and Engineering Research Council of Canada (NSERC). C. M. Adams is with the School of Information Technology and Engineering (SITE), University of Ottawa, ON K1N 6N5 Canada (phone: 613-562-5800; fax: 613-562-5664; e-mail: [email protected]). M. G. Rodrigues is an independent consultant in Ottawa, ON Canada.

infrastructures. But though these may be highly desirable in some contexts, there is a risk that the very ability to conduct transactions in a truly anonymous way can facilitate misuse by legitimate credential owners. In particular, anonymous access to communications networks, system resources, critical infrastructures, or online services may end up allowing questionable or even illegal activities. This paper proposes a novel mechanism for limiting transaction misuse without sacrificing the privacy of nonmisusing credential owners. Furthermore, both parties in a transaction have some freedom of choice: the credential verifier can decide whether a particular action would constitute prohibited behaviour with respect to his own policy and choose to proceed with the request or abort; likewise, the credential owner can decide whether to reveal information that may allow her actions to be tracked and choose to proceed by submitting this information, or abort. Aborting the transaction early reveals no information that would compromise the privacy of the requester, but proceeding with the transaction in question implies acknowledgement by both parties that the credential owner may be tracked and potentially identified. This technique thereby lowers the risk of misuse that may result from wide-scale deployment of credential systems. Section 2 of this paper provides background concepts. Section 3 discusses misuse of credentials by legitimate credential owners. Section 4 presents the proposal for limiting transaction misuse in some detail, including a discussion of the infrastructural components and protocol messages required for its implementation. Section 5 concludes the paper. A. Credential Systems As referred to in this paper, a credential is an electronic token composed of one or more pieces of information, usually encrypted, which is used to access a resource (e.g., a restricted website) or complete a transaction (e.g., an electronic purchase). The information held in a credential may simply be an identifier, or it may consist of a number of attributes which signal the owner’s eligibility to use the credential in a particular way. A credential system typically includes an issuer (a trusted Credential Authority) who issues the credentials to the credential owners/holders, and a credential verifier. A credential owner, Alice, uses her credential by showing it to a verifier, Bob, by means of a showing protocol, which may use any of a number of proposed techniques (e.g., a proof-of-knowledge protocol) designed to prove that the credential is valid and appropriate for the transaction.

54


The concept of credential systems grew out of research by Chaum dealing with electronic cash systems [11, 12], and group signature schemes [15], culminating in pseudonymous credentials [13, 14], which were designed to enable electronic transactions while protecting the privacy of the individual in a manner analogous to stamps, paper money and coins, bus tickets, and so on. Digital credentials, as introduced by Brands [2, 3], could be regarded as a generalization of pseudonymous credentials: as well as providing transactions where the identity of the user need not be revealed, digital credentials allow a holder to selectively disclose only those attributes relevant to the desired transaction. However, Brands’ credentials, like Chaum’s, are essentially single-use; multiple uses of the same credential can be linked to each other, thereby eroding privacy. Anonymous credentials, proposed by Camenisch and Lysyanskaya [4, 6, 7], sought to address some of the limitations of digital credentials (most particularly, the single-use aspect). Here the user merely demonstrates knowledge of a valid credential rather than revealing all or part of the credential. Other credential systems include work by Chen [16], Damgard [17], Lysyanskaya, et al. [19], Lysyanskaya [20], Persiano and Visconti [22], and Verheul [24]. A generalization of credential systems can be found in [8]. B. Misuse of Credentials by Legitimate Credential Owners Those who legitimately own or hold valid credentials may misuse them in at least two ways. First, a credential owner who can use the credential to gain access to a protected resource such as an online subscription to a magazine or newspaper may lend her credential (or, more specifically, lend the private information encoded in her credential) to a friend so that the friend can illegitimately use the credential to access the resource. This practice has been called “credential lending”, “credential sharing”, or “credential transfer” and has been discussed in a number of papers; see, for example, [3, 6, 9]. Second, the credential owner, knowing that her anonymity is guaranteed, may decide to engage in questionable or unacceptable activity involving this resource. For the purposes of this paper, we assume that for any given resource, the resource owner/controller is in a reasonable position to determine what uses of the resource are acceptable and what uses are unacceptable. For example, if Bob owns a communications network, Bob may choose to mandate that certain types of data (e.g., unclassified data, music files, text containing certain keywords, or inappropriate material) are not allowed on his network. Bob’s restriction, or set of restrictions, effectively constitutes an acceptable-use policy for his resource. Anyone that requests access to his network will be informed of his restrictions, but an anonymous user may attempt to sidestep the restrictions by sending an access request that does not conform to the published acceptable-use policy. This paper proposes a mechanism to deal with the second

type of credential misuse. Related work has been done in the areas of digital cash and limited-show credentials where, for example, double-spending a coin or showing a credential more than n times reveals the user’s identity [2, 5]. However, in those solutions the acceptable-use policy is implicit rather than explicit (e.g., it is implicitly known by every entity in the system that a coin should only be spent once by a user) and the consequence of policy violation is fixed, immediate, and universal (i.e., the identity of the violator is immediately revealed to arbitrary entities). In the present proposal we seek the flexibility of an explicit, published policy that can be modified at any time by the resource owner. Furthermore, we would like it to be possible for only the resource owner (in conjunction with a single authority, such as the credential issuer) to determine the requester’s identity and, in fact, for the resource owner to be able to choose whether and when to do this de-anonymization. II. LIMITING NON-POLICY-CONFORMANT REQUESTS A. Goals and Assumptions We assume that there exists an acceptable-use policy for a resource that is made available (i.e., publicized) to users of the resource. In this paper, a “non-policy-conformant request” is defined to be synonymous with an access request that violates the acceptable-use policy for a given resource. The goal is to propose an automated mechanism for limiting such nonconformant requests in online systems that make use of credentials to provide user anonymity. A user, Alice, must be aware of the acceptable-use policy for a resource and must accept the fact that violating this policy will substantially increase her risk of being identified as the author of the violation. Alice may choose to proceed with the transaction anyway; the technique proposed here does not attempt to prevent this. Our aim is to limit non-conformant requests rather than to prevent them. Simply increasing the risk of discovery will be sufficient to curb undesired activity in many cases. Note that similar means (i.e., techniques that provide disincentives leading to self-enforcement, rather than actual prevention) have been suggested by researchers in the area of credential transfer; see, for example, [18, 9, 2, 6]. Ultimately, though, the choice is left to Alice: if her anonymity is compromised, it is a consequence of her decision to disclose information that may be used to reveal her identity. Similarly, a choice should be given to Bob, the resource owner/controller. Bob may decide, for reasons that could be unknown and unknowable to the rest of the world, that he will make an exception to his policy in the case of Alice. He may decide that he is willing to let Alice violate the policy today (even if he might change his mind if she asks again tomorrow). Ideally, we would like a mechanism that respects Alice’s freedom as well as Bob’s, and allows each of them to make specific choices in particular situations. The power given to Bob (i.e., to de-anonymize the sender of a non-conformant request) is reasonable, given that he is the owner of the

55


resource and is the one who decides what requests are acceptable. However, we wish to prevent other arbitrary entities from appropriating this power. Furthermore, even though Bob can learn the requester’s identity, the specific technical details of the underlying credential proofs-ofknowledge may be such that he is unable to convince anyone else of the identity, thus limiting Bob’s power in a tangible way. B. The Proposed Technique We adopt an access control model where the acceptable-use policy is effectively coupled with the access control policy for that resource. Every action on the resource is implemented as a request for that action followed by a Permit/Deny response by the Policy Decision Point (PDP, run by Bob), which is enforced by a Policy Enforcement Point (PEP) that carries out the PDP’s wishes. Thus, rather than a conceptual model in which Alice posts confidential data to a site outside the firewall, we envision a model in which Alice makes a request to post the data to this site, and the PDP decides whether or not this should be allowed (according to the relevant policy in place). Note that we expand the traditional access control paradigm in one important way. Rather than allowing only the binary Permit/Deny decision by the PDP, we include a third choice: “send more information about yourself”. The information is in the form of a unique identifier encoded as another attribute in the requester’s credential. This identifier, when revealed to the verifier, may be used (possibly in conjunction with other authorities) to break the anonymity of the requester. Once in possession of this identifier, the PDP may choose whether or not to allow the transaction to proceed. The PDP may, in fact, choose to use the identifier to break anonymity or effectively discard this information. The user, Alice, does not know what the PDP will do with her request or her identifier, and therefore must decide whether to risk revealing this information or not. Thus, there is free choice for both parties, with full knowledge of the possible consequences of each choice. Such a mechanism respects the external factors that may influence a decision in a specific circumstance, while unobtrusively steering most users in most situations to engage in acceptable (i.e., policy-conformant) activities. C. Protocol Description Consider a credential system as proposed in [3] or [7], for example. Requesters engage in the showing protocol of the credential system to anonymously prove possession of specific attributes in order to access protected resources in an online distributed environment. The protection of resources is controlled by a policy which may be written in any specified policy language. One such candidate language that has seen use in several environments is the eXtensible Access Control Markup Language (XACML) [21]. Furthermore, consider a user Alice, a resource R, and an owner of this resource Bob (who has implemented a PDP and

PEP to protect access to R). Bob’s access control policy specifies that requesters must possess a specific attribute (such as role, citizenship, or age above a particular value) in order to use this resource. The policy also specifies acceptable uses of this resource (e.g., no unclassified data may be transmitted because it is a classified network). Alice uses a credential to prove to Bob’s PDP that she possesses the required attribute, but otherwise Alice is completely anonymous to the PDP. Now, Alice may be inclined to send unclassified data, in violation of the policy. If she tries to send this data, the PDP may abort the transaction, or may ask for her unique identifier (which can be used to de-anonymize her). She may then choose to abort the transaction, or reveal this identifier. If she reveals it, Bob may or may not transmit her unclassified data. That is, the transaction may be aborted by either party (Alice may abort before sending her identifier, and Bob may abort early (after receiving her request) or late (after receiving her identifier)), or the transaction may proceed to completion. If it is aborted early (by Alice or Bob) then Alice’s anonymity is not compromised. Otherwise (if it is aborted late by Bob, or if it proceeds to completion), Alice could potentially be the subject of an audit, or may face later disciplinary action, because of her choice to send a request that violates the acceptable-use policy. The protocol has the following form:

PEP Alice Alice submits a request to transmit unclassified data to Bob’s PEP, and engages in a proof-of-knowledge protocol with the PEP to prove, for example, that she has role “X”. PDP; PEP Alice PEP The PEP forwards the request and the proof-ofknowledge to the PDP. The PDP examines the request with respect to the access control and acceptable-use policy, returning a decision of abort (“Deny”) or “send more information about yourself”. The PEP returns this decision to Alice. Alice PEP Alice decides whether to send her unique identifier to the PEP, or abort. If she decides to send her unique identifier, she engages in a proof-of-knowledge protocol with the PEP to prove that her identifier has value “Y”. PEP PDP; PEP Network The PEP forwards this new proof-of-knowledge to the PDP, which may choose to store this information for future use. The PDP may then decide to abort (“Deny”), or allow the transmission by returning “Permit” to the PEP, which sends the specified unclassified data (included as part of Alice’s original request) to the network.

The protocol messages above (i.e., the interaction between Alice and the PEP, and the interaction between the PEP and the PDP) may be implemented in a number of different ways,

56


but a relatively straightforward way to achieve this is to use the request and response message syntaxes specified in XACML (particularly if XACML is used by Bob as his policy language). Alice begins by sending an XACML element to the PDP, where the field of the contains a element that holds the unclassified data she wishes to send. The proof of Alice’s attribute (e.g., that her role is equal to “X”) is handled by the credential system showing protocol. The and proof are forwarded by the PEP to the PDP. Upon rendering a decision, the PDP returns an XACML element containing a element with a decision of “Deny” (aborting the transaction), or a decision of “Indeterminate” and a containing a of “missing-attribute” and a indicating that additional information about Alice (specifically, her unique identifier) is required. If an AttributeId has been standardized for this unique identifier, then a containing a element with this can be included in the element to indicate to Alice precisely the information she must supply. The element is forwarded to Alice by the PEP, at which point Alice may choose to abort or to engage in another showing protocol execution with the PEP to prove that her identifier has value “Y”. This proof is forwarded by the PEP to the PDP. Upon rendering a decision, the PDP may return an XACML element containing a element with a decision of “Permit” or “Deny”. If the decision is “Permit”, the PEP can transmit the of the original on the network. Note that the credential system showing protocol execution between Alice and the PEP (both for proving that her role is “X” and that her unique identifier is “Y”) can be accomplished using XACML and elements. Assuming that a new (for example, “continue”) is standardized, the PEP can return a element containing a element with a decision of “Indeterminate”, a of “continue”, and a that contains the showing protocol response message. Alice can then return another element containing her next showing protocol message values in the elements. This back-and-forth interaction can continue as long as necessary until the credential showing protocol is complete. Thus, no new protocol syntax needs to be defined to carry out the credential proofs of knowledge; the above discussion demonstrates that existing XACML messages are sufficient for this purpose. III. SECURITY AND PRIVACY OF THE PROPOSED TECHNIQUE There are a number of considerations relating to the security and privacy of the technique described above.

A. Unique Identifier The unique identifier that gets encoded into Alice’s credential may or may not be hidden (“blinded”) from the credential issuer at the time of credential issuance. However, this identifier must be known to some authority (and linkable by that authority to a true identifier for Alice, such as her name); this differs from credential systems in which no unique identifiers are known to any authorities, but is needed here in order to allow the potential for de-anonymization in the case of credential misuse. Note that this unique identifier, although encoded into the credential, is not revealed in any showing protocol execution unless required by the PDP (i.e., in a nonpolicy-conformant request situation as outlined in Sections 4.1 – 4.3 above). It is important to note that different credentials for the same user (signed by the same or by different issuers) may contain different unique identifiers, so that a non-conformant request in one situation may or may not be immediately tied to a nonconformant request in another situation. This is ultimately a deployment decision, but the proposal allows either option (i.e., the same identifier or different identifiers) to be implemented. Using different identifiers eliminates the coordination that would be needed among different credential issuers, but may allow users who have violated the policy in one domain to continue operating in other domains. B. Unchanged Showing Protocol and XACML Messages Neither the credential showing protocol nor the XACML request and response messages are syntactically changed in any way from their original specifications. Rather, in the case of the XACML messages, we have simply defined how some of the fields may be used to hold the required information and have suggested the standardization of specific values for existing elements such as . Thus, this proposal has no effect on the security or privacy analyses that have already been accomplished for credential systems and XACML exchanges. The loss of privacy for non-policyconformant users is local (in that it applies to those users only), limited to the resource in question, and at the discretion of the PDP owner. C. Protection of XACML Messages XACML messages are not self-protected; thus, it is clear that they must have additional protection in order to acquire the properties of confidentiality and integrity. For example, an OASIS Standard specification is publicly available [1] that describes how XACML messages may be carried in Security Assertion Markup Language (SAML) assertions for security protection. D. Underlying Transport As is the case with all credential systems, the credential protocols (i.e., the issuing protocol and the showing protocol) must be carried out over anonymous channels, otherwise the anonymity guaranteed by the credentials themselves will be


57

nullified at the lower layers (e.g., by revealing the IP address in transmitted network packets). Here, the same holds true: the XACML messages must be transferred between Alice and the PEP using an anonymous channel, such as a Mix network [10] or a Tor network [23]. IV. CONCLUSIONS This paper proposes a mechanism to limit one form of misuse by legitimate owners in a credential system: engaging in transactions that violate the acceptable-use policy of a resource. In this proposal, credential owners are discouraged from requesting such transactions because the resource owner may be able to de-anonymize them and/or track their activities. This scheme has greater flexibility than previously proposed misuse-limiting mechanisms and can be implemented with no syntactical modifications to the XACML protocol specification. One assumption on which this proposal rests is that a resource owner is able to recognize when a request will violate his acceptable-use policy. This can be accomplished through integrity-protected labelling of data (e.g., a given file is “Top Secret” or “Unclassified”) or through intelligent contentscanning algorithms. The XACML specification enables both possibilities, but finding efficient and effective techniques for recognizing data is an area in which further research would be beneficial. REFERENCES [1]

[2] [3] [4]

[5]

[6]

[7]

[8]

Anderson, A., and Lockhart, H., SAML 2.0 Profile of XACML v2.0, OASIS Standard, Feb. 2005. Available from http://docs.oasisopen.org/xacml/2.0/access_control-xacml-2.0-saml-profile-spec-os.pdf (last accessed March 20, 2009). Brands, S., Rethinking Public Key Infrastructures and Digital Certificates: Building in Privacy, MIT Press, 2000. Brands, S., A Technical Overview of Digital Credentials. Available from http://www.cypherspace.org/credlib/brands-technical.pdf (last accessed March 20, 2009). Camenisch, J., An Efficient Anonymous Credentials System, invited talk given at the 5th International Workshop on Privacy Enhancing Technologies (PET 2005), Cavtat, Croatia, May 30, 2005. Available from http://www.petworkshop.org/2005/workshop/talks/Jan.pdf (last accessed March 20, 2009). Camenisch, J., Hoenberger, S., Kohlweiss, M., Lysyanskaya, A., and Meyerovich, M., How to Win the Clone Wars: Efficient Periodic nTimes Anonymous Authentication, in Proceedings of the ACM Conference on Computer and Communications Security, Oct. 30 – Nov. 3, 2006, pp. 201-210. Camenisch, J. & Lysyanskaya, A., An Efficient System for Nontransferable Anonymous Credentials with Optional Anonymity Revocation, Cryptology ePrint Archive: Report 2001/019 (extended version of Eurocrypt 2001 paper), 2001. Camenisch, J., and Lysyanskaya, A., Signature Schemes and Anonymous Credentials from Bilinear Maps, in Advances in Cryptology: Proceedings of Crypto 2004, Springer LNCS 3152, 2004, pp. 56-72. Camenisch, J., Sommer, D. & Zimmermann, R., A General Certification Framework with Applications to Privacy-Enhancing Certificate Infrastructures, 21st IFIP TC-11 International Information Security Conference (SEC 2006), Karlstad University Sweden, May 22-24, 2006.

[9]

[10] [11] [12] [13] [14]

[15] [16] [17]

[18] [19] [20] [21]

[22]

[23] [24]

Canetti, R., Charikar, M., Rajagopalan, S., Ravikumar, S., Sahai, A., and Tomkins, A., Non-Transferable Anonymous Credentials, Patent number 7222362. Available from http://www.freepatentsonline.com/7222362.html (last accessed March 20, 2009). Chaum, D., Untraceable electronic mail, return addresses, and digital pseudonyms, Communications of the ACM, 24(2), pp. 84-88, February 1981. Chaum, D., Blind Signatures for Untraceable Payments, Advances in Cryptology: Proceedings of Crypto '82, Plenum Press, pp. 199-203, 1983. Chaum, D., Blind Signature Systems, Advances in Cryptology: Proceedings of Crypto '83, Plenum Press, p. 153, 1984. Chaum, D., Security without identification: Transaction systems to make big brother obsolete, Communications of the ACM, 28(10), pp. 1030-1044, October 1985. Chaum, D. & Evertse, J.-H., A secure and privacy-protecting protocol for transmitting personal information between organizations, in Advances in Cryptology: Proceedings of Crypto ’86, Springer-Verlag LNCS 263, pp. 118-167, 1987. Chaum, D. & van Heyst, E., Group signatures, Advances in Cryptology: Proceedings of Eurocrypt ’91, Springer-Verlag LNCS 547, pp. 257-265, 1991. Chen, L., Access with pseudonyms, Cryptography: Policy and Algorithms, Springer-Verlag LNCS 1029, pp. 232-243, 1995. Damgard, I., Payment systems and credential mechanism with provable security against abuse by individuals, Advances in Cryptology: Proceedings of Crypto ’88, Springer Verlag LNCS 403, pp. 328-335, 1990. Dwork, C., Lotspiech, J., and Naor, M., Digital signets: self-enforcing protection of digital information, in Proceedings of the 28th ACM Symposium on Theory of Computing, 1996, pp. 489-498. Lysyanskaya, A., Rivest, R., Sahai, A. & Wolf, S., Pseudonym systems, Proceedings of Selected Areas in Cryptography 1999, Springer-Verlag LNCS 1758, pp. 184-199, 2000. Lysyanskaya, A., Signature Schemes and Applications to Cryptographic Protocol Design, PhD thesis, Massachusetts Institute of Technology, Cambridge, Massachusetts, Sept. 2002. Moses, T. (Ed.), eXtensible Access Control Markup Language (XACML), Version 2.0, OASIS Standard, Feb. 2005. Available from http://docs.oasis-open.org/xacml/2.0/access_control-xacml-2.0-corespec-os.pdf (last accessed March 20, 2009). Persiano, P. & Visconti, I., An Anonymous Credential System and a Privacy-Aware PKI, Proceedings of the 8th Australasian Conference on Information Security and Privacy (ACISP) 2003, Springer-Verlag LNCS 2727, pp.27-38, 2003. The Tor Project, Tor: Anonymity Online website: http://www.torproject.org/ (last accessed March 20, 2009). Verheul, E., Self-blindable credential certificates from the weil pairing, Proceedings of Asiacrypt 2001, Springer LNCS 2248, pp. 533-551, 2001.

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 M. Ohba, K. Matsuoka, and T. Ohta: Eliciting State Transition Diagrams from Programs described in a Rule-based Language

58

Eliciting State Transition Diagrams from Programs described in a Rule-based Language Minami Ohba, Koichi Matsuoka, and Tadashi Ohta*, *: Member, IEEE

Abstract— Since it is easier to describe programs in a rule-based language than a procedural language, systems described in a rule-based language have been proposed. But, it is difficult to understand total behaviors of the programs. To solve the problem and to make it easier for service developers to understand total behaviors of the programs, it is desirable if program specifications can automatically be elicited from programs described in a rule-based language. As well known, network services programs can be described based on a state transition model. This paper proposes methods for automatically eliciting state transition diagrams based on an enhanced state transition model from programs described in a rule-based language. To evaluate the proposed methods, an experimental system was implemented. Index Terms—rule-based language, state transition diagram, program specification, automatic elicitation (2.44 Software, 2.1 Programming Languages)

I

I. INTRODUCTION

T is well known that programs described in a rule-based language require an execution time a hundred times longer than those described in a procedural language. So, except for the field of AI, there were no proposals to describe network service programs in a rule-based language, except for ones proposed by the authors [1]. But, since a rule-based language is a declarative language, the execution order of programs described in the rule-based language has nothing to do with the order of describing programs. So, a programmer can describe a program from an arbitrary point in the program, regardless of execution order. Moreover, in the case of a rule-based language, even a partial program can be executed. Recently, with these advantages, some examples of network service programs described in a rule-based language have been proposed in conferences [2]-[5].

Manuscript received November 5, 2009. M. Ohba is a student in Master course with the Information Science Engineering Department, Soka University, 1-236, Tangi-cho, Hachioji-shi, Tokyo, Japan (e-mail: [email protected]). K. Matsuoka is a student in Doctor course with the Information Science Engineering Department, Soka University, 1-236, Tangi-cho, Hachioji-shi, Tokyo, Japan (e-mail: [email protected]). T. Ohta is with the Information Science Engineering Department, Soka University, 1-236, Tangi-cho, Hachioji-shi, Tokyo, Japan (e-mail: ohta@ soka.ac.jp). (corresponding author : +81-42-691-8481; fax: +81-42-691-9312; e-mail: [email protected]).

However, a rule-based language has some problems. First, it is difficult to discover any deficiency or excess descriptions in programs, which can cause description errors in rules. Second, if a condition to apply a rule is satisfied, the rule is, in some cases, applied where service developers do not expect it to be applied. So, it causes difficulties to understand total behaviours of programs. As a result, programs might not be exactly as developers intend them to be. On the other hand, it is well known that network service programs specifications can be described using state transition diagrams based on an enhanced state transition model. Therefore, if it is possible to automatically elicit state transition diagrams from programs described in a rule-based language, this can help developers to understand total behaviours of programs and to find description errors in programs. So, this paper proposes methods for automatically eliciting state transition diagrams based on an enhanced state transition model from programs described in a rule-based language. In section 2, as an example of rule-based languages, ESTR, developed by the authors, and state transition models are explained. In section 3, problems in eliciting state transition diagrams from programs described in the rule-based language are described. In section 4, solutions to the problems are proposed. In section 5, related researches are discussed. In section 6, an experimental automatic elicitation system based on the proposed methods, and an application result to VoIP are described. A problem in applying the proposed system to supplementary services and basic idea for solving the problem is described. It was confirmed that the proposed methods are reasonable. II. ESTR AND STATE TRANSITION MODEL A. ESTR STR (State Transition Rule) is a rule-based language for describing telecommunication services in a style of transition conditions and was proposed by Dr. Hirakawa [6]. STR has the generality to describe telecommunication services and is used in the paper [7]. The authors added a part of the describing process with state transition (the action describing part) to STR in order to use STR as a program description language. This language is called ESTR (Enhanced STR). Therefore, ESTR consists of the STR part and the action description part. A brief explanation necessary to understand this paper is given. For more details, please refer to [8].


59

In STR, a system state is represented as a set of primitives, each of which indicates a state of a terminal or a relationship between terminals. Arguments of a primitive represent actual terminal names. For example, a system state, where terminal A is in idle state (denoted by idle(A)), terminal B is receiving a dial tone (denoted by dialtone(B)), and users of terminal C and D are talking with each other (denoted by talk(C,D)), is described as {idle(A), dialtone(B), talk(C,D)}. (See Figure 1)

dialtone

idle

state transition diagram as a service specification, as generally used, is based on the enhanced model (Figure 2).

dialtone(A) dial(A,B) idle(B) calling(A,B)

not[idle(B)] busy(A)

Fig. 2. State transition based on an enhanced model

B

A talk C

D Fig. 1. System state

STR represents each state transition condition as a rule. The form of a rule is shown as follows: pre-condition event: post-condition Pre-condition and post-condition are described as a set of primitives. Pre-condition represents a condition where a rule is applied. Event indicates a trigger of a state transition. Post-condition represents a condition of a system state after applying the rule. In a system state, arguments of primitives are described as actual terminal names. But, in STR, arguments of primitives and event are described as variables, called terminal variables, so that a rule can be applied to any terminals. Suppose, terminal x is receiving a dial tone and terminal y is in idle state. At this moment, if a user of terminal x dials to terminal y (denoted by dial(x,y)), terminal x calls terminal y (denoted by calling(x,y)). This rule can be described as follows: dialtone(x),idle(y) dial(x,y): calling(x,y) When an event occurs in the system, a rule, which has the same event and whose pre-condition exists in a system state, is applied. When a rule is applied, a part of the system state, which equals the pre-condition of the applied rule, is changed into the post-condition of the applied rule; the pre-condition of the applied rule is deleted from the system state and the post-condition of the applied rule is added to the system state. To make easier for service developers to describe rules, a pre-condition and a post condition represent a part of a system state related to the service, instead of a full state. So, the pre-condition and the post-condition do not represent an actual state before transition or after transition, respectively. Therefore, it is not possible to elicit a state transition diagram as a service specification directly from the set of rules. B. State transition model There are two models for describing a state transition diagram; one is a basic state transition model (here after abbreviated as basic model) and the other is an enhanced state transition model (here after abbreviated as enhanced model). A

In the model, the next state is decided based on the current system state, an event (input signal), and a branch condition described in an analysis part, denoted as a triangle in Figure 2. Suppose, in a telecommunication service, a terminal is receiving a dial tone. At this moment, when a telephone number is input, a state transition diagram, based on the enhanced model, is shown in Figure 2. Here, the branch condition not[idle(B)] represents a terminal state other than idle(B); dialtone(B), busy(B), calling(B,C), calling(C,B), or talk(B,C). Here, busy(B) stands for a terminal B receiving a busy tone. In the basic model, the branch conditions are described explicitly in each system state. For the basic model, the same state transition diagram as shown in Figure 2 is shown in Figure 3. dialtone(A) dialtone(A) idle(B) dialtone(B) dial(A,B) dial(A,B)

calling(A,B)

busy(A) dialtone(B)

dialtone(A) dialtone(A) dialtone(A) calling(B,C) talk(B,C) busy(B) dial(A,B) dial(A,B) dial(A,B) busy(A) calling(B,C)

busy(A) talk(B,C)

busy(A) busy(B)

Fig. 3. State transition based on an basic model The processing flow for eliciting a state transition diagram based on the enhanced model is as follows: Step 1: Elicit state transitions for real system states, based on the basic model, from a set of rules. Step 2: Convert the state transitions obtained at Step 1 into state transitions as a service specification, based on the enhanced model. Step 3: Convert the state transitions obtained at Step 2 into a state transition diagram. In this paper, problems in Step 1 and Step 2 are discussed. Since problems in Step 3 are how to draw the diagram and excellent ways have been adopted in a CAD system, it is out of the scope for this paper.


60

III. PROBLEMS First, some definitions necessary to explain the problems are provided. [Definition 1] Suppose two states, si(A,B,C,D) and sj(P,Q,R,S). If the state, si(P,Q,R,S), which is obtained by replacing terminal variables of si, A, B, C, and D with P, Q, R, and S, is the same state as sj(P,Q,R,S), si(A,B,C,D) is called an equivalent state to sj(P,Q,R,S). An equivalent state is the same state except for terminal names. For example, {idle(A),calling(B,C)} is an equivalent state to {idle(D),calling(E,F)} (Figure 4).

idle(A) idle(D) calling(B,C) calling(E,F) Fig. 4. Example of an equivalent state It is denoted by si sj that si is equivalent to sj. [Definition 2] Suppose two events, ei(A,B,C,D) and ej(P,Q,R,S). If the event, ei(P,Q,R,S), which is obtained by replacing terminal variables of ei, A, B, C, and D with P, Q, R, and S, is the same event as ej(P,Q,R,S), ei(A,B,C,D) is an equivalent event to ej(P,Q,R,S), and denoted by ei ej. Equivalent event is the same event except for terminal names. For example, dial(A,B) dial(C,D). [Definition 3] Suppose two state transitions, ti(A,B,C,D) and tj(P,Q,R,S). If the state transition, ti(P,Q,R,S), which is obtained by replacing terminal variables of ti, A, B, C, and D with P, Q, R, and S, is the same state transition as tj(P,Q,R,S), ti(A,B,C,D) is an equivalent state transition to tj(P,Q,R,S). A state transition is described as [current state, event, next state]. So, when state transition, ti(A,B,C) is [{idle(A),calling(B)}, offhook(A), {dialtone(A), calling(B,C)}], and tj(P,Q,R) is [{idle(P),calling(Q,R)}, offhook(P), {dialtone(P),calling(Q,R)}], ti(A,B,C) tj(P,Q,R) (Figure 5). idle(A) idle(P) calling(B,C) calling(Q,R) offhook(A) offhook(P) dialtone(A) calling(B,C)

dialtone(P) calling(Q,R)

Fig. 5. Example of an equivalent state transition [Definition 4] Suppose two combinations of current state and event, {si(A,B,C,D) ei(A,B,C,D)} and {sj(P,Q,R,S) ej(P,Q,R,S)}. If the combination, {si(P,Q,R,S) ei(P,Q,R,S)}, which is obtained by replacing terminal variables of si and ei, A, B, C, and D with P, Q, R, and S, is the same combination as {sj(P,Q,R,S) ej(P,Q,R,S)}, {si(A,B,C,D) ei(A,B,C,D)} is an equivalent combination of current state and event to {sj(P,Q,R,S) ej(P,Q,R,S)}.

A. Deletion of equivalent state transitions The procedure to elicit real state transitions from a set of rules is explained. First, by creating possible events in turn, and applying rules to the initial state, state transitions from the initial state are elicited. The next states of elicited state transitions are saved in set S. A new state is taken out from S, and possible events are created in turn. In the same way as described above, by applying rules to the state, state transitions are elicited. If the next states of elicited state transitions have not been elicited so far, the states are stored in set S. These processes are repeated until S is empty. In the procedure described above, the same or equivalent state transitions may occur. Since the equivalent state transitions are the same except for terminal names, one equivalent state transition only has to be described as a service specification. So, other equivalent state transitions, as well as the same state transitions, should be eliminated. B. Extraction of analysis part The state transition diagram elicited in A is based on the basic model. So, to elicit the state transition diagram based on the enhanced model, branch conditions should be eliminated from states in the state transition diagram based on the basic model and moved to analysis part in the state transition diagram based on the enhanced model (conversion from Figure 3 to Figure 2). IV. SOLUTIONS Solutions for problems described in the previous section are proposed. A. Deletion of equivalent state transitions (1) Deletion In the procedure for eliciting state transitions described in A in section III, there are two cases where equivalent state transitions occur. Case 1) In the first case, equivalent state transitions occur when the same rule is applied to equivalent states. It is proven that all state transitions, obtained by applying the same rule to equivalent states, are equivalent state transitions. [Lemma 1] If si(A,B,C,D) is an equivalent state to sj(P,Q,R,S), sin(A,B,C,D) and sjn(P,Q,R,S), which are obtained by applying the same rule to si(A,B,C,D) and sj(P,Q,R,S), respectively, are equivalent. (Proof) According to ESTR specification described A in section II, the next state is decided by the current state, pre-condition, and post-condition of the applied rule. Since current states, pre-conditions and post-conditions of the applied rules are equivalent to each other, respectively, the next states are also equivalent to each other. Please refer to paper [9],[10] for more detailed explanation.

61


[Theorem 1] If si(A,B,C,D) is equivalent to sj(P,Q,R,S), ti(A,B,C,D)={si(A,B,C,D) ei(A,B,C,D) sin(A,B,C,D)} and tj(P,Q,R,S)={sj(P,Q,R,S) ej(P,Q,R,S) sjn(P,Q,R,S)} which are obtained by applying the same rule to si and sj, respectively, are equivalent. (Proof) By replacing terminals of tj(P,Q,R,S), P, Q, R, and S, with A, B, C, and D, tj(A,B,C,D) is rewritten as follows: tj(A,B,C,D)={sj(A,B,C,D) ej(A,B,C,D) sjn(A,B,C,D)}--(1) Since si(A,B,C,D) is equivalent to sj(P,Q,R,S), by Definition 1, the following equation is obtained: si(A,B,C,D) = sj(A,B,C,D)---------(2) By Lemma 1, sin(A,B,C,D) is an equivalent state to sjn(P,Q,R,S). Therefore, by Definition 1, the following equation is obtained: sin(A,B,C,D) = sjn(A,B,C,D)-------(3) Since ti and tj are obtained by applying the same rule, ei and ej are the same event name. Thus, the following equation is obtained: ei(A,B,C,D) = ej(A,B,C,D)---------(4) Consequently, by equations (1), (2), (3), and (4) above, ti(A,B,C,D) is the same state transition as tj(A,B,C,D). So, by Definition 3, ti(A,B,C,D) is an equivalent state transition to tj(P,Q,R,S). Thus, when a state transition of a real state is elicited, if the next state in the state transition is equivalent to an existing state, the next state is not stored in S, mentioned in the procedure for eliciting a state transition of real state described in A in section III, so that equivalent state transitions such as case 1 are not generated. Case 2)

Suppose, the same rule, except for arguments, is applied to the same current state. In this case, if two combinations of current state and event are equivalent, the state transitions are equivalent. For example, when the current state is {idle(A),idle(B),idle(C)}, for events offhook(A) and offhook(B), the same rule is applied except for arguments, and two state transitions, [{idle(A),idle(B),idle(C)}, offhook(A), {dialtone(A),idle(B),idle(C)}] and [{idle(A),idle(B),idle(C)}, offhook(B), {idle(A),dialtone(B),idle(C)}] are equivalent. In general, suppose, a current state, an event, and the next state are Sc(x,y,z), e(x,y,z), and Sn(x,y,z), respectively. A state transition, ti, which has been obtained when e(P,Q,R) occurs to the state Sc(A,B,C). If, for another event e(U,V,W), a state transition tj is elicited by applying the same rule, the following theorem holds. [Theorem 2]If {Sc(A,B,C) e(P,Q,R)} and {Sc(A,B,C) e(U,V,W)} are equivalent, ti and tj are also equivalent. Here P, Q, and R, and U, V, and W are either A, B, or C, respectively. However, (P=U)∧(Q=V)∧(R=W) does not hold. (Proof) If {Sc(A,B,C) e(P,Q,R)} and {Sc(A,B,C) e(U,V,W)} are equivalent, by Definition 4, the following equation holds. {Sc(A,B,C) e(P,Q,R)} = {Sc(X,Y,Z) e(P,Q,R)}---(5)

Here, (X,Y,Z) is obtained from (A,B,C) by replacing terminals in the same way as obtaining (U,V,W) from (P,Q,R). Since Sc(A,B,C) and Sc(X,Y,Z) are states and e(P,Q,R) is an event, the following formulas hold. Sc(A,B,C)∩e(P,Q,R) = Φ (means NULL) Sc(X,Y,Z) ∩e(P,Q,R) = Φ So, by formula (5), the following formula holds. Sc(A,B,C) = Sc(X,Y,Z)------------------(6) Then, elicited state transitions, ti and tj, are as follows, respectively. ti(P,Q,R)={Sc(A,B,C) e(P,Q,R) Sn(P,Q,R)} tj(U,V,W)={Sc(A,B,C) e(U,V,W) Sn’(U,V,W)} According to formula (6), the following formula holds. tj(U,V,W) = {Sc(X,Y,Z) e(U,V,W) Sn’(U,V,W)} Since Sc(X,Y,Z) and Sc(A,B,C) are equivalent, by Definition 1, ti and tj are equivalent. Thus, in case 2, the following procedure is carried out not to elicit equivalent state transitions. When a rule is applied to a state, if the same rule has already been applied to the same state, and union of the current state and an event is equivalent to union of the current state and an event for this time, the state transition is not generated at this time. (2) Elicitation of a state transition diagram for real states A method for eliciting a state transition diagram for real states without equivalent state transitions is proposed. First, elicit state transitions by applying rules to the initial state. If the same rule has already been applied to the same state, check whether an equivalent state transition is generated or not. And, if the next state of the elicited state transition is the same or equivalent to one of existing states, it is discarded. Otherwise, elicit state transitions by applying rules to the next state of the elicited state transition, in the same manner. Continue eliciting state transitions until no state transitions can be elicited other than state transitions which are the same or equivalent to the existing state transitions. Detailed explanations of how to set the initial state and to elicit states and state transitions are given. a) Initial state The initial state is defined as a state where all terminals related to the service are in idle state. For example, when three terminals are needed to execute service specifications (basic telephone service needs three terminals), the initial state is described as {idle(A),idle(B),idle(C)}. b) Elicitation of states and state transitions Step 1: Let S1 and S2 be sets of states. R is a set of applicable rules. T is a set of state transitions. The initial values of S1 and S2 are s0 (the initial state). The initial values of R and T are NULL, empty set. Step 2: Take out a state (s) from S1. If there is no state in S1, elicitation of state transitions is finished. Step 3: Save rules which satisfy the following condition in R: all primitive names described in the pre-conditions of the rules are involved in s.


62

Step 4: Take out a rule (r) from R. If there is no rule, elicitation of state transitions transiting from state s is finished. Go to Step 2 to elicit state transitions transiting from another state. Step 5: Assign real terminals to terminal variables in event e of the rule r. If there is no new terminal assignment, go to Step 4 to apply another rule. Step 6: Check whether r can be applied to s or not if e, to which terminal assignment is made in Step 5, occurs. If r can’t be applied, go to Step 5. If the state transition, obtained by applying r to s, has not been elicited, go to Step 7. Let e’ be an event of the elicited state transition. If a union of s and e is equivalent to a union of s and e’, go to Step 5. Step 7: Let s’ be the next state when r is applied to s. A state transition from s to s’ is represented as (s,e,s’). Generate state transition (s,e,s’) and store it in T. If a state which is the same or equivalent to s’ isn’t included in S2, s’ is stored in both S1 and S2. Go to Step 5 to elicit another state transition for another event. T, obtained in the way described above, contains state transitions of real states without equivalent state transitions. Figure 6 shows a part of a state transition diagram elicited in the above way. S0 onhoon(B)

idle(A) idle(B)

S4 dialtone(A) dialtone(B)

S8

offhook(A)

wringdial(A) dial(A,B) dial(A,A)

busy(B) idle(A) offhook(B) onhook(A)

dilatone(A) idle(B)

wrongdial(A) dial(A,A)

S7 busy(A) busy(B)

dial(A,B)

onhook(A)

busy(A) dialtone(B) dial(B,A) dial(B,B)

onhook(B)

onhook(A) S6

S2 calling(A,B)

S5

onhook(A)

S1

S3

idle(A) dialtone(B)

busy(A) idle(B) onhook(A)

both system states, before and after a state transition. But, it can’t be said that all primitive which exist in the both states are redundant primitives. A rule application condition is, as mentioned in A in section III, described in the pre-condition of the rule. When a rule is applied, primitives described in the pre-condition of the applied rule are deleted from the system state. So, a primitive that represents a condition of state transition but has nothing to do with a change of the system state, should be described in pre-condition and post-condition. In this case, the primitive appears in both system states. An example is shown. Suppose, terminal A is in dialtone state and terminal B is in busy state. At this moment, when a user of terminal A dials to terminal B, terminal A transits to busy state. In the system states before and after this transition, primitive busy(B) exists. But, this primitive is described in pre-condition and post-condition of the applied rule, and so, represents a condition for terminal A to transit to busy state. Therefore, a primitive that does not represent a condition of state transition and exists in both states, is not described in the pre-condition of the applied rule. A concrete example is shown. Consider the following rule which represents a state transition from idle state to dialtone state: idle(x) offhook(x): dialtone(x) Suppose the system state is {idle(A),busy(B),talk(C,D)}. At this moment, if an event offhook(A) occurs, the system state transits to {dilatone(A),busy(B),talk(C,D)}. In this case, busy(B) and talk(C,D) exist in both system states, before and after state transition. But, these primitives are not described in the applied rule. So, busy(B) and talk(C,D) do not represent conditions for this state transition.

offhook(B)

Fig. 6. State transition diagram for system states State S8 and S6 look like deadlock. But, state transitions from S8 and S6 are equivalent to state transitions from S3 and S1 which are equivalent states to S8 and S6, respectively. B. Extraction of analysis part In the state transition diagram obtained in section 4.1, states of all terminals which related to the service are described. So, primitives, that do not represent conditions of state transitions or, from the view point of the enhanced model, should be described in analysis parts, are included in each state. (1) Delete redundant primitives Since a primitive, that does not represent a condition of state transition, has nothing to do with the state transition, it is called ‘redundant primitive’. A redundant primitive exists in

(2) Extraction of primitives described in analysis part The way to find primitives that should be described in analysis part of the state transition diagram based on the enhanced model is proposed. There are two types of primitives described above. a) The first type The first type of primitives are those that exist in both states; before and after state transition and are described in the pre-condition of the applied rule. For example, suppose the following rule: dialtone(x),busy(y) dial(x,y): busy(x),busy(y) If an event dial(A,B) occurs when the system state is {dialtone(A), busy(B), talk(C,D)}, the system state transits to {busy(A), busy(B), talk(C,D)}. busy(B) exists in both states; before and after the state transition and is described in the pre-condition of the applied rule. So, busy(B) belongs to the first kind primitives. On the other hand, talk(C,D) exists in the both state but is not described in the pre-condition of the applied rule. So, talk(C,D) is not used to decide the next state. The process for obtaining the primitives that belong to the first type primitives is shown below. Let T be a set of state transitions obtained in (2) in A in section IV mentioned above. Make T’ vacant.

63


Step 1: If T is vacant, the process is ended. Take out a state transition ti = (Si, ei, Sin) from T. Step 2: Take out a set of primitives Pr that exist in both system states, Si and Sin. If Pr does not exist, change ti to ti = (Si, ei, Sin, NULL), and store it in T’. Hereafter, the fourth term in ti means branch condition to be described in the analysis part of the state transition diagram based on the enhanced model. Go to Step 1. Step 3: Let Si be Si - Pr, Sin be Sin - Pr, respectively. Suppose Pr’ that is a subset of Pr and consists of primitives described in the pre-condition of a rule ri. ri is applied to cause the state transition ti. If Pr’ = φ, change ti to ti = (Si, ei, Sin, NULL), and store it in T’. Go to Step 1. Step 4: Change ti to ti = (Si, ei, Sin, Pr’), and store it in T’. Go to Step 1. b) The second type For the second type of primitives, plural state transitions that are caused by the same event are considered. In these state transitions, there are state transitions whose states before transition include the same primitives and different primitives and states after transition are different according to the different primitives included in each state before transition, respectively. In this case, those different primitives represent conditions of each state transition. On the other hand, the common primitives show that these state transitions are caused from the same state represented by the common primitives. A concrete example is described based on Figure 3. State transitions shown in Figure 3, in precise, are written as follows: ({dialtone(A),idle(B)}, dial(A,B), calling(A,B)) ({dialtone(A),busy(B), dial(A,B), {busy(A),busy(B)}) ({dialtone(A),calling(B,C)}, dial(A,B), {busy(A),calling(B,C)}) ({dialtone(A),talk(B,C)}, dial(A,B), {busy(A),talk(B,C)}) In this example, dialtone(A) is the common primitive. idle(B), busy(B), calling(B,C), and talk(B,C) are different primitives and are chosen as primitives to be described in the analysis part. The way to obtain the second type of primitives is shown as follows: Let T = T’, here T’ has been obtained in the process a) proposed above. Make T’ be vacant. Step 1: Take out all state transitions {ti} from T that are triggered by the same event. If there are no plural state transitions that satisfy the condition mentioned above the process is ended. Step 2: Obtain Sc as follows: Sc = ∩Si : the common primitives for all Si If Sc is vacant, go to Step 4. Step 3: Let Pci = Si - Sc. For all ti, change them to ti = (Sc, ei, Sin, PciUPr’). Here, PciUPr’ means a union of Pci and Pr’. Step 4: Store all ti in T’. Go to Step 1.

There are state transitions where for different Pr’ the next states are the same. An example is shown. Suppose a state transition from dialtone(A) by an event dial(A,B). If Pr’ is idle(B), the next state is calling(A,B). If Pr’ is either of dialtone(B), calling(B,C), talk(B,C), or busy(B), the next state is busy(A). In this case, dialtone(B), calling(B,C), talk(B,C), or busy(B) can be described as notidle(B), representing that terminal B is in a state other than idle(B). Thus, a well known state transition diagram for POTS can be obtained (Figure 7). S0 idle(A) offhook(A) onhook(A)

S1 dilatone(A) wrongdial(A) dial(A,B)

idle(B) notidle(B) onhook(A)

S2

onhook(A) S4 busy(A)

calling(A,B) offhook(B) S3 talk(A,B)

A or B

onhook(A) or onhook(B) B or A

Fig. 7. State transition diagram for POTS V. RELATED WORK For communication systems, some research has been done where service specifications are elicited from input/output information to/from the system [7],[11]. In the paper [7], an algorithm for eliciting rules, which represent state transition conditions, from a message sequence chart is proposed. In the paper [11], message sequence charts and state transition diagrams for the real system state are elicited. The real system state represents states of all terminals connected to the system, and there were no descriptions about redundant states. In both papers, a state transition diagram as a service specification has not been elicited. VI. EXPERIMENTAL SYSTEM AND EVALUATION (1) Experimental system According to the proposed methods described in section 4, an automatic system for eliciting state transition diagrams was experimentally manufactured. Figure 8 shows the software structure of the experimental system. Input: This program block reads a program described in ESTR. Real State Transition Generation: This program Block generates state transitions of a real system states


64

which is based on the basic model. This block consists of the following sub blocks: Rule Selection, State Transition Generation, and Deletion of Equivalent State Transition. Rule Selection: This sub block selects a rule according to a rule application condition described in section II A. State Transition Generation: This sub block generates real state transitions by applying a selected rule. Deletion of Equivalent State Transition: This sub block delete equivalent state transitions after generating real state transitions. Service State Transition Generation: This program block generates state transitions for a service program specification which is based on the enhanced model. This block consists of the following sub blocks: Deletion of Redundant Primitives, Extraction of Branch Conditions, and Dividing the Next State. Deletion of Redundant Primitives: This sub block deletes redundant primitives described in state transitions based on the basic model. Extraction of Branch Conditions: This sub block extracts primitives as branch conditions which should be described in an analysis part. Division of the Next State: This sub block divides the next state into plural states if needed. As shown in Figure 7, the next state from S3 divided into two states, S0 and S4. main

wdrq(A) drq(A) idle(A) arq(A,B) arqng(A,B) arqok(A,B)

relcomp(A) wrelcomp(A) disc(A)

wsetup(A,B) setup(A,B) notidle(B) idle(B)

disc(A)

warq(B,A) arq(B,A) arqng(B,A) arqok(B,A)

disc(A)

wproc(B,A) proc(B,A)

Input Real State Transition Generation

rule[7]: walert(x,y) alert(y,x): wconn(x,y) rule[8]: wconn(x,y) conn(y,x): talk(x,y) rule[9]: wsetup(x,y) disc(x): wrelcomp(x) rule[10]: wrelcomp(x) relcomp(x): wdrq(x) rule[11]: wdrq(x) drq(x): idle(x) rule[12]: warq(y,x),wproc(x,y) disc(x): wrelcomp(x),wrelease(y) rule[13]: wproc(y,x) disc(x): wrelcomp(x),wrelease(y) rule[14]: walert(x,y) disc(x): wrelcomp(x),wrelease(y) rule[15]: wconn(x,y) disc(x): wrelcomp(x),wrelease(y) rule[16]: talk(x,y) disc(x): wrelcomp(x),wrelease(y) rule[17]: talk(x,y) disc(y): wrelease(x),wrelcomp(y) rule[18]: wrelease(x) disc(x): wdrq(x)

disc(A)

walert(A,B) alert(B,A)

Rule Selection disc(A)

wconn(A,B)

State Transition Generation Deletion of Equivalent State Transition Service State Transition Generation

Deletion of Redundant Primitives Extraction of Breach Conditions Division of the Next State

Fig. 8. Software structure of the system (2) Application to VoIP Applying the system to a VoIP service program described in ESTR, a state transition diagram of the service specification was elicited as shown in Figure 9. STR parts of ESTR of VoIP rules are as follows: rule[0]: idle(x),arqok(x,y) arq(x,y): wsetup(x,y) rule[1]: idle(x),arqng(x,y) arq(x,y): wrelease(x) rule[2]: wsetup(x,y),idle(y) setup(x,y): warq(y,x) rule[3]: wsetup(x,y),notidle(y) setup(x,y): wrelease(x) rule[4]: warq(y,x),arqok(y,x) arq(y,x): wproc(y,x) rule[5]: warq(y,x),arqng(y,x) arq(y,x): wrelease(x),wrelease(y) rule[6]: wproc(y,x) proc(y,x): walert(x,y)

release(A) wrelease(A)

con(A) talk(A,B) AvB

disc(A) or disc(B) BvA

Fig. 9. State transition diagram for VoIP Here, arqok(x,y) and arqng(x,y) stand for a state that admission request from terminal x to terminal y are accepted and rejected, respectively. arq(x,y) is an admission request message from terminal x to terminal y. wsetup(x,y) stands for a state waiting for a setup message from terminal x to terminal y. wrelease(x) means a state waiting for a release message from terminal x. setup(x,y) is a setup message from terminal x to terminal y. warq(y,x) stands for a state waiting an admission request message from terminal y to terminal x. wproc(y,x) means a state waiting for a call processing message from terminal y to terminal x. proc(y,x) is a call processing message from terminal y to terminal x. walert(x,y) means a state where terminal x is waiting for an alert message from terminal y. alert(y,x) is an alert message from terminal y to terminal x. wconn(x,y) stands for a state where terminal x is waiting for a connect message from terminal y. conn(y,x) is a connect message from terminal y to terminal x. disc(x) is a disconnect message from terminal x. wrelcomp(x) means a state waiting for a release complete message from terminal x. relcomp(x) is a release complete message from terminal x. wdrq(x) means a


65

state waiting for a disengage request message. drq(x) is a disengage request message from terminal x. In addition, by applying the proposed methods to telephone services, such as Call Waiting Service and Terminating Screening Service, and to some non telephone services, it was confirmed that the proposed methods are useful for eliciting state transition diagrams. But, there is an improvement to be made. The problem is discussed in (3) below. (3) Dividing space of states a) Problem A state transition diagram as a service specification is described for every service, respectively. So, a state transition diagram as a service specification of a supplementary service should be separated from a state transition of POTS. Moreover, since the number of terminals used in a supplementary service is greater than that used in POTS, the number of states generated is far more than in POTS. This increases processing time and amount of memory required to obtain a state transition diagram as a service specification. Thus, a method for eliciting state transitions just for a supplementary service is required. b) Solution The authors proposed methods for distinguishing between spaces of POTS service and a supplementary service and for identifying a rule which generates a state transition from a space of POTS to a space of the supplementary service in papers [12],[13], respectively. By using these methods, state transitions of POTS and state transitions of the supplementary service can be distinguished. VII. CONCLUSION A system for automatically eliciting a state transition diagram based on an enhanced state transition model from a network service program described as a set of rules has been proposed. By manufacturing an experimental system based on the proposal and applying it to VoIP service and a correct state transition diagram was elicited. In addition, by applying the proposed methods to telephone services, such as Call Waiting Service and Terminating Screening Service, and to some non telephone services, it was confirmed that the proposed methods are useful for eliciting state transition diagrams. In addition, items to be refined were clarified and solutions were given. For future work, by enhancing the proposed system and applying the enhanced system to many network services, including non-telephone services, the proposed method will be evaluated. REFERENCES [1] T. Morinaga, G. Ogose, and T. Ohta, “Active Networks Architecture for VoIP Gateway Using Declarative Language,” IEICE Transactions on Communications, Vol. E84-B, No. 12, pp.3189-3197, Dec. 2001. [2] Shondip Sen and Rechel Cardell-Oliver, “A Rule-Based Language for Programming Wireless Sensor Actuator Networks

using Frequency and Communication,” Third Workshop on Embedded Networked Sensors (EmNets 2006), 2006. [3] Yingwei Luo, Xiaolin Wang, Xinpeng Liu, Zhou Xing, Xiao Pang, and Habio Wang, “A Rule-based Event Handling Model,” Proc. of APSCC08, pp.869-875, Dec. 2008. [4] Richard Etter, Patricia Dockhorn Costa, and Tom Broens, “A rule-Based Approach Towards Context-Aware User Notification service,” Proc. of International Conference on Pervasive Services 2006, pp. 281-284, Jun 2006. [5] Xian Fei and Evan Magill, “Rule Execution and Event Distribution Middleware for PROSEN-WSN,” Proc. of SENSERCOMM2008, pp.580-585, Jan. 2008. [6] Yutaka Hirakawa and Toyofumi Takenaka, “Telecommunication Service Description Using State Transition Rules,” Sixth International Workshop on Software Specification and Design, Oct. 1991. [7] Kazumasa Takami and Yoshihiro Niitsu, “A Specification Description Conversion Method of Message Sequence Charts into Rules in Telecommunication Services,” Journal of IPSJ, Vol. 36, No. 5, pp1081-1090, May 1995. [8] Tae Yoneda and Tadashi Ohta, "The declarative language STR," Language Constructs for Describing Features, pp.197-212, Aug. 2000, Springer. [9] Tae Yoneda and Tadashi Ohta, "Reduction of the Number of Terminal Assignments for Detecting Feature Interactions in Telecommunication Services,” Proc. of ICECCS 2000, pp.202-209. Sep. 2000. [10] Masahide Nakamura, Yoshiaki Kakuda, and Toru Kikuno, “Feature Interaction Detection Using Permutation Symmetry,” Proc. of FIW’98, pp.187-201, Sep. 1998. [11] Nancy Griffeth, Yuri Cantor, and Constantinos Djouvas, “Testing a Network by Inferring Representative State Machines from Network Traces,” ICSEA2006, November 2006. [12] Kenichi Tatekuwa and Tadashi Ohta, “Automatic Deleting Specification Errors which Cause Miss-Detection of Feature Interaction,” Proc. of ICFI’05, pp.320-326, Sep. 2005. [13] Daiske Harada, Hiroaki Fujiwara, and Tadashi Ohta, “Avoidance of Feature Interactions at Run-Time,” Proc. of ICSEA2006, Oct. 2006.

Minami Ohba was born in 1987. She received B.S. degree in information system science from Soka University in 2009. She has interest in an communication software. Ms. Ohba is a member of IEICE

Koichi Matsuoka was born in 1985. He received B.S. degree and M.S. degree in information system science from Soka University in 2007 and 2009, respectively. He has interest in an communication software and networked robots. Mr. Matsuoka is a member of IEICE.

66


Tadashi Ohta was born in 1945. He received the B.S. degree, M.S. degree in electronics engineering and the Ph.D. degree in information engineering from Kyushu University in 1968, 1970, and 1987, respectively He joined Denden Kosha (now NTT) in 1970. Since then, he had been engaged in the research and development of software systems for electronic switching systems. In 1992 he moved to ATR Communication Systems Research Laboratories, where he had been engaged in the investigation on automatic programming and security. In 1996, he moved to Soka University. Prof. Ohta is a member of IEEE, IEICE, and IPSJ. Prof. Ohta is a fellow of IEICE.

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 K. S. Thampatty et.al: RTRL Based Multivariable Neuro-controller for Non-linear Systems

67

RTRL Based Multivariable Neuro-controller for Non-linear Systems K.C. Sindhu Thampatty, Student Member, IEEE, M. P. Nandakumar and Elizabeth P. Cheriyan

Abstract—The paper presents a new non-linear multivariable control strategy using Neural networks. In this proposed controller, the Neural network is trained using Real Time Recurrent Learning Algorithm, which uses the past and present process informations to generate the new control actions. A fully connected recurrent network architecture is selected for the controller, in which a feedback connection is provided from output layer to the input layer through a delay element. Since the synaptic weights to the neurons are adjusted on-line, this controller has potential applications in real time control also. The proposed control algorithm is tested for both continuous and Discrete systems. The simulation results have shown the potential of the controller to deal with the non-linear multivariable systems. Index Terms—Artificial Neural Network (ANN), Non-linear Control, Multivariable System, Real Time Recurrent Learning Algorithm (RTRL).

I. I NTRODUCTION HE ability of ANN to deal with non-linear systems can be readily exploited in the synthesis of non-linear controllers. The neural networks can learn suff ciently accurate model and can give good non-linear control when model equations are not known or with partial available state informations. Hence neural network based controller design is an eff cient approach to process non-linearities as well as variable interactions. Controllers in general are designed when the dynamic systems have certain unknown parameters. However it requires an approximation that the unknown parameters of the system under consideration is linear, which is diff cult for a complex non-linear systems. Most of the commonly encountered time varying systems are non-linear in nature. Recent advances in non-linear control theory inspired the development of adaptive control schemes for these systems. One of the successful areas of application of neural network is the non-linear control system because neural network is capable of producing highly non-linear mapping. Most of the results reported in the literatures for the adaptive control of non-linear dynamical systems are related to a single-input single-output systems (SISO) but, the practical systems have multiple inputs and multiple outputs (MIMO). Hence our interest in this work is to develop a controller for multivariable systems. In multivariable systems, unlike in SISO systems, there exists more than one control loop and hence loop interactions exists. This interactions can cause system instability or results in poor control performance. A multivariable controller can achieve non interacting control but

T

K.C.Sindhu Thampatty(corresponding author), M.P.Nandakumar and Elizabeth.P.Cheriyan are with Department of Electrical Engineering, National Institute of Technology, Calicut, Kerala,India e-mail: kc− [email protected].

its limitation is that the design methods involve the construction of a mathematical model describing the dynamics of the plant to be controlled. In practice this may not happen because the exact model representation of the plant is diff cult to obtain. Operating a complex system requires an intelligent controller with adaptive and learning capabilities in the presence of unknown disturbances, unmodeled dynamics and unstructured uncertainities. In control applications, neural networks can be incorporated in direct strategy or indirect strategy [1]. In direct strategy, for a given current state of the system and the target state for the next sampling instant, the network is trained to produce the control action that drives the system to the target state. In indirect strategy, the network is trained with input-output data from the dynamic system. For a given current state and current control action, the network learns to predict the next state of the system. In this work, a direct control strategy is adopted to design the controller. A number of researches proposes methods for identif cation and control of non-linear multivariable systems. The complex control system applications , adaptive and robust control techniques have high potential, because the modif cation of controller parameters are based on convergence and stability constraints, which can cause limitations in the performance of adaptive systems. AL-Zahary [2] explains the realization of neuro controllers for non-linear multivariable systems. They suggested two extended models of SISO neurocontroller to realise multivariable systems for identif cation and control. They have considered an unknown system and a control is generated by training the models with available input-output data. The paper [3] explains a control strategy which uses past and present process informations to design the multivariable controller. A feedforward architecture with back propagation learning algorithm with one hidden layer is used to develop the controller. The training is off-line with input-output data. A methodology to develop a proper model for the design of a robust controller for multivariable system expained in [4], where the controller is designed for structurally ill-conditioned processes. Different aspects of controller design for non-linear systems are available in literatures [5], [6]. One can f nd a number of applications in control system using neural network ,which can be trained by using different methods [7], [11]- [14]. Most of the training techniques, retaining the informations about the inf nite past is not possible, which is a drawback for the real-time applications. In feedforward neural networks, backpropagation with adaptive learning rate is the most widely used gradient based algorithm.


68

But this algorithm is very sensitive to noise and relatively unstable. Hence there is a need for a real-time recurrent learning algorithm, which stabilises the system and also improves the convergence. A suitable architecture for this type of learning is a dynamic recurrent neuron architecture in which the output of the neuron is fed back to the input through a delay. Different training methods are available for recurrent neurons [12], [15], [17]. A Method for non-linear system control using ANN is suggested in paper [18], but that is applicable only for SISO systems. In all these works reported, fully connected recurrent neural network is seldom used for the control of non-linear multivariable systems, which is essential for real-time control applications. Thus an intelligent controller is required to replace the conventional controllers. Among the several architectures found in literature, recurrent neural networks (RNN) involving dynamic elements and internal feedback connections have been considered as more suitable for real time applications. The critical issue in the application of recurrent neural network is the choice of the network architecture, i.e., the number and type of neurons, the location of feedback loops and the development of a suitable training algorithm. In this paper we discuss the use of RTRL algorithm to train the network, which is an optimal algorithm such that it minimises the instantaneous squared error at the output neuron for every discrete time instants, while the network is running. Number of neurons in the output layer is equal to the number of states of the system and it is a fully connected recurrent neural network, i.e., all the outputs from the neuron is fed back to the input through a delay. II.

REAL TIME RECURRENT LEARNING ALGORITHM

This is a forward gradient algorithm, which makes use of a matrix of partial derivatives of the network state values with respect to every weight. The algorithm is based on minimising the instantaneous squared error at the output of the neuron. The main diff culty related to the recursive training of recurrent network arises from the fact that the output of the network and its partial derivatives with respect to the weights depend on the inputs. In this method, partial derivatives of each node with respect to each weight are computed at every iteration [19]. The method is completely online and is simple to implement. Fig. 1 shows its architecture with three states, one bias and two control inputs. All the outputs are fed back to the source nodes through a delay. There are two distinct layers for the network, a concatenated input feedback layer and a processing layer of computational nodes. If q is the number of states of the system and m is the number of control inputs to the system, then the dynamic system can be represented by the non-linear difference equations given in (1) and (2). x(n + 1) = Φ(Wa x(n) + Wb u(n))

(1)

y(n) = Cx(n)

(2)

where Wa is a q X q matrix, Wb is a q X (m+1) matrix and C is a p X q matrix. Let ψ : Rq → Rq is a non-linear map.The

Fig. 1.

Structure of a fully connected recurrent network

neural network with m inputs, p outputs and q states can be represented in state space form as given in (3), (4) and (5).   φ(W1T ζ(n))  ..   .      T  (3) x(n + 1) =  φ(Wj ζ(n))     .  ..    T φ(Wq ζ(n)) " # Waj Wj = (4) Wbj " # x(n) ζ(n) = (5) u(n) The matrices, Wa , Wb , C and the non-linear function φ are interpreted as follows: • In (1), the total weights are split into two, namely, Wa and Wb . The matrix Wa represents the synaptic weights associated with the q neurons in the hidden layer that are fed back as inputs in the input layer. The matrix Wb represents the synaptic weights associated with this hidden layer, which are connected to the input sources including the bias. Thus, the bias terms of the hidden neurons are included in Wb . The matrix C represents the synaptic weights of p output neurons connected to the hidden layers. • The neurons in the hidden layers are with hyperbolic tangent nonlinear function, given by: φ(x) =

1 − e−2x 1 + e−2x

(6)


69

or a logistic function: φ(x) =

1 1 + e−x

(7)

The network consists of a set of N fully connected neurons and a set of M inputs. Let Wki (n) denote the weight associated with the link originating from neuron i towards neuron k at time n. The net input to neuron k, Sk (n) is def ned as the weighted sum of all activations in the network. Based on standard RTRL terminology, the output of node k at time (n + 1) is to be calculated as : yk (n + 1) = fk (Sk (n))

(8)

Proceeding through the steps of the LMS algorithm using steepest descent method, the correction in the synaptic weights can be calculated as: ∧j (n + 1) = φ(n)[Wa (n) ∧j (n) + ∪j (n)]

j = 1, 2, ., q (17)

e(n) = d(n) − Cx(n)

where e(n) is the error and d(n) is the desired output in nth instant The correction in weight is given by (18): (18)

∆Wj (n) = ηC ∧j (n)e(n)

where: Sk (n) =

X

Wki Zi (n)

(9)

i∈N ∪M

Zk (n) =

 xk (n), k ∈ M yk (n), k ∈ N

(10)

The nonlinear activation functionf (.) maps to the range [0,1]. The overall network error at time n is def ned by an error function J(n) represented as: J(n) = =

− −

1 2 1 2

X

[yk (n) − dk (n)]2

X

[ek (n)]2

(16)

Thus, the RTRL algorithm can be summarized as follows: 1) Initialize the weights: " # Waj Wj = Wbj j = 1, 2, ., q 2) Set the initial states as x(0) = x0 , where x0 is calculated for a particular operating condition. 3) Set ∧j (n) = 0 for j = 1, 2, · · · , q 4) Compute for n = 0, 1, · · · ∧j (n + 1) = φ(n)[Wa (n) ∧j (n) + ∪j (n)]

k∈outputs

j = 1, 2, · · · , q

(11)

k∈outputs

where dk (n) denotes the desired target value for output k at time n. To execute the RTRL algorithm, three matrices ∧j (n), ∪j (n) and φ(n) are calculated which are explained in ( 12),( 13) and ( 15). 1) ∧j (n) is a q X (q+m+1) matrix def ned as the partial derivative of the state vector with respect to the weight vector. ∂X(n) ∧j (n) = ; j = 1, 2, · · · , q ∂wj

(12)

2) ∪j (n) is a q X (q+m+1) matrix whose rows are all zero except for the j th row, which is equal to the transpose of the vector ζ(n).   O  T  ∪j (n) =  ζ (n)  ; j = 1, 2, · · · , q (13) O

3) φ(n) is a q X q diagonal matrix whose diagonal elements are the partial derivatives of the non-linear activation function given by: φ(wjT (ζ(n));

j = 1, 2, · · · , q

(14)

φ(n) = diag(φT (W1T ζ(n)), · · · , φT (WqT ζ(n))) (15)

e(n) = d(n) − Cx(n) ∆Wj (n) = ηC ∧j (n)e(n) By identifying the partial derivatives of the output function with respect to the weights as sensitivity elements, the sensitivity can be denoted as given in (19): pkij (n + 1) =

∂(yk (n + 1)) ∂(Wij (n))

where y = C[x] + D[u]

(19) (20)

In this algorithm, the storage requirements cannot be reduced as they constitute a crucial component in the weight upgradation procedure. III.

PROPOSED CONTROLLER

The proposed controller has been designed using the states of the system as the input signals which is fed back from the output of the neurons through a delay. The control strategy at any instant can be written as: y(n + 1) = F(y(n), u(n), bias) where y(n) = [y1 (n), y2 (n), ........yp (n)] : output vector

(21)


70

IV. C ASE

STUDY

In this section, we presents two simulation studies one for SISO system and the second for MIMO system, with the proposed adaptive control algorithm. For training the neural network, initially all the system state variables are assumed to a small value and given to the neural network. Then the desired system outputs are given as the required closed loop state trajectories. These desired trajectories can be generated by any conventional methods. The synaptic weights of the neurons are f rst initialized to random values. After the desired outputs are presented to the network, at each instant, the error function e(n) is computed and the correction in synaptic weights can be updated by using (18). The value of learning rate parameter is problem dependent. In this work, we used a suitable value 0.5, which gives a fast convergence rate of the learning process. Hyperbolic tangent given in (6) is used as the activation function for the controller and the output neurons. 1) Example 1. Single Input, Single Output system (SISO)

Fig. 2.

The proposed controller

u(n) = [u1 (n), u2 (n), ........um (n)] :control vector When the desired output trajectory is given at any instant, say (n + 1)th instant, then the required control input is generated by the controller with the available knowledge of the states in nth instant. For the purpose of training the network, the desired trajectory can be obtained by any method. The controller generates an appropriate control to achieve the desired state trajectory after on-line training of the network using RTRL algorithm. The proposed controller is given in Fig.2. The number of neurons in the controller is same as the number of control inputs to the system. Each controller neuron is connected to all the states in the input layer and given an external bias. The system states at (n + 1)th instant can be calculated as x(n + 1) = φ[(Wa x(n) + Wb ∗ K)]

I θ¨ = mglsin(θ) − ml2 θ¨ − ml¨ xcos(θ)

(22)

where K=[bias;control inputs] The desired output at (n + 1)th instant is given by: y(n + 1) = Cx(n + 1)

The system considered here is an inverted pendulum problem, which is a classical problem in dynamics and control theory, widely used as benchmark for testing control algorithms. The objective of the controller is to control the angle of an inverted pendulum. The inverted pendulum has its mass above its pivot point and is mounted on a motor driven cart. The assumption made in the system formulation is that the pendulum moves only in a two dimensional space. The cart is driven by a motor that exerts a horizontal force F on the cart, which is the control input to the system. In this problem, balancing the pendulam in a vertical position in open loop is unstable. The proposed controller in feedback will modify the states of the system by generating a control input u to the system. The main objective of the controller in this system is to keep the pendulum upright in the presence of any disturbance by applying a control force u. The non-linear system equations of the inverted pendulum can be expressed as: (25)

¨ Mx ¨ = F − m(¨ x + lθcos(θ) − lθ¨2 sin(θ)) − k x˙ (26) (23)

The control input generated by the controller in (n + 1)th instant is given by: X u(n + 1) = σ Wui xi (n) (24) i

where Wui is the synaptic weights associated with the controller neuron, σ is the activation function used in the controller and φ is the activation function used in the RTRL algorithm.

Where M is the mass of the cart, m is the mass of the pendulum, F is the force applied, k is a constant and l is the length of the pendulum. The system state variables are given by: x1 = θ x2 =

dθ dt

= θ˙

x3 = x = position of the cart x4 =

dx dt

= x˙ = velocity of the cart


71

y versus t 0.1

and the output of the system is given by y = θ

0.08 0.06 0.04

y

0.02

In this problem, the desired trajectory used for training the network is obtained by pole placement technique of the linearised model of the system having a small settling time, about 2 seconds and reasonable damping with damping factor ξ = 0.5.

0 -0.02 -0.04 -0.06 -0.08

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

t(sec)

Fig. 7.

Desired outputs vs. Time(Sec)

x1(theta) versus t

0.1 0.08

controller generated signal versus t

0.06

2

x1=theta

0.04

1.5

0.02 0

1

controller output U

−0.02 −0.04 −0.06 −0.08

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

t(sec)

Fig. 3.

Theta vs. Time(Sec)

0.5 0 −0.5 −1 −1.5

x2(theta) versus t

−2

0.4

0

20

40

60 t(sec)

80

100

120

0.2

x2=theta dot

0

Fig. 8.

−0.2

System input generated by the controller vs. Time(Sec)

−0.4

−0.6

−0.8

−1

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.6

0.7

t(sec)

Fig. 4.

Angular velocity vs. Time(Sec)

x3(displacement of cart) versus t

0.1 0.09

x3=displacement of cart

0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0

0

0.1

0.2

0.3

0.4

0.5

t(sec)

Fig. 5.

Position of the cart vs. Time(Sec)

2) Example 2. Multiple Input, Multiple Output systems (MIMO) In this example,we have considered a discrete model of a turbo-generator with six system states, two inputs and two outputs. The system dynamics are given as:

x4(velocity of cart) versus t

0.6 0.5 0.4

x4=velocity of cart

Figures 3 to 6 are the plots of different system states and Fig. 7 shows the desired trajectory of the system response, i.e., the desired position of the cart , which is given to the neural network for training. In this example, the tolerance level is set to 0.005. The output of the controller, i.e., the required control input to the system, generated by the controller for the desired output is shown in Fig.8. Once the desired trajectory as shown in Fig.10 is supplied to the network, the controller will act in such a way to track the output in very short time. The proposed controller is tested for different operating conditions(i.e., for different initial angles of the pendulam and the different initial positions of the cart) and it performs well, which shows its high robustness. The trained neural network can be used as the controller in the feedback circuit.

0.3 0.2 0.1 0 −0.1 −0.2 −0.3

0

0.1

0.2

0.3

0.4

0.5

0.6

dδ =s dt

(27)

ds = b1 (PT − PE ) dt

(28)

0.7

t(sec)

Fig. 6.

Velocity of the cart vs. Time(Sec)


72

−3

output generated vs. t

0.12

9

0.1

8

0.08

7 6

0.04

error square

output Y

0.06

0.02 0

4

2

−0.04

1

−0.06

Fig. 9.

5

3

−0.02

−0.08

error square Vs t

x 10

0

20

40

60 t(sec)

80

100

0

120

Actual output generated vs. Time(Sec)

Fig. 11.

0

20

40

60 t(sec)

80

100

120

Squared error vs. Time(Sec)

desired output vs. t

0.1

is minimum. In this case also the controller response is very fast. Figures 13 and 14 shows the eff ciency of the proposed controller in tracking the outputs. Control inputs generated by the controller is shown in Fig.15 and the Error square graph of the network is shown in Fig.16 All these simulation results shows that the proposed controller is eff cient in non-linear system control and since its reponse is very fast, this controller can be used for real time applications.

0.08

desired output Y

0.06 0.04 0.02 0 −0.02 −0.04 −0.06 −0.08

Fig. 10.

0

20

40

60 t(sec)

80

100

120

Desired output vs. Time(Sec)

0.35

dEq = b2 (−Eq + b3 ssin(δ − α12 ) + U1 ) dt dPT = b4 (−PT + qC) dt

0.3

(29)

0.25

0.2

(30)

0.15

0.1

dq = b6 (−γ(q) − b5 s + h) dt

(31)

dh = b7 (−h + U2 ) dt

(32)

0.05

0

dW1 =0 (33) dt where C, b’s are constants. δ is the angle of rotation of the rotor, s is the slip, PT is the mechanical power of the turbine, PE is the electrical power generated and Eq is the generated voltage. q is the valve movement of the regulator. U1 and U2 are the controls on voltage and valve position. α12 is the angle of transfer conductivity and h is the turbine velocity controller signal. The objective of the controller is to make the deviation from the desired generated voltage to be zero and also the deviation of the generator load angular position to be zero. These are controlled by the throttle valve position and the loading torque of the generator. The desired trajectories for the outputs are supplied to the network. Then the network is trained in such a way that the mean squared error is minimised at the output. Figure 12 shows the desired trajectories of the outpts, i.e., the change in the desired voltage and the change in load angle, supplied to the network for training. When this desired trajectories are supplied to the network, the control inputs to the system, U1 and U2 are generated by the controller in such a way that the error square

−0.05

−0.1

Fig. 12.

0

5

10

15

20

25

The desired trajectory supplied to the network

V. C ONCLUSION The neural networks offer a feasible alternative as system identif cation or controller design tools when there are no good and reliable deterministic mathematical models valid in all the operation conditions or only historical input-output data are available. In this paper, we have described the implementation of an adaptive neural network-based controller for non-linear, multivariable, and dynamic systems. The fully connected dynamic neural network architecture selected for implementation has been trained by RTRL algorithm. The proposed controller is capable of generating the required control input by modifying the synaptic weights when the desired trajectory of output is presented to the network. The quick response makes the ANN based approach very attractive for on-line applications in non-linear system control. The eff ciency of the proposed controller has been demonstrated by applying the method to both continous and discrete systems, and the results agree with our claim.


73



0.4

0.06 0.05

0.3 output Y

output Y

0.04 0.03 0.02

0.2 0.1

0.01 0 0 −0.01

0

5

10

15

20

−0.1

25

0

5

10

t(sec)

15

20

25

15

20

25

t(sec)



0.06

0.4

0.05 desired output Y

desired output Y

0.3 0.04 0.03 0.02 0.01

0.2 0.1 0

0 −0.01

0

5

10

15

20

25

t(sec)

−0.1

0

5

10 t(sec)

Fig. 13. The actual valve position with controller and the desired trajectory supplied to the network

Fig. 14. The actual voltage generated at the output with controller and the desired voltage supplied to the network

R EFERENCES

[13] Zhang.Y, Chen.G.P,Malik. O.P and HopeG.S, ”An Artif cial Neural Network Based Adaptive Power System Stabiliser,” IEEE Control System Magazine, 1992,vol.12, no. 4, pp8-10. [14] Steepen W.Riche Webrose , ”Steepest Descent Algorithm for Neural Network controllers and Filters,” IEEE Trans on Neural Networks, 1994,vol.5, no. 2. [15] Pearlmutter.B, ”Gradient calculations for Dynamic Recurrent Neural networks: A Survey,” IEEE Trans on Neural Networks, 1995, vol.6,no.3, pp 1212-1228. [16] Chang L.H, ”Design of nonlinear controller for bi-axial inverted pendulam,” IET Control theory Applications, 2007, vol 1, No.4, pp.979-986. [17] Williams R.J and Zipser D , ”A learning algorithm for continually Running fully Recurrent Neural Networks,” Neural Computation ,1,1989, pp 552-558. [18] K.C.Sindhu Thampatty, M.P.Nandakumar, Elizabeth. P. Cheriyan, ”ANN based Adaptive Controller tuned by RTRL Algorithm for non-linear system control,”Second International workshop on nonlinear Dynamics and Synchronisation, INDS’09, July 20-21, 2009, Klagenfurt, Austria. [19] Simon Haykin , ”Neural Networks-A comprehensive Foundation ,” IEEE Press, 1994 [20] Jagannathan Sarangapani, ”Neural Network Control of Nonlinear Discrete-Time,” Taylor and Francis ,New M. Young, The Technical Writers Handbook, Mill Valley, CA: University Science, 1989.

[1] L. Ender , R. Maciel Filho, ”Design of multivariable controller based on neural networks,” Elsevier, Computers and Chemical Engineering, 2000, vol 24, pp. 937-943. [2] T. A. Al-Zohary,Wahdan, M. A. R. Ghonaimy and A. A. Elshamy, ”Adaptive Control of Nonlinear Multivariable Systems Using Neural Networks and Approximate Models,” [3] Daniele Semino and Gabriele Pannocchia, ”Robust Multivariable Inverse-Based Controllers: Theory and Application,” Ind. Eng. Chem. Res.,1999, vol 38, pp.2375-2382. [4] Sun X.M,Zhao.J, Wang.W, ”State feedback control for discrete delay systems with controller failures based on average dwell-time method ,” IET Control theory Applications, 2008, vol 2, No.2, pp.126-132. [5] Duan G.R, Yu H.H, ”Robust pole assignment in high-order descriptor linear systems via proportional plus derivative state feedback,” IET Control theory Applications, 2008, vol 2, No.4, pp.277-287. [6] Zhang.B, ”Parametric eigen structure assignment by state feedback in descriptor systems,” IET Control theory Applications, 2008, vol 2, No.4, pp.303-309. [7] Filho.B, Cabral E, Soares A, ”A new approach to artif cial neural networks,” IEEE Trans on Neural Networks, 1998,vol.9, no. 6, pp.11671179. [8] Linkens.D, Nyongesa Y, ”Learning systems in intelligent control: an appraisal of fuzzy, neural and genetic algorithm control applications,” IEE Proc. Control Theory App, 134(4), pp 367-385 [9] Narendra K.S and Parthasarathy.K, ”Identif cation and control of dynamical system using neural networks,” IEEE Trans on Neural Networks, 1990,vol.1, no. 6, pp.4-27. [10] Paul J Webrose , ”An overview of Neural Networks for control,” IEEE Trans on Control Systems, 199, no. 1, pp40-42. [11] Panos J. Antsaklis, ”Neural Networks in Control Systems,” IEEE Trans on Energy Conversion, 1993,vol.8, no. 1, pp71-77. [12] Jin.L,Nikiforuk.P, Gupta M, ”Approximation of discrete time state space trajectories using dynamic recurrent networks,” IEEE Trans on Automatic Control, 1995,vol.40, no.7, pp.1266-1270.

K.C.Sindhu Thampatty received B.Tech degree in electrical and electronics engineering in 1994 and M.Tech degree in energetics in 1996, both from National Institute of Technology Calicut, India. She is currently pursuing Ph.D Degree at National Institute of Technology Calicut, India. She is Assistant Professor at Amrita Viswa Vidyapeetham, Ettimadai, Coimbatore, India. Her teaching and research interests include, Control Systems, Power Systems and Neural Networks and Fuzzy logic.


74

control input1

1.5

controller output U

1 0.5 0 −0.5 −1 −1.5

0

5

10

15

20

25

15

20

25

15

20

25

t(sec)

control input2 3

controller output U

2 1 0 −1 −2 −3

0

5

10 t(sec)

Fig. 15.

The controller outputs

error square Vs t

0.045

0.04

0.035

error square

0.03

0.025

0.02

0.015

0.01

0.005

0

5

10 t(sec)

Fig. 16.

Squared error

M.P.Nandakumar received B.Tech degree in electrical and electronics engineering in 1974 and M.Tech degree in Control Systems in 1977, both from National Institute of Technology Calicut, India. Received Ph.D degree in 1987 from Indian Institute of Technology, Kanpur. He is currently Assistant Professor in Electrical Engineering at National Institute of Technology Calicut, India. His research interests include Control Engineering, Nonlinear systems, Artif cial Neural network and fuzzy logic

Elizabeth.P.Cheriyan received B.Tech degree in electrical engineering from Kerala University, India in 1996, M.Tech degree in Energetics from National Institute of Technology Calicut, India, in 1998 and Ph.D degree in 2008 from Indian Institute of Technology, Bombay. She is currently Assistant Professor in Electrical Engineering at National Institute of Technology Calicut, India. Her research interests include Power System Analysis, Control Engineering and Electrical Machines

75

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 H. Rashidi and Z. Rashidi: A Simple Technique for Implementation of Coroutines in Programming Languages

A Simple Technique for Implementation of Coroutines in Programming Languages Hassan Rashidi*, Zaynab Rashidi** **

Department of Mathematics, Statistics and Computer Science, Allameh Tabatabaee University, Tehran, Iran (Email: [email protected]; [email protected]) ** Department of Computer Engineering, Alzahra University, Tehran, Iran, Email:[email protected]

ABSTRACT In this paper, a si mple techn ique for implementation of Coroutines in programming languages is presented. A coroutine is a subprogram that can return control to the caller before completion of execution. There is no direct method for implementation of Coroutines in some programming languages such as C/C++. The technique presented in this paper is based on using a static integer variable in a case state ment that controls each resu mption point in exec ution time. This technique is applicable to many ot her programming languages.

KEYWORDS

Programming, Pro gramming l anguages, Coroutines,

Simulation, Distributed Processing.

1. Introduction

Computing has evolved from a solitary activity on a single machine into a federation of cooperating activities that often span the globe. Ea ch activity can be performed by a single function/routine on a single m achine. There are a width variety of functions/routines. One of these functions is a coroutine. A coroutine is a subprogram that can return control to the call er before co mpletion of execution[1][2][3]. Coroutines are not c urrently a common control struct ure in pro gramming languages outside of discrete sim ulation languages. However, they pr ovide a control structure in many algorithms that is m ore natural than the ordinar y su bprogram hierarchy. Coroutines are commonly used in Parallel Processing, Distributed processing and Simulation. The question is that h ow we can im plement Coroutines i n im perative languages. In som e programming languages, there is no direct method for programmers to use a subpro gram as

Coroutines. I n this paper, a sim ple technique for implementation of Coroutines in C/C++ programming languages is presented. The rem ainder of this a rticle is organized as follows. In th e next section we give an overview of the relevant work on the matter. In Section 3, a detailed description of the technique is given. The strength and weakness of the technique is presented in a comparative with other solutions. In Section 4. F inally, the e xperimental results of running the approaches are presented in Section 5.

2. Related Work

When a coroutine receive s control from another subprogram, it executes partially a nd then is suspended when it returns control. At a later point, the calling program may resu me execution of the coroutine fro m the point at which execution was previously suspended. Figure 1 shows a sim ple control transfer between two Coroutines A and B. I n both Coroutines, note that S i is a se quence of state ments. When A calls subprogram B as a coroutine, B execu tes awhile and returns control to A, via a Resume A. just as any ordi nary subpro gram would do. When A again passes control to B via a Resume B, B again executes a while and returns control to A, just as ordinary subprogram. Implementation of Coroutines can be discussed in two levels: Programming level and inside of programming languages itself. There are not important techniques fo r im plementation of Coroutines i n programming level. The most common method is that coroutine stru cture may be readily simulated in many languages using the goto state ment and a resume point variable specifying the label of t he state ment at which execution is to resume[1].

76


variable A_Control is set to 1 to det ermine the first entry point. After doing t he statem ents for sequence S1 and before returning the control to the caller su bprogram, this variable is set to 2. Therefore, for the second entry poi nt, i t does the statements of sequence S2. In summary, each time the coroutine is called, set the variable to a n appropriate value to set the appropriate entry point. Note that in the technique, each cor outine can have some formal parameter as inputs. Moreover, the coroutine can return a single value to the caller subprogram.

Figure 1: Control transfer between Coroutines

For im plementation of Corouti nes inside programming languages, more information is needed. In this level su ppose the sim ple callreturn structure [1]. Pr att et al (2001) presents a method for i mplementation of Coro utines inside programming languages. Their im plementation based on the si mple callreturn, but they m ake differences in handling the CIP (Current Instruction Pointer). Each coroutine has IP location in it s activation r ecord, used t o provide the value for CIP on resu ming the execution. A single activation record is allocat ed storage statically at the beginning of execution as an extension of the code seg ment for a coroutine. A single location, which is c alled the resume point, is reserved in the activation record to save the old IP (Instruction Pointer) of the CIP when a resume instruction transfers control to another coroutine.

3. The Technique and its details

Our technique is simple and easy to implement by programmers and progra mming languages. In a short sentence, a coroutine is sim ply a case statement with a functi on return as the last statement of each case item. A si ngle static integer variable is used to determ ine an entr y point of each execution. A static variable is a variable that its value is held after returning t he control to t he caller subprogram . The static local variables inside each subprogram are maintained at the end part of subprogram's code segment. Figure 2 dem onstrates a sample of th e technique used for im plementation of the corout ine that is shown in Fi gure 1. At the first call, the static

type A_Coroutine(………….) { static int A_Control =1; typ e ret_val; switch(A_Contro) { case 1: Statements for S1; A_Control = 2; return(ret_value); case 2: Statements for S2; A_Control = 3; return(ret_value); case 3: Statements for S3; A_Control = 1; return(ret_value); } } Figure 2: A sample of the technique used

The Figure 3 shows a part of the code seg ment generated b y the pr ogramming languages like C/C++ co mpiler for the program of Figure 2. These languages store and update the value of static variable at the l ast location of the code segment. Th e first instruction fetches value t of the static variable A_Control. T he second instruction jum ps to an appropriate point of the code depending on t he value t. Therefore, the program jum ps either to L1 or L2 or L 3 depending on t=1, 2, 3 respectively . At each entry point of th e code, th e program does the instructions for the seque nce S t, t= 1,2,3. Then it sets the A_Control to a specific value to determine the next entry point in the next call.


77

Fetch value t of variable A_Control

Jump to L0+t-1 L0 :

Jump to L1 Jump to L2

Jump to L3 L1 :

…Instructions for S1... Set A_Control to 2 Return(ret_val)

L2 :

…Instructions for S2. Set A_Control to 3 Return(ret_val)

L3 :

…Instructions for S3

languages, for example Pascal and Fortran, that do not support static variable inside a subp rogram, a global variable can be used.

5. Summary and Conclusion

In this paper, a simple technique for implementation of

Coroutines in programm ing languages is presented. A coroutine is si mply a case statement with a function return as the last state ment of each case item. We can have a static/global variable and then set to the appropriate case e ntry. Each time the coroutine is called, set the case variable to point to the next option. Control struc tures in which subpro grams may be invoked eit her as coroutine or ordinar y subprograms and in whi ch Coroutine s may be recursive (i.e. may have m ultiple simultaneous activation) require more complex implementation. REFERENCES:

Set A_Control to 1 Return(ret_val) A_Control Figure 3: The Code generated by the compiler

The Control structure of the main function for t he example in Figure 1 is de picted by Figure 4. The first statements in main dedicated for declarations and initialization. After that, the calling functions make a sequence control as Figure 1. type main(………….) { Declarations and initialization;. Call A_Coroutine(…); Call B_Coroutine(…); Call A_Coroutine(…); Call B_Coroutine(…); Call A_Coroutine(…); Call B_Coroutine(…); Call C_Coroutine(…); } Figure 4: The Control structure of main

4. The Comparative discussion

In this section, a co mparative discussion over the techniques is presented. I t has so me advantages over the tech niques in the literature. Fi rstly, the technique is usable in both programming level and inside programming languages itself. Secondl y, the technique presented in this paper can be implemented in an y l anguages. For som e

[1]. Pratt T. and

Zel kowitz M., "Prog ramming Languages, Desi gn and Im plementation", 4th Edition Prentice Hall, 2001. [2]. Sebesta R . W., "C oncepts of Pr ogramming Languages", 6th Edition, Addison Wesley, 2004. [3]. Tucker a nd Noonan, Pr ogramming Lan guages. McGraw Hill, 2002.

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 Jianhua Yang et.al.: Analyzing and Correlating Interactive Sessions with One-Dimensional Random Walk to Detect Stepping-Stone Intrusion

78

Analyzing and Correlating Interactive Sessions with One-Dimensional Random Walk to Detect Stepping-Stone Intrusion Jianhua Yang, Guoqing Zhao, Lydia Ray, Shou-Hsuan Stephen Huang

Abstract—-Most network intruders tend to use steppingstones to reduce the risks of being discovered or captured when attacking or invading other computers. In this paper, we introduce an algorithm to detect stepping-stone intrusion by counting and comparing the number of Send and Echo packets on incoming and outgoing connections respectively. Here we show that stepping-stone intrusion can be modeled as a one-dimensional random-walk process. The analysis of the detection algorithm with random-walk theory is given. We compared this algorithm with the existing state of the art stepping-stone detection algorithm, and found that our algorithm performs better in terms of number of packets required to determine an intrusion. The analysis shows that this algorithm could resist an intruders’ time-perturbation evasion. Keywords— Intrusion detection, network security, randomwalk, stepping-stone

I.

INTRODUCTION

Stepping-stone intrusion has become the most popular and safe way used by intruders to attack other computers [1]. Unlike non-interactive attacks, such as virus attacks, which ultimately crush a system, the purpose of interactive attacks is to steal sensitive and confidential information from the system. The consequences of stepping-stone intrusion are in some measure even more serious than other types attack. But it also motivates computer scientists and researchers to put forward some new theories and approaches to detect and prevent this kind of attacks. With the development of computer technology, especially since Jianhua Yang is with Columbus State University, Columbus, GA 31907 USA (corresponding author phone: (001)706-565-3520; fax: (001)706-5653529; e-mail: [email protected]).

Guoqing Zhao is now with Beijing Institute of Petro-Chemical Technology, Beijing 100078 China (e-mail: [email protected]). Lydia Ray is now with Columbus State University, Columbus, GA 31907 USA (e-mail: [email protected]). Stephen S.-H. Huang is with University of Houston, TX 77204 USA (email: [email protected]). Part of this research has been presented in FINA2008 workshop

early 1995, many approaches have been proposed to detect stepping-stone intrusion and defend victims. Among the most popular ones are Content-based Thumbprint [2], Time-based Approach [1], Deviation-based Approach [3], Round-trip Time Approach [4, 5], and Packet Number Difference-based Approach [6, 7, 9]. Staniford-Chen and Heberlein proposed the contentbased thumbprint method that identifies intruders by comparing different sessions for suggestive similarities of connection chains [2]. The major problem of this method is that it cannot be applied to encrypted sessions because their real contents are not available and therefore unable to make a thumbprint which is the summary of the contents of a session. Zhang and Paxson [1] proposed the timebased approach that could be used to detect steppingstones or trace intrusions even if a session is encrypted. However, there are three major problems coming with this approach. First, an interactive session can be manipulated easily by intruders. Second, this approach requires that the packets of connections have precise and synchronized timestamps in order to correlate them properly. This makes it infeasible or impractical to correlate the measurements that were taken at different points in a network. Third, Zhang and Paxson also were aware of a fact that a large number of legitimate stepping-stone users routinely traverse a network for a variety of reasons. Yoda and Etoh [3] proposed the deviation-based approach that is a network-based correlation scheme. This method is based on the observation that the deviation between two unrelated connections is large enough to be distinguished from the deviation of those related connections. In addition to the limitations mentioned above, the time-based approach has other additional problems, such as not being efficient and applicable to compressed session and padded payload. Yung [5] proposed the round-trip-time (RTT) approach to detect stepping-stone intrusion by estimating the downstream length using the gap between a request and its response, as well as the gap between the request and its acknowledgement. The problem of this RTT approach is that it makes inaccurate detection because it cannot compute, simultaneously, the two gaps precisely. Blum [7] proposed the packet number difference-based (PND-based) approach to detect stepping-stone intrusion

79


by checking the difference of Send packet numbers between incoming and outgoing connections. It is based upon the idea that if two connections are relayed, the difference should be bounded; otherwise, it is not bounded. This method can resist intruders’ evasion, such as timejittering and chaff-perturbation. Donoho, et al., [6] showed for the first time that there are theoretical limits on the ability of attackers to disguise their traffic using evasions during a long interactive session. The major problem with the PND-based approach is due to the fact that the upper bound number of packets required by his approach to be monitored is too large, while the lower bound number of chaffs an attacker needs to evade detection is too small. This fact makes Blum’s method very weak in resisting intruders’ chaff-perturbation evasion. In this paper, we introduce a novel approach to exploit the optimal numbers of TCP requests and responses to detect stepping-stone intrusion. It is an improvement to Blum’s approach because our method not only counts the number of Send packets but also allows to compare the number of Send and Echo packets of incoming and outgoing connections respectively. Additionally, the mathematics involved in our analysis tool is different. We performed the algorithm analysis using one-dimensional Random Walk theory which gives false negative as well as the false positive errors. Blum analyzed his algorithm with Machine Learning theory which gave only false positive error. Theoretical analysis in this paper shows that the performance of our approach is much better than Blum’s approach in terms of the number of packets need to be monitored under the same confidence with the assumption that a session is manipulated by either time-jittering or chaff-perturbation evasion. The rest of the paper is arranged as following. In Section 2, we present the stepping-stone problem and explain how to model stepping-stone intrusion as a onedimensional Random Walk process. Section 3 presents the stepping-stone detection algorithm - correlating algorithm. In section 4, we analyze the performance of this algorithm using one-dimensional Random Walk theory, and in Section 5, we present the result in comparisons with Blum’s approach. Finally, in Section 6, we summarize this work and discuss future work. II.

USING RANDOM WALK PROCESS TO MODEL STEPPING-STONE INTRUSION

To understand what a stepping-stone intrusion is, and where a stepping-stone intrusion is detected, let us consider a safety-critical computer system host hn as shown in Figure 1, running under a complex network environment. Host hn is open to the Internet. Any user authenticated can connect to hn directly or indirectly. Most applications would connect to hn directly so as to access it efficiently except a few applications which require indirect connection to hn. However, users, who connect to hn indirectly,

are highly suspected to have hidden intentions, rather than accessing the host regularly. We identify these users as intruders. Just as shown in Figure 1, host h0 is assumed to be used by an intruder, host hi is the sensor where our detection program runs, all the hosts between h0 and hn are called stepping-stones [1], and Cin and Cout are one incoming and one outgoing connection of the sensor respectively. The detection program can reside at any stepping-stone computers. Downstream is from the sensor to hn, and upstream is from the sensor to h0. Stepping-stone intrusion detection is to decide if host hi is used as a stepping-stone. C

C

h0

hi-1 Upstream

hi (sensor)

hi+1

hn

Downstream

Figure 1. A sample connection chain

Stepping-stone intrusion process may or may not be an interactive session. In this paper we focus only on cases where sessions are interactive. If an intruder connects to a server remotely via a couple of stepping-stones, most probably the purpose of the intruder is to steal valuable information. Usually intruders need to look through the files at a victim host and sometimes need to open some files to check if those files are what they want. If they find what they are interested in, they may download the files to local or another computer. Most of the servers at the Internet have UNIX or Linux system installed. So intruders need to input UNIX commands, wait for the responses from the victim side, and determine the next step. It is reasonable to assume that stepping-stone intrusion connection chain is a TCP/IP interactive session. To determine if a computer is used as a stepping-stone, we compare one of the incoming connections with any one of the outgoing connections to see if they are relayed. An incoming connection is the one used to connect to host hi, such as the connection Cin in Figure 2; an outgoing connection is the one used to connect out to someone else from host hi, such as Cout in Figure 2. For each connection, it has requests (Send packets) and responses (Echo packets) going through it. Previous researchers paid much attention in comparing the Send packets between an incoming and an outgoing connection. In this paper, we focus on comparing the Send and Echo packets to determine if incoming and outgoing connections are relayed.


80

C in

Steppingstone host hi

C out

Figure 2. Stepping-stone and connections

Stepping-stone intrusion process is basically a process to send requests and receive the responses. Usually if a user types a UNIX command letter by letter, one or a couple of letters as one packet will be sent. During the time while a user types a UNIX command, a Send packet will be echoed (replied) by one or more packets. For example typing command ‘ls’ to list the files and the subdirectories of a directory, there might be one packet (ls) or two packets (l) and (s) separately sent out and there will be one or two packets echoed respectively before an ‘Enter’ packet is sent out. After an ‘Enter’ packet is sent out, there would be many Echo packets sent back, but the first Echo packet is used to echo the ‘Enter’ Send packet, and the other Echo packets would be the result of the command (ls) execution. We call the previous Echo packet commandecho, and the latter ones result-echo. Usually the size of result-echo is bigger than the size of command-echo. If we could filter out the result-echo and keep the commandecho, the number of Send packets would be roughly equal to the number of Echo packets. Here we use the word ‘roughly’ because; 1) we cannot guarantee that all the result-echo are filtered out; 2) it is also possible that Send or Echo packets may be split, combined, or dropped. If two connections are relayed, we cannot conclude that the number of Send packets is exactly the same as the number of command-echo packets, but at least we can say that the difference between these two numbers should be bounded. This is exactly the behavior of a one-dimensional random walk process. Let us check the rationale to use onedimensional random-walk to model the behavior of this process. If we monitor and count the numbers of Send and command-echo packets, ideally these two numbers are equal. The difference between the two numbers will walk to neither left (negative side) nor right (positive side). If a Send packet is monitored, the difference would walk to the left side one step; otherwise it would walk to the right side one step. Because of the two reasons discussed, and other unpredicted factors, such as network traffic, this difference would walk back and forth randomly. But it should be bounded if two connections are relayed; otherwise it would, but not always be bounded. In other words, this difference could walk to left or right, but never walk out of a boundary if a host is used as a stepping-stone. This boundary is a given threshold which is important in our stepping-stone intrusion detection algorithm. The given value of a boundary could affect the accuracy of our detection algorithm. The bigger the boundary, the smaller the

false negative error, but the bigger the false positive error. The smaller the boundary, the smaller the false positive error, but the bigger the false negative error. Selecting a proper threshold is a key to the success of the detection algorithm. We will give detailed analysis of the threshold, false negative error, and false positive error in Section 4. III.

CORRELATING ALGORITHM

We use N ((i2,s) ) to denote the number of requests of an outgoing connection, N ((i1,)e) to denote the number of responses of an incoming connection, and ∆ to denote N ((i1,)e ) − N ((i2,s) ) . For relayed incoming and outgoing connections, ∆ might vary, but should be always bounded. The problem with detecting a stepping-stone pair is to judge if the difference of the numbers of Send and command-echo packets between two connections is always bounded. In other words, for a stepping-stone pair, the following relationship should be held at all times, (1) − ΩΦ ≤ ∆ ≤ ΩΦ , where Ω Φ is the boundary threshold, which can be set up to any integer, usually with a range from 2 to 128 based on our experience and experiment context. To reduce the false alarms and misdetections in detecting stepping-stone pairs, we check if condition (1) is held every time a packet is captured. If we monitor a total of w packets and condition (1) is checked and held for w times, we can conclude that a stepping-stone intrusion is detected. If condition (1) is not held for any one of w times check, we can conclude that the monitored host is not used as a stepping-stone. Thus, we introduce the following algorithm, which is called request-response based detection algorithm (RRB), to detect stepping-stone intrusion. In this algorithm, Si( 2) is used to represent the send packets stream of an outgoing connection, Ei(1) is used to represent the echo packets stream of an incoming connection, and pj is used to represent a captured packet. RRB ( S i( 2 ) , Ei(1) , Ω Φ , w )

, , 0; 1: , ;

!

, ;

∆ , # , ; ∆ $ #Ω& ∆ ' Ω& () *+; !, () - # In this algorithm, we capture and check up to w packets on two connections to see if inequality (1) is satisfied. If there is one time that the inequality (1) is not satisfied, we


81

conclude that there is no stepping-stone pair. The conclusion regarding the existence of a non stepping-stone should be made only after all the connections are checked. If condition (1) is satisfied within w times of checking, we can then conclude with certainty that there is a steppingstone before us. It is not necessary to check if condition (1) holds for all the connections. The higher the checking times for w, the higher the confidence of the RRB. For a given confidence, which is also called false positive probability, how to estimate the least number of the packets need to be monitored on an incoming and an outgoing connection? IV.

To answer the above question, we assume that we only capture Send and Echo packets. So a packet captured is either a Send or an Echo. Suppose that a packet is a Send in probability q , and an Echo in probability p . As discussed in Section 2, the behavior of the difference ∆ between the number of Sends and Echoes can be modeled as a one-dimensional random walk process with independent jumps Z1, Z2,…, Zi,…, where i is a positive integer. We assume that if a Send is captured, the difference ∆ will make a jump Zi=-1, otherwise, a jump Zi=1; there is no other choices. It is obvious that the following equation must be satisfied. (2)

=

1 p (( ) w−1 2s1 q

δ P , a given false positive probability. A. False Negative Probability False negative probability p NC of RRB is actually the sum of the probabilities that a random walk process ∆ hits the lower bound − Ω Φ or the upper bound ΩΦ . We denote the probability that the process hits the lower bound − Ω Φ

f ((0w, −)Ω Φ ) , and the upper bound

q +( ) p

1 ΩΦ 2

1 w−1 1

s

1 q + ( ) 2 p

ΩΦ 2

1 w−1 1

s

) (3)

1 1 2

1 − p − q + 2( pq ) cos

1

=

π 2Ω Φ

1 2

2( pq ) cos

π 2Ω Φ

. One special case is when p=q:

pNC = f ((0w, −)ΩΦ ) + f ((0w, Ω) Φ ) ≤

1 w−1 1

s

1

,where s = 1

1 2

2( pq) cos

π 2Ω Φ

= cosw−1 1

= cos

π

π 2ΩΦ

(4)

.

2Ω Φ

From inequality (4), we obtain the least packet number w needed for a given false negative probability δ N when p=q:

log δ N log(cos

tive probability; δ N , a given false negative probability;

1 ΩΦ 2

ΩΦ 2

where

w≥

We evaluate the performance of the algorithm RRB by computing the least value of w for a given false positive detection probability or false negative detection probability. The performance of RPB can also be evaluated through false positive probability and false negative probability for a given w. A false negative probability indicates the possibility that condition (1) does not hold even if two connections are in the same chain. A false positive probability indicates the possibility that condition (1) holds when two connections are not in the same chain. For convenience, we use the following notations on the remaining of this paper: p NC , false negative probability; p PC , false posi-

within w times move by

1 q − pNC = f ((0w, −)ΩΦ ) + f((0w, Ω) Φ ) ≤ ( ) 2 p

s1 =

PERFORMANCE ANALYSIS

prob ( Z i = −1) = q   prob ( Z i = 1) = p   p + q =1 

( w)

ΩΦ within w times move by f ( 0, Ω Φ ) . Based on the results of the random walk theory from [8], we have:

π

2Ω Φ

+1

(5)

)

The significance of formula (5) is that for a given false negative probability, we can estimate how many packets need to be collected to make a decision. In other words, in order to obtain a given false negative error rate, we can estimate how long at least we need to monitor a host to conclude if it is used as a stepping-stone. For a fixed false negative probability, the smaller the w, the better the performance of a detection algorithm. B. False Positive Probability False positive probability pPC of RRB is the probability that the difference ∆ could walk within the range [− ΩΦ , ΩΦ ] in w moves even though the two connections are not relayed. From the results of [8], we get the following:


82

p PC =

∞

∑ ( f ((0k, −) ΩΦ ) + f ((0k, )ΩΦ ) ) =

k = w+1

= =

∞

1 q − (( ) ∑ k −1 p k = w+1 2 s1

1 p (( ) 2 q 1 p (( ) 2 q

ΩΦ 2

ΩΦ 2

q +( ) p q +( ) p

ΩΦ 2

ΩΦ 2

q +( ) p

∞

)

k = w+1 ΩΦ 2

)

1

∑ (s )

ΩΦ 2

k −1

1

1 s1w − s1w−1

,where

1 1 2

1 − p − q + 2( pq) cos

π 2Ω Φ

=

1 1 2

2( pq) cos

π

.

2Ω Φ

When p=q, we have the following simplified results, π cos w (7) 2Ω Φ p PC = π 1 − cos

Ci1 , the intruders may evade the detection. But we found that the Echo packets could not be held because lots of resend requests would be generated if the echoed packets were held. This would further result in network traffic problem. The way in our detection algorithm used is to compare the number of Send packets of outgoing connection Ci2 with the number of the echoed packets of incoming connection Ci1 . These two numbers are not affected by time-perturbation evasion because we count the Send packets of S i( 2) only when the packets are released. We do

(6)

s1 =

)

not care how long the Send packets are held before they are released. Of course for the Echo packets in a stream E i(1) , just as what we discussed before, we also do not have to worry about the time-perturbation on them. The difference ∆ will not be affected by time-perturbation evasion, so this algorithm can resist intruders’ timeperturbation evasion.

2Ω Φ

Similarly, we obtain w from (7) for a given δ P when p=q as the following: log[ δ P (1 − cos w≥ log(cos

π

S i( 2 )

S i(1)

π

2Ω Φ

2Ω Φ

)]

C i1 E i(1)

(8)

)

The significance of formula (8) is that for a given false positive probability, we can compute how many packets need to be collected to make a decision. In other words, in order to obtain a given false positive error rate, we can estimate how long at least we need to monitor a host to conclude if it is used as a stepping-stone. For a fixed false positive probability, the smaller the w, the better the performance of RRB C. Resistance to Time-Perturbation Time-perturbation [6], packet conserving evasion, is a technique used by intruders to evade stepping-stone detection. With time-perturbation intruders usually hold some send packets for a certain time and then release them to make two relayed connections un-relayed, or two unrelayed connections relayed. Intruders could evade the detection approaches which count the number of Send packets of incoming and outgoing connections respectively. However, our approach can resist intruders’ timeperturbation evasion. Let’s discuss why. As shown in Figure 3, host hi has one incoming connection Ci1 and one outgoing connection Ci2 , while each connection has one request stream (Si) and one response stream (Ei). Other approaches usually count the number of S i(1) , as well as the number of S i( 2) , and compare these two numbers to determine if there is a stepping-stone intrusion. If intruders could hold the send packets of the connection

hi

E

( 2) i

Ci2

Figure 3. Illustration of connections and streams of a host

V.

COMPARISON WITH STATE OF THE ART ALGORITHM

The best way to evaluate the performance of an algorithm is to compare it with the best existing algorithm. So far, Blum’s approach [7] has been considered to be the best way to detect stepping-stone intrusion. In this study, we compare the performance of RRB with Blum’s DAC in terms of the least number of packets needed for a given false positive rate δ P . The algorithm that needs fewer packets captured is considered to perform better. We must mention that Blum did not give false negative error analysis in his paper [7]. Thus, we compare the performance between the two algorithms only in terms of false positive probability. In order to compare with Blum’s DAC, we use the same notation as used in [7] where the boundary is denoted by p∆ . In this paper, the boundary is denoted by

ΩΦ .

Let wB and w be the minimum number of packets required to monitor in order to obtain the same given false positive probability δ P by the DAC and the RRB respectively. Our purpose is to compare wB and w to see which one is smaller. The smaller the number is, the better the


83

performance of the algorithm. Based on the paper [7] the numbers wB can be computed by the following formula:

wB = 2( p∆ + 1) 2 log

1

1000000

100000

(9)

δP

10000

, and based on inequality (8) the least value of w can be computed by the following formula.

log[δ P (1 − cos w= log cos

π

2 p∆

DAC RRB

1000

w

π

)] (10)

100

2 p∆

10

Figure 4 through Figure 8 show the comparison results between the least values of wB and w with varying p∆ ranging from 2 to 98 where the Y-axis uses the logarithmic scale, under fixed δ P values 0.6, 0.3, 0.1, 0.001, and 0.0001 respectively. In Figure 4 it shows that RRB has better performance than DAC only when p∆ is under 14.

2 8 14 20 2632 38 44 50 56 62 68 74 80 86 92 98

p∆

Figure 5. Comparison of number of packets monitored with Blum’s method under δ P = 0.1

1000000 100000 10000 DAC DSE

1000

w

When p∆ is larger than 14, DAC outperforms RRB. In Figure 5 it shows that RRB has better performance than DAC with p∆ in between 2 and 98. When p∆ = 14 , the least number of packets needed by RRB is about 1000 less than DAC. With the decrease of the false positive probability, the difference between wB and w becomes larger. It also means that the higher the detection accuracy, the better performance RRB has than DAC. This trend can be seen in Figures 6 through 8. From Figure 8, we obtain that under the same condition as before, p∆ = 14 , the least number of packets required by RRB is about 9000 less than DAC in terms of reaching the same false positive probability 0.0001. Based on the comparison of results observed from Figure 4 through Figure 8, we conclude that under a high confidence (low false positive probability) without chaff perturbation, RRB outperforms DAC because RRB needs fewer packets to conclude than what DAC needs.

1

100 10 1 2 8 14202632 384450 56626874 80869298 p∆ Figure 6. Comparison of number of packets monitored with Blum’s method under δ P = 0.01

1000000 100000

10000

10000 w

100000

1000 w

DAC RRB

100

DAC DSE

1000 100 10

10

1

1 2 8 14 20 26 32 38 44 50 56 62 68 74 80 86 92 98 p∆ Figure 4. Comparison of number of packets monitored with Blum’s method under δ P = 0.6

2 8 14202632 384450 56626874 80869298 p∆ Figure 7. Comparison of number of packets monitored with Blum’s method under δ P = 0.001


84

Figure 9 and Figure 10 show a similar concept explained in different ways. For example Figure 9 shows the concept under which the condition p∆ = 2 the comparison

1000000 100000

w

10000 DAC DSE

1000 100 10 1 2 8 14202632 384450 56626874 80869298 p∆

Figure 8. Comparison of number of packets monitored with Blum’s method under δ P = 0.0001 700 600 500 w

DAC RRB

300 200 100 0 0.001

δP

0.01

0.1

1

Figure 9. Comparison between wB and wy with δ P under

p∆ = 2 350000 300000 250000 200000 w

DAC RRB

150000 100000 50000 0 0.0001

0.001

0.01

0.1

than 0.6, and that RRB outperforms DAC when δ P is less than 0.6. From Figure 4 through Figure 10, we conclude that the larger the boundary, the more packets are needed to reach the same false positive probability. Under the same boundary, the smaller the false positive probability, the larger the least number of packets need to be monitored. In general RRB outperforms DAC in terms of number of packets needed to detect a stepping-stone intrusion because under high false positive probability (low confidence) the performance of a detection algorithm is not significant. VI.

400

0.0001

results between DAC and RRB with varying δ P . We know that RRB performs better than DAC in terms of the minimum number of packets required regardless of how much a false positive probability is. Figure 10 shows a variation of Figure 9, that is, under the boundary of 64. Figure 10 also shows that DAC outperforms RRB when δ P is larger

1

CONCLUSIONS AND FUTURE WORK

In this paper we introduced an algorithm that detects stepping-stone intrusion by counting and comparing the number of Send and Echo packets in an incoming and an outgoing connection respectively. We modeled a steppingstone intrusion as a one-dimensional random walk process, and gave the analysis of the detection algorithm with random walk theory. We compared our algorithm with the available state of the art stepping-stone detection algorithm, and found that our algorithm performs better in terms of least number of packets required. The analysis showed that our algorithm could resist intruders’ timeperturbation evasion. The limitation of our approach, even including other approaches, is that they all have high false positive rate because some applications use stepping-stones routinely. The relatiy is that all intruders use stepping-stone to attack their target, but it does not mean that interactive TCP/IP sessions using stepping-stone are used by intruders. Another limitation of this approach is that all analysis in this paper is based on the assumption p = q. This assumption is too strong to make this approach applicable. How to make the above condion weak along with developing an approach to detect atepping-stone intrussion in low false positive rate are the main goals of our future work.

Figure 10. Comparison between wB and w y with δ P under

ACKNOWLEDGMENTS

p∆ = 64

We would like to thank Dr. Marcos Cheney, professor of University of Maryland Eastern Shore, MD USA, and the anonymous reviewers for their helpful feedback and comments on this work.


85

REFERENCES th

[1] Y. Zhang, V. Paxson, “Detecting Stepping-Stones,” Proc. of the 9 USENIX Security Symposium, Denver, CO, August 2000, pp. 67-81. [2] S. Staniford-Chen, L. Todd Heberlein, “Holding Intruders Accountable on the Internet,” Proc. IEEE Symposium on Security and Privacy, Oakland, CA, August 1995, pp. 39-49. [3] K. Yoda, H. Etoh, “Finding Connection Chain for Tracing Intruders,” Proc. of 6th European Symposium on Research in Computer Security (LNCS 1985), Toulouse, France, September 2000, pp. 31-42. [4] J. Yang, S. S-H. Huang, “Matching TCP Packets and Its Application to the Detection of Long Connection Chains,” Proc. (IEEE) of 19th International Conference on Advanced Information Networking and Applications (AINA’05), Taipei, Taiwan, China, March 2005, pp. 1005-1010. [5] K. H. Yung, “Detecting Long Connecting Chains of Interactive Terminal Sessions,” Proc. of International Symposium on Recent Advances in Intrusion Detection, Zurich, Switzerland, September 2002, pp. 1-16. [6] D. L. Donoho (ed.), “Detecting Pairs of Jittered Interactive Streams by Exploiting Maximum Tolerable Delay,” Proc. of International Symposium on Recent Advances in Intrusion Detection, Zurich, Switzerland, September, 2002, pp. 45-59. [7] A. Blum, D. Song, S. Venkataraman, “Detection of Interactive Stepping-Stones: Algorithms and Confidence Bounds,” Proc. of International Symposium on Recent Advance in Intrusion Detection, Sophia Antipolis, France, September 2004, pp. 20-35. [8] D. Cox, H. Miller, The Theory of Stochastic Processes. New York, NY: John Wiley & Sons Inc., 1965. [9] T. He and L. Tong, “Detecting Encrypted Interactive Stepping-Stone Connections,” Proc. of 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing, Toulouse, France, May 2006, pp 816-819.

Jianhua Yang became a Member of IEEE from 2003 and ACM from 2006. He was born in Weifang, China at 1966. He earned his Ph.D. degree in computer science at University of Houston, Houston, TX USA in 2006, Master degree in computer engineering at Shandong University, Jinan, Shandong China in 1990, and Bachelor degree in electronic engineering at Shandong University, Jinan, Shandong China in 1987. His major field of study is computer science. He is currently working at TSYS School of Computer Science, Columbus State University (CSU), Columbus, GA USA as an Associate Professor. Before joining CSU, he was an Assistant Professor at Bennett College for Women from 2006 to 2008, University of Maryland Eastern Shore from 2008 to 2009, and Associate Professor at Beijing Institute of Petro-Chemical Technology, Beijing, China from 1990 to 2000. His current research interests are computer network and information security. Dr. Yang has published more than 20 research papers in the area of stepping-stone intrusion detection since 2004. He is serving as a reviewer for IEEE Transactions on Signal Processing and Journal of Computers and Security, Elsevier. Guoqing Zhao. He was born in ShanXi, China at 1964. He earned his Master degree in computer engineering at Beijing Institute of Technology , Beijing China in 2000, and Bachelor degree in science at North West University, Xian, Shanxi China in 1987. His major field of study is computer science. He is currently working at the Department of Computer Science, Beijing Institute of PetroChemical Technology, Beijing, China as an Associate Professor. His current research interests are computer network and information security.

Mr. Zhao has published more than 10 research papers in the area of computer networks and information security since 2005. Lydia Ray was born in India in 1975. She earned her PhD degree in Computer Science from Louisiana State University, Baton Rouge, Louisiana in 2005, M.Stat degree in Statistics from Indian Statistical Institute in 1998, and B.Sc degree in Statistics from University of Calcutta in 1996. Her major field of study at present is Computer Science. She is currently working as assistant professor at TSYS Department of Computer Science, Columbus State University. She joined here in 2006 Fall. Her research interest is in sensor networks and security. Shou-Hsuan Stephen Huang was born in Taipei, Taiwan, Republic of China. He received his BS degree in mathematics from National Cheng Kung University, Tainan, Taiwan, R.O.C. in 1973, and his MS degree in mathematics from Southern Illinois University, Carbondale, Illinois in 1976. Dr. Huang received his MS and Ph.D. degree in Computer Science from the University of Texas, Austin, Texas in 1979 and 1981 respectively. He is professor of computer science at the University of Houston, Houston, TX. He has served as Director of Graduate Studies, Associate Chair, and Chairperson of the department at the University of Houston. His research interests include: data Structures and Algorithms, Computer Security, and Intrusion Detection. Dr. Huang is a senior member of the IEEE Computer Society and a member of the ACM.

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 A. Mutazono, M. Sugano, and M. Murata: Self-organizing Anti-phase Synchronization Scheme for Sensor Networks Inspired by Frogs’ Calling Behavior

86

Self-organizing Anti-phase Synchronization Scheme for Sensor Networks Inspired by Frogs’ Calling Behavior Akira Mutazono, Masashi Sugano, and Masayuki Murata

Abstract—Researches on bio-inspired self-organized method has been done in order to control complex network systems where dynamic changes of topology and increase of the nodes are expected. That kind of research targets at applying robustness and adaptability features of biological systems against environmental changes to sensor networks. In this paper, we focus on the calling behavior of Japanese tree frogs, which make calls alternately with their neighbors in order to increase the probability of mating. This behavior can be applied in phase control which realizes collision-free transmission scheduling in wireless communication. We propose a self-organizing scheduling scheme inspired by this frog calling behavior for reliable data transmission in wireless sensor networks. Simulation results show that our proposed method for phase control is capable of reducing data transmission failures and improves the data collection ratio up to 24 % compared to a random transmission method. Index Terms—self-organization, anti-phase synchronization, sensor network, pulse-coupled oscillator, simulation.

I. I NTRODUCTION ELF-organized control inspired by biological systems has been receiving more attention as a concept for the realization of high robustness, scalability, and adaptability [1]. Each component of a biological system makes decisions based on local interactions with its neighbors, without receiving directions from a specific leader. Thus, the entire system can respond to changes in a coordinated manner in spite of the self-oriented behavior of the individual components. Such simple mechanisms bring cognitive functionality to the whole system, and self-organized control provides adaptability and robustness [2]. There has been methods proposed to adopt the advantages of biological systems to computer networks in such fields as routing [3] and clustering [4]. In the field of time synchronization, pulse-coupled oscillators (PCO) [5] are known to model the behavior of fireflies, which flash in unison with their neighbors. However, most research on the pulse-coupled oscillator model has focused on simultaneous synchronization [6]. Antiphase synchronization [7] (alternate phase synchronization) is necessary in the case where several terminals need to share common resources. When several terminals process a task by sharing common resources, the load can be balanced

S

A. Mutazono and M. Murata are with the Graduate Scool of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita-shi, Osaka 565-0871, JAPAN, E-mail: a-mutazono, murata@ist.osaka-u.ac.jp. M. Sugano is with the School of Comprehensive Rehabilitation, Osaka Prefecture University, 3-7-30 Habikino, Habikino-shi, Osaka 583-8555, JAPAN, E-mail: [email protected].

Fig. 1.

Japanese tree frog (Hyla japonica).

by applying round-robin scheduling, where each terminal is processed in turns. Similarly, in wireless communication, anti-phase synchronization of transmission scheduling reduces packet loss caused by collisions. As a possible mechanism for realizing anti-phase synchronization, we consider the calling behavior of Japanese tree frogs [8], especially advertisement calling. It is considered that one of the main reasons for the calling behavior of this type of frog is to attract females. If a male calls simultaneously with other frogs, it becomes difficult for the female to distinguish the caller, and therefore they adjust the timing of their calls [9, 10]. Figure 2 is an example of anti-phase synchronization observed in calling of two frogs. You can observe that two frogs avoid the collision of a call by adjusting timing mutually. We formulate such behavior of advertisement calling by using the pulse-coupled oscillator model, and it is applied in phase control for anti-phase synchronization as well as in transmission scheduling in wireless communication with the aim of avoiding transmission failures. Conventional scheduling protocols have problems regarding the overhead for adjusting their schedule and lack in adaptability since the schedule is fixed and cannot be rescheduled in accordance with environmental changes. However, self-organizing scheduling based on frog calling is expected to solve these problems. In this paper, we propose a self-organizing transmission scheduling scheme inspired by frog calling behavior. We demonstrate that phase control can result in anti-phase synchronization in various environments, and we perform a comparative evaluation with DESYNC [11, 12], which is another distributed anti-phase synchronization technique. The outline of this paper is as follows: Section II provides the motivation and some related work about anti-phase synchronization. Section III introduces details of the mechanisms


87

Fig. 2. An example of anti-phase synchronization observed in calling of two frogs. In the beginning, only one frog was calling, and another frog started calling after time 23 second. Two frogs avoided the collision of the call by adjusting timing mutually. (This sound was recorded by Mr. Ikkyu Aihara of Kyoto University.)

B

A

C

Fig. 3. Hidden terminal problem. Terminal B and C may transmit simultaneously in order not to know a mutual existence.

of the phase control method based on frogs’ alternate calling behavior. Section IV shows the result from numerical simulations in single-hop networks. We provide a conclusion and present possible extensions in Section V. II. A LTERNATE P HASE S YNCHRONIZATION FOR S CHEDULING Research on time synchronization using pulse-coupled oscillator model has been performed [13]. Those work target at adjusting oscillators’ phase in unison, however the research on synchronization that shifts the phase of oscillators with certain intervals has not been previously considered in detail. We call conventional simultaneous synchronization as inphase synchronization and call alternate synchronization of our target as anti-phase synchronization. For instance, in-phase synchronization is the phenomenon of simultaneous flashing of fireflies and anti-phase synchronization is seen in Christmas illuminations where the colorful bulbs flash alternately or alternate blinking of the crossing lamp. Anti-phase synchronization becomes effective for sharing the resource. Round-robin scheduling is known which assigns the same time slice to the process of a waiting state in order without priority. This method is supposed to be fair scheduling since resource is allocated to all the processes equally. In the field of wireless communication, TDMA (Time Division Multiple Access) is also a kind of anti-phase synchronization which divides the access period into fixed slots and assigns frequency used for communication. In TDMA, since it is not necessary to check a channel, delay is small and stable transmission speed is expectable. Furthermore, if anti-phase synchronization is applied to multi-hop network, a collision in the MAC layer in the wireless sensor network is avoidable. We explain the hidden terminal problem as a example of collision in MAC layer using Figure 3. When terminal B communicates

to terminal A, collisions do not occur because terminal B checks the channel is free (carrier sense) before transmission. However, when terminal C is added here, terminal B and C can not check the channel properly since they are located out of communication range each other. In such case, when two terminals transmit simultaneously, interference takes place at the point of terminal A and the packet does not reach terminal A correctly. This is the hidden terminal problem which can be serious problems in wireless sensor networks. Interference can be reduced if the terminals in the relation of hidden terminal problem adjust transmission schedule by anti-phase synchronization. There are some studies about anti-phase synchronization. DESYNC [11, 12] is a anti-phase synchronization method in distributed manner proposed by Nagpal et al.. Each node adjusts the firing time considering the last and next firing of itself so that the offsets of firings become equal. Even when there are many nodes, iteration of interactions leads whole network to anti-phase synchronized state. But, adjustment of timing in this method relies on information from only two nodes, this structure is not effective to multi-hop network. Stankovic [14] proposed another anti-phase synchronization method. This method adjusts the firing time for rare event detection considering the distribution of sensing region. However, this method needs a lot of calculation resources for building complex polynomial function and location information of the neighboring nodes is necessary for accurate anti-phase synchronization. PDTD (Phase Diffusion Time Division) [15] is a kind of anti-phase synchronization method that performs in a self-organizing manner. This method solves the hidden terminal problem by performing anti-phase synchronization between nodes within interaction range which is twice as large as communication range. III. T RANSMISSION S CHEDULING I NSPIRED BY F ROG C ALLING The outline of phase control is shown in Figure 4. The frog calls by making a sound for a certain period of time and then quiets down before repeating the call. If two or more male frogs call at random, the timing of their calls might overlap. In such a case, the calls interfere with each other and the female frog (the mating partner) cannot distinguish between the callers. Therefore, each male frog shifts the timing of its calls by listening to the calls of other frogs so as to avoid such overlap. After all frogs establish this interaction pattern, call alternation without interference is achieved within the group.


88

ω ω

+

Fire

ω

ｊ

ｊ

k

i

φ

j

ω =ω

ω

Fire

ｊ

j

i

ω

i

k

k

+

ω

Δ

ji

k

i

Δ

ω

+

ｊ

Δ

ik

jk

i

ω

i

(a)

kｊ

i

Δ

k

Δ

k

+

i

k

Fire

+

j

ji

+

(b)

(c)

Fig. 5. Phase control mechanism. (a) Each oscillator has its own phase and firing frequency. (b) Oscillator receives positive stimulus and promote firing frequency, oscillator receives negative stimulus and repress the firing frequency. (c) After iterations, the phase offset between each oscillator becomes equal and anti-phase synchronization is realized.

, then and oscillator slows down the firing frequency in order to spread the phase offset with respect to oscillator . After these interactions, the oscillators are assumed to be in a stable anti-phase synchronized state when the following conditions of Eqs. (5) and (6) are fulfilled (Figure 5). (5)

calling slot

collision

phase control

collision

Fig. 4. Outline of phase control which reduces the transmission failure by adjusting the transmission timing.

Pulse-coupled oscillators are used as models of various synchronization mechanisms in biology. Here, we formulate frog calling behavior with pulse-coupled oscillators. Each oscillator has a phase which changes with time with a firing frequency . When the phase reaches , the oscillator fires and returns the phase to the initial value ( ). Oscillator which is coupled with firing oscillator receives a stimulus and changes the firing frequency of the next turn in accordance with the phase offset between the coupled oscillators. The oscillator does not change the firing frequency immediately after receiving the stimulus; instead, it memorizes the size of the stimulus and changes the firing frequency after firing its own stimulus.

·

(1) (2)

(3)

where is the phase shift function which generates repulsive force which shifts the phase away from that of other oscillators. Aihara et al. [16] suggested the following phase shift function: (4) where is the coupling coefficient of a pulse-coupled oscillator model. When , then and oscillator advances the firing frequency to extend the phase offset with respect to oscillator . On the contrary, when

(6)

We then consider the group , in which oscillators are coupled with each other. When oscillator fires at time (½ ¾ ), it changes the firing frequency as follows: (7)

· ¼

¾

(8)

When the phase offsets between oscillators which fire consistently are all equal and the repulsive force of all oscillators is negated, the group is assumed to be in a stable anti-phase synchronized state. These conditions are described below together with the case of two oscillators.

½¾ ¾¿ ½ ¾

¾

½

¾

¾

(9) (10)

It is confirmed that two or three oscillators can be anti-phase synchronized with phase shift function Eq. (4) (Figure 6(a), (b)). However, this function cannot anti-phase synchronize more than four oscillators since they are divided into groups of two and three oscillators (Figure 6(c), (d)). This is caused by the phase shift function, which is a symmetric function, and the repulsive force is negated in situations in which condition Eq. (9) is not satisfied, despite the fact that condition Eq. (10) is satisfied and the oscillators converge to a stable state. The stimulus needs to be weighted depending on the phase distance Æ in order to resolve this problem. The smaller the phase distance Æ between the coupled oscillators, the stronger the oscillators should be in order to receive the stimulus. For this reason, we adopt the following equation.

Æ

(11)


89

(a)

(b)

(c)

(d)

Fig. 6. Difficulty on anti-phase synchronization. More than four oscillators are divided into the group of two oscillators and the group of three oscillators, they are anti-phase synchronized in each group.

Relative Phase Offset

1 0.8 0.6 0.4 0.2 0 0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Time [sec] Fig. 7.

Transision of relative phase offset.

Æ

and a time stamp, which represents the delay caused by the back-off of CSMA/CA. The size of the packet is set to 400 bits. Therefore, it takes 8 ms for transmitting one data packet, and the transmission node takes exclusive control of the communication band during that period. We use the following evaluation metrics. ¯ Average Error This value shows the average value of the phase offset between nodes. The smaller the average error, the higher the accuracy of the synchronization. ¯ Transmission Failure Probability The probability of transmission failure caused by overfailure of the back-off in CSMA/CA during the communication attempt of the node. ¯ Data Collection Ratio Ratio of the number of data packets reaching the sink to the overall number of data packets sent to the sink from the node.

(12)

By using this phase shift function Eq. (12), conditions (9) and (10) are always satisfied, regardless of the number of oscillators. Figure 7 shows the process of anti-phase synchronization between 10 oscillators. The phase of the oscillators, which is discrete in the initial state, is shifted to an antiphase synchronized state with interactions between coupled oscillators. The phase offset between consecutive oscillators becomes approximately the same at time = 1.0 second. After this point, although the oscillator receives stimuli, positive and negative stimuli cancel each other out, and the group maintains a stable state. IV. E VALUATION A. Simulation setup Through simulations, sensor nodes are deployed randomly in a monitoring region with a radius of 10 m and the timing of data transmission is determined on the basis of a phase which is assigned randomly in the initial state. The communication range of the node is assumed to be 20 m, and the nodes can communicate with all other nodes in the network. The node carries out sensing every 0.16 seconds and transmits the sensed data to the sink node at a transmission speed of 50 kbps. CSMA/CA is used for the transmission protocol of the MAC layer. Data packets include the sensing information

B. Performance of the proposed phase control mechanism We evaluate the coupling coefficient , which is an important parameter of the pulse-coupled oscillator model. In order to obtain the suitable parameter settings in accordance with the number of nodes, we estimate the average error after a certain period (20 seconds). Figure 8(a) shows that the average error of ¾ becomes the boundary value for synchronization, where the accuracy of synchronization becomes higher with time if it is lower than the boundary value, otherwise the phase keeps fluctuating and does not converge to a stable state. Additionally, the large width of the coupling coefficient enables the network to reach a stable state in environments consisting of a small number of nodes, and it becomes difficult to converge to the stable state if the value of the coupling coefficient is too large. This is a result of the number of coupled nodes, in other words, the stimulus becomes stronger as the node becomes coupled with more nodes and the coupling coefficient becomes larger. Hence, it is concluded from the simulation that overstimulation disturbs the convergence to a stable state. On the contrary, although small values of the coupling coefficient require longer synchronization times, the condition approaches a stable state in a steady manner (Figure 8(b)). These results indicate that anti-phase synchronization requires the coupling coefficient to be set adaptively. The choice of coupling coefficient also depends


10

0

10-1 10

-2

10

-3

10

-4

10

-5

10

-6

10

-7

10-8

10

4 nodes 10 nodes 15 nodes

10 Average Error

Lower Bound of Average Error

90

0

0.05

0.1

0.15

0.2

0.25

0

4 nodes 10 nodes 20 nodes

-1

10-2 10

-3

10

-4

10

-5

10

-6

10

-7

0.3

0

5

10

Coupling Coefficient (α)

Average Error

10

and lower bound of average

0

10

-1

10

-2

10-3 10-4 10

-5

10

-6

10-7

α = 0.01 α = 0.1 α = 0.2

0

5

10

15

20

(b) Transition of average error with 10 nodes. Setting of coupling coefficient

.

on the requirements of the particular application; for example, a small coupling coefficient for delay-tolerant applications and large coupling coefficient for accuracy-tolerant application. The number of nodes and the data transmission interval also affects the choice. Various factors should be considered when setting the coupling coefficient, and it is assumed that those factors constantly change. Therefore, setting a static coupling coefficient is not sufficient, and it is required that the parameter is set dynamically for each node in accordance with the number of nodes and the amount of traffic in a self-organizing manner. However, as this problem is beyond the scope of this work, it will be left for future study. The phase control method requires scalability over the number of nodes. We perform an evaluation of a network where 4, 10 or 20 nodes are deployed, and use a coupling coefficient of 0.06 in all three cases. The result of the average error with the progress of time is shown in Figure 9. It is easier for a small number of nodes to be synchronized within a short period or time. When the number of nodes increases to 10, an

1 4 nodes 10 nodes 20 nodes

0.8 0.6 0.4 0.2 0 0

Time [sec]

Fig. 8.

20

(a) Transition of average error.

Transimission Failure Probability

(a) Relation between coupling coefficient error.

15

Time [sec]

5

10 Time [sec]

15

20

(b) Transition of transmission failure probability. Fig. 9.

Influence of number of nodes on anti-phase synchronization.

equal phase offset is formed, and the nodes are synchronized as a result of the interaction between the nodes, although the time until the synchronized state is reached is longer as compared to the case of 4 nodes. However, 20 nodes cannot be synchronized due to the insufficient control of average error, and as a result, the average error keeps fluctuating and the oscillation cannot converge to a stable state. The reason for this failure can be described as follows. In this simulation, the node transmits 400 bits of data to the sink with a transmission speed of 50 kbps every 160 ms. The transmission of one data packet requires 400 [bits] / 50 [kbps] = 8 [ms]. Since the transmission width is 160 ms and the time slot is 8 ms, perfect anti-phase synchronization provides alternate transmission for up to 20 nodes. However, such a situation is difficult to realize in practice, and transmission failures inevitably occur in the process of synchronization (Figure 9(b)). The transmission failure interrupts the node from broadcasting the firing information, and consequently phase control is not performed properly and the average error increases. The iteration of this

100

30 Average Error Number of Transmission Failure

-1

10

25

Average Error

-2

10

10-3

20

10-4

15

-5

10

10

10-6 5

10-7 10-8 -4 10

10-3 10-2 Packet Loss Rate

0 10-1

1 Relative Phase Offset

91

Number of Transmission Failure


0.8 0.6 0.4 0.2 0

Fig. 10.

0

Influence of a packet loss.

5

10

15

20

Time [sec]

The reason for adopting a biological system in this method is its robustness against perturbations. In wireless communication in sensor networks, radio waves are shadowed by obstacles and fade as a result of the interference of radio waves. The energy of the nodes can thus be depleted and the node might cease to function in the case of such an unexpected failure. Furthermore, a node can be added to the network in order to replace a failed node. In this section, we regard the packet loss and the changes in topology induced by the addition and the failure of nodes as perturbations, and show that the self-organized anti-phase synchronization method is robust against such perturbations. he influence that the packet loss brings to average error and transmission failure is shown in Figure 10. In this simulation, a packet is dropped randomly based on packet loss rate and does not reach to the destination. In the environment where packet loss hardly happens, node adjusts the phase with suitable interval to other nodes and a precise anti-phase synchronization is performed. Although several times of transmission failure appear, it shall be allowed since the node has random phase in the initial condition. Even the synchronous accuracy falls as the packet loss rate increases, the phase offset among nodes in the environment of packet loss rate ¾ is maintained at an acceptable level and data transmission is carried out without failure. Figure 11 shows the result in this condition. The phase moves with fluctuation due to the failure of phase control caused by the packet loss (Figure 11(a)). Eventually, node shifts the phase and keeps the synchronized state with receiving the influence of packet loss. In the environment where the packet loss happens frequently (packet loss rate = ½ ), as the node cannot achieve enough interactions between neighboring nodes for stable anti-phase synchronization, the overlap of phase leads the transmission failure. Yet it is not perfect in the environment where the packet loss occurs very often, the proposal shows robustness against packet loss. The

1 Average Error Transmission Failure Probability 0.8

Average Error

C. Robustness against Perturbations

100

-1

10

0.6 0.4

10-2

0.2 10-3

0 0

5

10 Time [sec]

15

Transmission Failure Probability

(a) Transision of relative phase offset.

operation leads to the failure of anti-phase synchronization in the case of 20 nodes. Thus, the number of transmission nodes which can be synchronized by anti-phase synchronization is constrained by the access period.

20

(b) Transision of average error and transmission failure probability. Fig. 11. Performance of the network under the situation where a packet loss rate is ½¼ ¾ .

uniform dependence on the information brings robustness of self-organizing method against a packet loss. For instance, the influence of packet loss becomes large in centralized control since the node located on the lower layer of hierarchy decides its operation depending on the information from the node of the higher layer. Several methods are known as a solution of packet loss such as ACK (ACKnowledgement) where a receiving node replies a reception confirmation to a transmitting node, and FEC (Forward Error Collection) which carries out an error collection, there are also demerits on those methods such as an increase of control packet and an extension of delay. Not hierarchical but the local exchange of information on self-organizing control yields robustness against packet loss without executing those measures. Subsequently, we confirm that the proposed method restores the anti-phase synchronized state by performing phase control after the addition or the failure of a node. Three nodes with random phases are added to the network at 20 seconds, and three nodes fail at 50 seconds from initiation. Figure 12


92

1 Data Collection Ratio

1 Relative Firing Time

0.8 0.6 0.4

0.8 0.6 0.4 Proposal DESYNC Random Ideal TDMA

0.2

0.2 0 0

0 0

10

20

30

40

50

60

5

70

10

15

20

25

30

35

Messages per Second

Time [sec] Fig. 13.

Comparison of data collection ratio.

(a) Transition of relative phase offset.

1 Average error Transmission Failure Probability

Average Error

10-2

0.8

-3

10

0.6

10-4 0.4

10-5

0.2

-6

10

10-7

0 0

10

20

30 40 Time [sec]

50

60

centralized control. Transmission Failure Probability

10-1

70

(b) Transition of average error and transmission failure probability. Fig. 12. Influence caused by the change of topology: Addition and failure of the node.

shows that 10 nodes with random phases in the initial state immediately converge to a stable anti-phase synchronization state. At 20 seconds, the average error decreases and the synchronized state is destroyed due to the addition of nodes. As the transmission has been almost simultaneous up until that time, transmission failure arises as carrier sensing is performed over the maximum back-off time on CSMA/CA (Figure 12(b)). However, the node adjusts the phase in a selforganizing manner against the addition of nodes, and the antiphase synchronization state is restored within a short period of time. The same performance can be confirmed in the case of failure of nodes. Self-organized control is characterized by such robustness against changes in topology due to its intrinsic function of local interactions. In centralized control, if the node which plays an important role (such as a cluster head) fails, ordinary nodes stop functioning properly without receiving orders from that central node. On the other hand, a task is equally distributed to nodes in self-organized control, and the system is not influenced by the risks associated with

D. Comparison with other schemes In order to understand the features of the proposed method, we perform a comparative evaluation with three other schemes. DESYNC [11] is a distributed anti-phase synchronization scheme which achieves a synchronized state by adjusting the phase on the basis of information from two coupled nodes. Random gives random transmission timing to the nodes, while Ideal TDMA uses an ideal value which provides optimal scheduling. The MAC layer of the first two methods (DESYNC and Random) is based on CSMA/CA, and the same topology is used in the simulation. Figure 13 shows the influence of the traffic, namely the number of data generated by a node, on each scheme. The proposed method achieves a high data collection ratio in the case of low traffic by reducing the number of data transmission failures. As the traffic increases, the data collection ratio decreases due to failures of data transmission caused by too much traffic over the width of the access period. Although the proposed method does not reach an ideal value in such excessive traffic, it maintains a higher data collection ratio than the random control method. The difference is mainly due to the choice of coupling coefficient. The advantage of the proposed method is the feasibility of extension to multi-hop networks since the stimulus in the proposed method arrives from all nodes, while in DESYNC it arrives from only two nodes. The comparison between selforganizing and distributed control is a crucial point in terms of synchronous stability, extendibility, robustness, and so forth, which will be examined in future work. V. C ONCLUSION AND P OSSIBLE E XTENSIONS Robustness, adaptability and scalability are essential features for managing complex and diverse networks. In this paper, we introduced a self-organizing scheduling scheme inspired by frog calling behavior as a method for fulfilling such requirements. We performed evaluations through computer simulations in a single-hop network for a phase


93

control method inspired by the alternate calling behavior of Japanese tree frogs. The simulation results showed that phase control reduces the transmission failures by applying antiphase synchronization, regardless of the number of nodes. In addition, robustness against packet loss and changes in topology was confirmed, and stable anti-phase synchronization was maintained by realizing adaptive response to perturbations. Research on anti-phase synchronization is a relatively new field, and several factors are yet to be explored. In order to prove the feasibility of the convergence to a stable anti-phase synchronized state, it is necessary to perform mathematical analysis of the synchronous stability of the phase shift function. The phase control mechanism should be improved in order to achieve extendibility to multi-hop networks, which also considers the hidden terminal problem. The comparative evaluation of the proposed method with distributed methods from the viewpoint of the transmission of information would reflect the benefits of both methods.

[13] M. B. H. Rhouma and H. Frigui, “Self-organization of pulse-coupled oscillators with application to clustering,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 2, pp. 180–195, Feb. 2001. [14] Q. Cao, T. Abdelzaher, T. He, and J. Stankovic, “Towards optimal sleep scheduling in sensor networks for rare-event detection,” in Proceedings of the 4th International Symposium on Information Processing in Sensor Networks (IPSN 2005), Apr. 2005. [15] K. Sekiyama, Y. Kubo, S. Fukunaga, and M. Date, “Phase diffusion time division method for wireless communication network,” in Industrial Electronics Society, 2004. IECON 2004. 30th Annual Conference of IEEE, vol. 3, Nov. 2004. [16] I. Aihara, H. Kitahata, K. Yoshikawa, and K. Aihara, “Mathematical modeling of frogs’ calling behavior and its possible application to artificial life and robotics,” Artificial Life and Robotics, vol. 12, no. 1, pp. 29–32, Mar. 2008.

Akira Mutazono received B.E and M.E. degrees from Osaka University, in 2007 and 2009, respectively. His research interests include synchronization and biology-inspired control for sensor network systems.

ACKNOWLEDGMENT This research was partly supported by the “Global COE (Centers of Excellence) Program” of the Ministry of Education, Culture, Sports, Science and Technology, Japan, and the Sekisui Chemical Grant Program for Research on Manufacturing Based on Learning from Nature. R EFERENCES [1] H. Kitano, “Biological robustness,” Nature Reviews Genetics, vol. 5, no. 11, pp. 826–837, Nov. 2004. [2] F. Dressler, Self-organization in sensor and actor networks. Wiley, Jan. 2007. [3] Y. Zhang, L. Kuhn, and M. Fromherz, “Improvements on ant routing for sensor networks,” in Proceedings of the Fourth International Workshop on Ant Colony Optimization and Swarm Intelligence, Jul. 2004, pp. 154– 165. [4] H. Chan and A. Perrig, “ACE: an emergent algorithm for highly uniform cluster formation,” in Proceedings of the First European Workshop on Wireless Sensor Networks (EWSN 2004), Jan. 2004, pp. 154–171. [5] R. E. Mirollo and S. H. Strogatz, “Synchronization of pulse-coupled biological oscillators,” Journal on Applied Mathematics, vol. 50, no. 6, pp. 1645–1662, Dec. 1990. [6] A. Mutazono, M. Sugano, and M. Murata, “Evaluation of robustness in time synchronization for sensor networks,” in Proceedings of the 2nd International Conferenece on Bio-Inspired Models of Network, Information, and Computing Systems (BIONETICS 2007), Dec. 2007, pp. 89–92. [7] L.-Y. Cao and Y.-C. Lai, “Antiphase synchronism in chaotic systems,” Physical Review E, vol. 58, no. 1, pp. 382–386, Jul. 1998. [8] I. Aihara, S. Horai, H. Kitahata, K. Aihara, and K. Yoshikawa, “Dynamical calling behaviors experimentally observed in Japanese tree frogs (Hyla japonica),” IEICE Trans Fundamentals, vol. E90-A, pp. 2154– 2161, 2007. [9] W. E. Duellman and L. Trueb, Biology of amphibians. Johns Hopkins University Press, Mar. 1994. [10] H. C. Gerhardt and F. Huber, Acoustic communication in insects and anurans: common problems and diverse solutions. University of Chicago Press, Jul. 2002. [11] J. Degesys, I. Rose, A. Patel, and R. Nagpal, “DESYNC: Self-organizing desynchronization and TDMA on wireless sensor networks,” in Proceedings of the 6th International Conference on Information Processing in Sensor Networks (IPSN 2007), Apr. 2007, pp. 11–20. [12] J. Degesys and R. Nagpal, “Towards desynchronization of multi-hop topologies,” in Proceedings of the Second IEEE International Conference on Self-Adaptive and Self-Organizing Systems (SASO 2008), Oct. 2008, pp. 129–138.

Masashi Sugano received M.E. and D.E. degrees from Osaka University. In 1988, he joined Mita Industrial Co., Ltd. (currently, Kyocera Mita Corporation). From 1996 to 2003, he was an Associate Professor in Osaka Prefecture College of Health Sciences. In 2003, he moved to the Faculty of Comprehensive Rehabilitation, Osaka Prefecture College of Nursing. From 2005 to 2009, he was with the School of Comprehensive Rehabilitation, Osaka Prefecture University, and from April 2009, he has been a Professor. His research interests include performance evaluation of computer network, and sensor network. He is a member of IEEE, ACM, IEICE, and IPSJ.

Masayuki Murata received M.E. and D.E. degrees from Osaka University. In 1984, he joined Tokyo Research Laboratory, IBM Japan. From 1987 to 1989, he was an Assistant Professor with Computation Center, Osaka University. In 1989, he moved to the Department of Information and Computer Sciences, Faculty of Engineering Science. From 1992 to 1999, he was an Associate Professor in the Graduate School of Engineering Science, and from April 1999, he has been a Professor of Osaka University. He moved to Graduate School of Information Science and Technology in 2004. He has more than three hundred papers of international and domestic journals and conferences. His research interests include computer communication networks, performance modeling and evaluation. He is a member of IEEE, ACM, The Internet Society, IEICE, and IPSJ.

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 N. Auluck: Improving the Schedulability of Hybrid Real Time Heterogeneous Network of Workstations (NOWs)

94

Improving the Schedulability of Hybrid Real Time Heterogeneous Network of Workstations (NOWs) Nitin Auluck

Abstract— This paper proposes a duplication based algorithm for the scheduling of precedence related hybrid tasks on a network of workstations. The tasks are hybrid in that they consist of realtime tasks (with hard and soft deadlines) and non realtime tasks. The workstations are heterogeneous in that each task can have potentially diverse execution times on different workstations. The proposed algorithm uses selective task duplication that enables some tasks to have earlier start times (and hence earlier finish times). Hence, an increased number of subtasks finish execution before their deadline. This increases the schedulability of the realtime application. Task duplication has been used in the past for reducing the schedule length, albeit for non realtime systems. In addition, the algorithm is capable of scheduling the application even if a sufficient number of processors are not available. Based on extensive simulation results, it has been observed that the proposed algorithm offers a better success ratio as compared to the existing scheduling algorithms in the literature when there is a significant amount of communication in the system. Index Terms— Heterogeneous Computing, Integrated Scheduling, Precedence Constraints, Subtask duplication.

I. INTRODUCTION ealtime systems are defined as ones in which the

Rtimeliness of the availability the results is as important

as their correctness [15]. Based on the "seriousness" of the deadline misses, realtime tasks can broadly be divided into two categories hard realtime systems and soft realtime systems. In a hard realtime system, no lateness is acceptable under any circumstances. Even one such deadline miss can have catastrophic consequences. It may very well be the case that the application consists of a mix of hard, soft and non realtime tasks [2], [7, 8, 13,17]. One example of this hybrid system would be Unmanned Aerial Vehicles (UAVs) such as the Aerosonde robotic aircraft [11]. Scheduling in realtime systems has been a very well researched problem. Manuscript received on November 3, 2009; accepted on November 16, 2009. N. Auluck is with the Department of Computer Science, Quincy University, Quincy, IL 62301. Email: [email protected]

Various researchers have studied this problem for the single processor case [10, 12, 14]. Papers [1, 16, 18] address the scheduling of realtime tasks on homogeneous, identical multiprocessors. In contrast, realtime scheduling on heterogeneous multiprocessors has received little attention. Recent papers [5, 6] extend the popular EDF algorithm for a set of heterogeneous multiprocessors. However, they assume that the realtime tasks are independent. Task duplication has been used in realtime scheduling before, albeit for fault tolerant purposes [9, 19]. We have used task duplication as a tool for improving the schedulability of the realtime system consisting of hard realtime tasks [4]. In this paper, the concept is extended to a hybrid system consisting of hard, soft and non realtime tasks. The integrated scheduling of hard and soft realtime tasks on multiprocessors has been discussed in [13, 17]. However, they assume that all processors are identical. Hence, these algorithms will not work for heterogeneous multiprocessor systems, in which each task can have diverse execution costs on different processors, something that our proposed algorithm is capable of. The aim of this research is to propose a scheduling algorithm that improves the schedulability of a hybrid realtime system consisting of hard, soft and non real time tasks on a heterogeneous network of workstations. The rest of this paper is organized as follows. Section 2 describes the proposed scheme. The simulation results are discussed in section 3. Section 4 concludes the paper. II.PROPOSED SCHEME This section discusses the proposed scheme IDRTSA (Integrated Duplication based RealTime Scheduling Algorithm). The input for the algorithm is the realtime application RT, which consists of the set of hard realtime tasks H, the set of soft realtime tasks S and the set of non realtime tasks N. These tasks have been assumed to be independent. Each task has its own set of subtasks (such as h 1, h2 for H; s1, s2 for S; n1, n2 for N). These subtasks are precedence related. Hence, a subtask cannot start execution till it has received the results from all its predecessor subtasks. The set of heterogeneous workstations is depicted by W =

ISAST Transactions on Computers and Intelligent Systems, No. 2, Vol. 1, 2009 N. Auluck: Improving the Schedulability of Hybrid Real Time Heterogeneous Network of Workstations (NOWs)

95

{W1,W2, …., Wm}, where m (≥1) depicts the number of workstations in the system. The period of each hard task is denoted by p i. The semantics assumed is that one instance of all subtasks of a task needs to execute every period. The deadline of the task is given by di. The deadline of a periodic task is assumed to be at the end of its period. The schedule is generated for a length equal to the hyperperiod (or hp), which is defined as the least common multiple of all the task periods. Input: RT={H,S,N}, W. Output: realtime schedule. partition RT into sets of hard, soft and non realtime tasks. calculate the hyper period (hp) from periods. if (hp is large) modify selected periods by a small amount so that hp is reduced.

5.

for (each hi in H) do

6.

calculate est, ect, fpred, fp, last, lact and p_e; form the queue; perform cluster generation; schedule tasks on proper workstations; for (each si in S) do repeat steps 6 – 8; for (each ni in N) do repeat steps 6 – 8; if(AW

Computers and Intelligent Systems

Computers and Intelligent Systems

Suggest Documents

Computers and Intelligent Systems

Beyond intelligent tutoring systems: Using computers as ... - CiteSeerX

Computers and Electronics in Agriculture Intelligent ... - CiteSeerX

Computers, Environment and Urban Systems

Intelligent Systems Technologies and Applications

Intelligent Control Systems and Optimization

intelligent systems and agents 2008

Journal of Circuits, Systems, and Computers

Dialogue Systems - F12 Language and Computers

Dialogue Systems - F12 Language and Computers

Intelligent Control Systems - CyberLeninkawww.researchgate.net › publication › fulltext › Intelligent

Intelligent Authoring of Gamified Intelligent Tutoring Systems

Evaluating creativity in humans, computers, and collectively intelligent ...

Privacy and Trust Issues with Invisible Computers - Intelligent Data ...

Networking Mobile Devices and Computers in an Intelligent ... - SERSC

Intelligent Virtual Station - Intelligent Systems Division - NASA

INTELLIGENT TRANSPORTATION SYSTEMS

intelligent transportation systems

Intelligent Systems - Science Direct

Intelligent Systems - KIV

Systems Intelligent Leadership

MSAA - Intelligent Transportation Systems

Intelligent Transport Systems - TRL

Intelligent Systems - Microsoft