checking and imposing stability of recurrent neural

0 downloads 0 Views 280KB Size Report
... of NLqs are e.g. (generalized) cellular neural networks, the discrete time Lur'e ..... 17] Shynk J., \Adaptive IIR ltering," IEEE ASSP Magazine, pp.4-21, 1989.
NLq theory: checking and imposing stability of recurrent neural networks for nonlinear modelling  J.A.K. Suykens, J. Vandewalle, B. De Moor

Katholieke Universiteit Leuven, Dept. of Electr. Eng., ESAT-SISTA Kardinaal Mercierlaan 94, B-3001 Leuven (Heverlee), Belgium Tel: 32/16/32 18 02 Fax: 32/16/32 19 70 E-mail: johan.suykens,joos.vandewalle,[email protected] (Author for correspondence: J. Suykens) to appear in IEEE-SP (special issue: NNs for SP), SP EDICS - 6.1.2

Abstract

It is known that many discrete time recurrent neural networks, such as e.g. neural state space models, multilayer Hop eld networks and locally recurrent globally feedforward neural networks, can be represented as NLq systems. Sucient conditions for global asymptotic stability and input/output stability of NLq systems are available, including three types of criteria: diagonal scaling and criteria depending on diagonal dominance and condition number factors of certain matrices. In this paper, it is discussed how Narendra's dynamic backpropagation procedure, used for identifying recurrent neural network from I/O measurements, can be modi ed with an NLq stability constraint in order to ensure globally asymptotically stable identi ed models. An example illustrates how system identi cation of an internally stable model, corrupted by process noise, may lead to unwanted limit cycle behaviour and how this problem can be avoided by adding the stability constraint.

Keywords. multilayer recurrent neural networks, NLq systems, global asymptotic stability,

LMIs, dynamic backpropagation

 This research work was carried out at the ESAT laboratory and the Interdisciplinary Center of Neural

Networks ICNN of the Katholieke Universiteit Leuven, in the framework of the Belgian Programme on Interuniversity Poles of Attraction, initiated by the Belgian State, Prime Minister's Oce for Science, Technology and Culture (IUAP-17, IUAP-50) and in the framework of a Concerted Action Project MIPS (Modelbased Information Processing Systems) of the Flemish Community, ICCoS (Identi cation and Control of Complex Systems), Human Capital and Mobility Network (SIMONET) and SCIENCE-ERNSI (SC1-CT92-0779).

1

1 Introduction Recently, NLq theory has been introduced as a modelbased neural control framework with global asymptotic stability criteria [20, 24]. It consists of recurrent neural network models and controllers in state space form, for which the closed-loop system can be represented in so-called NLq system form. NLq systems are discrete time nonlinear state space models with q layers of alternating linear and static nonlinear operators that satisfy a sector condition. It has been shown then how Narendra's dynamic backpropagation, classically used to learn a controller track a set of speci c reference inputs, can be modi ed with NLq stability constraints. Furthermore several types of nonlinear behaviour, including systems with a unique equilibrium, multiple equilibria, (quasi)-periodic behaviour and chaos have been stabilized and controlled using the stability criteria [20]. In this paper we focuss on nonlinear modelling applications of NLq theory, instead of control applications. Like for tracking problems, where Narendra's dynamic backpropagation [9, 10] has been modi ed with a closed-loop stability constraint [20], we will modify dynamic backpropagation for system identi cation with a stability constraint in order to obtain identi ed recurrent neural networks that are globally asymptotically stable. Also for linear lters (e.g. IIR lters [16, 17]) this has been an important issue. We will consider the class of discrete time recurrent neural networks, which is representable as NLq systems. Examples are e.g. neural state space models and locally recurrent globally feedforward neural networks, that are models consisting of global and local feedback respectively. In order to check stability of identi ed models, sucient conditions for global asymptotic stability of NLq s are applied [20]. A rst condition is called diagonal scaling, which is closely related to diagonal scaling criteria in robust control theory [3, 13]. Checking stability can be formulated then as an LMI (Linear Matrix Inequality) problem, which corresponds to solving a convex optimization problem. A second condition is based on diagonal dominance of certain matrices. Certain results of digital lters with over ow characteristic [8] can be considered as a special case for q = 1 (one layer NLq ). Finally, we demonstrate how global asymptotic stability can be imposed on the identi ed models. This is done by modifying dynamic backpropagation with stability constraints. Besides the diagonal scaling condition, criteria based on condition numbers of certain matrices [20] are proposed for this purpose. In many applications one has indeed the a priori knowledge that the true system is globally asymptotically stable or one is interested in a stable approximator. It is illustrated with an example that process noise can indeed cause identi2

ed models that show limit cycle behaviour, instead of global asymptotic stability as for the true system. We show how this problem can be avoided by applying the modi ed dynamic backpropagation algorithm. This paper is organized as follows. In Section 2 we present two examples of discrete time recurrent neural networks, that are representable as NLq systems: neural state space models and locally recurrent globally feedforward neural networks. In Section 3 we review the classical dynamic backpropagation paradigm. In Section 4 we discuss sucient conditions for global asymptotic stability of identi ed models. In Section 5 Narendra's dynamic backpropagation is modi ed with NLq stability constraints. In Section 6 an example is given for a system corrupted by process noise.

2 NLq systems The following discrete time nonlinear state space model is called an NLq system [20]: 8 < :

pk+1 = ?1 ( V1 ?2 ( V2 :::?q ( Vq pk + Bq wk )::: + B2 wk ) + B1 wk ) ek = 1 ( W1 2 ( W2 :::q ( Wq pk + Dq wk )::: + D2 wk ) + D1 wk )

(1)

with state vector pk 2 Rnp , input vector wk 2 Rnw and output vector ek 2 Rne . The matrices Vi , Bi , Wi , Di (i = 1; :::; q) are constant with compatible dimensions, the matrices ?i = diagf i g, i = diagfi g (i = 1; :::; q) are diagonal with diagonal elements i (pk ; wk ), i (pk ; wk ) 2 [0; 1] for all pk ; wk . The term 'NLq ' stands for the alternating sequence of linear and nonlinear operations in the q layered system description. In this Section we rst explain the link between NLq s and multilayer Hop eld networks and then discuss two examples: neural state space models and locally recurrent globally feedforward networks, which are models with global feedback and local feedback respectively. Other examples of NLq s are e.g. (generalized) cellular neural networks, the discrete time Lur'e problem and linear fractional transformations with real diagonal uncertainty block [20, 23].

2.1 Multilayer Hop eld networks NLq systems are closely related to multilayer recurrent neural networks of the form: 8 < :

pk+1 = 1 ( V1 2 ( V2 :::q ( Vq pk + Bq wk )::: + B2 wk ) + B1 wk ) ek = 1 ( W1 2( W2 :::q ( Wq pk + Dq wk )::: + D2 wk ) + D1 wk )

with i (:), i (:) vector valued nonlinear functions that belong to sector [0; 1] [28]. 3

(2)

Let us illustrate the link between (1) and (2) for the autonomous NLq system (zero external input) q Y pk+1 = ( ?i (pk )Vi ) pk i=1

by means of the following autonomous Hop eld network with synchronous updating:

This can be written as

xk+1 = tanh(Wxk ):

(3)

xk+1 = ?(xk )Wxk

(4)

with ? = diagf i g and i = tanh(wiT xk )=(wiT xk ), which follows from the elementwise notation

xi := tanh(PPj wji xj ) w i xj ) P i j tanh( P ji j := j : j wj x j wj x P := ii j wji xj :

(5)

The time index is omitted here because of the assignment operator 0 :=0 . The notation ii means that this corresponds to the diagonal matrix ?(xk ). In case wiT xk = 0 de l' Hospital's rule can be applied or a Taylor expansion of tanh(:) can be taken, leading to i = 1. In a similar way the multilayer Hop eld neural network

can be written as because

xk+1 = tanh(V tanh(Wxk ));

(6)

xk+1 = ?1 (xk )V ?2 (xk )Wxk

(7)

xi := tanh(Pj vji tanh(Pl wlj xl ) P P := tanh( j vji 2 jj l wlj xl ) P P := 1 ii j vji 2 jj l wlj xl :

(8)

2.2 Neural state space models Neural state space models (Fig.1) for nonlinear system identi cation have been introduced in [19] and are of the form: 8 < :

x^k+1 = WAB tanh(VA x^k + VB uk + AB ) + Kk yk = WCD tanh(VC x^k + VD uk + CD ) + k 4

(9)

with estimated state vector x^k 2 Rn , input vector uk 2 Rm , output vector yk 2 Rl , zero mean white Gaussian noise input k (corresponding to the prediction error yk ? y^k ). W , V are the interconnection matrices with compatible dimensions,  are bias vectors and K is a steady Kalman gain. If the multilayer perceptrons in the state equation and output equation are replaced by linear mappings, the neural state space model corresponds to a Kalman lter. De ning pk = xk , wk = [uk ; k ; 1] and ek = yk in (1), the neural state space model is an NL2 system with ?1 = I , V1 = WAB , V2 = VA , B2 = [VB 0 AB ], B1 = [0 K 0], 1 = I , W1 = WCD , W2 = VC , D2 = [VD 0 CD ], D1 = [0 I 0]. For the autonomous case and zero bias terms, the neural state space model becomes:

x^k+1 = WAB tanh(VA x^k ):

(10)

By introducing a new state variable k = tanh(VA x^k ), this can be written as the NL1 system: 8 < :

x^k+1 = WAB k k+1 = tanh(VA WAB k ):

(11)

2.3 Locally recurrent globally feedforward networks In [26] the LRGF network (Locally Recurrent Globally Feedforward network), has been proposed, which aims at unifying several existing recurrent neural network models. Starting from the McCulloch-Pitts model, many other architectures have been proposed in the past with local synapse feedback, local activation feedback and local output feedback, time delayed neural networks etc., e.g. by Frasconi-Gori-Soda, De Vries-Principe and Poddar-Unnikrishnan. The architecture of Tsoi & Back (Fig.2) includes most of the latter architectures. Assuming on Fig.2 that the transfer functions Gi (z ) (i = 1; ::; n) (which may have both poles and zeros) have a state space realization (Ai ; Bi ; Ci ) an LRGF network can be described in state space form as 8 > > > > > > > > > > > < > > > > > > > > > > > :

) k(i+1 zk(i) k(n+1) zk(n) yk

= = = = =

i = 1; :::; n ? 1 A(i) k(i) + B (i) u(ki) ; C (i) k(i) A(n) k(n) + B (n) f (Pnj=1 zk(j) ) C (n) k(n) f (Pnj=1 zk(j) ):

(12)

Here u(ki) ; zk(i) 2 R (i = 1; :::; n ? 1) are the inputs and local outputs of the network, yk 2 R is the output of the network and zk(n) 2 R the ltered output of the network. f (:) is a static 5

nonlinearity belonging to sector [0; 1]. Applying the state augmentation k = f ( nj=1 zk(j ) ) an NL1 system is obtained with state vector pk = [k(1) ; k(2) ; :::; k(n?1) ; k(n) ; k ], input vector (n?1) ] and matrices wk = [u(1) k ; :::;uk P

2

V1 =

6 6 6 6 6 6 6 6 6 6 6 4

A(1)

A(2)

0

0

C (1) A(1) C (2) A(2) 2

B1 =

6 6 6 6 6 6 6 6 6 6 6 4

0

.. . 0

::: 0 A(n) B (n) ::: C (n?1) A(n?1) C (n) A(n) C (n) B (n) B (2)

0

C B

A(n?1)

0

3 7 7 7 7 7 7 7 7 7 7 7 5

3

B (1)

(1)

...

0

0

(1)

C B (2)

(2)

...

B (n?1)

::: 0 (n?1) (n?1) ::: C B

7 7 7 7 7 7 7 7 7 7 7 5

:

3 Classical dynamic backpropagation Dynamic backpropagation, according to Narendra & Parthasarathy [9, 10], is a well-known method for training recurrent neural networks. In this Section we shortly review this method for models in state space form, because this ts into the framework of NLq representations. Let us consider discrete time recurrent neural networks that can be written in the form: 8 < :

x^k+1 = (^xk ; uk ; k ; ) ; x^0 = x0 given y^k = (^xk ; uk ; )

(13)

where (:), (:) are twice continuously di erentiable nonlinear mappings and the weights ; are elements of the parameter vector , to be identi ed from a number of N input/output data Z N = fuk ; yk gkk==1N : N X (14) min J (; Z N ) = 1 l( ()): 

N

N k=1

k

A typical choice for l(k ) in this prediction error algorithm is 21 Tk k with prediction error k = yk ? y^k . For gradient based optimization algorithms one computes the gradient N @JN = 1 X T (? @ y^k ):  @ N k=1 k @

6

(15)

Dynamic backpropagation [9, 10] makes use then of a sensitivity model 8 > > > > > < > > > > > :

@ x^k+1 @ @ y^k @ @ y^k @

= @@x^k : @@ x^k + @@  = @@x^ k : @@ x^k

(16)

= @@ in order to generate the gradient of the cost function. The sensitivity model is a dynamical system with state vector @@ x^k 2 Rn , driven by the input vector consisting of @@  2 Rn , @@ 2 Rl and at the output @@ y^k 2 Rl , @@ y^k 2 Rl are generated. The Jacobians @@x^k 2 Rnn and @ ln are evaluated around the nominal trajectory. @ x^k 2 R Examples and applications of dynamic backpropagation, applied to neural state space models, are discussed in [19, 20, 21]. For aspects of model validation, regularization and pruning of neural network models see e.g. [1, 15, 18].

4 Checking global asymptotic stability of identi ed models In many applications, recurrent neural networks have been used in order to model systems with a unique or multiple equilibria, (quasi)-periodic behaviour or chaos. On the other hand one is often interested in obtaining models that are globally asymptotically stable, e.g. in case one has this a priori knowledge about the true system or one is interested in such an approximator. In this Section, we present two criteria that are sucient in order to check global asymptotic stability of the identi ed model, represented as NLq system.

Theorem 1 [Diagonal scaling] [20]. Consider the autonomous NLq system q Y

pk+1 = ( ?i (pk )Vi ) pk i=1

and let

2

Vtot =

6 6 6 6 6 6 6 6 6 6 4

0 V2 0 V3 ...

0

(17)

3 7 7 7 7 7 7 7 7 7 7 5

; Vi 2 Rnhi nhi+1 ; nh1 = nhq+1 = np:

0 Vq V1 0 A sucient condition for global asymptotic stability of (17) is to nd a diagonal matrix Dtot such that ?1 kq = D < 1; (18) kDtot Vtot Dtot 2 7

where Dtot = diagfD2 ; D3 ; :::; Dq ; D1 g and Di 2 Rnhi nhi are diagonal matrices with nonzero diagonal elements. Proof: See [20].

2 The condition is based on the Lyapunov function V (p) = kD1 pk2 , which is radially unbounded in order to prove that the origin is a unique equilibrium point. Finding such a diagonal matrix Dtot for a given matrix Vtot can be formulated as the LMI (Linear Matrix 2 : Inequality) in Dtot 2 V < D2 : (19) VtotT Dtot tot tot It is well-known that this corresponds to solving a convex optimization problem [3, 12, 27]. Similar criteria are known in the eld of control theory as 'diagonal scaling' [3, 7, 13]. However, it is also known that such criteria can be conservative. Sharper criteria are obtained by considering a Lyapunov function V (p) = kP1 pk2 with a non-diagonal matrix P1 instead of D1 . The next Theorem is expressed then in terms of diagonal dominance of certain matrices. Let us recall that a matrix Q 2 Rnn is called diagonally dominant of level Q  1 if the following property holds [20]:

qii > Q

n X j =1 (j 6=i)

jqij j ; 8i = 1; :::; n:

(20)

The following Theorem holds then.

Theorem 2 [Diagonal dominance] [20]. A sucient condition for global asymptotic

stability of the autonomous NLq system (17) is to nd matrices Pi , Ni such that q Y

?1 kq < 1 (  Q?i 1 )1=2 kPtot Vtot Ptot 2 i=1 Qi

(21)

with Ptot = blockdiagfP2 ; P3 ; :::; Pq ; P1 g, and Pi 2 Rnhi nhi full rank matrices. The matrices Qi = PiT Pi Ni are diagonally dominant with Qi > 1 and Ni are diagonal matrices with positive diagonal elements. Proof: See [20].

2 8

In order to check stability of an identi ed model, i.e. for a given matrix Vtot , one might formulate an optimization problem in Pi and Ni such that (21) is satis ed. LMI conditions that correspond to (21) are derived in [20]. For the case of neural networks with a sat(:) activation function (linear characteristic with saturation) Theorem 2 can be formulated in a sharper way [20]. In that case it is sucient to nd matrices Pi , Ni with: ?1 k2 < 1 such that Q = 1; (i = 1; :::; q): kPtot Vtot Ptot i

(22)

The latter follows from a result on stability of digital lters with over ow characteristic by Liu & Michel [8], which corresponds to the NL1 system:

xk+1 = sat(V xk ):

(23)

A sucient condition for global asymptotic stability is then to nd a matrix Q = P T P with:

kPV P ?1k2 < 1 such that Q = 1:

(24)

Also remark that for a linear system

xk+1 = Axk

(25)

one obtains the condition

kPAP ?1 k2 < 1; (26) from the Lyapunov function V (x) = kPxk2 with P a full rank matrix. The spectral radius of A corresponds to

(A) = P 2min kPAP ?1 k2 : Rnn

(27)

5 Modi ed dynamic backpropagation: imposing stability In this Section we discuss modi ed dynamic backpropagation algorithms, by adding an NLq stability constraint to the cost function (14). In this way it will be possible to identify recurrent neural network models that are guaranteed to be globally asymptotically stable. Based on the condition of Theorem 1, one may solve the following constrained nonlinear optimization problem: N N ) = 1 X l( ()) such that V ()T D2 V () < D2 : J ( ; Z min tot k tot tot tot ;Dtot N N k=1

9

(28)

The cost function is di erentiable but the constraint becomes non-di erentiable when the 2 Vtot () ? D2 coincide [14]. Convergent two largest eigenvalues of the matrix Vtot ()T Dtot tot algorithms for such non-convex non-di erentiable optimization problems have been described e.g. by Polak & Wardi [14]. The gradient based optimization method makes use of the concept of a generalized gradient [2] for the constraint. An alternative formulation of the problem is: N N ) = 1 X l( ()) such that min J ( ; Z k  N N k=1

?1 k2 < 1: min kDtot Vtot ()Dtot D tot

(29)

The evaluation of the constraint corresponds then to solving a convex optimization problem. Though LMI conditions have been formulated for the condition of diagonal dominance of Theorem 2, the use of these LMI conditions is rather unpractical for a modi ed dynamic backpropagation algorithm. Therefore we will make use of another Theorem in order to impose stability on the identi ed models.

Theorem 3 [Condition number factor] [20]. A sucient condition for global asymp-

totic stability of the autonomous NLq (17) is to nd matrices Pi such that q Y i=1

?1 kq < 1 (Pi ) kPtot Vtot Ptot 2

(30)

where Ptot = blockdiagfP2 ; P3 ; :::; Pq ; P1 g and Pi 2 Rnhi nhi are full rank matrices. The condition numbers (Pi ) are by de nition equal to kPi k2 kPi?1 k2 . Proof: See [20].

2 In practice, this Theorem is used as: ?1 k2 < 1: min maxf(Pi )g such that kPtot Vtot Ptot Ptot i

(31)

The constraint in (31) imposes local stability at the origin. The basin of attraction of the origin is enlarged by minimizing the condition numbers. Even when condition (30) is not satis ed, the basin of attraction can often be made very large (or probably in nitely large as simulation results suggest). This principle has been demonstrated on stabilizing and controlling systems with one or multiple equilibria, periodic and quasi-periodic behaviour and chaos [20].

10

Based on (31), dynamic backpropagation is modi ed then as follows: min J (; Z ;Qtot N

N) =

8 > > >
> > :

Vtot ()T Qtot Vtot () < Qtot I < Qi < 2i I:

(32)

T Ptot , Qi = P T Pi . An alternative The latter LMI corresponds to (Pi ) < i and Qtot = Ptot i formulation to (32) is: N N ) = 1 X l( ()) such that min kP V ()P ?1 k < 1 with maxf(P )g < c min J ( ; Z i tot 2 Ptot tot tot  N N k=1 k (33) with c a user-de ned upper bound on the condition numbers. For (29)-(33) the di erence in computational complexity between classical and modi ed dynamic backpropagation is in the LMI constraint which has to be solved at each iteration step and is O(m4 L1:5 ) for L inequalities of size m [27]. One can avoid solving LMIs at each iteration step at the expense of introducing additional unknown parameters to the optimization problem as Qtot in (32). Finally, for the case of NL1 systems, the problem can be formulated by solving a Lyapunov equation:

N 1X N min J ( ; Z ) = ; 0< > > > > > > > > > < > > > > > > > > > > :

Vtot ()T QVtot () + I = 2 Q min0  s:t: R = Q + I > 0

(34)

(P ) < c; R = P T P:

6 Example: a system corrupted by process noise Problem statement In this example we consider system identi cation of the following neural state space model as true system: 8 < x k+1 = WAB tanh(VA xk + VB uk ) + 'k (35) : yk = WCD tanh(VC xk + VD uk ): It is corrupted by zero mean white Gaussian process noise 'k . I/O data were generated by

11

taking

2

WAB = 4

0:4157 ?0:2006 0:1260 ?0:0237 1:1271 ?0:0401 ?0:6084 0:4073 ?0:2141 0:4840 ?0:2966 ?0:0027 ?0:1986 ?0:6325 0:4208 ?1:0233

3

2

5

VA = 4 2

WCD =



?0:5546 ?0:2603 1:3030 ?1:3587



VC = 4

?0:3152 ?0:2872 ?0:1009 ?0:8550

?0:8392 ?0:7385 ?0:6593 ?0:7005

?0:2323 ?0:5119 ?0:4354 0:4126 ?0:5717 ?0:8109 ?0:0671 0:0592

0:3600 0:5972 1:3145 ?0:2945 ?1:2125 ?0:9360 0:5938 ?0:2655

1:7870 ?1:4743 0:0347 0:2681 0:3255 1:7658 0:5651 ?1:7682

3

2

5

VB = 4

3

2

5

VD = 4

0:1256 1:2334 1:0599 ?1:7554

3

1:6275 ?2:3663 ?0:8700 ?0:7000

3

with zero initial state. The input uk is zero mean white Gaussian noise with standard deviation 5. The standard deviation of the noise 'k was chosen equal to 0.1. A set of 1000 data points was generated in this way, with the rst 500 data for the training set and the next 500 data for the test set. Some properties of the autonomous system are: (WAB VA ) = 0:98 < 1 and ?1 k2 = 1:66 > 1, which means that the origin is locally stable, but global minDtot kDtot Vtot Dtot asymptotic stability is not proven by the diagonal scaling condition. However, simulation of the autonomous system for several initial conditions suggests that the system is globally asymptotically stable (Fig.4). Because the state space representation of neural state space representation is only unique up to a similarity transformation and sign reversal of the hidden nodes [20], we are interested in identifying the true system, not in the sense of nding back the original matrices, but in order to obtain the same qualitative behaviour for the autonomous system of the identi ed model as for the true system.

Application of classical versus modi ed dynamic backpropagation System identi cation by using classical dynamic backpropagation and starting from randomly chosen parameter vectors  (deterministic neural neural state space model with the same number of hidden neurons as (35)), yields identi ed models with limit cycle behaviour for the autonomous system (Fig.5). This e ect is due to the process noise 'k . No problems of this kind were met for the system (35) in the purely deterministic case or in the case of observation noise. In order to impose global asymptotic stability on the identi ed models, modi ed dynamic backpropagation procedure was applied with diagonal scaling (28) and with condition number constraint (34). The autonomous system of (35) can be represented as the NL1 system (11). Fig.3 shows a comparison between classical and modi ed dynamic backpropagation for the error on the training set and test set, starting from 20 random parameter vectors  for the 3 methods. Besides the fact that the identi ed models appear to be globally asymptotically stable for modi ed dynamic backpropagation (Fig.6-7), the performance on 12

5

5

the test data is often better than with respect to classical dynamic backpropagation. One might argue that by taking a Kalman gain into the predictor the limit cycle problem might be avoided as well. However, often the deterministic part is identi ed rst and secondly the Kalman gain, while keeping the other parameters constant. Moreover the stability constraint can also be taken into account for stochastic models.

Software In the experiments, a quasi-Newton optimization method with BFGS updating of the Hessian [4, 6] was used for classical dynamic backpropagation (function fminu in Matlab [25]). For modi ed dynamic backpropagation, sequential quadratic programming [4, 6] was used (function constr in Matlab). Numerical calculation of the gradients was done. In the experiments, 70 iteration steps were taken for the optimization (Fig.3). For the case of diagonal scaling the Matlab function psv was used in order to calculate (28). Other software for solving LMI problems is e.g. LMI lab [5]. For the case of condition numbers, c = 100 was chosen as upper bound in (34).

7 Conclusion In this paper we discussed checking and imposing stability of discrete time recurrent neural network for nonlinear modelling applications. The class of recurrent neural networks that is representable as NLq systems has been considered. Sucient conditions for global asymptotic stability on diagonal scaling and diagonal dominance have been proposed for checking stability. Dynamic backpropagation has been modi ed with NLq stability constraints in order to obtain models that are globally asymptotically stable. Therefore criteria of diagonal scaling and condition number factors have been used. It has been illustrated on an example how a globally asymptotically stable system, corrupted by process noise, may lead to unwanted limit cycle behaviour for the autonomous behaviour of the identi ed model, if one applies classical dynamic backpropagation. The modi ed dynamic backpropagation algorithm overcomes this problem.

13

References [1] Billings S., Jamaluddin H., Chen S., \Properties of neural networks with applications to modelling non-linear dynamical systems," International Journal of Control, Vol.55, No.1, pp.193-224, 1992. [2] Boyd S., Barratt C., Linear controller design, limits of performance, Prentice-Hall, 1991. [3] Boyd S., El Ghaoui L., Feron E., Balakrishnan V., Linear matrix inequalities in system and control theory, SIAM (Studies in Applied Mathematics), Vol.15, 1994. [4] Fletcher R., Practical methods of optimization, second edition, Chichester and New York: John Wiley and Sons, 1987. [5] Gahinet, P., Nemirovskii A., \General-Purpose LMI Solvers with Benchmarks," Proceedings Conference on Decision and Control, pp. 3162-3165, 1993. [6] Gill P.E., Murray W., Wright M.H., Practical Optimization, London: Academic Press, 1981. [7] Kaszkurewicz E., Bhaya A., \Robust stability and diagonal Liapunov functions," SIAM Journal on Matrix Analysis and Applications, Vol.14, No.2, pp.508-520, 1993. [8] Liu D., Michel A.N., \Asymptotic stability of discrete time systems with saturation nonlinearities with applications to digital lters," IEEE Transactions on Circuits and Systems-I, Vol.39, No.10, pp. 798-807, 1992. [9] Narendra K.S., Parthasarathy K., \Identi cation and control of dynamical systems using neural networks," IEEE Transactions on Neural Networks, Vol.1, No.1, pp. 4-27, 1990. [10] Narendra K.S., Parthasarathy K., \Gradient methods for the optimization of dynamical systems containing neural networks," IEEE Transactions on Neural Networks, Vol.2, No.2, pp.252-262, 1991. [11] Nesterov Y., Nemirovskii A., Interior point polynomial algorithms in convex programming, SIAM (Studies in Applied Mathematics), Vol.13, 1994. [12] Overton M.L., \On minimizing the maximum eigenvalue of a symmetric matrix," SIAM Journal on Matrix Analysis and Applications, Vol.9, No.2, pp.256-268, 1988. 14

[13] Packard A., Doyle J., \The complex structured singular value," Automatica, Vol.29, No.1, pp.71-109, 1993. [14] Polak E., Wardi Y., \Nondi erentiable optimization algorithm for designing control systems having singular value inequalities," Automatica, Vol.18, No. 3, pp.267-283, 1982. [15] Reed R., \Pruning algorithms - a survey," IEEE Transactions on Neural Networks, Vol.4, No.5, pp.740-747, 1993. [16] Regalia Ph., Adaptive IIR ltering in signal processing and control, New York: Marcel Dekker Inc., 1995. [17] Shynk J., \Adaptive IIR ltering," IEEE ASSP Magazine, pp.4-21, 1989. [18] Sjoberg J., Zhang Q., Ljung L., Benveniste A., Delyon B., Glorennrc P., Hjalmarsson H., Juditsky A., \Nonlinear black-box modeling in system identi cation: a uni ed overview," Automatica, Vol.31, No.12, pp.1691-1724, 1995. [19] Suykens J.A.K., De Moor B., Vandewalle J., \Nonlinear system identi cation using neural state space models, applicable to robust control design," International Journal of Control, Vol.62, No.1, pp.129-152, 1995. [20] Suykens J.A.K., Vandewalle J.P.L., De Moor B.L.R., Arti cial neural networks for modelling and control of non-linear systems, Kluwer Academic Publishers, Boston, 1995. [21] Suykens J.A.K., Vandewalle J., \Learning a simple recurrent neural state space model to behave like Chua's double scroll," IEEE Transactions on Circuits and Systems-I, Vol.42, No.8, pp.499-502, 1995. [22] Suykens J.A.K., Vandewalle J., \Control of a recurrent neural network emulator for the double scroll," IEEE Transactions on Circuits and Systems-I, Vol.43, No.6, pp.511-514, 1996. [23] Suykens J.A.K., Vandewalle J., \Discrete time Interconnected Cellular Neural Networks within NLq theory," International Journal of Circuit Theory and Applications (Special Issue on Cellular Neural Networks), Vol.24, pp.25-36, 1996. [24] Suykens J.A.K., De Moor B., Vandewalle J., \NLq theory: a neural control framework with global asymptotic stability criteria," to appear in Neural Networks. 15

[25] The MathWorks Inc., \Optimization Toolbox (Version 4.1), User's guide," & \ Robust Control Toolbox (Version 4.1), User's guide," Matlab, 1996. [26] Tsoi A.C., Back A.D., \Locally recurrent globally feedforward networks: a critical review of architectures," IEEE Transactions on Neural Networks, Vol.5, No.2, pp. 229-239, 1994. [27] Vandenberghe L., Boyd S., \A primal-dual potential reduction method for problems involving matrix inequalities," Mathematical Programming, 69, pp.205-236, 1995. [28] Vidyasagar M., Nonlinear systems analysis, Prentice-Hall, 1993.

16

Captions of Figures Figure 1. Neural state space model, which is a discrete time recurrent neural network with multilayer perceptrons for the state and output equation and a Kalman gain for taking into account process noise. Figure 2. Locally recurrent globally feedforward network of Tsoi & Back, consisting of linear lters Gi(z ) (i = 1; :::; n ? 1) for local synapse feedback and Gn (z ) for local output feedback. Figure 3. Comparison between dynamic backpropagation (-), modi ed dynamic backpropagation with diagonal scaling (- -) and condition number constraint (..) for the error on the training data (*) and test data (o). The errors are plotted with respect to the experiment number. 20 random initial parameter vectors were chosen. The experiments were sorted with respect to the error on the training set. Figure 4. Behaviour of the true system to be identi ed. Shown are the state variables of the autonomous system for some randomly chosen initial state and zero noise. Figure 5. Unwanted limit cycle behaviour of the identi ed model, after applying classical dynamic backpropagation. The I/O data were generated by a globally asymptotically stable system, corrupted by process noise. Shown are the state variables of the identi ed model for the autonomous case. Figure 6. Model identi ed by applying modi ed dynamic backpropagation with diagonal scaling constraint. The model is guaranteed to be globally asymptotically stable as is illustrated for some randomly chosen initial state on this Figure. Shown are the state variables of the identi ed model for the autonomous case. Figure 7. Model identi ed by applying modi ed dynamic backpropagation with condition number constraint. Local stability at the origin is imposed. Its region of attraction is determined by the condition number. Shown are the state variables of the identi ed model for the autonomous case. 17

z

-1

f

^ x

k

+

+

^ x k+1

ε

Kalman

+

k

Gain

y k

-

g

^ yk

u k

Figure 1:

18

(1) u

k

G

1

k

G

2

(1) k

z

(2) k

(z)

(2) u

z

(z)

+

a

+ +

z

(n-1) u

k

G

n-1

(z)

(n-1) k

y

k

f(.)

+

z

(n) k G

Figure 2:

19

n

(z)

k

0.04

0.035

0.03

0.025

0.02

0.015 0

2

4

6

8

10

Figure 3:

20

12

14

16

18

20

1.5

1

0.5

0

−0.5

−1

−1.5 0

20

40

60 k

80

100

120

80

100

120

Figure 4: 2.5 2 1.5 1 0.5 0 −0.5 −1 −1.5 −2 −2.5 0

20

40

60 k

Figure 5: 21

1

0.5

0

−0.5

−1

−1.5 0

20

40

60 k

80

100

120

80

100

120

Figure 6: 2

1.5

1

0.5

0

−0.5

−1

−1.5

−2 0

20

40

60 k

Figure 7: 22