Model-following neuro-adaptive control design for non ... - IEEE Xplore

Model-following neuro-adaptive control design for non-square, non-affine nonlinear systems R. Padhi, N. Unnikrishnan and S.N. Balakrishnan Abstract: A new model-following adaptive control design technique for a class of non-affine and non-square nonlinear systems using neural networks is proposed. An appropriate stabilising controller is assumed available for a nominal system model. This nominal controller may not be able to guarantee stability/satisfactory performance in the presence of unmodelled dynamics (neglected algebraic terms in the mathematical model) and/or parameter uncertainties present in the system model. In order to ensure stable behaviour, an online control adaptation procedure is proposed. The controller design is carried out in two steps: (i) synthesis of a set of neural networks which capture matched unmodelled (neglected) dynamics or model uncertainties because of parametric variations and (ii) synthesis of a controller that drives the state of the actual plant to that of a desired nominal model. The neural network weight update rule is derived using Lyapunov theory, which guarantees both stability of the error dynamics (in a practical stability sense) and boundedness of the weights of the neural networks. The proposed adaptation procedure is independent of the technique used to design the nominal controller, and hence can be used in conjunction with any known control design technique. Numerical results for two challenging illustrative problems are presented, which demonstrate these features and clearly bring out the potential of the proposed approach.

1

Introduction

The field of artificial neural networks and its application to control systems has seen phenomenal growth in the last two decades. The origin of research on artificial neural networks can be traced back to 1940s [1]. In 1990, a compiled book was published [2] detailing various applications of artificial neural networks. A good survey paper appeared in 1992 [3], which outlined various applications of artificial neural networks to control-system design. The main philosophy that is exploited in system theory applications is the universal function approximation property of neural networks [4]. Benefits of using neural networks for control applications include its ability to effectively control nonlinear plants while adapting to unmodelled dynamics and time-varying parameters. In 1990, a paper by Narendra and Parthasarathy [5] demonstrated the potential and applicability of neural networks for the identification and control of nonlinear dynamical systems. The authors suggested various methods as well as learning algorithms useful for identification and adaptive control of nonlinear dynamic systems using recurrent neural networks. Since then, Narendra et al. have come up with a variety of useful adaptive control design techniques using neural networks, including applications concerning multiple models [6]. In 1992, Sanner and Slotine [7] developed a direct adaptive tracking control method with Gaussian radial basis # The Institution of Engineering and Technology 2007 doi:10.1049/iet-cta:20060364 Paper first received 24th August 2006 and in revised form 3rd April 2007 R. Padhi is with the Department of Aerospace Engineering, Indian Institute of Science, Bangalore, India N. Unnikrishnan and S.N. Balakrishnan are with the Department of Mechanical and Aerospace Engineering, University of Missouri, Rolla, USA E-mail: [email protected]

1650

function (RBF) networks to compensate for plant nonlinearities. The update process also kept the weights of the neural networks bounded. In 1996, Lewis et al. [8] proposed an online neural network that approximated unknown functions and it was used in designing a controller for a robot. Their approach avoided some of the limiting assumptions (like linearised models) of traditional adaptive control techniques. More importantly, their theoretical development also provided a Lyapunov stability analysis that guaranteed both tracking performance as well as boundedness of weights. However, the applicability of this technique was limited to systems which could be expressed in the ‘Brunovsky form’ [9] and which were affine in the control variable (in state space form). A robust adaptive output feedback controller for SISO systems with bounded disturbance was studied by Aloliwi and Khalil [10]. In a more recent paper, an adaptive output feedback control scheme for the output tracking of a class of nonlinear systems was presented by Seshagiri and Khalil [11] using RBF neural networks. A relatively simpler and popular method of nonlinear control design is the technique of dynamic inversion [12 – 14], which is essentially based on the philosophy of feedback linearisation [9, 15]. In this approach, an appropriate co-ordinate transformation is carried out to make the system dynamics take a linear form. Linear control design tools are then used to synthesise the controller. A drawback of this approach is its sensitivity to modelling errors and parameter inaccuracies. One way of addressing this problem is to augment the dynamic inversion technique with the H1 robust control theory [14]. Important contributions have come from Calise et al. in a number of publications [16 – 20] who have proposed to augment the dynamic inversion technique with neural networks so that the inversion error is cancelled out. The neural networks are trained online using a Lyapunov-based approach IET Control Theory Appl., 2007, 1, (6), pp. 1650 –1661

(similar to the approach followed in [7] and [8]). This basic idea has been extended to a variety of cases, namely outputbased control design [19, 20], reconfigurable control design [21] and so on. The feasibility and usefulness of this technique has been demonstrated in a number of applications in the field of flight control. There is a need to develop a universal control design technique to address modelling error issues, which can be applicable not only with dynamic inversion, but also with any other control design technique. Almost all techniques mentioned in the literature are applicable only for certain classes of nonlinear systems (control-affine systems, SISO systems and so on). In this context, a more powerful tool which can be applied to nonlinear systems in a general form will be useful in solving a wide variety of control problems in many engineering applications. Keeping these issues in mind, an approach was first presented in [22], where the authors followed a model-following approach. The idea presented in this reference is to design an ‘extra control’ online, which when added to a nominal controller leads to overall stability and improves the overall performance of the plant. The objective of this paper is to present a new systematic approach relying on the philosophy presented in [22] to address the important issues pointed out in the previous paragraph. In this paper, the controller design is carried out in two steps: (i) synthesis of a set of neural networks which capture matched unmodelled dynamics (because of neglected algebraic terms or because of uncertainties in the parameters) and (ii) computation of a controller that drives the state of the actual plant to that of a desired nominal model. The neural network weight update rule is derived using a Lyapunov-based approach, which guarantees both stability of the error dynamics (in a practical stability sense) as well as boundedness of weights of the neural networks. Note that this technique is applicable for non-square and nonaffine systems with matched uncertainties in their system equations. In this study, numerical results from two challenging problems are presented. The results obtained from this research bring out the potential of this new approach. Rest of the paper is organised as follows. In Section 2, the control design technique is laid out and the two-step process proposed in this paper is explained. The neural network structure is discussed and the derivation of the weight update rule is outlined in this section. Simulation studies on two challenging nonlinear problems were carried out. Simulation details and promising results are presented in Section 3. Appropriate conclusions are drawn in Section 4. 2

Control design procedure

The control design procedure proposed in this paper has two main components. In one part, assuming a neural network approximation of the unknown algebraic function (that represents the unmodelled dynamics and/or parametric uncertainties), the objective is to get the solution for the control variable that guarantees model-following. In the other part, the aim is to train the weights of the neural networks, in a stable manner, so as to capture the unknown function that is part of the plant dynamics. These two components of the control design procedure are discussed in detail in the following subsections. 2.1

Consider a nonlinear system, which has the following representation X_ d ¼ f (X d , U d ) IET Control Theory Appl., Vol. 1, No. 6, November 2007

X_ ¼ f (X, U) þ d(X )

(2)

where X [ R n is the state of the actual plant and U [ R m is the modified controller. The unknown algebraic function d(X) [ R n arises because of the two reasons mentioned above. Note that the two functions f (Xd , Ud) and f(X, U) may or may not have same algebraic expressions. However, f(X, U) contains the known part of the dynamics of (2). The task here is to design a modified controller U online in such a way that the states of the actual plant track the respective states of the nominal model. In other words, the goal is to ensure that X ! Xd as t ! 1, which ensures that the actual system performs like the nominal system. As a means to achieve this, the aim is to first capture the unknown function d(X) first, which is accom^ plished through a neural network approximation d(X). A necessary intermediate step towards this end is the definition of an ‘approximate system’ as follows ^ ) þ (X X a ), X a (0) ¼ X (0) X_ a ¼ f (X , U) þ d(X

(3)

Through this artifice, one can ensure that X ! Xa ! Xd as t ! 1. Obviously, this introduces two tasks: (i) ensuring X ! Xa as t ! 1 and (ii) ensuring Xa ! Xd as t ! 1. The reason of choosing an approximate system of the form in (3) is to facilitate meaningful bounds on the errors and weights. 2.2

Control solution (ensuring Xa ! Xd)

In this loop, it is assumed that a neural network approxi^ mation of the unknown function d(X) is available. The goal in this step is to drive Xa ! Xd as t ! 1, which is achieved by enforcing the following first-order asymptotically stable error dynamics (X_ a X_ d ) þ K(X a X d ) ¼ 0

(4)

where a positive definite gain matrix K is chosen. A relatively easy way of choosing the gain matrix is to have K ¼ diag(1=t1 , . . . , 1=tn )

(5)

where ti , i ¼ 1, . . . , n, can be interpreted as the desired time constant for the ith channel of the error dynamics in (4). Substitution of (1) and (3) into (4) leads to ^ ) þ (X X a ) f (X d , U d ) f (X , U) þ d(X

Problem description

where Xd [ R n is the desired state vector and Ud [ R m (m n) is the nominal control vector of a nominal system. It is assumed that the order of the system n is known and a satisfactory nominal controller Ud has been designed using some standard method (e.g. dynamic inversion technique, optimal control theory, Lyapunov theory and so on.) such that this controller meets some desired performance goal. However, (1) may not truly represent the actual plant because (i) there may be neglected algebraic terms in this mathematical model (this study is restricted to this class of unmodelled dynamics) and (ii) the numerical values of the parameters may not perfectly represent the actual plant and this error results in unknown functions in the model. As a consequence, the actual plant is assumed to have the following structure

(1)

þ K(X a X d ) ¼ 0

(6)

Solving for f(X, U) from (6) f (X , U) ¼ b(X , X a , X d , U d )

(7) 1651

where

b(X , X a , X d , U d ) W {f (X d , U d ) ^ )} K(X a X d ) (X X a ) d(X

(8)

The next step is to solve for the control U from (7). A few different cases and issues need to be considered in this context, which are discussed next. Case 1. If the following conditions are satisfied: † The system is square, i.e. m ¼ n † The system dynamics is affine in control variable, i.e. f(X, U) can be written as f (X , U) ¼ f 1 (X ) þ [g1 (X)]U

(9)

† [g1(X(t))]nn is non-singular 8t From (7– 9), U can be obtained in a straight forward manner as 1

U ¼ [g1 (X)] {b(X, X a , X d , U d ) f 1 (X )}

The following quantities are defined T V W U T ; U Ts f a (X, V ) W [f (X, U) þ C(X )U s ] d^ a (X, U s ) W [d(X ) C(X )U s ]

(11)

(12) (13) (14)

Using the definitions in (12 –14), (11) can be expressed as X_ a ¼ f a (X , V ) þ d^ a (X , U s ) þ (X X a ), X a (0) ¼ X (0) (15) Note that (15) defines a square system in X and V and therefore it is feasible to get a solution for V. The first m elements of V represents U. As a part of this process, the control designer needs to obtain d^ a (X , U s ). 1652

U f 1 (X ) þ g1 (X ) C(X) þ d^ a (X , U s ) Us þ (X X a ) f (X d , U d ) þ K(X a X d ) ¼ 0 (16) f 1 (X ) þ g1 (X ) C(X) V þ d^ a (X , U s ) þ (X X a ) f (X d , U d ) þ K(X a X d ) ¼ 0

(17)

This leads to the solution

(10)

Case 2. The question is what if the system is control affine but non-square? Two cases may arise, i.e. either m . n or m , n. If m . n, a technique that can be made use of is linear programming. Linear programming is the process of optimising a linear objective function subject to a finite number of linear equality and inequality constraints [23]. Control allocation problems in the face of redundant controllers have been dealt with successfully using linear programming in aerospace applications as shown in [24]. However, if m , n, which is usually the case in many engineering applications, a novel method of introducing extra variables to augment the control vector to make it square is proposed. This technique leads to a square problem that facilitates a solution. From this solution, components of the augmented control vector that represent the actual controller can be extracted. This idea will be elaborated in the following paragraphs. When m , n, the number of equations is more than the number of control variables and (6) cannot be solved for U. To find a solution, a slack-variable vector Us is introduced first. Next, an n (n 2 m) matrix C(X) is designed and C(X)Us is added to the right-hand side of the approximate system (3) to get X_ a ¼ [f (X , U) þ C(X )U s ] þ d^ a (X, U s ) þ (X X a ), X a (0) ¼ X (0)

A neural network is used in this study for this purpose. d^ ai (X, U s ) can be obtained as the output of a neural ^ T Fi (X , U s ). Here W ^ and F are network represented by W i the weight vector and basis function vector of a neural network, respectively. [XT, UTs ] is the input vector to the neural network. The subscript i stands for the each state of the plant model, i.e. each state equation has a separate neural network associated with it. Similar to the expression in (4), the error dynamic equation for a control affine but non-square system can be written as

V ¼ [G(X)]1 bs (X, X a , X d , U d , U s )

(18)

where (19) [G(X)] W g1 (X ) C(X) bs (X , X a , X d , U d , U s ) W {f (X d , U d ) K(X a X d ) (X X a ) d^ a (X, U s ) f 1 (X)} (20) ^ Ti Fi (X , U s ), i ¼ 1, . . . , n d^ ai (X , U s ) ¼ W

(21)

Note that the function C(X) should be chosen carefully such that the square matrix [G(X)] does not become singular. Choosing such a function C(X), however, is problem dependent and care should be taken while choosing it. It has to be noted that this formulation will result in a fixed point problem in the control solution because the control vector V contains the vector Us , and the control solution equation (18) also contains Us on the right-hand side. In (18), Us is an input to the neural network that approximates the uncertain function. Solution for V is obtained numerically as V kþ1 ¼ G 1 (H d^ a (X , V k )), k ¼ 0,1, 2 . . . , where k is the iteration number and H W f f (Xd , Ud) 2 K(Xa 2 Xd) 2 (X 2 Xa) 2 f1(X)g. The validity of this solution has been proved using the contraction mapping theorem (see Section 7). The proof containing conditions required to prove the existence of a unique control solution given for the most general case, i.e. the nonsquare, non-affine case. Case 3. The system dynamics is square (m ¼ n), but not control-affine. In such a situation, the following three options are available 1. The form of the equation may be such that it may still facilitate a closed-form solution for the control variable. 2. Another option is the use a numerical technique such as the standard Newton – Raphson technique [25]. With the availability of fast computational algorithms and high-speed processors, fast online numerical solution of algebraic equations is not considered to be an infeasible task. For example, the viability of the Newton – Raphson technique for online applications is discussed in [26, 27], where the authors have used the technique for complex real-life problems. Note that a good initial guess solution can be IET Control Theory Appl., Vol. 1, No. 6, November 2007

provided at any time step k as U d, (U guess )k ¼ U k1 ,

which reduces to k¼1 k ¼ 2, 3, . . .

U X_ a ¼ f 0 (X) þ g0 (X ) C(X) Us

(22)

þ d^ a (X , U, U s ) þ X X a 3. Following the idea in [28, 29], a novel method is introduced to deal with a class of control non-affine smooth nonlinear systems of the form X_ ¼ f (X , U), where f is a smooth mapping and f(0, 0) ¼ 0. If the unforced dynamic equation X_ ¼ f (X, 0) W f 0 (X ) of a system that falls in the class of systems mentioned above ðX_ ¼ f (X , U), where f is a smooth mapping and f(0, 0) ¼ 0) is Lyapunov stable, the system equation can be represented as X_ ¼ f 0 (X) þ g0 (X )U þ

m X

ui (Ri (X , U)U)

(23)

i¼1

as shown in [28, 29]. In the above-mentioned representation

with " # m X ^d a (X , U, U s ) W d(X ) C(X )U s þ ui (Ri (X , U)U) i¼1

(31) Define V W [UT UTs ]T and [G(X)] W [g0(X) C(X)]. The error dynamic equation can be expressed as f 0 (X ) þ GV þ d^ a (X, U, U s ) þ (X X a ) f (X d , U d ) þ K(X a X d ) ¼ 0

(32)

The control can be solved as

f 0 (X ) W f (X , 0) @f g0 (X ) W (X , 0) ¼ [g01 (X ) . . . g0m (X)] [ R nm @u

(24)

and Ri(X, U): R n R m ! R nm is a smooth mapping for 1 i m. The actual plant equation X_ ¼ f (X , U) þ d(X) for this class of nonlinear non-affine systems can be expressed as X_ ¼ f 0 (X ) þ g0 (X)U þ

(30)

m X

ui (Ri (X, U)U) þ d(X )

(25)

i¼1

The approximate plant equation now becomes ^ þ X Xa X_ a ¼ f 0 (X ) þ g0 (X )U þ d(X,U)

(26)

^ In this case, the online neural network output d(X, U) m P ui (Ri (X , U)U) þ d(X ). Now captures the uncertainty i¼1

the control solution can be obtained from the error dynamic equation (4) between the approximate state and the desired state as U ¼ [g0 (X)]1 {b(X, X a , X d , U d , U) f 0 (X )}

(27)

where b(X , X a , X d , U d , U) W {f (X d , U d ) ^ , U)} K(X a X d ) (X X a ) d(X

(28)

Note that [g0(X)] is assumed to be non-singular 8 t. Here again it can be seen that (27) constitutes a fixed point problem. The control solution is obtained numerically using Ukþ1 ¼ [g0(X)]21[b(X, Xa , Xd , Ud , Uk) 2 f0(X)], k ¼ 0, 1, 2. . . , where k is the iteration number. Such a solution is shown to be valid by proving that the mapping in (27) is a contraction mapping. The proofs and conditions that lead to the validity of the solution are given in Section 7 (Appendix). Case 4. If the system is both non-square and non-affine in control, the approximate plant equation takes the form X_ a ¼ f 0 (X ) þ g0 (X)U þ d^ a (X , U, U s ) þ X X a þ C(X)U s IET Control Theory Appl., Vol. 1, No. 6, November 2007

(29)

V ¼ [G(X )]1 bs (X , X a , X d , U d , U, U s ) bs (X , X a , X d , U d , U, U s ) W {f (X d , U d )

(33)

K(X a X d ) (X X a ) d^ a (X ,U,U s ) f 0 (X)} (34) Only the first m elements of V will be needed for the implementation of the control on the actual plant. [G(X)] is assumed to be non-singular 8 t. The control solution is obtained numerically using V kþ1 ¼ G 1 (H d^ a (X , V k )), k ¼ 0,1, 2. . . , where k is the iteration number and H W [ f (Xd ,Ud) 2 K(Xa 2 Xd) 2 (X 2 Xa) 2 f0(X)]. This solution is shown to be valid by proving that the mapping in (33) is a contraction mapping. The detailed proof is provided in Section 7 (Appendix). 2.3 Capturing the unknown function and neural network training (ensuring X ! Xa) In this section, the process of realising the uncertainties in the actual plant equations (which is crucial for controller synthesis) is discussed in detail. The Stone – Weierstrass theorem from classical real analysis can be used to show that certain network architectures possess the universal approximation capability. Networks typically have the desirable properties that larger networks produce less error than smaller networks and almost all functions can be modelled by neural networks. This makes the authors believe that the neural networks are more efficient in approximating complex functions if there are a large number of neurons in the hidden layer. 2.3.1 Selection of neural network structure: An important idea used in this work is to separate all the channels in the system equations. Thus, there will be n independent neural networks to approximate uncertainties in each of the n channels, which facilitates easier mathematical analysis. Define d(X) W [d1(X) dn(X)]T, where di(X), i ¼ 1, . . . , n is the ith component of d(X) which is the uncertainty in the ith state equation. Since each element of d(X) is represented by a separate neural network, each network output ^ Ti Fi (X ). It should be noted here that can be expressed as W the neural network input vector may contain states, the control vector and the slack variable vector. Separation of channels has been carried out in this work to keep the uncertainties in each system equation distinct. During system 1653

Subtracting (39) from (38) and using the definition in (37) gives

operation, magnitudes of uncertain terms in system equations may be of different orders. In such a case, having one network approximate uncertainties of the whole system may affect the convergence of the single network. In order to prevent this from happening, all channels were separated. Trigonometric basis neural networks [30, 31] were used in this study for approximating each of the unknown functions di(X). The online ‘uncertainty approximating neural network’ can be represented by a linearly parameterised feedforward structure. Radial basis functions (RBFs) can be used in these structures because these functions are universal approximators. However, RBFs are very poor at interpolating between their design centres, and in such cases a large number of basis functions are needed. Researchers typically use basis functions constructed from functions that they think richly represent the nature of the unknown terms that are being approximated. There is no standard procedure for choosing basis functions for any application. Fourier series has the ability to approximate any nonlinear function quite well. Note that such a choice also makes it application independent. Trigonometric basis neural networks are used in this study for approximating each of the unknown functions as the authors’ believe that the trigonometric sine and cosine functions and their combinations have the capability to represent many nonlinear functions well. In order to form the vector of basis functions, the input data is first pre-processed. In the numerical experiments carried out, vectors Ci , i ¼ 1, . . . , n which had a structure Ci ¼ [1 sin(xi) cos(xi)]T were created. The vector of basis functions was generated as

F ¼ kron(C n , . . . , kron (C 3 , kron(C 1 ,C 2 )), . . . )

e_ ai ¼ di (X ) d^ ai (X, U, U s ) eai

From the universal function approximation property of neural networks [31], it can be stated that there exists an ideal neural network with an optimum weight vector Wi and basis function vector Fi(X) that approximates di(X) to an accuracy of 1i , that is di (X ) ¼ W Ti Fi (X) þ 1i

^ Ti Fi (X, U, U s ) d^ ai (X , U, U s ) ¼ W ~ Ti Fi (X, U, U s ) þ 1i ea e_ ai ¼ W i

(43)

^ i ) is the error between the actual ~ i W (W i W where W weight and ideal weight of the neural network. Note that _^ _~ ¼ W W i i since Wi is constant. An important point to be noted here is that the aim of each neural network is to capture the resulting function in each state equation and not parameter estimation or system identification. The magnitudes of the uncertainties/nonlinearities in the state equations are then used to make the plant track the desired reference trajectory. Theorem: A stable adaptive weight update rule proposed as follows

(35)

_^ ^ W i ¼ gli eai Fi (X , U, U s ) gli si W i

(44)

will ensure bounds on the error signal eai and the adaptive ^ i . gl is the learning rate weights of the online networks W i of the ith online network and si is a sigma modification factor used to ensure a bound on the network weights. Proof: Choose a Lyapunov function for each state equation as 1 1 ~ T 1 ~ vi ¼ (e2ai ) þ (W i gli W i ) 2 2

2.3.2 Training of neural networks: The technique for updating the weights of the neural networks (i.e. training the networks) for accurate representations of the unknown functions di(X), i ¼ 1, . . ., n is discussed here. Define

(45)

Taking the derivative of the Lyapunov function _~ ~ Ti g1 v_ i ¼ eai e_ ai þ W li W i

(37)

(46)

On substituting the expression for e_ ai in (46) _~ ~ Ti Fi (X , U, U s ) þ 1i ea ) þ W ~ Ti g1 (47) v_ i ¼ eai (W li W i i _^ If the proposed weight update rule W i ¼ gli eai ^ i is used, the error dynamics of the Fi (X , U, U s ) gli si W difference between the optimal weight vector that represents the uncertainty and the weight vector used in the online

From (2 – 3), equations for the ith channel can be decomposed as

x_ ai ¼ fi (X , U) þ d^ ai (X, U, U s ) þ eai

(42)

Substituting (41 – 42) in (40) leads to

where Y [ R n and Z [ R m. The dimension of the neural network weight vector is same as the dimension of F. The neural network outputs for each of the different cases considered in this study have been tabulated in Table 1.

x_ i ¼ fi (X , U) þ di (X )

(41)

Let the actual weight of the network used to approximate ^ i . The approximated function can the uncertainties be W be written as

kron(, ) represents the ‘Kronecker product’ and is defined in [30] as T kron(Y , Z) ¼ y1 z1 y1 z2 yn zm (36)

eai ; (xi xai )

(40)

(38) (39)

Table 1: Uncertainties and neural network nutputs for different system types No.

System type

Uncertainty

Neural network output

1

Square, affine

d(X)

2

Non-square, affine

3

Square, non-affine

d(X) 2 C(X)Us m P d (X ) þ ui (R i (X , U)U)

4

Non-square, non-affine

d (X ) C(X )U s þ

i¼1

m P

d^ (X ) d^ a (X , U s ) d^ ðX , UÞ

ui (R i (X , U)U)

d^ a ðX , U, U s Þ

i¼1

1654

IET Control Theory Appl., Vol. 1, No. 6, November 2007

For v_ i , 0

adaptive network can be represented as _^ ^ W i ¼ gli eai Fi (X ,U,U s ) þ gli si W i

2

_^ _~ ¼ W since W i i. On substituting (48) in (47)

T

(49)

However ~ Ti W ^ i ¼ 1 (2(W ^ i )) ~ Ti W W 2 1 ~T ~ ¼ (2W i (W i W i )) 2 1 ~T ~T~ ¼ (2W i W i 2W i W i ) 2

(50)

jeai j .

~ i þ gl (si W i ea Fi (X, U, U s )) ¼ gli si W i i

~ Ti W ~ Ti W ^ Ti W i ^ iþW ~ i þ W Ti W i W ¼W ^ Ti (W ~ i W i) þ W ~ Ti W ~ i þ W Ti W i ¼W

X_ c ¼ AX c þ BU c (51)

Equation (50) can now be expressed as

(52)

^ Ti W ^ i ¼ 1 ((W ^ i ) (W ~ i ) þ (W Ti W i )) ~ Ti W ~ Ti W W 2 1 ~ i k2 kW ^ i k2 þ kW i k2 ) (53) (kW 2 Therefore the last term in (49) can be written in terms of the inequality (54)

The equation for v_ i becomes 1 ~ i k2 1 si kW ^ i k2 þ 1 si kW i k2 v_ i eai 1i e2ai si kW 2 2 2 12 1 ~ i k2 1 si kW ^ i k2 þ 1 si kW i k2 þ i e2ai si kW 2 2 2 2 2 2 2 ea 1 1 1 ~ i k2 1 si kW ^ i k2 ¼ i þ i þ si kW i k2 si kW 2 2 2 2 2 (55) Define


(56)

(60)

A(tt0 )

ðt

X c (t0 ) þ eA(tt) BU c (t) dt

(61)

t0

On using the bound ke A(t2t0)k ke 2l(t 2 t0), the bound on the solution to (60) can be expressed as [32] kX c (t)k ke

l(tt0 )

ðt

kX c (t0 )k þ kel(tt) kBkkU c (t)k dt t0

kel(tt0 ) kX c (t0 )k þ

kkBk sup kU (t)k l t0 tt c

(62)

Such a system is input-to-state stable [32]. This proves that ~ i is bounded for all bounded inputs. Since the input Xc W W ~ i is bounded, which to the system in (62) is bounded, W ^ i is bounded as well. This completes the proves that W proof. A 3

e2ai

2 1i 1 2 þ s kW i k bi W 2 2 i

(59)

For the above-mentioned linear time-invariant system with a negative definite A, the solution can be written as X c (t) ¼ e

Equation (52) can be expressed as

1 ~ i k2 1 si kW ^ i k2 þ 1 si kW i k2 si kW 2 2 2

(58)

~ i , A W gl si , B W gl , U c W si W i Define X c W W i i eai Fi (X , U, U s ). Equation (59) can be expressed as a linear differential equation of the form

~ Ti (W ^ iþW ~ i ) þ (W i W ^ i )T W i ¼W

~ Ti W î si W

pffiffiffiffiffiffiffi 2bi

_~ ¼ g e F (X , U, U ) þ g s (W W ~ i) W i li ai i s li i i

~ Ti W i þ W ~ Ti W i ˜ Ti W i ¼ W 2W

^ Ti W ~ Ti W ^ i ¼ 1 ((W ^ i ) þ (W ~ i) ~ Ti W W 2 ~ Ti W ~ Ti W ~ i ) (W ~ i )) þ (W Ti W i ) (W

(57)

Thus, it can be seen that selecting a sufficiently small si and choosing a sufficiently good set of basis functions which will reduce the approximation error 1i will help in keeping the error bound small. The p error bound for the proposed weight update scheme is ð2bi Þ. The following steps will prove that the weight update rule is stable and all the signals in the weight update rule are bounded. _~ ¼ g e It can be seen from (48) that W i li ai ^ i. Fi (X , U, U s ) þ gli si W Equation (48) can be expanded as

The first term in (50) can be expanded as follows.

^ Ti W ~ Ti W ^ iþW ~ i þ W Ti W i ¼ W

. bi

or

~ Ti g1 ~ Ti Fi þ ea 1i e2a þ W ^ v_ i ¼ eai W li ( gli (eai Fi si W i )) i i ~iW î ¼ eai 1i e2ai þ si W

e2ai

(48)

Simulation studies

In this section, two motivating examples that demonstrate the ideas presented in Section 2 are presented. The examples show that the methodology discussed in this paper can indeed be used to design controllers for complex nonlinear systems. 3.1

Van der Pol problem

As the first exercise, the Van der Pol system [33] was selected. The motivations for selecting it were (i) it is a vector problem, (ii) it is a non-square problem (m ¼ 1, n ¼ 2),(iii) the homogeneous system has an unstable equilibrium at the origin and (iv) the system exhibits limit cycle 1655

behaviour. These properties make it a challenging problem for state regulation. The desired system dynamics for this problem is given by

where 2

3 (x2 x2d ) þ d^ a1 (X , U s ) 7 6 6 a(1 x2 )x x þ d^ (X , U ) 7 Df1 6 1 2 1 a2 s 7 ;6 7 Df2 6 a(1 x21 )x2 þ x1 7 4 5 d d d

x_ 1d ¼ x2d x_ 2d ¼ a(1 x21d )x2d x1d þ (1 þ x21d þ x22d )ud

(63)

where x1d represents position and x2d represents velocity. The goal was to drive X W [x1d , x2d]T ! 0 as t ! 1. Formulating a regulator problem, the desired state and control trajectories were obtained using a new method known as ‘single network adaptive critic’ (SNAC) using a quadratic cost function shown below

J¼

1 (X T QX þ ru2d ) dt 2

(64)

0

where Q ¼ diag(I2) and r ¼ 1. Details of the SNAC technique can be obtained from [34]. In the SNAC synthesis for this problem, the critic neural network was made up of two sub-networks, each having a 2-6-1 structure. The plant dynamics was assumed to be x_ 1 ¼ x2 x_ 2 ¼ a(1 x21 )x2 x1 þ d(X ) þ (1 þ x21 þ x22 )u

(1 þ x21d þ x22d )ud

The gain matrix was selected as K ¼ diag(1, 1). After solving for V, the control variable u was extracted from it as the first element of V. For the first iteration in the control solution process, us ¼ 0 was used. With this, the explicit expression for u is given by u¼

1 ð

(65)

where d(X) ¼ 2 cos(x2) was the assumed unmodelled dynamics. Following the discussions in Section 2, the approximate system can be expressed as x_ 1a ¼ x2 þ (x1 x1a ) ^ ) þ (x x ) x_ 2a ¼ a(1 x21 )x2 x1 þ d(X 2 2a þ (1 þ x21 þ x22 )u (66) Since the problem is in a non-square form, the technique mentioned in Section 2 was used and C ¼ [210 10]T was selected. The approximate system was expressed as

(69)

1 [k1 (x1 x1d ) (k2 þ 1)(x2 x2d ) Df2 1 þ x21 þ x22 (x1 x1a ) (x2 x2a )]

(70)

The basis function vectors for the two neural networks were selected as T C 1 ¼ 1 sin (x1 ) cos (x1 ) T C 2 ¼ 1 sin (x2 ) cos (x2 ) T C 3 ¼ 1 sin (us ) cos (us )

F1 (X ) ¼ F2 (X ) ¼ kron(C 1 (kron(C 2 , C 3 ))

(71)

and the neural network training learning parameters were selected as gl1 ¼ gl2 ¼ 10 and s1 ¼ s2 ¼ 1 1026. In Fig. 1, the resulting state trajectories for the nominal system with the nominal controller, the actual system with the nominal controller and the actual system with the adaptive controller are given. First, it is clear from the plot that the nominal control is doing the intended job (of driving the states to zero) for the nominal plant. Next, it can be observed that if the same nominal controller is applied to the actual plant (with the unmodelled dynamics d(X) ¼ 2 cos x2), x1 cannot reach the origin. However, if the adaptive controller is applied, the resulting controller drives the states to the origin by forcing them to follow the states of the nominal system.

x_ 1a ¼ x2 þ (x1 x1a ) 10us þ d^ a1 (X , U s ) x_ 2a ¼ a(1 x21 )x2 x1 þ 10us þ d^ a2 (X, U s ) þ (x2 x2a ) þ (1 þ x21 þ x22 )u

(67)

with d^ a1 (X, U s ) expected to approximate the uncertainty 10us introduced in the first-state equation as a result of the approximate system being made square and d^ a2 (X, U s ) expected to approximate the uncertainty created by the algebraic sum of d(X) and 210us , which are the uncertain terms in the second-state equation of the approximate system. The augmented controller in (18) was expressed as

V¼

1656

0 10 1 1 þ x21 þ x22 10 ( " # #) " x1 x 1 a x1 a x 1 d Df1 K x2 a x2 d x2 x2 a Df2

Fig. 1 State trajectories against time

(68)

a State x1 against time b State x2 against time IET Control Theory Appl., Vol. 1, No. 6, November 2007

Fig. 2 Control and uncertainty approximation trajectories against time a Control trajectory against time b Network approximation of uncertainty in the first-state equation c Network approximation of uncertainty in the second-state equation

Fig. 2 illustrates control trajectories and neural network approximations of uncertainties in the two state equations. In Fig. 2a, a comparison between the histories of the nominal control and the adaptive control is presented. Fig. 2b shows the output of the first neural network, d^ 1 (X) tracking the uncertainty in the first-state equation because of 2C1us . From Fig. 2c it can be seen how well the neural network approximates the unknown function (d(X) 2 C2us), which is critical in deriving the appropriate adaptive controller.

In (72 – 73), x1d and x2d denote the desired position and velocity of mass 21, respectively. Similarly x3d and x4d denote the desired position and velocity of mass 22, respectively. Note that the control variables uid (torques applied by the servomotors) enter the system dynamics in a non-affine fashion. The system parameters and their values are listed in Table 2. Objectives of the nominal controllers were to make x1d and x3d track a reference signal R ¼ sin(2pt/T ), with T ¼ 10. Since x2d and x4d are derivatives of x1d and x3d , respectively, it means that x2d and x4d must track the reference signal R_ ¼ (2pt=T ) cos (2pt=T ). The nominal controller was designed using the dynamic inversion technique [36]. A second-order error dynamic equation € þ K d (X_ d R) _ þ K p (X d R) ¼ 0] was made [(X€ d R) use of in the controller design as the objective was tracking. The gain matrices used were Kd ¼ Kp ¼ I2 . In this problem, parametric uncertainties Da1 and Da2 were added to parameters a1 and a2 respectively. Functions f~ 1 (X ) and f~ 2 (X ) were added as unmodelled dynamic terms. The true plant equations now were of the following form x_ 1 ¼ x2

x_ 2 ¼ a1 þ Da1 sin(x1 ) þ b1 þ j1 tanh(u1 ) þ s1 sin(x4 ) þ f~ 1 (X ) x_ 3 ¼ x4 x_ 4 ¼ (a2 þ Da2 ) sin(x3 ) þ b2 þ j2 tanh(u2 ) þ s2 sin(x2 ) þ f~ 2 (X )

(74)

The next problem considered is a double inverted pendulum [19, 35]. The interesting aspects of this problem are (i) the equations of motion consist of four states, (ii) it is a nonsquare problem (n ¼ 4, m ¼ 2) and, more important, (iii) it is a problem which is non-affine in the control variable. In this problem both parameter variation and unmodelled dynamics are considered simultaneously. These characteristics make this problem a sufficiently challenging one to demonstrate that the proposed technique works for complex problems. The nominal system dynamics for this problem is given by [35]

To test the robustness of the proposed method, Da1 and Da2 were selected to be 20% of their corresponding nominal values. Similarly, f~ 1 (X ) and f~ 2 (X ) were assumed to be exponential functions of the form Km1e a1x1 and Km2e a2x3, respectively, with positive values for a1 and a2 . Parameters Km1 ¼ Km2 ¼ 0.1 and a1 ¼ a2 ¼ 0.01 were chosen. In this case, the goal for the neural networks was to learn d(X ) W 0 d2 (X ) 0 d4 (X ) T , where d2(X ) ¼ Da1 sin(x1) þ Km1e a1x1 and d4(X ) ¼ Da2 sin(x3) þ Km2e a2x3. It can be seen that the system dynamics is non-affine in the control variable, and it is also a non-square problem where the number of control variables is less that the number of states. Applying the transformations given in

x_ 1d ¼ x2d

Table 2: System parameter values

x_ 2d ¼ a1 sin (x1d ) þ b1 þ j1 tanh (u1d ) þ s1 sin (x4d )

System parameter

Value

Units

x_ 3d ¼ x4d

End mass of pendulum 1 (m1)

2

kg

End mass of pendulum 2 (m2)

2.5

kg

Moment of inertia (J1)

0.5

kg m2

Moment of inertia (J2)

0.625

kg m2

Spring constant of connecting

100

N/m

Pendulum height (r)

0.5

m

Natural length of spring (l)

0.5

m

Gravitational acceleration (g)

9.81

m/s2

Distance between pendulum

0.4

m

Maximum torque input (u1max)

20

Nm

Maximum torque input (u2max)

20

Nm

3.2

Double inverted pendulum problem

x_ 4d ¼ a2 sin (x3d ) þ b2 þ j2 tanh (u2d ) þ s2 sin (x2d )

(72)

where for i ¼ 1, 2 the parameters are defined as mi gr kr2 ai W Ji 4Ji kr bi W (l b) 2Ji ui ji W max Ji 2

si W

kr 4Ji


spring (k)

(73)

hinges (b)

1657

Fig. 3 State trajectories against time a b c d

State State State State

x1 x2 x3 x4

(position (velocity (position (velocity

of of of of

pendulum one) against time pendulum one) against time pendulum two) against time pendulum two) against time

(23), f0(X) and [g0(X)] were defined as follows 2 3 x2 6 a sin (x ) þ b þ s sin (x ) 7 6 1 1 1 1 4 7 f 0 (X ) W 6 7, 4 x4 5

a2 sin (x3 ) þ b2 þ s2 sin (x2 ) 2 3 0 0 6 07 6j 7 g0 (X ) W 6 1 7 40 05 0

(75)

j2

The actual plant equations were expressed as X_ ¼ f 0 (X ) þ g0 (X)U þ

m X

ui (Ri (X, U)U) þ d(X )

(76)

solve for the real control variable U. The gain matrix for the linear error dynamic equation was selected as K ¼ diag(1/t1 , 1/t2 , 1/t3 , 1/t4) with t1 ¼ t2 ¼ t3 ¼ t4 ¼ 0.2. The control solution vector was obtained as 1 V ¼ g0 (X) C ( [f 0 (X ) þ d^ a (X , U, U s ) þ (X X a ) X_ d þ K(X a X d )]). The numerical 1 solution was obtained using V kþ1 ¼ g0 (X ) C ([f 0 (X ) þ d^ a (X , V k ) þ (X X a ) X_ d þ K(X a X d )]). After solving for V, the first two elements which made up U were extracted from V. The basis function vectors were selected in the following manner T C 1 ¼ 1 sin (x1 ) cos (x1 ) T C 2 ¼ 1 sin (x2 ) cos (x2 )

1

Since [g0(X)] is not a square matrix, c(X ) ¼ 2 T 10 10 0 0 was chosen and a square 10 10 10 10 problem was formulated. The approximate system equation was ^ ,U) þ X X a (77) X_ a ¼ f 0 (X ) þ g0 (X )U þ d(X m P ^ where d(X, U) represents ui (Ri (X ,U)U) þ d(X ). To 1 make (77) a square system CU s is added to (77) and is rewritten as X_ a ¼ f 0 (X ) þ g0 (X)U þ d^ a (X , U, U s ) þ X X a þ CU s (78) where d^ a (X , U, U s ) is the output of the function approxim P mating neural networks and represents ui (Ri (X , U)U) þ 1

d(X ) CU s . Note that Us(21) is the slack variable used to create a square control effectiveness matrix to help 1658

Fig. 4 Control trajectories against time a Control u1 against time b Control u2 against time IET Control Theory Appl., Vol. 1, No. 6, November 2007

Fig. 5 Neural network approximations against time a b c d

Network Network Network Network

approximation approximation approximation approximation

of of of of

T

C3 ¼ 1 C4 ¼ 1 C5 ¼ 1 C6 ¼ 1

sin (u1 ) sin (u2 )

uncertainty uncertainty uncertainty uncertainty

in in in in

the the the the

first state equation second state equation third state equation fourth state equation

T

sin (us1 ) sin (us2 )

T T

F1 (X) ¼ F2 (X ) ¼ kron(C 1 , kron(C 2 , kron(C 3 , kron (79) (C 4 , kron(C 5 , C 6 ))))) T C 1 ¼ 1 sin (x3 ) cos (x3 ) T C 2 ¼ 1 sin (x4 ) cos (x4 ) T C 3 ¼ 1 sin (u1 ) T C 4 ¼ 1 sin (u2 ) T C 5 ¼ 1 sin (us1 ) T C 6 ¼ 1 sin (us2 ) F3 (X) ¼ F4 (X ) ¼ kron(C 1 , kron (C 2 , kron(C 3 , kron (C 4 , kron(C 5 , C 6 )))))

4

(80)

For this problem, the neural network training parameters selected were s1 ¼ s2 ¼ s3 ¼ s4 ¼ 1 1026 and gl1 ¼ gl2 ¼ gl3 ¼ gl4 ¼ 20. For the first iteration T on the control solution, scheme V ¼ u1d u2d 0 0 was used. Numerical results from this problem, as obtained by simulating the system dynamics with forth-order Runge – Kutta method [25] with step size Dt ¼ 0.01, are presented in Figs. 3 – 5. State trajectories are given in Fig. 3. It can be seen that the nominal controller is inadequate to achieve satisfactory tracking. However, with adaptive tuning, the resulting modified controller does a better job of forcing the state variables to track the reference signals. IET Control Theory Appl., Vol. 1, No. 6, November 2007

The nominal and modified controller trajectories are plotted in Fig. 4. These plots indicate that the online adaptation comes up with a significantly different control history, which is the key in achieving the controller goal. An important component of the control design procedure is proper approximation of the unknown functions as neural network outputs d^ ai (X , U, U s ). These unknown functions and the neural network outputs (approximations) are plotted in Fig. 5. It can be seen how efficiently and accurately the neural networks learn the unknown functions. Conclusions

Dynamic systems and processes are difficult to model accurately and/or their parameters may change with time. It is essential that these unmodelled terms or changes in parameters are captured and are used to adapt the controller for better performance. A model-following adaptive controller using neural networks has been developed in this paper for a fairly general class of nonlinear systems which may be non-square and non-affine in the control variable. The nonlinear system for which the method is applicable is assumed to be of known order, but it may contain matched unmodelled dynamics and/or parameter uncertainties. Simulation results have been shown for two challenging problems. The potential of this technique has been demonstrated by applying it to non-square systems (one of which is non-affine in control as well). Another distinct characteristic of the adaptation procedure presented in this paper is that it is independent of the technique used to design the nominal controller; and hence can be used in conjunction with any known control design technique. This powerful technique can be made use of in practical applications with relative ease. 5

Acknowledgment

This research was supported by NSF-USA grants 0201076 and 0324428. 1659

6

References

1 McCulloch, W.S., and Pitts, W.: ‘A logical calculus of the ideas immanent in nervous activity’, Bull. Math. Biophys., 1943, 9, pp. 127–147 2 Miller, W.T., Sutton, R., and Werbos, P.J. (Eds.): ‘Neural networks for control’ (MIT Press, 1990) 3 Hunt, K.J., Zbikowski, R., Sbarbaro, D., and Gawthorp, P.J.: ‘Neural networks for control systems – a survey’, Automatica, 1992, 28, (6), pp. 1083– 1112 4 Barto, A.G., Sutton, R.S., and Anderson, C.W.: ‘Neuron-like adaptive elements that can solve difficult control problems’, IEEE Trans. Syst. Man Cybern., 1983, SMC-13, (5), pp. 834– 846 5 Narendra, K.S., and Parthasarathy, K.: ‘Identification and control of dSystems using neural networks’, IEEE Trans. Neural Netw., 1990, 1, (1), pp. 4 –27 6 Chen, L., and Narendra, K.S.: ‘Nonlinear adaptive control using neural networks and multiple models’. Proc. American Control Conf., 2000 7 Sanner, R.M., and Slotine, J.J.E.: ‘Gaussian networks for direct adaptive control’, IEEE Trans. Neural Netw., 1992, 3, (6), pp. 837–863 8 Lewis, F.L., Yesildirek, A., and Liu, K.: ‘Multilayer neural net robot controller with guaranteed tracking performance’, IEEE Trans. Neural Netw., 1996, 7, (2), pp. 388–399 9 Khalil, H.K.: ‘Nonlinear systems’ (Prentice-Hall Inc, NJ, 1996, 2nd edn.) 10 Aloliwi, B., and Khalil, H.K.: ‘Adaptive output feedback regulation of a class of nonlinear systems: convergence and robustness’, IEEE Trans. Autom. Control, 1997, 42, (12), p. 1714 – 1716 11 Seshagiri, S., and Khalil, H.K.: ‘Output feedback control of nonlinear systems using RBF neural networks’, IEEE Trans. Neural Netw., 2000, 11, (1), pp. 69–79 12 Enns, D., Bugajski, D., Hendrick, R., and Stein, G.: ‘Dynamic inversion: an evolving methodology for flight control design’, Int. J. Control, 1994, 59, (1), pp. 71– 91 13 Lane, S.H., and Stengel, R.F.: ‘Flight control using non-linear inverse dynamics’, Automatica, 1988, 24, (4), pp. 471–483 14 Ngo, A.D., Reigelsperger, W.C., and Banda, S.S.: ‘Multivariable control law design for a tailless airplanes’. Proc. AIAA Conf. on Guidance, Navigation and Control, AIAA-96-3866, 1996 15 Slotine, J.-J.E., and Li, W.: ‘Applied nonlinear control’ (Prentice Hall, 1991) 16 Kim, B.S., and Calise, A.J.: ‘Nonlinear flight control using neural networks’, AIAA J. Guidance Control, Dynamics, 1997, 20, (1), pp. 26–33 17 Leitner, J., Calise, A., and Prasad, J.V.R.: ‘Analysis of adaptive neural networks for helicopter flight controls’, AIAA J. Guidance Control Dynamics, 1997, 20, (5), pp. 972– 979 18 McFarland, M.B., Rysdyk, R.T., and Calise, A.J.: ‘Robust adaptive control using single-hidden-layer feed-forward neural networks’. Proc. American Control Conf., 1999, pp. 4178–4182 19 Hovakimyan, N., Nardi, F., Calise, A.J., and Lee, H.: ‘Adaptive output feedback control of a class of nonlinear systems using neural networks’, Int. J. Control, 2001, 74, (12), pp. 1161–1169 20 Hovakimyan, N., Nardi, F., Nakwan, K., and Calise, A.J.: ‘Adaptive output feedback control of uncertain systems using single hidden layer neural networks’, IEEE Trans. Neural Netw., 2002, 13, (6), pp. 1420– 1431 21 Calise, A.J., Lee, S., and Sharma, M.: ‘Development of a reconfigurable flight control law for the X-36 tailless fighter aircraft’. Proc. AIAA Conf. on Guidance, Navigation and Control, Denver, CO, 2000 22 Balakrishnan, S.N., and Huang, Z.: ‘Robust adaptive critic based neurocontrollers for helicopter with unmodeled uncertainties’. Proc. 2001 AIAA Conf. on Guidance, Navigation and Control, 2001 23 Karloff, H.: ‘Linear programming’ (Birkhauser Boston, 1991) 24 Paradiso, J.A.: ‘A highly adaptable method of managing jets and aerosurfaces for control of aerospace vehicles’, J. Guidance Control Dynamics, 1991, 14, (1), pp. 44– 50 25 Gupta, S.K.: ‘Numerical methods for engineers’ (Wiley Eastern Ltd, 1995) 26 Soloway, D., and Haley, P.: ‘Aircraft reconfiguration using generalized predictive control’. Proc. American Control Conf., Arlington, VA, USA, 2001, pp. 2924– 2929 27 Soloway, D., and Haley, P.: ‘Neural generalized predictive control: a Newton–Raphson implementation’. Proc. IEEE CCA/ISIC/ CACSD, 1996 28 Lin, W.: ‘Stabilization of non-affine nonlinear systems via smooth state feedback’. Proc. 33rd Conf. on Decision and Control, Lake Buena Vista, FL, December, 1994 29 Lin, W.: ‘Feedback stabilization of general nonlinear control systems: a passive system approach’, Syst. Cont. Lett., 1995, 25, pp. 41– 52 1660

30 Ham, F.M., and Kostanic, I.: ‘Principles of neurocomputing for science and engineering’ (McGraw Hill, Inc., 2001) 31 Hassoun, M.H.: ‘Fundamentals of artificial neural networks’ (MIT Press, Cambridge, MA, 1995) 32 Khalil, H.K.: ‘Nonlinear systems’. (Prentice-Hall Inc., NJ, 2002, 3rd edn.) 33 Yesildirek, A.: ‘Nonlinear systems control using neural networks’. Ph.D thesis, University of Texas, Arlington, 1994 34 Padhi, R., Unnikrishnan, N., and Balakrishnan, S.N.: ‘Optimal control synthesis of a class of nonlinear systems using single network adaptive critics’. Proc. American Control Conf., 2004 35 Spooner, J.T., and Passino, K.M.: ‘Decentralized adaptive control of nonlinear systems using radial basis neural networks’, IEEE Tran. Autom. Control, 1999, 44, (11), pp. 2050–2057 36 Padhi, R., and Balakrishnan, S.N.: ‘Implementation of pilot commands in aircraft control: a new dynamic inversion approach’. Proc. AIAA Guidance, Navigation, and Control Conf., Austin, TX, USA, 2003

7

Appendix

The most general form of the approximate system dynamics will be considered in this proof. A resolution to the fixed-point problem that arises because of the particular structure of the control solution equation is discussed in this section. The approximate plant model can be represented by X_ a ¼ f 0 (X ) þ g0 (X )U þ d^ a (X , U, U s ) þ X X a þ C(X )U s

(81)

Substitution of (81) in the stable error dynamic equation (X_ a X_ d ) þ K(X a X d ) ¼ 0 leads to (f 0 (X ) þ g0 (X )U þ d^ a (X , U, U s ) þ X X a þ C(X )U s X_ d ) þ K(X a X_ d ) ¼ 0

(82)

Equation (82) can be rewritten as

g0 (X ) C(X ) V ¼ H d^ a (X , U, U s )

(83)

where H W ((f 0 (X ) þ (X X a ) X_ d ) þ K(X a X_ d )). Define G W [g0(X) C(X)]. From (83), the control vector V can be solved for as V ¼ G 1 (H d^ a (X , V ))

(84)

Equation (84) represents a fixed point problem to be solved at each instant. Assuming the state vector X to be fixed, let the mapping in (84) be represented by T. Let S be a closed subset of a Banach space x and let T be a mapping that maps S into S. The contraction mapping theorem is as follows [32]. Suppose that kT(x) 2 T( y)k rkx 2 yk, 8 x, y [ S, 0 r , 1, then † There exists a unique vector x [ S satisfying x ¼ T(x ) † x can be obtained by the method of successive approximation, starting from any arbitrary initial vector in S. A solution to (84) can be obtained by successive approximation if it can be shown that the mapping shown in (84) is a contraction mapping. Let k denote the iteration instant. Equation (84) at instants k and k þ 1 can be IET Control Theory Appl., Vol. 1, No. 6, November 2007

Taking the norm of the right hand side of (87)

expressed as

T

^ (F(X ,V k ) F(X ,V kþ1 ))k kG 1 W

V k ¼ G 1 (H d^ a (X , V k )) V kþ1

¼ G (H d^ a (X , V kþ1 )) 1

(85)

^ kk(F(X ,V k ) F(X ,V kþ1 ))k kG 1 kkW

(88)

The trigonometric basis functions used in this work ensure that

Let

^ kk(F(X,V k ) F(X ,V kþ1 ))k kG 1 kkW

T(V kþ1 ) T(V k ) ¼ G (H d^ a (X , V kþ1 )) 1

G (H dˆ a (X , V k )) 1

^ kkV kþ1 V k k kG 1 kkW

In order to prove that the mapping in (84) is a contraction mapping, it needs to be shown from the inequality

On simplification and expressing the neural network output in terms of weights and the basis function vector, (86) becomes T

T

^ F(X, V k ) W ^ F(X , V kþ1 )) T(V kþ1 ) T(V k ) ¼ G 1 (W ^ T (F(X, V k ) F(X, V kþ1 )) ¼ G 1 W (87)


(89)

(86)

^ kkV kþ1 V k k (90) kT(V kþ1 ) T(V k )k kG 1 kkW 1 ^ k , 1. In order to ensure this That the term kG kkW inequality, the terms used to form the matrix G should be carefully chosen. Since the matrix G is formed from the control effectiveness matrix and the slack matrix C(X), the control designer has control over kG21k. A proper choice of the matrix G will guarantee convergence of the control iteration.

1661