1 A SUPERVISED MULTIESTIMATION SCHEME FOR DISCRETE

0 downloads 0 Views 358KB Size Report
supervise the re-parametrization of the adaptive controller in real time. ... discrete multiestimation systems, under which the supervised control system converges to a ...... [17] Ogata, K., “Discrete Time Control Systems”, Prentice Hall, 1996.
A SUPERVISED MULTIESTIMATION SCHEME FOR DISCRETE ADAPTIVE CONTROL WHICH GUARANTEES ROBUST CLOSED-LOOP STABILITY M. De la Sen, S. Alonso-Quesada, A. Bilbao-Guillerna and A.J. Garrido Instituto de Investigación y Desarrollo de Procesos Facultad de Ciencias Universidad del País Vasco Campus de Leioa (Bizkaia). Aptdo. 644 de Bilbao 48080- Bilbao (SPAIN) Abstract. A discrete pole placement-based adaptive controller with multiestimation is synthesised for linear time-invariant plants. A higher level switching structure between the various estimation schemes is used to supervise the re-parametrization of the adaptive controller in real time. The basic usefulness of the proposed multiestimation scheme is related to the improvement of the adaptation transient behaviour while robust closed loop stability is proved even in the presence of unmodeled dynamics of sufficiently small sizes. The scheme becomes specifically attractive when the various estimators are adaptive identifiers for the plant is being modelled as possessing different possible amounts of unmodeled dynamics including nominal different orders and parametrical uncertainties. A conceptually simple higher level supervisor, based on heuristic updating rules, which estimates on-line the weights of the switching rule between estimation schemes is discussed.

Keywords. Discrete adaptive control, multiestimation, stability, switching techniques, supervisory control

I.

INTRODUCTION

The challenge of control theory nowadays is to develop control system schemes able to achieve a good performance in terms of speed, accuracy and stability for increasingly complex systems, including the presence of large uncertainties in the system to be controlled. This paper deals with the problem of improving the transient response of a class of adaptive systems in the case when the plant to be controlled has uncertainties in its parameter values or these can change abruptly, using a multiestimation-based adaptive controller. Parallel multiestimation schemes for adaptive control were introduced to improve the adaptation transient performance compared to single estimation schemes while relaxing the classical hypothesis of the ideal case of adaptive control, [1]-[6]. Those hypothesis are the knowledge of the high frequency gain sign, the knowledge of upper bounds for the plant polynomial degrees and the relative degree of the plant, and the fact that the plant has to be 1

of minimum-phase. Simulations made in this context showed that this kind of schemes could also be used to improve the transient response of adaptive systems [5]. Since then, multiestimation techniques have been used twofold. In [7] and [8], Narendra et al. use a continuous-time multiestimation scheme to show how adaptive multiestimation control can achieve a good performance in the transient of adaptive control systems. They consider N e adaptive models (or estimation schemes) and an adaptive controller associated with each model. Each pair identification scheme-adaptive controller can be indexed by an integer i ∈ N e . There is a supervisory index J (i(.)) associated with each individual identification

scheme. The identification scheme that parameterises the adaptive controller is chosen from a switching rule obtained by evaluating the minimum of the J (i(.)) , i ∈ N e . At a finite or infinite sequence of decision times, the supervisory index of each identifier is compared and the controller corresponding to the best index is selected. In [9], a switching supervisory control scheme is used for adaptive stabilising an unknown plant. The mathematical proofs of robust closed-loop stability are given and it is shown that a good control can be achieved asymptotically even if the switching mechanism does not stop in finite time. It is shown in [10] that there exist decision rules leading to asymptotic convergence of the estimators, for discrete multiestimation systems, under which the supervised control system converges to a reliable identifier. These approaches use, in general, a higher-order level switching rule, which acts as a supervisor of the basic parameterised controller, whose feature is to compare on-line the performance indexes of all the identifiers in order to choose the best controller parameterised by such identifier [11]. Other approaches have been reported in the literature as those based in a predefined switching route over the set of controllers, [5], [12]. Several proposals have been given about the choice of the supervisory index, like in [7] and [8], where an accumulated identification error with forgetting factor for continuous time systems is used. In [13], a discrete time version of that performance index is used to switch between set point problem controllers. In [11], the performance index is the norm-squared value of the identification error and in [14] the time average of the squared identification error is used. In general, the performance index is defined based on the identification error and eventually the control effort, which is a natural choice since that index can reflect how far a specific identifier is from the real plant behaviour. The main objective of this work is to extend the results of [7], [8] and [9], concerning the continuous time case, to the discrete time case while showing how a significant transient response improvement can also be achieved. Mathematical proofs of closed-loop signal boundedness will be also given. The plant is decomposed in a modelled part plus an 2

unmodeled one that includes different amounts of unmodeled dynamics. Each parameteradaptive estimator has a different dimensionality according to the number of modes and zeros of the modelled part. Assume the situation when the designer has no idea of the plant order while it only knows upper-bounds for the overall numbers of poles and zeros. Then, a plant free of zeros of at most (unknown) n-th order may be decomposed into modelled parts of orders running from unity to n while the unmodeled parts run from (n-1) to zero orders. Each of those decompositions may be dealt with its own estimator/adaptive controller pair with its associate treatment for the corresponding unmodeled part. The idea may be extended in a natural way for plants possessing zeros. The parallel multiestimation scheme is subject to switchings between pairs of estimators/adaptive controller parametrization to determine the current active adaptive controller. Exhaustive simulations have corroborated the efficiency of the use of the multiestimation scheme in the proposed way. Also, the influence of the free-design parameters of the estimation scheme in the closed-loop performance will be discussed. The paper is organized as follows. Section II deals with the system to be controlled and the multiestimation and adaptive controller architecture together with the basic assumptions needed for stability and convergence purposes. Also, a higher level supervisor for the weights of the identification performance indexes is discussed while the switchings between the estimators within the parallel multiestimation scheme are considered as a first-level supervisor. The above higher-level supervisor is based on empirical rules to improve the tracking performance by on-line updating the weights of the identification performance indexes. In Section III, the main properties about identification algorithms, control law and closed-loop stability are given. In Section IV, some computer simulations and their corresponding discussion are presented and finally, conclusions end the paper. II.

PROBLEM STATEMENT

A) Discrete plant Consider the linear and time invariant discrete plant A(q −1 )y k = B(q −1 )u k

(1)

where u k and y k are the input and the output sequences respectively, q −1 is the one-step delay operator and the degrees of polynomials A(q −1 ) = 1 + A ′ ( q −1 ) (monic) and B(q −1 ) are at least n ≥ 1 and m ≥ 0 (n > m) respectively. If the true degrees of A and B are n 0 < n and 3

m 0 ≤ n 0 − 1 then (1) remains valid by using appropriate zero coefficients in the A and B

polynomials . It turns out that the plant (1) may be expressed equivalently by the following set of models whose relative degree of the modelled part being unity: A (i) (q −1 )y k = B (i ) (q −1 )u k + η (i) k (t)

(2)

with degrees ∂ A (i ) = i ; ∂ B ( i ) = i − 1 ; i ∈ N e ⊆ {1, 2 , ..., n e } where n e ≤ n-1 so that all the estimation models are of unity relative degree. By using distinct combinations of relative degrees the number of possible models may be increased to n e > n -1 . The signal η (i) k (t) represents

unmodeled

dynamics

of

the

respective

nominal

model

A (i) (q −1 )y *k = B (i ) (q −1 )u *k (i=1,2,…,n-1) with A = A (n) , B = B (n) and η (n) ≡ 0 defining the

perfectly modelled plant description. If the plant (1) is subject to additive noise then it may be included in the various signals η (k i ) in (2). The following standard assumptions are made: Assumptions A.1. (1) It is assumed that polynomial degree upper-bounds n and m are known. (2) All the unstable plant zeros (if any) are known and are also zeros of the reference model. (3) The reference model H m (z) = B m (z) / A m (z ) is exponentially stable, i.e. all the roots of A m (z) satisfy z ≤ 1 − δ for some δ ∈(0,1] .

(4) There exists a known convex and compact subset D i ⊆ ℜ

2n

i

of the parameter space

where the real nominal plant parameter vectors in (2) belong to so that for all plant parameterisation in D i the polynomials A ( i ) and

B (i ) are coprime for all i ∈N e . In

particular, note that D ≡ D n ⊂ ℜ n + m + 1 ⊆ ℜ 2 n . (5) There exist real constants σ ( i ) ∈(0 , 1 ), α (i0 ) ≥ 0 and α (i ) ≥ 0 such that η (ki ) ≤ η (k i ) = α

(i )

ρ (ki ) + α (0 i ) ≤ α 1 ρ (ki ) + α 0

(

where ρ (ik ) = Sup σ ( i ) 0 ≤j ≤k

k− j

ϕ

(i) j

) for all



i ∈ N e.

Note that the plant zeros polynomial B(z) can be expressed uniquely as B = B + B − , where B+

is monic, B − contains all zeros of B that satisfy z > 1 − δ for some δ ∈(0,1] (in

particular, B − includes all unstable roots of B). Assumption A.1 (4) implies that the plant is controllable. It will be then used to project the estimates of A and B in D so as to ensure the 4

controllability of all the estimation models for all time. Assumption A.1 (5) implies that the contribution of the unmodeled dynamics to the output in all the estimates grows not faster than linearly with the supreme of the total regressor. Assumption A.1 (5) implies that η (ki ) ≤ α 1 ρ k + α 0 where ρ k = Sup

0 ≤ j≤k

(

α 1 = Max α ( i ) , i ∈N e

),



k− j

ϕ

(

(n) j

)

α 0 = Max α (0 i ) , i ∈N e ,

) with (

)

σ = Max σ ( i ) , i ∈ N e .

Assumption

A.1 (5) will be then crucial to prove the closed-loop stability of the adaptive system. B) Parallel multiestimation scheme A parallel multiestimation scheme is proposed based on the various uncertain plant models (2) for the case when neither the plant (1) nor the degrees of A and B are known. Each estimation scheme possess a relative adaptation dead-zone so as to freeze the adaptation when the absolute identification error is sufficiently small related to a known upper-bound of the absolute value of the contribution of the unmodeled dynamics. At the same time, each estimator tentatively parameterizes separately the adaptive controller at all time based on a specific " ad hoc" diophantine equation based on each nominal part of the various estimates. The main idea behind this philosophy is to switch at appropriate sampling instants between the various estimates so as to appropriately reparameterise the pole placement- based adaptive controller. Such a strategy will potentially allow to deal with larger or smaller amounts of unmodeled dynamics since the nominal true plant is of unknown order and relative order and to enhance the adaptation transient performance by using at each time interval the appropriate estimation/controller parameterisation pair. The closed-loop stability is guaranteed if the time interval between consecutive switches exceeds a minimum residence time. To simplify the exposition, a parallel multiestimation scheme with a set of only n estimators of relative degree unity is used based on the plant models (2) as follows: ˆ ′ (i ) y + Bˆ yˆ (ik ) = − A k k ˆ A

(i) k

(q

−1

T u k = θˆ (ik ) ϕ (ik )

(3)

ˆ (i) (q −1 )u + e (i) )y k = B k k k

for i=1,2,…, n and all estimates of

(i ) k

(4)

ˆ ( i ) ( q −1 ) = 1 + A ˆ ( i) ′ ( q k ≥ 0 , where A k k

−1

) and Bˆ (ik ) ( q −1 ) are the

the polynomials A (q −1 ) and B (q −1 ), respectively,

identification error for the k-th sample which is given by:

5

and e

(i ) k

is the i-th

ˆ (i ) ˆ (i ) ˆ (i) e (i) k = yk − y k = A k yk − Bk uk T T (i) = y k − θˆ (ik ) ϕ (ki ) = θ˜ (i) ϕ (i) k k + η k (t)

(5)

where yˆ (ik ) is the i-th estimation of the output for the k-th sample. θˆ (i) and k

ϕ (ik ) are,

respectively, the estimator of the nominal plant parameter vector and associate regressor defined by

(

θ (i) = −a 1 , ..., − a i , b 0 , ..., b i −1

)

T

; ϕ (ik ) = (y k − 1 , ..., y k − i , u k − 1 , ..., u k − i )T

(6)

for i ∈N e with θ˜ k( i ) = θ (i) − θˆ (i) the corresponding parametrical error. Note that in the k perfectly modelled case, (2) becomes identical to y k = θ Tϕ k of output estimate yˆ

k

) ˆ (n) = ˆy (n k = θ k ϕ k , provided that N e is extended to include the true plant order with

(

θ = θ (n) = −a 1 ,..., − a n , 0 , ... , 0 , b 0 , ..., b m ϕ

k



(n) k

(

= y k − 1 , ..., y

k− n

,u

k−1

, ..., u

)

T

k−d +1,

u

(7.a) k−d

, ..., u

k −n

)

T

(7.b)

It is typical in control problems to know a compact subset of the parameter space D ⊆ ℜn + m where the nominal real plant parameter vector belongs to. This knowledge allows the use of projections of the estimates within such a domain. If the estimation algorithm starts running with a nominal estimated vector being far away from the real plant parameter vector, then the transient will have large deviations from the desired output resulting in an unsuitable performance. In this work we have chosen a parallel multiestimation scheme to improve the transient response of the adaptive system. The architecture of the multiestimation scheme is represented in Figure 1:

6

Figure 1. Multiestimation scheme There exist n estimation algorithms running in parallel (i.e. at each sampling time t k every algorithm gives the estimated parameter vector θˆ (i) and the estimated plant output yˆ (ik ) , k i ∈ N e based on past plant input and output measurements). Each algorithm is different

from each other in what is concerned with the estimated parameter vector initialization and/or the kind of the estimation algorithm and integrate the so-called multiestimation scheme. There also exist n adaptive controller parameterisations (with only one being in operation at each time) such that the i-th adaptive controller is parameterised at every sampling instant by the i-th estimation algorithm. Thus, every pair identification algorithmadaptive controller is indexed with only one integer i ∈ N e . Denote by c k the integer in N e that defines the controller (parameterised by its respective identification algorithm) which is active (i.e. connected to the plant for control purposes) at the current time. A switching rule T ˜ (i) (i ) based on the identification errors e (ik ) = y k − yˆ (i) k = ϕ k θ k + η k ( i ∈ N e ) of the n estimation

algorithms chooses at each sampling time t k the individual estimation scheme which parameterizes the controller at time kT which is in fact connected in feedback to the plant. Remarks. 1. At each time, only one parameterisation of adaptive controller obtained from one of the estimates of the parallel multiestimation scheme is in operation generating the control input. 2. All the estimation algorithms are always running in parallel to calculate all the estimated plant outputs. Also, each respective adaptive controller is updated for all time although only the i-th controller is generating the plant input. 7



C) Parallel multiestimation scheme The proposed parameter- adaptive algorithm with relative adaptation dead- zone is: s (i ) P ( i ) ϕ ( i ) e ( i ) θˆ k( i+) 1 = θˆ (ki ) + k ( i k) T k( i ) k( i ) 1+ ϕ k P k ϕ k

P (k i+) 1 = P (ik ) −

s (ik )

s (k i ) P (ki ) ϕ (k i ) ϕ (k i ) T P (ik ) 1 + ϕ (k i ) T P (ik ) ϕ (ki )

(8.a) ; P (i0 ) = P (i0 )

T

>0

(8.b)

 0 if e (k i ) ≤ µ ( i ) η (ik )  (i ) = (i) η k otherwise  1−µ (i) ek 

(8.c)

for some real design parameters µ (i ) > 1 , all i ∈ N e and all integer k ≥ 0 . The relative deadzone (8.c) freezes the parametrical and covariance matrix adaptation in (8.a)-(8.b) if the identification error is sufficiently small related to the available upper-bound function of the contribution of the unmodeled dynamics. This is the basic mechanism which supplies attractive properties of the estimates and allows the stabilisation of the closed-loop system in the presence of unmodeled dynamics satisfying Assumption A.1 (5). Those features are then discussed in Section III. D) Basic adaptive controller The transfer function of the reference model is H m (z ) =

B − ( z ) B ′m (z ) A 0 i ( z ) A m (z)A 0 i (z)

where B ′m (z )

contains the free- design reference model zeros, B − ( z ) is formed by the unstable (assumed known) plant zeros and A 0 i ( z ) are closed-loop stable pole-zero cancellations which are introduced when necessary to guarantee that the relative degree of the reference model is non less than that of the plant so that the synthesised controller is causal. All the controllers are based on pole-placement (see for instance, [15]), whose basic scheme is displayed in the following Figure 2:

8

Figure 2. Basic adaptive controller Then, we will consider for each i-th controller the polynomials R (ik ) , S(i) k , T ( T depends only on the reference model zeros polynomial which is of constant coefficients) where T = B 'm A 0 i and R (ik ) (monic), S(ik ) are the unique solutions with degrees fulfilling

(i) deg(R (i) k ) = 2 n − i , deg(Sk ) = i − 1 and deg(A m A 0 i ) = 2n

(9)

of the polynomial diophantine equation ˆ (i) R (i) + B ˆ (i) S(i) = Bˆ ( i ) + A A ⇔ A ˆ (i) R (i) + B − S(i) = A A A k k k k k m 0i k 1k k m 0i

+

with R (ki ) = Bˆ (ik ) R 1(ik) , since B − (z) is known and R k ,

(10)

+ R 1k , B + and Bˆ (ki ) are monic,

ˆ (i ) ) = i and deg( Bˆ (i ) ) = i − 1 , for all i ∈N at every sampling instant if the relative deg( A k k e

degree of the i-th estimation model is unity. Assumption A.1 (4) is extended in a natural way to the multiestimation scheme by using "a posteriori" projection of the estimates when necessary as follows: ˆ (i) ˆ (i ) Assumption A. 2. It is assumed that θˆ (i) k ∈D i for all k ≥ 0 , then A k , B k is a coprime pair

over D i for all k ≥ 0 for all i ∈ N e ; i.e. all the estimation schemes are controllable for all ❏

time.

9

The above assumption would be then useful provided that each pair estimation schemeadaptive controller parameterisation is associated with a different plant operation point. This is often the case, for instance, of some chemical engineering processes. Now, it is necessary to elucidate how to choose the current adaptive controller (or, in other words, which is the active controller at each sampling instant) from the family of parallel controller parameterisations such that the adaptation transients are acceptable in practice while the closed-loop scheme is maintained globally stable. Thus, the basic pole-placement adaptive controller (see Figure 2) is reparameterised by one of the estimators of the multiestimation scheme during appropriate time intervals of lengths non- less than a minimum residence time. A level switching law (supervisor) calculates the switching times subject to a residence time between the various estimators what is used as a mechanism to on-line reparametrise the basic adaptive controller in operation to generate the control input. The operation mode of such a supervisor is discussed in the sequel. D.1 Switching rule in the parallel multiestimation scheme (First -level Supervision) The objective of the supervisor is to evaluate the performance of the possible controllers connected to the plant with the aim of choosing the current controller from the set of parallel controllers. The subsequent supervision scheme is a mechanism for calculating the consecutive switching times between the various estimators in order to on-line reparameterise the basic adaptive controller. The proposed specific performance index has the form: k

J (ik ) =

∑ λ [α k− l

( y l − yˆ (il ) ) 2 + ( 1 − α )( u l − uˆ (il ) ) 2

l =k −M

]

(11)

for all i ∈N e , where yˆ (ki ) and uˆ (ki ) are, respectively, the i-th predicted input and output given by:

(

)

(

)

ˆ ( i ) ( q −1 ) y + B ˆ ( i) (q yˆ (ki ) = 1 − A k k k

−1

)u

(12.a)

k

uˆ (ki ) = 1 − R (ik ) ( q −1 ) u k + T ( q −1 ) u c k − S (ki ) (q

−1

) yk

(12.b)

where u c k is the bounded reference sequence, 0 < α ≤ 1 and M is an integer number large enough to give sense to the performance evaluation. Note that (11) has two additive terms. The first one is a measure of the long-term accuracy of each identification algorithm, where

10

the forgetting factor λ ∈ ( 0 , 1 ] establishes the effective memory of the index in rapidly changing environments. The second one weights with forgetting factor the quadratic error between the input sequence and its estimated values. Now, the switching rule for the basic adaptive controller reparameterisation is obtained from the performance index (11) as follows. Let the switching sampling times sequence be denoted by T S ={t (1) , t(2) , ...,t ( π) } where π , which may be finite or infinite countable, is the number of switchings and

(t

( i+ 1 )

−t

( i)

)≥ τ

r

= N r T (a known minimum residence time) for all t (i ) , t ( i+1) ∈ T S .

Thus, the c k -estimation scheme with c k ∈N e , which parameterizes for all k ≥ 0 the basic adaptive controller at any switching time in TS is updated as follows. Assume that the last switching time for the controller re-parameterisation was t ( i ) = TÝ (i.e., an integer multiple of the sampling period). Thus, for each current k- sampling time, define the auxiliary integer variable:

[

( ) ; i , j ∈ N ],

c k = Arg i : J (ik ) = Min J (j) k

e

all integer k ≥ 1

if kT ≥ t ( i ) + τ r then c k ← c k (it indexes the current active controller parameterisation from one of the estimation algorithms) end__if if c k ≠ c k − 1 then modify t

( i +1 )

{

← kT and TS ← TS , t ( i + 1)

}

end__if A minimum residence time which ensures the achievement of closed-loop stability always exist for any unforced time-varying linear system consisting of a set of stable linear timevarying configurations. Switches between those configurations at intervals exceeding such a time ensure the global stability. These ideas will be discussed in the next section related to the closed-loop stability supplied by the proposed adaptive controller based on a multiestimation scheme if the residence time is known. If it is unknown, since it always exist under weak assumptions, then it is estimated by successive on-line increments via closed-loop performance tests until an available upper-bound is obtained in finite time so that stability is still ensured.

11

D.2. High- level Supervision A higher- level supervisor to the above one supervising the switching of parameterisation for the active controller of potential optional use is now proposed. This new supervisor selects on-line the value of the α weighting factor, which is now time-varying as a result. The identification performances indices (11) are assumed to take the more general form: k

J (ik )

∑ λ [α

=

k− l

k

2 ( y l − yˆ (il ) ) 2 + ( 1 − αk )(u l − uˆ (i) l )

l =k −M

]

(13)

where λ ∈( 0 , 1 ] is a forgetting factor and α k ∈( 0 , 1 ] is on-line updated at sampling Ý ′ (with M ′ ≠ M ) by evaluating instants being multiple of a sufficiently large integer k = M

the closed-loop tracking/control effort performance index

J

′ k

k

=

∑ [γ ( y

l = k− M

l

− y ml ) 2 + ( 1 − γ ) u 2l



]

(14)

where y mk is the uniformly bounded reference output sequence for some prescribed weight γ ∈( 0 ,1]

such

[

that

if

k

=

j



for

any

nonnegative

integer

j

then

]

α k + i = α k ∈ α min , α max ⊂ ( 0 , 1 ] ; i = 1, 2 , ..., ( j + 1) M ´ according to the following simple

empirical rules: If α k is increasing (decreasing) and the value of the performance index J

′ k

is decreasing

then continue with the same action on α k , otherwise change the updating action to "decrease " (" increase ") α k . The conceptual idea is to continue with the same current correcting action if the tracking performance is improving and to change it otherwise. The current value of α

k

is obtained

by adding or subtracting a quantity ± ∆ α k = ± ( or m ) ∆α k / mf to α k − M ′ if k = j M ´ . The value mf is a scaling factor, or modulation factor, which allows to increase / reduce the variation rate and which can optionally be re-updated on- line. Typically: . m f = 1 if the variation modulus is suited to be constant with ∆α (.) being constant. . if m f < 1 then ∆α (.) is increased (typically when the initial value of α, ∆ α 0 is small) and, . if m f > 1 then ∆α (.) is decreased (typically when ∆ α 0 is large). 12

Projection is used when the updated value lies out from [α min , α max ]⊂ ( 0, 1] . The objective of the choice M ′ ≠ M is to avoid that both supervisions operate at the same speed what could lead to conflictive decisions. The experience from worked examples dictates that one of the horizons should be chosen of the order of double size than the other one. More formally, the algorithm can be stated in the following way: Step 0 (Initialization): Set α 0 ∈ [α min ,α max ] ; 1 ≥ αmax ≥ α min > 0 , 0 < ∆ α 0 ≤ Min ( α max − α 0 , α 0 − α min ) , and   mf 0   

1

if

∆ α0 < m1

if

∆ α 0 ∈ m1 , m 2 ∆α 0 ≥ m2 > m1

if

[

with mf 0 ∈(1 − ε , 1 + ε ) ,

ε> 0 ,

] m 2 ≥ m1 > 0

Step 1: for k ← 1 , set α ← α 0 , ∆ α ← ∆α 0 , m f ← mf 0 , ∆mf ← ∆mf 0 > 0 (a very small real constant). Step 2 (α - refinement) : for__all Label ∆ mf :

if J

k ≥ M ′ do

′ k− M′

end__if

Label ∆ α :

if J

′ k− 2M

>J

′ k

mf

k

then ∆ mf ← ( − ∆ mf )

mf ← Pr ojection (1 − ε , 1+ ε ) ← ( mf + ∆mf

>J

end__if α k ←α ,



′ k− M′

then

∆α ←

α ← Pr ojection



)

( − ∆α / mf ) min

]

, α max ←

(α + ∆ α )

← mf

end__for

Note that if M ′ ≥ M then the higher level supervisor (α - supervision) operates at the same 13

or slower rate than the first level supervisor (switchings between estimator and controller parameterisation pairs to decide the active adaptive controller). Otherwise, the α supervision operates at a faster rate. The adjustment between both rates depends highly on the application and designer's "a priori" knowledge. In the examples of Section IV, exhaustive worked examples showed that the choices M ′ ≥ l M ( l = 1, 2 ) were not practical since the α- supervision via the loss function (14) did not lead to a significant improvement of the tracking performance. The reason is that the stationary is reached at high rate. However the choice M ′≈ M / 2 led to a significant improvement of the adaptation transients. The index that determines when the operation has to be changed is built from the tracking error since that signal is a measure of how ‘good’ the system behaviour is. Note that if in the initialization scheme of the above algorithm α max = α min then the multiestimation scheme corresponds to the particular case of an (high-level) unsupervised multiestimation scheme with constant values. From Assumption A.2, each eqn. 10 for all i ∈ N e has a unique solution for all k ≥ 0 under the degree constraints (9). The control law is R k u k = Tu ck − Sk y k

(15)

where ( R k , Sk ) = ( R k k , S k k ) ∈{( R (ik ) , S(i) k );1 ≤ i ≤ n } , i.e. at each time, the pair (R k , Sk ) is c

c

(i ) defined by (R (i) k , S k ) for some i = 1, 2, ..., n and then, the control input is generated by the c

corresponding i-th adaptive controller parameterisation as u k = u k k with c k ∈N e , all k ≥ 0 .

III. PROPERTIES OF THE ESTIMATION AND CLOSED-LOOP STABILITY A) Boundedness and convergence results of the parallel multiestimation scheme In this work, all the recursive identification algorithms associated with the multiestimation scheme will be of standard least-squares type. The difference between the various ones consists of the different initialization of the estimates vector for each algorithm. For the multiestimation scheme, the following result proved in Appendix A follows: Theorem 1. The combined estimated parameter vector from the multiestimation scheme (c ) (c ) θˆ k = θˆ k k leading to the output estimate yˆ k = yˆ k k has the following properties for each

i ∈ N e irrespective of the control law:

14

1) lim

k →∞

2) lim

k →∞

3) lim

k →∞

4) V

(θ˜ ϕ ) P ϕ ) [1 + (1 − s )ϕ )s (θ˜ ϕ ) =0 s



(i ) T k

(i ) k

(i) k

(1 − s 1 + (1 − s )ϕ ( i) k

( i) k

(i ) T

(i ) T k

(i ) T k

T

and

V

(i ) k

( i) k

( i)

( i) ∞

→V

]

(i ) P (i) k ϕk

]

θ˜ (ki ) ≤ V

)

(i ) T

(i) k

( i) k

(i) T k

(i) k

= θ˜ (k i ) P k−1

(i ) k

(i) P (i) k ϕk

=0

2

(i) k

(2 − s )s (θ˜ ϕ ϕ ) [1 + (1 − s )ϕ

P (ik )

(i) T k

P (ik ) ϕ (i) k

( i) k



(i ) 2 k

k

2

( i) k

( i) k

(i ) T

( i) k

(i ) 0

2

=0

∈[ 0 , ∞ ) as

(

∑ 1 + (1 − s )ϕ ( i) k

k= 0

(i ) k

T

( i) 0

)

s (ki ) ˜θ (ik )T ϕ (ki )

l

k → ∞ . Also,

and θˆ

( i) 0

is bounded for all k ≥ 0 if P

are bounded

2

V 0 − V∞ =



k= 0

l

∆Vk ⇒

(

∑ 1 + (1 − s )ϕ

k =0

k

)

T k

Pk ϕk

s k θ˜ Tk ϕ k

2

27

V 0 ≥

θ˜ k

(P )

λ max

2

for all k ≥ 0 since the maximum eigenvalue of the decreasing

k

positive semidefinite covariance matrix sequence is positive and bounded. The convergence of the estimates and parametrical errors to finite limits are proved as follows. Note from Property 3 that: s k ˜θ Tk ϕ k ϕ Tk P k ϕ

(2 − s ) s 1 + ( 1 − s )ϕ

T k

s k θ˜ Tk ϕ

k

k

×

k

θ˜ Tk ϕ k

k

k

Pkϕ

→ 0 as k → ∞

k

so that s k ˜θ Tk ϕ k

0←

T k

ϕ Pkϕ



k

T k

1+ ϕ P k ϕ k

→ 0 as k → ∞ since 0 ≤ s k ≤ 1 for all k ≥ 0

and/or

(2 − s ) s 0← 1 + (1 − s )ϕ k

θ˜ Tk ϕ

k

k

T k

k

Pkϕ



k

s k θ˜ Tk ϕ k 1 + ϕ Tk P k ϕ

→ 0 as k → ∞ since 0 ≤ s k ≤ 1 for all k ≥ 0 k

Thus, the vector of estimates and associate parametrical errors converge asymptotically to finite limits for all i ∈N e from (8.a) and Properties 5 have been proved. To

prove

˜θ ( 0← ϕ

lim

k →∞

T k

T k

ϕk

Properties

)

2

Pk ϕk

(α (θ˜ k

T k

≥ ϕ

k

(θ˜

T k

6,

ϕk

)

2

from

Property

1

and

the

fact

that

2

1 + ϕ Tk P k ϕ k

) )=

note

lim

k →∞

→ 0 as k → ∞ that

(α ( e k

k

−ηk

) ) = 0. 2

As a result, either α k → 0

or

e k − η k → 0 as k → ∞ . But , if e k → η k as k → ∞ then α k → 0 as k → ∞ from (8.c).

Thus, α k → 0 as k → ∞ . On the other hand, if the i-th regressor is bounded then each i-th estimator verifies the above asymptotic properties by replacing α k with s k . Thus, ❏

Properties 6 have been fully proved.

28

APPENDIX B. Intermediate auxiliary results and proof of Theorem 2

The following intermediate result will be used in the proof of Theorem 2: Lemma B.1. There exists a sufficiently large finite integer k 0 ≥ 0 such that Max i∈N e

( e )≤ µ α ( i) k

Sup

0 ≤ j≤k



k− j

for some real constant

(

α 0 = Max α i∈N e

( i) 0

)≥ 0 ,

ϕ

)+(µ α

j

i ∈N

(

e

(i )

)∈

(

0

)

µ = Max µ ( i ) > 1 ,

C > 0 with

σ = Max σ

+ C ) for all k ≥ k

0

i ∈N

e

( 0 , 1 ) where

(

)

α = Max α ( i ) ≥ 0 , i∈N e

ϕ j denotes the highest dimensional

regressor ϕ (j n e ) in the multiestimation scheme.

Proof:

From Theorem 1 (Property 5), all the estimates have finite limits so that

s (ki ) θ˜ (k i ) ϕ (ki ) → 0 as k → ∞ and all i ∈ N e from (8.a). Thus, it follows that s (ik ) → 0 T 1 + ϕ (ki ) P (ki ) ϕ (ki ) T

T and /or θ˜ (ki ) ϕ (k i ) → 0 and /or

ϕ

( i) k

diverges as k → ∞ for any given i ∈ N e . Those

cases are discussed separately, namely: a. If s (ik ) → 0 as k → ∞ then the modulus of the adaptation error is arbitrarily close to µ

( i)

η (ik ) so that for any sufficiently large finite k 0 ≥ 0 , there exists a positive real constant

C

( i)

( k 0 ) , dependent on k 0 with C

e (k i ) ≤ µ ( i ) η (k i ) + C ( i ) , all k ≥ k

b. If

0

( i)

( k 0 ) → 0 as k 0 → ∞ , such that

for any i ∈N e and the result follows directly.

T θ˜ (k i ) ϕ (k i ) → 0 then there exists a positive real constant C

e (k i ) ≤ η (k i ) + C ( i ) ≤ µ

(i )

η (k i ) + C ( i ) all k ≥ k

0

( i)

( k 0 ) such that

for any sufficiently large finite k 0 ≥ 0 .

Thus, the absolute i-th identification error satisfies the same inequality as above for k ≥ k 0 and the result follows.

c. Note from Theorem 1 ( Property 2) that

(1 − s )s 0← ( i) k

(i ) k

1+ ϕ

T θ˜ (ki ) ϕ (k i ) ϕ

(i ) T k

P (k i ) ϕ

(i ) T ˜ ( i ) θk k

(i ) k

(1 − s )s (θ˜ ≤ 1+ (1 − s )ϕ ( i) k

(i ) k

( i) k

29

( i) k

(i ) T k

T

ϕ (ki )

)

2

P (k i ) ϕ (k i )

→ 0 as k → ∞

Now, if ϕ (ki )

T diverges as k → ∞ then θ˜ (k i ) → 0 and θ˜ (k i ) ϕ (k i ) is uniformly bounded

(since the parametrical errors are bounded from Theorem 1) and / or either s (ik ) → 0 or s (ik ) → 1 as k → ∞ for any i ∈N e . T If θ˜ (k i ) → 0 with θ˜ (k i ) ϕ (k i ) uniformly bounded and / or if s (ik ) → 0 as k → ∞ then from T e (k i ) = θ˜ (ki ) ϕ (k i ) + η (k i )

and

from

(8.c),

respectively,

e (k i ) ≤ µ ( i ) η (k i ) + C ( i ) for any sufficiently large k ≥ k

If s (ik ) → 1 k → ∞ then e (k i ) ≥ µ ( i ) η (k i ) ≥ η (ki )

(

α (ik ) e (ki ) − η (k i )

)

2

0

it

follows

that

and some C (i) ( k 0 ) > 0.

for sufficiently large finite k so that

(

≥ α (ki ) e (k i ) 2 + η (ik ) 2 − 2 e (ki )

η (k i )

)

 2  ≥ α (k i )   1 − ( i )  e (ik ) 2 + η (ki ) µ  

2

  

for sufficiently large finite k. Since the first term converges asymptotically to zero from Property 6 of Theorem 1, it follows that 0 ← e (k i ) ← η (ki ) as k → ∞ . Thus, one still has e (k i ) ≤ µ ( i ) η (k i ) + C ( i ) for any sufficiently large k ≥ k

and some C (i) ( k 0 ) > 0. The

0



result has been proved.

Proof of Theorem 2. Since G (ik ) has bounded entries with its eigenvalues in

z < 1 for all

k ≥ 0 and satisfies Assumption A.3, there exist real constants K ≥1 (norm- dependent) and l

ρ ∈ ( ρ 0 , 1 ) such that

∏G

k =k

0

(c k

k

)

≤ K ρ l provided that the residence time is sufficiently

+1

large [Lemma 1 (ii)]. Now, from (16), Assumption A.1 (5) and Lemma B.1, one gets since v

x

k

is uniformly bounded and x k ≥ ϕ (k i )

k+ N

 ≤ Kρ N x   ≤ K ρN x 

for all i ∈ N e and all k ≥ k 0 ≥ 0 :

k

+

 1 µ α1 1− ρ 

0≤l ≤ k +N −1

k

+

1  K1 1− ρ 

  0 ≤ l≤ k + N −1 

Sup

Sup

30

(

 + C + K 1′    

xl

)+ α

x

   + K2 

l

0

  

(B.1)

for some sufficiently large finite integer k 0 ≥ 0 some real constants K 1′ > 0 , K 1 ≥ 0 (being j

monotonically increasing with µ α ) and K 2 ≥ 0 since



l=0

l

∏G

k =k

0

+1

(c k

k

)



K for all 1− ρ

integer 0 ≤ j ≤ ∞ . Now, proceed by contradiction. Assume that there is a diverging

{x

subsequence

k

k ∈N

0

} with N

being a subset of the set of nonnegative integers of

0

infinite cardinal such that k ∈ N 0 , (k + N) ∈ N 0 , and x

k+ N

≥ x

k



K K1 1−ρ

Max

0 ≤ j≤ k + N

( x ). j

If the state of (16)-(17) diverges such a

subsequence always exists for appropriate sufficiently large k ∈ N 0 ( ≥ k 0 ) and (k +N) ∈ N 0 ε

0

for some N satisfying N >

ln ε 0 + ln K ln ρ

for any prescribed real constant

∈ ( 0 , 1− ρ ) and ρ ∈ [ ρ 0 , 1 ) have negative neperian logarithms for 1 > ρ 0 > 0 . Thus,

from (B.1):

x

k+ N

(

(

K

≤ 1 − K ρ N + 1 −1ρ

provided that 1 >

))

−1

K K1 +ε 0 1−ρ

K K 2

Suggest Documents