October 2013
Improving Transient Performance of Adaptive Control Architectures using Frequency-Limited System Error Dynamics
arXiv:1309.6693v1 [math.DS] 25 Sep 2013
by Tansel Yucelen Department of Mechanical and Aerospace Engineering Missouri University of Science and Technology 400 W. 13th St., Rolla, MO 65401 PHONE: (573) 341-7702 FAX: (573) 341-6899
[email protected] Gerardo De La Torre School of Aerospace Engineering Georgia Institute of Technology 270 Ferst Dr., Atlanta, GA 30332 PHONE: (404) 385-4940 FAX: (404) 894-2760
[email protected]
Eric N. Johnson School of Aerospace Engineering Georgia Institute of Technology 270 Ferst Dr., Atlanta, GA 30332 PHONE: (404) 385-2519 FAX: (404) 894-2760
[email protected]
Abstract We develop an adaptive control architecture to achieve stabilization and command following of uncertain dynamical systems with improved transient performance. Our framework consists of a new reference system and an adaptive controller. The proposed reference system captures a desired closed-loop dynamical system behavior modified by a mismatch term representing the high-frequency content between the uncertain dynamical system and this reference system, i.e., the system error. In particular, this mismatch term allows to limit the frequency content of the system error dynamics, which is used to drive the adaptive controller. It is shown that this key feature of our framework yields fast adaptation without incurring high-frequency oscillations in the transient performance. We further show the effects of design parameters on the system performance, analyze closeness of the uncertain dynamical system to the unmodified (ideal) reference system, discuss robustness of the proposed approach with respect to time-varying uncertainties and disturbances, and make connections to gradient minimization and classical control theory. Key Words: Uncertain dynamical systems; stabilization and command following; adaptive control; frequency-limited system error dynamics; transient and steady-state performance guarantees
1.
Introduction Although fixed-gain robust control design approaches can deal with uncertain dynamical
systems, they require the knowledge of uncertainty bounds. Characterization of these bounds is not trivial due to practical constraints, because it requires extensive and costly verification and validation procedures. Furthermore, in the face of high uncertainty levels, system faults, or structural damage, these approaches may fail to satisfy a given system stabilization or command following requirement. On the other hand, adaptive control design approaches can effectively deal with these sources of uncertainties and require less modeling information than do fixed-gain robust control approaches. These facts make adaptive control theory a candidate for many applications. The control framework of this work builds on a well-known and important class of adaptive controllers, specifically, model reference adaptive controllers. Whitaker et al. [1, 2] originally proposed the model reference adaptive control concept. In particular, model reference adaptive control schemes have three major components, namely, a reference system (model), an update law, and a controller (Figure 1.1). The reference system, in the classical sense, captures a desired closed-loop dynamical system behavior for which its output (resp., state) is compared with the output (resp., state) of the uncertain dynamical system. This comparison results in a system error signal used to drive the update law online. Then, the controller adapts feedback gains to minimize this error signal using Command
Reference System Uncertain Dynamical System
Controller
Update Law
–
System Error
Figure 1.1: Block diagram of the model reference adaptive control scheme.
1
the information received from the update law. From a practical standpoint, the output (resp., state) of the uncertain dynamical system can be far different from the output (resp., state) of the reference system during transient time (learning phase), although a model reference adaptive control scheme can guarantee that the distance between the uncertain dynamical system and the reference system vanishes asymptotically. This problem, so-called poor transient performance phenomenon, can be solved by increasing the learning rate of the update law, and hence, fast adaptation can be achieved in order to suppress uncertainties rapidly during transient time. However, update laws with high learning rates may yield to signals with high-frequency content, which can violate actuator rate saturation constraints and/or excite unmodeled system dynamics [3, 4] resulting in system instability for practical applications. Hence, a critical trade-off between system stability and control adaptation rate exists in most adaptive control approaches, with some notable exceptions [5–8]. Specifically, the authors in [5] use a low-pass filter that subverts high-frequency oscillations attributable to fast adaptation, and their approach has guaranteed transient and steady-state performance. The authors in [6] present an adaptive control approach based on high learning rates that allows fast adaptation without hindering system robustness. Even though the methodologies documented in [5, 6] are promising, they require the knowledge of a conservative upper bound on the unknown constant gain appearing in their uncertainty parameterization. While this conservative upper bound can be available for some applications, the actual upper bound may change and exceed its conservative estimate, for example, when an aircraft in a flight control application undergoes a sudden change in dynamics. In such circumstances, the performance of these adaptive controllers may be poor, because tuning them online with a new conservative upper bound is not possible. The author in [7] presents a modification to the reference system in order to solve the poor transient performance phenomenon, where a detailed analysis of this approach is given in [8]. Specifically, this modification is constructed by using a modification gain multiplied by the system error that is between the uncertain dynamical system and
2
–
Command
Filter
Reference System
Controller
Uncertain Dynamical System
–
System Error
Update Law
Figure 1.2: Block diagram of the proposed scheme. Note that the reference system is driven not only by the command but also by the difference between the system error and its (low-pass) filtered form representing the high-frequency content of the system error. the modified reference system. In the limit as this modification gain goes to infinity, it is shown that the system error goes to zero in transient time. This approach can be used to effectively suppress uncertainties, however, for example, in the presence of exogenous lowfrequency persistent disturbances, the transient performance of this approach may not be sufficient. Because, this disturbance may not be visible to the update law, since the system error is (sufficiently) small due to a (sufficiently) large modification gain. This work develops an adaptive control architecture to achieve stabilization and command following of uncertain dynamical systems with improved transient performance. The contribution of our framework comes from using a new reference system with an adaptive controller. The proposed reference system captures a desired closed-loop dynamical system behavior modified by a mismatch term representing the high-frequency content between the uncertain dynamical system and this reference system, i.e., the system error (Figure 1.2). In particular, this mismatch term allows to limit the frequency content of the system error dynamics, which is used to drive the adaptive controller. That is, the update law learns through the low-frequency content of the system error, which constitutes a distinction over the approach in [7]. It is shown that this key feature of our framework yields fast adapta-
3
tion without incurring high-frequency oscillations in the transient performance. We further show the effects of design parameters on the system performance, analyze closeness of the uncertain dynamical system to the unmodified (ideal) reference system, discuss robustness of the proposed approach with respect to time-varying uncertainties and disturbances, and make connections to gradient minimization and classical control theory.
2.
Notation The notation used in this paper is fairly standard. Specifically, R denotes the set of real
numbers, Rn denotes the set of n × 1 real column vectors, Rn×m denotes the set of n × m real matrices, R+ (resp., R+ ) denotes the set of positive (resp., nonnegative-definite) real n×n
numbers, Rn×n (resp., R+ ) denotes the set of n × n positive-definite (resp., nonnegative+ definite) real matrices, Sn×n denotes the set of n × n symmetric real matrices, Dn×n denotes the n × n real matrices with diagonal scalar entries, (·)T denotes transpose, (·)−1 denotes inverse, and “,” denotes equality by definition. In addition, we write λmin (A) (resp., λmax (A)) for the minimum (resp., maximum) eigenvalue of the Hermitian matrix A, tr(·) for the trace operator, vec(·) for the column stacking operator, k · k2 for the Euclidian norm, k · k∞ for the infinity norm, and k · kF for the Frobenius matrix norm. Furthermore, for a signal x(t) = [x1 (t), x2 (t), . . . , xn (t)]T ∈ Rn defined for all t ≥ 0, the truncated L∞ norm and the L∞ norm [9, Section 5] are defined as kxτ (t)kL∞ , max1≤i≤n (sup0≤t≤τ |xi (t)|) and kx(t)kL∞ , max1≤i≤n (supt≥0 |xi (t)|), respectively.
3.
Preliminaries Consider the nonlinear uncertain dynamical system given by x˙ p (t) = Ap xp (t) + Bp Λu(t) + Bp δp xp (t) ,
4
xp (0) = xp0 ,
(1)
where xp (t) ∈ Rnp is the accessible state vector, u(t) ∈ Rm is the control input, δp : Rnp → Rm is an uncertainty, Ap ∈ Rnp ×np is a known system matrix, Bp ∈ Rnp ×m is a known control input matrix, and Λ ∈ Rm×m ∩ Dm×m is an unknown control effectiveness matrix. Further+ more, we assume that the pair (Ap , Bp ) is controllable and the uncertainty is parameterized as δp xp = WpT σp xp ,
xp ∈ Rnp ,
(2)
where Wp ∈ Rs×m is an unknown weight matrix and σp : Rnp → Rs is a known basis function1 T of the form σp xp = σp1 xp , σp2 xp , . . . , σps xp . To address command following, let c(t) ∈ Rnc be a given bounded piecewise continuous command and xc (t) ∈ Rnc be the integrator state satisfying x˙ c (t) = Ep xp (t) − c(t),
xc (0) = xc0 ,
(3)
where Ep ∈ Rnc ×np allows to choose a subset of xp (t) to be followed by c(t)2 . Now, (1) can be augmented with (3) as x(t) ˙ = Ax(t) + BΛu(t) + BWpT σp xp (t) +Br c(t),
x(0) = x0 ,
(4)
T T n where x(t) , xT p (t), xc (t) ∈ R , n = np + nc , is the (augmented) state vector, x0 , T T T xp0 , xc0 ∈ Rn , Ap 0np ×nc A , ∈ Rn×n , (5) Ep 0nc ×nc T B , BpT , 0T ∈ Rn×m , (6) nc ×m T Br , 0T ∈ Rn×nc . (7) np ×nc , −Inc ×nc Next, consider the feedback control law given by u(t) = un (t) + ua (t),
(8)
1 For the case where the basis σp xp is unknown, parameterization in (2) can be relaxed, for function example, by considering δp xp = WpT σpnn VpT xp +εnn x , x ∈ Dxp , where Wp ∈ Rs×m and Vp ∈ Rnp ×s p p p are unknown weight matrices, σpnn : Dxp → Rs is a known basis composed of neural networks function m approximators, εnn is an unknown residual error, and Dxp is a compact subset of Rnp [10]. p : Dxp → R 2 For stabilization, the integrator state given by (3) may not be required (i.e., nc = 0) since c(t) ≡ 0 for this case.
5
where un (t) ∈ Rm and ua (t) ∈ Rm are the nominal and adaptive control laws, respectively. Furthermore, let the nominal control law be un (t) = −Kx(t),
K ∈ Rm×n ,
(9)
such that Ar , A − BK is Hurwitz. Using (8) and (9) in (4) yields x(t) ˙ = Ar x(t) + Br c(t) + BΛ ua (t) + W T σ x(t) ,
(10)
where W T , Λ−1 WpT , (Λ−1 − Im×m )K ∈ R(s+n)×m is an unknown (aggregated) weight matrix and σ T x(t) , σpT xp (t) , xT (t) ∈ Rs+n is a known (aggregated) basis function. Considering (10), let the adaptive control law be ˆ T (t)σ x(t) , ua (t) = −W
(11)
ˆ (t) ∈ R(s+n)×m be the estimate of W satisfying the update law where W ˆ˙ (t) = γσ x(t) eT (t)P B, W
ˆ (0) = W ˆ 0, W
(12)
where γ ∈ R+ is the learning rate, e(t) , x(t) − xr (t) is the system error with xr (t) ∈ Rn being the reference state vector satisfying the reference system x˙ r (t) = Ar xr (t) + Br c(t),
xr (0) = xr0 ,
(13)
∩ Sn×n is a solution of the Lyapunov equation3 and P ∈ Rn×n + 0 = AT r P + P Ar + R,
(14)
∩ Sn×n . with R ∈ Rn×n + Now, the system error dynamics is given by using (10), (11), and (13) as ˜ T (t)σ x(t) , e(t) ˙ = Ar e(t) − BΛW 3
e(0) = e0 ,
(15)
Since Ar is Hurwitz, it follows from [11, Section 3.7] that there exists a unique P satisfying (14) for a given R.
6
˜ (t) , W ˆ (t) − W ∈ R(s+n)×m is the weight error and e0 , x0 − xr0 . The update where W law given by (12) can be derived by using Lyapunov analysis by considering the Lyapunov function candidate ˜ V e, W
˜ Λ 21 = eT P e + γ −1 tr W
T
˜ Λ 12 . W
(16)
˜ > 0 for all (e, W ˜ ) 6= (0, 0), and V e, W ˜ is radially unbounded. Note that V 0, 0 = 0, V e, W Now, differentiating (16) and then using (12) and (15) yields ˜ (t) = −eT (t)Re(t) ≤ 0, V˙ e(t), W
(17)
˜ (t) are Lyapunov stable, which guarantees that the system error e(t) and the weight error W and hence, are bounded for all t ∈ R+ . Since σ x(t) is bounded for all t ∈ R+ , it follows ˜ (t) is bounded for all t ∈ R+ . Now, it from (15) that e(t) ˙ is bounded, and hence, V¨ e(t), W follows from Barbalat’s lemma [9, Lemma 8.2] that ˜ (t) = 0, lim V˙ e(t), W
t→∞
(18)
which consequently shows that limt→∞ e(t) = 0. Remark 3.1. Although limt→∞ e(t) = 0, the state vector x(t) can be far different from xr (t) during transient time (learning phase), unless a high learning rate γ is used in the update law (12). To see this, we first let xr0 = x0 in (13), and hence, e(0) = 0 in (15). Note that this condition is realizable since it is assumed that the state vector xp (t) in (1) is ˜ (t) ≤ 0, and hence, accessible [12, Section II.C]. Since V˙ e(t), W ˜ (t) ≤ V e0 , W ˜ 0 = γ −1 kW ˜ 0 Λ 21 k2F , V e(t), W
(19)
p ˜ (t) ≥ λmin (P )ke(t)k22 in (19) yields ke(t)k2 ≤ kW ˜ 0 Λ 12 kF / γλmin (P ). then using V e(t), W p ˜ 0 Λ 21 kF / γλmin (P ). Since k·k∞ ≤ k·k2 holds, and this bound is uniform, then keτ (t)kL∞ ≤ kW Finally, since this holds uniformly in τ , we have p ˜ 0 Λ 21 kF / γλmin (P ), ke(t)kL∞ ≤ kW 7
(20)
which shows that the distance between x(t) and xr (t) can be made arbitrarily small in transient time by resorting to a high learning rate γ. As discussed, however, update laws with high learning rates in the face of large system uncertainties and abrupt changes may yield to signals with high-frequency oscillations, which can violate actuator rate saturation constraints and/or excite unmodeled system dynamics [3, 4] resulting in system instability for practical applications.
4.
Frequency-Limited System Error Dynamics One of the fundamental components of a model reference adaptive control scheme is
the system error e(t). In particular, if the system error e(t) contains any high-frequency oscillations, then the adaptive control law (11) can also have such oscillations, since the update law (12) is driven by this system error e(t). Motivating from this standpoint, our aim is to limit the frequency-content of the system error dynamics (15) during transient-time (learning phase), and hence, to filter out any possible high-frequency oscillations contained in the error signal e(t). For this purpose, let eL (t) ∈ Rn be a low-pass filtered system error of e(t) given by e˙ L (t) = Ar eL (t) + η e(t) − eL (t) ,
eL (0) = 0,
(21)
where η ∈ R+ is a filter gain. Note that since eL (t) is a low-pass filtered system error of e(t), the filter gain η is chosen such that η ≤ η ∗ , where η ∗ ∈ R+ is a design parameter. Next, we add a mismatch term to the system error dynamics (15) in order to enforce a distance condition between the trajectories of the system error e(t) and the trajectories of its low-pass filtered version eL (t). This leads to a minimization problem involving an error criterion capturing the distance between e(t) and eL (t). In particular, consider the cost function given by 1 J e, eL = ke − eL k22 , 2 8
(22)
and note that the negative gradient of (22) with respect to e is given by ∂ −J e(t), eL (t) = − e(t) − eL (t) , ∂e(t)
(23)
which gives the structure of the proposed mismatch term. Using the idea presented in [2,13– 16], we now need to add (23) to the system error dynamics given by (15). For this purpose, we modify the reference system (13) as x˙ r (t) = Ar xr (t) + Br c(t) + κ e(t) − eL (t) ,
xr (0) = xr0 ,
(24)
where κ ∈ R+ , and hence, the system error dynamics is given by using (10), (11), and (24) as ˜T e(t)=A ˙ r e(t)−BΛW (t)σ x(t) −κ e(t) − eL (t) ,
e(0) = e0 .
(25)
Finally, note for the rest of this paper that the update law (12) is driven by the system error e(t) = x(t) − xr (t), where xr (t) is obtained from (24) (not (13)). Remark 4.1. The reference system (24) captures a desired closed-loop dynamical system behavior modified by a mismatch term κ e(t)−eL (t) representing the high-frequency content between the uncertain dynamical system and this reference system. Although this implies a modification of the ideal (unmodified) reference system (13) during transient time, as we see in the following sections, this mismatch term allows to limit the frequency content of the system error dynamics (25), which is used to drive the adaptive controller. In other words, the purpose of our methodology is to prevent the update law from attempting to learn through the high-frequency content of the system error. Remark 4.2. As it is noted, the filter gain η needs to be chosen such that η ≤ η ∗ , where η ∗ needs to be small enough to cut off the high-frequency content of e(t). To see the negative effect of high filter gain, let η be sufficiently large. Then, e(t) − eL (t) ≈ 0 as a consequence of (21), and hence, we approximately recover the ideal (unmodified) reference system given by (13). In this case, the proposed approach converges to a standard model reference adaptive control scheme, which has practical limitations as discussed earlier in the presence of high 9
learning rate γ. Furthermore, as a special case of η = 0, the proposed approach converges to the approach documented in [7], since eL (t) ≡ 0 for all t ∈ R+ as a consequence of (21). Once again, as discussed earlier, this selection for the filter gain η may result in poor transient performance in the presence of exogenous low-frequency persistent disturbances. Therefore, from a practical point of view, this imposes another constraint in the selection of filter gain such that it also needs to satisfy η∗ ≤ η, where η∗ ∈ R+ needs to be large enough in order to suppress the effects of exogenous low-frequency persistent disturbances. This phenomenon is illustrated in the next remark for a special case as well as later in the paper for a more general case. Remark 4.3. To further elucidate the mechanism behind the proposed approach, let np = 1, nc = 0, m = 1, Ap = −α, α ∈ R+ , Bp = α, K = 0, and δp xp = d with d denoting an exogenous low-frequency disturbance. Furthermore, set R = 2 in (14) such that P = α−1 and let all initial conditions be zero. For this special case, the system loop transfer function G(s) (broken at the control input) can be equivalently written as a linear time-invariant dynamical system, and hence, we can resort to classical control theory tools, such as Bode plots, to analyze the closed-loop system with respect to different choices of γ, κ, and η. Specifically, the system loop transfer function is given by s+α+η α γ , G(s) = s s+α+κ+η s+α | {z } | {z } C(s)
(26)
P(s)
where C(s) and P(s) denote the controller and the plant, respectively. Furthermore, for the standard model reference adaptive controller (η is sufficiently large, or simply, κ = 0), note that C(s) =
γ . s
For the controller C(s) of proposed approach, since
by a lead compensator
s+α+η , s+α+κ+η
γ s
is multiplied
it can improve stability margins of the closed-loop sys-
tem positively. To further see the effects of γ, κ, and η, consider the Bode plot of G(s) in Figure 4.1. Here, we tune these design parameters in order to obtain a large loop gain at low frequencies (from 0 rad/s to 5/2π rad/s) for good rejection of low-frequency disturbances and a small loop gain at high frequencies to avoid injecting too much measure10
Bode Diagram
40
20
Magnitude (dB)
0
−20
γ=100, κ=5, η=0 −40
γ=100, κ=50, η=0 γ=100, κ=50, η=1 −60
γ=100, κ=50, η=10 γ=1000, κ=50, η=0
Phase (deg)
−80 −90
−135
−180 −1 10
0
1
10
10
2
10
Frequency (rad/s)
Figure 4.1: Bode plots of the loop gain transfer function for different γ, κ, and η. ment noise into the plant [17, Section 11.4]. Furthermore, we also would like to have at least a time-delay margin of 0.25 seconds. From Figure 4.1, one can see that the cases (γ, κ, η) = (100, 5, 0), (γ, κ, η) = (100, 50, 10), and (γ, κ, η) = (1000, 50, 0) achieves approximately the same rejection of low-frequency disturbances. However, it should be noted that the case (γ, κ, η) = (1000, 50, 0) amplifies measurement noise excessively in comparison to the cases (γ, κ, η) = (100, 5, 0) and (γ, κ, η) = (100, 50, 10). In addition, it should be also noted that the case (γ, κ, η) = (100, 5, 0) has the poorest time-delay margin of 0.1 sec-
11
onds, whereas the cases (γ, κ, η) = (100, 50, 10) and (γ, κ, η) = (1000, 50, 0) have time-delay margins of 0.25 and 0.14 seconds, respectively. Therefore, one can conclude that the case (γ, κ, η) = (100, 50, 10) achieves good rejection of low-frequency disturbances like the other two cases, has the maximum time-delay margin, and does not inject measurement noise as compared to the case (γ, κ, η) = (1000, 50, 0). This shows the significance of having additional design parameters η and κ in the control design process. Moreover, the effect of increasing κ alone can be depicted from the cases (γ, κ, η) = (100, 5, 0) and (γ, κ, η) = (100, 50, 0). Specifically, it deteriorates the rejection properties of low-frequency disturbances. That is the reason why we increased the adaptation gain γ in the case (γ, κ, η) = (1000, 50, 0) for achieving the same level of low-frequency disturbance rejection characteristics, however, as noted, this amplifies the measurement noise excessively and has less time-delay margin as compared to the case (γ, κ, η) = (100, 50, 10). Finally, the effect of increasing η to a moderate value can be seen from the cases (γ, κ, η) = (100, 50, 0), (γ, κ, η) = (100, 50, 1), and (γ, κ, η) = (100, 50, 10). That is, we can recover the desired low-frequency disturbance rejection characteristics without increasing γ, and hence, without amplifying the measurement noise.
5.
Transient and Steady-State Performance Guarantees This section establishes transient and steady-state performance properties of the proposed
adaptive control architecture. For this purpose, consider e(t) = x(t) − xr (t) with xr (t) ˜ (t) = W ˆ (t) − W . Furthermore, let the ideal (unmodified) reference satisfying (24) and W system4 be x˙ ri (t) = Ar xri (t) + Br c(t),
xri (0) = xr0 ,
(27)
where xri (t) ∈ Rn being the ideal reference state vector. Finally, let x˜(t) , xr (t) − xri (t) be the deviation error from the ideal reference system with xr (t), once again, satisfying (24). 4
To prevent any abuse of notation, we redefine the ideal (unmodified) reference system in (13) as (27).
12
Then, the system error, weight update error, low-pass filtered system error, and the deviation error dynamics are, respectively, given by (25), (21), ˜˙ (t) = γσ x(t) eT (t)P B, W
˜ (0) = W ˜ 0, W x˜˙ (t) = Ar x˜(t) + κ e(t) − eL (t) , x˜(0) = 0,
(28) (29)
˜0 , W ˆ 0 − W . The next theorem presents the first result of this paper. where W Theorem 5.1. Consider the nonlinear uncertain dynamical system given by (1) subject to (2), the (modified) reference system given by (24), and the feedback control law given by ˜ (t), eL (t), x˜(t) given by (25), (8) along with (9), (11), and (12). Then, the solution e(t), W ˜ 0 , 0, 0 ∈ Rn × R(s+n)×m × Rn × Rn and (21), (28), and (29) is Lyapunov stable for all e0 , W t ∈ R+ , and limt→∞ x(t) − xri (t) = 0. For t ∈ R+ , in addition, s ! r V κλmax (P ) kx(t) − xri (t)kL∞ ≤ 1+ , (30) λmin (P ) 2ξλmin (R) 1
˜ 0 Λ 2 k2 + λmax (P )ke0 k22 . where ξ ∈ (0, 1) and V , γ −1 kW F ˜ (t), eL (t), x˜(t) given by (25), Proof. To show Lyapunov stability of the solution e(t), W ˜ 0 , 0, 0 ∈ Rn × R(s+n)×m × Rn × Rn and t ∈ R+ , consider the (21), (28), and (29) for all e0 , W Lyapunov function candidate ˜ , eL , x˜ = V e, W ˜ +η −1 κeT P eL + 2ξκ−1 λ−1 (P )λmin (R)˜ V ∗ e, W xT P x˜, L max
(31)
˜ is given by (16), and note that V ∗ (0, 0, 0, 0) = 0, V ∗ e, W ˜ , eL , x˜ > 0 for all where V e, W ˜ , eL , x˜ 6= (0, 0, 0, 0), and V ∗ e, W ˜ , eL , x˜ is radially unbounded. Differentiating (31) e, W along the trajectories of (25), (21), (28), and (29) yields −1 −1 xT (t)R˜ x(t) V˙ ∗ (·)=−eT (t)Re(t) − η −1 κeT L (t)ReL (t) − 2ξκ λmax (P )λmin (R)˜ −1 −2κeT (t)P eH (t) + 2κeT xT (t)P eH (t) L (t)P eH (t) + 4ξλmax (P )λmin (R)˜ −1 −1 =−eT (t)Re(t) − η −1 κeT xT (t)R˜ x(t) L (t)ReL (t) − 2ξκ λmax (P )λmin (R)˜ h i 1 1 −1 −2κeT xT (t)P 2 P 2 eH (t) . H (t)P eH (t) + 2ξλmax (P )λmin (R) 2˜
13
(32)
where eH (t) , e(t) − eL (t)5 . Now, consider 1
1
2˜ xT P 2 P 2 eH ≤ µ˜ xT P x˜ +
1 T e P eH , µ H
µ ∈ R+ ,
(33)
which follows from Young’s inequality [18, Fact 1.4.7]. Using (33) in the last term of (32) and then etting µ = ξκ−1 λ−1 max (P )λmin (R) in (33) yields V˙ ∗ (·)≤−λmin (R)ke(t)k22 − η −1 κλmin (R)keL (t)k22 −2ξκ−1 λ2min (R)λ−1 x(t)k22 , max (P ) 1 − ξ k˜
(34)
and hence, since 1 − ξ > 0 in (34) by the definition of ξ, V˙ ∗ (·) ≤ 0. Therefore, the solu ˜ (t), eL (t), x˜(t) given by (25), (21), (28), and (29) is Lyapunov stable for all tion e(t), W ˜ 0 , 0, 0 ∈ Rn × R(s+n)×m × Rn × Rn and t ∈ R+ . e0 , W To show limt→∞ x(t)−xri (t) = 0, note that σ x(t) is bounded for all t ∈ R+ , and hence, ˜ (t), eL (t), e(t) ˙ is bounded. Furthermore, since e˙ L (t) and x˜˙ (t) are also bounded, then V¨ ∗ e(t), W x˜(t) is bounded for all t ∈ R+ . Now, it follows from Barbalat’s lemma [9, Lemma 8.2] that ˜ (t), eL (t), x˜(t) = 0, limt→∞ V˙ ∗ e(t), W
(35)
which consequently shows that lim e(t) = 0,
(36)
lim eL (t) = 0,
(37)
lim x˜(t) = 0.
(38)
x(t) − xri (t) = x(t) − xr (t) + xr (t) − xri (t) = e(t) + x˜(t),
(39)
lim x(t) − xri (t) = 0.
(40)
t→∞ t→∞
t→∞
Therefore, it follows from
that
t→∞ 5
Since eL (t) is a low-pass filtered system error of e(t), eH (t) represents the high-frequency content of the system error.
14
˜ (t), eL (t), x˜(t) ≤ 0 for all t ∈ R+ , this implies that Finally, since V˙ ∗ e(t), W ˜ 0 , 0, 0 . ˜ (t), eL (t), x˜(t) ≤ V ∗ e0 , W V ∗ e(t), W
(41)
˜ 0 , 0, 0 ≤ V and Using V ∗ e0 , W ˜ (t), eL (t), x˜(t) ≥ λmin (P )ke(t)k2 , V ∗ e(t), W 2
(42)
p in (41) yields ke(t)k2 ≤ λ−1 min (P )V . Since k · k∞ ≤ k · k2 , and this bound is uniform, then p −1 keτ (t)kL∞ ≤ λmin (P )V , and hence, ke(t)kL∞ ≤
q λ−1 min (P )V ,
(43)
is a direct consequence due to the fact that the former expression holds uniformly in τ . ˜ 0 , 0, 0 ≤ V and Similarly, using V ∗ e0 , W ˜ (t), eL (t), x˜(t) ≥ 2ξκ−1 λ−1 V ∗ e(t), W x(t)k22 , max (P )λmin (P )λmin (R)k˜
(44)
yields r k˜ x(t)kL∞ ≤
1 −1 −1 ξ λmin (R)λ−1 min (P )κλmax (P )V . 2
(45)
Now, it follows from (39) that kx(t) − xri (t)kL∞ ≤ ke(t)kL∞ + k˜ x(t)kL∞ ,
(46)
and hence, (30) is a direct consequence of using (43) and (45) in (46). This completes the proof. Remark 5.1. Even though the proposed architecture is predicated on a modified refer ence system given by (24), Theorem 5.1 shows that limt→∞ x(t) − xri (t) = 0, that is the (augmented) state vector x(t) of (4) asymptotically converges to the ideal reference state vector xri (t) of (27). Furthermore, during transient time (learning phase), the worst-case transient performance bound between x(t) and xri (t) is given by (30). To further elucidate this performance bound, we let xr0 = x0 in (24), and hence, e(0) = 0 in (25). As 15
noted in Remark 3.1, since it is assumed that the state vector xp (t) in (1) is accessible, p ˜ 0 Λ 21 kF / λmin (P ) and this condition is realizable [12, Section II.C]. Now, denoting V1 , kW q V2 , 12 ξ −1 λ−1 min (R)λmin (P ), it follows from (30) that 1 1 kx(t) − xri (t)kL∞ ≤ γ − 2 V1 1 + κ 2 V2 ,
(47)
for all t ∈ R+ . The performance bound in (47) implies that the distance between x(t) and xri (t) can be made arbitrarily small in transient time by resorting to a high learning rate γ, similar to Remark 3.1 for the standard model reference adaptive control scheme. However, as we see in the next section, by increasing κ, we make the distance between e(t) and eL (t) sufficiently small in transient time, and hence, a high learning rate γ subject to a high κ does not yield to signals with high-frequency oscillations. Finally, it should be also noted from (47) that keeping γ constant but increasing κ may result in a larger distance between x(t) and xri (t), and therefore, both should be increased simultaneously in order to keep this distance consistent during transient time.
6.
Suppressing High-Frequency System Error Dynamics in Transient Time We now show that the high-frequency content of the system error eH (t) = e(t) − eL (t)
can be effectively suppressed as one increases κ design parameter of the modified reference system (24). The next theorem presents the second result of this paper. Theorem 6.1. Consider the system error dynamics given by (25) and the low-pass filtered system error dynamics given by (21). Then, eH (t, κ) = e−κt eH0 + O(κ−1 ), holds for a sufficiently high κ, where eH0 , e(0) − eL (0) = e0 6 . 6
Recall that eL (0) = 0 in (21).
16
(48)
Proof. Let ε , κ−1 . Then, (25) and (21) can be equivalently written as ˜T εe(t)=εA ˙ r e(t) − εBΛW (t)σ x(t) − e(t) − eL (t) , e˙ L (t)=Ar eL (t) + η e(t) − eL (t) .
(49) (50)
Since setting ε = 0 results in 0 = e(t) − eL (t), e˙ L (t) = Ar eL (t),
(51) (52)
then the system given by (49) and (50) is said to be the singularly perturbed model form, where e(t) = eL (t) captures the isolated root. To shift the quasi steady-state of e(t) to the origin, consider eH (t) = e(t) − eL (t),
(53)
as a change of variables, which yields deH (τ ) = −eH (τ ), dτ
eH (0) = eH0 ,
(54)
where τ is related to the original t through τ = t/ε. Now, as a direct consequence of [9, Theorem 11.2] e(t, ε) = eAr t eL (0) + e−t/ε eH0 + O(ε) = e−t/ε eH0 + O(ε), eL (t, ε) = eAr t eL (0) + O(ε) = O(ε),
(55) (56)
which leads to the result given by (48). This completes the proof. Remark 6.1. Note that (52) and (54) are referred as the reduced-order system and the boundary-layer system, respectively, where the former describes the asymptotic behavior and the latter describes the transient behavior. In particular, Theorem 6.1 shows that the transient high-frequency content of the system error eH (t) is globally exponentially stable for a sufficiently high κ, and hence, it vanishes in a fast manner. If, in addition, xr0 = x0 in (24), then it follows from (48) that eH (t, κ) = O(κ−1 ).
17
7.
Extensions to Time-Varying Uncertainties This section discusses robustness properties of the proposed adaptive control architec-
ture with respect to time-varying uncertainties. For this purpose, we relax the uncertainty parametrization given by (2) to δp t, xp = WpT (t)σp xp ,
xp ∈ Rnp ,
(57)
where Wp (t) ∈ Rs×m is an unknown time-varying weight matrix subject to kWp (t)kF ≤ ˙ p (t)kF ≤ w˙ p.max , w˙ p,max ∈ R+ 7 . Note that, as a consequence wp,max , wp,max ∈ R+ , and kW of (57), the unknown (aggregated) weight matrix has the form W (t)T , Λ−1 WpT (t), (Λ−1 − Im×m )K ∈ R(s+n)×m for this case. We introduce the following definition for developing the main results of this section. Definition 7.1. Let φ : Rn → R be a continuously differentiable convex function given by φ(θ) ,
2 (εθ + 1)θT θ − θmax , 2 εθ θmax
(58)
where θmax ∈ R is a projection norm bound imposed on θ ∈ Rn and εθ > 0 is a projection tolerance bound. Then, the projection operator Proj : Rn × Rn → Rn is defined by y, if φ(θ) < 0, y, if φ(θ) ≥ 0 and φ0 (θ)y ≤ 0, Proj(θ, y) , φ0T (θ)φ0 (θ)y y − φ(θ), φ0 (θ)φ0T (θ) if φ(θ) ≥ 0 and φ0 (θ)y > 0,
(59)
where y ∈ Rn . Remark 7.1. It follows from Definition 7.1 that (θ − θ∗ )T (Proj(θ, y) − y) ≤ 0,
θ∗ ∈ Rn ,
(60)
holds [19]. The definition of the projection operator can be generalized to matrices as Projm (Θ, Y ) = Proj(col1 (Θ), col1 (Y )), . . . , Proj(colm (Θ), colm (Y )) , 7
(61)
If we let the first entry of the basis function σp xp to be the bias term, then this parameterization also captures the effect of exogenous time-varying disturbances.
18
where Θ ∈ Rn×m , Y ∈ Rn×m , and coli (·) denotes i-th column operator. In this case, for a given Θ∗ ∈ Rn×m , it follows from (60) that m X tr (Θ−Θ∗ )T (Projm (Θ, Y )−Y ) = coli (Θ−Θ∗ )T Proj(coli (Θ), coli (Y ))−coli (Y ) ≤ 0, i=1
(62) holds. Throughout this section, we assume that the projection norm bound imposed on each column of Θ ∈ Rn×m is θmax . ˆ (t) ∈ R(s+n)×m is the estimate Next, let the adaptive control law be given by (11), where W of W (t) satisfying the update law h i ˆ (t), σ x(t) eT (t)P B , ˆ˙ (t) = γProjm W W
ˆ (0) = W ˆ 0, W
(63)
with γ ∈ R+ being the learning rate, e(t) , x(t) − xr (t) being the system error such that xr (t) ∈ Rn satisfies (24), and P ∈ Rn×n ∩ Sn×n being a solution of (14). + We note here that the system error, weight update error, low-pass filtered system error, and the deviation error dynamics are, respectively, given by (25), (21), h i ˆ (t), σ x(t) eT (t)P B −W ˙ (t), ˜˙ (t) = γProjm W W
˜ (0) = W ˜ 0, W
(64)
˜0 , W ˆ 0 − W . Furthermore, since Wp (t) and W ˙ p (t) are bounded, there and (29), where W ˙ (t)kF ≤ w˙ max for all exists norm bounds wmax and w˙ max such that kW (t)kF ≤ wmax and kW ˆ (t) is predicated on the projection operator t ∈ R+ . Similarly, since the update law for W ˜ (t)kF ≤ w˜max for all and W (t) is bounded, there also exists a norm bound w˜max such that kW t ∈ R+ . Finally, as it is standard for the projection operator-based update laws, we assume ˆ (0)kF ≤ wˆmax , where wˆmax is the projection norm bound imposed on W ˆ (t). The that kW next theorem presents the third result of this paper. Theorem 7.1. Consider the nonlinear uncertain dynamical system given by (1) subject to (57), the (modified) reference system given by (24), and the feedback control law given by ˜ (t), eL (t), x˜(t) given by (25), (8) along with (9), (11), and (63). Then, the solution e(t), W
19
˜ 0 , 0, 0 ∈ Rn × R(s+n)×m × Rn × Rn (21), (64), and (29) is uniformly bounded for all e0 , W and t ∈ R+ , with ultimate bound kx(t) − xri (t)kL∞ ≤
r
s ! ρV κλmax (P ) 1+ , λmin (P ) 2ξλmin (R)
(65)
where ξ ∈ (0, 1) and ρV , γ
−1
2 kΛkF w˜max
1 + 4γ
−1
2 kΛkF w˙ max λ−2 min (R)
1 + κλ2max (P )λmin (R)ξ −1 (1 − ξ)−2 2
1 + ηκ−1 λmax (P )
! .
(66)
Proof. Consider the Lyapunov-like function candidate given by (31). Then, using the property of the projection operator given by (60) and following similar steps in the proof of Theorem 5.1, we have V˙ ∗ · ≤−c1 ke(t)k22 − c2 keL (t)k22 − c3 k˜ x(t)k22 + c4 ,
(67)
where c1 , λmin (R), c2 , η −1 κλmin (R), c3 , 2ξκ−1 ·λ2min (R)λ−1 max (P )(1 − ξ), and c4 , 2 2 2γ −1 kΛkF w˜max w˙ max , and hence, either ke(t)k2 ≥ ψ1 , c4 /c1 or keL (t)k2 ≥ ψ2 , c4 /c2 2 ˆ (t) is predicated on or k˜ x(t)k2 ≥ ψ3 , c4 /c3 yields V˙ ∗ (·) ≤ 0. Since the update law for W ˜ (t)k ≤ w˜max . That is, V˙ ∗ (·) ≤ 0 outside the compact set defined the projection operator, kW by ˜ (t), eL (t), x˜(t)) ∈ Rn × R(s+n)×m × Rn × Rn : ke(t)k2 ≤ ψ1 DV , {(e(t), W ˜ (t), eL (t), x˜(t)) ∈ Rn or keL (t)k2 ≤ ψ2 or k˜ x(t)k2 ≤ ψ3 } ∪ {(e(t), W ˜ (t)k ≤ w˜max }. ×R(s+n)×m × Rn × Rn : kW
(68)
˜ (t), eL (t), x˜(t) cannot grow outside DV , and hence, Therefore, V ∗ e(t), W ˜ (t), eL (t), x˜(t) ≤ max ˜ ˜ , eL , x˜ = ρV . V ∗ e(t), W (e, W (e,W ,eL ,˜ x)∈DV
(69)
Now, by following the final steps in the proof of Theorem 5.1, (65) is immediate. This completes the proof. 20
Remark 7.2. By following an identical analysis to [8, Theorem 9], the results of Theorem 7.1 can be further extended to show that the Lyapunov-like function candidate given by (31) exponentially converges to the set defined by the ultimate bound (65).
8.
Illustrative Numerical Example Consider the nonlinear dynamical system representing a controlled wing rock aircraft
dynamics model given by x˙ p1 (t) 0 1 xp1 (t) 0 = + Λu(t) + δp (t, xp (t)) , x˙ p2 (t) 0 0 xp2 (t) 1
(70)
where xp1 (0) = 0, xp2 (0) = 0, xp1 represents the roll angle in radians, and xp2 represents the roll rate in radians per second. In (70), δp (t, xp ) and Λ represent uncertainties of the form δp (t, xp ) = α1 sin(t) + α2 xp1 + α3 xp2 + α4 |xp1 |xp2 + α5 |xp2 |xp2 + α6 x3p1 ,
(71)
Λ = 0.75,
(72)
and
where αi , i = 1, . . . , 6, are unknown parameters. For our numerical example, we set α1 = 0.25, α2 = 0.5, α3 = 1.0, α4 = −5.0, α5 = 5.0, and α6 = 10.0. We chose K = [2.0, 2.0, 1.0] for the nominal controller design. For the proposed adaptive control architecture (Theorem T 7.1), σ(x) = 1, xp1 , xp2 , |xp1 |xp2 , |xp2 |xp2 , x3p1 , xT , is chosen as the basis function and we set R = I3 . Figures 8.1–8.4 present the results, where measurement noise is added to the state vector of (70) and α1 is set from 0 to 0.25 both at t = 45 seconds8 for all cases. Here, our aim is to follow a given square-wave roll angle command c(t). Figure 8.1 shows the closed-loop system performance of the standard model reference adaptive control approach (γ = 500, κ = 0, and η = 0). Even though we achieve a satisfactory command following performance with this approach, as discussed in Remark 3.1, its control performance is unacceptable due to high-frequency oscillations and measurement noise amplification. 8
That is, exogenous time-varying disturbance sin(t) is added at t = 45 seconds.
21
xp1(t)
x p 1(t ) [ d eg]
10
c(t)
5 0 −5 −10 0
10
20
30
40
50
60
70
80
90
t [ s]
u(t ) [ d eg]
100 50 0 −50
un(t)
−100
ua(t) 0
10
20
30
40
50
60
70
80
90
t [ s]
Figure 8.1: Command following performance for the standard model reference adaptive control approach (γ = 500, κ = 0, and η = 0). Next, we show the closed-loop system performance of the proposed model reference adaptive control approach (γ = 500, κ = 100, and η = 5) in Figure 8.2. In particular, we achieve a satisfactory command following performance similar to the case in Figure 8.1. However, the control response of our approach is clearly superior as compared to the control response of the standard model reference adaptive control approach in Figure 8.1. This is expected from the proposed theory, and hence, the control response of the proposed approach neither has high-frequency oscillations nor high measurement noise amplification. In order to compare our approach with the approach of [7,8], we set η = 0 in Figure 8.3. In this case, however, the transient performance is not sufficient due to the presence of exogenous low-frequency persistent disturbance. In order to improve the transients of this approach, we increased learning rate to γ = 2000 in Figure 8.4. Even though we now achieve a satisfactory command following performance with this approach, its control response has excessive measurement noise as compared to the control response of the proposed approach in Figure 8.2. Note that this is also observed in Remark 4.3 for the case of a simple example.
22
x (t)
x p 1(t ) [ d eg]
10
p1
c(t)
5 0 −5 −10 0
10
20
30
40
50
60
70
80
90
t [ s]
u(t ) [ d eg]
40 20 0
un(t)
−20
ua(t) −40 0
10
20
30
40
50
60
70
80
90
t [ s]
x p 1(t ) [ d eg]
Figure 8.2: Command following performance for the proposed model reference adaptive control approach (γ = 500, κ = 100, and η = 5).
xp1(t)
10
c(t)
5 0 −5 −10 −15 0
10
20
30
40
50
60
70
80
90
t [ s]
u(t ) [ d eg]
40 20 0
un(t)
−20
ua(t) 0
10
20
30
40
50
60
70
80
90
t [ s]
Figure 8.3: Command following performance for the model reference adaptive control approach of [7,8] (γ = 500, κ = 100, and η = 0).
23
xp1(t)
x p 1(t ) [ d eg]
10
c(t)
5 0 −5 −10 0
10
20
30
40
50
60
70
80
90
t [ s]
u(t ) [ d eg]
40 20 0
un(t)
−20
ua(t) −40 0
10
20
30
40
50
60
70
80
90
t [ s]
Figure 8.4: Command following performance for the model reference adaptive control approach of [7,8] (γ = 2000, κ = 100, and η = 0).
9.
Conclusion We contributed to the previous studies in model reference adaptive control theory by
introducing a new reference system in order to improve the transient performance. By utilizing singular perturbation theory, it is shown that the proposed reference system allows to limit the frequency content of the system error dynamics to yield fast adaptation without incurring high-frequency oscillations in the transient performance. We derived the guaranteed performance bounds, analyzed the effects of design parameters on the system performance, and discussed robustness properties to time-varying uncertainties and disturbances.
10.
Acknowledgment
The authors wish to thank Eugene Lavretsky from the Boeing Company for his constructive suggestions.
24
References [1] H. P. Whitaker, J. Yamron, and A. Kezer, Design of model reference control systems for aircraft. Cambridge, MA: Instrumentation Laboratory, Massachusetts Institute of Technology, 1958. [2] P. V. Osburn, H. P. Whitaker, and A. Kezer, “New developments in the design of adaptive control systems,” Institute of Aeronautical Sciences, 1961. [3] Z. T. Dydek, A. M. Annaswamy, and E. Lavretsky, “Adaptive control and the NASA X-15-3 flight revisited,” IEEE Control Systems Magazine, vol. 30, pp. 32–48, 2010. [4] C. E. Rohrs, L. S. Valavani, M. Athans, and G. Stein, “Robustness of continuous-time adaptive control algorithms in the presence of unmodeled dynamics,” IEEE Transactions on Automatic Control, vol. 30, pp. 881–889, 1985. [5] N. Hovakimyan and C. Cao, L1 adaptive control theory: Guaranteed robustness with fast adaptation. Philadelphia, PA: SIAM, 2010. [6] T. Yucelen and W. M. Haddad, “A robust adaptive control architecture for disturbance rejection and uncertainty suppression with L∞ transient and steady-state performance guarantees,” International Journal of Adaptive Control and Signal Processing (to appear, available online). [7] E. Lavretsky, “Reference dynamics modification in adaptive controllers for improved transient performance,” AIAA Guidance, Navigation, and Control Conference, Portland, OR, 2011. [8] T. E. Gibson, A. M. Annaswamy, and E. Lavretsky, “Adaptive systems with closed-loop reference models: Stability, robustness, and transient performance,” IEEE Transactions on Automatic Control (submitted, available online). [9] H. K. Khalil, Nonlinear systems. Upper Saddle River, NJ: Prentice Hall, 2002. [10] F. L. Lewis, A. Yesildirek, and K. Liu, “Multilayer neural-net robot controller with guaranteed tracking performance,” IEEE Transactions on Neural Networks, vol. 7, pp. 388-399, 1996. [11] W. M Haddad and V. Chellaboina, Nonlinear dynamical systems and control: A Lyapunov-based approach. Princeton, NJ: Princeton University Press, 2008. [12] Z. Han and K. S. Narendra, “New concepts in adaptive control using multiple models,” IEEE Transactions on Automatic Control, vol. 57, pp. 78–89, 2012. [13] T. Yucelen and A. J. Calise, “Kalman filter modification in adaptive control,” AIAA Journal of Guidance, Control, and Dynamics, vol. 33, pp. 426-439, 2010. [14] K. Y. Volyanskyy, W. M. Haddad, and A. J. Calise, “A new neuroadaptive control architecture for nonlinear uncertain dynamical systems: Beyond σ- and e-modifications,” IEEE Transactions on Neural Networks, vol. 20, no. 11, pp. 1707-1723, 2010. [15] A. J. Calise and T. Yucelen, “Adaptive loop transfer recovery,” AIAA Journal of Guidance, Control, and Dynamics (to appear, available online). [16] T. Yucelen and W. M. Haddad, “Low-frequency learning and fast adaptation in model reference adaptive control,” IEEE Transactions on Automatic Control (to appear, available online).
[17] K. J. Astrom and R. M. Murray, Feedback systems: An introduction for scientists and engineers. Princeton, NJ: Princeton University Press, 2008. [18] D. S. Bernstein, Matrix mathematics: Theory, facts, and formulas. Princeton, NJ: Princeton University Press, 2009. [19] J. B. Pomet and L. Praly, “Adaptive nonlinear regulation: Estimation from Lyapunov equation,” IEEE Transactions on Automatic Control, vol. 37, pp. 729–740, 1992.