Robust Adaptive Dynamic Programming for Large ... - Semantic Scholar

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 59, NO. 10, OCTOBER 2012

693

Robust Adaptive Dynamic Programming for Large-Scale Systems With an Application to Multimachine Power Systems Yu Jiang, Student Member, IEEE, and Zhong-Ping Jiang, Fellow, IEEE

Abstract—This brief presents a new approach to decentralized control design of complex systems with unknown parameters and dynamic uncertainties. A key strategy is to use the theory of robust adaptive dynamic programming and the policy iteration technique. An iterative control algorithm is given to devise a decentralized optimal controller that globally asymptotically stabilizes the system in question. Stability analysis is accomplished by means of the small-gain theorem. The effectiveness of the proposed computational control algorithm is demonstrated via the online learning control of multimachine power systems with governor controllers. Index Terms—Adaptive dynamic programming (ADP), decentralized control, multimachine power systems, small-gain.

I. I NTRODUCTION

I

N RECENT years, considerable attention has been paid to the stabilization of large-scale complex systems [7], [17], [21]–[23], as well as the related consensus and synchronization problems [1], [16], [24], [34]. Examples of large-scale systems arise from ecosystems, transportation networks, and power systems, to name only a few [2], [15], [18], [20], [28], [36]. Often, in real-world applications, precise mathematical models are hard to build, and the model mismatches, caused by parametric and dynamic uncertainties, are thus unavoidable. This, together with the exchange of only local system information, makes the design problem extremely challenging in the context of complex networks. The purpose of this brief is to apply the idea of approximate/adaptive dynamic programming (ADP) [29]–[32] to find online robust optimal stabilizing controllers for a class of complex large-scale systems. Inspired from the biological learning behavior, ADP is a methodology to solve optimal control problems via online information, without precisely knowing the system dynamics [13], [14], [19], [27]. In our recent work [4], [5], we have integrated modern nonlinear control techniques with ADP to develop a new framework called robust ADP. An appealing feature of robust ADP is that dynamic uncertainties can be addressed for the first time for ADP-based computational adaptive controller design.

Manuscript received March 31, 2012; revised June 5, 2012; accepted July 18, 2012. Date of publication September 7, 2012; date of current version October 12, 2012. This work was supported in part by the National Science Foundation under Grants DMS-0906659 and ECCS-1101401. This brief was recommended by Associate Editor Y. Nishio. The authors are with the Department of Electrical and Computer Engineering, Polytechnic Institute of New York University, Brooklyn, NY 11201 USA (e-mail: [email protected]; [email protected]). Color versions of one or more of the figures in this brief are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TCSII.2012.2213353

In this brief, we intend to extend the robust ADP recently developed in [4] and [5] to decentralized optimal control of a class of large-scale uncertain systems. The controller design for each subsystem only needs to utilize local state variables without knowing the system dynamics. By integrating a simple version of the cyclic-small-gain theorem [9], asymptotic stability can be achieved by assigning appropriate weighting matrices for each subsystem. As a by-product, certain suboptimality properties can be obtained. This brief is organized as follows. Section II studies the global and robust optimal stabilization of a class of large-scale uncertain systems. Section III develops the robust ADP scheme for large-scale systems. Section IV presents a novel solution to decentralized stabilization based on the proposed methodology. It is our belief that the proposed design methodology will find wide applications in large-scale systems. Finally, Section V gives some brief concluding remarks. Throughout this brief, we use vertical bars | · | to denote the Euclidean norm for vectors or the induced matrix norm for matrices. ⊗ indicates Kronecker product, and vec(A) represents T T T the mn vector defined by vec(A) = [aT 1 a2 · · · am ] , where, n for each i = 1, 2, . . . , m, ai ∈ R is the ith column of A ∈ Rn×m . II. O PTIMALITY AND ROBUSTNESS In this section, we first describe the class of large-scale uncertain systems to be studied. Then, we present our novel decentralized optimal controller design scheme. It will also be shown that the closed-loop interconnected system enjoys some suboptimality properties. A. Description of Large-Scale Systems Consider the complex large-scale system of which the i subsystem (1 ≤ i ≤ N ) is described by x˙ i = Ai xi + Bi [ui + Ψi (y)]

yi = Ci x i ,

1≤i≤N

(1)

where xi ∈ Rni , yi ∈ Rqi , and ui ∈ Rmi are the state, the output, and the control input for the ith subsystem; y = T T [y1T , y2T , . . . , yN ] ; and Ai ∈ Rni×ni and Bi ∈ Rni×mi are unknown system matrices. Ψi (·) : Rq → Rmi represents unknown interconnections satisfying |Ψi (y)| ≤ di |y| for all y ∈ Rq , with N N di > 0, N i=1 ni = n, i=1 qi = q, and i=1 mi = m. It is also assumed that (Ai , Bi ) is a stabilizable pair, i.e., there exists a constant matrix Ki such that Ai −Bi Ki is a stable matrix. Notice that the decoupled system can be written in a compact form

1549-7747/$31.00 © 2012 IEEE

x˙ = AD x + BD u

(2)

694

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS, VOL. 59, NO. 10, OCTOBER 2012

T

T

T T n T T T m where x = [xT 1 , x2 , . . . , xN ] ∈R , u = [u1 , u2 , . . . , uN ] ∈R , n×n AD = block diag(A1 , A2 , . . . , AN ) ∈ R , and BD = block diag(B1 , B2 , . . . , BN ) ∈ Rn×m . For system (2), we define the following quadratic cost:

JD

∞ = (xT QD x + uT RD u)dτ

(3)

where QD = block diag(Q1 , Q2 , . . . , QN ) ∈ Rn×n , RD = block diag(R1 , R2 , . . . , RN ) ∈ Rm×m , Qi∈ Rni×ni , and Ri∈ Rmi×mi , 1/2 T with Qi = QT i ≥ 0, Ri = Ri > 0, and (Ai , Qi ) observable for all 1 ≤ i ≤ N . By linear optimal control theory [12], a minimum cost JD in (3) can be obtained by employing the following decentralized control policy: (4)

where KD = block diag(K1 , K2 , . . . , KN ) is given by −1 T KD = R D B D PD

(5)

and PD = block diag(P1 , P2 , . . . , PN ) is the unique symmetric positive definite solution of the algebraic Riccati equation −1 T AT D PD + PD AD − PD BD RD BD PD + QD = 0.

(6)

B. Decentralized Stabilization Now, we analyze the stability of the closed-loop system composed of (1) and the decentralized controller (4). We show that, by selecting appropriate weighting matrices QD and RD , global asymptotic stability can be achieved for the large-scale closed-loop system. To begin with, the following two lemmas are given. Notice that some details in the proofs are omitted due to space limitation. See [6] for the proofs in greater details. Lemma 2.1: For any γi > 0 and i > 0, let u be the decentralized control policy obtained from (4)–(6) with Qi ≥ (γi−1 + 1)CiT Ci + γi−1 i Ini and Ri−1 ≥ d2i Imi . Then, along the solutions of the closed-loop system consisting of (1) and (4), we have N d T xi γi Pi xi ≤ −|yi |2 − i |xi |2 + γi |yj |2 . dt

N −1

(7)

j=1,j =i

j

γi1 γi2 · · · γij+1 < 1.

(8)

1≤i1 0 be a known constant satisfying max

1≤i,j≤N

Eqi Eqj (|Bij | + |Gij |) < β.

Then |di (t)| ≤ (N −1)β

N

|Δωi −Δωj | ≤ (N −1)2 β

j=1

N

|Δωj |.

j=1

Therefore, the model (29)–(31) is in the form (1), if we define xi = [Δδi (t) Δωi (t) ΔPei (t)]T and yi = Δωi (t). B. Numerical Simulation A ten-machine power system is considered for numerical studies. In the simulation, generator 1 is used as the reference machine. Governor controllers and ADP-based learning systems are installed on generators 2–10. The parameters of the system are given in [6]. All the parameters, except for the operating points, are assumed to be unknown to the learning system. The weighting matrices are set to be Qi = 1000I3 and Ri = 1, for i = 2, 3, . . . , 10. From t = 0 s to t = 1 s, all the generators were operating at the steady state. At t = 1 s, an impulse disturbance on the active power was added to the network. As a result, the power angles and frequencies started to oscillate. In order to stabilize the system and improve its performance, the learning algorithm is conducted from t = 4 s to t = 5 s. Robust-ADP-based control policies for the generators are applied from t = 5 s to the end of the simulation. Trajectories of the angles and frequencies of generators 2–4 are shown in Figs. 1 and 2. The complete simulation results are given in [6]. V. C ONCLUSION The decentralized control of large-scale uncertain systems using robust ADP has been studied in this brief. A novel decentralized controller design scheme has been presented. The obtained controller globally asymptotically stabilizes the large-scale system and, at the same time, preserves suboptimality properties. In addition, the effectiveness of the proposed methodology is demonstrated via its application to the online learning control of multimachine power systems with governor controllers.

JIANG AND JIANG: ROBUST ADP FOR SYSTEMS WITH APPLICATION TO MULTIMACHINE POWER SYSTEMS

Fig. 1.

Power angle deviations of the generators.

Fig. 2.

Power frequencies of the generators.

R EFERENCES [1] M. Chen, “Some simple synchronization criteria for complex dynamical networks,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 53, no. 11, pp. 1185–1189, Nov. 2006. [2] G. Guo, Y. Wang, and D. J. Hill, “Nonlinear output stabilization control for multimachine power systems,” IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 47, no. 1, pp. 46–53, Jan. 2000. [3] Y. Jiang and Z. P. Jiang, “Computational adaptive optimal control for continuous-time linear systems with completely unknown system dynamics,” Automatica, vol. 48, no. 10, pp. 2699–2704, Oct. 2012. [4] Y. Jiang and Z. P. Jiang, “Robust adaptive dynamic programming,” in Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, F. L. Lewis and D. Liu, Eds. New York: Wiley, 2012. [5] Y. Jiang and Z. P. Jiang, “Robust approximate dynamic programming and global stabilization with nonlinear dynamic uncertainties,” in Proc. IEEE Joint Conf. Decision Control Eur. Control Conf., Orlando, FL, Dec. 2011, pp. 115–120. [6] Y. Jiang and Z. P. Jiang, “Robust adaptive dynamic programming for large-scale systems with an application to multimachine power systems,” Polytech. Inst. New York Univ., Brooklyn, NY, Tech. Rep. [Online]. Available: http://files.nyu.edu/yj348/public/papers/2012/tcas12tr.pdf. [7] Z. P. Jiang, “Decentralized control for large-scale nonlinear systems: A review of recent results,” J. Dyn. Cont., Discr. Impulsive Syst., vol. 11, no. 4/5, pp. 537–552, 2004. [8] Z. P. Jiang, A. R. Teel, and L. Praly, “Small-gain theorem for ISS systems and applications,” Math. Control, Signals, Syst., vol. 7, no. 2, pp. 95–120, Jun. 1994. [9] Z. P. Jiang and Y. Wang, “A generalization of the nonlinear small-gain theorem for large-scale complex systems,” in Proc. 7th World Congr. Intell. Control Autom., Jun. 2008, pp. 1188–1193.

697

[10] D. Kleinman, “On an iterative technique for Riccati equation computations,” IEEE Trans. Autom. Control, vol. AC-13, no. 1, pp. 114–115, Feb. 1968. [11] P. Kundur, N. J. Balu, and M. G. Lauby, Power System Stability and Control. New York: McGraw-Hill, 1994. [12] F. L. Lewis and V. L. Syrmos, Optimal Control. Hoboken, NJ: Wiley, 1995. [13] F. L. Lewis and D. Vrabie, “Reinforcement learning and adaptive dynamic programming for feedback control,” IEEE Circuits Syst. Mag., vol. 9, no. 3, pp. 32–50, 2009. [14] F. L. Lewis and K. G. Vamvoudakis, “Reinforcement learning for partially observable dynamic processes: Adaptive dynamic programming using measured output data,” IEEE Trans. Syst., Man, Cybern. B, Cybern., vol. 41, no. 1, pp. 14–25, Feb. 2011. [15] Z. Li and G. Chen, “Global synchronization and asymptotic stability of complex dynamical networks,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 53, no. 1, pp. 28–33, Jan. 2006. [16] Z. Li, Z. Duan, G. Chen, and L. Huang, “Consensus of multiagent systems and synchronization of complex networks: A unified viewpoint,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 57, no. 1, pp. 213–224, Jan. 2010. [17] A. N. Michel, “On the status of stability of interconnected systems,” IEEE Trans. Circuits Syst., vol. CAS-30, no. 6, pp. 326–340, Jun. 1983. [18] Y. H. Liu, T. J. Chen, C. W. Li, Y. Z. Wang, and B. Chu, “Energy-based L2 disturbance attenuation excitation control of differential algebraic power systems,” IEEE Trans. Circuits Syst. II, Exp. Briefs, vol. 55, no. 10, pp. 1081–1085, Oct. 2008. [19] J. J. Murray, C. J. Cox, G. G. Lendaris, and R. Saeks, “Adaptive dynamic programming,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 32, no. 2, pp. 140–153, May 2002. [20] G. Revel, A. E. León, D. M. Alonso, and J. L. Moiola, “Bifurcation analysis on a multimachine power system model,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 57, no. 4, pp. 937–949, Apr. 2010. [21] N. R. Sandell, P. Varaiya, M. Athans, and M. G. Safonov, “Survey of decentralized control methods for large scale systems,” IEEE Trans. Autom. Control, vol. AC-23, no. 2, pp. 108–128, Apr. 1978. [22] D. D. Šiljak, Large-Scale Dynamic Systems: Stability and Structure. New York: North Holland, 1978. [23] D. D. Šiljak, Decentralized Control of Complex Systems. Orlando, FL: Academic, 1991. [24] Q. Song and J. Cao, “On pinning synchronization of directed and undirected complex dynamical networks,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 57, no. 3, pp. 672–680, Mar. 2010. [25] E. D. Sontag, “Input to state stability: Basic concepts and results,” in Nonlinear and Optimal Control Theory, P. Nistri and G. Stefani, Eds. Berlin, Germany: Springer-Verlag, 2007, pp. 163–220. [26] D. Vrabie, O. Pastravanu, M. Abu-Khalaf, and F. L. Lewis, “Adaptive optimal control for continuous-time linear systems based on policy iteration,” Automatica, vol. 4, no. 2, pp. 477–484, Feb. 2009. [27] F.-Y. Wang, H. Zhang, and D. Liu, “Adaptive dynamic programming: An introduction,” IEEE Comput. Intell. Mag., vol. 4, no. 2, pp. 39–47, May 2009. [28] Y. Wang and D. J. Hill, “Robust nonlinear coordinated control of power systems,” Automatica, vol. 32, no. 4, pp. 611–618, Apr. 1996. [29] P. J. Werbos, “Beyond Regression: New Tools for Prediction and Analysis in the Behavioural Sciences,” Ph.D. dissertation, Harvard Univ., Cambridge, MA, 1974. [30] P. J. Werbos, “Neural networks for control and system identification,” in Proc. IEEE 28th Conf. Decision Control, Dec. 1989, pp. 260–265. [31] P. J. Werbos, “A menu of designs for reinforcement learning over time,” in Neural Networks for Control, W. T. Miller, R. S. Sutton, and P. J. Werbos, Eds. Cambridge, MA: MIT Press, 1991, pp. 67–95. [32] P. J. Werbos, “Approximate dynamic programming for real-time control and neural modeling,” in Handbook of Intelligent Control, D. A. White and D. A. Sofge, Eds. New York: Van Nostrand, 1992. [33] P. J. Werbos, “Intelligence in the brain: A theory of how it works and how to build it,” Neural Netw., vol. 22, no. 3, pp. 200–212, Apr. 2009. [34] X. Yang, J. Cao, and J. Lu, “Stochastic synchronization of complex networks with nonidentical nodes via hybrid adaptive and impulsive control,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 59, no. 2, pp. 371–384, Feb. 2012. [35] H. Zhang, Q. Wei, and D. Liu, “An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games,” Automatica, vol. 47, no. 1, pp. 207–214, Jan. 2011. [36] J. Zhou and Y. Ohsawa, “Improved swing equation and its properties in synchronous generators,” IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 56, no. 1, pp. 200–209, Jan. 2009.

Robust Adaptive Dynamic Programming for Large ... - Semantic Scholar

Robust Adaptive Dynamic Programming for Large ... - Semantic Scholar

Suggest Documents

ROBUST ADAPTIVE DYNAMIC PROGRAMMING - MTNS 2012

Neural Networks Adaptive dynamic programming ... - Semantic Scholar

Adaptive dynamic programming for online solution ... - Semantic Scholar

Robust Semidefinite Programming Approaches for ... - Semantic Scholar

Adaptive Shooting Methods for Dynamic ... - Semantic Scholar

Direct Heuristic Dynamic Programming for ... - Semantic Scholar

Dynamic Programming Algorithms for Haplotype ... - Semantic Scholar

Approximate Stochastic Dynamic Programming for ... - Semantic Scholar

Approximate dynamic programming for ... - Semantic Scholar

Approximate Dynamic Programming for High ... - Semantic Scholar

Dynamic Programming for Stochastic Target ... - Semantic Scholar

Dynamic programming procedure for searching ... - Semantic Scholar

Dynamic Logic Programming - Semantic Scholar

Differential Dynamic Programming - Semantic Scholar

Adaptive Dynamic Programming for Finite-Horizon ... - CiteSeerX

Global Adaptive Dynamic Programming for Continuous ... - CiteSeerX

An Adaptive Dynamic Programming Algorithm for ... - CiteSeerX

Global Adaptive Dynamic Programming for Continuous ... - CiteSeerX

Global Adaptive Dynamic Programming for Continuous ... - CiteSeerX

Approximate Dynamic Programming for Large ... - Cornell University

Robust Adaptive Fault-tolerant Compensation ... - Semantic Scholar

Robust Hammerstein Adaptive Filtering under ... - Semantic Scholar

ROBUST ADAPTIVE BEAMFORMING BASED ON ... - Semantic Scholar

Robust adaptive deadzone and friction ... - Semantic Scholar