Successive Galerkin Approximation Of Nonlinear ... - Semantic Scholar

PID: ACC99-IEEE0220

Successive Galerkin Approximation Of Nonlinear Optimal Attitude Control Jonathan Lawton Randal W. Beard Electrical and Computer Engineering Department Brigham Young University Provo, Utah 84602 Tim McLain Mechanical Engineering Department Brigham Young University Provo, Utah 84602

Abstract This paper presents the application of the Successive Galerkin Approximation (SGA) to the HamiltonJacobi-Bellman equation to obtain solutions of the optimal attitude control problem. Galerkin's method approximates the value function by a truncated Galerkin series expansion. To do so a truncated Galerkin basis set is formed. A sucient number of functions must be included in this Galerkin basis set in order to guarantee that the solution will be a stabilizing control. By increasing the size of the Galerkin basis the quality of the approximation is improved at the cost of rapid growth in the computation load of the SGA. A major result of this paper is the development of the Galerkin basis set in the context of the optimal attitude control problem.

1 Introduction The development of control laws to regulate the attitude of spacecraft and aircraft has been the focus of much research. From among this class of problems the optimal attitude control problem has proven to be challenging due to its cascade nature. Open loop solutions have been obtained for the optimal control

problem. Reference [9] is a recent example of a numerical solution of the state and co-state equations from a Hamiltonian formulation. However, for practical reasons it is of interest to nd solutions to the optimal control problem that are in feedback form. Most feedback optimal control results have performance indices that penalize the angular velocity and either the control eort or the spacecraft attitude. Recall that the rotational equations of motion of a rigid body can be seen as having a dynamic component and a kinematic component in cascade. Penalizing the angular velocity and the torque essentially ignores the kinematic component of the system. This produces optimal control results that regulate the angular velocity but not the attitude [11]. Similarly there are controls that only control the kinematic portion of the system. These results are found by taking the angular velocity to be the control input to the kinematics and a performance index that penalizes the angular velocity and the attitude [10]. It is desirable to solve the optimal control problem given a performance index that penalizes the angular velocity, the control torque and the attitude. In [10] this problem is approached by rst using feedback linearization on the dynamics and then solving the complete optimal control problem. Another ap-

proach is used in reference [6] where an inverse optimal stabilization technique is used to solve this problem. Neither of these techniques can be applied to a large class of penalty functions. The main result of this paper is to use the recently developed Successive Galerkin Approximation (SGA) algorithm to nd stabilizing feedback controls that approximate the optimal attitude control. The SGA was initially described in [1], with a tutorial published in [2]. In [3, 4] it is shown that as the degree of the approximation is improved that the control approaches the optimal control. All numerical techniques for solving the optimal control problem via the Hamilton-Jacobi-Bellman equation are plagued by the \curse of dimensionality' in that the number of computations required to compute the optimal control grows exponentially with the state [5]. The SGA is not free from this curse. Fortunately, by eliminating less signi cant terms in the control the computational load can be reduced. In this paper we will use the Rodrigues parameters to represent the attitude of the rigid body. In section 2 we review the kinematics associated with the Rodrigues parameters as will as the Euler dynamics. Section 3 presents the SGA algorithm in the context of nding approximate solutions to the optimal attitude control problem. In section 4 we give some numerical examples illustrating the results of this paper. Comparisons are made between our results and a linear controller and an optimal controller where the dynamics have been feedback-linearized. Finally in section 5, we give our conclusions.

the control torque. The symbol S () is a skew symmetric matrix of the form 2 3 0 !3 ?!2 !1 5 S (!) = 4?!3 0 !2 ?!1 0 and the matrix valued function H () is given by 1 2 We have chosen to use the Rodrigues parameters in this paper. However, the results we obtain by applying the SGA algorithm to the optimal attitude control problem do not depend on this choice of parameters. Any other three parameter attitude representation would achieve similar results. If we de ne the states by x = [T ; !T ]T then the state equations are given by H () = (I ? S () + T ):

x_ = f (x) + gu;

where

(2)

() f (x) = J ?H 1 S (! )J ! and

g = J 0?1

References [1, 2, 3, 4] derive approximations to the Hamilton-Jacobi-Bellman equation for systems of the form given in equation (2).

2 System Dynamics and Kinematics 3 The Main Result

The Euler dynamics and kinematics of a rigid body Given system (2) we wish to nd the control u that are given by the system of dierential equations [10]: minimizes the performance index Z 1 _ = H ()!; (1) l(x(t)) + uT (x(t))Ru(x(t))dt; V ( x ( t )) = o !_ = J ?1 S (!)J! + J ?1 u; t o

where 2 IR3 is the vector of Rodrigues parameters, where l(x) : IR6 ! ! 2 IR3 is the angular velocity vector and u 2 IR3 is mal control is u =

IR and R 2 IR33 . The opti? 21 R?1 gT @V@x (x), where V is

the solution to the Hamilton-Jacobi-Bellman (HJB) required to calculate the coecients increases expoequation nentially with the state [7]. Fortunately, many of the basis elements i have coecients, ci that are ) = T + ? 1 T ?1 T = 0 very small or zero. These elements can be eliminated ( 4 from our control without greatly in uencing the per(3) formance of the control. This in turn reduces the The optimal control can be iteratively approximated computational load of the algorithm. by The processes of choosing appropriate basis functions can be divided into two steps. First, we need 1 @V (i?1) to choose a set B0 of basis functions that will at least ( i ) ? 1 T u =? R g (x); 2 @x produce a stabilizing control. Some guidelines can be used in making this selection, in addition to trial where V (i?1) is the solution of the Generalized- and error. Then we slowly increase then number of Hamilton-Jacobi-Bellman equation (GHJB) [8] basis elements, rejecting those whose coecients are T (i?1) either zero or very small, until we are satis ed with (i?1) (i?1) (i?1) ( + ) ( )= the control eort. T (i?1) ?1 (i?1) To choose the initial basis elements, we will exploit + + =0 the structure of the SGA control as applied to the (4) optimal control problem. We will assume that J = provided that the algorithm is initialized by the con- diag(J1 ; J2 ; J3 ) and R = diag(R1 ; R2 ; R3 ). trol u(0) which is stabilizing on a set IR6 . (i?1) The Successive Galerkin Approximation (SGA) al(i) = ? 1 R?1 g T @VN (x) u N gorithm uses Galerkin's spectral method to form a 2 @x P (i?1) truncated series approximation, VN(i) = Ni=1 ci i to @V 1 N (x) = 2 R?1J ?1 @! (6) equation (4), where fi g represents a truncated set of basis functions with corresponding coecients fci g. N X i The SGA algorithm iteratively approximates the opci : = 21 R?1J ?1 @ @! timal control by i=1 HJB V

GH J B u

@V

@x

f

@V

l

@x

@V

;V

f

@x

l

gR

u

g

@V

@x

:

gu

R

u

;

We wish to choose our basis elements such that the SGA control has at least the same terms as the initial @x i=1 stabilizing control. This should allow us to at least match the performance of u(0) . We will choose the where the coecients c(Ni) are found by solving the basis functions such that the terms 1 R?1 J ?1 @ ci 2 @! system of algebraic equations are of the same structure as the dierent terms in u(0) . For example if we choose to initialize the alN ( i?1) X hGHJB (uN ; ci i ); i (x)i = 0; (5) gorithm with the globally asymptotically stabilizing linear control [10] i=1 1 2

u(Ni) = ? R?1 gT

N X @i (x) (i) c ; N

i

R

2

3

?k1 1 ? k2 !1 where the projection operator h; i i = i dx. u(0) = 4 ?k1 2 ? k2 !2 5 The algorithm is guaranteed to generate a stabiliz?k1 3 ? k2 !3 ; ing control provided that the set fi g is large enough and u(0) is stabilizing on a set IR6 . All methods of solving the HJB equation are sub- choosing the basis elements ject to the curse of dimensionality. The SGA algoB0 = f1!1 ; 2 !2 ; 3 !3 ; !12 ; !22 ; !32g: rithm is no exception. The number of computations

(7)

we will accomplish our desire. Since VN(i) is a Lyapunov function, it should be positive de nite. The absence of the terms f21 ; 22 ; 23 g from the basis would make it very dicult for VN(i) to be positive de nite. Therefore we will add these terms to B0 to get Once this initial basis set is established, we run the SGA algorithm to make sure that it converges to a stabilizing control. If necessary, we will add additional term to B0 . Then we iteratively further increase the size of B0 eliminating terms that make no signi cant contribution to the control function u(i) . To summarize there are four guidelines that we can use when selecting basis functions. 1. Find a simple stabilizing control law to initialize the SGA. For attitude control, we may use a linear control u(0) = ?k1 ? k2 ! to stabilize the system. This control law is globally asymptotically stabilizing [10]. 2. Include terms in the initial basis set B0 such that each term in u(0) is represented.

1.4

1.2

1

attitude

B0 = f1 !1 ; 2 !2 ; 3 !3 ; !12 ; !22 ; !32; 21 ; 22 ; 23 g: (8)

1.6

0.8

0.6

0.4

0.2

0

−0.2

0

5

10

15

20

25

Time (s)

Figure 1: Rodrigues Parameters: SGA control |{; Linear Control - - results B =f1 !1 ; !12; 21 ; 31 !1 ; 31 !13; !14 ; 21 !14 ; 41 !14 ; 41 ; 41 !12 ; 2 2 3

3 3

4 2 4 4 4 4 4 2

2 !2 ; !2 ; 2 ; 2 !2 ; 2 !2 ; !2 ; 2 !2 ; 2 !2 ; 2 ; 2 !2 ; 3. Add some terms to our basis set to give VN(i) a 3 !3 ; !32 ; 23 ; 33 !3 ; 33 !33 ; !34 ; 23 !34 ; 43 !34 ; 43 ; 43 !32 :g chance to be positive de nite. (9) 4. Iteratively run the algorithm, discarding less signi cant terms until satis ed with the conver- Now we may compare the control eort derived gence of the control coecients. from basis (9) with other frequently used attitude controls. First we will compare our controller with the linear control u(0). For this example we have chosen to use the penalty l(x) = T + !T !. Figure 1 compares the performance of 1 for the linear conFor comparison purposes, we will consider the same troller and the SGA controller. The performance of system in reference [10] with J = diag(10; 6:3; 8:5) the other ve states are very similar. Similarly Figure N-m s2 and the initial conditions de ned by (0) = 2 compares the rst component of the control torque. [1:4735; 0:6115; 2:5521]T and !(0) = [0; 0; 0]T . We It is apparent the SGA control uses less control initialize the SGA with the globally asymptotically eort with lessthat overshoot than the linear controller. stabilizing control [10] Similarly we will compare the SGA control with the optimal control of a system with feedback linearized u(0) = ? ? 3!: dynamics [10] Following the guidelines from Section 3 we found u = ?S (!)J! ? 2JH ()! ? J (! + 2): (10) that the following set of basis functions gives good

4 Numerical Example

0.4 1.6 0.2 1.4 0 1.2

1

−0.4

−0.6

attitude

control (N−m)

−0.2

−0.8

−1

0.8

0.6

0.4

−1.2

0.2

−1.4 0 −1.6

0

5

10

15

20

25

−0.2

Time (s)

0

5

10

15

Time (s)

Figure 2: Control Eort: SGA Control |{; Linear Figure 3: Rodrigues Parameters: = 1 ; = 10 Control - - ? ?; = 180 - - -; Control Law (10) |{ 5

0

−5

−10 control (N−m)

In this example we use the same basis set B with penalty l(x) = T + !T ! for = 1; 10; 180. Figure 3 compares the rst components of their Rodrigues parameters. Figure 4 compares the rst components of the control eort. It is clear that the optimized feedback linearized controller requires a great deal of control eort. Figures 3 and 4 illustrate that as ! 1 that control law (10) and the SGA control have very similar performance. This can be explained by the fact that increasing reduces the penalty on u. Control law (10) is derived by using feedback linearization on the dynamics with a small penalty on the control eort. Thus as we decrease the penalty on the control the performance is similar.

−15

−20

−25

−30

−35

5 Conclusions

−40

−45

0 5 10 15 The main result of this paper is the implementation Time (s) of the Successive Galerkin Approximation (SGA) to approximate the optimal attitude control. This is sig- Figure 4: Control Eort: = 1 ; = 10 ? ni cant because it approximates the optimal control ?; = 180 - - -; Control Law (10) |{ problem with penalty on the angular velocity, atti-

tude and control eort. In general this problem is manipulators, IEEE Transactions on Systems, very dicult. The method depends on the choice of Man, and Cybernetics, SMC-9 (1979), pp. 152{ a truncated Galerkin Basis. The larger the truncated 159. basis the better the approximation at the expense of a larger computational load. A new result in this paper [9] H. Schuab, J. L. Junkins, and R. D. Robinett, New penalty functions and optimal control is the discussion of strategies for choosing signi cant formulation for spacecraft attitude control probGalerkin Basis functions. We conclude the paper by lems , Journal Of Guidance, Control And Dycomparing the performance of the SGA control with namics, (1997). a linear control and an optimized feedback linearized control. The SGA control performs well against both [10] P. Tsiotras, Stabilizaton and optimality results of these controls. for the attitude control problem, Journal of Guidance, Control and Dynamics, (1996). [11] P. Tsiotras, M. Corless, and M. Rotea, Optimal control of rigid body angular velocity [1] R. Beard, Improving the Closed-Loop Perforwith quadratic cost, in Proceedings of the 35th mance of Nonlinear Systems, PhD thesis, RensConference on Decision and Control, 1996. selaer Polytechnic Institute, Troy, New York, 1995. [2] R. Beard, G. Saridis, and J. Wen, Improving the performance of stabilizing control for nonlinear systems, Control Systems Magazine, (1996). , Galerkin approximation of the General[3] ized Hamilton-Jacobi-Bellman equation, Automatica, (1997). [4] , Approximate solutions to the timeinvariant Hamilton-Jacobi-Bellman equation, Journal of Optimization Theory and Applications, 96 (1998). [5] R. E. Bellman, Dynamic Programming, Princeton University Press, Princeton, New Jersey, 1957. [6] M. Krstic and P. Tsiotras, Inverse optimal stabilization of a rigid spacecraft, Transactions On Automatic Control, (To Appear). [7] J. Lawton and R. Beard, Numerically ef cient approximations to the Hamilton-JacobiBellman Equation, in Proceedings of the American Control Conference, 1998. [8] G. N. Saridis and C.-S. G. Lee, An approximation theory of optimal control for trainable

References