INVERSION BASED CONSTRAINED TRAJECTORY ... - CiteSeerX

6 downloads 0 Views 118KB Size Report
Nicolas Petit, Mark B. Milam, Richard M. Murray. Control and Dynamical ..... Birkhäuser Ver- lag. Gill, P. E., W. Murray, M. A. Saunders and M. Wright. (n.d.).
INVERSION BASED CONSTRAINED TRAJECTORY OPTIMIZATION Nicolas Petit, Mark B. Milam, Richard M. Murray

Control and Dynamical Systems Mail Code 107-81 California Institute of Technology Pasadena, CA 91125. {npetit,milam,murray}@cds.caltech.edu

Abstract: A computationally efficient technique for the numerical solution of optimal control problems is discussed. This method utilizes tools from nonlinear control theory to transform the optimal control problem to a new, lower dimensional set of coordinates. It is hypothesized that maximizing the relative degree in this transformation is directly related to minimizing the computation time. Firm evidence of this hypothesis is given by numerical experiments. Results are presented using the Nonlinear Trajectory Generation (NTG) software package. Copyright IFAC 2001. Keywords: Real-time optimization, optimal control, inversion, nonlinear optimization

1. INTRODUCTION Computationally efficient trajectory optimization is an enabling technology for many new facets of engineering. Formation flying of satellites, see (TechSat 21 n.d.), and trajectory generation of unmanned aerial vehicles, see (Sweetman 1997), are two examples where the tools of real-time trajectory optimization would be extremely useful. In (Milam et al. 2000), a new technique is presented to solve such problems. The technique is based on mapping the system dynamics, objective, and constraints to a lower dimensional space. An optimization problem is solved in the lower dimensional space, and then the optimal states and inputs are recovered from the inverse mapping. However, no concrete criterion was provided to determine the appropriate coordinate change. In this paper, we address the case of singleinput single-output systems and show how the role of the relative degree of the system affects real-time trajectory generation optimization problems. The relative degree of the system will directly relate to the amount of inversion used in the optimization problem. The computational implications of inversion are investigated. By example, we conclude that more inversion significantly increases the speed of execution with no loss in the rate of convergence. For either open-loop reference trajectory design or receding horizon techniques, this example illustrates that 1

Work supported in part by the Institut National de Recherche en Informatique et en Automatique (INRIA), AFOSR grant F49620-99-1-0190 and DARPA contract F33615-98-C-3613.

the choice of adequate variables for representing a system and its dynamics is crucial in the context of implementation of real-time trajectory generation. Section 2 describes the class of optimization problems under consideration. Two classical methods for solving optimization problems are presented and compared to the new technique. Section 3 provides numerical investigations for an example that exhibits the explicit relation between relative degree and computation time for a single-input single-output system. A summary and future directions for multi-input multi-output systems are provided in the final section.

2. PROBLEM FORMULATION AND PROPOSED METHOD OF SOLUTION 2.1 Optimal Control Problem Consider the nonlinear control system x˙ = f (x) + g(x)u, R ∋ t 7→ x ∈ Rn , R ∋ t 7→ u ∈ R

(1)

where all vector fields and functions are real-analytic. It is desired to find a trajectory of (1), i.e. [t0 , tf ] ∋ t 7→ (x, u)(t) ∈ Rn+1 , that minimizes the performance index J(x, u) =φf (x(tf ), u(tf )) + φ0 (x(t0 ), u(t0 )) Z tf L(x(t), u(t))dt, + t0

where L is a nonlinear function, subject to a vector of initial, final, and trajectory constraints lb0 ≤ ψ0 (x(t0 ), u(t0 )) ≤ ub0 , lbf ≤ ψf (x(tf ), u(tf )) ≤ ubf , lbt ≤ S(x, u) ≤ ubt ,

x(ti+1 ) − x(ti ) ti+1 − ti

to get (2)

respectively. For conciseness, we will refer to this optimal control problem as  min J(x, u)   (x,u)    subject to (3)   x˙ = f (x) + g(x)u,    lb ≤ c(x, u) ≤ ub. 2.2 Solution Technique 2.2.1. Classical collocation A numerical approach to solving this optimal control problem is to use the direct collocation method outlined in (Hargraves and Paris 1987). The idea behind this approach is to transform the optimal control problem into a nonlinear programming problem. This is accomplished by discretizing time into a grid of N − 1 intervals t0 = t1 < t2 < . . . < tN = tf

x ¯˙ (ti ) =

(4)

and approximating the state x and the control input u as piecewise polynomials x ˆ and u ˆ, respectively. Typically a cubic polynomial is chosen for the states and a linear polynomial for the control on each interval. Collocation is then used at the midpoint of each interval to satisfy Eq. (1). Let x ˆ(x(t1 )T , ..., x(tN )T ) and u ˆ(u(t1 ), ..., u(tN )) denote the approximations to x and u, respectively, depending on (x(t1 )T , ..., x(tN )T ) ∈ RnN and (u(t1 ), ..., u(tN )) ∈ RN corresponding to the value of x and u at the grid points. Then one solves the following finite dimension approximation of the original control problem (3)  x(y), u ˆ(y)) min F (y) = J(ˆ    y∈RM     subject to  x ˆ˙ − f (ˆ x(y), u ˆ(y)) = 0, lb ≤ c(ˆ x(y), u ˆ(y)) ≤ ub,     t + t j j+1   j = 1, . . . , N − 1 ∀t = 2 (5) where y = (x(t1 )T , u(t1 ), . . . , x(tN )T , u(tN )), and M = dim y = (n + 1)N . 2.2.2. Inverse dynamic optimization (Seywald 1994) suggested an improvement to the previous method (see also (Bryson 1999) page 362). Following this work, one first solves a subset of system dynamics in (3) for the the control in terms of combinations of the state and its time derivative. Then one substitutes for the control in the remaining system dynamics and constraints. Next all the time derivatives x˙ i are approximated by the finite difference approximations

p(x ¯˙ (ti ), x(ti )) = 0 q(x ¯˙ (ti ), x(ti )) ≤ 0

¾

i = 0, ..., N − 1.

The optimal control problem is turned into  min F (y)    y∈RM   subject to   p(x ¯˙ (ti ), x(ti )) = 0    q(x ¯˙ (ti ), x(ti )) ≤ 0

(6)

where y = (x(t1 )T , . . . , x(tN )T ), and M = dim y = nN . As with the Hargraves and Paris method, this parameterization of the optimal control problem (3) can be solved using nonlinear programming. The dimensionality of this discretized problem is lower than the dimensionality of the Hargraves and Paris method, where both the states and the input are the unknowns. This induces substantial improvement in numerical implementation. 2.2.3. New Numerical Approach In fact, it is usually possible to reduce the dimension of the problem further. Given an output, it is generally possible to parameterize the control and a part of the state in terms of this output and its time derivatives. In contrast to the previous approach, one must use more than one derivative of this output for this purpose. When the whole state and the input can be parameterized with one output, one says that the system is flat; see the work of (Fliess et al. 1995, Fliess et al. 1999). When the parameterization is only partial, the dimension of the subspace spanned by the output and its derivatives is given by r the relative degree of this output. Definition 1. ((Isidori 1989)). A single input single output system ½ x˙ = f (x) + g(x)u (7) y = h(x) is said to have relative degree r at point x0 if Lg Lkf h(x) = 0, in a neighborhood of x0 , and for all < r − Pn k ∂h 1 Lg Lr−1 h(x ) = 6 0 where L h(x) = 0 f i=1 ∂xi fi (x) f is the derivative of h along f . Roughly speaking, r is the number of times one has to differentiate y before u appears. Result 1. ((Isidori 1989)). Suppose the system (7) has relative degree r at x0 . Then r ≤ n. Set φ1 (x) = h(x) φ2 (x) = Lf h(x) .. . φr (x) = Lr−1 h(x). f

If r is strictly less than n, it is always possible to find n − r more functions φr+1 (x), ..., φn (x) such that the mapping   φ1 (x)   φ(x) =  ...  φn (x)

has a Jacobian matrix which is nonsingular at x0 and therefore qualifies as a local coordinates transformation in a neighborhood of x0 . The value at x0 of these additional functions can be fixed arbitrarily. Moreover, it is always possible to choose φr+1 (x), ..., φn (x) in such a way that Lg φi (x) = 0, for all r + 1 ≤ i ≤ n and all x around x0 . The implication of this result is that there exists a change of coordinates x 7→ z = (z1 , z2 , ..., zn ) such that the systems equations may be written as  z˙1 = z2        z˙2 = z3    ..   .     z˙ =z r−1

 z˙r     z ˙  r+1           z˙n

r

= b(z) + a(z)u = qr+1 (z) .. . = qn (z)

where a(z) is nonzero for all z in a neighborhood of z 0 = φ(x0 ). In these new coordinates, any optimal control problem can be solved by a partial collocation, i.e. collocating only (z1 , zr+1 , ..., zn ) instead of a full collocation (z1 , ..., zr , zr+1 , ..., zn , u). Inverting the change of coordinates, the state and the input (x1 , ..., xn , u) can (r) be expressed in terms of (z1 , ..., z1 , zr+1 , ..., zn ). This means that once translated into these new coordinates, the original control problem (3) will involve r successive derivatives of z1 . It is not realistic to use finite difference approximations as soon as r > 2. In this context, it is convenient to represent (z1 , zr+1 , ...zn ) as B-splines. B-splines are chosen as basis functions because of their ease of enforcing continuity across knot points and ease of computing their derivatives. A pictorial representation of such an approximation is given in Figure 1. Doing so we get z1 =

p1 X

where Bi,kj (t) is the B-spline basis function defined in (de Boor 1978) for the output zj with order kj , Cij are the coefficients of the B-spline, lj is the number of knot intervals, and mj is number of smoothness conditions at the knots. PThe set (z1 , zr+1 , ..., zn ) is thus represented by M = i∈{1,r+1,...,n} pi coefficients. In general, w collocation points are chosen uniformly over the time interval [to , tf ], (though optimal knots placements or Gaussian points may also be considered). Both dynamics and constraints will be enforced at the collocation points. The problem can be stated as the following nonlinear programming form:  min F (y)    y∈RM     subject to     z˙ (y) − q (z)(y) = 0 r+1 r+1 (8) .   ..       z˙n (y) − qn (z)(y) = 0 for every w    lb ≤ c(y) ≤ ub where

, . . . , C1n , . . . , Cpnn ), y = (C11 , . . . , Cp11 , C1r+1 , . . . , Cpr+1 r+1 P and M = i∈{1,r+1,...,n} pi . The coefficients of the B-spline basis functions can be found using nonlinear programming. 2.2.4. Comparisons Our approach is a generalization of inverse dynamic optimization. Let us summarize the different ways we can write the optimal control problem: • “Full collocation” solving problem (5) by collocating (x, u) = (x1 , ..., xn , u) without any attempt of variable elimination. After collocation the dimension of the unknowns space is O(n + 1). • “Inverse dynamic optimization” solving problem (6) by collocating x = (x1 , ..., xn ). Here the input is eliminated from the equation using one derivative of the state. After collocation the dimension of the unknowns space is O(n). • “Flatness parametrization” (Maximal inversion), our approach, solving problem (8) in the new coordinates collocating only (z1 , zr+1 , ..., zn ). Here we eliminate as many variables as possible and replace them using the first r derivatives of z1 . After collocation, the dimension of the unknowns space is O(n − r + 1).

Bi,k1 (t)Ci1

i=1 pr+1

zr+1 =

X

Bi,kr+1 (t)Cir+1

3. NUMERICAL INVESTIGATIONS

i=1

3.1 The software package NTG

.. . zn =

pn X

Bi,kn (t)Cin

i=1

with pj = lj (kj − mj ) + mj

The software package called Nonlinear Trajectory Generation (NTG) is designed to solve optimal control problems using the new approach presented in this paper, see (Milam et al. 2000).

collocation point

knotpoint

zj (t) zj (tf )

mj at knotpoints defines smoothness zj (to ) kj − 1 degree polynomial between knotpoints

Fig. 1. Spline representation of a variable. NTG has been developed using the C programming language. The sequential quadratic programming package NPSOL by (Gill et al. n.d.) is used as the nonlinear programming solver in NTG. When specifying a problem to NTG, the user is required to state the problem in terms of some choice of outputs and its derivatives. The user is also required to specify the regularity of the variables, the placement of the knot points, the order and regularity of the B-splines, and the collocation points for each output.

(x1 , x2 , x3 , x4 , x5 , u)(0) = (0, 0, 0, 0, 0, 0) (x1 , x2 , x3 , x4 , x5 , u)(1) = (π, 0, 0, 0, 0, 0) ∀i =∈ {1, 2, 3, 4, 5}, |xi |≤ 100 |u|≤ 100. To solve this problem by collocation, it is possible to use the three different approaches presented in the previous section • “Full collocation”. One must consider (x1 , x2 , x3 , x4 , x5 , u) as unknowns. • “Inverse dynamic optimization”. For this example, we can solve for u by the following

3.2 Example u = x˙ 5 + x5 . The example presented quantitatively illustrates the implications of transforming the optimization problem to the lowest space possible. The system under consideration is an academic example, without any particular physical meaning, chosen to contain various nonlinearities. Without loss of generality, its triangular form is chosen for the purpose of simplicity of the presentation. By considering different outputs with increasing relative degrees we will write different formulations of the optimal control problem. Starting from full collocation, we will end with the flatness parameterization. Finally, runs done with these different approaches will be compared. 3.2.1. Presentation of the system We consider the following fifth order single input dynamics  x˙ 1 = 5x2    2    x˙ 2 = sin x1 + x2 + 5x3 x˙ 3 = −x1 x2 + x3 + 5x4    x˙ 4 = x1 x2 x3 + x2 x3 + x4 + 5x5    x˙ 5 = −x5 + u

and the following optimal control problem: find [0, 1] ∋ t 7→ (x, u)(t) that minimizes ¶ Z 1µ 1 2 2 u (s) ds (9) x1 (s) + J= 100 0

subject to the constraints

Thus the whole system variables are parameterized by (x1 , x2 , x3 , x4 , x5 ). • “Flatness parameterization” (Maximal Inversion). Consider the variable x4 . Two differentiations give x˙ 4 =x1 x2 x3 + x2 x3 + x4 + 5x5 x ¨4 =5x22 x3 + (x1 + 1)(sin x1 + x22 + 5x3 )x3 + (x1 + 1)x2 (−x1 x2 + x3 + 5x4 ) + x1 x2 x3 + x2 x3 + x4 + 5x5 − 5x5 + 5u. (10) The system where x4 is the output has relative degree 2. By Result 1, it is possible to parameterize all the system by x4 and 3 more variables. Here we can choose (x1 , x2 , x3 , x4 ) for this parameterization. It is easy to check that when x3 is the output the system has relative degree 3. The whole system can be parameterized by (x1 , x2 , x3 ). Similarly, when x2 is chosen as the output, the system has relative degree 4 and it is possible to parameterize all its variables by (x1 , x2 ). At last the system with x1 as output has relative degree 5. The system is flat, i.e. it is possible to parameterize all its variables by x1 only. In the optimal control problem one can replace x2 , x3 , x4 , x5 (3) (4) (5) and u by combinations of x1 , x˙ 1 , x ¨1 , x1 , x1 , x1 . As a summary the different formulations of the optimal control problem are represented in Figure 2.

Choice of output u x5 x4 x3 x2 x1

Relative degree 0 1 2 3 4 5

Variables for complete parameterization (x1 , x2 , x3 , x4 , x5 , u) (x1 , x2 , x3 , x4 , x5 ) (x1 , x2 , x3 , x4 ) (x1 , x2 , x3 ) (x1 , x2 ) (x1 )

Differential equation to be satisfied 5 4 3 2 1 0

Fig. 2. The different formulations of the optimal control problem for the example. Top: full collocation. Bottom: flatness parametrization. 5

Full collocation Inverse dynamic optimization

4

log(cputime)

3

2

y=2.80*x-8.51

1

0

-1

-2 2.6

Inspecting the runs, we concluded that the successive QP subproblems are generally well conditioned in all cases.

Flatness parametrization

2.8

3

3.2

3.4

3.6

Interpretation NTG relies on NPSOL, the Fortran nonlinear programming package developed by (Gill et al. n.d.) NPSOL solves nonlinear programming problems using a sequential programming (SQP) algorithm, involving major and minor iterations. At each major iteration a new quadratic programming (QP) problem is defined that approximates both the nonlinear cost function and the nonlinear constraints. This QP problem is solved during the minor iterations. The overall cputime required is highly related to the sum of all the minor iterations.

3.8

4

4.2

4.4

4.6

log(Number of Variables)

Fig. 4. log(Number of variables) versus log(cpu-time). In each case 200 runs were done with random initial guesses. The variance of the results is represented by the error bar. The slope of the linear regression of the mean values of cpu-time is 2.80 .

3.2.2. Experiments and results All these formulations of the optimal control problem were coded in C. Symbolic gradients were provided. These problems were solved using the presented NTG package. Every solution was double checked by a numerical integration of the dynamics of the system. We only considered as valid solutions providing the following absolute precision 3.10 ≤ x1 (1) ≤ 3.20. This requirement combined with a search for speed of execution induced the choices of the NTG parameters. To perform as fair as possible comparisons we also give in each case the convergence rate, i.e. the percentage of runs ending with an optimal solution satisfying the constraints. This is to prove that the use of inversion does not induce any particular degradation in terms of numerical sensibility. In each case we did 200 runs with random initial conditions. All tests were conducted on a PC under Linux (Red Hat 6.2) with a Pentium III 733MHz processor. The results detailed in figure 3 show that the cpu-time is exponentially decreasing with the relative degree. The slowest problem to solve is the one fully collocated and the fastest is the “fully inverted” (flat) one.

In this example, each variable is represented by approximatively 15 coefficients. Therefore the number of variables is a decreasing function of the relative degree, see Figure 3. It is known, see (Gill et al. 1994), that the cost of solving a well-conditioned QP problem grows as a cubic function of the number of variables. In Figure 4 one can see that this is a good explanation of the differences in experimental cpu-time, the slope of the linear regression of the mean values of cpu-time versus the number of variables being 2.80 . 4. CONCLUSION AND PERSPECTIVES In this paper we explained the advantage of inversion in constrained trajectory optimization. The chosen example illustrates a general methodology. The role of the relative degree was clearly visible for the fifth order single input nonlinear dynamics considered here, both from a theoretical and a computational point of view. This problem is quite an extreme situation, and is not the most frequent case in engineering. In mechanical problems for instance, we are more likely to meet two or three of degrees of freedom systems. In the case of fully actuated systems, a known fact is known that these systems are flat, i.e. full inversion is possible and given by the Euler-Lagrange equations ˙ ∂L(q, q) ˙ d ∂L(q, q) − = u. dt ∂ q˙ ∂q

Relative degree 0 1 2 3 4 5

Cpu-time (s.) (average) 70.0 52.2 25.7 5.1 1.7 0.50

Number of Variables (after collocation) 90 76 65 43 29 14

Rate of convergence (%) 79.5 92.5 85.0 99.5 91.5 92.0

Fig. 3. Main results. The cpu-time is an exponential decreasing function of the relative degree. Top: full collocation. Bottom: flatness parametrization. In the under actuated case, things are more complicated. Consider a simplified model of a Uninhabited Combat Air Vehicle (UCAV), commonly referred to as the planar ducted fan m¨ x cos θ − (m¨ z − mg) sin θ = FXb m¨ x sin θ + (m¨ z − mg) cos θ = FZb ¨ (J/r)θ = FZb ˙ and the where the 6-dimensional state is (x, x, ˙ z, z, ˙ θ, θ) inputs are FXb and FZb . With (x, z) as output, the system has relative degree (2, 2) because 1 (FXb cos θ + FZb sin θ) m 1 z¨ = g + (−FXb sin θ + FZb cos θ) m

x ¨=

and the Jacobian JFXb ,FZb = m12 6= 0, i.e. two independent combinations of the inputs appear in the second derivatives of the outputs. Following our methodology it is possible to parameterize the whole system by (x, z) ˙ This is and 6 − 2 − 2 = 2 other quantities, namely θ, θ. a straightforward extension of the approach presented here. Any optimal control problem for this system can be solved by a partial collocation, i.e. collocating only ˙ The new formulation involves first and sec(x, z, θ, θ). ond derivatives of x and z because the relative degree is (2, 2). Further this system is flat as shown in (Martin et J cos θ and z2 = x − al. 1996). Choosing z1 = x + rm J sin θ gives a full parameterization of all the varirm ables of the system. Yet this parameterization involves (4) (4) (z1 , z˙1 , ..., z1 , z2 , z˙2 , ..., z2 ). The notion of relative degree is not defined in this example. Though the total number of derivatives required, 4 + 4 = 8 which exceeds the dimension of the state, can be deduced from another argument i.e. the dynamic extension algorithm. For a general multi-input multi-output system the question of the number of derivatives required is still open. We are currently working on this multi-input multioutput question from a theoretical point of view and investigating the computational consequences of inversion in this case. It is possible that the best parameterization may not be the flat one. The parameterization is highly dependent on the constraints imposed on the problem. These issues are of interest in our next area of appli-

cation, real-time trajectory generation for the Caltech ducted fan experiment, see (Milam and Murray 1999).

5. REFERENCES Bryson, A. E. (1999). Dynamic optimization. Addison Wesley. de Boor, C. (1978). A Practical Guide to Splines. Springer-Verlag. Fliess, M., J. L´evine, P. Martin and P. Rouchon (1995). Flatness and defect of non-linear systems: introductory theory and examples. International Journal of Control 61(6), 1327–1360. Fliess, M., J. L´evine, P. Martin and P. Rouchon (1999). A Lie-B¨acklund approach to equivalence and flatness of nonlinear systems. IEEE Trans. Auto. Cont. 44(5), 928–937. Gill, P. E., W. Murray and M. A. Saunders (1994). Large-scale SQP Methods and their Application in Trajectory Optimization. Chap. 1. Birkh¨auser Verlag. Gill, P. E., W. Murray, M. A. Saunders and M. Wright (n.d.). User’s Guide for NPSOL 5.0: A Fortran Package for Nonlinear Programming. Systems Optimization Laboratory. Stanford University, Stanford, CA 94305. Hargraves, C. and S. Paris (1987). Direct trajectory optimization using nonlinear programming and collocation. AIAA J. Guidance and Control 10, 338–342. Isidori, A. (1989). Nonlinear Control Systems. SpringerVerlag. Martin, P., S. Devasia and B. Paden (1996). A different look at output tracking: control of a VTOL aircraft. Automatica 32(1), 101–107. Milam, M. B. and R. M. Murray (1999). A testbed for nonlinear control techniques: The Caltech Ducted Fan. In: 1999 IEEE Conference on Control Applications. Milam, M. B., K. Mushambi and R. M. Murray (2000). A new computational approach to real-time trajectory generation for constrained mechanical systems. In: IEEE Conference on Decision and Control. Seywald, H. (1994). Trajectory optimization based on differential inclusion. J. Guidance, Control and Dynamics 17(3), 480–487. Sweetman, B. (1997). Fighters without pilots. Popular Science pp. 97–101. TechSat 21 //www.vs.afrl.af.mil/vsd/techsat21/

Suggest Documents