A Hybrid Differential Dynamic Programming ... - Semantic Scholar

cite published version: Lantoine, G., Russell, R. P., “A Hybrid Differential Dynamic Programming Algorithm for Constrained Optimal Control Problems, Part 2: Application,” Journal of Optimization Theory and Applications, Vol. 154, No. 2, 2012, pp. 418-442, DOI, 10.1007/s10957-012-0038-1

A Hybrid Differential Dynamic Programming Algorithm for Constrained Optimal Control Problems. Part 2: Application∗ Gregory Lantoine† and Ryan P. Russell‡

Abstract In the first part of this paper series, a new solver, called HDDP, is presented for solving constrained, nonlinear optimal control problems. In the present paper, the algorithm is extended to include practical safeguards to enhance robustness, and four illustrative examples are used to evaluate the main algorithm and some variants. The experiments involve both academic and applied problems to show that HDDP is capable of solving a wide class of constrained, nonlinear optimization problems. First, the algorithm is verified to converge in a single iteration on a simple multi-phase quadratic problem with trivial dynamics. Successively, more complicated constrained optimal control problems are then solved demonstrating robust solutions to problems with as many as 7 states, 25 phases, 258 stages, 458 constraints, and 924 total control variables. The competitiveness of HDDP, with respect to general-purpose, state-of-the-art NLP solvers, is also demonstrated. Key Words: Optimal control problems, differential dynamic programming, nonlinear large-scale problem AMS Classification: 49L20 - Dynamic programming method ∗ Acknowledgements:

This work was partially supported by Thales Alenia Space and the authors thank Thierry Dargent for

support and collaborations. † Corresponding

author, PhD Candidate, Georgia Institute of Technology, School of Aerospace Engineering, 270 Ferst Dr.,

Atlanta, Georgia, 30318, USA, [email protected]. ‡ Assistant

Professor, The University of Texas at Austin, Department of Aerospace Engineering and Engineering Mechanics, 1

University Station C0600, Austin, TX 78712-0235, USA, [email protected].

1


1

Introduction

Constrained, nonlinear optimal control problems are a major subject of interest and are useful in many fields [1]. The first part of this paper series presents the theoretical foundation of a new algorithm, called HDDP, developed to solve this type of problems. HDDP is a variant of the classical Differential Dynamic Programming technique [2] and relies on successive quadratic expansions of the cost, propagation and constraint functions. HDDP includes several standard nonlinear programming techniques (augmented Lagrangian, trust region, active set) to facilitate the inclusion of constraints in the formulation and increase robustness. In addition, HDDP is based on a state transition matrix formulation which allows for a decoupling of the dynamics from the optimization, as opposed to other modern DDP variants [3, 4].

The purpose of this second part of the series is to numerically evaluate HDDP by solving a variety of optimal control problems. In addition, several algorithmic extensions of HDDP are introduced and some conclusions on their respective merits are drawn. Comparisons are reported with SNOPT [5] and IPOPT [6], two popular, state-of-the-art, general-purpose solvers.

The paper is organized as follows. Preliminaries are given first, where the problem formulation and an overview of the test cases are discussed. Secondly, some practical algorithmic extensions of HDDP, including several heuristics for safeguarding and improving the overall method, are provided. Then, the implementation of HDDP is validated using a simple linear quadratic test problem. In addition, an Earth-Mars rendezvous transfer problem is investigated to confirm the relationship between HDDP and the Pontryagin minimum principle (see section 4 of Part 1). The three algorithms are also compared in this example. Next, SNOPT and HDDP are used to solve a multi-revolution orbital transfer, and the scalability of both algorithms is studied as the problem size increases. Finally, beyond the linear quadratic example, the multi-phase capability of HDDP is tested on a complex multi-asteroid tour problem.

2


2 2.1

Preliminaries Problem Formulation

In this two-part paper series, we consider multi-phase problems of the following generic form. Given a set of M phases divided by Ni stages per phase, minimize the objective function:   Ni M X X  J= (Li,j (xi,j , ui,j , wi )) + ϕi (xi,Ni +1 , wi , xi+1,1 , wi+1 ) , i=1

(1)

j=1

with respect to ui,j and wi for i = 1...M , j = 1...Ni subject to the dynamical equations xi,1 = Γi (wi ),

(2)

xi,j+1 = Fi,j (xi,j , ui,j , wi ),

(3)

gi,j (xi,j , ui,j , wi ) ≤ 0,

(4)

ψi (xi,Ni +1 , wi , xi+1,1 , wi+1 ) = 0,

(5)

U L U uL i,j ≤ ui,j ≤ ui,j , wi ≤ wi ≤ wi ,

(6)

the stage constraints

the phase constraints

and the control bounds

where Ni is the number of stages of the ith phase, xi,j ∈ Rnx,i are the states of dimension nx,i at phase i and stage j, ui,j ∈ Rnu,i are dynamic controls of dimension nu,i at phase i and stage j, wi ∈ Rnw,i are static controls (or parameters) of dimension nw,i associated with the phase i, Γi : Rnw,i → Rnx,i are the initial functions of each phase, Fi,j : Rnx,i × Rnu,i × Rnw,i → Rnx,i are the transition functions that propagate the states across each stage, Li,j : Rnx,i × Rnu,i × Rnw,i → R are the stage cost functions, ϕi : Rnx,i × Rnw,i × Rnx,i+1 × Rnw,i+1 → R are the phase cost functions, gi,j : Rnx,i × Rnu,i × Rnw,i → Rng,i are the stage constraints, and ψi : Rnx,i × Rnw,i × Rnx,i+1 × Rnw,i+1 → Rnψ,i are the (boundary) phase constraints. Note that problems with general inequality phase constraints ψi (xi,Ni +1 , wi , xi+1,1 , wi+1 ) ≤ 0 can be reformulated in the above form by introducing slack variables. By convention, i + 1 = 1 for i = M . We suppose that all the functions are at least twice continuously differentiable, and that their first- and second-order derivatives are available (and possibly expensive to evaluate).

3


Note that, contrary to HDDP, solving the optimal control problem formulation of (1) cannot be done directly with the SNOPT and IPOPT solvers. In fact, an additional, cumbersome step is necessary to convert the problem of (1) into a more generic NLP problem and to construct the first-order Jacobian (and possibly second-order Hessian) of the total objective and the constraints with respect to all the control variables of the problem. On the other hand, in HDDP, these functional derivatives are treated internally to yield the descent direction at each stage (see Part 1), so no further step is needed. It follows that an interface has been developed to greatly facilitate the use of NLP solvers for the multi-phase optimal control problems we consider. An outline of the implementation of SNOPT and IPOPT to solve the problem formulation of (1) is given in [7] and is beyond the scope of this paper.

2.2

Overview of the Test Cases

As pointed out in Table 1, this paper includes numerical test problems of various sizes and difficulty, representing both academic and real world examples. This diversity is important for a valid assessment of the robustness, performance and capabilities of the different algorithms. Note that all examples (except the linear quadratic problem) are using spacecraft dynamics.

Table 1: Characteristics of the test problems. Test Problem

# of variables

# of constraints

Number of phases M

Number of stages

Linear Quadratic

36

12

2

10

Earth-Mars

120

46

1

40

Multi-Revolution

120 - 900

42 - 302

1

20 - 300

GTOC4

924

458

25

258

All numerical tests are performed on a Intel Core 2 Duo (2.4 GHz) workstation under Windows XP 32, using the default runtime options of the Intel Visual Fortran compiler (v.11.0.066). In addition, it is not practical to consider for each problem all possible option combinations of HDDP, SNOPT, and IPOPT. Therefore, the following specific settings are used for each solver:

4


• HDDP (see the first part of the paper series for the definition of each constant): opt = 10−7 , feas = 10−5 , ∆0 = 0.01, σ0 = 0.001, κ = 0.25, 1 = 0.01, kσ = 1.1, SVD = 10−8 . • SNOPT: step limit = 10−3 , major feasibility tolerance = 10−5 , major optimality tolerance = 10−7 . • IPOPT: tol = 10−5 , nlp scaling method = none, mu strategy = adaptive, linear solver = mumps. If exact second-order derivatives are used, hessian approximation = exact and step limit = 0.01 a , otherwise hessian approximation = limited-memory and step limit = 10−3 .

3

Algorithmic Improvements and Options within HDDP

A complete algorithm requires many features to achieve robustness and efficiency. The first part of the paper series presented the theoretical aspects and the main techniques at the core of HDDP. Below, we mention possible practical, additional approaches to enhance the computational robustness and efficiency of HDDP.

3.1

Safeguardings

We recall that at each iteration, the next control iterates are found by applying the following control laws:

δuk = Ak + Bk δxk + Ck δw + Dk δλ,

(7a)

δλ+ = Aλ+ + Cλ+ δw+ ,

(7b)

δw+ = Aw+ + Bw+ δx− + Cw+ δw− + Dw+ δλ− .

(7c)

As described in Section 3.3.1. of Part 1, a trust region method is used to restrict the step of the control iterates to ensure second-order expansions stay valid. However, only the non-feedback terms Ak , Aλ+ , and Aw+ of the different control laws are affected by the trust region procedure. The other feedback terms are not rigorously restricted since they depend on the current state and parameter deviations, which are not known a priori in the backward sweep. One way to better control the step-length is to take the deviations of the previous forward sweep iterations and use them as a guess to estimate the magnitudes of the feedback terms. a This

option does not exist in the original IPOPT package, which could bias the results, so we modified the source code to be

able to specify the step between two successive iterates. In addition, larger steps are allowed when exact second-order derivatives are used, since the estimated optimal step is expected to be more accurate in that case.

5


The coefficient matrix terms are then truncated if their associated predicted steps are greater than some fraction of the non feedback terms. For instance, the feedback matrix Bk in (7a) is reset as: Bk =

η1 kAk k Bk , max(η1 kAk k , kBk δxprev k)

(8)

where η1 is a parameter set by the user (η1 = 10 in our implementation). In addition, to prevent occasional divergence, we zero the feedback matrices that have become suspiciously large.

Finally, for the forward run, we have found that the step 4 of the algorithm (see Section 3.4.4. of Part 1) for computing the successor nominal policy must be modified, the modification being motivated by the fact the successor policies, after accounting for all feedback terms, may ‘step outside’ the quadratic region despite the safeguarding heuristics of the backward sweep. To overcome this problem, the new iterate during the forward sweep is set according to an extra safeguarding rule: δuk =

η2 ∆ δuk , max(η2 ∆, kδuk k)

(9)

where η2 is another parameter set by the user. Note that adjusting the forward run in a way that violates the feedback law from the backward sweep can lead to discrepancies in the expected and actual reduction. It follows that this strategy must be used as an extreme safeguard, so a high value of η2 is recommended (η2 = 1000 in our implementation) to avoid this case as much as possible. Nevertheless, these safeguarding techniques are included by default in the standard HDDP algorithm to ensure appropriate robustness.

3.2

Trust Region Scaling

We recall that one HDDP iteration requires to solve a succession of quadratic subproblems. As noted in Section 3.3.1 of Part 1, each subproblem is formulated as an elliptical trust region problem: 1 min Ju,k δuk + δuTk Juu,k δuk , δuk 2 such that kDδuk k ≤ ∆,

(10)

where ∆ is the current trust region radius, and D is a positive definite diagonal scaling matrix that determines the geometrical shape of the trust region. From our experience, we find that the overall optimization algorithm usually behaves differently for different scaling matrices, particularly if the terms in the matrices are different 6


by orders of magnitude. Unfortunately, for nonlinear optimization problems, it is not clear how to determine the scaling matrix D to obtain good efficiency and robustness (primarily because scaling is more of an art than a science). In fact, the sensitivities of the cost-to-go function, with respect to change in variables, might vary drastically from one iteration to another due to nonlinearities. In our implementation, we offer two alternatives.

The first possibility is to fix the scaling matrix for all iterations. The matrix is set by the user at the start of the optimization (identity matrix by default). The scaling matrix can be determined at the beginning by independently estimating the quadratic region of validity for each variable. To that end, we find the maximum change of each variable that keeps the predicted reduction coinciding with the actual reduction of the nonlinear cost function. This strategy is the default algorithmic option in HDDP.

The second alternative is to reset the scaling matrix at each iteration so that the eigenvalues of the scaled Hessian D−1 Juu D−1T have a more balanced distribution, which can be viewed as a means of preconditioning the subproblem. In our implementation, we use a simple diagonal Hessian preconditioning: Dii =

p ii | , ), max(|Juu

(11)

where is the relative machine precision. In our experience, while this strategy works well in some cases, fixing the scaling matrix at the beginning seems to be more robust, especially if the user has some knowledge about the sensitivities of the problem.

3.3

Treatment of Control Bounds

In the standard HDDP algorithm presented in Part 1, control bounds are treated using a trust-region rangespace method. One drawback of this method is that the trust region computation is performed first, and then the resulting shifted Hessian is reduced to account for the constraints. This may lead to numerical difficulties since the trust region step may underestimate the size of components along the constraints. This undesirable side-effect is especially true when some control bounds are active. For instance, in Figure 1, the left-hand side shows a situation where the unconstrained trust region step is mainly along a direction that violates a control bound. The contribution of the unconstrained variable is much smaller than that of the fixed variable, and thus numerically swamped. On the right-hand side, the corresponding feasible direction left after reduction of 7


the Hessian is artificially small and not representative of a full trust region step. y

y

yL

yL

x

x

Figure 1: Negative effect of bounds on trust region step estimations.

To avoid this shortcoming, we use a different method to account specifically for control bounds. First, as before, we compute an unconstrained trust region step δu∗k to estimate the set of active bound constraints. Secondly, the control variables that lie on - or outside - their bounds are assigned a non-feedback δuk that keeps them - or takes them - directly on their bounds, and the feedback matrices are zeroed. Next, the Hessian Juu,k and gradient Ju,k are reduced to remove the rows and columns that correspond to the fixed control variables. A second trust region problem is then solved with the reduced Hessian and gradient b . The full size of the trust region is thus guaranteed to be used on the free control variables. Note that this technique is a special case of null-space methods that construct a reduced Hessian Z T Juu,k Z and a reduced gradient Z T Ju,k where Z is a full-rank matrix that spans the null space of active linearized constraints (in other words, geu,k Z = 0). Null-space methods are successfully implemented in state-of-the-art NLP solvers [5, 8]. Future work will therefore intend to generalize the outlined procedure for all nonlinear stage constraints.

However, this method to enforce control bounds is more computationally intensive because two trust region computations are necessary. Another idea for the treatment of the control bound constraints is to use an affine scaling interior-point method introduced by Coleman and Li [9]. Interior-point approaches are attractive for problems with a large number of active bounds since the active set does not need to be estimated. In this method, the scaling matrix D of the trust region technique (see (10)) is a diagonal matrix whose diagonal elements are determined by the distance of the control iterates to the bounds and by the direction of the gradient: b If

nonlinear stage constraints are present, they are handled with the range-space method described in the previous subsection

8


Dpp =

    1  q ,   uU  [ k −uk ]p        1,     q 1 L ,   [uk −uk ]p        1,

if [Ju,k ]p < 0 and uU k p < ∞, if [Ju,k ]p < 0 and uU k p = ∞,

(12)

if [Ju,k ]p ≥ 0 and uL k p > −∞, if [Ju,k ]p ≥ 0 and uL k p = −∞.

In general, the nulls-pace and interior-point methods are both effective for the treatment of bounds. Finally, note that both approaches require starting with a solution that strictly satisfies the bound constraints. It might be therefore necessary to modify the user-provided initial point so that unfeasible control components are projected on the boundary. The range-space method described in the first paper could also be used initially.

3.4

Filtering Method for Accepting New Iterates

To accept one iterate, an case extra can be distinguished compared to the algorithm presented in Part 1. First, if the predicted vs actual reduction ratio ρ belongs to the interval [1 − 1 , 1 + 1 ] where 1 2 >> 1 , the approximations are not as accurate but we do not simply throw away the trial iterate. Instead, we give it another chance by testing whether it can be accepted by a filter criterion. The filter concept originates from the observation that the solution of the optimal control problem consists of the two competing aims of minimizing cost functions and minimizing constraint violations. Hence it can be seen as a bi-objective problem. Fletcher and Leyffer [10] propose the use of a Pareto-based filtering method to treat this problem. A filter F is a list of pairs (h, f ) such that no pair dominates any other. A pair (h1 , f1 ) is said to dominate another pair (h2 , f2 ) if and only if both f1 ≤ f2 and h1 ≤ h2 . In our case, the pair corresponds to the cost and infeasibility values:

  Ni M 1 X X h= (Li,j (xi,j , ui,j , wi )) + ϕi (xi,Ni +1 , wi , xi+1,1 , wi+1 ) , M i=1 j=1 v uM h i X 1 u 2 t f= kψi (xi,Ni +1 , wi , xi+1,1 , wi+1 )k . M i=1

(13a)

(13b)

A natural requirement for a new iterate is that it should not be dominated by previous iterates. Hence, when

9


hnew < hk or fnew < fk for all (hk , fk ) ∈ F, we accept the new iterate and add it to the filter. All entries that are dominated by the new iterate are removed from the filter. The advantage of the filter method in our algorithm is to increase the opportunity of iterates to be accepted, which is likely to accelerate convergence. Note that the definitions in (13a) and (13b) for optimal and feasible play a significant role in successful Pareto filter implementations. As an example, the true performance index J includes the augmented Lagrangian, whose extra term is not included in the f definition. Therefore, excessively large penalty weights can overwhelmingly favor the feasibility, while the optimality moves in the dominated direction. In such cases, we simply rely on the successful iterates that satisfy the more conservative condition that ρ is within the interval [1 − 1 , 1 + 1 ].

3.5

Parallelization of STM Computations

Once the trajectory is integrated, the STMs at each segment can be computed independently from each other. The STM calculations can therefore be executed in parallel on a multicore machine or even a cluster to dramatically reduce the computation time (see figure 2). This is a major advantage over classical (Riccati-like) formulations, where the derivatives are interconnected and cannot be computed independently. This strategy is not tested in this paper and is left as future work. Trajectory Integration

Φ10 , Φ 02

Φ11 , Φ12

Φ1k , Φ 2k

Φ1N −1 , Φ 2N −1

Partials mapping

Figure 2: Parallelization of STM computations.

3.6

Adaptive Mesh Refinement

Many optimal control problems are inherently discontinuous with bang-bang control structures. Since the locations of the switching points are unknown in advance, a fine equally-spaced mesh is required to obtain an accurate solution if the mesh is kept fixed during the optimization process. To use a more coarse mesh 10


and reduce the computational cost, HDDP could be implemented with an internal mesh optimization strategy that automatically increases the resolution when the control undergoes large variations in magnitude [11]. Such a refinement can properly describe the optimal control discontinuities by creating a mesh that has nodes concentrated around switching points. This refinement is not considered in this paper and is left as future work.

3.7

Analytic State Transition Matrices

State transition matrices can be derived analytically for some problems [12]. It is known that spacecraft trajectory optimization software utilizing analytic STMs enjoy impressive speed advantages compared to integrated counterparts [13, 14]. Our HDDP framework offers the possibility to use these analytic STMs, which similarly enables tremendous computational time savings. This promising topic is considered in [15] and is extensively used in the following examples to speed up computations.

4

Validation of HDDP: Linear Quadratic Problem

The first part of the paper series outlined the theory and the mathematical equations that govern the HDDP algorithm. As a preliminary validation check, we propose to solve a linear system with a quadratic performance index and linear constraints. Powell proved that methods based on Augmented Lagragian functions should exactly converge in one iteration for this kind of problem [16]. It comes from the fact that the augmented cost function remains quadratic when linear constraints are included (they are only multiplied by the Lagrange multiplier).

To test the complete algorithm, we consider a simple multi-phase, targeting problem with 2 phases (M = 2) and 5 stages for each stage (N1 = N2 = 5). The states are governed by controls only where the transition functions Fi,j (see (3)) acting on each stage are given by:   



ri,j+1   ri,j + vi,j  =  for i = 1...2, j = 1...5. xi,j+1 = Fi,j (xi,j , ui,j ) =      vi,j+1 vi,j + ui,j

(14)

The states are the position and velocity, and the controls are directly related to the acceleration. At each stage,

11


the following quadratic cost function Li,j is considered: 2

Li,j = kui,j k

for i = 1...2, j = 1...5.

(15)

The phase constraints ψ1 between the two phases enforce the continuity of the states: ψ1 (x1,6 , x2,1 ) = x2,1 − x1,6 = 0.

(16)

The final constraint ψ2 targets an arbitrary point in space: ψ1 (x2,6 ) = r2,6 − [1, −1, 0].

(17)

The initial states of the first phase are fixed: x1,1 = [1, 1, 1, 1, 1, 1]. The initial guesses of the controls and the first states of the second phase are simply zero: x1,5 = [0, 0, 0, 0, 0, 0] and ui,j = [0, 0, 0] for i = 1...2, j = 1...5.

0

4 ux uy uz

−0.05

3

−0.15 States

Controls

−0.1

rx ry rz

−0.2

2

1

−0.25 −0.3

0

−0.35 −0.4

2

4

6 Stage #

8

−1

10

2

4

6 Stage #

8

10

Figure 3: Controls (left) and states (right) of the optimal solution.

Figure 3 shows the converged solution obtained by HDDP. As expected, HDDP converges to the optimal solution in one iteration (when all the safeguards are fully relaxed). Experiments with other initial conditions and target constraints are consistent, yielding single iteration optimal solutions.

5

Earth-Mars Rendezvous Transfer

An example problem for a classic Earth-Mars rendezvous transfer is presented to compare the solvers and point out the coupling between HDDP and indirect methods. We maximize the final mass of the spacecraft, and the 12


time of flight is fixed and equal to 348.79 days. The spacecraft has a 0.5 N thruster with 2000 s Isp . The initial mass of the spacecraft is 1000 kg. Planets are considered massless, so only the gravitational force of the Sun is taken into account. As a consequence, we use only one phase to describe the trajectory: M = 1. Along the trajectory, the spacecraft state vector is defined by 7 variables: position vector, velocity vector and mass. x = (r, v, m) .

(18)

We consider a launch date on April 10th, 2007. The corresponding states of the Earth at this date are obtained with JPL ephemerides DE405: r0 = [−140699693, −51614428, 980] km, v0 = [9.774596, −28.07828, 4.337725 10−4 ] km/s. The terminal constraints impose a rendezvous with Mars: 



 rf − rM (tf )  . ψf =    vf − vM (tf )

(19)

From JPL ephemerides DE405, the targeted states are : rM (tf ) = [−172682023, 176959469, 7948912] km, vM (tf ) = [−16.427384, −14.860506, 9.21486 × 10−2 ] km/s.

The low-thrust spacecraft trajectory is approximated as a series of impulsive ∆V ’s connected by coast arcs. From classical two-body mechanics, these arcs are computed analytically using a standard Kepler solver through the “f and g” procedure presented by Bate et al.[17]. The mapping between the states can be therefore defined analytically on each stage with the following closed-form transition function Fk : 







f rk + g(vk + ∆vk )  rk+1           v  = Fk (xk , ∆vk ) = f˙r + g(v  k+1   k ˙ k + ∆vk ) ,         ∆vk mk+1 mk exp(− g0 Isp )

(20)

where f and g are the Lagrange coefficients. The mass discontinuity, due to the impulse, is obtained from the rocket equation. The corresponding first- and second-order STMs are also computed analytically [18, 15]. A fixed equally-spaced mesh of 40 stages is used, which corresponds to 40 impulses separated by 8.7 days. The initial guess of the controls is zero. 13


The problem is solved with HDDP, SNOPT and IPOPT. Note that, for IPOPT, two different cases are considered: 1) first and second order derivatives provided, and 2) first order derivatives only provided. Regarding HDDP, we consider the following basic variants : • standard : algorithm as described in part 1 of the paper series • unsafe: no safeguardings are used. • scaling: an automatic trust region scaling is used, with components of the scaling matrix updated at each iteration (see subsection 3.2). • reduc: the Hessian is reduced to account for control bounds before solving the trust region subproblem (see subsection 3.3). • inter : the affine scaling interior-point method is used to treat control bounds (see subsection 3.3). • filter : the filtering method is considered for the acceptance or rejection of an iterate (see subsection 3.4). The results of the optimizations are given in Table 2. Figure 5, Figure 6 and Figure 7 show the thrust profiles of the optimal solution from each solver. The trajectory of the final optimal solution is given in Figure 4. The values of the Lagrange multipliers of the final constraints are given in Table 3 to test the similarity between HDDP and NLP solvers. 8

x 10 2

0.5

1.5

0.4 Thrust (N)

y (km)

1 0.5 0

0.3 0.2

−0.5 0.1

−1 −2

−1

0 x (km)

1

0 0

2 8

x 10

Figure 4: Optimal Earth-Mars Rendezvous trajec-

100 200 300 Time (days from epoch)

Figure 5: Thrust profile from SNOPT.

tory.

14

400

0.5

0.5

0.4

0.4 Thrust (N)

Thrust (N)


0.3

0.3

0.2

0.2

0.1

0.1

0 0


0 0

400


400

Figure 6: Thrust profile from IPOPT (First-order

Figure 7: Thrust profile from HDDP (standard ver-

version).

sion). Table 2: Comparison of results from different solvers. Solver

mf (kg)

# of function calls

# of derivative function calls

CPU Time (s)

SNOPT

598.66

439

439

10

IPOPT 1

598.66

1821

1816

90

IPOPT 2

598.66

304

249

762

HDDP standard

598.66

1419

1360

69

HDDP unsafe

FAILED

HDDP scaling

598.66

5498

4123

203

HDDP reduc

598.66

1239

1193

60

HDDP inter

598.66

3520

3018

145

HDDP filter

598.66

1329

1262

64

We can see that Figure 5, Figure 6 and Figure 7 show good agreement, so all solvers find the same solution with the same thrust profile. SNOPT is by far the fastest solver. Interestingly, Second-order IPOPT requires the fewest number of iterations, but its overall CPU time is the largest. This comes from that fact that the computation and construction of the second-order Hessian of the problem is very expensive. The standard HDDP solver compares reasonably well for this problem. However, HDDP is more intended for large-scale problems, and a more suited example is provided in the next section. As expected, there is a substantial decrease in the number of function calls when the Hessian is reduced before solving the trust-region subproblem 15


Table 3: Comparison of the Lagrange multipliers of the constraints.

(reduc variant).

Solver

Lagrange Multipliers

SNOPTc

[0.4804, -1.2011, -0.2510, 0.1151, 1.9604, 0.1265]

IPOPT 1

[0.4802, -1.1941, -0.2492, 0.1173, 1.9472, 0.1255]

IPOPT 2

[0.4810, -1.2037, -0.2511, 0.1145, 1.9643, 0.1262]

HDDP standard

[0.5095, -1.2700, -0.2665, 0.1178, 2.0701, 0.13404]

On the other hand, the interior point method (inter variant) performs the worst out of

all the methods. That poor performance may be partly explained by the fact the scaling matrix of the trust region procedure is likely to change from one iteration to another, which may interfere with the update of the trust region radius. The filter variant appears to decrease the computational time, although not dramatically. This improved performance may be explained by the aggressive nature of this strategy, which tends to accept iterates more frequently. For this problem, the scaling variant deteriorates the performance compared with the standard algorithm. This result illustrates the difficulty in designing automatic scaling procedures. Finally, we can see that the safeguarding techniques are crucial (as expected) since the algorithm is not converging without them. Note in Table 3 that the values of the Lagrange multipliers from HDDP match roughly those of SNOPT and IPOPT, which indicates that this NLP-like feature of HDDP is working well, and the resulting multipliers could be used as efficient guesses for pure direct methods.

In addition, we test the validity of the claim of section 4 of the first part of the paper series regarding the correspondence between the initial values of the co-states and the initial values of Jx (the sensitivities of the performance index with respect to the states) in HDDP. We find that: Jx,0 = [−0.96759, −1.32018, −8.8556 10−2 , −0.64969, −1.56202, 0.37153, 6.47488 10−2 ]. When the problem is solved using an indirect method, we have: λ0 = [−0.87165, −1.14978, −8.75855 10−2 , −0.54003, −1.40597, 0.33121, −0.52092]. The HDDP and indirect values are clearly related. The discrepancies likely result from the discretization and the use of approximated dynamics. In fact, Figure 8 shows the average errors in the initial costates for the same c We

point out that the signs of the Lagrange multipliers from SNOPT are switched to account for the different conventions.

16


0

Initial Co−state Relative error

10

−1

10

−2

10

−3

10

0

50

100 150 200 Number of stages N

250

300

Figure 8: Co-state error (relative to indirect values) as a function of discretization points. problem solved with varying different numbers of stages. Clearly, the errors decrease rapidly as the number of discretization points increases. The HDDP values for N = 40 are then given as initial guesses to the indirect optimization procedure. It is found that the indirect algorithm converges in a few iterations only. This ease of convergence demonstrates that the HDDP solution can be used as an good initial guess for an indirect formulation. 80 70

# of runs

60 50 40 30 20 10 0 1000

2000

3000 4000 # of function calls

5000

6000

Figure 9: Distribution of iteration # corresponding to 100 random initial guesses.

Finally, the robustness of HDDP is tested by generating 100 random initial guesses. For each stage, assuming uniform distributions, the magnitude and angles of the starting control guesses are randomly selected in the intervals [0, Tmax ] and [0, 2π], respectively. It is found that HDDP is able to converge to the same optimal solution for all initial guesses. This result shows that the radius of convergence of HDDP is very large for this

17


problem. The distribution of the number of iterations of the runs is plotted in Figure 9. The mean is 1817 iterations and the standard deviation is 403 iterations. Surprisingly, the number of iterations corresponding to the zero intial guess (see Table 2) is less than the mean number. This observation can be explained by the fact that the non-zero random initial guesses can significantly go beyond the Mars orbit, especially if the random thrust magnitudes are high.

6

Multi-Revolution Orbital Transfer

This example is a more complicated spacecraft trajectory problem regarding the minimum fuel optimization of a low-thrust orbital transfer from the Earth to a circular orbit. Again, we use only one phase to describe the trajectory: M = 1. The Isp is assumed to be constant and equal to 2000 s. The initial states (position, velocity, mass) are the same as in the previous example. The objective is to maximize the final mass. The analytical Kepler model described in the previous example (see (20)) is chosen to propagate the stages. Final constraints enforce the spacecraft to be on a final circular orbit with radius atarget = 1.95 AU. The square of the eccentricity is used in the second constraint to have continuous derivatives. 



af − atarget  , ψf =    e2f

(21)

where af and ef can be expressed as functions of the final states: af =

1 2

2/ krf k − kvf k /µ

,

(kvf k2 − µ/ krf k)rf − (rT vf )vf

f ef =

.

µ

(22a) (22b)

To study the influence of the number of revolutions on the optimization process, this problem is solved several times for increasing times of flight. The maximum thrust allowed and the number of stages are modified accordingly so that the problem stays accurate and feasible. The problem is intentionally designed such that the time allowed exceeds that from the minimum time solution. Therefore, an unknown number of coast/thrust arcs will appear in the solution. Such problems with multiple (on the order of ten) bang-bang control switches are known to be particularly challenging for direct optimal control methods.

18


• Case 1 (≈ 2 revs): T OF = 1165.65 days, N = 40, Tmax = 0.2 N. • Case 2 (≈ 5 revs): T OF = 2325.30 days, N = 80, Tmax = 0.14 N. • Case 3 (≈ 9 revs): T OF = 4650.60 days, N = 160, Tmax = 0.05 N. • Case 4 (≈ 17 revs): T OF = 8719.88 days, N = 300, Tmax = 0.015 N. Table 4: Comparison results between HDDP and SNOPT for multi-rev transfers. mf

# of

# of

# of

CPU time

(kg)

coast arcs

function calls

derivative function calls

(s)

HDDP

655.65

4

510

452

42

SNOPT

655.63

4

12638

12638

315

HDDP

655.65

7

1562

1026

171

SNOPT

654.35

5

9431

9431

832

HDDP

654.75

10

2875

2048

524

SNOPT

651.43

9

10321

10321

2981

HDDP

651.70

15

6060

4879

1689

Case 1

Case 2

Case 3

Case 4 SNOPT

FAILED

1.4 SNOPT HDDP

Cost per iteration (s)

1.2 1 0.8 0.6 0.4 0.2 0 0

2000

4000 6000 Time of Flight (days)

8000

10000

Figure 10: Cost per iteration as a function of time of flight for HDDP and SNOPT. The solver IPOPT cannot be used to solve these large-scale problems due to memory limitations. The large memory requirements of our implementation of IPOPT are due to the accommodation of the second-order 19


sensitivity calculations. It follows that only the solvers HDDP (standard variant) and SNOPT are used for the optimization. In order to evaluate the robustness, the initial guess of the thrust controls is set to zero for all stages d . This initial guess is very poor as the resulting trajectory never leaves the Earth’s vicinity. This example is, therefore, particularly challenging to solve, especially for long flight times. Table 4 summarizes the results for the different cases. We can see that HDDP is able to converge in all cases, while SNOPT fails when the time of flight (hence the number of variables) becomes large. These results point out that the many revolution problems become difficult to converge even with the sparse capabilities of SNOPT. We point out here that, for this problem, HDDP is significantly faster in total compute time due to the reduced number of iterations despite the extra cost of requiring second order derivatives. In addition, for cases 2 and 3, HDDP and SNOPT appear to converge on different local minima. Figure 10 shows the cost per iteration of HDDP and SNOPT, and demonstrates that SNOPT does indeed suffer from the ‘curse of dimensionality’ to a greater extent than that of HDDP. The computational cost of SNOPT increases exponentially (arguably the rate may be considered super-linear due to the sparsity of the problem), while that of HDDP increases only linearly. For a small number of variables, SNOPT is faster per iteration than HDDP since exact second-order derivatives are not computed in SNOPT. However, for a large number of variables, SNOPT becomes slower per iteration than HDDP because SNOPT does not take advantage of the time structure of the problem.

8

Inertial Trajectory

x 10

400 350

2

300 Angle (deg)

y (km)

1 0 −1

250 200 150 100

−2

50 −3

−2

−1

0 x (km)

1

2

0 0

3 8

x 10

2000

4000 6000 Time (days)

8000

10000

Figure 11: Trajectory of the case 4 transfer (from

Figure 12: In-plane azimutal thrust angle history of

HDDP).

the case 4 transfer (from HDDP).

d in

practice, the thrust magnitudes are set to a very small value so that sensitivities with respect to the angles do not vanish

20

0.016

0.016

0.014

0.014

0.012

0.012

0.01

0.01

Thrust (N)

Thrust (N)


0.008 0.006

0.008 0.006

0.004

0.004

0.002

0.002

0 0

2000

4000 6000 Time (days)

8000

0 0

10000

2000

4000 6000 Time (days)

8000

10000

Figure 13: Thrust profile of case 4 from HDDP (left) and T3D (right). 0

0

10

−5

e2

a−atarg

10

10

−10

10

0

−5

10

−10

2000

4000 # of iterations

6000

10

8000

0

0

2000


6000

8000

2000


6000

8000

10

λ2

λ1

5 −0.5

0 −1 0

2000


6000

−5 0

8000

Figure 14: Evolution of the constraints and associated Lagrange multipliers λ1 and λ2 during optimization: semi-major axis constraint (left) and eccentricity constraint (right).

Details on the solution of case 4 found by HDDP are given from Figure 11 to Figure 14. The trajectory involves nearly 17 revolutions. Figure 14 shows that the radius constraint and associated Lagrange multiplier are approximately converged after about 1/4 of the iterations. During the remaining iterations, the solutuion slowly improves the eccentricity constraint. In addition, in Figure 13 results are compared with the indirect solver T3D dedicated to orbital transfers [19]. Since T3D is an indirect method that does not discretize controls, it gives “exact” locally optimal solutions. The solution produced by T3D is therefore considered as the benchmark solution. The right plot of Figure 13 shows the thrust structure of the T3D solution. Despite the complexity of the structure with multiple exact bang-bang switches, we can see that the T3D and HDDP

21


solutions agree very closely. Note that convergence for T3D was very difficult for this challenging multi-rev problem, requiring a great deal of user intervention.

7

GTOC4 Multi-Phase Optimization

GTOC4 is the fourth issue of the Global Trajectory Optimization Competition (GTOC), first initiated in 2005 by the Advanced Concepts Team of the European Space Agency. GTOC problems are traditionally global low-thrust trajectory optimization problems that seek to find the best thrust profile and sequence of asteroids according to some performance index. GTOC editions are therefore a challenging benchmark for optimization algorithms. In the GTOC4 problem, the spacecraft is constrained to flyby a maximum number of asteroids (from a given list), and then rendezvous with a last asteroid. The primary performance index to be maximized is the number of visited asteroids, but when two solutions have the same number of visited asteroids a secondary performance index is the maximization of the final mass of the spacecraft. A local optimizer is therefore required to optimize a given sequence of asteroids.

In this problem, the trajectory can be readily broken into several portions connected by the flybys at the asteroids. GTOC4 is therefore a good test case for the multi-phase formulation of HDDP. The spacecraft has a constant specific impulse Isp of 3000 s and its maximum thrust is 0.2 N e . The initial mass of the spacecraft is 1500 kg and its dry mass is 500 kg. The spacecraft must launch from Earth with a departure excess velocity no greater than 4.0 km/s in magnitude, but with an unconstrained direction. The year of launch must be within 2015 and 2025, and the time of flight for the whole trajectory must not exceed 10 years.

This problem is defined to be in the same form as the generic formulation presented in (1). We define now all the functions and variables of this formulation. First, the variables are defined in the same way as in the last two examples. A spherical representation of the thrust vector controls is used. The initial function Γi is defined as: e The

maximum thrust is 0.135 N in the original GTOC4 problem. The authors raise the maximum thrust value because the

GTOC4 problem is not feasible with the original maximum thrust value when the analytical Kepler model is used to approximate the low-thrust trajectory.

22






rast,i (t0,i )        Γi =  vast,i (t0,i ) + V∞,i  ,     m0,i

(23)

where rast,i (t0,i ) and vast,i (t0,i ) are the position and velocity of the ith asteroid of the sequence at the starting time t0,i of phase i. Given the definition of the GTOC4 problem and the continuity conditions between the masses and the times of successive phases, the phase constraints have the following form: 



rf,i − rast,i+1 (tf,i )      for i = 1...M − 1, ψi =    tf,i − t0,i+1     mf,i − m0,i+1    rf,i − rast,i+1 (tf,i )   for i = M . ψi =    vf,i − vast,i+1 (tf,i )

(24)

Again, the analytical Kepler model is used to propagate the spacecraft across each stage, and the thrusts are approximated as impulsive velocity increments at the end of each stage. The trajectory obtained can then be refined using a numerical constant thrust model, but this extra step is not shown here. The initial guess comes from a promising ballistic Lambert solution that gives the asteroid sequence and initial values for all the static parameters wi = [V∞,i , m0,i , t0,i , tf,i ] of each phase. The orbital elements and associated epoch times of the asteroids of the sequence are given in Table 5. The thrust on each stage is set to zero.

Table 6 compares the charateristics of the results from HDDP (standard variant) and SNOPT. We can see that HDDP and SNOPT converge to nearly identical but nevertheless different final masses. In addition, for this example, HDDP takes more iterations and is slower than SNOPT. Interestingly, in both cases, the spacecraft uses only about half of the propellant available. Figure 15 depicts two-dimensional and three-dimensional trajectory views of the resulting solution optimized by HDDP. The optimal static parameters of each phase are given in Table 7. Figure 16 shows the resulting thrust and inclination histories. The spacecraft inclination remains low and varies little throughout the trajectory, while the inclinations of the intercepted asteroids vary significantly (see Table 5), which suggests the solution is efficient, as it is well known that changing inclination

23


is fuel expensive. Note that the problem was formulated with 25 phases since this trajectory has 24 asteroid flybys and 1 asteroid rendezvous. This example therefore demonstrates the multi-phase capability of HDDP. Table 5: Orbital Elements of the bodies encountered in the GTOC4 trajectory. Body

Epoch

Semi-major axis

Eccentricity

Inclination

Longitude of Ascending

Argument of Periapsis

#

(MJD)

a (AU)

e

i (deg)

Node LAN (deg)

w (deg)

Mean Anomaly M A (deg)

0f

54000

0.99998804953

1.67168116E-2

0.885435307E-3

175.4064769

287.6157754

257.6068370

1

54800

9.3017131191E-1

1.6769455838E-1

8.9335359602E-1

14.822384375

131.38493398

275.70393807

2

54800

1.084255941

3.155808232E-1

7.850170754

95.26367740

264.6332999

4.356061282

3

54800

1.7552828368

5.7945771228E-1

6.5141899261

13.045124964

270.61856651

155.74454312

4

54800

1.3800997657

2.7580784273E-1

2.6606667520E-1

96.339403680

101.42094303

229.92816483

5

54800

1.7075464883

5.2695554262E-1

4.2213705005

44.554300450

87.662123588

280.43305520

6

54800

1.0006640627

6.3230497939E-1

2.6484263889

19.209151230

200.25315258

106.72858289

7

54800

1.5911507659

3.4753312629E-1

3.7576988738E-1

74.065001060

10.415406459

169.90158505

8

54800

8.6572958591E-1

2.3794231521E-1

18.696815694

302.11000003

233.44411915

262.07506808

9

54800

1.6714664872

6.1129594066E-1

4.6618217515

263.39841256

84.928649085

272.63494057

10

54800

1.3160154668

2.1492182204E-1

2.7420643449

175.90424294

353.47104336

168.00794806

11

54800

9.5081078800E-1

3.0065719387E-1

1.4145702318

93.498333536

110.24580112

267.39908757

12

54800

2.0350529986

5.0272870853E-1

1.7759725806

44.755065897

144.09991810

218.24463021

13

54800

1.2388916078

3.7055387373E-1

21.681828028

73.115325356

105.53090047

132.76357397

14

54800

1.2152271331

5.6461074502E-1

1.7232805609

104.16370212

356.45495764

183.47161359

15

54800

1.0611146623

3.0767442711E-1

5.6219229406

269.68129154

80.383012719

312.78301349

16

54800

9.2123263041E-1

3.6297077952E-1

1.5474324643

347.20714860

57.688377044

302.98512819

17

54800

2.0515997162

6.6534478064E-1

6.1718765590

79.806648798

84.811200730

115.91149094

18

54800

1.2664655353

9.2674837663E-1

23.703765923

39.717681807

149.42286711

268.01737324

19

54800

8.9557654855E-1

4.9544188148E-1

11.561952262

162.89527752

139.57717229

26.143357706

20

54800

9.2467395906E-1

2.9779807731E-1

3.7631635262

203.55546271

253.44738625

238.74232395

21

54800

7.2358966214E-1

4.1051576901E-1

8.9805388805

231.65246288

355.50277050

121.10107758

22

54800

1.0047449862

2.9343421704E-1

5.2415677063

25.948442789

280.91259530

133.78127639

23

54800

7.5828217967E-1

3.5895682728E-1

33.432860441

281.89275262

201.48128492

275.33499146

24

54800

1.7057098943

6.8990451045E-1

8.7448312990

34.400999084

99.314851116

240.06977412

25

54800

1.0327257593

6.8786392762E-2

2.6459755979E-1

21.101512017

300.73089876

96.412302864

Table 6: Comparison results between HDDP and SNOPT for the GTOC4 problem.

f Body

Solver

mf (kg)

# of function calls

# of derivative function calls

CPU time (s)

HDDP

926.2642

796

586

489

SNOPT

931.1934

301

301

83

0 corresponds to the Earth.

24


Table 7: Optimal static parameters for each phase of the GTOC4 trajectory. Body

Body

V∞

m0

t0

tf

1 #

2 #

(km/s)

(kg)

(MJD)

(MJD)

0

1

[0.6056280844E5, 0.6087346971E5, 0.1166261180]

0.150000000E4

0.6056280844E5

0.6087346971E5

1

2

[0.6087346971E5, 0.6105433915E5, 0.1503211339E1]

0.145702336E4

0.6087346971E5

0.6105433915E5

2

3

[0.6105433915E5, 0.6117228617E5, 0.5732629605]

0.1457023213E4

0.6105433915E5

0.6117228617E5

3

4

[0.6117228617E5, 0.6147368910E5, -0.1442777185E2]

0.1454588435E4

0.6117228617E5

0.6147368910E5

4

5

[0.6147368910E5, 0.6155457131E5, -0.3442815132E1]

0.1388058079E4

0.6147368910E5

0.6155457131E5

5

6

[0.6155457131E5, 0.6168479608E5, 0.1285209291E2]

0.1378572898E4

0.6155457131E5

0.6168479608E5

6

7

[0.6168479608E5, 0.6181659422E5, -0.1713959727E2]

0.1354366965E4

0.6168479608E5

0.6181659422E5

7

8

[0.6181659422E5, 0.6202268662E5, 0.5676274520E1]

0.1346964060E4

0.6181659422E5

0.6202268662E5

8

9

[0.6202268662E5, 0.6214606799E5, -0.1229533875]

0.1333671408E4

0.6202268662E5

0.6214606799E5

9

10

[0.6214606799E5, 0.6229888436E5, -0.3339891717E1]

0.1307106049E4

0.6214606799E5

0.6229888436E5

10

11

[0.6229888436E5, 0.6240361798E5, 0.3149314215E1]

0.1281408461E4

0.6229888436E5

0.6240361798E5

11

12

[0.6240361798E5, 0.6260213851E5, -0.4926779996E1]

0.1281408331E4

0.6240361798E5

0.6260213851E5

12

13

[0.6260213851E5, 0.6272707632E5, -0.3888349194E1]

0.1257511109E4

0.6260213851E5

0.6272707632E5

13

14

[0.6272707632E5, 0.6281958835E5, 0.4641538756E1]

0.1236617638E4

0.6272707632E5

0.6281958835E5

14

15

[0.6281958835E5, 0.6291570320E5, 0.1944523210E2]

0.1224535301E4

0.6281958835E5

0.6291570320E5

15

16

[0.6291570320E5, 0.6301755513E5, 0.1423925345E-1]

0.1173846117E4

0.6291570320E5

0.6301755513E5

16

17

[0.6301755513E5, 0.6308978742E5, 0.8867154830E1]

0.1138044522E4

0.6301755513E5

0.6308978742E5

17

18

[0.6308978742E5, 0.6322878110E5, 0.7004391395E1]

0.1095698752E4

0.6308978742E5

0.6322878110E5

18

19

[0.6322878110E5, 0.6336470037E5, 0.1461061798E2]

0.1068972677E4

0.6322878110E5

0.6336470037E5

19

20

[0.6336470037E5, 0.6344120838E5, -0.1187284667E2]

0.1045183636E4

0.6336470037E5

0.6344120838E5

20

21

[0.6344120838E5, 0.6362006008E5, 0.9087521608E1]

0.1036215390E4

0.6344120838E5

0.6362006008E5

21

22

[0.6362006008E5, 0.6376812849E5, -0.2886452763E1]

0.9980220982E3

0.6362006008E5

0.6376812849E5

22

23

[0.6376812849E5, 0.6386880148E5, -0.6331117778E1]

0.9688235582E3

0.6376812849E5

0.6386880148E5

23

24

[0.6386880148E5, 0.6397596122E5, 0.8571439780E1]

0.9450834093E3

0.6386880148E5

0.6397596122E5

24

25

[0.6397596122E5, 0.6421530844E5, 0.1740526569E2]

0.9262642097E3

0.6397596122E5

0.6421530844E5

8

x 10 1.5

7

x 10 1

1

0.5 z (km)

y (km)

0.5 0

0 −0.5

−0.5 −1 2

−1

2 8

−1.5 −2

x 10 −1.5

−1

−0.5

0 x (km)

0.5

1

1.5

0 y (km)

8

x 10

0 −2

−2

8

x 10

x (km)

Figure 15: GTOC4 trajectory (Earth=blue, flybys=green, rendezvous=red) from HDDP: two dimensional top view (left) and three-dimensional view (right).

25


0.25

3 2.5

0.2

i (deg)

Thrust (N)

2 0.15

1.5

0.1 1 0.05

0 0

0.5

1000

2000 Time (days)

3000

0 0

4000

1000

2000 Time (days)

3000

4000

Figure 16: GTOC4 Thrust History (left) and Inclination History (right) from HDDP.

8

Conclusion

This paper series introduces HDDP, a new algorithm intended for the solution of complex optimal control problems. In this second part, we test HDDP on four optimal control problems of varying levels of difficulty. In all cases, we find robust convergence and competitive performance when compared to some existing state of the art NLP solvers. As expected due to its formulation, HDDP seems particularly efficient for large-scale problems, where the number of stages substantially exceeds the state and control dimensions. The results also indicate that HDDP has the ability to determine accurate estimates of the adjoint variables. In addition, several algorithmic variants of HDDP are analyzed, and it is found that adding a filtering method and reducing the Hessian to handle control bounds tend to be beneficial. Overall, the standard algorithmic version of the solver described in the first part appears to be reliable and acceptably efficient.

While the practical results of the HDDP are very encouraging, there is certainly room for improvement since it is a relatively new algorithm. In particular, the performance of HDDP might be strongly affected by the choice of parameters throughout the optimization process (trust region, penalty update, acceptance criterion...). Our conclusions are restricted to the reported choice of parameters and some aspects of the algorithm can be possibily improved making variations of these features. Another possible area worthy of further investigation is the approximation of the Hessians via a quasi-Newton approach, in case second-order information is not available or too expensive. In addition, the choice of the trust region algorithm used to solve each 26


quadratic subproblem can impact the computational speed and robustness of HDDP, so it might be worthwile to compare different trust region variants (Levenberg-Marquardt, dog-leg...). Finally, since the examples in this paper are focused on space trajectory problems, it would be useful to investigate the potential of HDDP in different engineering applications, like robotics or chemical engineering.

In summary, the current study and results of the preliminary testing offer the hope that HDDP will prove useful in the increasingly important area of constrained, nonlinear optimal control.

References 1. A. Chinchuluun, P.M. Pardalos, R. Enkhbat, and I. Tseveendorj. Optimization and Optimal Control: Theory and Applications. Volume 39 of Springer Optimization and Its Applications. Springer, 2010. 2. D. H. Jacobson and D. Q. Mayne. Differential Dynamic Programming. Elsevier Scientific, New York, N.Y., 1970. 3. G. J. Whiffen. Static/dynamic control for optimizing a useful objective. No. Patent 6496741, December 2002. 4. C. Colombo, M. Vasile, and G. Radice. Optimal low-thrust trajectories to asteroids through an algorithm based on differential dynamic programming. Celestial mechanics and dynamical astronomy, 105(1):75–112, 2009. 5. P. E. Gill, W. Murray, and M. A. Saunders. SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization. SIAM Journal on Optimization, 12(4):979–1006, 2002. 6. A. Wachter and L. T. Biegler. On the implementation of a primal-dual interior point filter line search algorithm for large-scale nonlinear programming, mathematical programming. Mathematical Programming, 106(1):25–57, 2006. 7. G. Lantoine. A Methodology for Robust Optimization of Low-Thrust Trajectories in Multibody Environments. PhD thesis, School of Aerospace Engineering, Georgia Institute of Technology, Georgia, 2010. 8. R. H. Byrd, J. Nocedal, and R. A. Waltz. KNITRO: An integrated package for nonlinear optimization. In Large Scale Nonlinear Optimization, pages 35–59. Springer Verlag, 2006. 9. T. F. Coleman and Y. Li. An interior trust region approach for nonlinear minimization subject to bounds. SIAM Journal of Optimization, 6(2):418–445, 1996. 10. R. Fletcher and S. Leyffer. Nonlinear programming without a penalty function. Numerical analysis report na/195, Department of Mathematics, University of Dundee, Scotland, 1997. 11. S. Jain. Multiresolution Strategies for the Numerical Solution of Optimal Control Problems. PhD thesis, School of Aerospace Engineering, Georgia Institute of Technology, Atlanta, GA, 2008.

27


12. J. L. Arsenault, K. C. Ford, and P. E. Koskela. Orbit determination using analytic partial. derivatives of perturbed motion. AIAA Journal, 8:4–12, 1970. 13. J. A. Sims, P. Finlayson, E. Rinderle, M. Vavrina, and T. Kowalkowski. Implementation of a low-thrust trajectory optimization algorithm for preliminary design. No. AIAA-2006-674, August 2006. AAS/AIAA Astrodynamics Specialist Conference and Exhibit, Keystone, CO. 14. R. P. Russell and C. A. Ocampo. Optimization of a broad class of ephemeris model earthmars cyclers. Journal of Guidance, Control, and Dynamics, 29(2):354–367, 2006. 15. G. Lantoine and R. P. Russell. A fast second-order algorithm for preliminary design of low-thrust trajectories. Paper IAC-08-C1.2.5, 2008. 59th International Astronautical Congress, Glasgow, Scotland, Sep 29 - Oct 3. 16. M. J. D. Powell. Algorithms for nonlinear constraints that use lagrangian functions. Mathematical Programming, 14:224–248, 1978. 17. R. Bate, D. Mueller, and J. White. Fundamentals of Astrodynamics. Dover Publications, New York, 1971. 18. E. T. Pitkin. Second transition partial derivatives via universal variables. Journal of Astronautical Sciences, 13:204, January 1966. 19. T. Dargent and V. Martinot. An integrated tool for low thrust optimal control orbit transfers in interplanetary trajectories. In Proceedings of the 18th International Symposium on Space Flight Dynamics, page 143, Munich, Germany, October 2004. German Space Operations Center of DLR and European Space Operations Centre of ESA.

28

A Hybrid Differential Dynamic Programming ... - Semantic Scholar

A Hybrid Differential Dynamic Programming ... - Semantic Scholar

Suggest Documents

Differential Dynamic Programming - Semantic Scholar

A Hybrid Differential Dynamic Programming Algorithm

A Hybrid Differential Dynamic Programming Algorithm for ... - CiteSeerX

Dynamic Logic Programming - Semantic Scholar

Dynamic proportional-integral-differential ... - Semantic Scholar

Dynamic Differential Geometry in Education - Semantic Scholar

Dynamic Expression Patterns of Differential ... - Semantic Scholar

HYBRID DYNAMIC EVENT TREE SAMPLING ... - Semantic Scholar

Hybrid spiral-dynamic bacteria-chemotaxis ... - Semantic Scholar

Dynamic Multilevel Hybrid Scheduling Algorithms ... - Semantic Scholar

A Hybrid Differential Evolution and Back ... - Semantic Scholar

Sparse dynamic programming I - Semantic Scholar

Direct Heuristic Dynamic Programming for ... - Semantic Scholar

Neural Networks Adaptive dynamic programming ... - Semantic Scholar

Dynamic Programming Algorithms for Haplotype ... - Semantic Scholar

Approximate Stochastic Dynamic Programming for ... - Semantic Scholar

Dynamic Programming-Based Reverse Frame ... - Semantic Scholar

Neural Networks and Differential Dynamic Programming for ...

Approximate dynamic programming for ... - Semantic Scholar

Dynamic Programming on Tree Decompositions ... - Semantic Scholar

Approximate Dynamic Programming for High ... - Semantic Scholar

Dynamic Programming for Stochastic Target ... - Semantic Scholar

nonlinear programming in approximate dynamic ... - Semantic Scholar

An Approximate Dynamic Programming Approach ... - Semantic Scholar