Sequential Linear Programming for Design of Time-Optimal ... - LAIRS

2 downloads 0 Views 198KB Size Report
spacecraft and the minimum time control of a robot. I. INTRODUCTION. There are numerous applications where the task of completing the control objective in ...
Sequential Linear Programming for Design of Time-Optimal Controllers Tarunraj Singh

Puneet Singla

Professor Assistant Professor [email protected] [email protected] Department of Mechanical & Aerospace Engineering University at Buffalo, Buffalo, NY-14260. Abstract— This paper presents a sequential linear programming approach for the determination of time-optimal controller for nonlinear systems. The sequential linear programming solution is used to update the control profile so as to satisfy the terminal conditions for an assumed maneuver time. A univariant minimization approach which brackets the optimal value of the maneuver time, such as the bisection algorithm is used in an outer loop to converge to the minimum time. The proposed technique is illustrated on two benchmark problems: the attitude control of a spacecraft and the minimum time control of a robot.

I. I NTRODUCTION There are numerous applications where the task of completing the control objective in minimum time is beneficial. Minimizing access time of hard disk drives, rapid maneuvering of manufacturing robots for productivity benefits are obvious examples for the application of time-optimal controllers. Design of minimumtime controller for rest-to-rest maneuvers of flexible structures has been studied by various researchers [1]– [4]. These papers dealt with linear systems exclusively. Time-optimal control of nonlinear systems such as attitude control [5], maneuvering robots [6], minimum time to climb of a F4 aircraft [7] are some disparate problems which have been studied in the literature. Meier and Bryson [6] proposed a technique which they referred to as Switch Time Optimization (STO) algorithm which generated bang-bang control profiles, provided one has the control parameterized with the right number of switches. They propose a technique for estimating the number of switches which is additional effort necessary prior to solving the time-optimal controller. Billimoria and Wie [5] use a multiple-shooting algorithm to determine the time-optimal control profiles for the three-axis reorientation of a rigid spacecraft. Since the structure of optimal control profile (number of switches) is unknown, a preprocessing step is necessary for the determination of the structure of the control profile. Since the optimal control is known to be bang-bang, for a three input system, the control is constrained to lie on the surface of a cube. The optimal control problem is modified by

requiring the control to lie on the surface of unit pnorm surface. As p is increased, the limiting solution approaches the bang-bang solution. The steepest ascent approach was used for the design of the optimal angle of attack for a F4 aircraft to determine the minimum time to climb [7]. It is clear that various techniques have been used to solve the time-optimal control problem for different applications. This provides a motivation to develop a technique which caters to a large class of problems which precludes the requirement of knowledge of the structure of the time-optimal control profile. In Section II, a sequentially linear programming approach is proposed for the determination of time-optimal control profiles. It is assumed that the sensitivities of the system dynamics with respect to the states and control exist. The sensitivities are used to determine a timevarying linear model which is used to pose a linear programming (LP) problem to exploit the strength of solver for LP problems. The solution of the LP problem is used to update the initial estimate of the control profile. This process continues till the terminal constraints are satisfied. A bisection algorithm is used to converge to the optimal maneuver time. Section III illustrates the proposed technique on two benchmark problems. The paper concludes with thoughts on extensions of the proposed technique to other optimal control problems. II. T IME -O PTIMAL C ONTROL The problem of design of time-optimal control profiles to move a nonlinear dynamic system from a set of initial conditions to another set of boundary conditions, with control constraints, can be stated as: Z tf min J = dt (1) 0

subject to x˙ = f (x, u), x(0) = x0 and x(tf ) = xf ul ≤ u ≤ uu , ∀t

(2) (3)

Pontryagin’s maximum principle states that for a Normal system, the time-optimal control profile is bang-bang. Meier and Bryson [6] proposed a technique which is referred to as the switch time optimization algorithm to determine the location of the switches. This is a computationally expensive approach. Liu and Wie [2] posed a parameter optimization problem for the determination of the time-optimal control profile for the rest-to-rest motion of linear systems. Singh and Vadali [3] proposed a frequency domain approach to determine the optimal control profile. These approaches result in parameter optimization problems with nonlinear constraints. Gradient based optimization algorithms can generate non-global minima. To alleviate this problem, a Linear Programming (LP) problem can be formulated to determine the time-optimal control profile. This does not require any prior knowledge of the number of switches necessary to parameterize the control profile, since the entire control profile is discretized in time and the magnitude of the control at each instant in time is determined by the optimization problem [8]–[10]. In the next section, we discuss briefly the formulation of LP problem for time optimal control of a linear dynamical systems. A. LP for Time Optimal Control of Linear Systems Let us consider the discrete time state space representation of a linear dynamical system with a sampling time Ts (4) x(k + 1) = Gx(k) + Hu(k), k ∈ N where x ∈ Rn is the state vector and u ∈ Rp is the control input vector. The state response for the control input vector u(k) is x(k + 1) = Gk x(1) +

k X

Gk−i Hu(i)

(5)

i=1

where x(1) represents the initial state of the system. To solve the control problem with specified initial and final states, in addition to the final time (tf ), the final state constraint can be represented as x(N + 1) = GN x(1) +

N X

GN −i Hu(i)

(6)

i=1

where the maneuver time tf = (N + 1)Ts , is discretized into N intervals. Eq. (6) can be rewritten in the standard equality constraint form: xN +1 − GN x1 = Au where,   A = GN −1 H GN −2 H · · · H  T u = u(1) u(2) . . . u(N )

(7)

The limits on the control which are specified as ul ≤ u(k) ≤ uu ,

k∈N

(8)

and can be easily included in the problem formulation. Therefore, for a given final time, one can pose the following LP-problem to find a feasible solution to satisfy terminal state and control input constraints: Minimize: aT u

(9)

subject to Au = b, & ul ≤ u ≤ uu

(10)

where,  T 0 0 ... 0 0   A = GN −1 H . . . H a = b

= x(N + 1) − GN x(1)

Since the linear programming formulation of the timeoptimal control problem requires the specification of the final-time, the determination of the optimal control profile requires that the initial estimate of the maneuver time be used to determine the feasibility of satisfying all the constraints. If the initial estimate of the final time results in an infeasible (feasible) problem, the maneuver time is increased (decreased). This process is carried out iteratively to converge to the minimum time solution, as illustrated in Fig. 1. The bisection algorithm is suited to converge to the boundary separating the feasible and infeasible regions over the domain of maneuver time. It can be shown that for the bisection algorithm to converge to within a specified  of the true solution requires [3] ! L tU f − tf (11) P = log2  L iterations where tU f and tf are the initially specified upper and lower bounds on the estimated maneuver time. This approach has been shown to work well for linear systems. In the following section, an iterative technique motivated by the aforementioned technique will be proposed to design time-optimal control profiles for a general nonlinear systems.

B. Sequential LP for Time Optimal Control of Nonlinear Systems Let us consider a nonlinear system represented by the state space model x˙ = f (x, u) .

(12)

where x ∈ Rn is the state vector, u ∈ Rp the control input vector subject to following actuation constraints. ul ≤ u(t) ≤ uu ,

∀t

(13)

the final state constraint can be represented as   N −1 N X Y  ∆x(N + 1) = Gj  Hi ∆u(i) + HN ∆u(N )

Upper bound Lower bound

i=1 Yes

(18) A linear programming problem can now be posed as

Solution

Discretize system with N samples in [0

Minimize:aT ∆u

]

Solve phase Linear programming for

(19)

subject to

which satisfies boundary conditions

A∆u = b, ul − u ≤ ∆u ≤ uu − u

Yes

Feasible?

j=i+1

(20)

where,

No

 T 0 0 ... 0 0  T ∆u = ∆u(1)∆u(2) . . . ∆u(N ) " N ! ! N Y Y A = Gi H1 Gi H2 a =

Fig. 1.

Bisection Algorithm

Like any nonlinear optimal control method, the timeoptimal control profile is computed iteratively by assuming an initial control profile u0 (t) and determining the corresponding evolution of the states. To determine the update to the control profile, we need a mechanism which exploits the error in terminal conditions to perturb the current control profile. Linearizing the nonlinear model about the current states, the system dynamics can be written as x˙ + ∆x˙ = f (x, u) +

∂f ∂f ∆x + ∆u ∂x ∂u

(14)

which can be simplified to ∆x˙ =

∂f ∂f ∆x + ∆u ∂x ∂u

(15)

Assuming a sampling time Ts , the continuous time model can be rewritten in discrete time as: ∆x(k+1) = Gk ∆x(k)+Hk ∆u(k) where k = 1, 2, ..., N (16) where ∆x is the perturbation state vector and ∆u is the perturbation control input. The state response for the control input ∆u(k) is ! k Y ∆x(k + 1) = Gi ∆x(1) + Hk ∆u(k) i=1

+

k−1 X



k Y

 i=1

 Gj  Hi ∆u(i)

(17a)

j=i+1

where ∆x(1) represents the initial perturbation state of the system and is zero, since the initial condition are prescribed. To solve the control problem with specified initial and final states, in addition to the final time (tf ),

i=2

b

=

# ···

HN

i=3

∆x(N + 1)

∆x(N + 1) is the difference between the terminal states x(tf ) and the desired final states xf . Now, the proposed algorithm is based upon the fact that the solution to the original nonlinear time optimal control problem posed by Eqs. (1)-(3) can be approximated by solving the aforementioned linear programming problem recursively. It should be noticed that we get a feasible solution for linearized system dynamics by solving the LP problem of Eqs. (1)-(3) at each iteration which differs from the true nonlinear state constraints. We anticipate that at each iteration the linearization error decreases and finally, we will obtain the solution to the original time optimal problem. The main steps of the proposed algorithm are given below: 1) Guess the bounds for final time, tlf and tuf . tl +tu

2) Initialize tf = f 2 f and divide the time interval [0-tf ] into pre-specified N intervals and guess the value for control variable u(i), i ∈ [1, N ] compatible with actuator constraints of Eq. (3). 3) Integrate the nonlinear system dynamics Eq. (2), to compute x(tf ) & if the terminal state constraints are satisfied then decrease the final time according to the bisection algorithm & Go to Step 2. 4) Else linearize the nonlinear dynamics system and find a feasible solution by solving the LP problem posed by Eqs. (19)-(20). 5) If the solution to LP problem (Eqs. (19)-(20)) exists, then modify the initial guess for control unew (i) = uold (i) + ∆u(i) and Go To Step 3. 6) Else, increase the value of tf according to the bisection algorithm. and Go To Step 2.

Finally, it should be noticed that with the proposed algorithm one can always impose system dynamics constraints using continuous differential equations without any approximation, while other nonlinear programming algorithms [11], [12] require the discretization of the system dynamics and constraints to be written as algebraic equations. Hence, one needs to approximate the continuous time differential equations with discrete time difference equations and as a consequence of this, the optimal solution is always accurate up to the errors introduced by the discretization process. III. N UMERICAL R ESULTS To illustrate the proposed technique, we consider the following two benchmark problems. A. Robotic Arm The first example is the benchmark problem listed in the large-scale nonlinearly Constrained Optimization Problems Set (COPS) [13]. The goal is to determine the control effort required to move a robotic arm between two points in minimum time. This problem was first introduced in the thesis of Monika M¨ossner-Beigel (Heidelberg University) [13]. The dynamics of the system is: Lρ00 = u1 ,

Iθ θ00 = u2 ,

Iφ φ00 = u3 .

(21)

where, L is the length of rigid robotic arm, ρ is the length of the arm from pivot point and (θ, φ) are the horizontal and vertical angles from the horizontal plane. 0 The () represents the derivative with respect to time. Iθ and Iφ denote moment of inertia of the arm and are defined by  Iθ 1 3 (L − ρ) + ρ3 sin2 φ, Iφ = (22) Iθ = 3 sin2 φ The geometrical constraints on state variables and actuation constraints on control variables are give by: ρ(t) ∈ [0, L], |θ(t)| ≤ π, 0 ≤ φ(t) ≤ π |u1 | ≤ 1, |u2 | ≤ 1, |u3 | ≤ 1 Further, the desired initial and final conditions for the states are: 2π ρ(0) = ρ(tf ) = 4.5, θ(0) = 0, θ(tf ) = 3 π 0 0 φ(0) = φ(tf ) = , ρ (0) = ρ (tf ) = 0 (23) 4 θ0 (0) = θ0 (tf ) = φ0 (0) = φ0 (tf ) = 0 In our implementation, control variables are discretized using zero-order-hold over N intervals with a uniform time step. The initial (N + 1) discrete values for all the

control variables were set to zero and initial bounds on final time were set to tlf = 0, tuf = 20. Fig. 2(a) shows the computed control effort as a function of time for N = 300. As expected the time-optimal control for the robot arm is bang-bang. Figs. 2(b) and 2(c) show the evolution of the position states (ρ, θ, φ) and velocity states (ρ0 , θ0 , φ0 ) as a function of time, respectively. From these figures, it is clear that the position state variables (ρ, θ, φ) are continuously differentiable while the velocity state variables (ρ0 , θ0 , φ0 ) are only piecewise differentiable due to bang-bang nature of control variables. Further, Fig. 2(d) shows the computed optimal solution (final time, tf ) as a function of number of discretization steps. We mention that all the computations were carried on SONY VAIO Notebook 1 running MATLAB-14. From these plots, it is clear that the optimal solution converges well with reasonable number of discretization steps (N = 100). Finally, Table I shows the comparison of the proposed algorithm with other nonlinear programming algorithms. We mention that the various numbers in Table I corresponds to the best solution obtained by these algorithms as published in Ref. [13]. From this Table, it is clear that the proposed algorithm converges to a better solution in terms of the optimal cost (final time, tf ) and the number of optimization variables. Further, as mentioned in Ref. [13], LOQO [14] and SNOPT [11], [12] encounters difficulties in computing the solution and could not find the solution for some particular numbers of discretization steps. It should be mentioned that with the proposed algorithm one can impose system dynamics constraints in a continuous manner using continuous differential equations, while other nonlinear programming algorithms require the constraints to be written as algebraic equations. Hence, one needs to approximate the continuous time differential equations with discrete time difference equations and as a consequence of this, the optimal solution is always accurate up to the errors introduced by the discretization process. B. Minimum Time Spacecraft Attitude Control The second problem is the design of time-optimal control profiles for the attitude control of a spacecraft. The problem of the angular motion control or of the attitude of rigid spacecraft has been widely studied in the literature [1], [4]. If the spacecraft is assumed to be symmetric, equipped with three independent actuators along three principal axis of the spacecraft and its angular motion is parameterized by Modified Rodrigues Parameters (MRP) [15], then the spacecraft dynamics 1 equipped

with 2.16GHz processor and 1GB RAM

TABLE I C OMPARATIVE R ESULTS FOR ROBOTIC A RM P ROBLEM

Algorithm

Number of Optimization Variables

Optimal Cost (tf )

Constraint Violation

LOQO

910

9.14267

4.7 × 10−11

MINOS

3610

9.14108

5.7 × 10−13

SNOPT

3610

9.14101

2.1 × 10−11

Proposed Algorithm

600

9.14101

4.1 × 10−8

equations are given as:  1 σ˙ = (1 − σ T σ)I + 2[˜ σ ] + 2σσ T ω, Is ω˙ = u 4 where, σ, ω and u are 3 × 1 vectors of MRP, spacecraft angular velocity, and external torque acting on spacecraft, respectively. For simulation purposes, the spacecraft inertia matrix Is is assumed to be identity and various bounds on state and control variables are:

u1

1 0

u2

0

u3

−1 0 1 −1 0 1 0 −1 0

2

4

6

8

2

4

6

8

4 6 Time (Seconds)

8

2

(a) Optimal Control Inputs

σ(0) = [0.577 0.333 0]T σ(tf ) = [0.126 0.436 1.512]T ω(0) = ω(tf ) = [0 0 0]T , |ui (t)| ≤ 1, i = 1, 2, 3

4.5 ρ

4

φ

θ

3.5 0 2

2

4

6

8

2

4

6

8

4 6 Time (Seconds)

8

1

0 0 0.75 0.7 0.65 0.6 0.55 0

2

To compute the time optimal attitude control, once again continuous time control variables are discretized using zero-order-hold over N intervals. The initial discrete values for all the control variables were set to zero and initial bounds on final time were set to tlb = 0, tub = 20. Fig. 3(a) shows the computed control effort as a function of time for N = 400. As expected the time optimal attitude control is bang-bang in nature. Figs. 3(b) and 3(c) show the spacecraft attitude and angular velocity as a function of time, respectively. Once again, as expected the spacecraft attitude is continuously differentiable while the spacecraft angular velocity is only piecewise differentiable due to bang-bang nature of control variables. Further, Fig. 3(d) shows the computed optimal solution (final time, tf ) as a function of number of discretization steps. From these plots, it is clear that the optimal solution converges well with reasonable number of discretization steps (N = 200) and with constraint violation of the order of 10−8 .

ρ’

(b) Position Level States 0.4 0.2 0 −0.2 −0.4 0

2

4

6

8

2

4

6

8

4 6 Time (Seconds)

8

θ’

0.5

φ’

0 0 0.1 0 −0.1 0

2

(c) Velocity Level States

Optimal Solution (tf)

9.18 9.17 9.16 9.15

200 400 600 800 Number of Discretization Steps (N)

1000

(d) Optimal Solution vs Number of Discretization Steps Fig. 2.

Simulation Results for Robotic Arm Problem

IV. C ONCLUSIONS This paper proposed a sequential linear programming approach for the determination of time-optimal controllers for systems with nonlinear dynamics. Evaluating its performance on benchmark problems, it is clear that

time-optimal ones.

u1

1 0

u2

0

u3

−1 0 1 −1 0 1 0 −1 0

0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

0.5

1

1.5 2 Time (Seconds)

2.5

3

0.4 0.2 0 0.4 0.3 0.2 0

σ3

σ2

σ1

(a) Optimal Control Inputs

0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

0.5

1

1.5 2 Time (Seconds)

2.5

3

1 0 0

0.5 0 −0.5

ω2

ω1

(b) Spacecraft Attitude

0 1.5 1 0.5 0 0

0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

0.5

1

1.5

2

2.5

3

ω3

1 0.5 0 0

Time (Seconds)

(c) Spacecraft Angular Velocity

Optimal Solution (tf)

3.135 3.134 3.133 3.132 3.131 3.13 3.129 0

200 400 600 800 Number of Discretization Steps (N)

1000

(d) Optimal Solution vs Number of Discretization Steps Fig. 3. Simulation Results for Time Optimal Spacecraft Attitude Control Problem

it outperforms standard nonlinear programming solvers as delineated by Dolan et al. (COPS). The obvious benefit of this approach is that no prior knowledge of the structure of the control profile is necessary to initiate the algorithm. Finally, the preliminary results presented here provide compelling evidence for the merits of the proposed approach. The authors are currently studying problems which support singular solutions and will extend this technique to have higher order hold approximations for control variable between two time steps and address optimal control problems besides the

R EFERENCES [1] G. Singh, P. T. Kabamba, and N. H. McClamroch. Planar, timeoptimal, rest-to-rest slewing maneuvers of flexible spacecraft. Journal of Guidance, Control, and Dynamics, 12(1):71–81, 1989. [2] W. Liu and B. Wie. Robust time-optimal control of uncertain flexible spacecraft. Journal of Guidance, Control, and Dynamics, 15(3):597–604, 1992. [3] T.Singh and S. R. Vadali. Robust time-optimal control: A frequency domain approach. Journal of Guidance, Control, and Dynamics, 17(2):346–353, 1994. [4] J. Ben-Asher, J. A. Burns, and E. M. Cliff. Time-optimal slewing of flexible spacecraft. Journal of Guidance, Control, and Dynamics, 15(2):360–367, 1992. [5] K. D. Billimoria and B. Wie. Time-optimal three-axis reorientation of a rigid spacecraft. Journal of Guidance, Control, and Dynamics, 16(3):446–452, 1993. [6] E. B. Meier and A. E. Bryson Jr. Efficient algorithm for time optimal control of a two-link manipulator. Journal of Guidance, Control, and Dynamics, 13(5):859–866, 1990. [7] A. E. Bryson Jr. Optimal control 1950 - to 1985. IEEE control Systems Magazine, 16(3):26–33, 1996. [8] T. Singh J-J. Kim. Desensitized control of vibratory systems with friction: linear programming approach. Optimal Control: Applications and Methods, 25(4):165–180, July 2004. [9] R. Kased and T. Singh. Rest-to-rest motion of an experimental flexible structure subject to friction: Linear programming approach. In AIAA Guidance, Navigation and Control Conference, San Francisco, CA, Aug. 15-18 2005. [10] B. J. Driessen. On-off minimum-time control with limited fuel usage: Near global optima via linear programming. Optimal Control Applications and Methods, 27(3):161–168, 2006. [11] Philip E. Gill, Walter Murray, and Michael A. Saunders. SNOPT: An SQP algorithm for large-scale constrained optimization. jSIAM-J-OPT, 12(4):979–1006, 2002. [12] P. E. Gill, W. Murray, M. A. Saunders, and M. H. Wright. User’s Guide for NPSOL (Version 4.0): a Fortran Package for Nonlinear Programming. Stanford University SOL 86-2, 1986. [13] E. Dolan, J. J. Mor´e, and T. S. Munson. Benchmarking optimization software with cops 3.0. Technical Report ANL/MCS-273, Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL, 2004. [14] R. J. Vanderbei. LOQO: An interior point code for quadratic programming. Optimization Methods and Software, 11:451–484, 1999. [15] J. L. Junkins and P. Singla. How nonlinear is it? a tutorial on nonlinearity of orbit and attitude dynamics. Journal of Astronautical Sciences, 52(1-2):7–60, 2004.