B.F.A. (California College of Arts and Crafts) 1975. A dissertation ... UNIVERSITY of CALIFORNIA at BERKELEY. Committee in ... Contents. List of Figures viii.
Dynamic Inversion of Nonlinear Maps with Applications to Nonlinear Control and Robotics by Neil Holden Getz
B.S. (Columbia University) 1987 B.F.A. (California College of Arts and Crafts) 1975
A dissertation submitted in partial satisfaction of the requirements for the degree of Doctor of Philosophy in Electrical Engineering and Computer Sciences in the GRADUATE DIVISION of the UNIVERSITY of CALIFORNIA at BERKELEY
Committee in charge: Professor Jerrold E. Marsden, Chair Professor Charles A. Desoer Professor Andrew K. Packard
1995
The dissertation of Neil Holden Getz is approved:
Chair
Date
Date
Date
University of California at Berkeley
1995
Dynamic Inversion of Nonlinear Maps with Applications to Nonlinear Control and Robotics
Copyright 1995 by Neil Holden Getz
i
Abstract
Dynamic Inversion of Nonlinear Maps with Applications to Nonlinear Control and Robotics by Neil Holden Getz Doctor of Philosophy in Electrical Engineering and Computer Sciences University of California at Berkeley Professor Jerrold E. Marsden, Chair This dissertation introduces the notion of a dynamic inverse of a nonlinear map. The dynamic inverse is used in the construction of nonlinear dynamical system, called a dynamic inverter, that asymptotically solves inverse problems with time-varying vector-valued solutions. Dynamic inversion generalizes and extends many previous results on the inversion of maps using continuous-time dynamic systems. By posing the dynamic inverse itself as the solution to an inverse problem, we show how one may solve for a dynamic inverse dynamically while simultaneously using the dynamic inverse solution to solve for the time-varying root of interest. Dynamic inversion is a continuous-time dynamic computational paradigm that may be incorporated into controllers in order to continuously provide estimates of time-varying parameters necessary for control. This allows nonlinear control systems to be posed entirely in continuous-time, replacing discrete root-finding algorithms as well as discrete algorithms for matrix inversion with integration. Example applications include solving for the intersection of time-varying polynomials, inversion of nonlinear control systems, regular and generalized inversion of fixed and time-varying matrices, polar decomposition of fixed and time-varying matrices, output tracking of implicitly defined reference trajectories, end-effector tracking control for robotic manipulators, and causal approximate output tracking for nonlinear nonminimum-phase systems. For the problem of output tracking for nonminimum-phase systems, an internal equilibrium manifold is introduced. This manifold is intrinsic to the class of nonlinear nonminimum-phase systems studied. Approximate output tracking is achieved by constructing a controller that makes a neighborhood of the
ii internal equilibrium manifold attractive and invariant. Dynamic inversion is incorporated into the controller to provide a continuous estimate of the manifold location. This estimate is incorporated into the tracking control law. We demonstrate, by application to the tracking problem for the inverted pendulum on a cart, that the resulting internal equilibrium controller significantly outperforms a linear quadratic regulator, where the linearization of the internal equilibrium controller is made identical to the linear quadratic regulator. We also apply internal equilibrium control to the problem of causing a nonlinear, nonholonomic model of a bicycle to track a time-parameterized trajectory in the ground plane while retaining balance.
Professor Jerrold E. Marsden Dissertation Committee Chair
For Elise
iv
Contents List of Figures
viii
List of Tables
xiv
1 Introduction 1.1 Motivation . . . . . . . . . . . . 1.2 Dynamic Inversion . . . . . . . . 1.3 Contributions of this Dissertation 1.4 Overview of the Thesis . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
1 1 2 3 4
2 Dynamic Inversion of Nonlinear Maps 2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 An Informal Introduction to Dynamic Inversion 2.1.2 Previous Work . . . . . . . . . . . . . . . . . . 2.1.3 Main Results . . . . . . . . . . . . . . . . . . . 2.1.4 Chapter Overview . . . . . . . . . . . . . . . . 2.2 A Dynamic Inverse . . . . . . . . . . . . . . . . . . . . 2.3 Dynamic Inversion . . . . . . . . . . . . . . . . . . . . 2.3.1 Dynamic Inversion with Bounded Error . . . . 2.3.2 Dynamic Inversion with Vanishing Error . . . . 2.4 Dynamic Estimation of a Dynamic Inverse . . . . . . . 2.5 Generalizations of Dynamic Inversion . . . . . . . . . . 2.6 Chapter Summary . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
7 7 7 13 15 15 16 24 24 32 37 48 50
3 Dynamic Methods for Polar Decomposition and Inversion of Matrices 3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.1 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.1.3 Chapter Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 Inverting Time-Varying Matrices . . . . . . . . . . . . . . . . . . . . . . . . 3.2.1 Left and Right Inversion of Time-Varying Matrices . . . . . . . . . . 3.3 Inversion of Constant Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1 A Comment on Gradient Methods . . . . . . . . . . . . . . . . . . . 3.3.2 Dynamic Inversion of Constant Matrices by a Prescribed Time . . .
52 52 52 53 54 55 56 57 58 60
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
v 3.4
. . . . .
66 68 69 75 80
. . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
81 81 82 83 83 84 84 84 86 87 88 94 95 95 96 97 99 99 105 106 111
5 Joint-Space Tracking of Workspace Trajectories in Continuous Time 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.1 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.2 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.1.3 Chapter Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Problem Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.3 Manipulator Tracking Control Methodologies . . . . . . . . . . . . . . . . 5.3.1 Workspace Control of Joint-space Trajectories . . . . . . . . . . . . 5.4 Joint-Space Control of Workspace Trajectories . . . . . . . . . . . . . . . 5.5 A Two-Link Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.5.1 Tracking the Other Solution . . . . . . . . . . . . . . . . . . . . . . 5.6 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
112 112 112 114 114 115 116 120 121 125 130 133
3.5 3.6
Polar Decomposition for Time-Varying Matrices . . . . 3.4.1 The Lyapunov Map . . . . . . . . . . . . . . . . 3.4.2 Dynamic Polar Decomposition . . . . . . . . . . Polar Decomposition and Inversion of Constant Matrices Chapter Summary . . . . . . . . . . . . . . . . . . . . .
4 Tracking Implicit Trajectories 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Motivation . . . . . . . . . . . . . . . . . . . . 4.1.2 Previous Work . . . . . . . . . . . . . . . . . . 4.1.3 Main Results . . . . . . . . . . . . . . . . . . . 4.1.4 Chapter Overview . . . . . . . . . . . . . . . . 4.2 Problem Definition . . . . . . . . . . . . . . . . . . . . 4.2.1 System Structure . . . . . . . . . . . . . . . . . 4.2.2 Internal Dynamics . . . . . . . . . . . . . . . . 4.2.3 The Output Space . . . . . . . . . . . . . . . . 4.2.4 Ouput-Bounded Internal Dynamics . . . . . . . 4.2.5 The Problem . . . . . . . . . . . . . . . . . . . 4.3 Tracking Control . . . . . . . . . . . . . . . . . . . . . 4.3.1 Tracking Explicit Trajectories . . . . . . . . . . 4.3.2 Estimating the Implicit Reference Trajectory . 4.3.3 Estimating Derivatives of Implicit Trajectories 4.3.4 Combined Dynamic Inverter and Plant . . . . . 4.3.5 An Implicit Tracking Theorem . . . . . . . . . 4.4 An Example of Implicit Tracking . . . . . . . . . . . . 4.4.1 Simulations . . . . . . . . . . . . . . . . . . . . 4.5 Chapter Summary . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . . . . . . . . . . . . . . . . .
. . . . .
6 Approximate Output Tracking for a Class of Nonminimum-Phase Systems 134 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 6.1.1 Limitations on Tracking Performance . . . . . . . . . . . . . . . . . 135
vi 6.1.2 The Inversion Problem for Nonlinear Systems . . . . . . . . . . . . . 6.1.3 How Dynamic Inversion Will Be Used . . . . . . . . . . . . . . . . . 6.1.4 Previous Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.5 Differences in Our Approach . . . . . . . . . . . . . . . . . . . . . . 6.1.6 Main Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.1.7 Chapter Preview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Jacobian Linearization and Regions of Attraction . . . . . . . . . . . . . . . 6.2.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.2 The Role of Jacobian Linearization in Nonlinear Control . . . . . . . 6.2.3 Different Controllers – Same Linearization . . . . . . . . . . . . . . . 6.2.4 Regions of Attraction . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.1 External/Internal Convertible Form . . . . . . . . . . . . . . . . . . 6.3.2 Properties of E/I Convertible Systems . . . . . . . . . . . . . . . . . 6.3.3 The Linearization at the Origin . . . . . . . . . . . . . . . . . . . . . 6.3.4 The Zero Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.5 Conversion of Control Systems to External/Internal Convertible Form 6.3.6 Balance Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.7 The Regulation and Tracking Problems . . . . . . . . . . . . . . . . 6.3.8 A Comment on Normal Form . . . . . . . . . . . . . . . . . . . . . 6.4 Controlling the External Subsystem . . . . . . . . . . . . . . . . . . . . . . 6.4.1 The External Tracking Dynamics . . . . . . . . . . . . . . . . . . . . 6.5 Controlling the Internal Subsystem . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 The Internal Tracking Dynamics . . . . . . . . . . . . . . . . . . . . 6.6 The Internal Equilibrium Manifold . . . . . . . . . . . . . . . . . . . . . . . 6.6.1 Derivatives Along the Internal Equilibrium Manifold . . . . . . . . . 6.7 Approximate Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7.1 Error Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.7.2 Analysis of the Internal Equilibrium Controller . . . . . . . . . . . . 6.8 Estimation of the Internal Equilibrium Angle . . . . . . . . . . . . . . . . . 6.9 Tracking for the Inverted Pendulum on a Cart . . . . . . . . . . . . . . . . 6.9.1 An Intuitive Description of the Internal Equilibrium Controller . . . 6.10 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10.1 Regulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.10.2 Tracking Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.11 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.12 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Automatic Control of a Bicycle 7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 7.1.1 Chapter Overview . . . . . . . . . . . . . . . . 7.2 The Model . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Assumptions on the Model . . . . . . . . . . . 7.2.2 Reference Frames and Generalized Coordinates 7.2.3 Inputs and Generalized Forces . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
138 138 138 140 141 142 143 143 143 144 144 145 146 147 150 151 153 155 157 158 159 159 161 161 163 166 167 167 167 173 175 178 181 182 194 198 199 200 200 201 201 202 202 206
vii
7.3
7.4 7.5 7.6 7.7 7.8
7.9
7.2.4 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . Equations of Motion . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 Practical Simplifications . . . . . . . . . . . . . . . . . . 7.3.2 Conversion to External/Internal Convertible Form . . . 7.3.3 Internal Dynamics of the Bicycle . . . . . . . . . . . . . External Tracking Controller . . . . . . . . . . . . . . . . . . . Internal Tracking Controller . . . . . . . . . . . . . . . . . . . . Internal Equilibrium Angle . . . . . . . . . . . . . . . . . . . . 7.6.1 A Dynamic Inverter for the Internal Equilibrium Angle Path Tracking with Balance . . . . . . . . . . . . . . . . . . . . Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.8.1 Straight Path at Constant Speed . . . . . . . . . . . . . 7.8.2 Sinusoidal Path . . . . . . . . . . . . . . . . . . . . . . . 7.8.3 Circle at Constant Velocity . . . . . . . . . . . . . . . . 7.8.4 Figure-Eight Trajectory . . . . . . . . . . . . . . . . . . Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . .
8 Conclusions 8.1 Review . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Observations . . . . . . . . . . . . . . . . . . . . 8.2.1 Dynamic Time v.s. Computational Time 8.2.2 Realization of Dynamic Inverters . . . . . 8.3 Future Work . . . . . . . . . . . . . . . . . . . . 8.3.1 Methods for Producing Dynamic Inverses 8.3.2 Differential-Algebraic Systems . . . . . . 8.3.3 Inverse Kinematics with Singularities . . . 8.3.4 Tracking Multiple Solutions . . . . . . . . 8.3.5 Tracking Optimal Solutions . . . . . . . . 8.3.6 Control System Design . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
206 207 210 211 213 215 215 216 217 218 218 219 223 225 227 230
. . . . . . . . . . .
231 231 233 233 234 235 235 235 235 236 236 236
Bibliography
237
A Notation and Terminology
245
B Some Useful Theorems B.1 A Comparison Theorem . . . . . . . . . . . B.2 Taylor’s Theorem . . . . . . . . . . . . . . . B.3 Singularly Perturbed Systems . . . . . . . . B.4 Tracking Convergence for Integrator Chains B.5 A Converse Theorem . . . . . . . . . . . . . B.6 Uniform Ultimate Boundedness . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
C Partial Feedback Linearization of Nonlinear Control Systems
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
250 250 250 251 252 253 254 256
viii
List of Figures 2.1 2.2
2.3
2.4 2.5 2.6
2.7
2.8 2.9
2.10 2.11 2.12 2.13 2.14
The map F (θ) where θ∗ is the unique solution to F (θ) = 0. The shaded region represents possible values of the function F (θ). . . . . . . . . . . . . There exists a line passing through (θ∗ , 0) of slope β > 0 such that F (θ) (shown gray) is above the line to the right of θ∗ and below the line to the left of θ∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Any function F (θ), Lipschitz in θ, that is transverse to the θ-axis at θ∗ and whose values lay in the shaded regions of either of these graphs may be inverted with the dynamic system (2.13). . . . . . . . . . . . . . . . . . . . The function (θ − 1)3 with root θ∗ = 1. No line of slope β may be drawn through θ∗ as in Figure 2.2. . . . . . . . . . . . . . . . . . . . . . . . . . . The function G[w] := sign(w)|w|1/4, a dynamic inverse of F (θ) = (θ − 1)3 . . The composition G[F (θ)] = sign((θ−1)3)|(θ−1)3 |1/4. Now we can draw a line of slope β = 1/2 (dashed) through (θ∗ , 0) = (1, 0), like the line in Figure 2.2. The dotted curve is F (θ) = (θ = 1)3 . . . . . . . . . . . . . . . . . . . . . . For any y ∈ Br(t1 ), the constant matrix D1 F˜ (y, t1 )−1 provides a dynamic inverse for F (θ, t) over a sufficiently small interval (t0 , t2 ) containing t1 . See Theorem 2.2.13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The function F (θ, t) (2.65) for t = 0 (solid), t = 1/8 (dotted), and t = 3/8 (dashed). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The upper graph show the solutions of the dynamic inverter (2.66) for µ = 10 (dashed) and µ = 100 (solid). The initial condition was θ(0) = 3. The lower graph shows the estimation error for the dynamic inverter (2.66) using µ = 10 (dashed) and µ = 100 (solid). . . . . . . . . . . . . . . . . . . . . . . . . . Nonlinear circuit element of Example 2.3.4. . . . . . . . . . . . . . . . . . . The characteristic Va −Vb versus f (Va −Vb ) is strictly monotonic, continuous, and lies in the shaded region. A typical curve is shown. See Example 2.3.4. Circuit realization of a dynamic inverter. See Example 2.3.4. . . . . . . . . Effective characteristic (solid) of the dynamic inverter circuit of Figure 2.12. The nonlinear element’s characteristic is indicated in gray. See Example 2.3.4. The top graph shows solutions of the dynamic inverter (2.89) with E(θ, t) = θ˙∗ (t) for µ = 10 (dashed) and µ = 100 (solid), with the actual solution θ∗ (t) (dotted). The initial condition was θ(0) = 3. The bottom graph shows the corresponding estimation error. . . . . . . . . . . . . . . . . . . . . . . . .
8
9
11 12 13
14
22 27
28 29 29 30 31
35
ix 2.15 The top graph shows the state trajectory θ(t) (solid) of the dynamic inverter (2.92), along with the solution θ∗ (t) (dotted). The bottom graph shows the error norm |θ(t) − θ∗ (t)|. . . . . . . . . . . . . . . . . . . . . . . 2.16 The solution of interest in Example 2.4.7, θ∗ (t) = (x∗ (t), y∗ (t)), is the intersection (to the right of (0, 0)) of the two cubic curves shown in each of the graphs. This figure shows the pair of cubic curves (2.115) for t ∈ {0, 1, . . ., 5}. 2.17 The solution of the dynamic inverter of Example 2.4.7 for F (θ, t) = 0 corresponding to Example 2.4.7, where θ = (x, y). The upper graph shows x(t) versus t (solid) and y(t) versus t (dashed). The lower graph shows x(t) versus y(t) with the initial condition (x(0), y(0)) = (1, 0) marked by the small circle. 2.18 The estimation error for the dynamic inverter of Example 2.4.7 as seen through F (2.116), log10 kF (θ(t), t)k∞ versus t in seconds. See Example 2.4.7. 2.19 The closed-loop system with dynamic inversion compensator (2.131) with state (Γ, u) and the nonlinear plant (2.122) with state x. . . . . . . . . . . . 3.1 3.2 3.3 3.4 3.5 3.6
3.7 3.8 3.9 4.1 4.2 4.3 4.4
The matrix homotopy H(t). . . . . . . . . . . . . . . . . . . . . . . . . . . . The matrix homotopy H(t) from I to M with the corresponding solution Γ∗ (t), the inverse of H(t). . . . . . . . . . . . . . . . . . . . . . . . . . . . . The homotopy from I to M must remain in GL+ (n, R) to be invertible. . . Elements of A(t) (see (3.50)). See Example 3.4.3 . . . . . . . . . . . . . . . Elements of x (top), and Γ (bottom). See Example 3.4.3. . . . . . . . . . . The error log10 (kˆ x(t)Λ(t)ˆ x(t) − Ik∞ ) indicating the extent to which x fails to satisfy x ˆΛ(t)ˆ x − I = 0. The ripple from t ≈ 1.8 to t = 8 is due to numerical noise. See Example 3.4.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . Λ(t) is positive definite and symmetric for all t ∈ [0, 1]. . . . . . . . . . . . Elements of x(t) (top) and Γ (t) (bottom), for Example 3.5.3. . . . . . . . . The base 10 log of the error kˆ x(t)M M T x ˆ(t) − Ik∞ , for Example 3.5.3. . .
37
43
45 46 48 61 62 63 73 73
74 76 78 79
Schematic of (4.11). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 The cart and ball system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 Three equilibria of the cart and ball. . . . . . . . . . . . . . . . . . . . . . 89 If k(y(t), y(t), ˙ y¨(t))k is kept sufficiently small for all t ≥ 0, then the ball remains in the bowl. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 4.5 Output-bounded internal dynamics. . . . . . . . . . . . . . . . . . . . . . . 90 4.6 Some cart-ball systems that do not have output-bounded internal dynamics. 91 4.7 Some cart-ball systems that do have output-bounded internal dynamics. . . 92 4.8 The zero dynamics vector field φ(η) for the zero dynamics (4.27) of Example 4.2.11. The origin of the zero dynamics is unstable, but η(t) is bounded on [0, ∞) when |η(0)| < 3. . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 4.9 The closed-loop control system [C, P ]. . . . . . . . . . . . . . . . . . . . . . 95 (r) 4.10 If k(ξ(0), u(0))k < ν and k(Yd (t), yd )k < δ with ν and δ sufficiently small, (r) then convergence of (ξ(t), u(t)) to (Yd (t), yd (t)) preserves the upper bound ρ on the internal state η(t). . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 ( 4.11 As (ξ(t), u(t)) converges to (Θ∗ (t), θ∗ r¯)(t)) it must remain in Bκ . . . . . . . 103
x 4.12 If δ and ν are not sufficiently small, then (ξ(t), u(t)) may converge exponentially to (Θ∗ (t), θ∗ (t)) but leave the ball Bκ at some time. . . . . . . . . . . 4.13 If (ξ(0), u(0)) is in Bν , and η(0) is in Bρ , then kη(t)k < ρ for all t ≥ 0. The Bκ is that ball in which (ξ(t), u(t)) must remain in order that η(t) remain in Bρ . Compare to Figure 4.5. . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.14 Top: The output y(t) (solid), as well as the implicit reference trajectory θ∗ (dotted), and its estimate θ(t) (dashed) for the simulation of Example 4.4.1. Bottom: The output tracking error y(t) − θ∗ (t) . . . . . . . . . . . . . . . . 4.15 The internal state η(t) for the simulation of Example 4.4.1. . . . . . . . . . 4.16 The top graph shows the estimation error θ(t) − θ∗ (t) for Example 4.4.1. The bottom graph shows the estimation error Γ (t) − Γ∗ (t). . . . . . . . . . . . 4.17 The top left graph shows the phase plot of ξ 1 versus ξ 2 . The top right graph shows θ∗ versus θ˙∗ . The lower left graph shows θ versus E 1 (Γ, θ, t). The lower right graph shows the tracking error phase, ξ 1 − θ∗ versus ξ 2 − θ˙∗ . The symbol ‘o’ marks the initial conditions for each plot. . . . . . . . . . . . . . 5.1
5.2
5.3 5.4
5.5 5.6
104
104
108 109 109
110
A sequence of poses {xd (tk )} along the workspace trajectory are inverted via an inverse-kinematics algorithm. The resulting sequence of joint-space points ˜ {θd (tk )} is then splined to form θ(t). . . . . . . . . . . . . . . . . . . . . . 117 The black curve on the left corresponds to the desired end-effector trajectory xd (t). The black dots on the left correspond to points of xd (t) at a discrete sequence of times t1 < t2 < t3 < t4 . The black curve on the right corresponds to the inverse kinematic solution θd (t) satisfying F (θd(t)) = xd (t). The black dots on the right correspond to the inverse kinematic solutions θd (tk ) satisfying F (θd (tk )) = xd (tk ). The white curve on the right corresponds to ˜ through the sequence {θd (tk )}. The white a time parameterized spline θ(t) ˜ ˜ curve on the left is F (θ(t)). Note that the error between xd (t) and F (θ(t)) is non-uniform, going to zero at the sample points, and diverging from xd (t) away from the sample points. . . . . . . . . . . . . . . . . . . . . . . . . . 119 The four robot control strategies are represented each by one of the four arrows. This chapter presents a JCWT strategy, indicated by the black arrow.121 A two-link robot arm with joint angles θ = (θ1 , θ2 ), joint torques τ = (τ1 , τ2 ), end-effector position x, desired end-effector position xd , link lengths l1 and l2 , and link masses m1 and m2 , assumed to be point masses. . . . . . . . . 125 Two configurations corresponding to the same end effector position. . . . . 127 The top left graph shows convergence of the workspace paths: F (θ) (solid), ¯ (dashed), and F (θ∗ ) (dotted) corresponding to the initial conditions of F (θ) Table 5.1. The top right graph shows convergence of the joint-space paths: θ (solid), θ¯ (dashed), and θ∗ (dotted). For the top graphs the symbol ’o’ marks the initial condition for each trajectory. For the two bottom graphs, the ¯ − θ∗ (t), and upper one shows the l2 -norm of the estimation error eest = θ(t) T ˙ the lower one shows the norm of the tracking error etrack = [(θ(t), θ(t))] − T ˙ [(θ∗ (t), θ∗ (t))] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
xi 5.7
5.8 6.1 6.2
6.3 6.4 6.5 6.6 6.7 6.8 6.9 6.10 6.11 6.12 6.13
6.14 6.15 6.16 6.17 6.18 6.19
The top left graph shows convergence of the workspace paths: F (θ) (solid), ¯ (dashed), and F (θ∗ ) (dotted) for the other inverse kinematic solution F (θ) corresponding to the initial conditions of Table 5.5.1. Note that the path F (θ∗ ) is a periodic curve of period 2. The top right graph shows convergence of the joint-space paths: θ (solid), θ¯ (dashed), and θ∗ (dotted) for the other inverse kinematic solution. For the top graphs the symbol ’o’ marks the initial condition for each trajectory. For the bottom two graphs, the upper ¯ plot shows the l2 -norm of the estimation error eest = θ(t)−θ ∗ (t), and the lower T ˙ bottom graph shows the l2 -norm of the tracking error etrack = [(θ(t), θ(t))] − T ˙ [(θ∗ (t), θ∗ (t))] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . An irregular joint geometry. . . . . . . . . . . . . . . . . . . . . . . . . . . A balancing cart-ball system. . . . . . . . . . . . . . . . . . . . . . . . . . . A reference trajectory yd (t) such that y˙d (t) ≥ 0 for all t ≥ 0, and such that supt≥0 k[yd (t), y˙ d(t), y¨d(t)]T k < . For this graph t1 = 1, k = 1/(2π). In general we assume t1 > 0 is unknown. . . . . . . . . . . . . . . . . . . . . . An external/internal convertible system. . . . . . . . . . . . . . . . . . . . . The external subsystem Σext(u) of Σ(u) (see also Figure 6.3). . . . . . . . . The internal subsystem Σint(x, u) of Σ(u) (see also Figure 6.3). . . . . . . . The plant Σ(u) reconstructed from the internal and external subsystems. . The zero dynamics of Σ(u). . . . . . . . . . . . . . . . . . . . . . . . . . . . The interconnection of plant Σ(u) and compensator C(v). . . . . . . . . . . The internal tracking controller. . . . . . . . . . . . . . . . . . . . . . . . . . When f (x, α) and g(x, α) are independent of x, then αe may be regarded as a time-varying function of vext. . . . . . . . . . . . . . . . . . . . . . . . . . The internal equilibrium controller causes the error [eTx , eTα ]T to converge toward 0 exponentially until it reaches the ball Bb ⊂ Rn . See Proposition 6.7.4. Inverted pendulum on a cart. . . . . . . . . . . . . . . . . . . . . . . . . . Regulation of the inverted pendulum. The internal equilibrium manifold E(t) is outlined in bold gray in the lower graph. The actual (x1 , x2 , α1 ) trajectory of the pendulum is indicated in black, and its projection (x1 , x2 , αe ) onto E(t) is shown in gray. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 100 , α2 (0) = 0. Since yd ≡ 0, (e1x , e2x) = (x1 , x2 ). . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 200 , α2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 500 , α2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 600 , α2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 850 , α2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Regulation trial for initial conditions x1 (0) = 1, x2 (0) = 0, α1 (0) = 0, α2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
131 133 135
137 147 147 148 148 152 157 162 166 171 175
180 184 185 186 187 188 190
xii 6.20 Regulation trial for initial conditions x1 (0) = 8, x2 (0) = 0, α1 (0) = 0, α2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.21 Regulation trial for initial conditions x1 (0) = 16, x2 (0) = 0, α1 (0) = 0, α2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.22 Regulation trial for initial conditions x1 (0) = 64, x2 (0) = 0, α1 (0) = 0, α2 (0) = 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.23 Tracking trial for initial conditions x(0) = 0, α(0) = 0, with yd (t) = sin(0.2π t), a 0.1 Hz sinusoid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.24 Tracking trial for initial conditions x(0) = 0, α(0) = 0, with yd (t) = sin(0.4π t), a 0.2 Hz sinusoid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.25 Tracking trial for initial conditions x(0) = 0, α(0) = 0, with yd (t) = sin(π t), a 0.5 Hz sinusoid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1 7.2
191 192 193 195 196 197
Side view of the bicycle model with α = 0. . . . . . . . . . . . . . . . . . . 202 Bicycle model rolled away from upright by angle α. In this figure α is negative. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203 7.3 Leaning bicycle showing the relationship between the steering angle φ and the steering shaft angle ψ. Note that in the figure, the roll-angle α is negative.204 7.4 The bicycle model showing body velocities vr and v⊥ . Note that the roll-angle α in the figure is negative. . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 7.5 Top view of rear “wheel” showing the relationships among vr , v⊥ , x, ˙ and y. ˙ 205 7.6 Velocity geometry for constraints. . . . . . . . . . . . . . . . . . . . . . . . 207 7.7 Target path (xd , yd ) = (5t, 0)[m]. The x and y scales are in meters. The bicycle’s path in the plane (solid) with the desired straight path (dotted). . 220 7.8 Target path (xd , yd ) = (5t, 0) meters. The top graph shows the tracking error k(x, y) − (xd , yd)k2 versus t. The second graph shows the steering angle φ. The third graph shows the rear wheel velocity vr (solid) with desired rearwheel velocity (dotted) vrd . The fourth graph shows the roll-angle α (solid) with internal equilibrium roll-angle αe (dotted). . . . . . . . . . . . . . . . 221 7.9 Internal equilibrium control causes the bicycle to steer itself so that its roll angle α converges to a neighborhood of the equilibrium roll angle αe , shown as a dashed line. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 7.10 Sinusoidal target path (xd (t), yd (t)) = (5t, sin( 15 πt)) [m]. The bicycles path in the plane (solid) with the desired straight path (dotted). . . . . . . . . . 223 7.11 Sinusoidal target path (xd , yd ) = (5t, 2 sin(0.2πt)). The top graph shows the tracking error k(x, y)−(xd, yd )k2 . The bottom three graphs show the steering angle φ, the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted), and the roll-angle α (solid) with internal equilibrium roll-angle αe (dotted). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 7.12 Circular target trajectory with radius 8 meters and tangential velocity of 5 meters per second. The first 10 seconds of the bicycle’s path in the plane (solid) with the desired circular path (dotted). . . . . . . . . . . . . . . . . 225
xiii 7.13 Circular target trajectory with radius 8 meters and tangential velocity of 5 meters per second. The top graph shows the tracking error k(x, y)−(xd, yd)k2 . The bottom three graphs show the steering angle φ, the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted), and the roll-angle α (solid) with internal equilibrium roll-angle αe (dotted). . . . . . . . . . . . . . . . 226 7.14 The bicycle’s path in the plane (solid) with the desired figure-eight path (dotted). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 7.15 Figure-eight target trajectory. The top graph shows the tracking error k(x, y)− (xd , yd)k2 . The bottom three graphs show the steering angle φ, the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted), and the roll-angle α (solid) with internal equilibrium roll-angle αe (dotted). . . . . . . . . . . 229
xiv
List of Tables 4.1 4.2
Initial Conditions for the implicit tracking controller simulation. . . . . . . Parameters for the implicit tracking controller simulation. . . . . . . . . . .
106 107
5.1
The table on the left shows parameters for the simulation of implicit tracking control of a two-link robot arm. The table on the right shows initial conditions. All angles are in radians. . . . . . . . . . . . . . . . . . . . . . Initial Conditions for the simulation of implicit tracking control of the other solution for a two-link robot arm. All angles are in radians. . . . . . . . . .
128
5.2 6.1
7.1 7.2 7.3 7.4 7.5
130
Initial conditions for regulation simulations. An asterisk ‘*’ indicates that the corresponding initial conditions are in the region of attraction of the origin for the particular controller. . . . . . . . . . . . . . . . . . . . . . . . . . .
182
Physical and gain parameters for the simulations. . . . . . . . . . Initial conditions for a straight trajectory at constant speed. . . Initial conditions for the sinusoidal trajectory at constant speed. Initial conditions for following a circular trajectory. . . . . . . . Initial conditions for following the figure-eight trajectory. . . . .
219 219 223 225 227
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
xv
Acknowledgements Many people have given me help, guidance, freedom, access, and inspiration in my years in graduate school. To all of them I am very grateful. I wish to thank Genevi`eve Thi´ebaut, Heather Levien, Mary Byrnes, Carol Block, Chris Colbert, Susan DeVries, Tito Gatchalian, Diane Hsuing, and Jeff Wilkinson for their advice and good-natured support in guiding me through various bureaucratic hurdles of graduate school. I am grateful to my friends Adam Schwartz and Shahram Shahruz for their encouragement and suggestions on my work, and to Ed Nicolson and Steve Burgett for making themselves available for my many questions about LaTeX. Thanks to Max Holm who, through his dedication and perseverance, has kept the robotics lab computer network running. Thanks too to other friends and office-mates, past and present, who have given me the pleasure of their company and companionship. I thank members of my qualifying exam committee — Professors Seth Sanders, Hami Kazerooni, Ron Fearing — who, through their questions, suggestions, and interest gave me that sense of validation that a graduate student needs to cross the threshold from the classical into the unknown. I thank Professors John Wawrzynek, Leon Chua, Shankar Sastry, Alan Lichtenberg, Max Mendel, Karl Hedrick, and Philip Stark, professors who gave me the freedom to chase after my ideas in various independent-study courses over the years. Thanks too to Professor John Canny for allowing me access to the robotics lab computers. I thank NASA and NSF for funding during Fall 1992 and Summer 1993. I also wish to thank NSF for travel awards which allowed me to travel to conferences in Nagoya and New Orleans. To my parents I am grateful for many things. They have been encouraging and patient throughout my re-education from artist to engineer, tolerating my neglect with an understanding, from their own experience, of the level of work necessary to achieve most things of value. I have seen little of them since I have been in graduate school. I hope to see much more of them in the future. The most fortunate incident in my graduate school years has been my acquaintance with my fellow engineer Dr. Elise Burmeister. For her companionship, understanding, tolerance, patience, patience, and patience I cannot express my gratitude. For many reasons, I do not believe that I would have survived graduate school to graduation without her tenderness, comfort, and good humor. To her I dedicate this dissertation.
xvi To my readers, I am especially grateful: Professor Andrew Packard has supplied me with encouragement and good will at a time when I needed both. The depth of his understanding of control theory is an inspiration that I will carry with me into my professional years. I look forward to his friendship in the years to come. My research advisor, Professor Jerrold Marsden, has given me encouragement, guidance, and most importantly his interest in my ideas. There are a few electrifying moments in one’s research career where ideas snap together, fusing in a flash of revelation, understanding walking calmly out of the light. These are the moments for which researchers struggle and live. With Jerry as my advisor, I have had more than my share of such moments. In an environment where familiarity is too often passed off as understanding, I have known of no one with a deeper sense of what it means to understand something than he. Where others are ready to move on, thinking that all has been seen and assimilated, Jerry says “Wait! There’s more.” There always is more — much more, and without Jerry we might have missed it. I look forward to many more years of collaboration with him as friend, colleague, and mentor. Finally, I wish to express my profound gratitude to Professor Charles Desoer for countless hours of careful reading, thoughtful suggestions, encouragement, and relentless, though always constructive criticism. Since I first began to study the problem of controlling a bicycle back in the late spring of 1992, he has been my constant guide. There is a fearless spirit in great scientists, an uncompromising determination to stand their ground in the face of ferocious Complexity. I have had the privilege of becoming familiar with that spirit in hours of consultation with Professor Desoer, and the thrill of seeing Complexity in retreat. When I asked Professor Desoer to be on my dissertation committee, it was because I knew of no better motivation for careful thought and exposition than knowing that he would be reading and criticizing the product of my efforts. This was the best decision I have made in my graduate career. By both example and instruction he has been my most influential guiding force, teaching me how to think and live like an engineer.
1
Chapter 1
Introduction 1.1
Motivation Nonlinear equations arise frequently in nonlinear control as well as many other
areas of applied mathematics1. They may appear as constraints on a dynamic system, or as equations whose time-varying roots may be important reference signals upon which a control system relies. Numerical estimates of both fixed and time-varying roots can, however, be problematic. Even for R1 → R1 maps, Newton’s method2 is known to fail, for instance, near local minima and maxima. Newton’s method is not even applicable for
nondifferentiable functions, and when applied to differentiable functions, the differential F 0 (θ) must be known; this is not always convenient. Secant and regula falsi methods skirt the need for F 0 (θ) by estimating F 0 (θ) using values of F (θi ) for successive iterations θi , but they too fail when local maxima and minima are encountered. The bisection method is a very general and robust method for approximating roots, but it does not generalize to multiple dimensions. In fact, for Rn → Rn maps, virtually the only method available for
solving for roots is Newton-Raphson.
In applying discrete inversion routines in the context of the control of dynamic systems we are faced with another problem. Numerical techniques for the solution of linear and nonlinear equations are usually performed in discrete-time, a sequence of intermediate computations converging toward a solution. This can be a disadvantage in that the employment of such methods, in the context of the control of continuous-time dynamic systems, 1 It is common practice to include linear objects in the set of nonlinear objects. This convention will be obeyed throughout this dissertation. Thus, a linear map is also a nonlinear map, but a nonlinear map is not, in general, linear. 2 See [GMW81] for a review of the various numerical methods mentioned here.
2
Introduction
Chap. 1
necessitates a discrete-time approach to the combination of computation and control, often making implementations and proofs tedious and difficult. In the context of the control of nonlinear systems, the problem of inversion arises when one wishes to control the output of a control system to track a desired trajectory. One must then “invert” the control system in order to obtain a state trajectory and control which will produce the desired output. For nonminimum-phase systems, i.e. systems with unstable zero dynamics (see Chapter 4, Definition 4.2.5), such inversion presents some difficult and fundamental problems. Methods exist [DPC94, HMS94] for computing exact and approximate inverse solutions. However, in the nonminimum-phase case the resulting inverse trajectory will not, in general, have an initial condition that corresponds to the initial condition of the control system. One must have complete knowledge of the desired output trajectory and have the freedom to preset initial conditions [BL93] of the control system in order to achieve exact output tracking. Such knowledge is often unavailable, and even when available, presetting of initial conditions is usually not practical. This dissertation has been motivated by the problem of causal inversion of nonlinear nonminimum-phase systems. Indeed, a concrete motivational problem has been the design of a controller for a simple mathematical model of an autonomous bicycle, where we wish to make the bicycle follow a time-parameterized path in the ground plane without falling over. Knowing that we can stabilize any smooth roll-angle trajectory and rear-wheel velocity, how can we choose the roll-angle trajectory and rear-wheel velocity to produce the desired tracking behavior in the plane. As a practical matter we assume that we can only count on knowing a reference trajectory and its derivatives at the present time, not the entire future of the trajectory. This immediately rules out the utility of presetting initial conditions. For nonminimum-phase systems, it also rules out the possibility of exact tracking of reference trajectories drawn from an open set [GBLL94]. Consequently we will sacrifice some exactness of output tracking in order to construct a controller that provides approximate inversion with bounded internal state.
1.2
Dynamic Inversion In this dissertation a continuous-time dynamic method for approximating solu-
tions θ∗ (t) to nonlinear equations of the form F (θ, t) = 0 will be presented. We call this method and its resulting computational paradigms dynamic inversion. We will associate with F (θ, t) a dynamic system θ˙ = Φ(θ, t) with the crucial property that an arbitrarily
Sec. 1.3
Contributions of this Dissertation
3
small neighborhood of θ∗ is exponentially attractive. There will be two cases; one where θ∗ is attracting, and one where a region about θ∗ is attracting. We will rely upon the notion that any dynamic system having a stable equilibrium may be regarded as a representation of an analog computational architecture for the solving of an equilibrium equation. For the special case that F (θ, t) = F (θ) we will see that the method presented in this dissertation is considerably more general than the discrete-time iterative methods mentioned above. Differentiability of F (θ) is not required, though convenient when available. Our method is dissuaded by neither local minima nor local maxima. It generalizes easily to Rn → Rn maps. We will apply dynamic inversion in order to provide solutions for the
problems mentioned in Section 1.1 above.
1.3
Contributions of this Dissertation The main new results of this dissertation are:
• A methodology for the construction of continuous-time dynamic systems that solve inverse problems having finite-dimensional time-varying solutions.
• A geometric approach to approximate output tracking for a class of nonlinear nonminimumphase systems.
• A theorem on the affect of affine disturbances on exponentially stable dynamic systems.
• A useful nonlinear characterization of internal dynamics for nonlinear systems which extends notions of internal stability beyond the usual characterization of stable or unstable zero dynamics. Application of these results has resulted in further contributions of this dissertation: • Nonlinear dynamic systems that solve for inverses and polar decompositions of fixed and time-varying matrices.
• A class of tracking controllers which allow nonlinear control systems to track implicitly defined trajectories.
• A control methodology for robotic manipulators which allows tracking of workspace trajectories using gains and errors posed in joint-space.
4
Introduction
Chap. 1
• A tracking controller for balancing two-wheeled vehicles such as bicycles and motorcycles.
1.4
Overview of the Thesis This dissertation presents a nonlinear dynamic framework for solving a class of
inverse problems and applies this frame work to a variety of problems that arise in nonlinear control. The dissertation is organized as follows: • Chapter 2. Dynamic Inversion of Nonlinear Maps. In this chapter the notion of
a dynamic inverse of a map is introduced. Given an inverse problem, where the inverse solution is posed as the root of a time-dependent map, the dynamic inverse of the map is combined with the map to produce a nonlinear dynamic system whose solution asymptotically approximates the root. We present properties of the class of dynamic inverses of a map which allow coupled inverse problems, problems whose solutions depend on each other, to be combined into a single dynamic system that produces all of the coupled solutions. By posing the dynamic inverse itself as the solution to an inverse problem, we show how both the dynamic inverse and the solution to the inverse problem of interest may be solved simultaneously.
• Chapter 3. Dynamic Inversion and Polar Decomposition of Matrices. In
this chapter we present dynamic methods for the inversion and polar decomposition of fixed and time-varying matrices. Four main results are presented. First we show how a time-varying matrix inverse may be tracked given a good initial guess at the inverse at an initial time. In the second result, we show how a fixed matrix may be dynamically inverter given a good initial guess at its inverse and how, for special classes of matrices, the initial guess need not be close to the solution. We then show how dynamic inversion may be applied in order to produce the polar decomposition, as well as the inverse, of a time-varying matrix. This leads to similar results for fixed matrices, though in the case of fixed matrices we show how inversion and polar decomposition may be achieved in finite time rather than asymptotically.
• Chapter 4. Tracking Implicit Trajectories. Output tracking control for systems
having relative degree and stable zero dynamics is a well understood problem when the reference trajectory to be tracked is posed explicitly. When the trajectory is posed
Sec. 1.4
Overview of the Thesis
5
implicitly, however, no such tracking control methodology exists. Here we present such a methodology, combining dynamic inversion methods of Chapter 2 with more conventional tracking control to produce a dynamic controller for tracking implicit trajectories. We also introduce the concept of output-bounded internal dynamics which is a nonlinear extension of the more common notion of internal stability usually applied to nonlinear systems and inherited from linear systems. We show that application of the implicit tracking controller preserves output-bounded zero dynamics for tracking of an open set of reference outputs. • Chapter 5. Joint-Space Tracking of Workspace Trajectories in Continuous Time. In this chapter we consider the problem of tracking control of robotic manipu-
lators. Given a desired end-effector reference trajectory the objective is to apply joint forces and torques so that the path of the end-effector of the manipulator converges to the desired reference trajectory. Two standard approaches are first examined. In one a discrete inverse kinematics algorithm is applied to points along the reference trajectory to create a sequence of points in joint space. These joint-space points are then splined together and a standard tracking controller controls the joint torques in such a way that the splined joint-space trajectory is followed. In a second method, through differentiation of the forward kinematics map, the dynamic equations of the robotic manipulator are transformed into workspace coordinates. Tracking control is then posed directly in the workspace. In contrast, using results on implicit tracking from Chapter 4, we present a controller which allows continuous time inversion of the forward kinematics so that gains and errors for tracking control may be posed in the joint space. • Chapter 6. Approximate Output Tracking for a Class of NonminimumPhase Systems. For a significant class of nonlinear control systems, a control which
holds the output to be identically zero results in unstable internal dynamics. For such systems exact tracking of output reference trajectories drawn from an open set is not possible without the ability to preset initial conditions if one also wishes to maintain bounded state trajectories. This chapter offers an approach to tracking control which trades off some accuracy of tracking for internal boundedness and stability. The complete history of the output reference trajectory is not assumed known in advance. An internal equilibrium manifold is defined. It is a submanifold of state space with the special property that if the state of the system is near that manifold, then the output
6
Introduction
Chap. 1
approximately tracks the output reference trajectory. A controller is presented which causes a neighborhood of the manifold to become attractive and invariant. Thus if the manifold is bounded, then the state is bounded and approximate output tracking is achieved. The internal equilibrium controller is applied to the tracking control of the classical problem of the inverted pendulum on a cart. Comparison to a linear quadratic regulator shows a significant increase in performance. Dynamic inversion is incorporated into the controller to provide a signal used to track the location of the internal equilibrium manifold. • Chapter 7. Automatic Control of a Bicycle. Based on the results of Chapter 6,
an internal equilibrium controller is constructed for the tracking control of a nonlinear nonholonomic nonminimum-phase model of a bicycle or motorcycle. A simple model of a bicycle is presented. Through nonholonomic reduction and manipulation of the rolling constraints, equations of motion amenable to the techniques of Chapter 6 are obtained. Simulation results verify the theory of Chapter 6 while the bicycle tracks a straight line, a sinusoid, a circle, and a figure-eight.
• Chapter 8. Conclusions. The main results of the dissertation are summarized and a number of ideas and problems for future work are presented.
In a number of appendices we include some reference material for the reader including • Appendix A: notation and definitions. • Appendix B: a number of useful theorems drawn from outside sources. • Appendix C: a review of the subject of feedback linearization of nonlinear control systems.
The results contained in Appendices B and C will be pointed to when needed.
7
Chapter 2
Dynamic Inversion of Nonlinear Maps 2.1
Introduction In this chapter we describe a continuous-time dynamic methodology for inverting
nonlinear maps. We call this methodology dynamic inversion. Given a map1 F : Rn ×R+ → Rn known, by some means, to have a continuous isolated solution θ∗ (t) to F (θ, t) = 0, we
associate with F (θ, t) another map G[w, θ, t] which we call a dynamic inverse of F (θ, t). The map G[w, θ] is characterized by the property that the dynamic system θ˙ = −G[F (θ, t), θ, t]
(2.1)
has a solution θ(t) which converges asymptotically to the solution θ∗ (t).
2.1.1
An Informal Introduction to Dynamic Inversion Dynamic inversion is most easily introduced2 by first considering the problem of
finding the root of a real-valued function on the real line. A. Consider the function F : R → R; θ 7→ F (θ) illustrated in Figure 2.1 Assume
that we do not know the solution θ∗ to F (θ) = 0, but that we would like to find it using
a representation of F (θ). The representation of F (θ) may be, e.g. in the form of a closedform expression, a combination of table-lookup and interpolation, a physical (non-dynamic) system with an input θ and an output F (θ), or any combination of the above. Assume that 1 2
We will use the terms “map,” “mapping”, and “function” interchangeably. The precise hypotheses will be developed starting in Section 2.2 below.
8
Dynamic Inversion of Nonlinear Maps
Chap. 2
F(θ)
[ a
?
θ
*
] θ b
Figure 2.1: The map F (θ) where θ∗ is the unique solution to F (θ) = 0. The shaded region represents possible values of the function F (θ). we know that a unique solution θ∗ exists in the interval [a, b] ⊂ R. The function F (θ) of Figure 2.1 has a number of features which limit the choices of techniques that may be used
to find its root, θ∗ . It is, in places, not differentiable. It also has minima and maxima at points other than θ∗ . We may even be uncertain about the value of F (θ) for θ in certain
regions of [a, b] as indicated by the shaded region of the graph3 . We will assume, however,
that F (θ) is Lipschitz continuous on [a, b]. Clearly Newton’s method and its variants (e.g. secant method, regula falsi) would fail to find the root of this function if the initial guess at the root is not close to θ∗ . However, we make the following claim: Claim 2.1.1 For any initial value θ0 ∈ [a, b], the solution θ(t) to the dynamic system θ˙ = −F (θ) converges to the root θ∗ as t → ∞.
(2.2)
Informal Proof of Claim 2.1.1: Consider a solution of (2.2). Assume that F (θ) is such that a solution θ(t) of (2.2) exists for any θ(0) ∈ [a, b]. If θ(0) = θ∗ , then F (θ(0)) = 0, 3
“?”.
We assume that there exists some k > 0 such that F (θ) ≤ k < 0 in the region of Figure 2.1 marked by
Sec. 2.1
Introduction
9
β [ a
θ
*
] θ b
Figure 2.2: There exists a line passing through (θ∗ , 0) of slope β > 0 such that F (θ) (shown gray) is above the line to the right of θ∗ and below the line to the left of θ∗ . so (2.2) works fine for this case. If θ(0) ∈ [a, θ∗ ], then the vector field −F (θ) pushes the state to the right, towards θ∗ . As long as F (θ(t)) < 0 this will continue to be so. Since F (θ) < 0 for all θ ∈ [a, θ∗ ], the solution θ(t) will flow to θ∗ as t → ∞. Likewise, if θ(0) ∈ [θ∗ , b], then −F (θ) pushes the solution θ(t) left to θ∗ as t → ∞.
The argument above suggests that for maps similar to F (θ) in Figure 2.1, i.e. maps
whose values are strictly above the abscissa to the right of θ∗ and strictly below the abscissa to the left of θ∗ , θ(t) → θ∗ asymptotically as t → ∞. We will make an additional claim,
however.
Claim 2.1.2 The convergence θ → θ∗ where θ(t) is the solution of (2.2) is in fact expo-
nential, that is, there exists a k1 and a k2 in R, 0 < ki < ∞, i ∈ {1, 2} such that for all t > 0,
kθ(t)k ≤ k1 |θ(0) − θ∗ |e−k2 t
(2.3)
The important feature of F (θ) in Figure 2.1 which allows us to make Claim 2.1.2 is illustrated in Figure 2.2. Note that to the right of the root θ∗ , the graph of F (θ, t) is above a line of slope β passing through (θ∗ , 0), and to the left of θ∗ , the graph of F (θ, t) is
10
Dynamic Inversion of Nonlinear Maps
Chap. 2
below the same line. An equivalent expression of this feature is to define z := θ − θ∗ and
say that for all z ∈ [a − θ∗ , b − θ∗ ],
zF (z + θ∗ ) ≥ βz 2
(2.4)
Informal Proof of Claim 2.1.2: Let V (θ) := 12 (θ − θ∗ )2 = 12 z 2 . Differentiate V (θ) with respect to t to get
d V (θ) = (θ − θ∗ )θ˙ = −zF (θ) dt
(2.5)
−zF (θ) = −zF (z + θ∗ ) ≤ −βz 2
(2.6)
1 βz 2 = 2β z 2 = 2βV (θ) 2
(2.7)
V (θ) ≤ V (θ(0))e−2βt
(2.8)
But from (2.4) we have
Note that
Therefore
Insert the definition of V (θ) into (2.8) to get 1 1 (θ(t) − θ∗ )2 ≤ (θ(0) − θ∗ )2 e−2βt 2 2
(2.9)
Multiply (2.9) by 2 and take the positive branch of the square-root of both sides of the resulting equation to get |θ(t) − θ∗ | ≤ |θ(0) − θ∗ |e−βt, t ≥ 0 which proves the claimed exponential convergence.
(2.10)
We call the dynamic system (2.2) a dynamic inverter for θ∗ since it solves F (θ∗ ) = 0 for θ∗ . B. Let sign(a) be defined by sign(a) =
(
1, if a > 0 −1, if a < 0
(2.11)
If we replace (2.4) by z sign(F (b) − F (a)) · F (z + θ∗ ) ≥ βz 2
(2.12)
Sec. 2.1
Introduction
b AAAAA ] θ AAAAAAAAAAA AA AAAA AAA AAAAA AAAA AA AAAAAAAAAA AAAAA AAAA AA β AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA θ *
[ a
a [
11
β θ *
] θ b
Figure 2.3: Any function F (θ), Lipschitz in θ, that is transverse to the θ-axis at θ∗ and whose values lay in the shaded regions of either of these graphs may be inverted with the dynamic system (2.13). then the dynamic inverter
θ˙ = −sign(F (b) − F (a)) · F (θ)
(2.13)
will suffice for inversion of functions of same form as F (θ) in Figure 2.1 or for functions such as −F (θ).
Consider Figure 2.3. Any Lipschitz continuous function F (θ) which is transverse
to the θ-axis at θ∗ and whose values lie in the gray regions of the figure will be dynamically inverted by (2.13), as long as F (θ) is such that a solution of (2.13) exists for all θ(0) ∈ [a, b]. The proof is similar to the proof above, the essential step in the proof coming from the inequality (2.12). C. Now suppose that we encounter a function F (θ) := (θ − c)3
(2.14)
which is graphed in Figure 2.4 for c = 1. This function has a well-defined root θ∗ = c, but there does not exist a β > 0 such that (2.12) holds, i.e. there is no line of constant slope β > 0 such that F (θ) fits in either picture of Figure 2.3. However, consider the following observation: Observation 2.1.3 If G : R → R; w 7→ G[w], is such that G[w] = 0 =⇒ w = 0
(2.15)
12
Dynamic Inversion of Nonlinear Maps
Chap. 2
1 0.8 0.6 0.4
F(θ)
0.2 0
θ*
-0.2 -0.4 -0.6 -0.8 -1
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
θ
Figure 2.4: The function (θ −1)3 with root θ∗ = 1. No line of slope β may be drawn through θ∗ as in Figure 2.2. then {G[w] = 0 and F (θ∗ ) = 0} =⇒ G[F (θ∗ )] = 0 In other words, if θ∗ is a root of F (θ), then θ∗ is also a root of G[F (θ)].
(2.16)
N
Observation 2.1.3 affords us the freedom to generalize our dynamic root solving method to functions such as (2.14). For instance, let G[w] = sign(w)|w|1/3. Then G[F (θ)] = θ − c which satisfies (2.12) for any β ∈ (0, 1]. So for dynamic inversion of (2.14) we could use
θ˙ = −G[F (θ)]
(2.17)
Note that neither F (θ) nor G[w] need be Lipschitz continuous, but if G[F (θ)] is Lipschitz in θ, then a unique solution θ(t) is guaranteed to exist. We could also have used G[w] := sign(w)|w|1/4, shown in Figure 2.5. The composition G[F (θ)] is shown in Figure 2.6 along with an appropriate line of slope β = 1/2. In fact there are an infinite number of functions G[w] which satisfy zG[F (z + θ∗ )] ≥ βz 2
(2.18)
for some β > 0. We call such a function G[w] a dynamic inverse of F (θ) since, in the context of the dynamic system (2.17), G[w] solves F (θ) = 0 with exponential convergence
Sec. 2.1
Introduction
13
1 0.8 0.6 0.4
G(w)
0.2 0
-0.2 -0.4 -0.6 -0.8 -1 -1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
w
Figure 2.5: The function G[w] := sign(w)|w|1/4, a dynamic inverse of F (θ) = (θ − 1)3 . of θ(t) → θ∗ .
D. The bisection method (see, for instance, [GMW81], page 84) could also be used
to solve for the root of F (θ), but the bisection method relies upon the fact that θ∗ divides any continuous interval containing θ∗ into two connected sub-intervals, one in which F (θ) is positive, and the other in which F (θ) is negative. The bisection method is defined only for real-valued functions of one variable. On the other hand dynamic inversion, including the criterion (2.18), generalizes easily to maps F : Rn → Rn as well as maps F (θ, t) which depend on time.
2.1.2
Previous Work Continuous-time dynamic methods of solving inverse problems have been around
a long time. Indeed, if x∗ (t) is an isolated asymptotically stable equilibrium solution4 of x˙ = φ(x, t), then x˙ = φ(x, t) can be regarded as a dynamic inverter for solving φ(x, t) = 0
(2.19)
In the areas of adaptive control [SB89] and optimal control [AM90, BH69] dynamical systems have been used to solve for unknown parameters of physical systems. Most 4
By an equilibrium solution of x˙ = φ(x, t) we mean a solution x∗ (t) that satisfies φ(x∗ (t), t) = 0.
14
Dynamic Inversion of Nonlinear Maps
Chap. 2
1 0.8 0.6 0.4
G[F(θ)]
0.2 0
-0.2 -0.4 -0.6 -0.8 -1
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
θ
Figure 2.6: The composition G[F (θ)] = sign((θ − 1)3 )|(θ − 1)3 |1/4. Now we can draw a line of slope β = 1/2 (dashed) through (θ∗ , 0) = (1, 0), like the line in Figure 2.2. The dotted curve is F (θ) = (θ = 1)3 . results are for linear systems whose parameters are assumed to be slowly varying. Such results may often be used also for nonlinear systems that can be converted to linear systems through state- dependent coordinate transformations [SB89]. With the recent vogue of designing dynamic systems and circuits thought to mimic certain models of computation in the nervous system [Mea89, CR93], there has been a renewed interest in viewing dynamics as computation. Gradient flows, in particular, have been heavily relied upon in the neural network literature [JLS88]. More recently Brockett [Bro91, Bro89] has shown how continuous-time dynamical systems may be used to sort lists and solve linear programming problems. Bloch [Blo85, Blo90] has shown how Hamiltonian systems may be used to solve principal component and linear programming problems. Helmke and Moore in [HM94] review a broad variety of inverse and optimization problems solvable by continuous-time dynamical systems. The inverse-kinematics problem in which one wishes to solve for a θ∗ (t) satisfying F (θ) − xd (t) = 0
(2.20)
has given rise to a number of continuous-time dynamic methods [WE84, TD93, NTV91a, NTV91b, NTV94, Tor90b, Tor90a] of solving inverse problems of the form (2.20). The
Sec. 2.1
Introduction
15
notion of a dynamic inverse of a nonlinear map, introduced here, generalizes the role of DF (θ)−1 · w and DF (θ)T · w in those methods. In fact we will see that we may use
dynamic inversion itself to determine a dynamic inverse, while simultaneously using that dynamic inverse to solve for a time-varying root of interest. Also, we have developed dynamic inversion around the inversion of maps of the form F (θ, t) which is considerably more general than F (θ)−xd (t). Differentiability of F (θ, t) is not required though it is useful
when available. In Chapter 5 we will apply our methods to the inverse-kinematics problem, in particular to the problem of controlling a robotic arm to track an inverse kinematic solution. In the present chapter, however, we will give some example applications (see for instance Examples 2.4.7 and 2.4.8) of dynamic inversion which are not of a form amenable to prior techniques of using continuous-time dynamics for their solutions.
2.1.3
Main Results The main results of this chapter are as follows:
i. We define a dynamic inverse for nonlinear maps. ii. Using a dynamic inverse we construct a dynamic system that yields an estimate of the root θ∗ (t) of F (θ, t) = 0, and we prove that the estimation error is bounded as t → ∞. iii. We construct a derivative estimator for the root θ∗ (t), and incorporate that estimator into a dynamic system which estimates θ∗ (t) with vanishing error as t → ∞. iv. We construct a dynamic system that dynamically solves for a dynamic inverse as the solution to an inverse problem, while simultaneously using that dynamic inverse to produce an estimator for a root θ∗ (t), where the estimation error is vanishing.
2.1.4
Chapter Overview In Section 2.2 we will introduce the formal definition of a dynamic inverse of a
map. Then in Section 2.3 we will use the dynamic inverse to derive a continuous-time dynamic estimator for time-varying vector-valued roots of nonlinear time-dependent maps. We will prove two theorems which assert that the resulting estimation error may be made arbitrarily small within an arbitrarily short period of time by the adjustment of a single scalar gain. With one theorem we will assert that the estimation error becomes arbitrarily small in finite time; with the other theorem we will assert exponential convergence of an
16
Dynamic Inversion of Nonlinear Maps
Chap. 2
estimator to the root as t → ∞. We will then show in Section 2.4 how a dynamic inverse
itself may be determined dynamically, that is we will pose both the dynamic inverse itself and the root we seek as the solution to an equation of the form F (θ, t) = 0. In Section 2.5 we will discuss ways in which dynamic inversion may be generalized to cover a broader set of problems. A number of examples will illustrate application of dynamic inversion in cases where closed-form solutions are readily available, allowing the reader to verify the theory
and operation of dynamic inversion. In Example 2.4.7, however, we apply dynamic inversion to solve for the intersection of two time-varying polynomials, a problem whose quasiperiodic solution is not so readily available in a closed form. Example 2.4.8 shows, in a more abstract context, how dynamic inversion may be used to construct a dynamic controller for a nonlinear control system.
2.2
A Dynamic Inverse We begin by defining the dynamic inverse, a definition central to the development
of the methodology presented in this chapter. The dynamic inverse is defined in terms of the unknown root of a map. Later we will show that a dynamic inverse may be obtained without first knowing the root. Definition 2.2.1 For F : Rn × R+ → Rn ; (θ, t) 7→ F (θ, t) let θ∗ (t) be a continuous isolated solution of F (θ, t) = 0. A map G : Rn × Rn × R+ → Rn ; (w, θ, t) 7→ G[w, θ, t] is called a
dynamic inverse of F on the ball Br := {z ∈ Rn | kzk ≤ r}, r > 0, if i. G[0, z + θ∗ (t), t] = 0 for all t ≥ 0 and z ∈ Br ,
ii. the map G[F (θ, t), θ, t] is Lipschitz in θ, piecewise-continuous in t, and iii. there is a real constant β, with 0 < β < ∞, such that Dynamic Inverse Criterion z T G [F (z + θ∗ (t), t) , z + θ∗ (t), t] ≥ βkzk22
(2.21)
for all z ∈ Br .
N
Sec. 2.2
A Dynamic Inverse
17
In order to emphasize the association of G with a particular solution θ∗ (t) of F (θ, t) = 0, we will sometimes say that G is a dynamic inverse of F (θ, t) with respect to the solution θ∗ (t). In order to emphasize the association of a particular parameter β with G, we will sometimes say that G is a dynamic inverse with parameter β. We will also, at times, restrict the domain of t to some subset of R+ . Note that the definition of dynamic inverse does not involve dynamics, though it’s significance will be in the dynamic context of a dynamic inverter. Some easily verified properties of the dynamic inverse that will prove useful are the following: Property 2.2.2 Positive Scalar Times Dynamic Inverse. If G[w, θ, t] is a dynamic inverse of F (θ, t) with parameter β, then for any µ > 0 ∈ R, µG[w, θ, t] is a dynamic inverse of F (θ, t) with parameter µβ.
N
Property 2.2.3 Many β’s for Each Dynamic Inverse. If G[w, θ, t] is a dynamic inverse of F (θ, t) with parameter β1 , then for any β2 such that 0 < β2 ≤ β1 , G[w, θ, t] is a dynamic inverse of F (θ, t) with parameter β2 .
N
Property 2.2.4 Stacking Decoupled Dynamic Inverses. Assume that G1 (w1 , θ1 , t) is a dynamic inverse of F1 (θ1 , t) with parameter β1 , and G2 (w2 , θ2 , t) is a dynamic inverse of F2 (θ2 , t) with parameter β2 . Let w = (w1 , w2) and θ = (θ1 , θ2 ). Let G and F be defined by " # " # G1 (w1 , θ1 , t) F1 (θ1 , t) G[w, θ, t] := , F (θ, t) := (2.22) G2 (w2 , θ2 , t) F2 (θ2 , t) Then G is a dynamic inverse of F with parameter β = min{β1 , β2 }.
N
˜ θ, t] := Property 2.2.5 Dynamic Inverse at Zero. Let F˜ (z, t) := F (z+θ∗ (t), t) and G[z, G[w, z + θ∗ (t), t]. Then G[w, θ, t] is a dynamic inverse of F (θ, t) relative to a solution θ∗ (t) ˜ if and only if G[w, z, t] is a dynamic inverse of F˜ (z, t) relative to z∗ = 0.
N
Property 2.2.6 Trivial Dynamic Inverse If G1 (w, θ, t) is a dynamic inverse of F1 (θ, t), then G2 (w) = w is a dynamic inverse of G1 (F1 (θ, t), θ, t). When G[w] = kw where k ∈ R, 0 < k < ∞, then G[w] is called a trivial dynamic inverse.
N
18
Dynamic Inversion of Nonlinear Maps
Chap. 2
It will be proven in the next section that if F (θ, t) has a dynamic inverse G[w, θ, t], then for all initial conditions θ(0) in an open neighborhood of θ∗ (0), the integral curves of the vector field −µG[F (θ, t), θ, t] converge exponentially to a neighborhood of θ∗ (t) as t → ∞.
For the case of a scalar valued F (θ, t) we have the following lemma:
Lemma 2.2.7 Dynamic Inverse for Scalar Functions. Let F : R × R+ → R; (θ, t) 7→
F (θ, t) be C 2 in θ and continuous in t for all θ in an interval [a, b]. Let θ∗ (t) be a continuous isolated solution of F (θ, t) = 0. Assume that there exists an r > 0 and a β > 0 such that (θ − θ∗ )sign(F (b, t) − F (a, t))F (θ, t) ≥ β(θ − θ∗ )2
(2.23)
for all (θ − θ∗ ) ∈ Br and all t ∈ R+ . Then Dynamic Inverse for Scalar Functions ∂ G[w] := sign D1 F (θ∗ (0), 0) · w ∂θ
is a constant dynamic inverse of F (θ, t).
(2.24)
Proof of Lemma 2.2.7: Since F (θ, t) is C 1 in θ, D1 F (θ, t) is well-defined and continuous in θ and t. Since F (θ, t) is continuous and satisfies (2.23) for all t ∈ R+ , sign(F (b, t) − F (a, t)) is well-defined and constant for all t, and, furthermore,
Therefore
∂ sign D1 F (θ∗ (t), t) = sign(F (b, t) − F (a, t)) ∂θ
(2.25)
∂ ∂ sign D1 F (θ∗ (t), t) = sign D1 F (θ∗ (0), 0) ∂θ ∂θ
(2.26)
Thus the sign of D1 F (θ∗ (t), t) is an invariant of the isolated solution θ∗ (t). Now from (2.23) and (2.25) we have ∂ (θ − θ∗ )sign D1 F (θ∗ (0), 0) F (θ, t) ≥ β(θ − θ∗ )2 ∂θ so G[w](2.24) is a constant dynamic inverse for F (θ, t).
(2.27)
Sec. 2.2
A Dynamic Inverse
19
Remark 2.2.8 Lemma 2.2.7 tells us that for time- varying scalar valued C 1 functions, we need only pick a sign to produce a dynamic inverse. Typically one knows an interval [a, b] that brackets the solution. Then one need only evaluate F (a, t1 ) and F (b, t2 ) for any times
N
t1 ≥ 0 and t2 ≥ 0 to determine a dynamic inverse.
Dynamic inverses for affine maps are easily obtained as illustrated by the following example. Example 2.2.9 Dynamic Inverse for Affine Maps. Let F (θ, t) = A(θ − u(t))
(2.28)
where A ∈ Rn×n . Then for any matrix B ∈ Rn×n such that BA is positive definite,
G[w, θ, t] = B · w is a dynamic inverse of F . The solution θ∗ of F (θ, t) = 0 is θ∗ (t) = u(t). It is clear that
z T G [F (z + u(t), θ, t) , t] = z T B(Az) ≥ σmin(BA) kzk22
(2.29)
where σmin(BA) is the smallest singular-value of BA. Note that if A is singular, then F given by (2.28) has no dynamic inverse. If A is non-singular, a possible choice of B is
N
AT .
We will have occasion in Section 2.4 below to choose a dynamic inverse for one inverse problem to depend on the solution to a different, but related inverse problem. In such cases the combination of the two inverse problems may be viewed as a single inverse problem through the following property of dynamic inverses. Property 2.2.10 Stacking Coupled Dynamic Inverses. Assume that G1 (w 1 ; θ1 , θ2 ; t) is a dynamic inverse of F 1 (θ1 , θ2 , t), with respect to θ∗1 (t) for all θ2 such that (θ2 − θ∗2 (t)) ∈
Br2 , and G2 (w 2 ; θ1 , θ2 ; t) is a dynamic inverse of F 2 (θ1 , θ2 , t) with respect to θ∗2 (t) for all θ1
such that (θ1 − θ∗1 (t)) ∈ Br1 . Let θ := (θ1 , θ2 ) and w = (w 1 , w 2 ). Then " # G1 (w 1 ; θ1 , θ2 ; t) G[w, θ, t] := G2 (w 2 ; θ1 , θ2 ; t) is a dynamic inverse of F (θ, t) :=
"
F 1 (θ1 , θ2 , t) F 2 (θ1 , θ2 , t)
#
with respect to (θ∗1 (t), θ∗2 (t)) for all (θ1 , θ2 ) such that (θ1 − θ∗1 , θ2 − θ∗2 ) ∈ Br1 × Br2 .
(2.30)
(2.31)
N
20
Dynamic Inversion of Nonlinear Maps
Chap. 2
Sufficient conditions on F (θ, t) under which a dynamic inverse exists are mild. They are given in the following existence lemma. Lemma 2.2.11 Sufficient Conditions for Existence of a Dynamic Inverse. For F : Rn ×R+ → Rn ; (θ, t) 7→ F (θ, t), let θ∗ (t) be a continuous isolated solution of F (θ, t) = 0. Let F (θ, t) be C 2 in θ and continuous in t. Assume that the following are true: i. D1 F (θ∗ (t), t) is nonsingular for all t; ii. D1 F (θ∗ (t), t) and D1 F (θ∗ (t), t)−1 are bounded uniformly in t; iii. for all z ∈ Br , D12 F (z + θ∗ (t), t) is bounded uniformly in t. Under these conditions there exists an r > 0 independent of t, and a function G : Rn × Rn × R+ → Rn , (w, θ, t) 7→ G[w, θ, t] such that for each t > 0 and for all θ satisfying θ − θ∗ (t) ∈ Br , G[w, θ, t] is a dynamic inverse of F (θ, t).
Proof of Lemma 2.2.11: Let F˜ (z, t) := F (z + θ∗ , t)
(2.32)
Since D1 F (θ∗ , t) is invertible for all t ∈ R+ , by the inverse function theorem (see [AMR88],
Theorem 2.5.7, page 121), for each t ∈ R+ there exists an open neighborhood Nt ⊂ Rn of the origin, and a function F˜−1 : Rn × R+ 7→ Rn ; (w, t) 7→ F˜−1 [w, t] such that for all z ∈ Nt , h i F˜ −1 F˜ (z, t), t = z
(2.33)
G[w, θ, t] := F˜−1 (w, t)
(2.34)
Br ⊂ Nt ∀t ∈ R+
(2.35)
z T G [F (z + θ∗ (t), t) , θ, t] = z T z = kzk22
(2.36)
Let
If there exists an r > 0 such that
then for all z ∈ Br
and we may choose G as a dynamic inverse with β satisfying 0 < β ≤ 1. In the absence of items ii and iii of the hypothesis, there is the possibility that no such r exists, e.g. the
Sec. 2.2
A Dynamic Inverse
21
largest ball contained in Nt may be B0 in the limit as t → ∞. Assurance that an r > 0 exists is provided by a proposition of Abraham, et al. [AMR88] (Proposition 2.5.6, page 119) regarding size of the ball on which F˜ (z, t) = 0 is solvable. Though that proposition gives explicit bounds on r based on the explicit uniform bounds on D1 F (θ∗ (t), t), D1 F (θ∗ (t), t)−1 , and D12 F (z + θ∗ (t), t), for our purposes it is enough to know that the existence of such uniform bounds is sufficient to guarantee the existence of an r > 0.
Though Lemma 2.2.11 requires F (θ, t) to be C 2 in θ at θ = θ∗ (t), this is only a sufficient condition for the existence of a dynamic inverse. That it is not necessary is indicated by the next example. Example 2.2.12 Consider the piecewise-linear time-varying function ( −(θ − u(t)), θ − u(t) ≥ 0 F (θ, t) = − 12 (θ − u(t)), θ − u(t) < 0
(2.37)
where u(t) is a continuous function of t. The solution to F (θ, t) = 0 is θ∗ (t) = u(t). Let5
G[w, θ, t] = G[w] = −w. Then
z T G [F (z + θ∗ (t))] = −z T F (z + u(t)) ( −z T (−1), z≥0 = −z T (−1/2) z < 0 ≥
(2.38)
1 2 2 kzk2
where F˜ is as defined in (2.32), so that 0 < β ≤ 1/2. But F (·, t) is not differentiable at θ = θ∗ (t).
N
Using the exact inverse of F as a dynamic inverse as in the proof of Lemma 2.2.11 is not very practical since the exact inverse, though always a dynamic inverse, is normally not known. There is reason for hope, however, in the observation that the criterion that G be a dynamic inverse of F is considerably weaker than the criterion that G be an inverse of F in the usual sense. One might guess that a truncated Taylor expansion for F −1 would be a good candidate for G. That this guess is true is verified in the proof of the following theorem. 5
Throughout we will use the abuse of notation demonstrated by referring to G[w, θ, t] as G[w] when the value of G depends only on w.
22
Dynamic Inversion of Nonlinear Maps
Chap. 2
Theorem 2.2.13 Fixed Jacobian Inverse as a Dynamic Inverse. Let θ∗ (t) be a
continuous isolated solution of F (θ, t) = 0, where F (θ, t) is C 2 in θ, and C 1 in t. Let F˜ (z, t) := F (θ∗ (t) + z, t). Assume that D1 F˜ (0, t) is nonsingular, and that D12 F˜ (0, t) is
bounded. Let t1 > 0 be a constant. Then there exist t0 and t2 , with 0 ≤ t0 < t1 < t2 , and an r(t1 ) ∈ R, r(t1 ) > 0, such that for any y ∈ Br(t1 ) ,
Fixed Jacobian Dynamic Inverse (2.39)
G[w] = D1 F˜ (y, t1 )−1 · w
is a dynamic inverse of F˜ (z, t) for all z ∈ Br(t1) and all t ∈ (t0 , t2 ).
Remark 2.2.14 Theorem 2.2.13 tells us that over a sufficiently small time interval, there is an open set of constant matrices such that if M is in that set, then G[w] := M · w is a dynamic inverse of F (θ, t). See Figure 2.7.
z2
B r(t1)
~
G[w] = D1F(y, t1)-1 Äw
r(t1)
z1
y t
t0
t2
t1
B r(t1) x (t0, t2) Figure 2.7: For any y ∈ Br(t1 ) , the constant matrix D1 F˜ (y, t1 )−1 provides a dynamic inverse for F (θ, t) over a sufficiently small interval (t0 , t2 ) containing t1 . See Theorem 2.2.13.
N Remark 2.2.15 Nearby Jacobian Inverse as a Dynamic Inverse. We may replace t1 by t in (2.39) to conclude from Theorem 2.2.13 that D1 F (θ(t), t)−1 · w is a dynamic
Sec. 2.2
A Dynamic Inverse
23
inverse of F (θ, t) for all t ≥ 0, if θ(t) is sufficiently close to θ∗ (t) for all t ≥ 0. This will prove particularly important later when we use the dynamic inverse in a dynamic context
N
in order to keep θ(t) close to θ∗ (t).
Proof of Theorem 2.2.13: Note that since F˜ (0, t) = 0 for all t, if F˜ (z, t) is C k in z, then D2l F˜ (0, t1 ) ≡ 0 for l ∈ k (see Appendix A for explanations of notation). Using this, we expand F˜ (z, t) in a Taylor series in both variables to get F˜ (z, t) = D1 F˜ (0, t1) · z + O kzk2 , |t − t1 | kzk For r > 0, let y ∈ Br ⊂ Rn and expand D1 F˜ (0, t1) about y as
D1 F˜ (0, t1) = D1 F˜ (y, t1 ) + O(kyk)
(2.40)
(2.41)
Substitute (2.41) into (2.40) to get F˜ (z, t) = D1 F˜ (y, t1 ) · z + f (z, t)
(2.42)
where f (z, t) = O kzk2 , |t − t1 | kzk, kyk kzk Now consider the dynamic inverse candidate
G[w] = D1 F˜ (y, t1 )−1 · w Left multiply G[F˜ (z, t)] by z T and expand F˜ according to (2.42) to get h i z T G F˜ (z, t) = z T D1 F˜ (y, t1)−1 F˜ (z, t) = z T z + z T D1 F˜ (y, t1 )−1 f (z, t)
(2.43)
(2.44)
(2.45)
Choose β ∈ (0, 1). If there exists an r ∈ R+ and an interval (t0 , t2 ) containing t1 such that for all z ∈ Br and all t ∈ (t0 , t2 ),
z T D1 F˜ (y, t1 )−1 f (z, t) ≥ (β − 1)kzk22
(2.46)
z T G[F˜ (z, t)] ≥ βkzk2
(2.47)
then implying that G is a dynamic inverse of F˜ on Br for t ∈ (t0 , t2 ). Since f (z, t) satisfies (2.43), D1 F˜ (y, t1 )−1 · f (z, t) = O kzk2 , |t − t1 | kzk, kyk kzk (2.48)
Thus for each t1 > 0 there is always a sufficiently small r(t1 ) > 0, and a sufficiently small interval (t0 , t2 ) such that (2.46) is true for the chosen β.
24
Dynamic Inversion of Nonlinear Maps
Chap. 2
Remark 2.2.16 Positive-Definite Combinations with the Jacobian Inverse. Let F (θ, t) be C 1 in θ. Then any matrix valued function B(θ, t) ∈ Rn×n such that B is contin-
uous in t, and
B(θ∗ , t)D1 F (θ∗ , t) > 0
(2.49)
G[w, θ, t] = B(θ, t) · w is a dynamic inverse of F (θ, t) for all θ sufficiently close to θ∗ . This includes as special cases B(t) = D1 F (θ, t)−1 and B(t) = D1 F (θ, t)T , where kθ − θ∗ k is
N
sufficiently small.
Though it will often be convenient to choose a linear dynamic inverse, a dynamic inverse need not be linear as shown by the following two examples. Example 2.2.17 Nonlinear Dynamic Inverse. Let F (θ, t) = (θ − sin(t))3 so that θ∗ = sin(t). Note that F (θ, t) fails to satisfy the conditions of Lemma 2.2.11. Let G[w] :=
sign(w)|w|1/3. Then z T G[F (z + θ∗ , t)] = z T z ≥ kzk2
(2.50)
so G[w] is a dynamic inverse of F (θ, t). Note that, though G[w] itself is not Lipschitz in w, G[F (θ, t)] = θ − sin(t) which is Lipschitz in θ and continuous in t, thus G[w] satisfies items ii and iii of the dynamic inverse definition, Definition 2.2.1.
N
Later in Section 2.4 we will show how a dynamic inverse can be determined dynamically, that is, we will find both the root and the dynamic inverse itself using a single dynamic system.
2.3
Dynamic Inversion In this section we will use the dynamic inverse to construct a dynamic system
whose state is an estimator for the root θ∗ (t) of F (θ, t) = 0. We present two theorems, Theorem 2.3.1 and Theorem 2.3.5, collectively called the dynamic inversion theorem, covering both the case where we have no estimate of θ˙∗ (t) as well as the case in which we do have such an estimate.
2.3.1
Dynamic Inversion with Bounded Error Suppose we wish to produce an estimate for the root θ∗ (t) of a map F : Rn ×R+ →
Rn . Assume that we have a rough estimate of θ∗ (0), called θ0 , with θ0 − θ∗ (0) ∈ Br , but no
Sec. 2.3
Dynamic Inversion
25
estimator for θ˙∗ . Theorem 2.3.1 below tells us that once we have found a dynamic inverse G[w, θ, t] we are guaranteed that there always exists a real µ > 0 such that the solution θ(t) to θ˙ = −µG [F (θ, t), θ, t] ,
θ(0) = θ0
(2.51)
approximates θ∗ (t) arbitrarily closely in an arbitrarily short period of time. The following theorem is quite general in that it allows us to find roots of continuous, but not necessarily differentiable nonlinear maps. Theorem 2.3.1 Dynamic Inversion Theorem—Bounded Error. For F : Rn ×R+ → Rn ; (θ, t) 7→ F (θ, t), let θ∗ (t) be a continuous isolated solution of F (θ, t) = 0 for all t ∈ R+ .
Let G : Rn × Rn × R+ → Rn ; (w, θ, t) 7→ G[w, θ, t] be a dynamic inverse of F (θ, t) with parameter β for all θ such that θ − θ∗ (t) is in Br for all t ∈ R+ . Let E : Rn × R+ → Rn ;
(θ, t) 7→ E(θ, t), be Lipschitz in θ and piecewise-continuous in t. Assume that there exists a
γ ∈ R+ such that
E (z + θ∗ (t), t) − θ˙∗ (t)
∞
≤ γ/2
(2.52)
for all t ∈ R+ , and z ∈ Br . Assume also that kθ(0) − θ∗ (0)k∞ is in Br . Then for each
µ > 0, the solution θ(t) to
Dynamic Inverter with Bounded Error θ˙ = −µG [F (θ, t), θ, t] + E(θ, t) satisfies
where z := θ − θ∗ and
(
kz(t)k2 ≤ kz(0)k2e−µβt/2 , 0 ≤ t ≤ t1 kz(t)k2 ≤ γ/µβ, 2 ln t1 = µβ
µβkz(0)k2 γ
t > t1
(2.53)
(2.54)
(2.55)
Proof of Theorem 2.3.1: Let z(t) := θ(t) − θ∗ (t), and assume µ ≥ 0. Transforming (2.53) ˜ to z-coordinates and letting F˜ (z, t) := F (z + θ∗ (t), t) and G[w, z, t] = G[w, z + θ∗ (t), t] gives h i ˜ F˜ (z, t), z + θ∗ (t), t + E (z + θ∗ (t), t) − θ˙∗ (t) z˙ = −µ G (2.56)
26
Dynamic Inversion of Nonlinear Maps
Chap. 2
Since G[F (θ, t), θ, t] and E(θ, t) are Lipschitz in θ and piecewise-continuous in t, a solution z(t), t ∈ R+ exists for (2.53). Let Then
V (z) =
1 kzk22 2
V˙ (z) = z T z˙
By (2.52),
= −z T µ G[F˜ (z, t), z, t] + z T E(z + θ∗ (t), t) − θ˙∗ (t)
1 z 1 z T E(z + θ∗ (t), t) − θ˙∗ (t) ≤ z T γ = γkz(t)k2 2 kzk2 2
(2.57)
(2.58)
(2.59)
Combining (2.58) and (2.59) along with the assumption that G[w, θ, t] is a dynamic inverse of F˜ (z, t) with parameter β, we have V˙ (z) ≤ −µβkz(t)k22 + 12 γkz(t)k2
= − 12 µβkz(t)k22 − 12 µβkz(t)k22 + 12 γkz(t)k2
(2.60)
Therefore, for z(t) satisfying kz(t)k2 ≥ γ/µβ, 1 V˙ (z) ≤ − µβkz(t)k22 = −µβV (z) 2
(2.61)
Let y(t) satisfy y(0) = V (z(0)), and y˙ = −µβy. Then y(t) = y(0)e−µβt =
1 kz(0)k22e−µβt 2
(2.62)
By Theorem B.1.1 (see Appendix B), V (z(t)) ≤ y(t). As a consequence, kz(t)k2 ≤ kz(0)k2 e−µβt/2
(2.63)
for all t such that kz(t)k2 ≥ γ/µβ. If z(0) ∈ Bγ/µβ , then since V˙ ≤ 0 on the boundary of Bγ/µβ , z(t) can never leave Bγ/µβ . If z(0) 6∈ Bγ/µβ , then z(t) is guaranteed to enter Bγ/µβ no later than t1 , where t1 is the solution to
γ = kz(0)k2 e−µβt1 /2 µβ namely (2.55).
(2.64)
The map E(θ, t) in Theorem 2.3.1 may model an estimator for θ˙∗ . It may also model errors resulting from the representation of F (θ, t) or the presence of noise. Remark 2.3.2 Note that differentiability of F (θ, t) is not a requirement for application of Theorem 2.3.1.
N
Sec. 2.3
Dynamic Inversion
27
Example 2.3.3 Dynamic Inversion of a Piecewise Linear Map – No Derivative Estimate. Consider the map F : [−4, 4] ⊂ R → R defined by
F (θ, t) = sin(4πt) +
−4 − θ/2, θ < −2
(2.65)
3θ/2, −2 ≤ θ ≤ 2
4 − θ/2, θ > 2
as shown by the solid line in Figure 2.8. Clearly F (θ, t) is not differentiable with respect to θ for all θ ∈ [−4, 4]. 4
3
F(θ, t0 ), t0 = 0, 1/8, 3/8
2
1
0
-1
-2
-3
-4 -4
-3
-2
-1
0
1
2
3
4
θ
Figure 2.8: The function F (θ, t) (2.65) for t = 0 (solid), t = 1/8 (dotted), and t = 3/8 (dashed). The unique solution of F (θ, t) = 0 in [−4, 4] is θ∗ (t) = −(2/3) sin(4πt). A dynamic
inverse of F (θ, t) is G[w, θ, t] = w corresponding to β = 1. A dynamic inverter for F is then θ˙ = −µF (θ, t)
(2.66)
for any real constant µ > 0, where F (θ, t) is defined by (2.65). For this example we take E(θ, t) ≡ 0, though in a later example, Example 2.3.7, we will construct and use a non-zero
E(θ, t).
28
Dynamic Inversion of Nonlinear Maps
Chap. 2
3
2.5
2
θ
1.5
1
0.5
0
-0.5
-1
0
0.5
1
1.5
1
1.5
t 3
2.5
||θ − θ* ||
2
1.5
1
0.5
0
-0.5
0
0.5
t
Figure 2.9: The upper graph show the solutions of the dynamic inverter (2.66) for µ = 10 (dashed) and µ = 100 (solid). The initial condition was θ(0) = 3. The lower graph shows the estimation error for the dynamic inverter (2.66) using µ = 10 (dashed) and µ = 100 (solid).
The top graph of Figure 2.9 shows the simulated solutions of (2.66) for θ(0) = 3, with µ = 10 and µ = 100. The simulations were done in Matlab [Mat92] using an adaptive step-size fourth and fifth order Runge-Kutta integrator ode45. Each solution can be seen to
Sec. 2.3
Dynamic Inversion
29
converge to a neighborhood of θ∗ (t); the higher the value of µ, the smaller the neighborhood and the faster the convergence. The estimation error for each of the simulations is shown
N
in the bottom graph of Figure 2.9.
As an analog computational paradigm, it is natural to consider the realization of a dynamic inverter in an analog circuit. Example 2.3.4 A Dynamic Inverter Circuit. Consider a nonlinear circuit element, such as a diode, represented schematically in Figure 2.10.
i Vb
Va
Figure 2.10: Nonlinear circuit element of Example 2.3.4. Assume that the circuit element is characterized by i = f (Va − Vb )
(2.67)
where i is the current through the circuit element, and Va and Vb are the voltages at each end of the circuit element as indicated in Figure 2.10. Assume that the characteristic of the circuit element is continuous, strictly monotonic, and lies in the shaded region of the graph of Figure 2.11.
f (Va - Vb)
AAAAA AAA AA AAAA AAAAA AAAAA AAAAA
Va - Vb
Figure 2.11: The characteristic Va − Vb versus f (Va − Vb ) is strictly monotonic, continuous, and lies in the shaded region. A typical curve is shown. See Example 2.3.4.
30
Dynamic Inversion of Nonlinear Maps
Chap. 2
R6 C R1 R3 i
Vout
V1
+
R2
+
+
R2
V3
R3 R5 R4 Vin
+
V2
Figure 2.12: Circuit realization of a dynamic inverter. See Example 2.3.4. Now consider the circuit of Figure 2.12 composed of linear resistors, ideal operational amplifiers6 , and the nonlinear circuit element with characteristic (2.67). The circuit of Figure 2.12 is composed of a number of standard operational amplifier sub-circuits: an integrator, a current to voltage converter, an inverting amplifier, and a differential amplifier7 . To solve for Vout in terms of Vin note the following: Z t 1 Vout = − V3 dt + Vout(0) R6 C 0
(2.68)
where Vout(0) is due to any charge on the capacitor at t = 0, V3 =
R3 (V2 − V1 ) R2
V1 = −R1 i V2 = −
R5 Vin R4
i = f (Vout)
(2.69) (2.70) (2.71) (2.72)
6 For an ideal operational amplifier, the open-loop gain of the amplifier is infinite, and no current flows into the + or − terminals. See [CDK87], page 175. 7 For a review of the characteristics of such circuits see Chua, Desoer, and Kuh [CDK87], Chapter 4
Sec. 2.3
Dynamic Inversion
31
Substitute Equations (2.70), (2.71), and (2.72) into Equation (2.69), let R1 =
R5 R4
(2.73)
and then substitute the resulting expression for V3 into (2.68) to get Z t R3 R5 (f (Vout) − Vin ) dt + Vout(0) Vout = − R2 R4 R6 0 Let µ=−
R3 R5 R2 R4 R6
(2.74)
(2.75)
Now differentiate Equation (2.74) to get V˙ out = −µ (f (Vout) − Vin )
(2.76)
The differential equation (2.76) is a dynamic inverter which solves for θ∗ satisfying f (θ, t) = Vin , thus the circuit of Figure 2.12 is a realization of a dynamic inverter for the nonlinear circuit element characterized by (2.67). For sufficiently high µ, kV˙ in(·)k∞ sufficiently small,
and after a transient, the relationship between Vin and Vout is approximately characterized
by the inverse of the characteristic of the nonlinear circuit element as indicated in Figure 2.13. The larger the value of µ and the smaller the bound on kV˙ in (·)k∞, the better the
relation between Vin and Vout approximates the inverse of the nonlinear characteristic. Vout
Vin
Figure 2.13: Effective characteristic (solid) of the dynamic inverter circuit of Figure 2.12. The nonlinear element’s characteristic is indicated in gray. See Example 2.3.4. Of course, practical realizations of such circuits as that of Figure 2.12 normally require modification in order to compensate for temperature fluctuations and non-ideal properties of the operational amplifiers.
N
32
2.3.2
Dynamic Inversion of Nonlinear Maps
Chap. 2
Dynamic Inversion with Vanishing Error
It is often the case that a differentiable representation of F (θ, t) is available. Under this condition an estimator, E(θ, t), for θ˙∗ may be obtained. Differentiate F (θ∗ (t), t) = 0 with respect to t to get the identity D1 F (θ∗ (t), t) θ˙∗ (t) + D2 F (θ∗ , t) = 0
(2.77)
Solve for θ˙∗ , and replace θ∗ by θ to get the derivative estimator E(θ, t) := −D1 F (θ, t)−1 D2 F (θ, t)
(2.78)
If F (θ, t) is C 2 in θ, then E(θ, t) becomes arbitrarily precise estimator of θ˙∗ (t) as θ approaches θ∗ (t). Other approximators E(θ, t) satisfying E(θ, t) → θ˙∗ as θ → θ∗ are also possible as we will see in Section 2.4. Derivative estimators of this sort may be incorporated into dynamic inversion in order to produce an estimator θ(t) for θ∗ (t) that is not only attracted to a neighborhood of θ∗ (t) in finite time, as in the case of Theorem 2.3.1, but is attracted to θ∗ (t) itself as t → ∞. We will formalize this fact in the following theorem. Theorem 2.3.5 Dynamic Inversion Theorem – Vanishing Error. Let θ∗ (t) be a
continuous isolated solution of F (θ, t) = 0, with F : Rn × R+ → Rn ; (θ, t) 7→ F (θ, t).
Assume that G : Rn × Rn × R+ → Rn ; (w, θ, t) 7→ G[w, θ, t], is a dynamic inverse of F (θ, t)
on Br , for some finite β > 0. Let E : Rn × R+ → Rn ; (θ, t) 7→ E(θ, t) be locally Lipschitz in θ and continuous in t. Assume that for some constant κ ∈ (0, ∞), E(θ, t) satisfies
E (z + θ∗ (t), t) − θ˙∗ (t) ≤ κkzk2 2
(2.79)
for all z ∈ Br . Let θ(t) denote the solution to the system Dynamic Inverter with Vanishing Error θ˙ = −µ G [F (θ, t) , θ, t] + E(θ, t)
(2.80)
Sec. 2.3
Dynamic Inversion
33
with initial condition θ(0) satisfying θ(0) − θ∗ (0) ∈ Br . Then kθ(t) − θ∗ (t)k2 ≤ kθ(0) − θ∗ (0)k2 e−(µβ−κ)t
(2.81)
for all t ∈ R+ , and in particular if µ > κ/β, then θ(t) converges to θ∗ (t) exponentially as
t → ∞.
Proof of Theorem 2.3.5: Let z(t) := θ(t) − θ∗ (t), F˜ (z, t) := F (z + θ∗ (t), t), and ˜ G[w, z, t] := G[w, z + θ∗ (t), t]. Differentiate z(t) = θ(t) − θ∗ (t) with respect to t, and substitute (2.80) for θ˙ to get
Let
h i ˜ F˜ (z, t), z, t + E (z + θ∗ (t), t) − θ˙∗ (t) z˙ = −µ G
1 V (z) := kzk22 2 Differentiate V with respect to t to get h i d ˜ F˜ (z, t), z, t + z T E (z + θ∗ (t), t) − θ˙∗ (t) V (z) = −µz T G dt
(2.82)
(2.83)
(2.84)
Then by Definition 2.2.1 and (2.79),
d V (z) ≤ −µβkzk22 + κkzk22 = −(µβ − κ)kzk22 dt so that for z ∈ Br ,
d V (z) ≤ −2(µβ − κ)V (z) dt Then by Theorem B.1.1 of Appendix B, V (z) ≤ V (0)e−2(µβ−κ)t
(2.85)
(2.86)
(2.87)
Substitute the right-hand side of (2.83) for V (z) to conclude kz(t)k ≤ kz(0)k2 e−(µβ−κ)t for all t ≥ 0.
(2.88)
Remark 2.3.6 Note that as in Theorem 2.3.1, differentiability of F (θ, t) is not required for application of Theorem 2.3.5, though when F (θ, t) is differentiable we may construct E(θ, t) as in (2.78).
N
34
Dynamic Inversion of Nonlinear Maps
Chap. 2
Example 2.3.7 Dynamic Inversion of a Piecewise Linear Map – With Derivative Estimate. Let F and G be as in Example 2.3.3. Let E(θ, t) := −(8π/3) cos(4πt) which is
the time-derivative of the solution θ∗ = −(4π/3) sin(4πt) to F (θ, t) = 0. Use the dynamic
inverter
θ˙ = −µF (θ, t) + E(θ, t)
(2.89)
with the same initial condition as before, θ(0) = 3. The top graph of Figure 2.14 shows the simulation results, and the bottom graph of Figure 2.14 shows the estimation error. In this case the errors can be seen to go to zero exponentially. Note that for both dynamic inverters (2.66) and (2.89), each of the solutions θ(t) pass through the point θ = 2, a local maximum of F , for which F (θ(t), t) is not differentiable. In contrast, Newton’s method and the gradient method are undefined for non-differentiable functions, and even if we were to make F (θ, t) differentiable in θ by smoothing it, Newton’s method would fail due to the local maximum at θ = 2.
N
Sec. 2.3
Dynamic Inversion
35
3
2.5
2
θ
1.5
1
0.5
0
-0.5
-1
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
0.3
0.35
0.4
0.45
0.5
t 3
2.5
||θ − θ* ||
2
1.5
1
0.5
0
-0.5
0
0.05
0.1
0.15
0.2
0.25
t
Figure 2.14: The top graph shows solutions of the dynamic inverter (2.89) with E(θ, t) = θ˙∗ (t) for µ = 10 (dashed) and µ = 100 (solid), with the actual solution θ∗ (t) (dotted). The initial condition was θ(0) = 3. The bottom graph shows the corresponding estimation error.
36
Dynamic Inversion of Nonlinear Maps
Chap. 2
Though we have assumed knowledge of θ˙∗ (t) in Example 2.3.7, this is most often not a practical assumption. In the next example we show how the derivative estimator (2.78) may be used in the context of Theorem 2.3.5. Example 2.3.8 Dynamic Inversion with a Derivative Estimate. Let F (θ, t) = (2 + sin(t)) tan(θ) − cos(t)
(2.90)
Then D1 F (θ, t) = (2 + sin(t)) sec2 (θ). Using the derivative estimator (2.78) gives E(θ, t) = −
(sin(t) + cos(t) tan(θ)) cos2 (θ) 2 + sin(t)
(2.91)
Let G[w] = w which corresponds to the dynamic inverse (2.24). Then a dynamic inverter for θ∗ (t) is θ˙ = −µG[F (θ, t)] + E(θ, t)
= −µ ((2 + sin(t)) tan(θ) − cos(t)) −
(sin(t)+cos(t) tan(θ)) cos2 (θ) 2+sin(t)
(2.92)
Figure 2.15 shows the results of a simulation of the dynamic inverter (2.92) for θ(0) = 1. The top graph of Figure 2.15 shows θ(t) and θ∗ (t) (dotted). The bottom graph shows the absolute value of the error between θ(t) and the solution cos(t) θ∗ (t) = arctan 2 + sin(t)
(2.93)
N
Note that θ(0) 6= θ∗ (0).
Remark 2.3.9 Dynamic Inversion with Perfect Initial Conditions. If θ(0) = θ∗ (0), then the conditions of Theorem 2.3.5 guarantee that θ(t) ≡ θ∗ (t) for all t ∈ R+ . So in a sense, we need only solve the inverse problem at a single instant t = 0. Then the dynamic inversion takes care of maintaining the inversion for all t.
N
Remark 2.3.10 Maintenance of a State-Dependent Jacobian Inverse as Dynamic Inverse. Let G[w, θ, t] := D1 F (θ, t)−1 · w
(2.94)
where θ is the state of a dynamic inverter. It follows from Lemma 2.2.11 and Theorem 2.3.5 that if µ is sufficiently large, kθ(0)−θ∗ (0)k is sufficiently small, and G[w, θ(0), 0] is a dynamic inverse of F (θ, t) at t = 0, then G[w, θ, t] is a dynamic inverse of F (θ, t) for all t > 0 (See also Remark 2.2.15).
N
Sec. 2.4
Dynamic Estimation of a Dynamic Inverse
37
1
θ
0.5 0 -0.5 -1
0
1
2
3
4
5
6
4
5
6
t
|θ − θ∗ |
0.6
0.4
0.2
0 0
1
2
3
t
Figure 2.15: The top graph shows the state trajectory θ(t) (solid) of the dynamic inverter (2.92), along with the solution θ∗ (t) (dotted). The bottom graph shows the error norm |θ(t) − θ∗ (t)|. Example 2.3.11 will illustrate application of Remark 2.3.10 to the estimation of θ∗ (t). Example 2.3.11 Dynamic Inversion Using State-Dependent Jacobian Inverse. Let w and θ be in Rn . Assume that the assumptions of Lemma 2.2.11 hold. We may obtain an estimator E(θ, t) for θ˙∗ from (2.78). Assume that r has been chosen sufficiently small, and that D2 F (θ, t) is sufficiently bounded so that E(θ, t) satisfies (2.79) for all z ∈ Br . Let G[w, θ, t] := D1 F (θ, t)−1 · w
(2.95)
and assume that r is small enough that G is a dynamic inverse of F on Br . If (θ(0) − θ∗ (0)) ∈ Br , and µ is sufficiently large, then by Theorem 2.3.5 the approximation error z(t) := θ(t) − θ∗ (t) using (2.80) will converge exponentially to zero.
2.4
N
Dynamic Estimation of a Dynamic Inverse In this section we will show how we can apply the dynamic inversion theorem to
the construction of a dynamic system whose state includes both a dynamic inverse of a
38
Dynamic Inversion of Nonlinear Maps
Chap. 2
particular F as well as an approximation for the root of F . Consideration of the example of dynamic inversion of a time-varying matrix [GM95b] will lead the way8 . Example 2.4.1 Inversion of Time-Varying Matrices. Consider the problem of estimating the inverse Γ∗ (t) ∈ Rn×n of a time-varying matrix A(t) ∈ GL(n, R), where GL(n, R)
denotes the group of invertible matrices in Rn×n . Assume that we have representations for ˙ both A(t) and A(t), and that A(t) is C 1 in t. Let Γ be an element of Rn×n . In order for Γ∗ to be the inverse of A(t), Γ∗ must satisfy A(t)Γ − I = 0
(2.96)
Let F : Rn×n × R+ → Rn×n ; (Γ, t) 7→ F (Γ, t) be defined by F (Γ, t) := A(t)Γ − I
(2.97)
As usual we will refer to the solution of F (Γ, t) = 0 as Γ∗ (t). To obtain an estimator E(Γ, t) for Γ˙∗ (t), differentiate AΓ∗ = I with respect to t, solve the resulting expression for Γ˙∗ , replace A−1 by θ∗ , and then replace Γ∗ by Γ in the resulting expression to get ˙ E(Γ, t) := −Γ A(t)Γ
(2.98)
Differentiate F (Γ, t) with respect to Γ to get D1 F (Γ, t) = A(t)
(2.99)
whose inverse is Γ∗ . So a choice of dynamic inverse is G[w, Γ, t] := Γ · w
(2.100)
for Γ sufficiently close to Γ∗ = A−1 (t) and with w ∈ Rn×n . The dynamic inverter for this problem then takes the form
Γ˙ = −µG [F (Γ, t), Γ ] + E(Γ, t) 8
In Chapter 3 we will cover dynamic matrix inversion in more depth.
(2.101)
Sec. 2.4
Dynamic Estimation of a Dynamic Inverse
39
or, expanded, ˙ Γ˙ = −µΓ (A(t)Γ − I) − Γ A(t)Γ
(2.102)
and we choose as initial conditions Γ (0) ≈ Γ∗ (0) = A−1 (0) so that the estimation error starts small. Theorem 2.3.5 guarantees that for sufficiently large µ, and for Γ (0) sufficiently close to A−1 (0), equation (2.102) will produce an estimator Γ (t) whose error Γ (t) − Γ∗ (t) decays exponentially to zero at a rate determined by our choice of µ. Even if we don’t know ˙ A(t), we can, by Theorem 2.3.1, take E(Γ, t) to be identically zero and achieve inversion with a bounded error.
N
Remark 2.4.2 Inversion of Time-Varying Matrices. Example 2.4.1 allows one to invert time-varying matrices without calling upon discrete matrix inversion routines. One only need calculate or approximate a single inverse, A(0)−1 . The flow of (2.102) then takes care of the inversion for all t > 0.
N
Remark 2.4.2 will be expanded in Section 3.2 of Chapter 3. Remark 2.4.3 Notation. In the following example and in the remainder of this section we will couple two dynamic inverters together; one which estimates the solution θ∗ of F (θ, t) = 0, and the other which solves for a matrix Γ to be used in a dynamic inverse G[w, Γ ] of F (θ, t). In order to distinguish between the map, dynamic inverse, and derivative estimator for each of the two problems we adopt the convention of referring to the map, dynamic inverse, and derivative estimator for Γ by F γ , Gγ , and E γ respectively, retaining the designations F , G, and E for the map, dynamic inverse, and derivative estimator for θ.
N
Example 2.4.4 Obtaining a Dynamic Inverse Dynamically. Assume that F (θ, t) satisfies the assumptions of Lemma 2.2.11 with continuous isolated solution θ∗ (t). Assume
that D1 F (θ, t) is C 2 in θ and C 1 in t. Let Γ ∈ Rn×n denote an estimator for D1 F (θ∗ , t)−1 .
40
Dynamic Inversion of Nonlinear Maps
Chap. 2
We may then estimate θ˙∗ (t) as follows: Differentiate F (θ∗ , t) = 0, solve for θ˙∗ , and substitute Γ for D1 F (θ∗ (t), t)−1 and θ for θ∗ to obtain an estimator for θ˙∗ in terms of Γ , θ, and t, E(Γ, θ, t) := −Γ D2 F (θ, t)
(2.103)
Assume that E(Γ, θ, t) is Lipschitz in Γ and θ, and piecewise-continuous in t. Using E(Γ, θ, t) = [Ei(Γ, θ, t)]i∈n, and similar to (2.98) we estimate Γ˙∗ with d E (Γ, θ, t) := −Γ D1 F (θ, t) Γ dt ˙ θ=E(Γ,θ,t) γ
(2.104)
where n X ∂D1 F (θ, t) ∂D1 F (θ, t) d D1 F (θ, t) := Ei(Γ, θ, t) + . dt ∂θi ∂t ˙ θ=E(Γ,θ,t)
(2.105)
i=1
In this case,
F γ(Γ, θ, t) := D1 F (θ, t)Γ − I
(2.106)
Gγ[W, Γ ] := Γ · W
(2.107)
Let
as in Example 2.4.1 (see (2.100)), with W ∈ Rn×n .
Theorem 2.3.5 now tells us that we may estimate θ∗ (t) with the system of coupled
nonlinear differential equations " # " Γ˙ Γ = −µ θ˙ 0
0 Γ
# " ·
F γ (Γ, θ, t) F (θ, t)
#
+
"
E γ(Γ, θ, t) E(Γ, θ, t)
#
(2.108)
N
with guaranteed exponential convergence of (Γ, θ) to (Γ∗ , θ∗ ). After a definition, we summarize the result of Example 2.4.4 with a theorem. Definition 2.4.5 For (Γ, θ) ∈ Rn×n × Rn , define the norm k(Γ, θ)k2 by 1/2 n n X X k(Γ, θ)k2 := |Γij |2 + |θi |2 i,j=1
i=1
(2.109)
Sec. 2.4
Dynamic Estimation of a Dynamic Inverse
41
Norm (2.109) is thus the l2 norm of the matrix [Γ, θ] where we consider θ to be a column
N
vector.
Theorem 2.4.6 Dynamic Inversion with Dynamic Determination of a Dynamic Inverse. Let F (θ, t) satisfy the assumptions of Lemma 2.2.11. Then for Γ (0) sufficiently close to D1 F (θ∗ , 0)−1 , and θ(0) sufficiently close to θ∗ (0), the solution (Γ (t), θ(t)) of "
Γ˙ θ˙
#
= −µ +
"
"
Γ
0
0
Γ
−Γ
#"
D1 F (θ, t)Γ − I F (θ, t)
#
d ˙ Γ dt D1 F (θ, t) θ=−Γ D2 F (θ,t)
−Γ D2 F (θ, t)
(2.110)
#
satisfies (Γ (t), θ(t)) → (D1 F (θ∗ , t)−1 , θ∗ (t)) as t → ∞. Furthermore, for sufficiently large
µ > 0, the convergence is exponential, i.e. there exist k1 > 0 and k2 > 0 such that k(Γ (t), θ(t)) − (Γ∗ (t), θ∗ (t))k2 ≤ k1 k(Γ (0), θ(0)) − (Γ∗ (0), θ∗(0))k2e−k2 t
for all t ≥ 0, where Γ∗ (t) = D1 F (θ∗ (t), t)−1 . Proof of Theorem 2.4.6: Let " # " # Γ 0 W ¯ G[(W, w), Γ ] := · 0 Γ w Note that D1 F¯ ((Γ, θ), t) =
and F¯ ((Γ, θ), t) :=
"
(2.111)
D1 F (θ, t) 0
∗
"
D1 F (θ, t)
F γ (Γ, θ, t) F (θ, t) #
#
(2.112)
(2.113)
where ∗ indicates an unspecified n × n block matrix. If Γ is sufficiently close to Γ∗ = D1 F (θ∗ (t), t)−1 , then the product of diag(Γ, Γ ) and D1 F¯ ((Γ, θ), t) is positive definite. Thus ¯ G[(W, w), Γ ] is a dynamic inverse of F¯ ((Γ, θ), t) for (Γ, θ) sufficiently close to (Γ∗ , θ∗ ). It
¯ ¯ follows that G[(W, w), Γ∗] = D1 F¯ ((Γ∗ , θ∗ ), t)−1 · [W T , w T ]T , and G[(W, w), Γ∗] is continuous
in its arguments. Hence, for Γ (0) sufficiently close to D1 F (θ∗ , 0)−1 , and θ(0) sufficiently ¯ close to θ∗ (0), G[(W, w), Γ ] is a dynamic inverse of F¯ ((Γ, θ), t). Also for ¯ E((Γ, θ), t) :=
"
E γ(Γ, θ, t) E(Γ, θ, t)
#
(2.114)
42
Dynamic Inversion of Nonlinear Maps
Chap. 2
¯ ∗ , θ∗ ), t) = (Γ˙∗ , θ˙∗ ) and E((Γ ¯ ∗ , θ∗ ), t) is continuous in (Γ, θ). Therefore, we have that E((Γ by Theorem 2.3.5, for sufficiently large µ > 0, equation (2.110) is a dynamic inverter for
(Γ, θ), and (Γ, θ) converges exponentially to (Γ∗ , θ∗ ).
In all of the preceding examples in which dynamic inversion was applied to approximate a root θ∗ (t), a closed-form expression for θ∗ (t) has been available by inspection of F (θ, t). This has facilitated verification that dynamic inverters do what they are supposed to do. In the following example of dynamic inversion θ∗ (t) is not so easily determined analytically. Example 2.4.7 Tracking Intersections of Curves. Consider the two time-dependent cubic curves in the x, y plane, √ y = (2 + sin(t))x3 + (−1 + 31 sin( 2t))x
(2.115)
y = −(2 + sin(3t))x3 + (1 + 41 sin2 (5t))x
For each t ≥ 0 it is readily verified that these curves intersect at three points: one point is
the origin, one is to the right of (0, 0) at θ∗ (t) = (x∗ (t), y∗ (t)), and one is to the left of the
origin at −(x∗ (t), y∗ (t)). Figure 2.16 shows the two curves and their intersections for six values of t and for x ≥ 0. Let
F (θ, t) = F ((x, y), t) =
"
y − (2 + sin(t))x3 − (−1 +
y + (2 + sin(3t))x3 − (1 +
√
1 3 sin( 2t))x 1 2 4 sin (5t))x
#
(2.116)
We will be interested in the solution θ∗ (t) = (x∗ (t), y∗ (t)) of F (θ, t) = 0 to the right of
x = 0; the other solutions are (x∗ (t), y∗ (t)) = (0, 0) and (x∗ (t), y∗ (t)) = −θ∗ (t). We will use Theorem 2.4.6 to track the solution θ∗ (t). In this case, Γ ∈ R2×2 , " D1 F ((x, y), t) =
# √ −3(2 + sin(t))x2 − (−1 + 31 sin( 2 t)) 1 3(2 + sin(3t))x2 − (1 +
D2 F ((x, y), t) =
"
− cos(t)x3 −
3 cos(3t)x3 −
5 2
√
2 3
1 4
sin2 (5t))
√ cos( 2 t)x
sin(5t) cos(5t)x " # E1 E(Γ, (x, y), t) = −Γ D2 F ((x, y), t) =: E2
1
#
(2.117)
(2.118)
(2.119)
Sec. 2.4
Dynamic Estimation of a Dynamic Inverse t=1
2
2
0
0
-2
y
y
t=0
0
0.5
1
-2
1.5
0
0.5
x
0
0
0.5
1
-2
1.5
0
0.5
x
1.5
1
1.5
t=5 2
0
0
y
y
t=4
0
1 x
2
-2
1.5
t=3 2
y
y
t=2
0
1 x
2
-2
43
0.5
1
1.5
-2
0
0.5
x
x
Figure 2.16: The solution of interest in Example 2.4.7, θ∗ (t) = (x∗ (t), y∗ (t)), is the intersection (to the right of (0, 0)) of the two cubic curves shown in each of the graphs. This figure shows the pair of cubic curves (2.115) for t ∈ {0, 1, . . ., 5}. and d F ((x, y), t) dt D "1
=
−3 cos(t)x2 − 6(2 + sin(t))xE 1 −
9 cos(3t)x2
+ 6(2 +
sin(3t))xE 1
√ 2 3
√ cos( 2 t)
5 2
0
− sin(5t) cos(5t) 0
#
(2.120)
From (2.116), (2.117), (2.118), and (2.120) we can construct the dynamic inverter (2.110). When t = 0, the root (to the right of (0, 0)) can be obtained by inspection √ as (x∗ (0), y∗(0)) = ( √12 , 0). Thus we could use (x(0), y(0)) = (1/ 2, 0) and Γ (0) =
D1 F ((x(0), y(0)), t)−1 for initial conditions for the dynamic inverter to produce the exact9
solution (x∗ (t), y∗ (t)) for all t ≥ 0. In order to demonstrate an error transient, however, we
choose initial conditions " 9
x(0) y(0)
#
=
"
1 0
#
,
Γ (0) =
"
−1/4 1/4 1/2
1/2
#
(2.121)
By “exact” for the simulation, we mean exact up to the tolerance of the integrator which, in this example, was 10−6 .
44
Dynamic Inversion of Nonlinear Maps
Chap. 2
Figure 2.17 shows the results of a simulation of the dynamic inverter using the adaptive step-size Runge-Kutta integrator ode45 from Matlab [Mat92], with µ = 10. The upper graph shows x(t) (solid) and y(t) (dashed) versus t. The lower graph shows (x(t), y(t)) for t ∈ [0, 10]. The root θ∗ (t) = (x∗ (t), y∗ (t)) is a quasi-periodic curve. Note that if we were √ to change 2 to 2, for instance, in (2.115) and (2.116), the solution would have a period of 2π. Figure 2.18 shows the log of the approximation error as seen through F , namely log10 kF ((x(t), y(t)), t)k∞. The error can be seen to decay to the level of the integrator tolerance, 10−6 , within 2 seconds.
N
Sec. 2.4
Dynamic Estimation of a Dynamic Inverse
45
1
0.8
x (solid), y (dashed)
0.6
0.4
0.2
0
-0.2
-0.4
0
1
2
3
4
5
6
7
8
9
10
t
0.5 0.4 0.3
y
0.2 0.1 0 -0.1 -0.2 -0.3 -0.4 0.5
0.6
0.7
0.8
0.9
1
1.1
x
Figure 2.17: The solution of the dynamic inverter of Example 2.4.7 for F (θ, t) = 0 corresponding to Example 2.4.7, where θ = (x, y). The upper graph shows x(t) versus t (solid) and y(t) versus t (dashed). The lower graph shows x(t) versus y(t) with the initial condition (x(0), y(0)) = (1, 0) marked by the small circle.
46
Dynamic Inversion of Nonlinear Maps
Chap. 2
0 -1 -2
log10||F(θ, t)|| ∞
-3 -4 -5 -6 -7 -8 -9
0
1
2
3
4
5
t
6
7
8
9
10
Figure 2.18: The estimation error for the dynamic inverter of Example 2.4.7 as seen through F (2.116), log10 kF (θ(t), t)k∞ versus t in seconds. See Example 2.4.7.
Sec. 2.4
Dynamic Estimation of a Dynamic Inverse
47
An example of application of Theorem 2.4.6 to the inversion of robot kinematics will be given in Chapter 5. In the closing example of this chapter we apply Theorem 2.4.6 to the solution of a standard problem in the control of nonlinear systems. Example 2.4.8 Dynamic Inversion of a Nonlinear Control System. Consider the multi-input, multi-output, time-varying nonlinear control system x˙ = f (x, t, u)
(2.122)
with x and u in Rn . Assume that f (x, t, u) is C 2 in its arguments. Let x(t) denote the solution of (2.122). Assume also that D3 f (x, t, u) is invertible for all (x, u) in a neighborhood of (0, 0) ∈ Rn × Rn and for all t ≥ 0. Consider also the vector field φ(x, t) assumed to be C 2 in x and t. Suppose we wish to solve for u such that φ(x, t) = f (x, t, u)
(2.123)
i.e. we wish to solve for a u(·) that will cause the state x(t) to obey the dynamics x˙ = φ(x, t)
(2.124)
F (u, t) := f (x(t), t, u) − φ(x(t), t)
(2.125)
Define
Let u∗ (t) be a continuous isolated solution, assumed to exist, of F (u, t) = 0. Let F γ (Γ, u, t) := D3 f (x(t), t, u)Γ − I
(2.126)
so that Γ∗ (t) = D3 f (x(t), t, u)−1. As in Theorem 2.4.6 let
with w ∈ Rn , and
G[w] = Γ · w
(2.127)
Gγ [W ] = Γ · W
(2.128)
with W ∈ Rn×n . To solve for an estimator E(Γ, u, t) for u˙ ∗ , we differentiate F (u, t) = 0 with respect to t, solve for u, ˙ and replace D1 F (u, t)−1 by Γ and x˙ by f (x, u) to get E(Γ, u, t) = −Γ ((D1 f (x(t), t, u) − D1 φ(x(t), t)) f (x, t, u) +D2 f (x, t, u) − D2 φ(x, t))
(2.129)
48
Dynamic Inversion of Nonlinear Maps
Chap. 2
Similarly, to solve for an estimator E γ (Γ, u, t) for Γ˙∗ , differentiate F γ (Γ, u, t) = 0 with respect to t, solve for Γ˙ , replace D2 f (x, t, u)−1 by Γ , x˙ by f (x, t, u), and u˙ by E(Γ, u, t) to get E γ (Γ, u, t) = −Γ (D1,3 f (x(t), t, u) · f (x(t), u) + D2,3 f (x(t), t, u) +D3,3 f (x(t), t, u) · E(Γ, u, t)) Γ
(2.130)
If u(0) and Γ (0) are sufficiently close to u∗ (0) and D3 f (x(0), 0, u∗(0))−1 respectively, then a dynamic compensator which produces a u that converges exponentially toward u∗ is ( Γ˙ = −µΓ · F γ (Γ, u, t) + E γ (Γ, u, t) (2.131) u˙ = −µΓ · F (u, t) + E(Γ, u, t) Furthermore, if we choose u(0) to satisfy F (u(0), 0) = 0, and Γ (0) to satisfy Γ (0) = D2 f (x(0), u(0), 0)−1, then x(t) will satisfy (2.124) for all t ≥ 0. Figure 2.19 shows the
closed-loop system including the dynamic inversion compensator, and the original nonlinear plant (2.122).
Γ, u
Dynamic Inverter . Γ = −µGγ[Fγ(Γ,u,t)]+Eγ(Γ,u,t) . u = −µG[F(u,t)]+E(Γ,u,t)
Nonlinear Plant
u
. x = f(x,t,u)
x(t)
x
Figure 2.19: The closed-loop system with dynamic inversion compensator (2.131) with state (Γ, u) and the nonlinear plant (2.122) with state x. Note that if a convenient closed form exists for (D3 f (x, t, u))−1 , or if one is satisfied to use discrete numerical matrix inversion, one could replace Γ by (D3 f (x, t, u))−1 and eliminate the Γ˙ equations.
2.5
N
Generalizations of Dynamic Inversion The dynamic inversion theorems, Theorems 2.3.1 and 2.3.5, rely upon the use
of quadratic Lyapunov functions, and indeed, the definition of a dynamic inverse, Definition 2.2.1, is tailored for association with a quadratic Lyapunov function. We may generalize dynamic inversion based on more general Lyapunov functions. For instance, consider the following definition.
Sec. 2.5
Generalizations of Dynamic Inversion
49
Definition 2.5.1 General Dynamic Inverse. For F : Rn × R+ → Rn ; (θ, t) 7→ F (θ, t)
let θ∗ (t) be a continuous isolated solution of F (θ, t) = 0. A map G : Rn × Rn × R+ → Rn ; (w, t) 7→ G[w, θ, t] is called a dynamic inverse of F on the ball Br := {z ∈ Rn | kzk ≤
r}, r > 0, if
i. G[0, z + θ∗ (t), t] = 0 for all z ∈ Br , t ≥ 0, ii. the map G[F (θ, t), θ, t] is Lipschitz in θ, piecewise-continuous in t, and iii. there exists a continuously differentiable function V : [0, ∞)×Br → R; (t, z) 7→ V (t, z) such that for all z ∈ Br ,
α1 (kzk) ≤ V (t, z) ≤ α2 (kzk) D1 V (t, z) + D2 V (t, z) θ˙∗ (t) − G[F (z + θ∗ , t), z + θ∗ , t] ≤ −α3 (kzk)
(2.132) (2.133)
where α1 (·), α2 (·), and α3 (·) are of class K (see Appendix A) on [0, r).
N A more general dynamic inversion theorem follows from Definition 2.5.1. Theorem 2.5.2 General Dynamic Inversion Theorem. Let θ∗ (t) be a continuous
isolated solution of F (θ, t) = 0, with F : Rn × R+ → Rn ; (θ, t) 7→ F (θ, t). Assume that
G : Rn × Rn × R+ → Rn ; (w, θ, t) 7→ G[w, θ, t], is a dynamic inverse (Definition 2.5.1) of
F (θ, t) on Br . Let θ(t) denote the solution to the system θ˙ = −G [F (θ, t) , θ, t]
(2.134)
with initial condition θ(0) satisfying θ(0) − θ∗ (0) ∈ Br . If θ(0) − θ∗ (0) is in Br , then θ → θ∗
asymptotically.
Proof of Theorem 2.5.2: Since G[w, θ, t] is assumed to be a dynamic inverse of F (θ, t), there exists a function V (t, z) satisfying (2.132) and (2.133). It follows (see [Kha92], Theorem 4.1, page 169) that the origin z = 0 of the system z˙ = −G[F (z + θ∗ (t), t), z + θ∗ (t), t] + θ˙∗ (t) is uniformly asymptotically stable. Thus θ → θ∗ asymptotically as t → ∞.
(2.135)
50
Dynamic Inversion of Nonlinear Maps
Chap. 2
Though it is readily apparent that Definition 2.5.1 leads to a more general dynamic inversion theorem, Theorem 2.5.2, with a simple proof, use of the more general definition also imposes the generally difficult requirement of finding a Lyapunov function in order to prove that G is indeed a dynamic inverse of F . In contrast the dynamic inverse criterion of Definition 2.2.1 is often easily verified from familiarity with the inverse problem one is trying to solve. For instance, one often knows that D1 F (θ, t) is invertible for all θ sufficiently close to θ∗ (t). In such cases Definition 2.2.1 leads easily to the constructive methods of, for instance, Theorem 2.4.6. What we would gain in generality by relying upon Definition 2.5.1 we would lose in ease of construction of dynamic inverters for a broad and useful set of inverse problems. Another consideration in our choice of dynamic inverse definition, Definition 2.2.1, is that it leads to exponentially stable systems. Exponentially stable systems are known to maintain their exponential stability under a wide variety of perturbations. This fact has been of profound value in the history of control theory, accounting, for instance, for the wide successes of the application of linear controllers to the control of nonlinear systems. When dynamic inverters are incorporated into control laws, this exponential stability allows one to call upon a variety of well-known results of stability theory in order to conclude exponential stability of the closed loop control system. We will see an example of this in Chapter 4 where we will apply dynamic inversion to construct a tracking controller that will allow tracking of implicitly defined trajectories with exponential convergence. By retaining exponential stability of a closed-loop control system we allow that control system to retain a useful level of robustness with respect to perturbations and modeling errors.
2.6
Chapter Summary By building upon simple quadratic Lyapunov stability arguments we have devel-
oped a methodology for the construction of a class of nonlinear dynamic systems that can solve time-dependent finite-dimensional inverse problems. The notion of a dynamic inverse of a map has been introduced. We have shown a number of ways in which dynamic inverses may be obtained, perhaps the most powerful being through dynamic inversion itself, where the dynamic inverse is solved for at the same time it is being used to track the root of interest. We have shown how derivative estimation can be used to make the difference between an ultimately bounded approximation error, and an approximation error that converges exponentially to zero.
Sec. 2.6
Chapter Summary
51
For realization of dynamic inversion on a digital computer, an integration method must be chosen. In the current digital technology integration can be slow, particularly for ordinary differential equations of high dimension. Of course this disadvantage might be made less severe by redesigning computers to optimize integration. On the other hand the reliance of dynamic inversion on integration places all questions of accuracy and rate of convergence squarely in the lap of the chosen integration routine. Accuracy and convergence properties of discrete integrators are a well studied problem. In the following chapters we will apply dynamic inversion to a variety of problems in mathematics and nonlinear control.
52
Chapter 3
Dynamic Methods for Polar Decomposition and Inversion of Matrices 3.1
Introduction In Chapter 2 we introduced a technique in which a dynamic system was used to
generate an approximation θ(t) to the solution θ∗ (t) of a nonlinear vector equation of the form F (θ, t) = 0. As we saw in Example 2.4.1, one may also pose the inverse of a timevarying matrix as a solution to an equation of the form F (Γ, t) = 0. Square roots and other matrix functions may be posed similarly. Motivated by this realization, in this chapter we will further investigate the use of dynamic inversion to construct dynamic systems that perform matrix inversion as well as polar decomposition.
3.1.1
Previous Work As in the case of vector equations (see Section 2.1.2), continuous-time dynamic
methods of solving matrix equations have appeared before. Any dynamic system on a matrix space for which an asymptotically stable equilibrium exists may be considered to be a dynamic inverter that solves for its equilibrium. Continuous-time dynamic methods for determining eigenvalues date back at least as far as Rutishauser [Rut54, Rut58]. We have already mentioned (see Section 2.1.2) the work of Brockett [Bro91, Bro89], who has shown how one can use matrix differential equations to perform computation often
Sec. 3.1
Introduction
53
thought of as being intrinsically discrete, and Bloch [Blo85, Blo90] who has shown how Hamiltonian systems may be used to solve principal component and linear programming problems. Chu [Chu95] has studied the Toda flow as a continuous-time analog of the QR algorithm. Chu [Chu92] and Chu and Driessel [CD91b] have explored the use of differential equations in solving linear algebra problems. Smith [Smi91], and Helmke et al. [HMP94] have constructed dynamical systems that perform singular-value decomposition. Dynamic methods of matrix inversion have also appeared in the artificial neural network literature. See for instance Jang et al. [JLS88] and Wang [Wan93]. For a review of dynamic matrix methods as well as a comprehensive list of references for dynamic approaches to optimization see [HM94]. A dynamic decomposition related to polar decomposition of fixed matrices has also appeared in Helmke and Moore [HM94], though, as the authors point out, their gradient based method does not guarantee the positive definiteness of the symmetric component of the polar decomposition. Using dynamic inversion we will derive a system that produces the desired inverse and polar decomposition products at any fixed time t1 > 0 with guaranteed positive definiteness of the symmetric component. As far as we know, all prior continuous-time dynamic approaches to inversion of matrix equations use gradient flows. In contrast, dynamic inversion, as we formulate it, does not require the requisite metric needed to define a gradient. Though we will see in Section 3.3.1 that gradient approaches fit well into the dynamic inversion framework, the main results of the present paper do not rely upon a metric structure.
3.1.2
Main Results In this chapter, using dynamic inversion, we will join constant matrices with un-
known inverses to constant matrices with known inverses through a t-parameterized path of matrices, where t may be identified with time. As the path proceeds from the matrix with the known inverse to the matrix with the unknown inverse, functions of the state of the dynamic inverter exactly track the polar decomposition factors as well as the inverse of the path element. The path is such that when t = 1 the unknown matrix is reached, hence the functions of the state of the dynamic inverter at t = 1 provide the exact desired polar decomposition factors as well as the inverse. By scaling time we may produce the exact desired inverse and the polar decomposition by any prescribed time t1 > 0. The main results of this chapter are as follows: We will construct dynamic systems
54
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
that i. invert time-dependent matrices asymptotically, ii. invert a spectrally restricted matrix by a prescribed time, iii. invert and decompose any t-dependent invertible matrix into its polar decomposition factors, iv. invert and decompose any constant nonsingular matrix into its polar decomposition factors by a prescribed time. Result ii will be obtained from result i using homotopy. Likewise, result iv will be obtained from result iii using homotopy.
3.1.3
Chapter Overview In Example 2.4.1 of Chapter 2 we examined the application of dynamic inversion
to the problem of inverting time-varying matrices where we assumed that a sufficiently good approximation existed for the inverse of the time-varying matrix at an initial time. In Section 3.2 we will show some further applications of time-varying matrix inversion. Motivated by the desire to obtain the initial inverse dynamically, in Section 3.3 we will consider the problem of inverting constant matrices. By using a matrix homotopy from the identity we will use the results of Section 3.2 to produce exact inversion of a restricted class of constant matrices, including positive definite matrices, by a prescribed time. In Section 3.4 we will consider the polar decomposition of a time-varying matrix. We will show how, starting from a good guess at the initial value of the inverse of the positive definite part of the polar decomposition, we may construct a dynamic system that produces an exponentially convergent estimate of the inverse of the positive definite symmetric part. From this estimate and the original matrix we may obtain the decomposition products as well as the inverse. Then in Section 3.5 we revisit the problem of constant matrix inversion and show how, combining homotopy with dynamic polar decomposition, we may dynamically produce the polar decomposition factors as well as the inverse of any constant matrix by a prescribed time without requiring an initial guess1 . 1
For the notation of this chapter see Section A of the Appendix.
Sec. 3.2
3.2
Inverting Time-Varying Matrices
55
Inverting Time-Varying Matrices We summarize the results of Example 2.4.1 of Chapter 2 in the following theorem.
Recall that GL(n, R) refers to the general linear group of n × n invertible matrices with real
components (see Appendix A for notation), the group operation being matrix multiplication. Theorem 3.2.1 Dynamic Inversion of Time-Varying Matrices. Let A(t) ∈ GL(n, R) ˙ be C 1 in t, with A(t), A(t)−1 , and A(t) bounded on [0, ∞). Let G[w, Γ, t] be a dynamic
inverse (see Definition 2.2.1) of F (Γ, t) = A(t)Γ − I for all t ∈ R+ , and for all Γ such that
Γ − Γ∗ is in Br . Let Γ (t) ∈ Rn×n be the solution to
˙ Γ˙ = −µG[A(t)Γ − I, Γ, t] − Γ A(t)Γ
(3.1)
with kΓ (0) − Γ∗ (0)k ≤ r < ∞. Then for sufficiently small r, there exists a µ ˜ > 0, k1 > 0, and k2 > 0 such that for all µ > µ ˜, and for all t ≥ 0,
kΓ (t) − Γ∗ (t)k2 ≤ k1 kΓ (0) − Γ∗ (0)k2 e−k2 t In particular limt→∞ Γ (t) = A(t)−1 .
(3.2)
Example 3.2.2 A Dynamic Inverter for a Time-Varying Matrix. Let G[w, t] := A(t)T · w
(3.3)
Then by Theorem 3.2.1, for sufficiently large constant µ > 0, and for Γ (0) sufficiently close to A(0)−1 , the solution Γ (t) of
Dynamic Inverter for a Time-Varying Matrix ˙ Γ˙ = −µA(t)T (A(t)Γ − I) − Γ A(t)Γ approaches A(t)−1 exponentially as t → ∞.
(3.4)
N
See also Example 2.4.1 where the dynamic inverse G[w, Γ ] = Γ · w is used instead
of G[w, t] = A(t)T · w.
Example 3.2.3 Dynamic Inversion of a Mass Matrix. Consider a finite dimensional mechanical system modeled by the implicit second order differential equation M (q)¨ q + N (q, q) ˙ =0
(3.5)
56
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
Usually the matrix M (q) is positive definite and symmetric for all q since the kinetic energy, (1/2)q˙ T M (q)q, ˙ is greater than zero for all q˙ > 0. It is often convenient to express such systems in an explicit form, with q¨ alone on the left side of a second order ordinary differential equation. To do so we will invert M (q) dynamically. Let Γ be a symmetric estimator for M (q)−1 . Suppose we know M −1 (q(0)) approximately. If our approximation is sufficiently close to the true value of M −1 (q(0)), then setting Γ (0) to that approximation, and letting µ > 0 be sufficiently large allows us to apply Theorem 3.2.1. Then the system
Γ˙
Dynamic Inverter for a Mass Matrix h i ∂Mi,j (q) = −µΓ (M (q)Γ − I) − Γ ·Γ q ˙ ∂q i,j∈n
(3.6)
q¨ = Γ N (q, q) ˙
provides an exponentially convergent estimate of q¨ for all t.
Furthermore, if Γ (0) =
M (q(0))−1 , then Γ (t) = M −1 (q(t)) for all t ≥ 0.
N
Remark 3.2.4 Symmetry and the Choice of Dynamic Inverse. In Example 3.2.3, M (q) is symmetric, as is its inverse M (q)−1 . The right hand side of (3.6) is also symmetric, hence if Γ (0) is symmetric, so will be Γ (t) for all t. If we had chosen G[w, q] := M (q)T · w as a dynamic inverse (see, for instance, Example 3.2.2) we would not have had this symmetry. The symmetry allows us to cast the top equation of (3.6) on the space S(n, R) of symmetric n × n matrices thereby reducing the complexity of the dynamic inverter with respect to the nonsymmetric case; what would otherwise be n2 equations (3.6) is reduced to s(n) :=
N
n(n + 1)/2 equations.
3.2.1
Left and Right Inversion of Time-Varying Matrices Consider a matrix A(t) ∈ Rm×n . Assume that A(t) is of full rank for all t ≥ 0.
We consider two cases: (1) If m ≤ n, then A(t) has a right inverse Γ∗ (t) ∈ Rn×m satisfying F (Γ, t) := A(t)Γ − I = 0
(3.7)
Sec. 3.3
Inversion of Constant Matrices
57
It is easily verified that G[w] := Γ · w
(3.8)
is a dynamic inverse for F (Γ, t) when Γ is sufficiently close to Γ∗ = A(t)T (A(t)A(t)T )−1 . Differentiate F (Γ∗ , t) = 0 with respect to t, solve for Γ˙∗ , and replace Γ∗ by Γ to get the derivative estimator ˙ E(Γ, t) := −Γ A(t)Γ
(3.9)
Thus a dynamic inverter for right-inversion of a time-varying matrix is ˙ Γ˙ = −µΓ (A(t)Γ − I) − Γ A(t)Γ
(3.10)
The form of this dynamic inverter may be seen to be identical to (2.102). Alternatively we may use Theorem 3.2.1 to invert A(t)A(t)T , constructing the right inverse as A(t)T Γ (t). In the case that m ≥ n, A(t) has a left inverse Γ∗ (t) which satisfies F (Γ, t) := Γ A(t) − I = 0
(3.11)
˙ We may use the dynamic inverter (3.10) with A(t) replaced by A(t)T , and A(t) replaced by T ˙ A(t) to approximate the left inverse of A(t).
3.3
Inversion of Constant Matrices In this section we consider two methods for the dynamic inversion of constant ma-
trices; one for asymptotic inversion, and the other for inversion in finite time. In Section 3.5, relying on the methods of Section 3.4, we will consider another more complex, but more general approach to the same problem. Constant matrices may be inverted in a manner similar to the inversion of timevarying matrices as described in the last section. Let F (Γ ) := M Γ − I
(3.12)
Let Γ (t) denote the estimator for the inverse of a constant matrix M , with Γ∗ = M −1 as the solution of F (Γ ) = 0. Since M is constant, Γ˙∗ is zero. As a consequence, if Γ (0) is
sufficiently close to Γ∗ , then a dynamic inverse of F (Γ ) (3.12) is G[w, Γ ] := Γ · w, and we
58
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
can use the dynamic inverter
Dynamic Inverter for Constant Square Matrices Γ˙ = −µΓ (M Γ − I)
(3.13)
Choosing Γ (0) sufficiently close to Γ∗ assures us that, as Γ (t) flows to Γ∗ = M −1 , Γ will not intersect the set of singular matrices.
3.3.1
A Comment on Gradient Methods As shown in Section 3.2, Example 3.2.2, the function G[w, Γ ] := Γ · w is not our
only choice of a dynamic inverse G[w, Γ, t] which is linear in w. It is easily verified that G[w] = M T · w, w ∈ Rn×n , is also a dynamic inverse for F (Γ ) := M Γ − I, and that for this choice of dynamic inverse we do not need to worry about the dynamic inverse becoming singular; it is valid globally and leads to the dynamic inverter
Dynamic Inverter for Constant Square Matrices Γ˙ = −µM T (M Γ − I)
(3.14)
Γ → M −1
Remark 3.3.1 Left and Right Inverses of Constant Matrices If M has full rowrank, with M ∈ Rm×n , m ≤ n, then the equilibrium solution Γ∗ of (3.14) is the right inverse M R := M T (M M T )−1 of M .
Dynamic Right-Inverter for Constant Matrices Γ˙ = −µM T (M Γ − I)
(3.15)
Γ → MR
If instead we were to choose F (Γ ) := Γ M −I and G[w] := w·M T , and if M ∈ Rm×n , m ≥ n,
has full column-rank, then the solution Γ∗ would be the left inverse M L := (M T M )−1 M T
Sec. 3.3
Inversion of Constant Matrices
59
of M . Dynamic Left-Inverter for Constant Matrices Γ˙ = −µ(Γ M − I)M T
(3.16)
Γ → ML
N The dynamic inverter (3.14) is the standard least squares gradient flow (see [HM94], Section 1.6) for the function Φ : Rn → R; Γ → 7 Φ(Γ ) where 1 Φ(Γ ) := kM Γ − Ik22 (3.17) 2 It is also the neural-network constant matrix inverter of Wang [Wan93]. Of course other gradient schemes may have the same solution as (3.14) though they may start from gradients of functions other than (3.17) (See, for instance [JLS88]). In general, artificial neural networks are constructed to dynamically solve for the minimum of an energy function having a unique (at least locally) minimum, i.e. they realize gradient flows. Connecting Gradient Methods with Dynamic Inversion In general a dynamic inverter consists of three functions, F , G, and E as described in Section 2. The function F (Γ, t) is the implicit function to be inverted, G[w, θ, t] is a dynamic inverse for F (Γ, t), and E(θ, t) is an estimator for the derivative with respect to t of the root Γ∗ of F (Γ, t) = 0. In order to relate gradient methods to dynamic inversion we consider the decomposition of a gradient flow system into an E, F , and G forming a dynamic inverter. For instance, let H : Rn×n × R → R be a smooth function. A gradient
system based on this function is
Gradient System ∂ Γ˙ = −∇H(Γ, t) + H(Γ, t) ∂t
(3.18)
where ∇ denotes the gradient of H(Γ, t). We may always identify gradient systems with dynamic inversion through the trivial dynamic inverse (see Property 2.2.6) G[w] = w
(3.19)
60
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
Then F (Γ, t) = ∇H(Γ, t)
(3.20)
∂ H(Γ, t) ∂t
(3.21)
and E(Γ, t) = Let µ = 1. Then Γ˙ = −G[F (Γ, t)] + E(Γ, t)
(3.22)
is the same as (3.18). Thus we have decomposed the gradient system (3.18) into an E, F , and G. It is more interesting, however, to find a dynamic inverse G such that if G were changed to the identity map, then the desired root would still be the solution to F (Γ, t) = 0, but the resulting dynamic inverter would not converge to the desired root. For example, identifying F (Γ ) = M Γ − I, G[w] = M T · w, and E = 0 decomposes the gradient flow
(3.14) into a dynamic inverter. For arbitrary M ∈ GL(n, R) the stability properties of Γ˙ = −µF (Γ ) are unknown. But with G defined as G[w] = M T · w, Γ˙ = −µG[F (Γ )] has an asymptotically stable equilibrium at Γ∗ = M −1 . For a system of the form (3.14) such a
decomposition is straightforward. For more complicated gradient systems however, we have no general methodology for decomposition into E, F , and G.
3.3.2
Dynamic Inversion of Constant Matrices by a Prescribed Time The constant matrix dynamic inverters (3.13) and (3.14) above have the potential
disadvantage of producing an exact inverse only asymptotically as t → ∞. One may,
however, wish to obtain the inverse by a prescribed time. To obtain inversion by a prescribed
time we now consider another method. If we could create a time-varying matrix H(t) that is invertible by inspection at t = 0, and that equals M at some known finite time t > 0, say t = 1, then perhaps we could use the technique of Section 3.2 for the inversion of time-varying matrices in order to invert H(t). If Γ (0) = H(0)−1 , then the solution of the dynamic inverter at time t = 1 will be M −1 . We require, of course, that H(t) remain in GL(n, R) as t goes from 0 to 1. One ideal candidate for the initial value of the time-varying matrix is the identity matrix I, since it is its own inverse. Example 3.3.2 Constant Matrix Inversion by a Prescribed Time Using Homotopy. Let M be a constant matrix in Rn×n . We wish to dynamically determine the inverse
Sec. 3.3
Inversion of Constant Matrices
61
of M . Consider the t-dependent matrix,
Matrix Homotopy (3.23) H(t) = (1 − t)I + tM.
In the space of n × n matrices, t 7→ H(t) describes a t-parameterized curve, or homotopy, of
matrices from the identity to M = H(1) as indicated in Figure 3.1; in fact this curve (3.23) is a straight line.
R nxn H(t) [ 0
M
] R 1 I
Figure 3.1: The matrix homotopy H(t).
From Theorem 3.2.1 we know how to dynamically invert a time-varying matrix given that we have an approximation of its inverse at time t = 0. If we know the exact inverse at time t = 0, then we may use the dynamic inverter of Theorem 3.2.1 to track the exact inverse of the time-varying matrix for all t ≥ 0. In the present case the inverse at time t = 0 is just ˙ the identity I. We may invert H(t) by substituting H(t) for A(t), and H(t) = M − I for
˙ A(t) in (3.1), setting Γ (0) = I. Since our initial conditions are a precise inverse of H(0), Theorem 3.2.1 tells us that the matrix Γ becomes the precise inverse of M at time t = 1 as shown schematically in Figure 3.2. That is, of course, if H(t) remains nonsingular as t goes from 0 to 1!
62
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
R nxn
R nxn t=1
M
M-1 H(t)
Γ*(t) t=0
I
I
Matrix Homotopy
Dynamic Inversion Solution
Figure 3.2: The matrix homotopy H(t) from I to M with the corresponding solution Γ∗ (t), the inverse of H(t).
N For a dynamic inverter for this example let F (Γ, t) := ((1 − t)I + tM ) Γ − I
G[w, Γ ] := Γ · w
(3.24)
E(Γ ) := −Γ (M − I)Γ
Then a dynamic inverter is Γ˙ = −µG[F (Γ, t), Γ ] + E(Γ ) with Γ (0) = I. Expanded, this is Prescribed-Time Dynamic Inverter for Constant Matrices Γ˙ = −µΓ ((1 − t)I + tM )Γ − I) − Γ (M − I)Γ Another choice of linear dynamic inverse is G[w, t] := ((1 − t)I + tM )T giving Prescribed-Time Dynamic Inverter for Constant Matrices Γ˙ = −µH(t)T (H(t)Γ − I) − Γ (M − I)Γ
N Homotopy-based methods, also called continuation methods, for solving sets of linear and nonlinear equations have been around for quite some time. For a review of
Sec. 3.3
Inversion of Constant Matrices
63
developments prior to 1980 see Allgower and Georg [AG80] The general idea is that one starts with a problem with a known solution (e.g. the inverse of the identity matrix) and smoothly transforms that problem to a problem with an unknown solution, transforming the known solution in a corresponding manner until the unknown solution is reached. Often it is considerably easier to transform a known solution to a problem into an unknown solution to a closely related problem rather than calculating the new solution from scratch. Solution of the roots of nonlinear polynomial equations (see Dunyak et al. [DJW84] and Watson [Wat81] for examples) is a typical example with broad engineering application.
Remark 3.3.3 Requirement for Nonsingular Homotopy.
The scheme of Exam-
ple 3.3.2 requires that there is no t ∈ [0, 1] for which H(t) (3.23) is singular.
N
Recall that there are two maximal connected open subsets which comprise GL(n, R), namely GL+ (n, R) = {M ∈ Rn×n | det(M ) > 0} and GL− (n, R) = {M ∈ Rn×n | det(M )
0 by a slight modification of the homotopy (3.23). We summarize our results of this section in the following theorem.
Theorem 3.3.7 Dynamic Inversion of Constant Matrices by a Prescribed Time. For any constant M ∈ GL(n, R), and for any prescribed t1 > 0, if σ(M ) ∩ (−∞, 0) = ∅,
Sec. 3.3
Inversion of Constant Matrices
65
then the solution Γ (t) of the dynamic inverter
Prescribed-Time Dynamic Inverter for Constant Matrices t t ˙ Γ = −µΓ (1 − )I + M Γ − I − Γ (M − I)Γ t1 t1
(3.27)
with Γ (0) = I, satisfies Γ (t1 ) = M −1 .
Remark 3.3.8 Preservation of Symmetry. If M is symmetric, then the right-hand side of (3.27) is also symmetric. Thus if Γ (0) is symmetric, then Γ (t) is symmetric for all
N
t.
Example 3.3.9 Right and Left Inverses of Constant Matrices by a Prescribed Time. Let A ∈ Rm×n be a constant matrix with m ≤ n and assume that A has full rank.
The right inverse of A is given by AR := AT (AAT )−1 . To obtain the right inverse AR at time t1 , we may apply Theorem 3.3.7 replacing M by AAT which is positive definite. Then AT (AAT )−1 = AT Γ (t1 ).
Prescribed-Time Dynamic Right-Inversion of a Constant Matrix Γ˙ = −µΓ (1 − tt1 )I + tt1 AAT Γ − I − Γ (AAT − I) AT Γ (t1 ) = AR
If a constant A has full column rank, then since AT A is positive definite, the left inverse AL := (AT A)−1 AT may be obtained by substituting AT A for M in Theorem 3.3.7. Then AL = Γ (t1 )AT .
Prescribed-Time Dynamic Left-Inversion of a Constant Matrix Γ˙ = −µΓ (1 − tt1 )I + tt1 AT A Γ − I − Γ (AT A − I) Γ (t1 )AT = AL
N
66
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3 Theorem 3.3.7 is limited in its utility by the necessity that M have a spectrum
which does not intersect (−∞, 0). By appealing to the polar decomposition in Section 3.5 below, we will show that we may, at the cost of a slight increase in complexity, use dynamic inversion to produce an exact inverse of any invertible constant M , irrespective of its spectrum, by any prescribed time t1 ≥ 0.
3.4
Polar Decomposition for Time-Varying Matrices In this section we will show how dynamic inversion may be used to perform polar
decomposition [HJ85] and inversion of a time-varying matrix. We will assume that A(t) ∈ ˙ GL(n, R), and that A(t), A(t), and A(t)−1 are bounded on (0, ∞).
Though polar decomposition will be used here largely as a path to inversion, polar
decomposition finds substantial utility in its own right. In particular it is used widely in the study of stress and strain in continuous media. See, for instance, Marsden and Hughes [MH83]. First consider the polar decomposition of a constant matrix M ∈ GL(n, R), M =
P U where U is in the space of n × n orthogonal matrices with real entries, O(n, R), and P
is the symmetric positive definite square root of M M T . Regarding M as a linear operator
Rn → Rn , the polar decomposition expresses the action of M on a vector as a rotation
(possibly with a reflection) followed by a scaling along the eigenvectors of M M T . If M ∈
GL(n, R), then P and U are unique.
Now consider the case of a t-dependent nonsingular square matrix A(t). Since A(t) is nonsingular for all t ≥ 0, A(t)A(t)T is positive definite for all t ≥ 0. For any t ≥ 0, the unique positive definite solution to XA(t)A(t)T X − I = 0 is X∗ (t) = P (t)−1 . Thus if
we know X∗ (t), then from A(t) = P (t)U (t) we can get the orthogonal factor U (t) of the polar decomposition by U (t) = X∗ (t)A(t), as well as the symmetric positive definite part
P (t) = X∗ (t)A(t)A(t)T . We can also obtain the inverse of A(t) as A(t)−1 = U (t)T X∗ (t). Since P (t) is a symmetric n × n matrix, it is parameterized by s(n) := n(n + 1)/2
(3.28)
elements as is its inverse P −1 (t). We will construct the dynamic inverter that produces P −1 (t). Remark 3.4.1 Vector Notation for Symmetric Matrices. It will be convenient for the purposes of this section and the next to adopt a notation that allows us to switch between
Sec. 3.4
Polar Decomposition for Time-Varying Matrices
67
matrix representation and vector representation of elements of S(n, R). The convenience of this notation will be seen in Section 3.4.1 to arise from the lack of a convenient matrix form of the inverse of the linear matrix mapping on S(n, R), X 7→ XM + M X where X and M
are in S(n, R).
Choose an ordered basis β = {β1 , . . . , βs(n)}
(3.29)
for S(n, R). For any x ∈ Rs(n) there corresponds a unique matrix x ˆ ∈ S(n, R) where the correspondence is through the expansion of x ˆ in the ordered basis β, x ˆ ≡ (x)ˆ:=
X
i∈s(n)
xi βi ∈ S(n, R)
(3.30)
ˇ denote the vector of the expansion coefficients of Conversely, for any X ∈ S(n, R), let X X
xi βi
(3.31)
ˇ ≡ (X)ˇ= x X
(3.32)
ˇ = X and (ˆ (X)ˆ x)ˇ = x
(3.33)
X=
i∈s(n)
in the basis β so that
Then
N Let Λ(t) := A(t)A(t)T
(3.34)
Let F : Rs(n) × R+ → Rs(n) ; (x, t) 7→ F (x, t) be defined by F (x, t) := (ˆ xΛ(t)ˆ x − I)ˇ
(3.35)
Let x∗ be a solution of F (x, t) = 0. Then x ˆ∗ is a symmetric square root of Λ(t). Nothing in the form of F (x, t) (3.35) enforces the positive definiteness of the solution x ˆ∗ (t), where x∗ (t) is the solution of F (x, t) = 0. For instance, for each solution x∗ (t) of
F (x, t) = 0, −x∗ (t) is also a solution. Each solution t 7→ x∗ (t) is, however, isolated as long as D1 F (x∗ , t), where F (x, t) is defined by (3.35), is nonsingular. We will show in the next
subsection, Subsection 3.4.1, that the nonsingularity of A(t) implies the nonsingularity of D1 F (x∗ , t).
68
3.4.1
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
The Lyapunov Map We will use a linear dynamic inverse for F (x, t) (3.35) based upon the matrix
inverse of D1 F (x∗ , t). We will estimate this matrix inverse using dynamic inversion. It is not immediately obvious, however, that D1 F (x∗ , t) is invertible. In this subsection we will consider the invertibility of D1 F (x∗ , t). Differentiate Fˆ (x, t) = x ˆΛ(t)ˆ x−I
(3.36)
D1 F (x, t) = x ˆΛ(t) + Λ(t)ˆ x
(3.37)
with respect to x ˆ to get
The differential D1 F (x, t) expressed as a mapping S(n, R) → S(n, R) is LΛ(t)ˆx : Y 7→ LΛ(t)ˆx (Y ) := Y Λ(t)ˆ x+x ˆΛ(t)Y
(3.38)
The representation of LΛ(t)ˆx (Y ) on matrices Y expressed as vectors Yˇ ∈ Rs(n) in a basis β of S(n, R) is D1 F (x, t) · Yˇ . Thus the matrix D1 F (x, t) is invertible if and only if LΛ(t)ˆx is an invertible map. We will refer to a map of the form
LM : Y 7→ LM Y := Y M + M Y
(3.39)
with Y and M in Rn×n as a Lyapunov map due to its relation to the Lyapunov equation Y M + M Y = Q which arises in the study of the stability of linear control systems (see e.g. Horn and Johnson [HJ91], Chapter 4). It may be easily verified that a Lyapunov map (3.39) is linear in Y . It may also be proven that LM is an invertible map if no two eigenvalues of M add up to zero (see e.g. [HJ91], Theorem 4.4.6, page 270). Now note that Λ(t)ˆ x∗ = x ˆ∗ Λ(t) = P (t) which is positive definite and symmetric, having only real-valued and strictly positive eigenvalues. Thus no pair of eigenvectors of Λ(t)ˆ x∗ sum to zero. Therefore LΛ(t)ˆx∗ (Y ) is nonsingular. It follows then that the matrix D1 F (x∗ , t) is invertible. Since D1 F (x, t) is continuous in x, it follows that D1 F (x, t) remains invertible for all x in a sufficiently small neighborhood of x∗ . Though numerical inversion of the Lyapunov map has long been a topic of interest in the context of control theory [BS72, GNL79], we do not know of any matrix map L−1 : S(n, R) → S(n, R), taking matrices to matrices, which inverts LM . By converting LM to
an s(n) × s(n) matrix, however, and representing elements of S(n, R) as vectors, the inverse L−1 as a mapping between vector spaces Rs(n) → Rs(n) can be obtained through standard
matrix inversion or, as we will see, dynamic matrix inversion. This is why we sometimes resort to the vector notation of Remark 3.4.1 in referring to elements of S(n, R).
Sec. 3.4
3.4.2
Polar Decomposition for Time-Varying Matrices
69
Dynamic Polar Decomposition The estimator for D1 F (x∗ , t)−1 will be denoted Γ ∈ Rs(n)×s(n) , so that Γ∗ = D1 F (x∗ , t)−1
(3.40)
Using Γ , we may define a dynamic inverse for F (x, t). Let G : Rs(n) × Rs(n)×s(n) → Rs(n) ; (w, Γ ) 7→ G[w, Γ ] be defined by
G[w, Γ ] := D1 F (x∗ , t)−1 Γ∗ =Γ · w = Γ · w
(3.41)
for w ∈ Rs(n) . This makes G[w, Γ ] (3.41) a dynamic inverse for F (x, t) = (ˆ xΛ(t)ˆ x − I)ˇ, as long as Γ is sufficiently close to Γ∗ .
To construct an estimator E(x, Γ, t) ∈ Rs(n) of x˙ ∗ , first differentiate F (x∗ , t) = 0, D1 F (x∗ , t)x˙ ∗ + D2 F (x∗ , t) = 0
(3.42)
x˙ ∗ = −D1 F (x∗ , t)−1 D2 F (x∗ , t) = −Γ∗ D2 F (x∗ , t)
(3.43)
and then solve for x˙ ∗ ,
˙ x∗ )ˇ. Now substitute x and Γ for x∗ and Γ∗ to obtain Note that D2 F (x∗ , t) = (ˆ x∗ Λ(t)ˆ ˙ x ˇ E(x, Γ, t) := −Γ x ˆΛ(t)ˆ (3.44) To obtain Γ , let F γ : Rs(n) × Rs(n)×s(n) × R+ → Rs(n)×s(n) ; (x, Γ, t) 7→ F γ (x, Γ, t)
be defined by
F γ (x, Γ, t) := D1 F (x, t)Γ − I
(3.45)
A linear dynamic inverse for F γ (x, Γ, t) is Gγ : Rs(n)×s(n) ×Rs(n)×s(n) → Rs(n)×s(n) ; (w, Γ ) 7→ Gγ [w, Γ ] defined by
Gγ [w, Γ ] := Γ · w
(3.46)
For an estimator E γ (x, Γ, t) for Γ˙∗ , we differentiate F γ (x∗ , Γ∗ , t) = 0 with respect to t, solve for Γ˙∗ , and substitute x and Γ for x∗ and Γ∗ respectively to get d γ E (x, Γ, t) := −Γ D1 F (x, t) ·Γ (3.47) dt x˙ ∗ =E(x,Γ,t)
Combining the E’s, F ’s, and G’s from (3.44), (3.35), (3.41), (3.47), (3.45), and (3.46),
we obtain the dynamic inverter ( x˙ = −µG[F (x, t), Γ ] + E(Γ, x, t) Γ˙ = −µGγ [F γ (x, Γ, t), Γ ] + E γ (x, Γ, t)
(3.48)
70
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
or in an expanded form
Dynamic Polar Decomposition for Time-Varying Matrices x˙ = −µΓ (ˆ ˙ x ˇ xΛ(t)ˆ x − I)ˇ− Γ x ˆΛ(t)ˆ Γ˙ = −µΓ (D1 F (x, t)Γ − I) − Γ d D1 F (x, t) ·Γ dt x=E(x,Γ,t) ˙
(3.49)
x ˆA(t) → U (t)
x ˆA(t)A(t)T → P (t)
A(t)T (ˆ x)2 → A−1 (t)
Initial conditions for the dynamic inverter (3.49) may be set so that x ˆ(0) ≈ P (0)−1
and Γ (0) ≈ D1 F ((P (0)−1 )ˇ, 0)−1. Under these conditions Γ (t) ≡ P (t)−1 for all t ≥ 0.
Combining the results above with the dynamic inversion theorem, Theorem 2.3.5
gives the following theorem. Theorem 3.4.2 Dynamic Polar Decomposition of Time-Varying Matrices. Let A(t) be in GL(n, R) for all t ∈ R+ . Let the polar decomposition of A(t) be A(t) = P (t)U (t)
with P (t) ∈ S(n, R) the positive definite symmetric square root of Λ(t) := A(t)A(t)T , and
U (t) ∈ O(, R) for all t ∈ R+ . Let x be in Rs(n) , and let Γ be in Rs(n)×s(n) . Let (x(t), Γ (t)) denote the solution of the dynamic inverter (3.49) where F (x, t) is given by (3.35). Then
there exists a µ ˜ such that if the dynamic inversion gain µ satisfies µ > µ ˜, and (ˆ x(0), Γ (0)) is sufficiently close to (P (0)−1 , D1F ((P (0)−1 )ˇ, t)−1 ), then i. Λ(t)ˆ x(t) exponentially converges to P (t), ii. x ˆ(t)A(t) exponentially converges to U (t), and iii. A(t)ˆ x(t)2 exponentially converges to A(t)−1 .
An example of the polar decomposition of a 2 × 2 matrix will illustrate application
of Theorem 3.4.2.
Example 3.4.3 Polar Decomposition of a Time-Varying Matrix. Let " # 10 + sin(10t) cos(t) A(t) := −t 1
(3.50)
Sec. 3.4
Polar Decomposition for Time-Varying Matrices
71
In this case x ∈ R3 and Γ ∈ R3×3 . We will perform polar decomposition and inversion of
A(t) over t ∈ [0, 8], an interval over which A(t) is nonsingular. We will estimate P (t) and
U (t) such that A(t) = P (t)U (t), with P (t) ∈ S(2, R) being the positive definite symmetric
square root of A(t)A(t)T , and with U (t) ∈ O(2, R). Let "
Λ(t) =
λ1 λ2 λ2 λ3
#
= A(t)A(t)T
(3.51)
We choose the ordered basis β of S(2, R) to be
β=
("
1 0 0 0
# " ,
0 1 1 0
# " ,
0 0 0 1
#)
(3.52)
In this basis we have
λ1 x21 + 2λ2 x1 x2 + λ3 x22 − 1
2 F (x, t) = (ˆ xΛ(t)ˆ x)ˇ = λ x x + λ x + λ x x + λ x x 1 1 2 2 2 1 3 3 2 3 2 λ1 x22 + 2λ2 x2 x3 + λ3 x23 − 1
(3.53)
Then
D1 F (x, t) =
2(λ1x1 + λ2 x2 )
2(λ2 x1 + λ3 x2 )
0
λ1 x2 + λ2 x3
λ1 x1 + 2λ2 x2 + λ3 x3
λ2 x1 + λ3 x2
0
2(λ1 x2 + λ2 x3 )
2(λ2 x2 + λ3 x3 )
(3.54)
For an estimator for x˙ we have from (3.44)
λ˙ 1 x21 + 2λ˙ 2x1 x2 + λ˙ 3 x22 ˙ 1 x1 x2 + λ˙ 2 x2 + λ˙ 2 x1 x3 + λ˙ 3 x2 x3 E(x, Γ, t) = −Γ λ 2 λ˙ 1 x2 + 2λ˙ 2x2 x3 + λ˙ 3 x2 2
(3.55)
3
The estimator E γ for Γ∗ is given by (3.47), where L˙ 11 L˙ 12 0 ˙ ˙ ˙ = L21 L22 L23 x=E(x,Γ,t) ˙ 0 L˙ 32 L˙ 33
d D1 F (x, t) dt
(3.56)
72
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
with L˙ 11 = 2λ˙ 1 x1 + 2λ1 E1 (x, Γ, t) + 2λ˙ 2 x2 + 2λ2 E2 (x, Γ, t) L˙ 12 = 2L˙ 23 L˙ 21 = λ˙ 1 x2 + λ1 E2 (x, Γ, t) + λ˙ 2 x3 + λ2 E3 (x, Γ, t) L˙ 22 = λ˙ 1 x1 + λ1 E1 (x, Γ, t) + 2λ˙ 2 x2 + 2λ2 E2 (x, Γ, t) + λ˙ 3 x3 + λ3 E3 (x, Γ, t) L˙ 23 = λ˙ 2 x1 + λ2 E1 (x, Γ, t) + λ˙ 3 x2 + λ3 E2 (x, Γ, t) L˙ 32 = 2L˙ 21
(3.57)
L˙ 33 = 2λ˙ 2 x2 + 2λ2 E2 (x, Γ, t) + 2λ˙ 3 x3 + 2λ3 E3 (x, Γ, t) Dynamic inversion using equations (3.49) was simulated using the adaptive step size Runge-Kutta integrator ode45 from Matlab, with the default tolerance of 10−6 . The initial conditions were set so that x ˆ(0) = Λ(0)1/2 + eˆx Γ (0) = D1 F (x(0), t)−1
(3.58)
where ex = [−0.55, 0.04, −2.48]T is an error that has been deliberately added to demonstrate
the error transient of the dynamic inverter. The value of µ was set to 10.
The graph of Figure 3.4 shows the values of the individual elements of A(t). The top graph of Figure 3.5 shows the elements of x(t), the estimator for P (t)−1 , and the bottom graph of Figure 3.5 shows the elements of Γ (t). Figure 3.6 shows log10 (kˆ x(t)Λ(t)ˆ x(t) − Ik∞) indicating the extent to which x ˆ, the
estimator for P (t)−1 fails to be the square root of Λ(t) = A(t)A(t)T . For estimates of P (t), U (t), and A(t)−1 we have
Γ (t)A(t)A(t)T → P (t), Γ (t)A(t) → U (t), and A(t)T Γ 2 → A(t)−1
(3.59)
N
Sec. 3.4
Polar Decomposition for Time-Varying Matrices
73
A(t) 12 10 8 6 4 2 0 -2 -4 -6 -8
0
1
2
3
4
5
t
6
7
8
Figure 3.4: Elements of A(t) (see (3.50)). See Example 3.4.3
x(t) 0.5 0 -0.5 -1 -1.5
0
1
2
3
1
2
3
4
5
6
7
8
4
5
6
7
8
t Γ(t)
0.2 0 -0.2 -0.4 -0.6 -0.8
0
t
Figure 3.5: Elements of x (top), and Γ (bottom). See Example 3.4.3.
74
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
Log10 of Error in Estimation of P(t)-1 2 1 0 -1 -2 -3 -4 -5 -6 -7
0
1
2
3
4
t
5
6
7
8
Figure 3.6: The error log10 (kˆ x(t)Λ(t)ˆ x(t) − Ik∞) indicating the extent to which x fails to satisfy x ˆΛ(t)ˆ x − I = 0. The ripple from t ≈ 1.8 to t = 8 is due to numerical noise. See Example 3.4.3.
Sec. 3.5
Polar Decomposition and Inversion of Constant Matrices
75
Remark 3.4.4 Symmetry of the Dynamic Inverter. It is interesting to note that P (t)−1 , besides being a solution to x ˆΛ(t)ˆ x − I = 0 is also a solution to Λ(t)ˆ x2 − I = 0 as well as x ˆ2 Λ(t)−I = 0. But Λ(t)ˆ x2 −I and x ˆ2 Λ(t)−I are not, in general, symmetric even when Λ(t) and x ˆ are symmetric. Though exponential convergence is still guaranteed when using these forms, the flow Γ (t) is not, in general, confined to S(n, R). Using these forms would increase the number of equations in the dynamic inverter by n(n − 1)/2 + n2 − s(n)2 since, not only would the right hand side of the top equation of (3.49) no longer be symmetric,
N
but Γ would be n2 × n2 rather than s(n) × s(n).
3.5
Polar Decomposition and Inversion of Constant Matrices In the dynamic inversion techniques of Sections 3.2 and 3.4 we assumed that we
had available an approximation of A−1 (0) with which to set Γ (0) in the dynamic inversion of A(t). Thus we would need to invert at least one constant matrix, A(0), in order to start the dynamic inverter. Methods of constant matrix inversion presented in Section 3.3 had the potential disadvantage of either producing exact inversion only asymptotically as t → ∞,
or of only working on matrices with no eigenvalues in the interval (−∞, 0). The question naturally arises then, how might we use dynamic inversion to invert any constant matrix so that the exact inverse is available by a prescribed time. In this section, by appealing to both homotopy and polar decomposition, we give an answer to this question. Let M be in GL(n, R) with P = P T > 0, U U T = I,
and M = P U
(3.60)
Helmke and Moore (see [HM94], pages 150-152) have described a gradient flow for the function kA − U P k2 ,
¯˙ = U ¯ P¯ M T U ¯ − M P¯ U ¯ +U ¯TM P¯˙ = −2P¯ + M T U
(3.61)
¯ are meant to approximate P and U respectively. Asymptotically, this where P¯ and U system produces factors P∗ and U∗ satisfying M − P∗ U∗ = 0 for almost all initial conditions ¯ P¯ (0), U(0) as t → ∞. A difficulty with this approach, as the authors point out, is that positive definiteness of the approximator P¯ is not guaranteed. In this section we describe a dynamic system that provides polar decomposition of any nonsingular constant matrix by any prescribed time, with the positive definiteness
76
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
of the estimator of P guaranteed. This will be accomplished by applying Theorem 3.4.2 on dynamic polar decomposition of time-varying matrices to the homotopy Λ(t) := (1 − t)I + tM M T
(3.62)
AA A AAA AA AA A AA AAA AA A AA AA AA AAAAAA AA AA AA A AA AA A AA AA A AAA AAA AA AA A AAA AA AA AA A AAA AAAA AAA
Unlike the homotopy H(t) = (1 − t)I + tM of Section 3.3, the homotopy Λ(t) (3.62) is guaranteed to have a spectrum which avoids (−∞, 0) for any nonsingular M since Λ(t) is a
positive definite symmetric matrix for all t ∈ [0, 1]. The situation is depicted in Figure 3.7.
S(n, R ) > 0
MM T
Λ(t)
I
Figure 3.7: Λ(t) is positive definite and symmetric for all t ∈ [0, 1]. Recall that M is in GL(n, R). For Λ(t) as defined in (3.62) note that Λ(0) = I, Λ(1) = M M T , and for all t ∈ [0, 1], Λ(t) is positive definite and symmetric. Let P (t)
denote the positive definite symmetric square root of Λ(t). Let the estimator of P −1 (t) be x ˆ ∈ Rn×n . Differentiate Λ(t) (3.62) with respect to t to get ˙ Λ(t) = MMT − I
(3.63)
Now we may apply the dynamic inverter of Section 3.4 in order to perform the polar decomposition of M . As in (3.35), let F (x, t) := (ˆ xΛ(t)ˆ x − I)ˇ
(3.64)
By inspection it may be verified that x ˆ∗ (0) = I and Γ∗ (0) = I. If we set x ˆ(0) = I and Γ (0) = I, then Theorem 2.3.5 and the results of the last section assure us that x ˆ(t) ≡ P −1
Sec. 3.5
Polar Decomposition and Inversion of Constant Matrices
77
for all t ≥ 0, and thus x ˆ(1) = P −1 . Consequently x ˆ(1) = P −1 Λ(1)ˆ x(1) = M M T x ˆ(1) = P
(3.65)
x ˆ(1)M = U MT x ˆ(1)2 = M −1
˙ Note that Λ(t) = M M T − I = 0 if and only if M is unitary, in which case
M −1 = M T .
Combining the results of this section with the results of the last section gives the following Theorem. Theorem 3.5.1 Dynamic Polar Decomposition of Constant Matrices by a Prescribed Time. Let M be in GL(n, R). Let the polar decomposition of M be M = P U with P ∈ S(n, R) the positive definite symmetric square root of M M T and U ∈ O(n, R). Let x be in Rs(n) , and let Γ be in Rs(n)×s(n) . Let x(0) = Iˇ and Γ (0) = 1 I. Let (x(t), Γ (t)) denote 2
the solution of
Prescribed-Time Dynamic Inverter for Constant Matrices ( Γ˙ = −µGγ [F γ (Γ, x)] + E γ (x, Γ ) x˙ = −µG[F (x, t), Γ ] + E(x, Γ ) Λ(t) = (1 − t)I + tM M T F (x, t) = (ˆ xΛ(t)ˆ x − I)ˇ
(3.66)
G[w, Γ ] = Γ · w
E(x, Γ ) = −Γ (ˆ x(M M T − I)ˆ x)ˇ F γ (x, Γ, t) = D1 F (x, t)Γ − I
Gγ [w, Γ ] = Γ · w E γ (x, Γ ) = −Γ d D1 F (x, t) dt
x=E(x,Γ ˙ )
·Γ
Then for any µ > 0, MMT x ˆ(1) = P, x ˆ(1)M = U, and M T x ˆ(1)2 = M −1
(3.67)
N
78
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
Remark 3.5.2 Polar Decomposition by Any Prescribed Time. As in Theorem 3.3.7 we can force x ˆ to equal P −1 at any time t1 > 0 by substituting t/t1 for t in Λ(t), and proceeding with the derivation of the dynamic inverter as above. Then x ˆ(t1 ) = P −1 .
N
Example 3.5.3 A digital computer simulation of a dynamic inverter for the polar decomposition of a constant 2-by-2 matrix was performed. The integration was performed in Matlab [Mat92] using ode45 an adaptive step size Runge-Kutta routine using the default tolerance of 10−6 . The matrix M was chosen randomly to be " # 7 −3 M= −24 −3
(3.68)
The value of µ was set to 10. The evolution of the elements of x(t) and Γ (t) are shown in Figure 3.8.
x(t) 1 0.8 0.6 0.4 0.2 0 0
0.1
0.2
0.3
0.4
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.5
0.6
0.7
0.8
0.9
1
t Γ(t)
0.5 0.4 0.3 0.2 0.1 0 0
t
Figure 3.8: Elements of x(t) (top) and Γ (t) (bottom), for Example 3.5.3. Figure 3.9 shows the base 10 log of kF (x, t)k∞ = kˆ x(t)M M T x ˆ(t) − Ik∞ indicating
Sec. 3.6
Chapter Summary
79
the extent to which x, the estimator for P −1 fails to be the square root of Λ(t) = M M T .
Log10 of Error in Estimation of Inverse ofP(t) 3 2 1 0 -1 -2 -3 -4 -5 -6 -7
0
0.1
0.2
0.3
0.4
0.5
t
0.6
0.7
0.8
0.9
1
Figure 3.9: The base 10 log of the error kˆ x(t)M M T x ˆ(t) − Ik∞ , for Example 3.5.3. The final value (t = 1) of the error kˆ x(t)M M T x ˆ(t) − Ik∞ was kˆ x(1)Λ(1)ˆ x(1) − Ik∞ = 1.0611 × 10−6
(3.69)
Final values of P , U , and A−1 were P = MMT x ˆ(1) =
"
U =x ˆ(1)M =
"
M −1 = M T x ˆ(1)2 =
"
5.2444 −5.5223 0.3473
−5.5223
#
−0.9377
#
−0.0323
#
23.5479
−0.9377 −0.3473 0.0323
−0.2581 −0.0753
(3.70)
N
80
3.6
Dynamic Methods for Polar Decomposition and Inversion of Matrices Chap. 3
Chapter Summary We have seen how the polar decomposition and inversion of time-varying and
constant matrices may be accomplished by continuous-time dynamic systems. Our results are easily modified to provide solutions for time-varying and time-invariant linear equations of the form A(t)x = b. We have also seen that dynamic inversion in the matrix context provides a useful and general conceptual framework through which to view other methods of dynamic computation such as gradient flow methods. In some control problems, dynamic inversion may provide essential signals which can be incorporated into controllers for nonlinear dynamic systems [GM95c]. In those same problems it may also be used for matrix inversion. For example, dynamic inversion will be incorporated into a controller for robotic manipulators in Chapter 5 where the dynamic inverter will produce inverse kinematic solutions necessary for the control law. If inversion of, say, a time-varying mass matrix is also required in the same problem, a dynamic inverter may be augmented to provide that capability too, without interfering with other inversions within the same problem.
81
Chapter 4
Tracking Implicit Trajectories 4.1
Introduction In this chapter we consider the problem of controlling the output of a time-invariant
nonlinear control system to track a given implicitly defined reference trajectory. By an implicitly defined reference trajectory is meant a reference trajectory θ∗ (t) defined as a particular continuous isolated solution to an equation of the form F (θ, t) = 0. A dynamic inverter (see Chapter 2) will be incorporated into a tracking controller in order to control the output of a nonlinear control system to track such an implicit trajectory. A standard output-tracking controller for a given nonlinear time-invariant plant having vector relative degree (see Appendix C) relies upon explicit expressions for both an output reference trajectory, as well as the time-derivatives of the output reference trajectory. Given the reference trajectory and its derivatives, exponentially convergent tracking can be guaranteed by feedback linearization (see Section C of Appendix C) followed by standard tracking control for integrator chains (reviewed in Section B.4 of Appendix B). For a simple example, consider the system x˙ = f (x) + g(x)w, y = x
(4.1)
with input w, output y, state x, and where x, f (x), g(x), w, and y are in R, and f (0) = 0. Assume g(x) 6= 0 for all x in a neighborhood of 0. We feedback-linearize system (4.1) by setting
w=
1 (−f (x) + u) g(x)
(4.2)
in (4.1) to get x˙ = u
(4.3)
82
Tracking Implicit Trajectories
Chap. 4
Thus the relationship between the new input u and the output y is through the linear equation y˙ = u
(4.4)
Then, to make the output y(t) track a reference trajectory yd (t), we set u = y˙d − β(x − yd )
(4.5)
with β ∈ R, β > 0. Insert this u (4.5) into (4.3) to get x˙ = y˙d − β(x − yd )
(4.6)
The control (4.5) causes the output y(t) to converge to the reference trajectory yd (t) exponentially1. This can be seen by letting e := x − yd in which case (4.6) takes the form e˙ = −βe
(4.7)
which is an autonomous linear dynamic system with exponentially stable equilibrium, e = 0. Thus e → 0 as t → ∞. Since e = y − yd , this implies that y(t) → yd (t) exponentially a t → ∞.
Now suppose we substitute for the explicit reference trajectory yd (t) and its deriva-
tives, estimators for an implicitly defined reference trajectory θ∗ (t) and its derivatives, obtained through dynamic inversion. It seems reasonable to expect that the combination of dynamic inverter and controlled plant2 will display error dynamics that are stable at least asymptotically. We will prove that this reasonable expectation is indeed correct, and that in fact we can achieve exponentially stable output tracking-error dynamics.
4.1.1
Motivation The implicit tracking problem has been motivated in part by the problem of con-
trolling robotic manipulators to track inverse-kinematic solutions, an application which is reviewed and explored in the robotic manipulator control context in Chapter 5. In the present chapter we study an implicit tracking problem, defined precisely in Section 4.2, that is more general for two reasons: i. We consider the implicit output reference trajectory θ∗ (t), to be the solution of an equation of the general form F (θ, t) = 0 rather than the form F (θ) − x(t) = 0 com-
monly used in the inverse-kinematics problem. We allow the possibility that the
1 By exponential convergence of q(t) ∈ Rn to r(t) ∈ Rn we mean that there exist two positive real constants k1 and k2 such that kq(t) − r(t)k ≤ k1 kq(0) − r(0)ke−k2 αt . 2 We use the standard term “plant” to refer to the dynamic system which we wish to control.
Sec. 4.1
Introduction
83
t-dependence of F (θ, t) may arise through dependence of F on the solution of an exogenous dynamic system. ii. We consider that there may be performance limitations due to the possibility of unbounded internal dynamics (defined precisely in Definition 4.2.3 below). Since we will use dynamic inversion in our controller, a primary concern is that any transients induced by the coupling of dynamic inverter to the plant do not cause the internal state of the resulting closed-loop system to become unbounded. Another motivation for study of the implicit tracking problem has been our use of implicit tracking in the control of nonlinear nonminimum-phase systems as we will detail in Chapter 6. That application will, in fact, include a slight variation in the structure of the derivative estimators.
4.1.2
Previous Work To the best of our knowledge, previous work on the output tracking of implicitly
defined reference trajectories has been confined to robotics-related work on implicit trajectories satisfying F (θ) − x(t) = 0. Thus we reserve a discussion of such previous work for the next chapter where the robot control problem is discussed in some detail.
4.1.3
Main Results The main results of this chapter are:
i. a useful characterization of acceptable internal behavior for nonlinear control systems called output-bounded internal dynamics, Definition 4.2.7, in which a bound on the output and its derivatives with respect to time implies bounded internal state, ii. an algorithm, Algorithm 4.3.1, for constructing derivative estimators for time derivatives of the implicitly defined reference trajectory, iii. a dynamic tracking controller that causes nonlinear control systems with outputbounded internal dynamics to track implicitly defined reference trajectories, iv. an implicit tracking theorem (Theorem 4.3.4) describing conditions under which the implicit tracking controller guarantees exponential output tracking convergence with bounded internal state.
84
Tracking Implicit Trajectories
4.1.4
Chap. 4
Chapter Overview In Section 4.2 we give a precise definition of the implicit asymptotic tracking
problem, and define the class of control systems in which we will be interested. In Section 4.3 we construct a dynamic compensator which (a) produces an explicit estimator for an implicit reference trajectory, and (b) causes the output of a nonlinear plant to converge exponentially to a given implicit reference trajectory. In Section 4.4 we give an example of implicit tracking control for a system that has output-bounded internal dynamics, but unstable zero dynamics. A simulation will illustrate tracking convergence with bounded internal state.
4.2
Problem Definition In this section, after supplying the necessary assumptions, definitions, and math-
ematical setting, we precisely define the implicit tracking problem.
4.2.1
System Structure We will consider nonlinear time-invariant control systems of the form (
x˙ = f (x) + g(x)w y = h(x)
(4.8)
with w and y in Rp , and x ∈ Rn . Assumption 4.2.1 Assume that f : Rn → Rn , g : Rn → Rn×p , and h : Rn → Rp are sufficiently smooth in x and t. Assume also that f (0) = 0, and h(0) = 0.
N
Assumption 4.2.2 Vector Relative Degree. Assume that system (4.8) has well-defined vector relative degree3 r = [r1 , r2 , . . . , rp] in a neighborhood of the origin x = 0.
(4.9)
N
By Assumption 4.2.2, ri is the minimum number of times one must differentiate output component yi in order to see any component wk , k ∈ p, of the input w. 3
See Section C of Appendix C for a review of vector relative degree.
Sec. 4.2
Problem Definition
Σ
85
y
ext
ξ ξ
u
Σ
int
η Figure 4.1: Schematic of (4.11).
Let r¯ := max{ri }. i∈p
(4.10)
Then r¯ is the highest relative degree of any of the outputs yi of system (4.8), i.e. the maximum number of times one must differentiate any output yi (t), i ∈ p, in order to see
some component wk , k ∈ p of the input w.
Through standard state-dependent coordinate and input transformations (see Section C of the Appendix) we may input/output linearize the plant (4.8) so that it takes the form Plant ( j ξ˙i = ξij+1 , i ∈ p, j ∈ ri − 1 Σext : ξ˙iri = ui , i ∈ p P Σint : η˙ = α(ξ, η) + β(ξ, η)u y = ξ1, i ∈ p i
(4.11)
i
with ξij ∈ R, and ui ∈ R. It follows from Assumption 4.2.1 that α(0, 0) = 0. The structure
of (4.11) is illustrated in Figure 4.1.
86
Tracking Implicit Trajectories
4.2.2
Chap. 4
Internal Dynamics Let sr := r1 + · · · + rp
be the sum of the relative degrees of the p outputs yi , i ∈ p. Let r T ξ := ξ11 , ξ12 , . . ., ξ1r1 , ξ21 , . . . , ξ2r2 , . . . , ξpp ∈ Rsr .
(4.12)
(4.13)
It follows that η is in Rn−sr .
Definition 4.2.3 We refer to the dynamics of η, Internal Dynamics of P (4.14) Σint : η˙ = α(ξ, η) + β(ξ, η)u obtained from (4.11), with ξ regarded as an exogenous time-dependent function, as the
N
internal dynamics of (4.11). We refer to η as the internal state.
Definition 4.2.4 We refer to the dynamics of η, obtained from (4.14) by setting ξ ≡ 0 and u ≡ 0, as the zero dynamics of (4.11),
Zero Dynamics of P (4.15) η˙ = α(0, η)
N As pointed out by Isidori and Moog [IM91], other useful definitions of zero dynamics for square multi-input, multi-output nonlinear systems are possible. However, those definitions are equivalent to Definition 4.2.4 for the case of decoupled systems such as (4.11). Definition 4.2.5 If the zero dynamics of (4.11) are asymptotically stable at η = 0, then (4.11) is a minimum-phase system4. Otherwise (4.11) is a nonminimum-phase system.
N 4
If a transfer function H(s) of a linear system has a zero in the right half of the complex plane, the transfer function, when evaluated for s going along the imaginary axis from −j∞ to +j∞, undergoes a change in phase which is greater, for the same magnitude, than if that zero were replaced by its left half plane mirror image; hence the name nonminimum phase [FPEN86].
Sec. 4.2
4.2.3
Problem Definition
87
The Output Space Let Cpn [0, ∞) denote the normed space of n-times continuously differentiable Rp -
valued functions on [0, ∞). For y(·) ∈ Cpn [0, ∞), let the norm k · k(n) be defined by ky(·)k(n) := sup{ky(t)k∞, ky (1)(t)k∞ , . . . , ky (n)(t)k∞}
(4.16)
t≥0
The open r-ball in Cpn [0, ∞), using the norm (4.16), is Br(n) := {y(·) ∈ Cpn [0, ∞)| ky(·)k(n) < r}.
(4.17)
(n)
Note that if y(·) is in Br , then for each t ≥ 0, y (i)(t) ∈ Br ⊂ Rp , i ∈ p, the r-ball in Rp. define
For any particular y(·) ∈ Cpr¯[0, ∞) (see Equation (4.10) for the definition of r¯),
h iT (r −1) (1) (r −1) (r −1) Y (t) := y1 (t), y1 (t), . . . , y1 1 (t), y2 (t), . . ., y2 2 (t), . . . , yp p (t) ∈ Rsr (r1)
y (r) (t) := [y1
(rp ) T
, . . . , yp
]
(4.18) (4.19)
Note that due to the structure of P (4.11), if y(·) is the output of P (4.11), then Y (·) ≡ ξ(·) (4.13).
Let (0)
(r1−1)
(Y, u) := [y1 , . . . , y1 (ri)
Note that (Y, u) = (Y, y (r)). Since yi
(0)
(rp −1)
, y2 , . . . , yp
, u1 , . . . , up]T
(4.20)
= ui , i ∈ p,
kyk(¯r) = sup{ky (0)(t)k∞ , . . . , ky (¯r) (t)k∞ , ku(t)k∞}
(4.21)
kyk(¯r) < ⇐⇒ sup k(Y (t), u(t))k∞ <
(4.22)
kyk(¯r) < =⇒ kY (t)k∞ < ∀t ≥ 0
(4.23)
t≥0
Thus t≥0
and
Assumption 4.2.6 Bounded Implicit Output-Reference Trajectory. Assume that the reference output that we wish to track is a particular continuous isolated solution θ∗ (t) ∈ Rp of
F (θ, t) = 0,
(4.24)
where F : Rp × R+ → Rp ; (θ, t) 7→ F (θ, t) is C r¯+2 in θ and t, and satisfies assumptions i, ii, and iii of the dynamic inverse existence lemma, Lemma 2.2.11. Assume, furthermore, that kθ∗ (·)k(¯r) < δ for some constant δ > 0.
(4.25)
N
88
Tracking Implicit Trajectories
4.2.4
Chap. 4
Ouput-Bounded Internal Dynamics It is useful to define a class of nonlinear control systems having the property that
if the output y(t) and its derivatives up to some finite order are bounded5 to be sufficiently small, then the internal state η of the system is guaranteed to be bounded. If a control system meets such a criterion then one may essentially ignore the internal dynamics if it can be guaranteed that, in a suitable norm, the output and its derivatives are sufficiently small. Minimum-phase systems (see Definition 4.2.5) are such a class, but the class of minimum-phase systems is only a subclass of the class of systems that possess the desired property. We introduce in the present subsection a property called output-bounded internal dynamics. Informally, a system with output-bounded internal dynamics is one in which the internal state is bounded whenever the output and its derivatives up to some finite order are bounded to be sufficiently small. However, output-bounded internal dynamics does not imply that the bound on the internal state goes to zero as the bound on the output and its derivatives goes to zero. Thus, for instance, smooth control systems with stable zero dynamics (see Definition 4.2.4) also have output-bounded internal dynamics, but systems with output-bounded internal dynamics may have unstable zero dynamics. Consider as an example of a system with output-bounded internal dynamics (and unstable zero dynamics) a planar cart upon which is fixed a bowl in which a ball is free to roll. Assume that no energy is dissipated by the interaction of the rolling ball and the bowl, and that the mass of the ball is a point mass located at the center of the ball.
η
-1
u
0
y
0
1
AAAA AAAA
Figure 4.2: The cart and ball system. 5
Whenever we say that a signal or state is bounded we will mean that its norm (4.16) is bounded above.
Sec. 4.2
Problem Definition
89
The position y(t) of the cart is the output of the cart-ball system, and the acceleration6 u of the cart is the input. The position and velocity of the ball comprise the internal state of the cart-ball system. Assume as indicated in Figure 4.2 that the bowl is such that its center is higher than the region immediately surrounding the center, but that to either side of the center there is a relative minimum with respect to height. Figure 4.3 shows the three resulting equilibria: the unstable one at the center of the bowl, and the stable ones to the left and right of center.
-1
0
1
-1
0
1
-1
0
1
Figure 4.3: Three equilibria of the cart and ball.
Objective: We would like to cause the cart to asymptotically track desired output reference trajectories yd (t), while allowing the ball to remain in the bowl. The zero dynamics of the cart-ball system are the dynamics of the ball in the bowl when the cart is held still, i.e. u ≡ 0 and y ≡ 0. Clearly the ball of Figures 4.2 and 4.3 is unstable at the origin (center bottom of bowl), hence the zero dynamics of the cart-ball
system are unstable. If, however, y(t) and its derivatives up to order 2 are kept sufficiently small, then we may guarantee that the ball never leaves the bowl7 . This is what we mean by output-bounded internal dynamics.
6
Choosing the acceleration of the cart as the input rather than the force on the cart is equivalent to input-output linearization of the cart-ball system. 7 In fact, due to the assumption that the cart rolls along flat ground, as indicated in Figure 4.2, we need only restrict y¨ to be small, but for generality in our definition of output-bounded internal dynamics we ignore this particular property.
90
Tracking Implicit Trajectories
u
-1
0
1
-1
0
1
-1
0
1
-1
0
1
-1
Chap. 4
0
1
-1
0
1
-1
0
1
AAA AAA 0
y
Figure 4.4: If k(y(t), y(t), ˙ y¨(t))k is kept sufficiently small for all t ≥ 0, then the ball remains in the bowl. Now, regarding the internal dynamics of a class of systems of the form (4.11), we make the following formal definition (see Figure 4.5):
R n-sr
R sr+p
(Y(0),u(0)) (Y(t),u(t)) 0
η(0) ρ
κ
0 η(t)
Figure 4.5: Output-bounded internal dynamics.
Output-Bounded Internal Dynamics Definition 4.2.7 A system of the form (4.11) has output-bounded internal dynamics if there exist real numbers ρ > 0 and κ > 0 such that if ky(·)k(¯r) < κ and kη(0)k < ρ, then kη(t)k < ρ for all t ≥ 0.
N
Figure 4.6 shows some cart-ball systems that do not have output-bounded internal dynamics. Figure 4.7 shows some more cart-ball systems that do have output-bounded zero
Sec. 4.2
Problem Definition
91
dynamics. Note the following:
• The cart-ball system ‘i’ in Figure 4.7 is obtained from the cart-ball system ‘c’ in Figure 4.6 by a shift of internal coordinates.
• All cart-ball systems in Figure 4.6 have unstable zero dynamics.
• Cart-ball systems ‘k’ and ‘l’ of Figure 4.7 have unstable zero dynamics, while carts ‘g’, ‘h’, ‘i’, and ‘j’ have stable zero dynamics.
-1
0 a
1
-1
0 b
1
-1
0 c
1
-1
0 d
1
-1
0 e
1
-1
0 f
1
Figure 4.6: Some cart-ball systems that do not have output-bounded internal dynamics.
92
Tracking Implicit Trajectories
-1
0 g
1
-1
0 h
1
-1
0 j
1
-1
0 k
1
Chap. 4
-1
-1 i
0
0 l
1
1
Figure 4.7: Some cart-ball systems that do have output-bounded internal dynamics. Remark 4.2.8 Control of carts ‘a’, ‘c’, and ‘d’ of Figure 4.6 represents a subclass of nonminimum-phase control problems, including the inverted pendulum on a cart and the controlled bicycle, that we will consider in some depth in Chapter 6.
N
Assumption 4.2.9 Output-Bounded Internal Dynamics. Assume that the control system (4.11) has output-bounded internal dynamics.
N
Remark 4.2.10 Unstable Zero-Dynamics. Under Assumption 4.2.9, system (4.11) may have unstable zero dynamics, i.e. even if (4.11) has output-bounded internal dynamics, the origin of the system η˙ = α(0, η) may be unstable.
N
An example will illustrate Remark 4.2.10. Example 4.2.11 Output-Bounded Internal Dynamics. Consider the plant ξ˙1 = ξ 2 ξ˙2 = u η˙ = (η + 3)(η + 1)η(η − 1)(η − 3) + ξ 1 + (ξ 2 )2 + u y = ξ1
(4.26)
Sec. 4.2
Problem Definition
93
Vector Field of Zero Dynamics 40
30
20
φ(η)
10
0
-10
-20
-30
-40
-3
-2
-1
0
1
2
3
η
Figure 4.8: The zero dynamics vector field φ(η) for the zero dynamics (4.27) of Example 4.2.11. The origin of the zero dynamics is unstable, but η(t) is bounded on [0, ∞) when |η(0)| < 3. which is of the form (4.11). The zero dynamics of (4.26) are obtained by setting y ≡ 0 and
u ≡ 0. Setting y ≡ 0 implies ξ 1 = 0 and ξ 2 = 0. Thus the zero dynamics of system (4.26) is (see also (4.15))
η˙ = (η + 3)(η + 1)η(η − 1)(η − 3) = α(0, η) =: φ(η).
(4.27)
The vector field φ(η) (4.27) for the zero dynamics is graphed in Figure 4.8. It has equilibria at η ∈ {−3, −1, 0, 1, 3}. The zero dynamics is clearly unstable at η = 0 as evidenced by
the the observation that the slope of η versus φ(η) is positive at η = 0. Note that for all kη(0)k < 3, the solution η(t) of the zero dynamics (4.27) satisfies kη(t)k < 3 for all t ≥ 0.
We claim that if kη(0)k∞ < 2 (corresponding to ρ = 2 in Definition 4.2.7), and
ky(·)k(2) < 4 (corresponding to κ = 5 in Definition 4.2.7), then the solution η(t) of η˙ = (η + 3)(η + 1)η(η − 1)(η − 3) + ξ 1 + (ξ 2 )2 + u
(4.28)
satisfies kη(t)k < 2 for all t ≥ 0. To wit: take as a Lyapunov function candidate for the internal dynamics (4.28)
V (η) :=
1 2 η . 2
(4.29)
94
Tracking Implicit Trajectories
Chap. 4
Differentiate V (η) with respect to t to get d V (η) = η η˙ = (η + 3)(η + 1)η 2(η − 1)(η − 3) + η(ξ 1 + (ξ 2 )2 + u). dt
(4.30)
Note that ξ 1 = y, ξ 2 = y, ˙ and u = y¨. Consider the interval [−2, 2] = B2 for η; in particular consider the endpoints of this interval. It may be easily verified by substituting 2 and −2 for η in (4.30) that if
kξ 1 (t) + (ξ 2 (t))2 + uk = ky(t) + y(t) ˙ 2 + y¨k < 30, for all t ≥ 0, then
d dt V (±2)
(4.31)
< 0. It follows that if ky(·)k(2) < 4,
(4.32)
then η(t) remains in [−2, 2] for all t ≥ 0. Therefore the plant (4.26) has output-bounded
N
internal dynamics about the equilibrium (ξ, η) = (0, 0).
4.2.5
The Problem We can now define the problem central to this chapter. (¯ r)
Problem 4.2.12 Asymptotic Implicit Tracking Problem. Let θ∗ (·) ∈ Bκ
isolated
Cpr¯[0, ∞)
be an
solution of F (θ, t) = 0. Given the control system P (4.11) having output-
bounded internal dynamics, find a compensator C such that for the closed-loop system [C, P ] defined by Figure 4.9, i. y(t) → θ∗ (t) asymptotically, ii. the internal state of [C, P ] is bounded.
N
In fact the compensator C for tracking implicit trajectories will be dynamic, its dynamic part being a dynamic inverter with state (Γ, θ) as indicated in Figure 4.9 (The state Γ is the part of the dynamic inverter state from which the dynamic inverse is formed as discussed in Chapter 2, Section 2.4). The internal state of the closed loop system [C, P ], i.e. the unobservable state, is (η, Γ, θ). Thus if item ii of Problem 4.2.12 is satisfied, then η(t) is bounded on [0, ∞).
Sec. 4.3
Tracking Control
95
u
y
C
P ξ,η
Γ,θ ξ
Figure 4.9: The closed-loop control system [C, P ].
4.3
Tracking Control In this section we will construct the implicit tracking controller and prove that
it provides exponentially convergent tracking with bounded internal dynamics. In Subsection 4.3.1 we will review tracking control for explicit trajectories for systems of the form (4.11). In Subsection 4.3.2 we will apply dynamic inversion to obtain an estimator θ(t) for an implicit reference output θ∗ (t). Since we will, in general, require estimates of higher time-derivatives of the reference output θ∗ (t), in Subsection 4.3.3 we give an algorithm for obtaining derivative estimators dependent upon the state of a dynamic inverter. Then in Subsection 4.3.4 we will join the dynamic inverter and the derivative estimators to the plant P (4.11) to create the closed loop system [C, P ]. In Subsection 4.3.5 we will prove that the resulting system [C, P ] provides exponentially convergent tracking with bounded internal dynamics.
4.3.1
Tracking Explicit Trajectories For a desired reference trajectory yd (t) = [yd1 (t), . . ., ydp (t)]T ∈ Cpr¯[0, ∞), let (0)
(r −1)
Yd (t) := [yd1 (t), . . . , yd11
(0)
(r −1)
(t), yd2 (t), . . . , yd22
(0)
(r −1)
(t), yd3 (t), . . ., ydpp
(t)]T
(4.33)
and (r)
(r )
(r )
yd (t) := [yd11 (t), . . . , ydpp (t)]T ∈ Rp (¯ r)
(4.34)
(r)
and assume that yd (·) is in Bδ , equivalently (Yd (t), yd (t)) ∈ Bδ for all t ≥ 0. Let {βij },
i ∈ p, j ∈ ri be chosen to be real constant coefficients of the polynomials in s, sri +
ri X j=1
j
βi sj−1 , i ∈ p,
(4.35)
such that all roots of the polynomials have strictly negative real parts. It is a standard and elementary result of linear control theory (see Section B.4 of Appendix B) that the choice
96
Tracking Implicit Trajectories
Chap. 4
of input
ui =
(r ) ydi i (t)
−
ri X k=1
(k−1)
βik (ξik − ydi
), i ∈ p
(4.36)
will cause ξ(t) to converge to Yd (t) with exponentially decaying error. It follows from the (r)
form of (4.36) that (ξ(t), u(t)) → (Yd (t), yd (t)) exponentially as t → ∞. If δ and ν, with 0 < ν ≤ κ, are sufficiently small, then the exponential convergence of ξ(t) to Yd (t) together (¯ r)
with the assumption that yd is in Bδ
implies that ξ(t) remains in Bκ for all t ≥ 0 (see
Figure 4.10). This, combined with Assumption 4.2.9 guarantees that η(t) remains bounded. R sr+p
R n-sr (ξ(0),u(0)) η(0)
(Yd(0),yd(0)) δ
ρ ν
0 η(t)
κ
(r)
Figure 4.10: If k(ξ(0), u(0))k < ν and k(Yd (t), yd )k < δ with ν and δ sufficiently small, (r) then convergence of (ξ(t), u(t)) to (Yd (t), yd (t)) preserves the upper bound ρ on the internal state η(t). In the present case θ∗ (t) is the desired output reference trajectory we would like to track. Of course if we had an explicit expression for θ∗ (t) we could simply substitute θ∗ (t) and its derivatives, for yd (t) and its derivatives in (4.36). We assume, however, that such an explicit expression for θ∗ (t) may not be available.
4.3.2
Estimating the Implicit Reference Trajectory The implicit reference trajectory θ∗ (t) may be estimated by using a dynamic in-
verter satisfying the assumptions of Theorem 2.3.5 on dynamic inversion with vanishing error. In this chapter we will use a dynamic inverse of the form (2.110) which we assume
Sec. 4.3
Tracking Control
97
satisfies the assumptions of Theorem 2.4.6. This will allow us to determine a dynamic inverse dynamically. For convenience we repeat that dynamic inverter here: "
#
Γ˙ θ˙
= −µ +
"
"
Γ
0
0
Γ
#"
D1 F (θ, t)Γ − I
F (θ, t) d −Γ dt D1 F (θ, t) θ=−Γ ˙ D
2
#
Γ F (θ,t)
−Γ D2 F (θ, t)
(4.37)
#
For notational simplicity, define
¯ G[w] :=
"
Γ
0
0
Γ
#
· w, F¯ (θ, t) :=
"
D1 F (θ, t)Γ − I
#
#
F (θ, t)
=
"
F γ (Γ, θ, t)
=
"
E γ (Γ, θ, t)
F (θ, t)
#
,
(4.38)
and ¯ E(Γ, θ, t) :=
"
−Γ
d ˙ Γ dt D1 F (θ, t) θ=−Γ D2 F (θ,t)
−Γ D2 F (θ, t)
E(Γ, θ, t)
#
(4.39)
so that the dynamic inverter (4.37) is represented by "
Γ˙ θ˙
#
¯ F¯ (Γ, θ, t), Γ, θ + E(Γ, ¯ = −µG θ, t)
(4.40)
where (Γ∗ (t), θ∗ (t)) is defined to be a continuous isolated solution of F¯ (Γ, θ, t) = 0, with Γ∗ (t) ≡ D1 F¯ (θ∗ , t)−1 . By Theorem 2.4.6, (Γ (t), θ(t)) converges to (Γ∗ (t), θ∗ (t)) exponentially for sufficiently large µ > 0 if (Γ (0), θ(0)) is sufficiently close to (Γ∗ (0), θ∗(0)).
4.3.3
Estimating Derivatives of Implicit Trajectories We will substitute estimators for time derivatives of θ∗ (t) into the tracking law (4.36)
in place of the exact time derivatives of θ∗ (t). In this subsection we show how to obtain such derivative estimates from F (θ, t) as functions of t, Γ , and θ. (k)
We may obtain an estimator for θ∗ algorithm:
for any k ≥ 0 by the following recursive
98
Tracking Implicit Trajectories
Chap. 4
Algorithm 4.3.1 Derivative Estimator Algorithm. i. k ∈ Z+ .
Data:
ii. The function F (θ, t), assumed to be C k in θ and t. If k = 1: Let E 1 (Γ, θ, t) = −Γ D2 F (θ, t). If k > 1: E k (Γ, θ, t) =
d k−1 (Γ, θ, t) θ=E ˙ 1 (Γ,θ,t), Γ˙ =E γ (Γ,θ,t) dt E
where E γ (Γ, θ, t) = hP i n ∂ 1 (Γ, θ, t) + D F (θ, t) Γ −Γ D F (θ, t) E 2,1 i i=1 ∂θi 1
Output: E k (Γ, θ, t)
Recall that E γ (Γ, θ, t) is the estimator for Γ˙∗ (see Section 2.4, Remark 2.4.3).
By construction, the estimators E i(Γ, θ, t) produced by Algorithm 4.3.1 satisfy (i)
E i(Γ∗ , θ∗ , t) = θ∗ (t), and by Assumption 4.2.6, E i(Γ, θ, t) is continuously differentiable in each of its arguments for Γ and θ sufficiently close to Γ∗ and θ∗ . Remark 4.3.2 Note that in general, d i E (Γ, θ, t) 6= E i+1 (Γ, θ, t). dt Only at (Γ, θ) = (Γ∗ , θ∗ ) is equality guaranteed.
(4.41)
N
Example 4.3.3 Application of the Derivative Estimator Algorithm. Let F (θ, t) := k sin(θ) − cos(θ)u
(4.42)
E 1 (Γ, θ, t) = −Γ · D2 F (θ, t) = −Γ (− cos(θ)u) ˙ = Γ cos(θ)u˙
(4.43)
Then
To get E 2 (Γ, θ, t), first E γ (Γ, θ, t) = −Γ · (−k sin(θ) + cos(θ)u) E 1 (Γ, θ, t) + sin(θ)u˙ · Γ
(4.44)
Sec. 4.3
Tracking Control
99
Then, E 2 (Γ, θ, t) = = =
d 1 ˙ 1 (Γ,θ,t) dt E (Γ, θ, t) Γ˙ =E γ (Γ,θ,t),θ=E d Γ cos(θ)u˙ Γ˙ =E γ (Γ,θ,t),θ=E 1 (Γ,θ,t) ˙ dt
(4.45)
Γ˙ cos(θ)u˙ − Γ sin(θ)θ˙u˙ + Γ cos(θ)¨ u ˙
˙ 1 (Γ,θ,t) Γ =E γ (Γ,θ,t),θ=E
= E γ (Γ, θ, t) cos(θ)u˙ − Γ sin(θ)E 1 (Γ, θ, t)u˙ + Γ cos(θ)¨ u
N 4.3.4
Combined Dynamic Inverter and Plant In this subsection we combine the dynamic inverter (4.37) with derivative estima-
tors and the plant (4.11) to get the closed-loop system [C, P ]. Let Eik (Γ, θ, t) denote the ith component of the vector-valued function E k (Γ, θ, t). (k)
Substitute the estimators E k (Γ, θ, t) for θ∗ , k ∈ r¯, into the control law (4.36) to get the
new control law
Implicit Tracking Control Law u ˜i (ξ, Γ, θ, t) =
Eiri (Γ, θ, t) −
ri X k=1
βik (ξik − Eik−1 (Γ, θ, t))
(4.46)
Combining plant (4.11), dynamic inverter (4.40), and control law (4.46) gives the closed-loop system Implicit [C, P ] "
4.3.5
Tracking Controller and Plant (4.11) ξ˙ij ξ˙ri i
= ξij+1 , i ∈ p, j ∈ ri − 1
= u ˜i (ξ, Γ, θ, t)
(4.47)
η˙ = α(ξ, η) + β(ξ, η)˜ u(ξ, Γ, θ, t) # ˙ Γ ¯ F¯ (Γ, θ, t), Γ, θ + E(Γ, ¯ = −µG θ, t) θ˙
An Implicit Tracking Theorem Let (0)
(1)
(r −1)
Θ∗ (t) := [θ∗1 , θ∗1 , . . . , θ∗11
(0)
(r −1)
, θ∗2 , . . ., θ∗22
(r −1) T
, . . . , θ∗pp
] ∈ Rsr .
(4.48)
100
Tracking Implicit Trajectories
Chap. 4
The following theorem gives sufficient conditions under which the closed loop system [C, P ] (4.47) will solve the asymptotic implicit tracking problem, Problem 4.2.12. Theorem 4.3.4 Implicit Tracking Theorem. Assume that i. θ∗ (t) is a continuous isolated solution of F (θ, t) = 0, and G[w, Γ, t] is a dynamic inverse of F (θ, t), ii. plant (4.11) has output-bounded internal dynamics, (¯ r)
iii. θ∗ (·) is in Bδ , iv. the right-hand side of system (4.47) is C 2 in its arguments and all of its partial derivatives up to order 2 are bounded for all ξ, η, and (Γ − Γ∗ , θ − θ∗ ) sufficiently
small.
If (θ(0) − θ∗ (0)), (Γ (0) − Γ∗ (0)), (ξ(0) − Θ∗ (0)), δ, and η(0) are sufficiently small, then there exists a ν > 0, and a µ ¯ > 0 such that for all ξ(0) ∈ Bν ⊂ Rsr , and for all positive µ>µ ¯, the output y(t) of (4.47), converges exponentially to θ∗ (t), while (η, Γ, θ) remains
bounded.
Proof of Theorem 4.3.4: After some coordinate changes are applied to [C, P ] (4.47), we will prove exponential convergence of Y (t) to Θ∗ (t) for sufficiently large µ using sin-
gular perturbation theory8 . We will rely upon a proof from Khalil [Kha92], restated in Appendix B, to prove exponential convergence of ξ(t) to Θ∗ (t). Then we will show that
a ν > 0 exists such that if ξ(0) ∈ Bν ⊂ Rsr , then exponentially convergent tracking with
bounded internal dynamics is achieved.
A. Let := 1/µ. Define a coordinate change (ξ, η, Γ, θ) → (e, η, w, z) by eji
(j−1)
= ξij − θ∗i
w = Γ − Γ∗ ,
, (4.49)
z = θ − θ∗ ,
with η left unchanged. Let u = u ˜ as defined by (4.46). Through substitution of the error 8
See Kokotovic, et al. [KHO86] for a review of singular perturbation theory in the context of control theory.
Sec. 4.3
Tracking Control
101
coordinates (4.49) into (4.46), as well as some algebra we get Pi u ˜i (ξ, Γ, θ, t) = Eiri (Γ, θ, t) − rk=1 βik ξik − Eik−1 (Γ, θ, t) P (k−1) (r ) (r ) (k−1) i (Γ, θ, t) = θ∗ii + Eiri (Γ, θ, t) − θ∗ii − rk=1 βik eki + θ∗i − Ei P (r ) (r ) i = θ∗ii + Eiri (Γ, θ, t) − θ∗ii − rk=1 β k ek i i P(ri ) k−1 (k−1) + k=1 βi Ei (Γ, θ, t) − θ∗i Pi (r ) (r ) = θ∗ii − rk=1 βik eki + Eiri (w + Γ∗ , z + θ∗ , t) − θ∗ii Pi (k−1) + rk=1 βik Eik (w + Γ∗ , z + θ∗ , t) − θ∗ (4.50) Substitute the resulting expression for u ˜ into (4.47) to get [C, P ] in error coordinates, e˙ji = ej+1 , i ∈ p, j ∈ ri − 1 i P (ri ) r ri ri k k i e ˙ = − β e + E (w + Γ , z + θ , t) − θ ∗ ∗ i i k=1 i i ∗i Pri (k−1) k k + k=1 βi Ei (w + Γ∗ , z + θ∗ , t) − θ∗i η˙ = α(ξ, η) + β(ξ, η)˜ u(ξ, w + Γ∗ , z + θ∗ , t) (4.51) " # w ˙ ¯ F¯ (w + Γ∗ , z + θ∗ , t), w + Γ∗ , z + θ∗ = −G z ˙ " #! ˙∗ Γ ¯ + Γ∗ , z + θ∗ , t) − + E(w θ˙∗ B. Now consider the error system obtained from (4.51) by omitting the η-dynamics, e˙ji = ej+1 , i ∈ p, j ∈ ri − 1 i Pri r i k ek + E ri (w + Γ , z + θ , t) − θ(ri ) e ˙ = − β ∗ ∗ i i k=1 i i ∗i Pri (k−1) k k + k=1 βi Ei (w + Γ∗ , z + θ∗ , t) − θ∗i " # (4.52) w˙ ¯ F¯ (w + Γ∗ , z + θ∗ , t), w + Γ∗ , z + θ∗ = − G z˙ " #! ˙∗ Γ ¯ + Γ∗ , z + θ∗ , t) − + E(w θ˙∗
We will show that (4.52) satisfies assumptions i through v of Theorem B.3.1 of Appendix B. Note the following: i. The origin (e, z, w) = (0, 0, 0) is an equilibrium of (4.52).
ii. The equation obtained from the dynamic inverter part of (4.47) by setting = 0, namely ¯ F¯ (w + Γ∗ , z + θ∗ , t), w + Γ∗ , z + θ∗ 0 = −G
(4.53)
102
Tracking Implicit Trajectories
Chap. 4
has an isolated solution at (w, z) = (0, 0). iii. By assumption, the right hand side of (4.52) and its partial derivatives up to order 2 are bounded for sufficiently small (e, w, z). iv. For the linear time-invariant system ( j e˙i = ej+1 , i ∈ p, j ∈ ri − 1 i Pri ri e˙i = − k=1 βik eki
(4.54)
e = 0 is an exponentially stable equilibrium.
v. The origin (w, z) = (0, 0) of " # w d ¯ F¯ (w + Γ∗ (t), z + θ∗ (t), t), w + Γ∗ (t), z + θ∗ (t) = −G dτ z
(4.55)
is exponentially stable uniformly in t (Equation (4.55) is a differential equation in τ , with solution (w(τ ), z(τ )), where t is considered fixed. See Theorem B.3.1 of Appendix B). Then by Theorem B.3.1 of Appendix B there exists a ¯ > 0 such that for all < ¯ , (e, w, z) = (0, 0, 0) is an exponentially stable equilibrium of (4.52). Since µ = 1/ and µ ¯ = 1/¯ , it follows that there exists a µ ¯ > 0 such that for all µ > µ ¯, the origin (e, w, z) = (0, 0, 0) of (4.52) is exponentially stable. Thus ξ(t) goes to Θ∗ (t) exponentially if (ξ(0) − Θ∗ (0)), (θ(0) − θ∗ (0)),
and (Γ (0) − Γ∗ (0)) are sufficiently small.
C. Exponential stability of the origin of (4.52) implies exponentially convergence
of y(t) to θ∗ (t), and of (Γ (t), θ(t)) to (Γ∗ (t), θ∗ (t)). Since (4.52) does not depend on η, this exponential output error convergence is unaffected by the evolution of η. Nevertheless, we must assure ourselves that η remains bounded. Since the plant (4.11) is assumed to have output-bounded internal dynamics, there (¯ r)
exists a ρ > 0 and a κ > 0 such that ξ(·) ∈ Bκ and η(0) ∈ Bρ imply η(t) ∈ Bρ for all t ≥ 0.
We must show that there exists a ν > 0 such that if ξ(0) is in Bν , then (ξ(t), u(t)) remains in Bκ as ξ(t) converges to Θ∗ (t) (see Figure 4.11).
(r)
By assumption, kθ∗ k(¯r) < δ which implies that k(Θ∗ (t), θ∗ (t)k∞ < δ (see (4.9)) for
each i ∈ p. If δ is sufficiently small, then there exists a ν ≥ δ such that for all (ξ(0), u(0)) in
Bν , ξ(0) is sufficiently close to Θ∗ (t) to satisfy the requirements of exponential convergence
for part B (above) of this proof. However, ξ(0) may converge to Θ(0) in such a way that for some t > 0, k(ξ(t), u(t))k∞ > κ even though the convergence is asymptotic. Thus the norm
Sec. 4.3
Tracking Control
103
R sr+p
(ξ(0),u(0)) (Θ∗(0),θ∗(0)) δ ν κ
(
Figure 4.11: As (ξ(t), u(t)) converges to (Θ∗ (t), θ∗ r¯)(t)) it must remain in Bκ .
kx(t)−y(t)k need only be bounded above by a decaying exponential function. This does not
imply that the norm decreases monotonically in t. For example, the function e−αt sin(ωt)
is a function that converges to zero exponentially and whose norm is not monotonic in t (see Figure 4.12). We must assure ourselves that there exists a ν sufficiently small so that kξ(0)k(¯r) < ν implies k(ξ(t), u(t))k∞ < κ for t ≥ 0 in order to preserve the boundedness of
η. Figure 4.12 illustrates the situation we would like to avoid.
Exponential convergence of ξ(t) to Θ∗ (t) as t → ∞ implies that there exist k1 > 0,
and k2 > 0 such that
kξ(t) − Θ(t)k ≤ k1 kξ(0) − Θ(0)ke−k2 t
(4.56)
for all t ≥ 0. By choosing δ and ν sufficiently small, we can make kξ(0) − Θ∗ (0)k as small
as we please. Therefore, we can guarantee by choice of δ and ν sufficiently small, that kξ(·)k(¯r) < κ. Since the plant P has output-bounded internal dynamics, this guarantees that η(t) remains in Bρ for all t ≥ 0.
104
Tracking Implicit Trajectories
Chap. 4
R sr+p
(ξ(0),u(0)) (Θ∗(0),θ∗(r)(0)) δ ν κ
Figure 4.12: If δ and ν are not sufficiently small, then (ξ(t), u(t)) may converge exponentially to (Θ∗ (t), θ∗ (t)) but leave the ball Bκ at some time.
R sr+p R n-sr (ξ(0),u(0)) η(0)
(Θ∗(0),θ∗(r)(0)) δ ν
ρ
0 η(t)
κ
Figure 4.13: If (ξ(0), u(0)) is in Bν , and η(0) is in Bρ , then kη(t)k < ρ for all t ≥ 0. The Bκ is that ball in which (ξ(t), u(t)) must remain in order that η(t) remain in Bρ . Compare to Figure 4.5.
Sec. 4.4
4.4
An Example of Implicit Tracking
105
An Example of Implicit Tracking An example will illustrate application of Theorem 4.3.4 to a nonlinear control
system having unstable zero dynamics, but output-bounded internal dynamics. Example 4.4.1 Implicit Trajectory Tracking. We will construct an implicit tracking controller for the plant (4.26) of Example 4.2.11. Recall that plant (4.26) has outputbounded internal dynamics, but unstable zero dynamics. The tracking problem for plant (4.26) corresponds to the tracking problem for the cart of Figures 4.2 through 4.4, where we wish to cause the cart to track an implicitly defined trajectory without having the ball leave the bowl. Assume that we would like y(t) to track the solution θ∗ (t) to F (θ, t) = 0
(4.57)
F (θ, t) := (2 + sin(t)) tan(θ/10) − 1.
(4.58)
where
1 The equation F (θ, t) = 0 has an explicit solution, θ∗ (t) = 10 arctan 2+sin(t) . This will
allow us to verify performance of the closed loop system resulting from application of the implicit tracking controller. We will construct the controller, however, as if only an implicit expression F (θ, t) = 0 for the reference trajectory were available. First we will construct the necessary derivative estimators. We know from Lemma 2.2.7 on dynamic inverses for scalar functions, that we can use G(w) = 1·w as a dynamic inverse for this F (θ, t), but for purposes of illustration, and in the manner of Theorem 2.4.6, we will use a dynamically estimated dynamic inverse G(w, Γ ) = Γ · w, where Γ∗ (t) is the solution to D1 F (θ∗ , t)Γ − 1 = 0, and D1 F (θ, t) =
1 (2 + sin(t)) sec2 (θ/10). 10
(4.59)
Use the derivative estimator algorithm, Algorithm 4.3.1, to obtain estimators E 1 (Γ, θ, t) = −Γ cos(t) tan(θ/10),
(4.60)
and E 2(Γ, θ, t) = −Γ
1 10
cos(t) sec2 (θ/10)
1 + 50 (2 + sin(t)) sec2 (θ/10) tan(θ/10)E 1(Γ, θ, t)2
− sin(t) tan(θ/10) +
1 10
(4.61)
cos(t) sec2 (θ/10)E 1(Γ, θ, t) .
106
Tracking Implicit Trajectories We have d ˙ D F (θ , t) = 1 ∗ dt θ∗ =E 1 (Γ,θ,t)
1 10
Chap. 4
cos(t) sec2 (θ/10)
1 + 50 (2 + sin(t)) sec2 (θ/10) tan(θ/10)E 1(Γ, θ, t)
(4.62)
Then, for estimation of Γ˙∗ we have E γ (Γ, θ, t) := −Γ where E γ (Γ∗ , θ∗ , t) = Γ˙∗ .
d dt D1 F (θ, t) θ˙∗ =E 1 (Γ,θ,t) Γ
(4.63)
Let β 2 = β 1 = 1. For the implicit tracking controller define u ˜(Γ, θ, t) := E 2 (Γ, θ, t) − β 2 (ξ − E 1 (Γ, θ, t)) − β 1 (ξ − θ).
Combine the dynamic ξ˙1 = ˙2 ξ = [C, P ] η˙ = Γ˙ = ˙ θ =
(4.64)
inverter and plant to get ξ2 u ˜(Γ, θ, t) (η + 3)(η + 1)η(η − 1)(η − 3) + ξ 1 + (ξ 2 )2 + u ˜(Γ, θ, t)
(4.65)
−µ(D1 F (θ, t)Γ − I) + E γ (Γ, θ, t) −µΓ F (θ, t) − E 1 (Γ, θ, t)
where u ˜(Γ, θ, t) is given by (4.64).
4.4.1
Simulations Figures 4.14 through 4.17 show the results of a simulation of (4.65) with the initial
conditions shown in Table 4.1. variable ξ1 ξ2 η Γ θ
initial value 3 -1 0 1 1
Table 4.1: Initial Conditions for the implicit tracking controller simulation. The parameters used in the simulation are shown in Table 4.2. The simulation was integrated using an adaptive step size, fourth and fifth order Runge-Kutta integrator ode45 in Matlab [Mat92].
Sec. 4.4
An Example of Implicit Tracking parameter β1 β2 µ
107
value 1 1 10
Table 4.2: Parameters for the implicit tracking controller simulation. The top graph of Figure 4.14 shows the output y(t) (solid), the estimator θ(t) (dashed), and the actual reference trajectory θ∗ (t) (dotted). Convergence of the three trajectories is readily apparent. The bottom graph of Figure 4.14 shows the output tracking error y(t) − θ∗ (t) for the simulation. The decay of the tracking error can be seen.
Figure 4.15 shows the internal state η(t). Note that η was initialized at 0 which
is unstable. It can be seen that η settles to a region between one of its zero dynamics equilibria, η = 1, and a value of η = 1.6. Most importantly, η stays bounded. Figure 4.16 shows the estimation errors for θ∗ (top) and Γ∗ (bottom). Both errors can be seen to decay to zero.
108
Tracking Implicit Trajectories
Chap. 4
8
θ* (dotted), y (solid), θ (dashed)
7
6
5
4
3
2
1 0
2
4
6
8
10
12
14
16
18
20
16
18
20
t 0.5
0
y - θ*
-0.5
-1
-1.5
-2
0
2
4
6
8
10
12
14
t
Figure 4.14: Top: The output y(t) (solid), as well as the implicit reference trajectory θ∗ (dotted), and its estimate θ(t) (dashed) for the simulation of Example 4.4.1. Bottom: The output tracking error y(t) − θ∗ (t)
Sec. 4.4
An Example of Implicit Tracking
109
1.6
1.4
1.2
η
1
0.8
0.6
0.4
0.2
0 0
2
4
6
8
10
12
14
16
18
20
t
Figure 4.15: The internal state η(t) for the simulation of Example 4.4.1.
1 0
θ - θ*
-1 -2 -3 -4
0
2
4
6
8
10
12
14
16
18
20
12
14
16
18
20
t 1
Γ - Γ*
0 -1 -2 -3 -4
0
2
4
6
8
10
t
Figure 4.16: The top graph shows the estimation error θ(t) − θ∗ (t) for Example 4.4.1. The bottom graph shows the estimation error Γ (t) − Γ∗ (t).
110
Tracking Implicit Trajectories
Chap. 4
Figure 4.17 shows four phase plots. The top left plot shows ξ 1 versus ξ 2 . Comparing this to the top right plot showing θ∗ versus θ˙∗ we can see how the output converges to the implicit reference trajectory. The bottom left plot is not a true phase plot. It shows θ versus E 1 (Γ, θ, t). Recall from Remark 4.3.2 that only when (Γ, θ) = (Γ∗ , θ∗ ) is E 1 (Γ, θ, t) ˙ The plot at the lower right of Figure 4.17 shows the tracking error guaranteed to equal θ. phase (y − θ∗ , y˙ − θ˙∗ ) = (e1 , e2 ) as it converges to zero.
5
. θ*
ξ2
5
0
-5
0
2
4
6
8
0
-5
0
2
ξ1
4
6
8
θ*
5
2
e2
E1 (θ)
1 0
0 -1
-5
0
2
4
6
θ
8
-2 -2
0
2
e1
Figure 4.17: The top left graph shows the phase plot of ξ 1 versus ξ 2 . The top right graph shows θ∗ versus θ˙∗ . The lower left graph shows θ versus E 1 (Γ, θ, t). The lower right graph shows the tracking error phase, ξ 1 − θ∗ versus ξ 2 − θ˙∗ . The symbol ‘o’ marks the initial conditions for each plot.
The combination of plant and compensator can be seen to have behaved as predicted, with exponentially decaying output error, and with bounded internal dynamics.
N
Sec. 4.5
4.5
Chapter Summary
111
Chapter Summary We have defined a useful characterization of internal dynamics for nonlinear sys-
tems which we have called output-bounded internal dynamics. A nonlinear control system with output-bounded internal dynamics has an acceptable form of internal behavior without having stable zero dynamics. We have combined a dynamic inverter with a standard tracking controller, replacing the explicit reference trajectory and its time-derivatives by estimators based on dynamic inversion to produce a controller for tracking implicit reference trajectories. For systems having output-bounded internal dynamics, we have seen that for suitable initial conditions and gain parameters, the implicit tracking controller keeps the internal dynamics bounded. We have proven, though an appeal to singular perturbation theory, that the combination of nonlinear plant, dynamic inverter, and controller results in exponentially convergent output tracking with bounded internal dynamics for plants having output-bounded internal dynamics. A simulation of a controlled nonlinear system with unstable zero dynamics, but output-bounded internal dynamics was shown to exhibit the predicted convergence and stability behavior. We have used the particular dynamic inverter of Theorem 2.4.6. A review of the proof of the implicit tracking theorem, Theorem 4.3.4, easily reveals that this is not the only dynamic inverter that may be used. Any dynamic inverter conforming to the assumptions of Theorem 2.3.5 will do for the example of Section 4.4 above as well as for the implicit tracking theorem. For instance, if D1 F (θ, t)−1 is indeed available in closed form, then it may be substituted for Γ in the controller equations. Then one need not solve a differential equation for Γ .
112
Chapter 5
Joint-Space Tracking of Workspace Trajectories in Continuous Time 5.1
Introduction In this chapter we will apply the implicit tracking theorem, Theorem 4.3.4, to the
problem of tracking end-effector trajectories for robotic manipulators. The control of robotic manipulators provides considerable motivation for the application of dynamic inversion to the implicit tracking problem. Inverse kinematics is currently a computationally expensive procedure, limiting, as we will discuss in Section 5.3, the performance of robot controllers for important classes of tasks such as the tracking and stabilization of end-effector trajectories, the task which will be considered here. The limitation is sufficiently severe that many commercial robotic manipulators are designed so that there exist closed-form solutions for their inverse-kinematics. Using dynamic inversion, one need only approximately solve the inverse kinematics problem at a single configuration. A dynamic inverter uses this solution as the initial condition for a dynamical system, the flow of which is the time-varying solution to the inverse-kinematics problem. Coupling this inverse-kinematic solution to a tracking controller for the robot arm, and applying the implicit tracking theorem, Theorem 4.3.4 gives exponentially convergent tracking.
5.1.1
Previous Work The map F (θ, t) to be inverted in this chapter is of a special form F (θ, t) = F (θ) − xd (t)
(5.1)
Sec. 5.1
Introduction
113
in which an exclusively time-dependent term is combined additively with an exclusively θ-dependent term. The inverse problem of approximating θ∗ (t) given xd (t) has received a significant amount of attention in the literature and, indeed, continuous-time algorithms for solving inverse kinematics are classical (see [Nak91]). Wolovich and Elliot [WE84] introduced a set of dynamical equations that solve the inverse kinematics problem x = F (θ)
where x is the cartesian configuration vector and θ is the vector of joint angles. Their dynamical system may be expressed as θ˙ = −K · DF (θ)T · (F (θ) − xd (t))
(5.2)
where K is a positive-definite matrix. Viewing this equation through in the dynamic inversion framework, we see immediately that this is a dynamic inverter that solves for the solution to F (θ) − xd (t) = 0
(5.3)
with dynamic inverse KDF (θ) and no derivative estimator. The lack of a derivative estimator in (5.2) limits the performance of the algorithm (5.2) unless kx˙ d (t)k is small, or the leftmost eigenvalue of K is large. In contrast we will use derivative estimation as specified
in Theorem 4.3.4, and we will be concerned with the tracking problem rather than solely with the inverse kinematics problem. Tcho´ n and Duleba introduced a dynamical method for determining inverse kinematics for manipulators with forward kinematic maps F which may be expressed in the
form
θ˙ = −DF (θ)T · adj DF (θ) · DF (θ)T · F (θ)
(5.4)
where adj(·) refers to the classical adjoint. Here DF is assumed to be surjective allowing for inverse solutions for redundant manipulators. In the dynamic inversion context it is clear that G[w] := DF (θ)T · adj DF (θ) · DF (θ)T · w
(5.5)
is being used in (5.4) as a dynamic inverter for F (θ) since G[w] is simply a left inverse of DF multiplied by det(DF DF T ). Assuming that F (θ) is analytic and expanding F (θ) in a
Taylor series about θ∗ makes it clear that G[w] is a dynamic inverse of F (θ) in a suitably
small neighborhood of θ∗ . Note that no derivative estimator E(θ, t) is used. Since Tchon
and Duleba are concerned with inversion at particular configurations of the end effector, this is not surprising. Lack of such a derivative estimator, however, limits the utility of their algorithm in the context of tracking end-effector trajectories. Note also that determination of adj(DF ) is similar to determination of DF −1 which is something we would like to avoid.
114
Joint-Space Tracking of Workspace Trajectories in Continuous Time
Chap. 5
Another continuous-time dynamical handling of inverse kinematics similar to that which we describe below has been presented by Nicosia et al. in [NTV91a]. Their approach too fits well into the framework of dynamic inversion as described in this dissertation. Both G(w, θ) = D1 F (θ, t)−1 · w and G(w, θ) = D1 F (θ, t)T are used by those authors as dynamic inverses. Derivative estimation similar to the technique presented in Chapter 4 is also used, though rather than assuming knowledge of the time-derivatives of xd as we will do here, Nicosia et al. use an observer to estimate those derivatives. They also rely heavily upon the availability of D1 F (θ, t)−1 . Though such reliance is often feasible in practice, we will not require it, relying instead upon dynamic estimation of a dynamic inverse.
5.1.2
Main Results The main result of this chapter are as follows:
i. We define four classes of robotic manipulator controllers based upon whether the errors used for the control are in the workspace or the joint-space, and further subdivided by whether the object of the controller is to stabilize a workspace trajectory or a joint-space trajectory. ii. We introduce a controller that provides joint-space tracking of workspace trajectories. The controller is posed in continuous-time. Its digital computer implementation then requires only integration of an exponentially stable dynamical system. The heart of this chapter may be viewed as an application of the implicit tracking theorem, Theorem 4.3.4, to robotic manipulator control. In this application, however, internal dynamics are assume to be absent or ignorable.
5.1.3
Chapter Overview In Section 5.2, after some necessary definitions, we precisely define the robotic
control problem in which we will be interested. In Section 5.3 we then describe some current methods of robot manipulator control, looking very briefly at some of their strengths and shortcomings. In Section 5.4 we apply the implicit tracking theorem, Theorem 4.3.4, to construct an exact tracking controller for the tracking of end-effector trajectories. In Section 5.5 an example of output tracking for a simple model of a two-link robot arm is used to illustrate the application of the implicit tracking theorem.
Sec. 5.2
5.2
Problem Definition
115
Problem Definition Let the vector of joint angles1 of the robotic manipulator be denoted θ ∈ Rn ,
and the corresponding generalized torques2 be τ ∈ Rn . We will concern ourselves with the
control of open-chain robotic manipulators3 having equations of motion in the standard form Robotic Manipulator Dynamic Equations ˙ +τ M (θ)θ¨ = K(θ, θ)
(5.6)
where the inertia matrix M (θ) ∈ Rn×n is positive-definite and symmetric for all θ ∈ Rn . ˙ contains all Coriolis, centrifugal, frictional, damping, and gravitational The vector K(θ, θ) forces. The forward-kinematics map F relates the generalized coordinates θ of the robotic
manipulator to the configuration of the end-effector x,
Forward-Kinematic Relation (5.7) x = F (θ) Depending upon the particular manipulator, x may take values in various sets, including subgroups of the special Euclidean group SE(3), the group of positions and orientations in Euclidean 3-space. We call the set of all possible end-effector configurations x the workspace, X . We call each element of the workspace a pose since each x ∈ X corresponds
to a pose of the end-effector. The map F (θ) will be assumed to be C 2 .
Typically the joint-space is a non-euclidean manifold. For the purposes of this
chapter we may view the joint-space, as well as the workspace, which we will assume to be of the same dimension as the joint-space, through charts4 from Rn . For simplicity, we will at first avoid a discussion of redundant manipulators, manipulators having degrees of freedom greater in number than the dimension of the space 1
By a “joint angle” we mean a parameter uniquely describing a joint configuration. Thus, for instance, the “angle” may parameterize a prismatic as well as a rotary joint. The vector of joint-angles is a set of generalized coordinates for the robotic manipulator. 2 We will use the term “torques” to mean control forces or control torques as appropriate. 3 By open-chain robotic manipulator, we mean a finite sequence of rigid links, the first link being hinged to the ground, with all successive links hinged to the previous link by a joint. The end of the last link is presumed to be free to move in the workspace. 4 See [AMR88], Chapter 3 for a review of manifolds and their associated charts.
116
Joint-Space Tracking of Workspace Trajectories in Continuous Time
Chap. 5
in which xd (t) resides. Then in Remark 5.4.4 we will show how our controller may be easily adapted for use in redundant manipulators. Discussion of singularities in the inversekinematics problem will also be avoided. We will assume that the inverse-kinematic image of the desired work-space path, i.e. θ∗ (t) satisfying F (θ) − xd (t) = 0 does not pass through such a singularity.
The workspace tracking problem considered here is as follows: Problem 5.2.1 Workspace Tracking Problem. Find a control τ (θ, t) such that for all initial conditions θ0 ∈ Rn in an open subset of Rn , the pose x(t) of the end-effector converges exponentially to the desired end-effector trajectory xd (t).
Assumption 5.2.2 Smoothness. Assume that the desired end-effector trajectory xd (t) is C 4 on [0, ∞), that the forward-kinematic map F (θ) is also C 4 on Rn , and that DF (z+θ∗ (t)), DF (z + θ∗ (t))−1 , and D 2 F (z + θ∗ (t)) are bounded uniformly in t for all z ∈ Br .
N
Assumption 5.2.2 will provide the degree of smoothness necessary to invoke the implicit tracking theorem. Given a particular end-effector pose xp ∈ X , the inverse-kinematics problem
is to find θp satisfying xp = F (θp). In general, multiple solutions θp exist. For simplicity
we will further restrict the space from which we draw desired output trajectories xd (t) to
those output trajectories that have corresponding, though possibly multiple, continuous isolated solutions θ∗ (t), i.e. that do not go through singularities. For robotic manipulators having ni input torques and n degrees of freedom, ni ≤ n, with the rank of DF (θ) being
ni , the inverse function theorem (see [AMR88], Section 2.5, page 116) implies that inversekinematic solutions are isolated. Exceptions to this isolation are at discrete singularities where DF (θ) drops rank. Thus, the added restriction on xd (t) is mild. For simplicity in presenting the controller, we will consider only manipulators with ni = n. Since there are in general multiple continuous isolated solutions θ∗ (t) of F (θ) =
xd (t) we will assume that a particular one has been chosen. It will be demonstrated below, in
Section 5.5, that it is only a matter of choice of initial conditions for the tracking controller that will cause the manipulator to follow one inverse-kinematic solution over another.
5.3
Manipulator Tracking Control Methodologies Current techniques of tracking control for robotic manipulators (see [Cra89], [SV89],
[MLS94]) can be divided into two classes according to whether tracking-error feedback is
Sec. 5.3
Manipulator Tracking Control Methodologies
θd(t1) θd(t2 )
xd(t3) xd(t2) xd(t1)
xd(t4)
117
∼ θ(t)
xd(t)
θd(t3) Inverse-Kinematics
Workspace
Joint-space
Algorithm
θd(t4)
t t1
t2
t3
t4
Figure 5.1: A sequence of poses {xd (tk )} along the workspace trajectory are inverted via an inverse-kinematics algorithm. The resulting sequence of joint-space points {θd (tk )} is then ˜ splined to form θ(t). realized in terms of workspace errors (i.e. θ − θd , θ˙ − θ˙d ) or joint-space errors (i.e. x − xd ,
x˙ − x˙ d ). These classes are as follows:
i. Joint-Space Control of Joint-Space Trajectories. A discrete inverse-kinematics algorithm is applied to a time-parameterized sequence of chosen points {xd (tk )}, called
via points, along a continuous desired pose trajectory t 7→ xd (t) ∈ X . This discrete
inversion produces a corresponding time-parameterized sequence of joint-angle vectors {θd (tk )}. One may then create, via a spline, a smooth time-parameterized curve ˜ ∈ Rn through the sequence {θd (tk )}, (see Figure 5.1) and then track θ(t) ˜ using a θ(t) ˜˙ and (θ − θ) ˜ such as tracking controller described in terms of errors (θ˙ − θ) ˜˙ − B 1 (θ − θ) ˜ ˙ + M (θ) θ¨ ˜ − B 2 (θ˙ − θ) τ = −K(θ, θ)
(5.8)
where B 1 and B 2 are positive-definite gain matrices in Rn×n .
ii. Workspace Control of Workspace Trajectories. To transform the dynamic equations of the robot into workspace coordinates, differentiate the forward-kinematics relationship x = F (θ) twice with respect to t, x˙ = DF (θ)θ˙ P n ∂ ˙i θ˙ + DF (θ)θ¨ x ¨ = DF (θ) θ i=1 ∂θi
(5.9)
118
Joint-Space Tracking of Workspace Trajectories in Continuous Time
Chap. 5
¨ and solve for θ, θ¨ = DF (θ)−1
n X ∂ x ¨− DF (θ)θ˙i θ˙ ∂θi i=1
!
(5.10)
Substitute the result (5.10) for θ¨ into the manipulators dynamical equations (5.6) M (θ)DF (θ)−1 x ¨ = DF (θ)−1
! n X ∂ ˙ +τ DF (θ)θ˙iθ˙ K(θ, θ) ∂θi
(5.11)
i=1
and left-multiply both sides of (5.11) by (DF (θ)−1 )T , (DF (θ)−1 )T M (θ)DF (θ)−1 x ¨= P n ∂ −1 T −1 ˙i θ˙ K(θ, θ) ˙ + (DF (θ)−1 )T τ (DF (θ) ) DF (θ) DF (θ) θ i=1 ∂θi
(5.12)
Then choose gain matrices B 1 and B 2 in Rn×n for error feedback in terms of these workspace errors (θ˙ − x˙ d ) and (x − xd ) to obtain a tracking controller for tracking the desired xd (t) ∈ X as follows:
τ = −DF (θ)−1 Pn ∂ DF (θ)θ˙ θ˙ K(θ, θ) ˙ + M (θ)DF (θ)−1 v i i=1 ∂θi v = x ¨d (t) − B 2 (x˙ − x˙ d (t)) − B 1 (x − xd (t))
(5.13)
Each of the above approaches has its advantages and limitations. In the first class of controllers, i above, if accuracy of end-effector pose is to be achieved, one must solve a great number of individual inverse-kinematic problems in order to find the corresponding sequence of points in joint space. If a disturbance causes the end-effector to move substantially from its desired trajectory, a new sequence of workspace points may have to be inverted in order to fulfill desired error dynamics. The joint-space spline from one via point to the next may correspond to a workspace path that diverges substantially from the desired workspace trajectory for points midway between the via points. This can cause a lack of uniformity in the workspace error as indicated schematically in Figure 5.2. This approach has also necessitated a combined discrete-time, continuous-time approach to workspace tracking control of robotic manipulators. Intrinsically, it is not “real-time” since the next via point in the joint space must be determined before the spline from the previous via point can be created, time-parameterized, and tracked. Joint-space control does have an advantage in that joint parameterizations are global. Thus one need not change coordinates in the middle of a control task.
Sec. 5.3
Manipulator Tracking Control Methodologies
Jo
Workspace
F(θ(t))
xd(t1)
xd(t2)
xd(t4)
AA AAAAA A AAAA AAAA AAAA AAA AA AAA AA AAA
ce pa
∼
xd(t)
t-S in
AAA AA AAA AAA AA AA AAA AAA AA AA AAA AA AAA AA AAA AA xd(t3)
119
∼ θ(t)
F
θd(t1)
θd(t)
θd(t2 )
θd(t3)
θd(t4)
Figure 5.2: The black curve on the left corresponds to the desired end-effector trajectory xd (t). The black dots on the left correspond to points of xd (t) at a discrete sequence of times t1 < t2 < t3 < t4 . The black curve on the right corresponds to the inverse kinematic solution θd (t) satisfying F (θd (t)) = xd (t). The black dots on the right correspond to the inverse kinematic solutions θd (tk ) satisfying F (θd(tk )) = xd (tk ). The white curve on the ˜ through the sequence {θd (tk )}. The right corresponds to a time parameterized spline θ(t) ˜ ˜ white curve on the left is F (θ(t)). Note that the error between xd (t) and F (θ(t)) is nonuniform, going to zero at the sample points, and diverging from xd (t) away from the sample points. In the second class of controllers, ii above, (see [MLS94], Section 5.4 for more details on workspace control) one need not solve for an inverse-kinematic solution, though DF (θ) must be inverted. This method too can be undesirable since the inputs to the manipulator are often joint torques. Avoidance of saturation of the joint torques, for instance, is made difficult. If the mass matrix of the manipulator cannot be conveniently inverted symbolically, then, once again, a mixed discrete and continuous time control scheme is necessitated in order to apply, e.g. Gauss elimination, to solve for θ¨ at the start of each continuous control interval. In addition, since the workspace is usually SE(3), and since no global parameterization of SE(3) exists5 , this approach can necessitate the overhead of coordinate changes in the controller implementation. However, specifying control gains in 5
Quaternion representations of SE(3) can make this problem less serious.
120
Joint-Space Tracking of Workspace Trajectories in Continuous Time
Chap. 5
the workspace coordinates can be advantageous for certain combinations of manipulator and task. In fact, the two control strategies above suggest the existence of two more control strategies. By distinguishing between workspace and joint-space control, as well as between workspace and joint-space trajectories we see that there are in fact four distinct strategies, as illustrated in Figure 5.3:
Four Classes of Robotic Manipulator Control • JCJT. Joint-space control of joint-space trajectories, • WCWT. Workspace control of workspace trajectories,
(5.14)
• WCJT. Workspace control of joint-space trajectories, • JCWT. Joint-space control of workspace trajectories.
Strategy JCJT corresponds to i above, and strategy WCWT corresponds to ii above. This chapter describes a JCWT method, symbolized by the black arrow in Figure 5.3, of joint-space control of workspace trajectories based on dynamic inversion and the implicit tracking results of the last chapter. This alternative allows one to pose the controller in joint-space while continuously providing an estimate of θ∗ (t) satisfying xd (t) = F (θ∗ (t)) allowing continuous-time control in joint-space. The continuous time approach also has the
virtue of a degree of independence of choice of computational machinery. For realization of the control via digital computer one must choose an integrator in order to integrate the dynamic inverter. The issue of accuracy, however, is made solely a matter of the choice of integrator. Using our method, we also retain the advantage of global control coordinates.
5.3.1
Workspace Control of Joint-space Trajectories Workspace control of joint-space trajectories (WCJT) is easily obtained from
workspace control of workspace trajectories (WCWT) by replacing xd (t), x˙ d (t), and x ¨d (t) in 5.13 by xd (t) = F (θd(t)), x˙ d (t) = DF (θd (t))θ˙d (t),
(5.15)
Sec. 5.4
Joint-Space Control of Workspace Trajectories
Joint-space Trajectories
WCJT
JCJT
Joint-space Control
121
Workspace Trajectories
JCWT
WCWT
Workspace Control
Figure 5.3: The four robot control strategies are represented each by one of the four arrows. This chapter presents a JCWT strategy, indicated by the black arrow.
and n X ∂ ˙ x ¨d (t) = DF (θd (t))θdi (t) θ˙d (t) + DF (θd (t))θ¨d (t). ∂θi
(5.16)
i=1
Though WCJT completes the picture illustrated in Figure 5.3, any advantage in its use is unclear at present. It may be useful in cases where one wishes to control joint motions on a robotic manipulator through exogenous forces applied to the end effector.
5.4
Joint-Space Control of Workspace Trajectories We now apply dynamic inversion to the problem of tracking workspace trajectories
using joint-space control. Given a desired end-effector trajectory t 7→ xd (t), the inverse
kinematic solution θ∗ (t) to F (θ) = xd (t) is defined implicitly as a continuous isolated
122
Joint-Space Tracking of Workspace Trajectories in Continuous Time
Chap. 5
solution of F (θ, t) = 0 where
F (θ, t) := F (θ) − xd (t)
(5.17)
The use of dynamic inversion for the tracking of implicitly defined trajectories is described in Chapter 4. Those arguments will be specialized here to the case of robotic manipulator control. Note that F (θ, t) is a sum of one θ-dependent term and one t-dependent term. Thus, assuming xd (t) is C 2 in t, we have the necessary uniformity in t to conclude from the dynamic inverse existence lemma, Lemma 2.2.11, that we may use a dynamic inverse G(w), linear in w and based on (DF (θ))−1 . From the form of the manipulator dynamical equations (5.6), and since M (θ) is positive-definite, it is clear by substitution into (5.6) that the feedback torque
˙ + M (θ)v τ = −K(θ, θ)
(5.18)
applied to (5.6) causes the resulting controlled manipulator dynamics θ¨ = v
(5.19)
to be linear from input to state, as well as decoupled6 . Let e := θ − θ∗ denote the tracking
error between the manipulator configuration θ and the inverse-kinematic solution θ∗ . Let
βi2 , βi1 ∈ R, i ∈ n be such that the roots of the polynomial in s, s2 + βi2 s + βi1 , i ∈ n
(5.20)
have strictly negative real parts. Suppose we had explicit signals θ∗ (t), θ˙∗ (t), and θ¨∗ (t). Choosing v in (5.19) as vi := θ¨∗ (t)i − βi2 (θ˙i − θ˙∗ (t)i ) − βi1 (θi − θ∗ (t)i ) = θ¨∗ (t)i − β 2 e˙ − β 1 e i
(5.21)
i
results in controlled manipulator dynamics having exponentially stable tracking error. If the trajectory θ∗ (t) were given explicitly, our job would be done. However, we do not have explicit expressions for θ∗ (t), θ˙∗ (t), and θ¨∗ (t) since we do not have an explicit expression 6
By the dynamics being decoupled we mean that for each i ∈ p, θ¨i = vi .
Sec. 5.4
Joint-Space Control of Workspace Trajectories
123
for θ∗ (t). We will construct estimators for θ˙∗ and θ¨∗ that will depend upon the state of a dynamic inverter as well as the desired workspace trajectory xd (t). We may approximate the time derivatives of θ∗ (t) using Algorithm 4.3.1. We require approximators E 1 (Γ, θ, t) for θ˙∗ and E 2 Γ, θ, t for θ¨∗ . Recall that Γ ∈ Rn×n is part of the state of the dynamic inverter used in the construction of a dynamic inverse. For E 1 , E 1 (Γ, θ, t) = −Γ · D2 F (θ, t) = Γ x˙ d (t)
(5.22)
For E γ (Γ, θ, t), the estimator of Γ˙∗ we get E γ (Γ, θ, t) = −Γ D1,1 F (θ, t) · E 1 (Γ, θ, t) + D2,1 F (θ, t) Γ P n ∂ 1 DF (θ) · E (Γ, θ, t) Γ = −Γ i=1 ∂θi
(5.23)
Then for E 2 (Γ, θ, t) we get
E 2 (Γ, θ, t) = =
d 1 ˙ 1 dt E (Γ, θ, t) θ=E (Γ,θ,t), Γ˙ =E γ (Γ,θ,t) γ E (Γ, θ, t)x˙ d + Γ x ¨d
(5.24)
Summarizing,
Derivative Estimators E 1 (Γ, t) = Γ x˙ d (t) E 2 (Γ, θ, t) = E γ (Γ, θ, t)x˙ d + Γ x ¨d P n ∂ γ 1 E (Γ, θ, t) = −Γ DF (θ)E (Γ, t) Γ i i=1 ∂θi
(5.25)
Remark 5.4.1 Notation. For the remainder of this chapter we will let θ denote the joint¯ denote the angle vector, θ∗ denote the inverse-kinematic solution to F (θ, t) = 0, and (Γ¯, θ)
state of the dynamic inverter which includes θ¯ as the estimator for the inverse-kinematic
solution.
N Let ¯ t) − β 2 (θ˙ − E 1 (Γ¯ , t)) − β 1 (θ − θ) ¯ vi := Ei2 (Γ¯ , θ, i i i
(5.26)
where, as noted in Remark 5.4.1, we denote the estimators for Γ and θ by Γ¯ and θ¯ respectively. Recall that E 1 (Γ∗ , t) = θ˙∗ , and E 2(Γ∗ , θ∗ , t) = θ¨∗ . Also, by Assumption 5.2.2,
E 1 (Γ∗ , t) and E 2 (Γ∗ , t) are C 2 in their arguments.
124
Joint-Space Tracking of Workspace Trajectories in Continuous Time
Chap. 5
¯ t), we let In order to estimate a linear dynamic inverse G(w, Γ¯ ) = Γ¯ · w for F (θ, ¯ := DF (θ) ¯ Γ¯ − I F γ (Γ¯, θ)
(5.27)
¯ ∈ Rn×n . As in Example 2.4.1 the dynamic inverse of F γ (Γ¯ , θ) ¯ is G : Rn×n × with F γ (Γ¯ , θ) Rn×n → Rn×n defined by
Gγ (w, Γ¯) = Γ¯ · w
(5.28)
We have already obtained an estimator for Γ˙∗ , namely E γ (Γ, θ, t) in (5.25). Combining estimation, control, and manipulator dynamics we make the following claim. Corollary 5.4.2 Joint Space Controller for Workspace Trajectories. Let F (θ) and xd (t) ∈
X be C 4 . Let θ∗ (t) be a continuous isolated solution of F (θ) = xd (t). Let DF (θ∗ (t)) and
its inverse be bounded for all t, and for all z ∈ Br , r > 0, let D 2 F (z + θ∗ (t)) be bounded. j j ¯ t), and E γ (Γ, θ, t) be Let B j := diag(β , . . . , βn ), j ∈ {1, 2}. Let where E 1 (Γ¯ , t), E 2 (Γ¯ , θ, 1
given by (5.25). Then if (Γ (0), θ(0)) is sufficiently close to (DF (θ∗(0))−1 , θ∗ (0)), the control system
˙ =τ M (θ)θ¨ + K(θ, θ)
(5.29)
˙ + M (θ)v(Γ¯ , θ, ¯ t) τ = −K(θ, θ)
(5.30)
¯ t) = E 2 (Γ¯ , θ, ¯ t) − B 1 (θ˙ − E 1(Γ¯, t)) − B 0 (θ − θ), ¯ v(Γ¯ , θ, # " # " # " γ ¯ Γ¯ − I Γ¯˙ Γ¯ DF (θ) E (Γ, θ, t) = −µ + ¯ − xd (t) θ¯˙ Γ¯ F (θ) E 1 (Γ, θ)
(5.31) (5.32)
causes the joint angle vector θ(t) to converge to the θ∗ (t) exponentially as t → ∞.
¯ Remark 5.4.3 Equations (5.32) provide exponentially convergent estimates Γ¯(t) and θ(t) of Γ∗ (t) and θ∗ (t), Equations (5.29) are the equations of motion for the manipulator. Equa¯ tions (5.30) and (5.31) determine the input τ as a function of θ, Γ¯, and θ.
Sec. 5.5
A Two-Link Example
125
Proof of Corollary 5.4.2: This is a straightforward application of Theorem 4.3.4 for the
case of ignorable or nonexistent internal dynamics.
Remark 5.4.4 Redundant Manipulators. In the case of a redundant manipulator, θ is of a dimension m greater than the workspace dimension n. Thus an infinite number of joint angle vectors θ correspond to any particular end-effector configuration. Assuming that DF (θ) is surjective, a tracking controller of the same form as the controller of Corollary 5.4.2 may be used. The only modification necessary to the controller of Corollary 5.4.2 for tracking for redundant manipulators is that Γ is n × m rather than n × n. In that case Γ∗ is the right inverse of DF (θ∗ ) (see Chapter 3, Section 3.2.1). The derivative estimators too
N
remain of the same form.
5.5
A Two-Link Example In this section we work through an example of the application of Theorem 4.3.4
to the control of a simple model of a two-link robotic arm diagrammed in Figure 5.4.
m2 xd
xd(t)
x(t) l1
l2
θ2,τ2 m1
AAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAA θ1,τ1
Figure 5.4: A two-link robot arm with joint angles θ = (θ1 , θ2 ), joint torques τ = (τ1 , τ2 ), end-effector position x, desired end-effector position xd , link lengths l1 and l2 , and link masses m1 and m2 , assumed to be point masses. The links of the robot arm are assumed rigid and of length l1 and l2 . The masses
126
Joint-Space Tracking of Workspace Trajectories in Continuous Time
Chap. 5
of each link are assumed, for simplicity, to be point masses m1 and m2 located at the distal ends of link 1 and link 2 respectively. The desired position of the end-effector at time t is xd (t). The actual position is x(t). We wish to make the end-effector (end of the second link) track a prescribed trajectory xd (t) in the Euclidean plane. The joint-space of the arm is parameterized by θ ∈ T2 where T2 is the 2-torus. As alluded to earlier in Section 5.3,
for our purposes we may view T2 through a single chart from R2 since neither joint of
the arm will ever undergo a full circular motion due to our choices of xd (t) and initial conditions. We will assume that for i ∈ {1, 2} we may exert a control torque τi at the ith
joint and will denote the vector of input torques by τ ∈ R2 . In this two-link manipulator case F : R2 → R2 ; θ 7→ F (θ) maps the configuration space to the Euclidean plane. Let
ci := cos(θi ), cij := cos(θi + θj ), si := sin(θi ), and sij := sin(θi + θj ), with i, j ∈ {1, 2}. For the two-link arm, the forward-kinematics map is F (θ) =
"
l1 c1 + l2 c12 l1 s1 + l2 s12
#
(5.33)
The workspace of the two-link robot arm is the codomain of F , namely {x ∈ R2 : x =
F (θ), θ ∈ T2 }. We wish to determine a τ such that the end-effector position x(t) = F (θ(t))
converges to the desired end-effector position xd (t).
The equations of motion for the two link manipulator (see [Cra89], Section 6.8) are
where
˙ + W (θ) + τ M (θ)θ¨ = − V (θ, θ)
(5.34)
M11 (θ) = l22 m2 + 2l1 l2 m2 c2 + l12 (m1 + m2 ) M12 = M21 = l22 m2 + l1 l2 m2 c2
(5.35)
M22 = l22 m2 ˙ = V (θ, θ) and W (θ) =
" "
−m2 l1 l2 s2 θ˙22 − 2m2 l1 l2 s2 θ˙1 θ˙2 m2 l1 l2 s2 θ˙2 1
m2 l2 gc12 + (m1 + m2 )l1 gc1 m2 l2 gc12
#
#
(5.36)
(5.37)
˙ is the vector of The matrix M (θ) is a positive-definite symmetric mass matrix, V (θ, θ) centrifugal and Coriolis forces on the manipulator, and W (θ) is the gravitational force on the point masses of the arm.
Sec. 5.5
A Two-Link Example
127
Let θ∗ (t) be the solution of F (θ, t) = 0, where F (θ, t) := F (θ) − xd (t). If we were
tracking an explicit joint trajectory θd (t) we could choose an input torque ˙ + W (θ) + M (θ) θ¨∗ − B 2 (θ˙ − θ˙∗ ) − B 1 (θ − θ∗ ) τ = V (θ, θ)
(5.38)
where B 1 = diag(β11 , β21 ) and B 2 = diag(β12 , β22 ), in order to achieve exponentially convergent tracking. However, the trajectory we wish to track is defined implicitly as the solution θ∗ (t) to F (θ) − xd (t) = 0.
For the simple two-link robotic arm considered in this example, closed-form solu-
tions for the inverse kinematics exist (see Craig [Cra89], p.122). For demonstration purposes we will use dynamic inversion to invert the kinematics, and we will use the closed form of the inverse kinematics to check our results.
(x1, x2)
AAAAA AAAAA AAAAA AAAAA AAAAA
θ2
θ1
Figure 5.5: Two configurations corresponding to the same end effector position. For each x in the interior of the workspace, there exist two configurations θ satisfying F (θ) = x as indicated in Figure 5.5. As long as xd is kept away from the boundary of the workspace, the two possible inverse-kinematic solutions of F (θ, t) = 0 never intersect7 .
We will first choose one inverse-kinematic solution, by our choice of initial conditions for dynamic inversion, and track it. Then we will change only our choice of initial conditions ¯ (Γ¯ (0), θ(0)) for the dynamic inverter and track the other inverse-kinematic solution. First we apply the derivative estimation algorithm to get estimators for θ˙∗ and θ¨∗ .
For the two-link arm we have as an estimator for θ˙∗ (t), E 1 (Γ, t) := Γ x˙ d (t)
(5.39)
E 2 (Γ, θ, t) := E γ (Γ, θ, t)x˙ d + Γ x¨d
(5.40)
¨ is An estimator for θ(t) 7
Where the two isolated solutions θ∗ meet, DF (θ∗) is singular.
128
Joint-Space Tracking of Workspace Trajectories in Continuous Time
Chap. 5
where E γ = −Γ
P n
= −Γ
Ei1 (Γ, θ, t) Γ " # " # ! −l1 c1 − l2 c12 −l2 c12 −l c −l c 2 12 2 12 E11 (Γ, t) + E21 (Γ, t) Γ −l1 s1 − l2 s12 −l2 s12 −l2 s12 −l2 s12
∂ i=1 ∂θi DF (θ) ·
Let c¯i = cos(θ¯i ), s¯i = sin(θ¯i ), c¯12 = cos(θ¯1 + θ¯2 ), and s¯12 = sin(θ¯1 + θ¯2 ). A dynamic inverter for this two-link manipulator control problem is " # ! −l s ¯ − l s ¯ −l s ¯ 1 1 2 12 2 12 ¯ t) ¯˙ ¯ Γ¯ − I + E γ (Γ¯, θ, Γ = −µΓ l1 c¯1 + l2 c¯12 l2 c¯12 " # ! l c ¯ + l c ¯ 1 1 2 12 θ¯˙ = −µΓ¯ − xd (t) + Γ¯ x˙ d (t) l1 s¯1 + l2 s¯12
(5.41)
Let
¯ t) = V (θ, θ) ˙ + W (θ) + M (θ) E 2 (Γ¯ , θ, ¯ t) − B 2 (θ˙ − E 1 (Γ¯ , t)) − B 1 (θ − θ) ¯ τ˜(θ, θ,
(5.42)
The resulting controller for the two-link arm is given as in Theorem 5.4.2. We choose xd (t) to be a time-parameterized figure-eight in the workspace, xd (t) = [3.75 cos(πt), 2 + 1.5 sin(2πt)]T
(5.43)
Figure 5.6 shows the results of a simulation. The integration was performed in Matlab [Mat92] using the adaptive step size Runge-Kutta integrator ode45. The parameters and initial conditions used in the simulation are shown in Table 5.1 below. Table 5.1: The table on the left shows parameters for the simulation of implicit tracking control of a two-link robot arm. The table on the right shows initial conditions. All angles are in radians. parameter
value
variable
B1 B0 µ l1 l2 m1 m2 g
I I 10 3[m] 2[m] 1[kg] 1[kg] 9.8[m/s2]
θ¯ Γ¯ θ θ˙
initial value
0 π/2
0 1/3 −1/2 −1/3 π −π/2 0 0
Sec. 5.5
A Two-Link Example
129
Workspace Paths
Joint-Space Paths 3
6
2.5
5 2 1.5 1
3
θ2 [rad]
x2 [m]
4
2
0.5 0
1
-0.5
0
-1 -1.5
-1 -5
-4
-3
-2
-1
0
x1 [m]
1
2
3
4
5
-2 -1
-0.5
0
0.5
1
1.5
θ1 [rad]
2
2.5
3
3.5
0.5
|| eest ||
0.4 0.3 0.2 0.1 0 0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
1.2
1.4
1.6
1.8
2
t 25
|| etrack ||
20 15 10 5 0 0
0.2
0.4
0.6
0.8
1
t Figure 5.6: The top left graph shows convergence of the workspace paths: F (θ) (solid), ¯ (dashed), and F (θ∗ ) (dotted) corresponding to the initial conditions of Table 5.1. F (θ) The top right graph shows convergence of the joint-space paths: θ (solid), θ¯ (dashed), and θ∗ (dotted). For the top graphs the symbol ’o’ marks the initial condition for each trajectory. For the two bottom graphs, the upper one shows the l2 -norm of the estimation ¯ − θ∗ (t), and the lower one shows the norm of the tracking error etrack = error eest = θ(t) T ˙ [(θ(t), θ(t))] − [(θ∗ (t), θ˙∗ (t))]T .
130
Joint-Space Tracking of Workspace Trajectories in Continuous Time
Chap. 5
The top left graph of Figure 5.6 shows the resulting end-effector path F (θ) (solid), ¯ of the estimator θ¯ for θ∗ through the desired path xd = F (θ∗ ) (dotted), and the image F (θ) ¯ and the path of the end-effector x(t) can forward-kinematics map F (dashed). Both F (θ)
be seen to converge to the desired path. The top right graph of Figure 5.6 shows a similar picture, but in joint space. Again, the convergence of both the estimator θ¯ (dashed) for
the inverse-kinematic solution, and the actual joint-angles θ (solid) to the inverse-kinematic solution θ∗ (dotted) corresponding to the desired trajectory can be seen. The upper bottom ¯ − θ(t)k2 , and the lower graph of Figure 5.6 shows the norm of the estimation error kθ(t)
˙ T − [θ∗ (t), θ˙∗ (t)]T k2 graphed bottom graph shows the norm of the tracking error k[θ(t), θ(t)] versus time. The particular inverse-kinematic solution chosen was due to the choice of Γ¯(0).
5.5.1
Tracking the Other Solution We may cause the arm to track the other inverse kinematic solution simply by
choosing a different set of initial conditions for the dynamic inverter. Figure 5.7 shows ˙ the results using the same parameters and manipulator initial conditions θ(0) and θ(0) as ¯ above, but with dynamic inverter initial conditions θ(0) and Γ¯ (0) as indicated in Table 5.2. Table 5.2: Initial Conditions for the simulation of implicit tracking control of the other solution for a two-link robot arm. All angles are in radians. variable θ¯ Γ¯ θ θ˙
initial value
1.1760 −π/2 −0.3077 0.1282 0.5000 0.3333 π −π/2 0 0
Again, the end-effector path F (θ) (solid), desired path xd (t) (dotted), and the ¯ (dashed) are shown in top left graph of image of θ¯ through F in the workspace F (θ)
Figure 5.7. The top right graph of Figure 5.7 shows the corresponding joint-space paths, θ (solid), θ∗ (dotted), and θ¯ (dashed), and the bottom graphs show the estimation error
¯ − θ∗ (t)k2 (upper) and tracking error k(θ(t), θ(t)) ˙ kθ(t) − (θ∗ (t), θ˙∗ (t))k2 (lower). Once again
Sec. 5.5
A Two-Link Example
131
Workspace Paths
Joint-Space Paths -0.5
6
5
-1
-1.5
3
θ2 [rad]
x2 [m]
4
2
-2
1 -2.5
0
-1 -5
-4
-3
-2
-1
0
x1 [m]
1
2
3
4
-3 0.5
5
1
1.5
2
2.5
3
3.5
4
θ1 [rad]
|| eest ||
0.4 0.3 0.2 0.1 0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
2
1.2
1.4
1.6
1.8
2
t
|| etrack ||
10 8 6 4 2 0
0.2
0.4
0.6
0.8
1
t ¯ Figure 5.7: The top left graph shows convergence of the workspace paths: F (θ) (solid), F (θ) (dashed), and F (θ∗ ) (dotted) for the other inverse kinematic solution corresponding to the initial conditions of Table 5.5.1. Note that the path F (θ∗ ) is a periodic curve of period 2. The top right graph shows convergence of the joint-space paths: θ (solid), θ¯ (dashed), and θ∗ (dotted) for the other inverse kinematic solution. For the top graphs the symbol ’o’ marks the initial condition for each trajectory. For the bottom two graphs, the upper plot ¯ − θ∗ (t), and the lower bottom graph shows the l2 -norm of the estimation error eest = θ(t) T − [(θ (t), θ˙ (t))]T . ˙ shows the l2 -norm of the tracking error etrack = [(θ(t), θ(t))] ∗ ∗
132
Joint-Space Tracking of Workspace Trajectories in Continuous Time
the tracking error can be seen to converge to zero.
Chap. 5
Sec. 5.6
5.6
Chapter Summary
133
Chapter Summary The implicit tracking controller of Chapter 4 has been applied to the robot control
problem of tracking of workspace trajectories using joint-space control. This approach provides exponentially convergent tracking of the inverse-kinematic solution corresponding to a continuous end-effector trajectory in the workspace. This results in exponential tracking of the desired end-effector path in the workspace. The controller has been posed in continuous time, using a dynamic inverter to produce approximations of the joint-space signals necessary for control. Though the two-link robot arm of Section 5.5 had simple rotary joints, it should be kept in mind that dynamic inversion may be used for inverse kinematics of manipulators with more complex joint geometry than simple prismatic or rotary joints. This includes, for instance, joints such as spherical joints
8
as well as joints with less regular geometries (see
Figure 5.8), like those found in the human body. A more general joint parameterization might require multiple parameters for a single joint. A more general forward-kinematic map F might reflect changes in link length as a function of joint configuration. As long
as our assumption on the rank and smoothness of F (θ) hold and as long as a continuous
isolated solution exists, dynamic inversion and the implicit tracking theorem will work for such manipulators.
θ
Figure 5.8: An irregular joint geometry.
8
Spherical joints may be modeled as the coincidence of three rotary joints
134
Chapter 6
Approximate Output Tracking for a Class of Nonminimum-Phase Systems 6.1
Introduction In this chapter we study the tracking control problem for a class of time-invariant
nonlinear nonminimum-phase control systems which we call balance systems. A balance system has associated with it a controllable linearization at its origin as well as certain other structural properties which we will exploit. Balance systems are a useful class of models for modeling physical systems for which gravitational balance must be maintained. Some examples of systems which are appropriately modeled as balance systems are bicycles, motorcycles, rockets, winged aircraft, and the inverted pendulum on a cart. The problem with which we will concern ourselves in this chapter is the output tracking problem, where we wish to cause the output of a balance system to track a desired reference trajectory, but we also wish to maintain the internal state within given bounds. We will not assume that balance systems have output-bounded internal dynamics (see Chapter 4, Definition 4.2.7). As an example of a balance system consider the cart and ball system of Figure 6.1. The output of the system is the position of the cart y. The input to the system is the acceleration of the cart u = y¨, and the internal state of the system is the position α and velocity α˙ of the ball relative to a frame fixed to the cart as shown. The ball is modeled as a particle, i.e. zero moment of inertia, and is assumed to be constrained vertically to
Sec. 6.1
Introduction
135
α u
0
-1
0
1
y
Figure 6.1: A balancing cart-ball system. remain on the curved surface shown, but free to roll off of either lateral extreme of the cart’s surface. The origin of the cart-ball system corresponds to the cart being still (y˙ = 0) at y = 0, with the ball at α = 0 and motionless (α˙ = 0). The class of tracking problems which we will consider is represented by the objective of controlling the cart’s position y to track a desired trajectory yd (t) without having the ball slide off of the cart. Critical to the definition of the problem class, however, is that we wish our controller to work for any reference trajectory yd (·) chosen from an open set of reference trajectories, where the open set contains the trajectory yd (·) ≡ 0. For our cart-ball system, for instance, we restrict the
reference trajectory yd (·) to satisfy
sup k [yd (t), y˙ d(t), y¨d (t)]T k∞ <
(6.1)
t≥0
for some > 0.
6.1.1
Limitations on Tracking Performance The tracking performance of systems such as the cart-ball system of Figure 6.1 has
some inherent limitations. In contrast to systems with output-bounded internal dynamics, the internal state (the position and velocity of the ball relative to the cart, (α, α)) ˙ cannot be ignored in the act of tracking an output trajectory given that we require that the ball remain on the cart. Recall (see Chapter 4) that for feedback linearizable systems with output-bounded internal dynamics, we can achieve exponentially convergent tracking of any sufficiently smooth output reference function yd if the derivatives of yd are sufficiently small. In particular, if the initial output tracking error is zero, then the output tracking error is zero for all time. A sufficiently small bound on the derivatives of yd (t) insures bounded internal dynamics. This is obviously not true for the cart-ball system above. For
136
Approximate Output Tracking
Chap. 6
instance, if we wish to track yd (·) ≡ 0, but the ball is not initially at the origin with zero velocity, then the ball will fall off of the cart. Causal Exact Tracking A more subtle limitation of the cart-ball system is that even if we are allowed to choose the initial condition (α(0), α(0)) ˙ of the ball, we still cannot achieve exact output tracking, with bounded internal state, over an open set of reference trajectories using a causal controller. By a “causal” controller we mean a controller which, at any time t, requires no more information about the reference trajectory yd (·) than a finite length vector (1)
(k)
[yd (t), yd (t), . . . , yd (t)]T . If we know all of yd (t) for t ≥ 0 in advance, we can, in some cases, determine an initial state of our control system such that under the assumption of exact output tracking, the state of the control system remains bounded. But in many cases of interest it is impractical to assume such knowledge. That we cannot achieve exact causal tracking of any output reference trajectory from an open set can be demonstrated as follows: Choose an arbitrarily small > 0. For any such we can construct a C 2 reference trajectory yd (t) such that y˙d (t) ≥ 0 for all t ≥ 0
and supt≥0 k[yd (t), y˙ d (t), y¨d(t)]T k∞ < . An example is
0, 1 1) yd (t) = k π(t−t − sin(2π (t − t )) , 1 2 4 kπ 2 ,
t ∈ [0, t1]
t ∈ [t1 , t1 + 1]
(6.2)
t ≥ t1 + 1
for t1 ≥ 0. Differentiate yd (t) to get
0, y˙d (t) = k sin2 (π(t − t1 )) , 0,
t ∈ [0, t1 ]
t ∈ [t1 , t1 + 1]
(6.3)
t ≥ t1 + 1
and differentiate y˙d (t) to get
0, y¨d (t) = 2π k (sin(π(t − t1 )) cos(π(t − t1 ))) , 0,
t ∈ [0, t1 ]
t ∈ [t1 , t1 + 1]
(6.4)
t ≥ t1 + 1
If k < /π, then (6.1) holds. The signals yd (t), y˙d (t), and y¨d (t) are graphed in Figure 6.2 for k = 1/(2π), where k is sufficiently small for = 1.
Sec. 6.1
Introduction
137
yd
0.2
0 0
0.5
1
1.5
2
2.5
3
2
2.5
3
2
2.5
3
t
d/dt yd
0.2 0.1 0 0
0.5
1
1.5
t
d2 /dt2 yd
0.5 0 -0.5
0
0.5
1
1.5
t Figure 6.2: A reference trajectory yd (t) such that y˙d (t) ≥ 0 for all t ≥ 0, and such that supt≥0 k[yd (t), y˙ d(t), y¨d (t)]T k < . For this graph t1 = 1, k = 1/(2π). In general we assume t1 > 0 is unknown. Now assume that the cart tracks yd (t) (6.2) exactly, but that we do not know t1 in advance. Such tracking is easily accomplished by setting the input u to be u = y¨d (t) + β1 (x˙ − y˙d (t)) − β0 (x − yd (t))
(6.5)
and setting y(0) = yd (0) and y(0) ˙ = y˙d (0), with β1 > 0 and β2 > 0. The cart travels at constant velocity for all t ≥ t1 + 1. Thus if α(t1 + 1) = 0 and α(t ˙ 1 + 1) = 0, then for all
t ≥ t1 + 1 the ball remains at rest at its zero configuration (α, α) ˙ = (0, 0) relative to the
cart. There may or may not exist a solution such that the ball starts on the cart at time t = 0 with some initial velocity, and ends up at (α, α) ˙ = 0 at time t = t1 + 1 without falling off the cart. If a solution does not exist, then obviously there is no initial condition at which we can set the ball so that (α(t1 + 1), α(t ˙ 1 + 1)) = (0, 0). If a solution does exist, then for each choice of t1 , that solution is unique. Thus the propitious initial conditions for the ball are unique for each value of t1 . But without knowledge of t1 we cannot determine this
138
Approximate Output Tracking
Chap. 6
unique propitious initial condition, and every other initial condition will cause the ball to eventually fall off of the cart because the ball does not end up at its zero state (α, α) ˙ = (0, 0) at time t1 + 1 (after which the cart travels at constant velocity). Therefore, we cannot, in a causal manner, achieve exact tracking while keeping the ball on the cart. Since we can make as small as we wish and still define yd of the form (6.2), it follows that there is no > 0 for which we can achieve exact tracking on all yd (t) satisfying (6.1).
6.1.2
The Inversion Problem for Nonlinear Systems Given a control system, the inversion problem is the problem of determining a
feasible state trajectory such that output resulting from that state trajectory is a preassigned output reference trajectory yd (t). A feasible state trajectory whose corresponding output is the given output reference is referred to as the inverse corresponding to yd . We seek to construct a causal controller for balance systems with the property that the closed-loop system, consisting of balance system and controller, has output-bounded internal dynamics while providing output tracking. In the present chapter, however, we wish to have the ability to make the bound on the internal state of the closed-loop system arbitrarily small so that we can satisfy any bound on the internal state. This will require a bound on the reference output yd of the closed-loop system, as well as on its derivatives.
6.1.3
How Dynamic Inversion Will Be Used We will use dynamic inversion (see Chapters 2 and 3) to produce an estimate of an
implicitly defined function of state called the internal equilibrium angle, αe . We will use a variation of the derivative estimator algorithm, Algorithm 4.3.1, in order to obtain estimates of the derivatives of αe with respect to time. Both αe and its derivative estimates will then be incorporated into a tracking control law as the internal part of an approximate inverse trajectory. By causing the internal state (e.g. (α, α)) ˙ to approximately track (αe , α˙ e ), the proposed control law will allow approximate tracking with bounded internal dynamics.
6.1.4
Previous Work The problem of output tracking for linear time-invariant systems was solved by
Francis [Fra77]. Isidori and Byrnes [IB90] generalized the result of Francis to the timeinvariant nonlinear case. Both results give asymptotic tracking of any member in a family of signals generated by time-invariant autonomous dynamic systems. Though the linear
Sec. 6.1
Introduction
139
problem may be solved by solving a set of linear matrix equations, the nonlinear problem requires the solution of a non-trivial set of partial differential equations. Huang and Rugh [HR92a, HR92b] and Krener [Kre92] have studied conditions under which the Byrnes and Isidori equations are solvable. Tornamb`e [Tor91] presented a controller for single-input single-output feedback linearizable systems using singular perturbation to stabilize a state trajectory compatible with exact tracking. Gurumoorthy and Sanders [GS93] used a singular perturbation approach to stabilize bounded and known state trajectories which yield the desired output exactly. Tracking of exosystem generated signals using sliding mode control was studied by Gopolswamy and Hedrick [GH93]. In both [GS93] and [GH93] a reversal of a local approximation of the internal dynamics vector field was used in order to approximate the internal part of the inverse trajectory. Such a strategy is akin to the use of dynamic inversion in this chapter, though considerably more local.
Hauser, Sastry, and Meyer [HSM92] studied controllers for non-minimum phase systems for which the transfer function of the linearization has a zero in the right-half plane, but close to the imaginary axis. Approximate feedback linearization was then used for tracking and regulation, by approximating the slightly nonminimum-phase system by one that was minimum phase and feedback linearizable.
Devasia, Paden, and Chen [DPC94] have introduced an iterative method of determining a bounded inverse for a class of nonlinear systems when such a bounded inverse exists. Once the inverse is found one can resort to conventional tracking controllers to stabilize the inverse trajectory [DP94]. In the case of nonminimum-phase systems the inverse constructed in [DPC94] is non-causal in that one must either start the system from the origin at t = −∞, where the inverse is assumed to be 0, or one must preset the initial
conditions of the system. Once the solution is determined one may set the initial conditions of the control system to match the predetermined solution, then use conventional tracking control techniques to stabilize the bounded trajectory. Hunt, Meyer, and Su [HMS94] have also presented constructive methods for finding a bounded inverse compatible with a desired
output. DiBenedetto and Lucibello [BL93] have studied the case where a known solution exists and one is free to choose initial conditions. Again, this is a non-causal solution since we must know the complete history of the output reference trajectory in order to make the correct choice of initial condition.
140
Approximate Output Tracking
6.1.5
Chap. 6
Differences in Our Approach The approach presented in this chapter differs from previous approaches to tracking
for nonlinear non-minimum phase systems in a number of ways. i. Unlike [IB90], [GH93], [Fra77], [HR92a], and [GS93], we do not rely upon an autonomous exosystem to produce the output trajectory we wish to track. We are motivated to avoid the autonomous exosystem assumption by the realization that in the problem of controlling many nonlinear systems, vehicles for instance, reference trajectories do not originate in autonomous dynamic systems. ii. We do not assume that a bounded internal trajectory exists under exact output tracking conditions. iii. Unlike Tornamb`e [Tor91] we do not assume that the control system is feedback linearizable. We do, however, make other weaker assumptions regarding the structure of a partially feedback linearized system. iv. Our approach is to construct a submanifold of the state-space, called the internal equilibrium manifold, whose geometry depends upon a choice of output error dynamics. By making an open neighborhood of that manifold attractive and invariant, approximate1 output tracking with bounded internal dynamics is achieved. In fact Grizzle, Di Benedetto, and Lamnabhi-Lagarrigue [GBLL94] have shown that exact causal tracking for nonminimum-phase systems is impossible2 . In the present case, there is no inverse state-trajectory, corresponding to exact tracking, in the internal equilibrium manifold, but there is an approximate inverse in a neighborhood of the manifold; our controller renders that neighborhood invariant. Tracking the inverse solution corresponds to approximate output tracking. v. Unlike Francis [Fra77], Devasia et al [DPC94], and Hunt et al [HMS94] we do not construct a “particular” control to drive the output along a desired reference trajectory and superimpose a feedback control to stabilize the particular solution. Rather we construct a control that, in a sense, stabilizes a desired output error dynamics. 1
By approximate tracking is meant tracking with bounded error, where the bound on the error depends on a norm of the reference trajectory, defined below (6.57). 2 Strictly speaking, what Grizzle, et al. [GBLL94] showed was that given a nonminimum-phase analytic control system, there does not exist an analytic compensator which provides exact tracking of an open set of trajectories, while maintaining internal stability.
Sec. 6.1
Introduction
141
vi. Unlike [DPC94] the control scheme presented in this chapter is causal in the case of nonminimum-phase systems, and solves problems of both stabilization and trajectory generation simultaneously. We do not rely upon knowledge of a bounded inverse. Instead, exactness of tracking is sacrificed in favor of assuring boundedness of a solution without the need to assume a priori knowledge of the solution. Our approach also helps to elucidate some of the limitations on the output tracking task for nonminimumphase systems and the role of geometrical features in those limitations. Though the controller presented provides only approximate tracking, it is approximate tracking of an open set of trajectories. There are many applications in which boundedness of a solution is more important than exactness of tracking. In particular the present work has been inspired by the problem of controlling a bicycle [Get94, Get95, GM95a], examined in Chapter 7, for which balance must be maintained at all times, and where perfect path-tracking accuracy may be seen as being somewhat less important.
6.1.6
Main Results The main results of this chapter are
• We exhibit a new class of systems called external/internal convertible systems3 such
that their external dynamics are converted to internal dynamics, and their internal dynamics are made external, by a choice of input coordinate change and a new choice of output.
• We present a subclass of nonminimum-phase external/internal convertible systems
called balance systems which have unstable zero dynamics and controllable lineariza-
tion at the origin. • We introduce a manifold associated with the internal dynamics of partially feedback linearizable systems called an internal equilibrium manifold. Controlling the state of
the control system to a neighborhood of the internal equilibrium manifold corresponds to “balancing” the ball, on the cart-ball system above, while the cart tracks a desired reference trajectory. • We describe a causal controller for balance systems which solves for an approximate
bounded inverse while simultaneously making a region around that bounded inverse
3
Systems of this type were first introduced, albeit sloppily, in [GH95].
142
Approximate Output Tracking
Chap. 6
attractive and invariant. We also prove that under appropriate conditions the controller results in approximate tracking of a given reference trajectory and internal dynamics that are bounded above by a class-K function (see Appendix A for definitions and notation) of a bound on the reference output and its derivatives.
• We state and prove a theorem on the stability of exponentially stable systems under affine perturbations.
6.1.7
Chapter Preview In order to set the stage for a comparison of a linear controller and the somewhat
more complicated controller we will present in this chapter, we first review, in Section 6.2, the Jacobian linearization4 of time-invariant nonlinear control systems. In Section 6.3 we describe more precisely the class of nonlinear systems under consideration, and the problem which we will solve. We will point out some structural features of the class, features which our controller exploits. In Section 6.4 we discuss output tracking control of our system class, ignoring for the moment unstable internal dynamics. In Section 6.5 we discuss the control of the internal dynamics ignoring, again for the moment, the behavior of the output of the system. In Section 6.6 we define the internal equilibrium manifold, an intrinsic geometric structure associated the the class of control systems which we consider. The internal equilibrium manifold is associated with states for which the internal dynamics are in some sense “balanced” in the sense that one balances a broomstick while walking across a room. In Section 6.8 we propose a tracking controller based upon the internal equilibrium manifold. In Section 6.8 we show how dynamic inversion may be applied to the estimation of the internal equilibrium variables. In Section 6.9 we apply the tracking controller to the classical problem of controlling an inverted pendulum on a cart where the cart position is the output we wish to cause to track a desired trajectory. Finally, in Section 6.10 we simulate the controller applied to the inverted pendulum on a cart and demonstrate a significant performance improvement over results obtained using a linear quadratic regulator approach [CD91a]. 4 We use the term Jacobian linearization rather than linearization in order to distinguish between the linearization of a control system at a point in state-space, from feedback linearization where a state-dependent coordinate change renders a nonlinear control system linear from input to output.
Sec. 6.2
6.2
Jacobian Linearization and Regions of Attraction
143
Jacobian Linearization and Regions of Attraction
6.2.1
Motivation In this chapter we will appeal to geometry as well as Jacobian linearization in
order to derive a controller for a class of nonminimum-phase systems. A step in our design process will involve choosing control parameters such the the Jacobian linearization of the controlled nonlinear system is exponentially stable at the origin. Thus, by design, stability in an arbitrarily small neighborhood of the origin will not be an issue. Our controller will be more complex than a linear controller, even though we will assume that the systems we deal with have controllable linearizations at the origin. Therefore, in order to justify the increase in complexity, we will show, through simulation and comparison, that the domain of attraction of the origin is notably larger than the domain of attraction for a standard linear (LQR) controller. In light of this comparison, a brief review of the role of Jacobian linearization in nonlinear control is appropriate.
6.2.2
The Role of Jacobian Linearization in Nonlinear Control We review Jacobian linearization as applied to a time-invariant nonlinear system
of the form x˙ = f (x, u)
(6.6)
with x ∈ Rn , u ∈ Rm , and with equilibrium x = 0, i.e. f (0, 0) = 0. We will assume that
f (x, u) is C 2 in its arguments. Expand
Assume a controller of the form u = −Kx where K ∈ Rn×m is as yet undetermined. x˙ = f (x, −Kx)
(6.7)
in a Taylor expansion about (x, u) = (0, 0) giving ∂f (0, 0) ∂f (0, 0) x− Kx + g(x). ∂x ∂u where g(x) ∈ Rn satisfies g(x) = O(kxk2 ). Let x˙ =
(6.8)
∂f (0, 0) ∂f (0, 0) ∈ Rn×n and B := ∈ Rn×m (6.9) ∂x ∂u and ignore the O(kxk2 ) term g(x) to get the linearized model of systems (6.6) about the A :=
origin x = 0, x˙ = Ax − BKx
(6.10)
144
Approximate Output Tracking
Chap. 6
Assume that the pair (A, B) is controllable. Through standard methods (see [CD91a]) we determine a K such that A − BK has eigenvalues in Co− (see Appendix A for notation), hence for x(0) sufficiently close to the origin, the solution x(t) of (6.7) with such a choice of
K is guaranteed to obey x(t) → 0 as t → ∞. If, in fact, g(x) ≡ 0 in (6.8), then the nonlinear system (6.6) is linear, and the origin is globally exponentially stable for any K such that
σ(A − BK) ∈ Co− . More typically, however, the term g(x) becomes large as kxk increases,
and the largest ball centered at the origin and contained in the domain of attraction of the origin is bounded.
6.2.3
Different Controllers – Same Linearization The form of the state feedback u = −Kx is simple and convenient, easily realized
by computer. Linear state feedback is, however, only one of an infinite number of functions we may choose for our controller, having the same linearization. For instance, let u ˜(x) = −Kx + u2 (x)
(6.11)
where u2 (x) = O(kxk2 ). Then a Taylor expansion of x˙ = f (x, u ˜(x)) about (x, u) = (0, 0) is x˙ =
∂f (0, 0) ∂f (0, 0) ∂ u ˜(0) x− x + h(x) ∂x ∂u ∂x (6.12)
=
∂f (0, 0) ∂f (0, 0) x− Kx + h(x) ∂x ∂u
where h(x) ∈ Rn is O(kxk2).
Equation (6.12) is the same form as equation (6.8) and has the identical lineariza-
tion (6.10), where A and B are defined as in (6.9). Thus, as long as u ˜ (6.11) has the linearization −Kx, the systems x˙ = f (x, −Kx) and x˙ = f (x, −Kx + u2 (x)) will behave
similarly for initial conditions close to the origin.
6.2.4
Regions of Attraction The nonlinear term u2 (x) in (6.11) may cause the size and shape of the region of
attraction5 of the origin of (6.12) to differ widely from the region of attraction of the origin of system (6.7) due to differences between g(x) and h(x) (see (6.8) and (6.12)). For systems having vector relative degree (see Section C in the Appendix, or Isidori [Isi89], p. 235) in a neighborhood of the origin, and which are fully linearizable 5
The region of attraction of an asymptotically stable equilibrium is the set of all initial conditions whose corresponding solutions go to the equilibrium as t → ∞.
Sec. 6.3
Problem Description
145
(no internal dynamics) for instance, it is possible (see Appendix, Section C) to choose a state-dependent change of coordinates and a control u ˜(x) so that h(x) = 0 locally, making the origin of (6.12) globally exponentially stable (as long as the coordinate change is valid globally). For most nonlinear systems, however, determination of the region of attraction is, in general, highly problematic. Only in rare instances can one be specific about the size of the region. Occasionally a Lyapunov function can be found which will give a conservative bound on the region of attraction for very simple systems. The problem of determining the region of attraction of equilibria in nonlinear systems has inspired the creation of a number of simulation-based tools which attempt to ease the burden [Kad94, PC89]. These tools work by integrating the dynamical system for a large set of initial conditions. This method is obviously approximate since regions of attraction can be notoriously complex, as in the case of chaotic systems [GH90]. The engineer must bring his or her experience to bear on the interpretation of such data. Computer tools are most effective for systems of two or three dimensions since the resulting region of attraction is easily visualized. In most cases however, one must rely upon simulation and physical understanding, as we will do here, in order to determine whether one controller provides a “better” region of attraction than another.
6.3
Problem Description
In this section we will first define a class of systems called external/internal convertible systems, which we sometimes call E/I convertible for short, and point out a number of properties of such systems. In Section 6.3.1 we define the E/I convertible form. In Section 6.3.2 we describe a number of properties of E/I convertible systems. In Section 6.3.3 we discuss the Jacobian linearization of E/I convertible systems. In Section 6.3.4 we discuss the zero dynamics of E/I convertible systems. In Section 6.3.5 we show how partially linearizable nonlinear control systems may be put into E/I convertible form. Then in Section 6.3.6 balance systems will be defined as a subclass of E/I convertible systems. This will allow us, in Section 6.3.7 to give a precise statement of the tracking problem we wish to solve.
146
6.3.1
Approximate Output Tracking
Chap. 6
External/Internal Convertible Form We will consider single-input, single-output, n-dimensional time-invariant nonlin-
ear control systems of the form (see Figure 6.3)
External/Internal Convertible System x˙ i = xi+1 , i ∈ m − 1 x˙ m = u α˙ i = αi+1 , i ∈ p − 1 Σ(u) α˙ p = f (x, α) + g(x, α)u y = x1
(6.13)
with input u ∈ R, output y ∈ R, state (x, α), with x := (x1 , . . . , xm ) ∈ Rm and α := (α1 , . . .αp ) ∈ Rp , with n = m + p. The coordinates (x, α) are assumed to be defined on
the open ball Br ⊂ Rn about the origin. The origin is assumed to be an equilibrium of the
system, thus f (0, 0) = 0.
Assumption 6.3.1 The functions f (x, α) and g(x, α) are C n in their arguments for all (x, α) in Br ⊂ Rm+p .
N
Assumption 6.3.2 The function g(x, α) is assumed to be nonzero for all (x, α) ∈ Br .
N Definition 6.3.3 Systems of the form (6.13) satisfying Assumption 6.3.1 are called external/internal convertible systems.
N
For convenience we will often refer to external/internal convertible systems as E/I convertible. Figure 6.3 gives a block diagram showing the structure of an E/I convertible system.
Sec. 6.3
u
Problem Description
∨
xm
x2
∨
∨
147
x1 y
External Subsystem Internal Subsystem αp
∨
f (x,α )+g(x,α ) u
∨
α2
∨
α1
Figure 6.3: An external/internal convertible system.
6.3.2
Properties of E/I Convertible Systems External/internal convertible systems have some useful structural features and
properties that we will now describe.
External and Internal Subsystems We will refer to x ∈ Rn as the external state of Σ(u) (6.13), and to Σext(u)
(
x˙ i = xi+1 , i ∈ m − 1
(6.14)
x˙ m = u
as the external subsystem of Σ(u) (6.13). Note that the external state x(t) of Σ(u) is completely determined by y (0,m−1)(t), because xi = y (i−1), i ∈ m (see Figure 6.4). Hence the external state is observable, and if y(t) ≡ 0 for all t, then x(t) ≡ 0 for all t.
u y(m)
∨
xm y(m-1)
∨
x2 y(1)
∨
x1 y
Figure 6.4: The external subsystem Σext(u) of Σ(u) (see also Figure 6.3).
148
Approximate Output Tracking
Chap. 6
We will refer to α ∈ Rp as the internal state of Σ(u) (6.13) and to
Σint(x, u)
(
α˙ i = αi+1 , i ∈ p − 1
(6.15)
α˙ p = f (x, α) + g(x, α)u
as the internal subsystem of Σ(u) (see Figure 6.5).
x
f (x,α )+g(x,α ) u
u
αp
∨
∨
α2
∨
α1
Figure 6.5: The internal subsystem Σint(x, u) of Σ(u) (see also Figure 6.3).
The E/I convertible form (6.13) may be constructed from Σint and Σext as shown in Figure 6.6.
Σ
ext
(u)
y
x x
u
Σ
int
(x, u)
α Figure 6.6: The plant Σ(u) reconstructed from the internal and external subsystems.
Sec. 6.3
Problem Description
Example 6.3.4 Consider the control system x˙ 1 = x2 x˙ 2 = u Σ(u) α˙ 1 = α2 α˙ 2 = x + sin(α1 ) − cos(α1 ) cos(x)u
The external state is [x1 , x2 ]T and the external subsystem is ( x˙ 1 = x2 Σext(u) x˙ 2 = u
The internal state is [α1 , α2 ]T and the internal subsystem is ( α˙ 1 = α2 Σint(x, u) α˙ 2 = x + sin(α1 ) − cos(α1 ) cos(x)u
149
(6.16)
(6.17)
(6.18)
N
The Dual Structure An E/I convertible system is convertible because with a simple state-dependent input coordinate change and a change of output, the internal system is converted to an external system, and the external system is converted to an internal system. The resulting system will be referred to as the dual of Σ(u). Let u = g(x, α)−1 (v − f (x, α))
(6.19)
define a state-dependent input transformation, u ↔ v. Define λ = α1 as the dual output. Apply transformation (6.19) Σd (v)
to the E/I convertible system (6.13) to get x˙ i = xi+1 , i ∈ m − 1
x˙ m = −g(x, α)−1f (x, α) + g(x, α)−1v α˙ i = αi+1 , i ∈ p − 1
α˙ p = v
(6.20)
λ = α1
Thus the input transformation (6.19) combined with the output assignment λ = α1 con-
verts the internal dynamics of Σ(u) to the external dynamics of Σd (v), and the external dynamics of Σ(u) to the internal dynamics of Σd (v). This property is the origin of the term external/internal convertible.
150
6.3.3
Approximate Output Tracking
Chap. 6
The Linearization at the Origin The Jacobian linearization of Σ(u) (6.13) at the origin (x, α) = (0, 0) has the form "
x˙ α˙
#
=
"
A11
0
#"
A21 A22 | {z }
x α
#
+
A
"
#
B1
B2 | {z }
u
(6.21)
B
with B1 ∈ Rm×1 , B2 ∈ Rp×1 , A11 ∈ Rm×m , A21 ∈ Rp×m , and A22 ∈ Rp×p defined by
A11
0 1 0 ··· . . . = 0 0 0 0 ···
A22
0 0 ··· .. .. . . , A21 = 0 1 ··· 0 ∂f (0, 0)/∂x
0 1 . . . = 0 0
0 . . . B1 = , 0 1
···
0
∂f (0, 0)/∂α
B2 =
0 1
···
0
and
0
0 .. . 0 g(0, 0)
0 .. . 0
(6.22)
(6.23)
(6.24)
We may make a few observations regarding the linearization (6.21) of (6.13) at the origin. Observation 6.3.5 The form of the Jacobian linearization of an E/I convertible system neither precludes nor implies controllability at the origin. For example, (
x˙ = u α˙ = −x + α + u
(6.25)
is not controllable, while ( is controllable.
x˙ = u α˙ = x + u
(6.26)
N
Sec. 6.3
Problem Description
151
Observation 6.3.6 If ∂f (0, 0)/∂x = 0 and ∂f (0, 0)/∂α = 0, then the linearization is not controllable. For example
(
x˙ = u α˙ = cu
where c 6= 0 is a constant, is not controllable at the origin.
(6.27)
N
Observation 6.3.7 If f and g in Σ(u) (6.13) satisfy ∂f /∂x ≡ 0 and g(0, 0) = 0, then the pair (A, B)(6.21) is not controllable. For example ( x˙ = u
α˙ = α
is not controllable.
(6.28)
N
Since, when we define balance systems in Section 6.3.6, we will assume controllability of the linearization (6.21) of Σ(u) (6.13), we know that we may stabilize the origin by linear state-feedback. Well established tools exist for such stabilization, e.g. a linear quadratic regulator (LQR) (see [CD91a]). Remark 6.3.8 As mentioned in Section 6.2.1 previously, the controller we will describe in this chapter is more complex than a linear controller. We will show that the trade-off for such complexity is a substantial increase in the region of attraction of the origin in the case of regulation, along with a substantial increase in performance in the case of tracking. Our scheme will depend upon n parameters with the requirement that the linearization of the resulting closed loop system is stable at the origin. In fact later, for comparison purposes, we will make the linearization of the controlled subsystem identical to a linear quadratic regulator.
6.3.4
N
The Zero Dynamics For many single-input, single-output systems the relative degree of the output y
is less than the system dimension n. For such systems we may define zero dynamics, as discussed in Chapter 4. The stability or instability of those zero dynamics is a critical factor in determining the sort of control that may be used for tracking. The zero dynamics of the E/I convertible system Σ(u) (6.13) are
152
Approximate Output Tracking
Chap. 6
Zero-Dynamics of the E/I Convertible Form ( α˙ i = αi+1 , i ∈ p − 1
(6.29)
α˙ p = f (0, α)
obtained by restricting the input u and the output y of Σ(u) (6.13) to be identically zero (see Figure 6.7), or equivalently obtained as Σext(0, 0) (see Equation (6.14)).
f (0,α )
∨
αp
∨
α2
∨
α1
Figure 6.7: The zero dynamics of Σ(u). Balance systems, will be defined below as having zero dynamics (6.29) that are unstable at α = 0. The zero dynamics of the dual system Σd (v) (6.20) are obtained by restricting λ ≡ 0 and v ≡ 0. The resulting system is the zero dynamics of Σd (v), Zero-Dynamics of the Dual ( x˙ i = xi+1 , i ∈ m − 1
(6.30)
x˙ m = −g(x, 0)−1f (x, 0)
The stability or instability of the zero dynamics of Σd (v) are independent of the stability or instability of the zero dynamics of Σ(u). This is demonstrated by the following example. Example 6.3.9 Special Cases: Internal Stability of Dual Systems. The linear system
x˙ = u α˙ = x + α + u y = x
(6.31)
Sec. 6.3
Problem Description
153
has zero dynamics α˙ = α
(6.32)
obtained by setting u ≡ 0 and y ≡ 0. This zero dynamics (6.32) is unstable; if α(0) 6= 0
then |α(t)| → ∞. The system dual to (6.31) x˙ = −x − α + v α˙ = v y = α c
(6.33)
however, has the stable zero dynamics
x˙ = −x. The system
x˙ = u α˙ = −x + u y = x
has unstable zero dynamics and a dual system x˙ = x + v α˙ = v y = α
(6.34)
(6.35)
(6.36)
c
which also has unstable zero dynamics.
6.3.5
N
Conversion of Control Systems to External/Internal Convertible Form Single-input single-output time-invariant nonlinear control systems of the form ( x ¯˙ = f¯(¯ x) + g¯(¯ x)¯ u (6.37) y¯ = ¯ h(¯ x)
may be brought into the external/internal convertible form (6.13) in a number of ways. For instance, suppose the output y¯ has relative degree m (see Appendix, Section C, or Isidori [Isi89], Chapter 4). Suppose also that there exists another function φ(¯ x) having relative degree p and such that φ](¯ x) Φ(¯ x) := [L0f¯h, . . . , Lfm−1 h, L0f¯φ, . . ., Lp−1 ¯ f¯
(6.38)
154
Approximate Output Tracking
Chap. 6
is a diffeomorphism in a neighborhood of each x ¯ ∈ Br ⊂ Rn . Then the coordinate transfor-
mation x ¯ 7→ Φ(¯ x) along with the input transformation ¯ −Lm h(¯ x) + u f¯ u ¯= ¯ x) Lg¯ Ln−1 ¯ h(¯
(6.39)
f
brings (6.37) to the convertible form of Σ(u) given in (6.13). Many underactuated mechanical systems6 may be easily brought to convertible form (6.13) as shown by the following example. Example 6.3.10 Conversion of a Mechanical Systems to External/Internal Convertible Form. We consider a multi-input, multi-output, underactuated mechanical system in order to suggest to the reader how the standard form (6.13) is generalized to multiinput, multi-output systems. Let q 1 ∈ Rn1 , and q 2 ∈ Rn2 , with τ 1 ∈ Rn1 . Consider the second order underactu-
ated mechanical system " #" # M11 (q) M12 (q) q¨1 M21 (q) M22 (q) | {z }
q¨2
=
"
F 1 (q, q) ˙ F 2 (q, q) ˙
#
+
"
τ1 0
#
(6.40)
M (q)
with output y = q 1 and generalized applied force τ 1 . The mass matrix M (q) is assumed positive-definite and symmetric, hence the submatrices M11 (q) ∈ Rn1 ×n1 and M22 (q) ∈
Rn2 ×n2 are also symmetric and positive-definite. Since M (q) is nonsingular, a unique solution exists for q¨2 in terms of q 1 , q 2 , q˙1 , q˙2 , and τ . Consider the state-dependent input transformation which generates τ 1 from a new input u, τ 1 = M11 (q)u + M12 (q)¨ q 2 − F 1 (q, q) ˙
(6.41)
Substitute this expression for τ 1 into the first n1 equations of (6.40) to get q¨1 = u
(6.42)
Substitute (6.42) into the last n2 equations of (6.40) and solve for q¨2 to get −1 −1 q¨2 = M22 (q)F 2 (q, q) ˙ − M22 (q)M21 (q)u
(6.43)
which, when substituted into (6.41) gives
6
−1 −1 τ 1 = M11 (q) − M12 (q)M22 (q)M21 (q) u + M12 (q)M22 (q)F 2 (q, q) ˙ − F 1 (q, q) ˙
(6.44)
A mechanical system is underactuated when there are fewer generalized forces than generalized coordinates.
Sec. 6.3
Problem Description
155
Let x1 = q 1 , x2 = q˙1 , α1 = q 2 , α2 = q˙2
(6.45)
Our mechanical system now takes the form
x˙ 1 x˙ 2 α˙ 1 α˙ 2
= x2 = u
(6.46)
= α2 = M22 (q)−1 F 2 (q, q) ˙ − M22 (q)−1 M21 (q)u
with output y = x1 . If M21 (q) is full rank for all q in a neighborhood of the origin, then (6.46) is an E/I convertible system in that neighborhood. Spong [Spo96] calls this full rank condition on M21 (q) strong inertial coupling. Special Case: For the special case of n1 = n2 = 1, (6.46) is in single-input, single-output,
N
E/I convertible form (6.13).
6.3.6
Balance Systems In this subsection we define balance systems as a special class of E/I convertible
systems by imposing a number of assumptions on the E/I convertible form.
Assumption 6.3.11 The pair (A, B) from the Jacobian linearization (6.21) of Σ(u) (6.13)
N
at (x, α) = (0, 0) is controllable.
Rp−1 .
Let α2 denote [α2 , . . . , αp]T . Let (x; α1, α2 ) denote (x, α) with α1 ∈ R1 and α2 ∈
Assumption 6.3.12 Assume that ∂ f (0; 0, 0) > 0 ∂α1
(6.47)
N
156
Approximate Output Tracking
Chap. 6
Under Assumption 6.3.12 an E/I convertible system is non-minimum phase. Consider the equation f (x; α1 , 0) + g(x; α1, 0)v = 0
(6.48)
Divide both sides of (6.48) by g(x; α1, 0), which is never zero by Assumption 6.3.2, subtract v, and multiply by −1 to get −g(x; α1 , 0)f (x; α1, 0)f (x; α1, 0) = v
(6.49)
Assumption 6.3.13 Assume that for all (x; α1, 0) ∈ Br ∂ g(x; α1, 0)−1 f (x; α1 , 0) 6= 0 1 ∂α
(6.50)
N
i.e. α1 7→ v is strictly monotone.
The significance of Assumption 6.3.13 relates to the existence of an internal equilibrium manifold as will be explained in Section 6.6 below. Definition 6.3.14 An external/internal convertible system which satisfies Assumptions 6.3.11,
N
6.3.12, and 6.3.13 will be referred to as a balance system. Example 6.3.15 A Balance System. x˙ 1 = x˙ 2 = α˙ = y =
Consider the E/I convertible form system x2 u
(6.51)
sin(α) + x1 + u x1
whose linearization at the origin is 0 1 0 x˙ 1 x1 0 x˙ 2 = 0 0 0 x2 + 1 u α˙ 0 0 1 α 1 | {z } | {z } A
(6.52)
B
It is easily verified that (B, A) is a controllable pair. For system (6.51) f (x, α) = sin(α) + x1 , g(x, α) = 1
(6.53)
Sec. 6.3
Problem Description
157
Thus
∂f (0, 0) =1>0 ∂α so Assumption 6.3.12 is satisfied. Also ∂ ∂ g(x; α)−1 f (x, α) = sin(α) + x1 = cos(α) ∂α ∂α
(6.54)
(6.55)
is nonzero for all α ∈ (−π/2, π/2), so Assumption 6.3.13 is satisfied. Thus (6.51) is a
N
balance system for (x, α) ∈ Bπ/2 .
6.3.7
The Regulation and Tracking Problems In this subsection we define the tracking problem, with the regulation problem as
a special case. For n, m ∈ Z+ , m ≥ n, and y(·) ∈ C m [0, ∞) let y (n,m) := [y (n), y (n+1), . . . , y (m)]T
(6.56)
Let the class of output reference trajectories yd (·) be C n on [0, ∞), with the norm (0,n)
kyd
(0,n)
k∞ = sup kyd
(t)k∞
(6.57)
n o B(n) = yd , | ky (0,n)k∞ <
(6.58)
t≥0
and the corresponding open ball
Let C(v) denote the compensator for a balance system Σ(u) (6.13) defined by ( z˙ = a(x, α, z, v) C(v) (6.59) u = b(x, α, z, v) with x ∈ Rm , α ∈ Rp, z ∈ Rq , v ∈ Rk , u ∈ Rm , and where a(x, α, z, v) and b(x, α, z, v) are
smooth vector fields with a(0, 0, 0, 0) = 0, and b(0, 0, 0, 0) = 0. Let [C(v), Σ(u)] denote the interconnection of C(v) and Σ(u) as illustrated in Figure 6.8.
u
v
C(v) z
Σ (u )
y
x,α (x,α)
Figure 6.8: The interconnection of plant Σ(u) and compensator C(v).
158
Approximate Output Tracking
Chap. 6
We will concern ourselves with the following problem. Problem 6.3.16 Asymptotic Approximate Tracking. Let y be the output of [C(v), Σ(u)]. (n)
Assume yd ∈ B . Find integer q ≥ 0, and smooth functions a : Rm × Rp × Rq × Rk → Rq
and b : Rm × Rp × Rq × Rk → Rm , such that for all (x(0), α(0), z(0)) in a neighborhood of the origin,
i. limt→∞ ky(t) − yd (t)k∞ < κ1 () where κ1 () is a class-K function (see Appendix A) of ,
ii. for a class-K function of , κ2 (), if kα(0)k < κ2 (), then kα(t)k∞ < κ2 () for all t ≥ 0, iii. the equilibrium (0, 0) of [C(v), Σ(u)] is locally exponentially stable, iv. (α(t), z(t)) is bounded on R+ .
N
Definition 6.3.17 The Regulation Problem. The regulation problem is the tracking problem, Problem 6.3.16, for yd ≡ 0.
N
The dynamic part of the compensator C(v) that will be described will turn out to be a dynamic inverter.
6.3.8
A Comment on Normal Form Single-input single-output nonlinear autonomous control systems of the form (6.37)
may always be brought into a normal form (see Isidori [Isi89], page 152) in which the input u does not appear in the internal subsystem. This is a result of the fact that the input vector field g¯ (6.37) is one-dimensional and therefore constitutes an involutive vector field distribution. Thus there exists an embedded submanifold of the state space Rn , and an n−1 dimensional distribution ∆(x), tangent to the embedded submanifold, such that g¯ is not in ∆(x). By changing the vector field basis in which the system is expressed to be a union of g¯ together with a smooth basis of ∆(x), the normal form is achieved. Since form (6.13) is a special case of form (6.37), we may also bring (6.13) to normal form. However, the control methodology we will develop in this chapter is not confined to single-input singleoutput systems. Single-input single-output systems have only been chosen for simplicity of exposition. We will, in fact, rely upon the appearance of u in the internal subsystem. In the multi-input multi-output case, a normal form, in the sense of Isidori [Isi89], exists only
Sec. 6.4
Controlling the External Subsystem
159
for the restricted class of control systems having involutive input distribution. If the input distribution is indeed involutive, our method is unaffected by this. But by not relying upon involutivity we retain the utility of our control scheme over a much wider variety of systems.
6.4
Controlling the External Subsystem In this section we describe a tracking controller for the external subsystem Σext(u)
(see Equation (6.14)) disregarding, for the moment, the evolution of the internal subsystem Σint(x, u), (6.15). This tracking controller will play a role in our final control law.
6.4.1
The External Tracking Dynamics We define the external tracking controller by
External Tracking Controller
(0,m) vext(yd , x)
=
uext = vext P (m) yd (t) − m i=1
(6.60) γi(xi −
(i−1) yd (t))
where γi , i ∈ m are chosen so that the roots of the polynomial sm +
m X
γi si
(6.61)
i=1
is in Co− . Remark 6.4.1 The external tracking controller is dependent solely upon the external state (0,m)
x and on yd
(t). The evolution of the internal state α is ignored.
N
In light of Remark 6.4.1 we cannot expect Σ(uext), the system (6.13) with u := uext, to be internally stable. For instance, if yd ≡ 0 and Σ(u) has unstable zero dynamics, then
by definition of unstable zero dynamics, Σ(uext) is not internally stable.
Given a reference output yd (t) ∈ C n , the nominal external dynamics are
160
Approximate Output Tracking
Chap. 6
Nominal External Dynamics x2 x˙ 1 . .. . . . Σext(uext) : = x˙ m−1 xm P (m) m i − y (i−1) (t) m y − γ x x˙ i=1 i d d
(6.62)
We call the vector field vf(Σext(uext)) (see Appendix A for notation) the nominal external vector field,
Nominal External Vector Field x2 .. . (0,m) Next(yd , x) = xm P (m) (i−1) i yd − m (t) i=1 βi x − yd
(6.63)
The nominal external dynamics are the dynamics we would like the external system Σext(u) (6.14) to obey if we could ignore the internal α-dynamics. Example 6.4.2 External Tracking Controller and External Dynamics. Consider the system
x˙ 1 x˙ 2 α˙ 1 α˙ 2
= x2 = u = α2
(6.64)
= f (x, α) + g(x, α)u
The external subsystem for this system is (
x˙ 1 = x2 x˙ 2 = u
(6.65)
Sec. 6.5
Controlling the Internal Subsystem
161
Given yd (t) ∈ C 2 an external tracking controller (6.60) is uext = vext = y¨d − (x2 − y˙ d ) − (x1 − yd ) and the nominal external dynamics (6.62) is ( x˙ 1 = x2
x˙ 2 = y¨d (t) − (x2 − y˙ d ) − (x1 − yd )
(6.66)
(6.67)
N 6.5
Controlling the Internal Subsystem In this section we describe a controller for the internal subsystem Σint(u), disre-
garding, again for the moment, the evolution of the external subsystem Σext(u) and thus the output y(t). Like the external tracking controller, the internal tracking controller will play a crucial role in our final tracking control law.
6.5.1
The Internal Tracking Dynamics We will associate with the internal subsystem Σint(x, u) the output λ = α1 , which
may be identified as the output of the dual system Σd (v) (6.20). Given a reference output λd (t) ∈ C p , the internal tracking controller is
Internal Tracking Controller uint(vint) = g(x, α)−1 (vint − f (x, α)) P (p) vint = λd − pi=1 βi (αi − λ(i−1))
(6.68)
where βi , i ∈ p, are chosen so that the roots of the polynomial p
s −
p X
βi si−1
(6.69)
i=1
have strictly negative real parts. Given a reference output λd (t), the nominal internal dynamics (see Figure 6.9) is Σint(x, uint(vint)) (see (6.15) and (6.68)),
162
Approximate Output Tracking
Chap. 6
Nominal Internal Dynamics α˙ 1 α2 . .. . . . Σint(x, uint) = = α˙ p−1 αp P (p) (i−1) α˙ p λd − pi=1 βi (αi − λd )
(6.70)
If we could ignore the external dynamics Σext(uint) of Σ(uint), the nominal internal dynamics (6.70) are the dynamics we would like the state of Σint(x, u) to obey in order for λ(t) to converge exponentially to λd(t).
x
( p)
λ 0,d
v
p
( p) − d
λ
ℜβi (α i −λ(id 1) ) −
i=1
u
g(x,α )-1 (v − f (x,α ))
Σ int (u) α
α Figure 6.9: The internal tracking controller. Example 6.5.1 Internal Tracking Controller and Internal Dynamics. Consider the system (6.64) of Example 6.4.2. Assume λd (t) is C 2 . An internal tracking controller for system (6.64) is (
uint = g(x, α)−1(f (x, α) − vint) ¨ d − 2(α2 − λ˙ d ) − (α1 − λd) vint = λ
(6.71)
and the nominal internal dynamics are "
α˙ 1 α˙ 2
#
=
"
α2 ¨ d − 2(α2 − λ˙ d ) − (α1 − λd) λ
#
(6.72)
N
Sec. 6.6
The Internal Equilibrium Manifold
163
Remark 6.5.2 In our final tracking strategy we will construct a value of the signal λd, (0,m)
which we will call αe , that depends upon yd
(t), and the external state x. We will then use
a controller of a form similar to uint(vint) (6.68) in order to cause α1 to approximately track αe . It will be shown in Section 6.7 that under appropriate conditions, if α1 approximately
N
tracks αe , then y approximately tracks yd (t).
6.6
The Internal Equilibrium Manifold In this section we construct the internal equilibrium angle αe , actually a func-
tion of x and a choice of external controller vext . We will use αe and approximations of its time-derivatives in order to construct an approximate tracking controller for Σ(u). The internal equilibrium angle7 αe is constructed as part of the equilibrium solution of Σint(x, uext), the internal subsystem (6.15) with the external tracking controller (6.60) applied. The internal equilibrium equations, Σe , are
Internal Equilibrium Equations 0 = α2 .. . Σe 0 = αp 0 = f (x, α) + g(x, α)v (y (0,m)(t), x) ext
(6.73)
d
The first p − 1 equations of Σe dictate that for the equilibrium solution, α2 = . . . = αp = 0
(6.74)
α2 := [α2 , . . ., αp ]T
(6.75)
Again, let
with α = (α1 , α2 ). Substitute the solution α2 = 0 of the first p − 1 equations of Σe into the
last equation of Σe , and let (αe , 0, . . . , 0) be denoted by (αe , 0) to get 7
We use the term angle because in the case of the inverted pendulum on a cart, αe corresponds to an angle of the pendulum.
164
Approximate Output Tracking
Chap. 6
Internal Equilibrium Angle (0,m)
0 = f (x; αe , 0) + g(x; αe, 0)vext(yd
(6.76) (t), x)
where we have replaced α1 in (6.73) by αe . We call the solution αe of (6.76) the internal equilibrium angle. Assumption 6.3.13 together with Assumption 6.3.1 and the implicit function theorem imply that for all x in a neighborhood of the origin, there exists an invertible xdependent map v 7→ αe (x, v) such that f (x; αe (x, v), 0) + g(x; αe(x, v), 0)v = 0 (0,m)
We will abuse notation by writing αe (yd
(0,m)
, x) = αe (x, vext(yd
(6.77)
, x)). Now note (see (6.60))
(0,m) (0,m) that vext(0, 0) = 0 and vext(yd , x) is continuous in yd and x. (0,m) αe (y (0,m), x) to (6.76) exists for all yd and x sufficiently small.
Therefore the solution
Assumption 6.6.1 Assume that αe (x, v) is C p in x and v with all mixed partial derivatives
N
up to order p bounded on Br . Remark 6.6.2 Note that
(0,m)
−g(x; αe , 0)−1f (x; αe , 0) = vext(yd
(6.78)
(t), x)
Equation (6.78) will play a key role later in the analysis of our final controller, where (6.78)
N
will reappear as the first term in a Taylor expansion.
Remark 6.6.3 Derivatives of the Internal Equilibrium Angle. Even though αe is the solution to an equilibrium equation, it is not generally true that
d (i) dt αe
= 0. This is
because the internal equilibrium equation Σe is dependent upon exogenous variables x and (0,m)
yd
(t) which are, themselves, functions of t.
N
Sec. 6.6
The Internal Equilibrium Manifold (0,m)
Remark 6.6.4 When yd
165
≡ 0, αe is the solution of
f (x; αe , 0) − g(x; αe, 0) (0,m)
thus αe is not identically zero when yd
m X
γi xi = 0
(6.79)
i=1
N
≡ 0.
Under certain regularity conditions, the implicit equations Σe (6.73) define an mdimensional submanifold of the n-dimensional state-space of Σ(u). More precisely we define the internal equilibrium manifold as follows:
Internal Equilibrium Manifold E(t) = {(x, α) ∈ Br ⊂ Rm × Rp | 1
α =
(0,m) αe (yd (t), x), α2
(6.80) p
= ...= α = 0
o
Property 6.6.5 The manifold E(t) is an intrinsic geometric structure associated with the
pair (yd , Σ(u)) having the following properties:
• t-dependence. The internal equilibrium manifold is t-dependent, except in the reg(i)
ulation case where yd (t), i ∈ m, are identically zero.
• Dimension. The internal equilibrium manifold is of dimension m and codimension p.
• Graph Property. The internal equilibrium manifold is a t-dependent graph over the m-dimensional ex -subspace of the state space of Σext(u).
• Independent of Applied Input. The definition of E(t) is independent of the input (0,m)
u applied to Σ(u), though it is dependent on vext (yd
(t), x).
• Regulation Case. When yd (t) ≡ 0, (x, α) = (0, 0) is in E(t). (0,m)
• Special Case. If f and g are independent of x, then the level sets of αe (yd
, x)
are m − 1-dimensional time-varying hyperplanes embedded in the m-dimensional x-
subspace of the state space Rn . In this case αe may be regarded as a function of the (0,2)
value of vext. Figure 6.10 illustrates this for the case of x ∈ R2 and yd
≡ 0.
166
Approximate Output Tracking
αe
AA
Chap. 6
αe x2 v ext
x1
Figure 6.10: When f (x, α) and g(x, α) are independent of x, then αe may be regarded as a time-varying function of vext.
N 6.6.1
Derivatives Along the Internal Equilibrium Manifold As mentioned above, we may regard E(t) as being a t-dependent graph over the m-
dimensional x-subspace Rm of Σext(u). Consider a smooth vector field N : R+ × Rm → Rm ; (t, x) 7→ N (t, x). Thus for each t we have a vector field N (t, ·) in Rm . Let h(t, x) be
a smooth real-valued function of t and x. The ith Lie derivative (or directional derivative) of h(t, x) along N (t, x), holding t fixed and evaluating at (t, x), is denoted LiN h(t, x), and is defined recursively by LN h(t, x) = L1N h(t, x) = dh(t, x) · N (t, x), and LiN h(t, x) = ¯ i h(t, x) by LN (Li−1 (t, x)h(t, x)), with L0 h(t, x) = h(t, x). Define L N
N
N
¯ N h(t, x) := LN h(t, x) + ∂h L ∂t
(6.81)
¯ iN h(t, x) := L ¯N L ¯ i−1 h(t, x) L N
(6.82)
¯ i h(t, x) Similarly define L N
¯ i h(t, x) is the ith derivative with respect to t of the function h(t, x) along the Thus L N (0,m)
solutions of x˙ = N (t, x). The ith derivative of αe (yd ¯ i αe . then L
, x) along the vector field N (t, x) is
N
(0,m)
Recall the nominal external vector field Next(yd ¯ i αe as an approximator for dii αe . we will use L Next dt
, x) (6.63). In the next section
Sec. 6.7
6.7
Approximate Tracking
167
Approximate Tracking In this section we will combine the results of the previous sections in order to (n)
construct a controller for approximate output tracking of yd (t) ∈ B . The internal equilibrium controller, is constructed by substituting αe for λd, and
¯ i αe L Next
(i)
for λd in the internal tracking controller (6.68).
Internal Equilibrium Controller ue = uint(ve ) = g(x, α)−1(−f (x, α) + ve ) ¯ p αe − Pp−1 βi (αi − L ¯ (i−1) αe ) ve = L Next
6.7.1
i=1
(6.83)
Next
Error Coordinates It will be convenient to use error coordinates in our analysis of the stability of
Σ(ue ). Let (i−1)
eix = xi − yd
eiα =
(i) αi − αe ,
,
i∈ m−1
i∈p
(6.84)
p T T 1 with the external error ex := [e1x , . . . , em x ] and the internal error eα := [eα , . . . , eα ] . p T 1 Let e := [e1x , . . . , em x , eα , . . ., eα ] . Note particularly that αe may be regarded as a function (m)
of yd
6.7.2
(0,m)
and ex , or of yd
and x.
Analysis of the Internal Equilibrium Controller We will show that the system Σ(ue ), defined as system Σ(u) (6.13) with the input
ue (6.83), may be regarded as an exponentially stable system under an affine perturbation (see Definition (6.7.1) below). It will be seen that the internal equilibrium controller approximately decouples the dynamics of ex and eα . Later in Section 6.8 we will bring in dynamic inversion in order to estimate αe . For the purposes of the present analysis, however, we assume that αe is available explicitly.
168
Approximate Output Tracking Insert ue (6.83) into Σ(u) x˙ i x˙ m Σ(ue (ve )) α˙ i α˙ p
Chap. 6
(6.13) to get = xi+1 , i ∈ m − 1
= −g(x, α)−1f (x, α) + g(x, α)−1ve = αi+1 , i ∈ p
(6.85)
= ve
where ve is given by (6.83). Expand ve as
P ¯ p αe − p βi (αi − L ¯ i−1 αe ) ve = L Next i=1 Next P Pp (i−1) (p) (i−1) ¯ p αe − α(p) ¯ i−1 = αe − pi=1 βi (αi − αe ) + (L ) e )− i=1 βi (LNext αe − αe Next Pp (p) (0,n) = αe + i=1 βi eiα + pα (yd , e) (6.86) (0,n)
where pα (yd
(0,n)
(t), e) = O(kyd
Note that
(t)k, kek).
(0,n)
−g(x, α)−1f (x, α) = −g(x; αe, 0)−1 f (x; αe, 0) + qx (yd (0,n)
where qx (yd
(0,n)
(t), e) = O(kyd
(0,m)
=
=
(m) yd (m) yd
It follows that (m) (m)
= yd = where
(m) yd
−
−
−
(0,n)
(t), e) + qx (yd
Pm
(t), e)
(i−1) (0,n) − i=1 γi − yd ) + qx (yd , e) P (0,n) i − m (t), e) i=1 γi ex + qx (yd
(xi
Pm
(i−1) (0,n) i ) + qx (yd , e) + i=1 γi (x − yd Pm (i−1) (0,n) i ) + px (yd , e) i=1 γi (x − yd Pm (0,n) i (t), e) i=1 γi ex + px (yd
(0,n)
px (yd
(6.87)
(t)k, kek). Substituting from (6.78) gives
−g(x, α)−1 f (x, α) = vext(yd
ue = yd
(t), e)
(0,n)
(t), e) = qx (yd
g(x, α)−1ve
(t), e) + g(x, α)−1 ve
(0,n)
= O(kyd
(t)k, kek)
Thus we may regard Σ(ue ) as x˙ i = xi+1 , i ∈ m − 1 x˙ m = y (m) − Pm γ (xi − y (i−1)(t)) + p (y (0,n)(t), e) x d i=1 i d d i i+1 α˙ = α , i ∈ p − 1 Pp (i−1) (0,n) α˙ p = α(p) β (αi − αe ) + p (y (t), e) e − i=1
i
α
d
(6.88)
(6.89)
(6.90)
(6.91)
Sec. 6.7
Approximate Tracking
or, equivalently using internal and eix = em = x eiα = epα =
169
external error coordinates (6.84) (i+1)
, i ∈ m−1 Pm (0,n) − i=1 γi eix + px (yd (t), e)
ex
(6.92)
ei+1 α , ∈ p−1 P (0,n) − pi=1 βi eiα + pα (yd (t), e)
System (6.92) is in the form of a decoupled and exponentially stable (for suitable choice of βi ’s and γi ’s) error system with an added perturbation. Let (0,n)
p(yd
h iT (0,n) (0,n) , e) := 0, . . . , 0, px(yd (t), e), 0, . . ., 0, pα(yd (t), e) (0,n)
be the perturbation vector of (6.92). The perturbation p(yd
(6.93) (0,n)
(t), e) is Lipschitz in (yd
(t), e)
by our smoothness assumptions. Therefore, there exist k1 > 0 and k2 > 0 such that (0,n)
kp(yd
(0,n)
(t), e)k∞ ≤ k1 kyd
(t)k∞ + k2 kek∞
≤ k1 + k2 kek∞
(6.94)
Definition 6.7.1 Affine Perturbation. We call a perturbation of the structure (6.94), i.e. a perturbation whose norm is bounded by a constant plus a term linear in the norm of
N
the state, an affine perturbation.
Assumption 6.7.2 Assume that the βi , and γi have been chosen such that when yd ≡ 0,
the origin of Σ(u), as well as the origin of the nominal error dynamics, e˙ix = ei+1 x , i∈ m−1 e˙m = − Pm γ e x i=1 i x i i+1 e˙α = eα , i ∈ p − 1 e˙pα = − Pp β ei i=1
(6.95)
i α
are exponentially stable.
N
Remark 6.7.3 Stability of the origin of (6.95) does not imply stability of the origin of Σ(ue ) because (6.95) is not, in general, the Jacobian linearization of Σ(ue ). Instead it is a convenient approximation of the Jacobian linearization with which to construct approxi(i)
mators for αe , i ∈ p, and study boundedness of the output error.
N
170
Approximate Output Tracking
Chap. 6
In practice Assumption 6.7.2 is easy to enforce; one simply chooses positive values of βi and γi that make the Jacobian linearization of Σ(ue ) exponentially stable at the origin. Then one checks that under these choices of βi and γi the origin of the (linear) nominal error dynamics (6.95) is exponentially stable. In fact we may, and will in the inverted pendulum example of Section 6.9 below, choose βi , i ∈ p and γi, i ∈ m, such that the
Jacobian linearization of Σ(ue ) is identical to that of the linearization of Σ(−K[xT , αT ]T ) where K ∈ R1×n is the gain matrix of a linear quadratic regulator. Now we make the following claim.
Proposition 6.7.4 Convergence for the Internal Equilibrium Controller. Assume (n)
yd ∈ B
for ≥ 0. If re > 0 and k2 ≥ 0 (see (6.94)) are sufficiently small real numbers,
then there exists a t1 ≥ 0, and a class-K function b() such that for all (ex (0), eα(0)) ∈ Bre , (ex (t), eα(t)) converges toward zero exponentially with a rate γ > 0 until (ex (t1 ), eα(t1 )) enters Bb() . Once (ex (t), eα(t)) enters Bb() it remains in Bb() thereafter.
Remark 6.7.5 A consequence of Proposition 6.7.4 is that the tracking error y(t) − yd (t) is uniformly ultimately bounded (see Definition B.6.1 of Appendix B) as is the error α1 − αe .
N For brevity, and since is assumed fixed, we will refer to b() as b. The proof of Proposition 6.7.4 will follow from Theorem 6.7.6 below. Roughly, Theorem 6.7.6 states that under suitable conditions, state trajectories of exponentially stable systems subject to affine perturbations converge exponentially toward the origin up until a time t1 when the solution enters a bounded neighborhood of the origin. After time t1 the solution remains forever within the bounded neighborhood. Theorem 6.7.6 Exponentially Stable Systems Under Affine Perturbations. Consider the perturbed system x˙ = f (t, x) + g(t, x)
(6.96)
with x ∈ Br ⊂ Rn , f : R+ × Br → Rn , and g : R+ × Br → Rn . Assume that f and g are
piecewise continuous in t and locally Lipschitz in x. Assume that x = 0 is an exponentially stable equilibrium of the nominal system x˙ = f (t, x).
(6.97)
Sec. 6.7
Approximate Tracking
171
AAAAAAAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAAAAAAA AAAAAAAAAAAAAAA
eα
ex
0
b
t1
t re
Figure 6.11: The internal equilibrium controller causes the error [eTx , eTα ]T to converge toward 0 exponentially until it reaches the ball Bb ⊂ Rn . See Proposition 6.7.4. There exists a Lyapunov function V : R+ × Br → R for (6.97) which, for some ci , i ∈ 4,
satisfies
c1 kxk22 ≤ V (t, x) ≤ c2 kxk22 ∂V ∂V + f (t, x) ≤ −c3 kxk22 ∂t ∂x
∂V
∂x ≤ c4 kxk2
(6.98) (6.99) (6.100)
for all x ∈ Br . Assume that there exist p1 > 0, p2 > 0, and θ ∈ (0, 1) with p1 < such that for all x ∈ Br ,
θ (c3 − p2 c4 ) r c4
(6.101)
kg(t, x)k ≤ p1 + p2 kxk2
(6.102)
If p2 < c3 /c4 , then for each x(t0 ) ∈ Br , there exists a t1 ≥ 0 and a γ > 0 such that the
solutions of (6.96) satisfy
kx(t)k ≤ kkx(t0 )ke−γ(t−t0 ) , ∀t ≥ t1
(6.103)
kx(t)k ≤ b, ∀t ≥ t1
(6.104)
and
where k=
r
c2 , c1
γ=
(1 − θ)(c3 − p2 c4 ) , 2c2
b=
c4 θr c 3 − p2 c 4
r
c1 c2
(6.105)
172
Approximate Output Tracking
Chap. 6
Proof of Theorem 6.7.6:
The proof of Theorem 6.7.6 is a variation of classical results
on perturbed systems (see Khalil [Kha92], Section 4.5, page 191). By Theorem B.5.1 of Appendix B, a V (·, ·) and constants ci , i ∈ 4 satisfying (6.98), (6.99), and (6.100) exist. We
use V (t, x) as a Lyapunov function candidate for the perturbed system (6.7.6). Differentiate V (t, x) along solutions of (6.96), and use (6.100) and (6.102) to get
V˙ (t, x) ≤ −c3 kxk22 + ∂V ∂x 2 kg(t, x)k2
≤ −c3 kxk22 + c4 kxk2 (p1 + p2 kxk2 )
≤ −(c3 −
p2 c4 )kxk22
(6.106)
+ p1 c4 kxk2
Let c5 := c3 − p2 c4 . Since we assume that p2 = c3 /c4 , cr > 0. Therefore
For θ ∈ (0, 1),
V˙ (t, x) ≤ −c5 kxk22 + p1 c4 kxk2
(6.107)
V˙ (t, x) ≤ −(1 − θ)c5 kxk22 − θkxk22 + p1 c4 kxk2
(6.108)
−θkxk22 + p1 c4 kxk2 ≤ 0
(6.109)
V˙ (t, x) ≤ −(1 − θ)c5 kxk22
(6.110)
If
then
But (6.109) holds if kxk2 ≥ By assumption
p1 c 4 p1 c 4 = θc5 θ(c3 − p2 c4 )
r>
(6.111)
p1 c 4 θ(c3 − p2 c4 )
(6.112)
so there is a subset S of Br such that x ∈ S implies (6.111). Now application of Theo-
rem B.6.2 of Appendix B completes the proof. We may now prove Proposition 6.7.4. Proof of Proposition 6.7.4: We may regard Σ(u) as being of the form "
e˙x e˙ α
#
=
"
Ax
0
0
Aα
#"
ex eα
#
(0,n)
+ p(yd
(t), e)
(6.113)
Sec. 6.8
Estimation of the Internal Equilibrium Angle (n)
where, for yd ∈ B
(0,n)
and kek < re , kp(yd
173
(t), e)k ≤ k1 + k2 kek for some k1 > 0 and
k2 > 0. By choice of γi and βi , Ax and Aα are Hurwitz. Consequently " # " #" # e˙ x Ax 0 ex = e˙α 0 Aα eα
(6.114)
is exponentially stable. Therefore by Theorem B.5.1 of Appendix B there exist ci > 0, i ∈ 4,
and a Lyapunov function V (·) satisfying (6.98), (6.99), and (6.100). Let θ be a number in
the open interval (0, 1). If k2 is sufficiently small, then there exists a k1max such that k1max =
θ(c3 − p2 c4 )re . c4
(6.115)
Then if k1 ≤ k1max , by Theorem 6.7.6 e = (ex , eα) converges toward 0 until it arrives in Bb ⊂ Rn where
c4 b= c 3 − p2 c 4
r
c1 θre c2
(6.116)
after which time it remains in Bb . Since b is linear in , b is a class-K function of . Thus e(t) is uniformly ultimately bounded by b. This implies that limt→∞ ky(t) − yd (t)k∞ < b. (i)
By assumption αe , i ∈ p is bounded on BY × Br . Since eα is uniformly ultimately bounded,
so is α. (p)
Using the internal equilibrium controller requires that yd be C n because αe pends on
(n) yd .
de-
There is nothing intrinsic to the form of our controller which allows us to
guarantee that a sufficiently small k2 (see (6.94)) exists for each problem one might encounter. If necessary one could estimate an upper bound on k2 for a given and re using a computer. (0,m−1)
Even when yd
≡ 0, there are still conditions on k2 required in order to
assure exponential stability. By choosing βi, i ∈ p and γi , i ∈ m such that Σ(ue ) has a
stable linearization, however, we know that for an arbitrarily small region about the origin, the origin in exponentially stable. So in the worst case, we have good regulation in a neighborhood of the origin irrespective of k2 .
6.8
Estimation of the Internal Equilibrium Angle If one assumes that a method exists with which to solve equation (6.76) for αe , the
internal equilibrium method of Section 6.7 stands on its own without the need to introduce
174
Approximate Output Tracking
Chap. 6
dynamic inversion. Dynamic inversion is well suited, however, for obtaining an estimate of αe from the implicit equation, Equation (6.76), that defines it. By substituting the estimator ¯ i αe , we obtain estimators for α(i) α ˜e for αe in L e . Approximation errors in estimating αe and N
(i)
(0,n)
its derivatives αe may be regarded as additional terms in the perturbation p(yd
(t), e).
The implicit equation to be solved is again 0 = f (x; αe, 0) + g(x; αe, 0)vext
(6.117)
where vext is given by (6.60). We obtain an estimator for α˙ e as follows. Differentiate (6.117) with respect to t to get 0 =
∂f (x;αe,0) e,0) e ,0) x˙ + ∂f (x;α + ∂g(x;α vext ∂x ∂α1 ∂α1 e ,0) + ∂g(x;α xv ˙ ext + g(x; αe, 0)v˙ ext ∂x
α˙ e
(6.118)
where v˙ ext
d = dt
(m) yd
−
m X
γi eix
i=1
!
(m+1)
= yd
(m)
− γm(ue − yd ) −
m−1 X
γi ei+1 x
(6.119)
i=1
and x˙ = [x2 , . . . , xm , ue]T . Therefore α˙ e = e,0) − ∂f (x;α + ∂α1
∂g(x;αe ,0) vext ∂α1
−1
∂f (x;αe,0) x˙ ∂x
+
∂g(x;αe,0) xv ˙ ext ∂x
+ g(x; αe, 0)v˙ ext
(6.120)
and the estimator E for α˙ e is obtained by substituting α ˜e for αe in the expression above, E(x; α ˜ e, α2 ; t) := −1 αe,0) ∂g(x;˜ αe ,0) ∂f (x;˜ αe,0) − ∂f (x;˜ + v x˙ + ext ∂x ∂α1 ∂α1
∂g(x;˜ αe,0) xv ˙ ext ∂x
+ g(x; α ˜ e, 0)v˙ ext
(6.121)
Let F (α ˜e , x, t) = f (x; α ˜ e , 0, . . ., 0) + g(x; α ˜ e, 0, . . ., 0)vext(x, t)
(6.122)
A dynamic inverter for αe is now ˙e = −µ sign(D1 F (α ˜α ˜e , x, t)) (f (x; α ˜ e, 0) + g(x; α ˜ e, 0)vext) + E(x; α ˜ e, α2 ; t)
(6.123)
where E(x; α ˜ e, α2 ; t) is as in (6.121). In the multi-input, multi-output context, where αe is a vector, we may use the dynamic inverter of Theorem 2.4.6 to obtain an estimate for αe .
Sec. 6.9
Tracking for the Inverted Pendulum on a Cart
175
Remark 6.8.1 An Alternate Derivative Estimator. Rather than using the E(x, α ˜ e, α2 , t) ¯ Next αe as a derivative estimator for as defined by Equation (6.121) above, we can also use L αe . This gives a less accurate estimate of α˙ e than E(x; α ˜ e, α2 ; t), but is often considerably simpler to compute. We will use this simpler estimator in Chapter 7 when we apply internal
N
equilibrium control to the control of a bicycle.
The dynamic inverter (6.123) provides the dynamic part of an internal equilibrium controller. Thus the dimension of the dynamic inverter chosen is the number q of Problem 6.3.16. For the single input, single output case, using the dynamic inverter (6.123), q = 1.
6.9
Tracking for the Inverted Pendulum on a Cart The classical control problem of controlling the inverted pendulum on a cart [Kai80]
AA AA AA A A AAA AA AAA AA AAA AA A
will be used to illustrate both the problem in which we are interested as well as an application of the approximate tracking controller.
mp
α2
α1
l
τ1
g
mc
x1
x2 Figure 6.12: Inverted pendulum on a cart. The cart and pendulum system is illustrated in Figure 6.12. The position of the cart is parameterized by x1 ∈ R, the linear velocity of the pendulum pivot by x2 = x˙ 1 ,
the angle of the pendulum away from upright by α1 ∈ (−π/2, π/2) ⊂ S 1 , and the angular
velocity of the pendulum by α2 = α˙ 1 . The mass of the point-mass pendulum is mp, the length of the pendulum is l which we will set to equal 1 below, and the mass of the cart is mc . The gravitational acceleration is g.
176
Approximate Output Tracking
Chap. 6
The Lagrangian for the cart-pendulum system is L = −mp gl cos(α1 ) +
1 2 l mp(α2 )2 + (mc + mp )(x2 )2 + mp l cos(α1 )α2 x2 2
The Euler-Lagrange equations of motion are " #" # " # " # mp + mc lmp cos(α1 ) x˙ 2 lmp(α˙ 1 )2 sin(α1 ) τ1 = + lmp cos(α1 ) l 2 mp α˙ 2 −glmp sin(α1 ) 0
(6.124)
(6.125)
We will follow Example 6.3.10 in converting (6.125) to E/I convertible form. Apply the input transformation of Example 6.3.10 τ 1 = (mp sin(α1 )2 + mc )u − mp sin(α1 )(g cos(α1 ) − l(α2 )2 ) to make the dynamics of the cart and x˙ 1 x˙ 2 Σ(u) α˙ 1 α˙ 2
(6.126)
pendulum system take the E/I convertible form = x2 = u
(6.127)
= α2 = g sin(α1 ) − cos(α1 )u
where we have set l = 1.
The external tracking controller (6.60) is (0,2)
vext(yd
(t), x) := y¨d (t) − γ2 (x2 − y˙ d (t)) − γ1 (x1 − yd (t))
(6.128)
where γ1 and γ2 are real numbers chosen such that r 2 + γ2 r + γ1 = 0 has roots with strictly negative real parts. The internal equilibrium manifold for the pendulum is the set E(t) of all (ex , α)
satisfying the system of equations Σe , ( 0 = α2 Σe (0,2) 0 = g sin(α1 ) − cos(α1 )vext(yd (t), x).
(6.129)
The internal equilibrium angle αe for the inverted pendulum is the solution to (0,2)
g sin(αe ) − cos(αe )vext(yd
(t), x) = 0
Though this equation has an explicit solution for αe , namely ! (0,2) vext(yd (t), x) αe = arctan g
(6.130)
(6.131)
Sec. 6.9
Tracking for the Inverted Pendulum on a Cart
177
we will use dynamic inversion in order to track an estimate of αe . Note that in this case, since the constraint equations (6.129) depend on x only (0,2)
through vext(yd
(t), x), we could think of E(t) as a fixed graph over the space R in which
vext resides; i.e. the equations (6.129) describe a relation between α and vext which does not
depend on x or t (see Figure 6.10 and Property 6.6.5, item 6.6.5). For generality we will ignore this, though our observation is reflected in the property that the level sets of E(t) in
the ex phase plane, where vext is defined by (6.128), are parallel lines. For the inverted pendulum problem we have (0,2)
vext(yd
(t), x) = y¨d (t) − γ2 (x2 − y˙ d (t)) − γ1 (x1 − yd (t))
¯ Next vext(y (0,3), x) L d ¯ 2 vext(y (0,4)(t), x) L Next d
(3)
(2)
(1)
= yd (t) − γ2 (vext − yd (t)) − γ1 (x2 − yd (t)) (4) ¯ Next vext − y (3)(t)) − γ1 (vext − y (2)(t)) = y (t) − γ2 (L d
d
(6.132)
d
¯ N αe , and L ¯ 2 αe . To obtain For our final control law, (6.137) below, we need L ext Next these, differentiate the second equation of (6.129) along the vector field " # e1x Next := −γ2 e2x − γ1 e1x
(6.133)
¯ N αe and L ¯ 2 αe to obtain and solve for L ext Next ¯ Next αe = (g cos(αe ) + sin(αe )vext)−1 cos(αe )L ¯ Next vext L ¯ 2 αe = − (g cos(αe ) + sin(αe )vext)−1 (−g sin(α ¯ N α2 L ˜e ) + cos(α)v ˜ ext) L ext e Next ¯ Next vextL ¯ Next αe − cos(α ¯ 2 vext. +2 sin(α ˜ e )L ˜ e )L
(6.134)
Next
The internal tracking controller (6.68) for the cart-pendulum system is uint (vint) = (−g sin(α1 ) + vint )/ cos(α1 )
(6.135)
¨ d − β2 (α2 − λ˙ d ) − β1 (α1 − λd ) vint = λ
(6.136)
where
where λd (t) is a desired C 2 trajectory for α1 . The internal equilibrium control law (6.83) for the cart and pendulum is then ( ue (ve ) = uint (ve ) = − −g sin(α1 ) + v / cos(α1 ) ¯ 2 αe − β2 (α2 − L ¯ N αe ) − β1 (α1 − αe ) ve = L Next
(6.137)
ext
Again, adjustment of the βi ’s and γi ’s must be made so that linearization of Σ(ue ) at the origin is stable when u = ue (ve ) and yd ≡ 0, with the polynomials s2 + β2 s + β1 and s2 + γ2 s + γ1 having roots with strictly negative real parts.
178
Approximate Output Tracking
Chap. 6
From (6.121) we use the estimator for α˙ e E(x; α1, α2 , t) = g cos(α1 ) + sin(α1 )vext
−1
(3) cos(α1 ) yd − γ2 (ue − y¨d ) − γ1 (x2 − y˙ d )
(6.138)
The dynamic inverter for approximating αe is (0,2) ˜˙αe = −µ(g sin(α ˜e ) − cos(α ˜e )uext(yd , x)) + E(x; α ˜ e, α2 , t)
6.9.1
(6.139)
An Intuitive Description of the Internal Equilibrium Controller The internal equilibrium controller for the inverted pendulum on a cart may be
viewed as follows: For each (e1x , e2x) and each time t there corresponds a value of the acceleration of the cart x ¨1 = u which, disregarding internal dynamics, would make the cart track yd (t) according to the nominal external dynamics. For each such value of u, there corresponds an angle αe of the pendulum such that if u were held constant in time at that value, that angle αe would be an (unstable) equilibrium. The internal equilibrium controller strategy is to stabilize a region about that equilibrium αe and to follow αe as it changes due (i)
to changes in yd (t), i ∈ 2 as well as motion of the cart. Note that in the case of output reg-
ulation (yd ≡ 0), αe depends only on x. The situation is shown schematically in Figure 6.13
for the task of regulation to the origin. In this case x1 = e1x and x2 = e2x . At the top of the figure x1 is represented by the location of the pendulum pivot, and x2 is represented
by the arrow below the pivot. By the conversion to standard form, we have essentially removed the cart dynamics and have assigned an input which is equal to the acceleration of the cart. Thus we do not show the cart in the drawing; just the pendulum pivot whose acceleration we now control. The time sequence of the pendulum frames shown runs from left to right along the top row of pendula, and continues from right to left along the bottom row. A gray line shows the internal equilibrium angle αe . The bottom part of the drawing shows the error dynamics phase plane as well as the internal equilibrium manifold E(t).
Two trajectories are shown in the phase space. The black trajectory is (e1x, e2x , α1 ) while the
gray trajectory is (e1x , e2x, αe ). The internal equilibrium controller steers (e1x , e2x, α) toward (e1x , e2x, αe ). When (e1x , e2x, α1 ) is close to (e1x , e2x, αe ), then the actual external vector field vf(Σext(ue )) is close to the nominal external vector field Next = vf(Σext(vext)). Note that as α moves towards E(t), the pendulum pivot moves away from the origin. Only when α gets close to E(t) does the pivot start to move toward the origin. The labels a, b, c and d label various points of αe , while the labels 1, 2, 3, and 4 label various points of the pendulum
state. The angles α1 and αe become approximately equal at b, staying close as αe goes from
Sec. 6.9
Tracking for the Inverted Pendulum on a Cart
negative to positive at c, and as ex heads into the origin.
179
180
Approximate Output Tracking
Chap. 6
a
b
1
d
2
c
4
3
α1
x2 d 1
4 c
3
x1
a
b
2
Figure 6.13: Regulation of the inverted pendulum. The internal equilibrium manifold E(t) is outlined in bold gray in the lower graph. The actual (x1 , x2 , α1 ) trajectory of the pendulum is indicated in black, and its projection (x1 , x2 , αe ) onto E(t) is shown in gray.
Sec. 6.10
6.10
Simulations
181
Simulations In this section we compare simulation results for a linear quadratic regulator and
for the internal equilibrium controller we have presented. We apply both controllers to the problems of output regulation and tracking for the inverted pendulum on a cart, where the output is the cart position. We will demonstrate that our method is more effective than linear quadratic regulation. The linear quadratic regulator (see [CD91a] for review) is of the form
1
x − yd (t) x2 − y˙ (t) d u = −K 1 α α2
= −K
e1x e2x α1 α2
(6.140)
where K is chosen to minimize
Z h
x1 x2 α1
10 0 0 0
x1
i 0 α2 0 0
2 1 0 0 x 1 + (u)2 dt 0 1 0 α α2 0 0 1
0 1 0 0
(6.141)
subject to the constraint that
x˙ 1
x˙ 2 0 0 0 0 1 = α˙ 0 0 0 1 α˙ 2 0 0 g 0
x1
0
x2 1 1 + u α 0 α2 −1
(6.142)
Equation (6.142) is the Jacobian linearization of (6.127) for yd ≡ 0. The resulting gain
matrix was calculated in Matlab [Mat92] to four decimal places. It is K=
h
−3.1623 −4.7378 −43.0326 −13.7789
i
(6.143)
The gain coefficients for the internal equilibrium controller were calculated to be such that the Jacobian linearization of Σ(ue ) is identical to the Jacobian linearization of Σ(−K[x1 , x2 , α1 , α2 ]T ) at the origin. Thus ∂ue /∂(x, α) = K. This leads to γ1 = 2.1885, γ2 = 1.35943, β1 = 33.2326 β2 = 13.7789
(6.144)
182
Approximate Output Tracking
Chap. 6
Table 6.1: Initial conditions for regulation simulations. An asterisk ‘*’ indicates that the corresponding initial conditions are in the region of attraction of the origin for the particular controller. x1 (0) 0 0 0 0 0 0 0 0 0 1 [m] 2 [m] 4 [m] 8 [m] 16 [m] 32 [m] 64 [m] 128 [m]
6.10.1
x2 (0) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
α1 (0) 100 200 300 400 500 600 700 800 850 0 0 0 0 0 0 0 0
α2 (0) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
LQR * * * * *
* * * *
IE * * * * * * * * * * * * * * * *
Regulation Results Two regulation tasks were used in order to demonstrate the enhancement of the
region of attraction by the internal equilibrium regulator. In one task the initial conditions for the system were set so that x1 (0), x2 (0), and α2 (0) were zero, while the pendulum angle α1 (0) was incremented in increments of 100 from 100 to 800 , followed by a step of 50 to α1 (0) = 850 . In the second set x2 (0), α1 (0), and α2 (0) were set to zero, while x1 (0) was incremented from 10 meters to 90 meters. The chart in Table 6.1 shows a summary of the results. The linear quadratic regulator column is indicated with LQR and the internal equilibrium controller column is marked by IE. An asterisk ‘*’ in the controller’s column indicates that the initial conditions on the corresponding row were in the region of attraction of the origin for that controller. Figures 6.14 through 6.18 show results for various initial values for α1 (0) from 100 through 850 , for the regulation problem (yd ≡ 0). Each of these figures is composed
of four graphs. The top graph shows the output y versus time, with the LQR output dashed, and the internal equilibrium regulator output solid. The left graph of the second
row shows the angle α1 versus time, with the LQR α1 dashed, the internal equilibrium
Sec. 6.10
Simulations
183
regulator α1 (solid), and α ˜e (t) (dotted). The right graph of the second row shows the input ue (solid) to the internal equilibrium controller, as well as the input u (dashed) to the LQR controller. The bottom graph in e1x , e2x, α1 space shows the equilibrium manifold E(t)
over the e1x , e2x error plane. Three trajectory lines converge to the origin. One trajectory is (e1x(t), e2x(t), α1 (t)) which is shown in black. Another is (e1x (t), e2x(t), 0), the projection of (e1x(t), e2x (t), α1(t)) onto the error phase-plane which is shown in light gray. A third is the projection of (e1x (t), e2x(t), α1 (t)) onto E(t), namely (e1x (t), e2x(t), αe ). This trajectory is shown in dark gray as is the outline of E(t).
In Figure 6.14 the time course of the cart position is virtually identical for both
the internal equilibrium controller as well as the LQR controller. The initial conditions are x1 (0) = 0, x2 (0) = 0, α1 (0) = 100 , α2 (0) = 0. The upper right graph shows convergence of α1 (solid) to αe (dotted). The lower right plot in (e1x , e2x, α1 ) space shows how the trajectory (e1x (t), e2x(t), α1 (t)) is attracted to E(t), moving first away from the origin, but then as the
trajectory gets closer to E(t), (e1x (t), e2x(t), α1 (t)) begins to close in on the origin. The internal equilibrium manifold E(t) appears flat at this view because the state trajectory is
small compared to the curvature of E(t).
In Figure 6.15 with initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 200 , α2 (0) = 0,
one can just begin to see a difference between the response of the IE controller and the LQR controller in the three time plots, as well as the curvature of E(t).
In Figure 6.16 with initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 500 , α2 (0) = 0,
the responses of the two controlled systems have diverged substantially. Both, however, perform the regulation function. In Figure 6.17 with initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 600 , α2 (0) = 0, the LQR controlled system has failed in that α1 has exceeded 900 at approximately t = 1.5 seconds. The IE controller, however, continues to provide regulation as α1 approximately tracks αe . The last trial of this set, Figure 6.18 is for an initial pendulum angle of 850 and shows how the IE regulator pulls the trajectory (e1x (t), e2x(t), α1 (t)) down to E(t). Approxi-
mate tracking of αe by α1 is apparent as is regulation to the origin.
184
Approximate Output Tracking
Chap. 6
0.6
y
0.4
0.2
0
0
1
2
3
4
t
5
6
7
8
9
10
8 0.2
7 0.15
6 0.1
5 4
0.05
3
α1
u
0
2
-0.05
1 -0.1
0 -0.15
-0.2
-1
0
1
2
3
4
t
5
6
7
8
9
10
-2
0
1
2
3
4
t
5
6
7
8
9
10
e2x
e1x
Figure 6.14: Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 100 , α2 (0) = 0. Since yd ≡ 0, (e1x, e2x ) = (x1 , x2 ).
Sec. 6.10
Simulations
185
1.2
1
0.8
y
0.6
0.4
0.2
0
-0.2
0
1
2
3
4
t
5
6
7
8
9
10
16
0.4
14 0.3
12 0.2
10 8
0
6
α1
u
0.1
4
-0.1
2 -0.2
0 -0.3
-0.4
-2
0
1
2
3
4
t
5
6
7
8
9
10
-4
0
1
2
3
4
t
5
6
7
8
9
10
e2x
e1x
Figure 6.15: Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 200 , α2 (0) = 0.
186
Approximate Output Tracking
Chap. 6
4 3.5 3 2.5
y
2 1.5 1 0.5 0 -0.5
0
1
2
3
4
t
5
6
1
60
0.8
50
7
8
9
10
0.6
40 0.4
30 0.2
20
u
α1
0 -0.2
10
-0.4
0 -0.6
-10
-0.8 -1
0
1
2
3
4
t
5
6
7
8
9
10
-20
0
1
2
3
4
t
5
6
7
8
9
10
e2x
e1x
Figure 6.16: Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 500 , α2 (0) = 0.
Sec. 6.10
Simulations
187
8
6
4
2
y
0
-2
-4
-6
-8
0
1
2
3
4
t
5
2
6
7
8
9
10
100
1.5 50 1 0
u
α1
0.5
0
-50
-0.5 -100 -1
-1.5
0
1
2
3
4
t
5
6
7
8
9
10
-150
0
1
2
3
4
t
5
6
7
8
9
10
e2x
e1x
Figure 6.17: Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 600 , α2 (0) = 0.
188
Approximate Output Tracking
Chap. 6
50
40
30
20
y
10
0
-10
-20
0
1
2
3
4
t
5
6
2
1000
1.5
800
1
600
0.5
400
u
α1
-30
0
0
-1
-200
0
1
2
3
4
t
5
6
7
8
9
10
8
9
10
200
-0.5
-1.5
7
-400
0
1
2
3
4
t
5
6
7
8
9
10
e2x
e1x
Figure 6.18: Regulation trial for initial conditions x1 (0) = 0, x2 (0) = 0, α1 (0) = 850 , α2 (0) = 0.
Sec. 6.10
Simulations
189
The second set of regulation simulations shown in Figures 6.19 through 6.22 corresponds to setting x2 (0) = 0, α1 (0) = 0, and α2 (0) = 0, while stepping x1 (0) through the sequence 1, 8, 16, and 64 (in meters). The arrangement of graphs is the same as for Figures 6.14 through 6.18. Figure 6.19 corresponds to initial conditions x1 (0) = 1, x2 (0) = 0, α1 (0) = 0, α2 (0) = 0. Again, the response of the IE and LQR controlled systems are virtually identical. The small scale of the trajectory in (e1x, e2x , α1 ) space again makes E(t) appear flat. Figure 6.20 corresponds to initial conditions x1 (0) = 8, x2 (0) = 0, α1 (0) = 0, α2 (0) = 0. The response of the two controllers has diverged substantially, but both perform their regulation functions. Figure 6.21 shows failure of the LQR regulator as the pendulum angle exceeds π/2 after approximately 0.5 seconds. The IE controller continues to regulate properly as α1 is attracted to αe . Figure 6.22 shows successful regulation using the IE controller for initial conditions 1
x (0) = 64, x2 (0) = 0, α1 (0) = 0, α2 (0) = 0. The curvature of E(t) can be clearly seen as can its attractiveness to (e1x (t), e2x(t), α1 (t)).
190
Approximate Output Tracking
Chap. 6
1.2
1
0.8
y
0.6
0.4
0.2
0
-0.2
0
1
2
3
4
t
5
6
7
8
9
10
4 0.05
0
3
-0.05
2
-0.1
α1
u
1 -0.15
0 -0.2
-1 -0.25
-0.3
0
1
2
3
4
t
5
6
7
8
9
10
-2
0
1
2
3
4
t
5
6
7
8
9
10
e2x
e1x
Figure 6.19: Regulation trial for initial conditions x1 (0) = 1, x2 (0) = 0, α1 (0) = 0, α2 (0) = 0.
Sec. 6.10
Simulations
191
9 8 7 6 5
y
4 3 2 1 0 -1
0
1
2
3
4
t
5
6
7
8
9
10
30
0.6
25
0.4
20 0.2
15 0
10 -0.2
u
α1
5 -0.4
0 -0.6
-5
-0.8
-10
-1 -1.2
-15
0
1
2
3
4
t
5
6
7
8
9
10
-20
0
1
2
3
4
t
5
6
7
8
9
10
e2x
e1x
Figure 6.20: Regulation trial for initial conditions x1 (0) = 8, x2 (0) = 0, α1 (0) = 0, α2 (0) = 0.
192
Approximate Output Tracking
Chap. 6
18 16 14 12 10
y
8 6 4 2 0 -2
0
1
2
3
4
t
5
1
6
7
8
9
10
60 40
0.5 20 0
0 -20
u
α1
-0.5
-1
-40 -60 -80
-1.5 -100 -2
0
1
2
3
4
t
5
6
7
8
9
10
-120
0
1
2
3
4
t
5
6
7
8
9
10
e2x
e1x
Figure 6.21: Regulation trial for initial conditions x1 (0) = 16, x2 (0) = 0, α1 (0) = 0, α2 (0) = 0.
Sec. 6.10
Simulations
193
80
60
40
y
20
0
-20
-40
0
1
2
3
4
t
5
1.5
6
7
8
9
10
300
200
1
100 0.5 0 0
u
α1
-100 -0.5
-200 -1 -300 -1.5
-2
-400
0
1
2
3
4
t
5
6
7
8
9
10
-500
0
1
2
3
4
t
5
6
7
8
9
10
e2x
e1x
Figure 6.22: Regulation trial for initial conditions x1 (0) = 64, x2 (0) = 0, α1 (0) = 0, α2 (0) = 0.
194
6.10.2
Approximate Output Tracking
Chap. 6
Tracking Results The last group of graphs shows the results for tracking a sinusoidal trajectory. For
each simulation the initial conditions were x(0) = 0, α(0) = 0, and α ˜ e (0) = 0. Figure 6.23 shows tracking results for both the internal equilibrium controller and the LQR controller tracking yd = sin(2π 0.1t). Figure 6.24 shows tracking results for yd = sin(2π 0.2t). Figure 6.25 shows tracking results for yd = sin(2π 0.5t). Each of Figures 6.23, 6.24, and 6.25 contain six graphs consisting of two sets of three graphs each. The set of the three upper graphs correspond to the internal equilibrium controller. The set of the three lower-left graphs correspond to the LQR controller. The top left graph of each set shows the desired output yd (t) (dashed), the actual (simulated) output y (solid), and the tracking error y − yd (dotted). The bottom left graphs of each
set shows the actual (simulated) pendulum angle α1 (solid). For the IE set the internal equilibrium angle αe (dashed), and the error α − αe (dotted) are also shown. The right
graph of each set show the corresponding intputs u. For the graph of Figure 6.25, the LQR
controller caused the angle α to exceed π/2 after a time of approximately 2.1 seconds, so the LQR set for that figure includes data only up untill that time. In all cases observed the performance of the IE controller exceeds that of the LQR controller with respect to the tracking error, the peak IE tracking error being approximately one quarter of the LQR tracking error. Figure 6.25 shows failure of the LQR controller to track, the pendulum angle exceeding π/2 at approximately 2.2 seconds. Though the tracking error for the IE controller has increased with respect to Figure 6.24, the pendulum angle remains confined to the interval (−π/2, π/2).
Sec. 6.10
Simulations
1.5
0.5
1
0
0.5
y
1
-0.5
0 0
2
4
6
8
t
10
12
14
16
18
20
-0.5
ue
-1
0.2
-1
-1.5
0.1
-2
-0.1
-2.5
α1
0
-0.2
0
2
4
6
8
t
10
12
14
16
18
-3
20
2
4
6
8
2
4
6
8
t
10
12
14
16
18
20
10
12
14
16
18
20
1.5
1
1
0
0.5
-1
0 0
2
4
6
8
t
10
12
14
16
18
20
-0.5
u
-2
0
2
2
y
195
-1
0.05
-1.5
0
-2
-0.05
-2.5
α1
0.1
-0.1
0
2
4
6
8
t
10
12
14
16
18
20
-3
0
t
Figure 6.23: Tracking trial for initial conditions x(0) = 0, α(0) = 0, with yd (t) = sin(0.2π t), a 0.1 Hz sinusoid.
196
Approximate Output Tracking
2
0.5
1.5
0
1
y
1
-0.5
0.5 0
2
4
6
8
t
10
12
14
16
18
20
0 -0.5
ue
-1
0.3 0.2
-1
0.1
α1
Chap. 6
-1.5
0 -2
-0.1 -0.2
0
2
4
6
8
t
10
12
14
16
18
20
-2.5
2
4
6
8
t
10
12
14
16
18
20
14
16
18
20
4
4
3
2
y
0
2
0
1 -2
0 0
2
4
6
8
t
10
12
14
16
18
20
-1
u
-4
-2
0.2
-3
0
-4
-0.2
-5
α1
0.4
-0.4
0
2
4
6
8
t
10
12
14
16
18
20
-6
0
2
4
6
8
t
10
12
Figure 6.24: Tracking trial for initial conditions x(0) = 0, α(0) = 0, with yd (t) = sin(0.4π t), a 0.2 Hz sinusoid.
Sec. 6.10
Simulations
197
40
4 2
30
0
y
20
-2 -4
10 0
1
2
3
4
5
t
6
7
8
9
10
0
ue
-6
1
-10 0.5
-20
α1
0
-30
-0.5 -1
0
1
2
3
4
5
t
6
7
8
9
-40
10
4
0
1
2
3
4
0.2
0.4
0.6
0.8
5
t
6
7
8
9
10
80
2 60
y
0 -2
40
-4 0
0.2
0.4
0.6
0.8
t
1
1.2
1.4
1.6
1.8
20
2
u
-6
4
0
2 -20
α1
0 -2
-40
-4 -6
0
0.2
0.4
0.6
0.8
t
1
1.2
1.4
1.6
1.8
2
-60
0
t
1
1.2
1.4
1.6
1.8
2
Figure 6.25: Tracking trial for initial conditions x(0) = 0, α(0) = 0, with yd (t) = sin(π t), a 0.5 Hz sinusoid.
198
Approximate Output Tracking
6.11
Chap. 6
Discussion
Modifications of the Internal Equilibrium Controller. The internal equilibrium controller may be modified by setting the higher order Lie derivatives in ue (6.83) to zero. (0,n)
This results in a higher bound on the norm of the perturbation term p(yd suffers as a result; holding the bound on
(0,n) kyd k∞
, e). Performance
the same as above, the invariant and
attractive neighborhood of E(t) becomes larger, resulting in a larger ultimate bound on the (0,n)
tracking error. For systems in which the bound on kyd
modification can pay off by reducing computational load.
k∞ is sufficiently small, this
Modifications of the Dynamic Inverter. There is latitude for choice in the construction of the dynamic inverter for approximating αe . As mentioned above in Remark 6.8.1, we ¯ Next αe to estimate α˙ e rather than using (6.121). This is often a computationally could use L simpler choice than (6.121) for the following reason: Recall that ue (ve ) (6.83) is dependent (0,n)
(t), x, and α. Since vext is dependent on xm , v˙ ext is dependent on ue (ve ) and (0,n) ¯ Next vext gives L ¯ Next αe which is hence on yd (t), x, and α. Replacing v˙ ext in (6.121) with L
on yd
(0,m)
dependent only on yd
(t) and x.
Strict Bounds on Internal Trajectories. When strict bounds are required on kα(t)k,
conditions must be such that the attractive and invariant neighborhood of E(t) remains
within the strict bounds. It is quite possible for E(t) to obey the bounds, while the attractive and invariant neighborhood of E(t) exceeds the bounds.
Extension to Time-Varying Systems. We have applied the internal equilibrium controller to time-invariant systems. This was for simplicity of exposition only. The internal equilibrium controller may applied to time-varying systems in a similar manner, the only changes in the above arguments being smoothness requirements on f (t, x, α) and g(t, x, α).
Extension to Multi-Input, Multi-Output Systems. Again, for simplicity of exposition, we have confined ourselves to single-input, single-output systems. Extension to multi-input, multi-output systems is straightforward. The internal equilibrium becomes a vector valued quantity. Conditions on the existence of E(t) in a neighborhood of the origin
are inherited, as above, from the conditions of the implicit function theorem. Details of this extension will appear in future work.
Sec. 6.12
6.12
Chapter Summary
199
Chapter Summary We have introduced an internal equilibrium manifold E(t) and a controller that
makes a neighborhood of E(t) attractive and invariant. This has afforded approximate
output tracking of reference output from an open set, while maintaining internal stability. Comparison to the performance of a linear quadratic regulator in the case of the inverted pendulum and cart indicates a substantial increase in the region of attraction of the origin in both regulation and tracking tasks. We have shown that the ultimate bound on both the tracking error and the internal state is a class-K function of the bound on the output
reference trajectory. In the next chapter we will apply internal equilibrium control to tracking control with balance for a nonholonomic model of a bicycle.
200
Chapter 7
Automatic Control of a Bicycle 7.1
Introduction In this chapter we derive a controller which, using steering and rear-wheel torque
as controls, causes a model of an automatically controlled riderless bicycle to approximately track a time-parameterized path in the (horizontal) ground plane while retaining its balance. The design of the controller utilizes the results on internal equilibrium control from Chapter 6 in a multi-input, multi-output context on a mechanical system with nonholonomic constraints. Control of the bicycle is a rich problem offering a number of considerable challenges of current research interest in the area of mechanics and robot control. As modeled here, the bicycle is an underactuated system, subject to nonholonomic contact constraints associated with the rolling constraints on the front and rear wheels. It is unstable (except under certain combinations of fork geometry and speed) when not controlled. It is also, when considered to traverse flat ground, a system subject to symmetries; its Lagrangian and constraints are invariant with respect to translations and rotations in the ground plane. Though a number of researchers have studied the stability of bicycles and motorcycles under a linear model of rider control [NF72, Far92, Rol72, Sha71, Wei72] (See Hand [Han88] for a comprehensive survey), as far as we know the controller described here is the first controller allowing approximate tracking of arbitrary trajectories while maintaining balance. Von Wissel and Nikoukhah [vWN95] have constructed a method of piecing together optimal trajectories to construct paths, even around obstacles, along which a bicycle may be stabilized in traveling from one point to another. Control of balance and roll-angle tracking for the bicycle model used in this chapter has been addressed in [Get94].
Sec. 7.2
The Model
201
This corresponds to internal tracking control (see Chapter 6, Section 6.5) which will be reviewed in Section 7.5 in the context of the bicycle problem. In addition to extending those results to tracking in the plane we also utilize some new results from Bloch, et al. [BKMM94] on the derivation and structure of the equations of motion for nonholonomic systems with symmetries. Another offering of this chapter is a simple, tractable model of the bicycle. Extant mathematical models of bicycles and motorcycles [Sha71, DN92, Rol72, FSR90, Far92, DN92, NDV92] have been designed with the intent of providing a complete physical description of the bicycle or motorcycle, capable of predicting a wide range of vehicle behavior including some troublesome oscillatory instabilities [Far92] thought to arise from dynamic interactions of frame-flexibility, tire-road interactions, and rider control. Instead, the bicycle model offered here, and in [Get94, GM95a], is designed for control of the bicycle in an operating range that might be reasonably realized by an autonomous vehicle. Thus we make a tradeoff of modeling errors, with respect to a more accurate model of a real automatic bicycle, for simplicity and tractability.
7.1.1
Chapter Overview In Section 7.2 we describe the bicycle model, its generalized coordinates, and its
nonholonomic constraints. In Section 7.3 we derive equations of motion for the bicycle. Through a sequence of coordinate changes and assumptions we convert the equations of motion to the external/internal convertible form (6.13) of Chapter 6. In Section 7.4 we construct an external tracking controller for the bicycle. In Section 7.5 we describe the internal tracking controller for the bicycle and show some interesting paths in the plane which result from its application. In Section 7.6 we define an internal equilibrium angle for the bicycle. In Section 7.7 we describe an internal equilibrium controller for the bicycle. Then in Section 7.8 we show the results of simulations of the internal equilibrium control of a bicycle following a variety of trajectories in the plane.
7.2
The Model In this section we describe the bicycle model, its generalized coordinates, forces
on the bicycle, and its nonholonomic constraints.
202
Automatic Control of a Bicycle
Chap. 7
m
Force exerted on bicycle by the ground
p
Steering axis
c τ
r
Contact Line Constrained directions of wheel travel
b Figure 7.1: Side view of the bicycle model with α = 0.
7.2.1
Assumptions on the Model We consider a simplified bicycle model illustrated in Figure 7.1. The wheels of the
bicycle are considered to have negligible inertial moments, mass, radii, and width, and to roll with neither lateral nor longitudinal slip. The rigid frame of the bicycle is assumed to be symmetric about a plane containing the rear wheel. The bicycle is assumed to have a steering-axis fixed in the bicycle’s plane of symmetry, and perpendicular to the flat ground when the bicycle is upright. For simplicity we neglect the moments of inertial of the bicycle, i.e. we assume a point-mass bicycle.
7.2.2
Reference Frames and Generalized Coordinates Consider a ground-fixed inertial reference frame with x and y axes in the ground
plane and z-axis perpendicular to the ground plane in the direction opposite to the force of gravity as shown in Figure 7.2. This frame is called the ground frame. The intersection of the vehicle’s plane of symmetry with the ground plane forms its contact-line. The contactline is rotated about the z-direction by a yaw-angle, θ. The contact-line is considered directed, with its positive direction from the rear to the front of the bicycle. The yaw-angle θ is zero when the contact-line is parallel to the x-axis. The angle that the bicycle’s plane of symmetry makes with the vertical direction is the roll-angle, α ∈ (−π/2, π/2). Consider the line of intersection between the plane of the front wheel and the
Sec. 7.2
The Model
203
α z y
m
x
p
θ c contact-line
b
AAA φ
Figure 7.2: Bicycle model rolled away from upright by angle α. In this figure α is negative. ground plane. Let φ ∈ (−π/2, π/2) be the steering-angle between this intersection and
the contact-line as shown in Figure 7.2. We will parameterize the steering angle by σ :=
1 tan(φ), b
(7.1)
and refer to σ as the steering variable. We will see in Section 7.2.4 that use of σ rather than φ to parameterize steering will help to give the nonholonomic rolling constraints of the bicycle a very simple form. Remark 7.2.1 Note that the steering angle φ is not the angle of rotation of the steering shaft of the bicycle. Call the steering shaft angle ψ (see Figure 7.3). Then ψ and φ are related by tan(φ) cos(α) = tan(ψ)
(7.2)
bσ cos(α) = tan(ψ)
(7.3)
and from (7.1) we have
From (7.2), if α = 0, then φ = ψ. The two angles φ and ψ are shown in Figure 7.3.
204
Automatic Control of a Bicycle
Chap. 7
m
α ψ
φ Figure 7.3: Leaning bicycle showing the relationship between the steering angle φ and the steering shaft angle ψ. Note that in the figure, the roll-angle α is negative.
N
The coordinates x, y, θ, α, and σ are a complete set of generalized coordinates for the bicycle.
Alternative Generalized Velocities Corresponding to the generalized coordinates is a set of generalized velocities x, ˙ y, ˙ ˙ α, θ, ˙ and σ. ˙ When we introduce constraints in Subsection 7.2.4, it will be convenient to use an alternative set of generalized velocities which we now define. Let vr be the component of the velocity of the rear-wheel contact along the contact-line as measured from the ground frame, and let v⊥ be the component of the velocity of the rear-wheel contact perpendicular to the contact line and parallel to the ground plane as measured from the ground frame. Both velocities are indicated in Figure 7.4. Note that vr , v⊥ , x, ˙ and y˙ are related through a rotation by θ, " as indicated in Figure 7.5.
x˙ y˙
#
=
"
cos(θ) − sin(θ) sin(θ)
cos(θ)
#"
vr v⊥
#
(7.4)
Sec. 7.2
The Model
205
z y α
θ θ
v
⊥
x vr
α
φ Figure 7.4: The bicycle model showing body velocities vr and v⊥ . Note that the roll-angle α in the figure is negative.
v
vr
y
θ
⊥
y
x x Ground Frame
Figure 7.5: Top view of rear “wheel” showing the relationships among vr , v⊥ , x, ˙ and y. ˙
The body frame of the bicycle is taken to have its origin at the rear-wheel contact point, with axes in the direction of vr , v⊥ as shown in Figure 7.4, with the third axis parallel to the z-axis of the ground frame. The velocity components vr and v⊥ are the velocity of the rear-wheel contact point relative to the ground frame expressed in the body frame as indicated in Figures 7.4 and 7.5. A complete alternate set of generalized velocities for the ˙ and σ. bicycle is then vr , v⊥ , θ, ˙
206
7.2.3
Automatic Control of a Bicycle
Chap. 7
Inputs and Generalized Forces Let τ r be the reaction force that the ground exerts on the bicycle at the rear-wheel
contact point (see Figure 7.1); this reaction force τr acts along the contact-line as indicated in Figure 7.1. Another torque generator is associated with the steering variable σ. The corresponding generalized torque is τ σ . The bicycle is also subject to the force of gravity mg acting on the point mass m of the bicycle. Thus our bicycle is riderless and under automatic control, driven by two torque generators.
7.2.4
Constraints As mentioned in Subsection 7.2.1 the front and rear wheels are assumed to roll
with neither lateral nor longitudinal slip. For simplicity we will deal directly with the component of the contact velocity vr along the the contact line (see Figure 7.2). Front and rear-wheel contacts are constrained to have velocities parallel to the lines of intersection of their respective wheel-planes and the ground-plane, but free to turn about an axis through the wheel/ground contact and parallel to the z-axis. The large arrows at the front-wheel and rear-wheel contacts in Figures 7.1 through 7.4 indicate the positive directions of the wheel/ground contact velocities. The generalized velocities of the bicycle are partitioned as r˙ = [α, ˙ vr , σ] ˙ T and ˙ v⊥ ]T . In these velocity coordinates the nonholonomic constraints associated s˙ = [θ, with the front and rear wheels, assumed to roll without slipping, are expressed simply by s˙ + A(r, s)r˙ = 0 or
Bicycle Constraints " # " # α˙ θ˙ 0 −σ 0 vr + v⊥ 0 0 0 {z } σ˙ | {z } | | {z s˙ A(r,s) r˙
=0
(7.5)
}
˙ of the bicycle is the product of the The first equation of (7.5) says that the yaw-rate, θ, steering variable, σ := 1b tan φ, and the rear-wheel velocity vr , θ˙ = σvr . This follows by inspection of Figure 7.6.
Sec. 7.3
Equations of Motion
vr tan(φ)
207
AA AA AA
φ
vr
vr θ
y
b x Ground Frame
Figure 7.6: Velocity geometry for constraints. The second equation tells us that v⊥ , the component of the velocity of the rearwheel contact point perpendicular to the plane of that wheel, is zero, i.e. no lateral slip of the rear wheel, hence v⊥ = 0. The linear map represented by matrix A(r, s) (7.5) connects the base velocities ˙ v⊥ ]T . If we know r˙ for all t, we can integrate σ˙ r˙ = [α, ˙ vr , σ] ˙ T to the fiber1 velocities s˙ = [θ, to get σ, then from (7.5) we can reconstruct s˙ for all t. Remark 7.2.2 Due to symmetries of the constraints with respect to translations and rotations in the plane, A(r, s) depends only on r.
N
Remark 7.2.3 Assuming no wheel slip, Z
t
Z
t
vr dt
(7.6)
0
is the length of the path of the bicycle at time t, where the path length at time t = 0 is assumed to be zero. The integral
0
v⊥ dt
is always zero since v⊥ is always zero.
7.3
(7.7)
N
Equations of Motion In this section the reduced equations of motion for the bicycle will be derived
using the results of [BKMM94]. For an alternative, but equivalent derivations with the same results, derived using results from [BRM92], see [Get94]. 1
For a description of this base-fiber structure of nonholonomic systems with symmetry see [BKMM94]
208
Automatic Control of a Bicycle
Chap. 7
We choose a body-frame for the bicycle centered at the rear-wheel ground contact, with one axis pointing forward along the line of intersection of the rear wheel plane with the ground, another axis orthogonal to the first and in the ground plane, and an axis normal to the ground, pointing in the direction opposite to gravity (see Figure 7.4). The body frame is a natural frame in which to write the Lagrangian of the bicycle for a number of reasons. In particular the rolling constraints take on the very simple form seen in (7.5). Let sα := sin(α) and cα := cos(α). We will associate an unspecified, though presumably known, α- and σ-dependent moment of inertia J(α, σ) of the front wheel about the steering axis, where the σ and α dependence of J(α, σ) is inherited from the σ and α dependence of the steering shaft angle ψ (7.3). We will introduce an assumption in Subsection 7.3.1 below that the steering variable σ is directly controlled, i.e. an input, so the lack of commitment to a specific J(α, σ) will not be a problem and will allow us to consider a broader range of bicycle models than if we were to commit ourselves to a specific J(α, σ). This assumption is easy to justify in practice using a steering motor capable of high torque. Temporarily retaining the structure associated with J(α, σ) will, on the other hand, allow us to use a convenient formulation of the equations of motion [BKMM94]. Thus the kinetic energy associated with the steering axis will be 21 J(α, σ)σ˙ 2. The Lagrangian for the bicycle is constructed from the kinetic energies associated with the point mass and the steering axis, as well as the potential energy of the point mass. It is (see Figures 7.2 and 7.4)
Bicycle Lagrangian L = −mgpcα + 12 J(α, σ)σ˙ 2 2 ˙ 2 + (v⊥ − pαc ˙ 2 + (−pαs +m (v + ps θ) ˙ + c θ) ˙ ) r α α α 2
(7.8)
where m is the mass of the bicycle, considered for simplicity to be a point mass. Note that the constraints are not reflected in the structure of L, though we have allowed them to influence our choice of velocity coordinates. This is evidenced by the inclusion of v⊥ in (7.8) which the constraints dictate should be identically zero. Incorporating the constraints (7.5) into the Lagrangian we obtain the constrained Lagrangian for the bicycle
Sec. 7.3
Equations of Motion
209
Constrained Bicycle Lagrangian Lc = −(gmpcα) + 12 J(α, σ)σ˙ 2 +
m 2 ((vr
(7.9)
+ pσsα vr )2
+p2 s2α α˙ 2 + (cσvr − pcαα) ˙ 2)
obtained by substitution of θ˙ = vr σ and v⊥ = 0 into the unconstrained Lagrangian (7.8). Of course the equations of motion for the constrained Lagrangian are not Lagrange’s equations. The correct formulation of the equations of motion based upon the constrained Lagrangian are derived in [BKMM94] and shown to be equivalent to d’Alembert’s equations for constrained systems. They are2 d ∂Lc ∂Lc ∂Lc ∂L l j − + Aki k = − l Cij r˙ i i dt ∂ r˙ ∂r ∂s ∂ s˙
(7.10)
b where Cαβ denote the components of the curvature of the connection A(r, s), ! l ∂Alj ∂Ali ∂Alj k k ∂Ai l Cij = − + Ai k − Aj k ∂r j ∂r i ∂s ∂s
(7.11)
When all generalized forces τ i correspond to so-called base velocities r˙ i , Equations (7.10) is modified to be
d ∂Lc ∂Lc ∂Lc ∂L l j − + Aki k = − l Cij r˙ + τ i (7.12) i i dt ∂ r˙ ∂r ∂s ∂ s˙ The resulting equations of motion (7.12) are reduced in number with respect to
the number of generalized coordinates. The reduced equations3 may be expressed as M (r)¨ r = K(r, r) ˙ + Bτ
(7.13)
where r˙ = [α, ˙ vr , σ] ˙ T , M ∈ R3×3 , K ∈ R3 , B ∈ R3×2 , and u = [τ r , τ σ ]T . The components
of M, K, and B are
p2
−cpcα
0
0
1 m J(α, σ)
2 2 2 2 2 M (r) = −cpcασ 1 + c σ + 2pσsα + p σ sα 0
2
0
(7.14)
∂Al
Here we use the summation convention where, for example, if s is of dimension m, then Aki ∂skj ≡ ∂Al
k j Σm k=1 Ai ∂sk . 3 The reduced equations of motion (7.13) do not include the structure of the momentum equations [BKMM94], a set of first order differential equations governing the effects of conserved momenta on the motion of the bicycle. In the case of the bicycle, such effects are governed entirely by the inputs.
210
Automatic Control of a Bicycle
gpsα + (1 + pσsα )pcασvr2 + cpcαvr σ˙ +
Chap. 7
1 ∂ ˙2 2m ∂α J(α, σ)σ
2 2 K(r) = −(1 + pσsα )2pcασvr α˙ − cpσsαα˙ − (c σ + psα (1 + pσsα))vr σ˙ 1 ∂ ˙2 2m ∂σ J(α, σ)σ 0 0 1 B= m 0 1 0 m
7.3.1
(7.15)
(7.16)
Practical Simplifications We will further reduce our model through practical considerations. First, we will
assume that for the range of values of α, α, ˙ σ, and σ˙ in which we will be interested, that the term
1 ∂ ˙ 2, 2m ∂α J(α, σ)σ
from the first entry of K(r) is negligible. As stated above we assume
that the steering variable σ is directly controlled, i.e. we assume that we have a steering actuator that will allow us to track any desired σ(t) trajectory exactly. For an automatically controlled bicycle this is a reasonable approximation as long as |¨ σ| is sufficiently small. It
will be convenient to refer to σ˙ as the input w σ ,
σ˙ = w σ
(7.17)
After the above simplifications, the equations of motion take on the simpler form
Reduced Bicycle Equations of Motion
˜ (r) M
"
σ˙ = w σ # " # σ α ¨ w ˜ ˜ = K(r) + B(r) v˙ r τr
(7.18)
where ˜ (α, σ) = M ˜ K(α, α, ˙ σ, vr ) = ˜ B(α, σ, vr ) =
"
p2
−cpcα σ
−cpcα σ 1 + (c2 + p2 s2α )σ 2 + 2pσsα " # gpsα + (1 + pσsα)pcασvr2 "
−(1 + pσsα)2pcασvr α˙ − cpσsαα˙ 2 cpcαvr
0
−(c2 σ + psα (1 + pσsα))vr 1/m
#
#
(7.19)
(7.20)
(7.21)
Sec. 7.3
Equations of Motion
211
with inputs w σ and τ r . ˜ Remark 7.3.1 The first column of B(α, σ, vr) in (7.21) has vr as a multiplication factor confirming the intuitive notion that if vr = 0 then steering input w σ can have no affect on either α or vr . Thus, as experience predicts, the bicycle is not controllable when vr = 0. Also, as vr gets closer to zero, the steering input w σ must get larger in order to maintain influence over α and vr . It is practical then to choose controls and initial conditions such that vr > vmin > 0
(7.22)
N
The model is equally valid for vr < 0, but we will only consider the vr > 0 case.
7.3.2
Conversion to External/Internal Convertible Form The equations (7.18) are not yet in the E/I convertible form we require for appli-
cation of the results of Chapter 6. We will convert (7.18) to the E/I convertible form (6.13) through a sequence of coordinate changes on both state and input. A. First we apply a state-dependent input transformation to ur which makes the dynamics of vr linear with respect to the input. The second equation of the reduced equations (7.18) is ˜ 21 α ˜ 22 v˙ r = K ˜2 + B ˜21 w σ + B ˜22 τ r M ¨+M
(7.23)
˜ −1 −M ˜ 21 α ˜2 + B ˜21 w σ + M ˜ −1 B ˜22 τ r v˙ r = M ¨+K 22 22
(7.24)
˜ 22 B ˜ −1 −M ˜ −1 −M ˜ 21 α ˜2 + B ˜21 w σ + w r τr = M ¨+K 22 22
(7.25)
Solve (7.23) for v˙ r to get
Now define a transformation from the input force τ r , to a new input variable w r with
This transformation (7.25) is r, r, ˙ and w σ dependent. Substitute (7.25) into the equations of motion (7.18) to get
v˙ r = w r
σ˙ = w σ M ˜ 11 α ˜1 − M ˜ 12 w r + B ˜11 w σ ¨ = K
so that the inputs are now w r and w σ .
(7.26)
212
Automatic Control of a Bicycle
Chap. 7
B. We consider the output of the bicycle to be the x, y location of the rear-wheel ground contact. We will connect outputs to inputs through differentiation, with respect to t, of the nonholonomic constraints. From Figure 7.5 we can see that x˙ and y˙ are related to vr and θ by "
x˙ y˙
#
=
"
vr cθ vr sθ
#
(7.27)
Differentiate (7.27) twice with respect to t to get "
x(3) y (3)
#
=
"
−2v˙ r sθ θ˙ − vr cθ θ˙2 2v˙ r cθ θ˙ − vr sθ θ˙2
#
+
"
cθ −vr sθ
sθ
vr cθ
#"
v¨r θ¨
#
(7.28)
Define new inputs ur := v¨r = w˙ r ,
uθ := θ¨ = v˙ r σ + vr σ˙ = w r σ + vr w σ
(7.29)
Note that uθ and ur are related to w σ and w r through integration and a state dependent coordinate change. Assume that we want (x, y) to track (xd (t), yd(t)) where xd (·) and yd (·) are C 4 (Later in Section 7.6 we will see the reason for the order of the smoothness requirement on xd (·) and yd (·)). Through (7.28) and the input transformations (7.29) above, the bicycle model takes the form " # " # " #" # x(3) −2v˙ r sθ − vr cθ θ˙ ˙ cθ −vr sθ ur = θ+ y (3) 2v˙ r cθ − vr sθ θ˙ sθ vr cθ uθ ˙ α ˙ r + c cαuθ α ¨ = pg sα + 1p 1 + pθs cα θv vr p
(7.30)
Note that uθ is the only input directly entering the internal dynamics.
C. In one last step we will achieve E/I convertible form for the bicycle. Define new inputs ux and uy by "
ur uθ
#
=
"
cθ
sθ
−sθ /vr cθ /vr
#
−
"
−2v˙ r sθ − vr cθ θ˙ 2v˙ r cθ − vr sθ θ˙
#
θ˙ +
"
ux uy
#!
(7.31)
This redefinition of inputs converts (7.30) to a multi-input, multi-output version of the external/internal convertible form (6.13) for internal equilibrium control, where we group the θ and vr dynamics with the external dynamics.
Sec. 7.3
Equations of Motion
213
External/Internal Convertible Form for the Bicycle (" # " # x(3) ux Σext = y (3) uy ˙ α g pθs 1 ˙ r α ¨ = s + 1 + cαθv p α p vr " # " #! h i x ˙ Σint −2 v ˙ s − v c θ u r θ r θ c − θ˙ + + p cα −sθ /vr cθ ˙ 2v˙ r cθ − vr sθ θ uy
(7.32)
˙ θ, ¨ and v˙ r , are functions of x(0,3) and y (0,3) through Note that θ and vr , and thus θ, the relations x˙ = vr cos(θ), y˙ = vr sin(θ),
vr =
p
x˙ 2 + y˙ 2
(7.33)
Thus (7.32) is equivalent to the combination of reduced equations of motion (7.19) and constraints (7.5). Remark 7.3.2 A count of equations reveals that (7.32) consists of eight equations (we count z (k) = · as k differential equations) as compared to the six differential equations of (7.26), plus two constraints (7.5). The extra two equations are due to the input trans-
N
formation (7.29).
7.3.3
Internal Dynamics of the Bicycle Because only uθ appears in the internal dynamics of (7.30), it will be more con-
venient for our purposes to work with the model in the form (7.30) rather than to use the E/I convertible form, where both ux and uy appear in the internal dynamics. The internal dynamics from (7.30) are the α-dynamics,
Internal Dynamics of the Bicycle ! ˙ α g 1 pθs ˙ r + c cα uθ α ¨ = sα + 1+ cαθv p p vr p
(7.34)
214
Automatic Control of a Bicycle
Chap. 7
Zero-Dynamics of the Bicycle Recall that the zero dynamics of a control system is the reduced system obtained by restricting the output and input to be identically zero. If we were to choose the reference output (xd (t), yd(t)) = (0, 0) as our “zero” output, we would have no chance of stabilizing the bicycle about this output since, as pointed out in Remark 7.3.1, the bicycle is not p controllable when vr = x˙ 2 + y˙ 2 = 0. Through changes of coordinates, however, there is a
good deal of choice in what is meant by zeroing the output. For instance, let (xz (t), yz (t)) be a C 3 trajectory in R2 , and redefine the output of the bicycle system to be # " # " x ˜ x − xz (t) = y˜ y − yz (t) We may then redefine the input to be " # u ˜x u ˜y
=
"
(3)
ux − xz (t) (3)
uy − yz (t)
#
(7.35)
(7.36)
In this manner, any sufficiently smooth reference trajectory (xr (t), yr (t)) may be made to be the “zero” output. However, the class of such reference trajectories is divided between those reference trajectories resulting in autonomous zero dynamics, and those resulting in non-autonomous zero dynamics. For a zero output we will choose rectilinear motion along the x axis at a constant speed of vz > vrmin > 0. This keeps the zero dynamics autonomous while retaining controllability. If (x(t), y(t)) ≡ (vz t, 0), then ux ≡ 0, uy ≡ 0, vr ≡ vz , θ ≡ 0, and θ˙ ≡ 0. The resulting zero dynamics of the bicycle are Zero-Dynamics of the Bicycle α ¨=
g sin(α) b
(7.37)
The zero dynamics of the bicycle are the equations of motion for a planar pendulum in a gravitational field, where α = 0 corresponds to the upright position of the pendulum. These are the same zero dynamics we would have obtained if we had naively let the zero output be (xz (t), yz (t)) ≡ (0, 0). We would, in fact, obtain the same zero dynamics if we had chosen our “zero output” to coincide with any inertial frame traveling parallel to the ground. This is a result of the invariance of the Lagrangian and the constraints under translations and rotations in the ground plane.
Sec. 7.5
7.4
Internal Tracking Controlle
215
External Tracking Controller Assume we wish to track the output reference trajectory (xd (t), yd (t)) ∈ C 4 . Let
γi , i ∈ 3, be such that
s3 + γ3 s2 + γ2 s + γ1
(7.38)
has roots with strictly negative real parts. Inputs ux and uy which, ignoring internal αdynamics, will achieve this goal comprise an external tracking controller for the bicycle
External Tracking Controller for the Bicycle " # " # " # " (3) # (i−1) 3 X uxext xd (t) x(i−1) − xd (t) Vx = − γi =: (3) (i−1) y (i−1) − yd (t) Vy uyext yd (t) i=1
(7.39)
y
Substitute uxext and uext into (7.31) to get a corresponding value of the input uθ , which we refer to as uθext , which produces the same tracking result (x, y) → (xd , yd ). " # " #! h i x ˙ −2 v ˙ s − v c θ u 1 r θ r θ ext − θ˙ + uθext = −sθ cθ y vr 2v˙ r cθ − vr sθ θ˙ u
(7.40)
ext
The nominal external dynamics of the bicycle (see (6.63)) are then
"
7.5
Nominal External Dynamics for the Bicycle #
x(3) y (3)
(0,3)
(0,3)
=
Next(xd
, yd
:=
"
#
(3)
xd (t) (3) yd (t)
−
, x, y)
P3
i=1 γi
"
(i−1)
x(i−1) − xd y (i−1)
−
(t)
(i−1) yd (t)
(7.41) #
Internal Tracking Controller Let λd (t), t ∈ R+ be a desired C 2 roll-angle trajectory, with λd (t) ∈ (−π/2, π/2)
for all t ≥ 0. Let βi , i ∈ {1, 2}, be such that the roots of the polynomial s2 + β2 s + β1 have
216
Automatic Control of a Bicycle
Chap. 7
strictly negative real parts. An internal tracking controller for the bicycle is then
Internal Tracking Controller for the Bicycle −1 ˙ α g pθs c 1 θ ˙ c − s − 1 + c θv v uθint(vint ) = α r int p α p α p vr
(7.42)
θ ¨ d − β2 (α˙ − λ˙ d ) − β1 (α − λd ) vint = λ
7.6
Internal Equilibrium Angle We now define the internal equilibrium angle αe by the implicit relation 1 g 0 = sin(αe ) + p p
pθ˙ sin(αe ) 1+ vr
!
˙ r + c cos(αe )uθ cos(αe )θv ext p
(7.43)
obtained by setting uθ = uθext in the internal dynamics (7.34), where uext is given by (7.40). Set the left hand side of the internal dynamics equation (7.34) to zero and divide (7.43) by cos(αe )/p to get the equivalent but simpler implicit equation for the internal equilibrium angle αe ,
Internal Equilibrium Angle for the Bicycle ! pθ˙ sin(αe ) ˙ 0 = g tan(αe ) + 1 + θvr + cuθext vr
(7.44)
˙ vr , and uθ . Note that αe is an implicit function of θ, ext Equation (7.44) is a trigonometric polynomial in αe , having multiple solutions. We will be interested in only one solution however; the upright (αe ∈ (−π/2, π/2)) solution
closest to 0. We will determine an estimate α ˆ e for αe using dynamic inversion and will
single out the correct solution for αe by choice of initial condition for the dynamic inverter. The internal equilibrium manifold for the bicycle is
Sec. 7.6
Internal Equilibrium Angle
217
Internal Equilibrium Manifold for the Bicycle n o ˙ vr , uθ ), α˙ = 0 E(t) = (x(0,2), y (0,2), α(0,1)) | α = αe (θ, ext
7.6.1
(7.45)
A Dynamic Inverter for the Internal Equilibrium Angle We will use dynamic inversion to solve for αe in (7.44) given the state of the bicycle
(x(0,2), y (0,2), α(0,1)).
Let
˙ vr , uθ ) := g tan(α) + F (α, θ, ext
pθ˙ sin(α) 1+ vr
!
˙ r + cuθ θv ext
(7.46)
˙ vr , uθ ) with respect to t and solve for To obtain an estimator for α˙ e , differentiate F (αe , θ, ext α˙ e to get −1 α˙ e = − g sec2 (αe ) + pθ˙2 cos(αe ) · θ pu sin(αe )vr +pθ˙ sin(αe )v˙ r θ˙ + 1 + vr
pθ˙ sin(αe ) vr
(uθ vr + θ˙v˙ r ) + cu˙ θext
(7.47)
Recall that if h(t, (x(0,2), y (0,2))) is a smooth function of t and the state (x(0,2), y (0,2)), then ¯N h = L ext
∂h ∂(x(0,2), y (0,2))
+
∂h ∂t
(7.48)
˙ vr , v˙ r ) in (7.47), and uθ An estimator for α˙ e is then obtained by replacing α˙ e by E(αe , θ, ext θ ¯ by LNext uext where N is the nominal external dynamics (7.41), −1 ˙ vr , v˙ r ) = − g sec2 (αe ) + pθ˙2 cos(αe ) E(αe, θ, · θ ˙ ˙ pu sin(αe )vr +pθ sin(αe )v˙ r ˙ + 1 + pθ sin(αe ) (uθ vr + θw ˙ r ) + cL ¯ Next uθ θ ext vr vr
(7.49)
˙ vr , and v˙ r are functions of x(0,2) and y (0,2). A dynamic where we remind the reader that θ, θ, inverter for αe is then ˙ vr , uθ ) + E(α ˙ vr , v˙ r ) ˆ˙αe = −µF (α ˆe , θ, ˆe, θ, ext ˙ vr , uθ ) is given by (7.46). where F (α, θ, θ, ext
(7.50)
218
Automatic Control of a Bicycle
Chap. 7
Note that h
"
i
−2v˙ r sθ − vr cθ θ˙ 2v˙ r cθ − vr sθ θ˙
#
"
Vx
#!
˙ r − sθ v˙ r −sθ θv ˙ r − cθ v˙ r − θ˙ + −cθ θv Vy " # h i r 2 θ ˙ − (v˙ r cθ θ˙ − vr sθ θ˙ + vr cθ u ) −2(uextsθ + v˙ r cθ θ) ext s c + − vθ vθ θ˙ − r r ˙ − (v˙ r sθ θ˙ + vr cθ θ˙2 + vr sθ uθ ) 2(urextcθ − v˙ r sθ θ) ext " # " #! ¯ N Vx −2v˙ r sθ − vr cθ θ˙ L ext − uθext + ˙ ¯ 2v˙ r cθ − vr sθ θ LNext Vy (7.51)
¯ N uθ L = ext ext
1 vr2
where "
LNext Vx LNext Vy
#
=
"
(4)
xd (t) (4)
yd (t)
#
−
"
(3)
Vx − xd (t) (3)
Vy − yd (t)
#
−
2 X i=1
γi
"
(i)
x(i) − xd (t) (i)
y (i) − yd (t)
#
(7.52)
¯ Next Vx and L ¯ Next Vy terms The C 4 smoothness required of xd (t) and yd (t) are due to the L in which the fourth derivative of the reference trajectories appears.
7.7
Path Tracking with Balance An internal equilibrium controller for the bicycle is obtained by letting ur = urext,
and uθ = uθext(ve ), ¯ 2 αe − β2 (α˙ − L ¯ N αe ) − β1 (α − αe ) ve = L ext Next
(7.53)
As shown in Chapter 6, the internal equilibrium controller causes a neighborhood of E(t)
to become attractive and invariant and thereby produces approximate tracking, i.e. after a time T , (x(t), y(t)) is close to (xd (t), yd (t)). Our final controller is
Internal Equilibrium Controller for the Bicycle −1 ˙ α ˙ r ve ue (ve ) := uθint (ve ) = pc cα − gp sα − 1p 1 + pθs c θv α vr
(7.54)
¯ 2 αe − β2 (α˙ − L ¯ Next αe ) − β1 (α − αe ) ve := L Next
7.8
Simulations In this section we show the results of four simulations of the internal equilibrium
controller on the bicycle model using four simulations use four different reference trajecto-
Sec. 7.8
Simulations
219
ries: a straight line, a sinusoid, a circle, and a figure-eight, each revealing some capabilities and limitations of internal equilibrium control as applied to the bicycle. All simulations were performed in Matlab [Mat92] using an adaptive step-size Runge-Kutta integrator, ode45. The same physical and control parameters were used in all four simulations. These parameters are shown in Table 7.1. Table 7.1: Physical and gain parameters for the simulations. γ2 3
7.8.1
γ1 3
γ0 1
β1 20
β0 100
µ 10
c 1/2 [m]
p 1 [m]
g 9.8 [m/s2 ]
Straight Path at Constant Speed For the first simulation the output reference trajectory was along a straight line
at constant speed, (xd (t), yd(t)) = (5t, 0)
(7.55)
where the units of length are in meters. The initial conditions for the simulation are shown in Table 7.2. Table 7.2: Initial conditions for a straight trajectory at constant speed. x(0) 0
y(0) 5[m]
x(0) ˙ 2.5[m/s]
y(0) ˙ 0
x ¨(0) 0
y¨(0) 0
α(0) 0
α(0) ˙ 0
The top graph of Figure 7.7 shows the resulting path in the plane (solid) along with the desired path (dotted). The top graph of Figure 7.8 shows the tracking error k(x, y) − (xd , yd )k2 versus t.
The second graph of Figure 7.8 shows the steering angle φ graphed versus t. The third q graph shows the rear wheel velocity vr (solid) with desired rear-wheel velocity vrd = x˙ 2d + y˙d2
(dotted), both versus t. The bottom graph shows the roll-angle α (solid) with internal equilibrium roll-angle αe (dotted), both versus t. Note the countersteering evident in the graph of φ versus t. The steering angle goes positive first, steering the bicycle away from the desired path momentarily in order to cause the bicycle’s roll-angle to converge towards the equilibrium roll-angle which is, initially, positive. The tracking error for the straight path goes to zero. This behavior will also be seen for the case of the circular path below. This is a result of the internal equilibrium angle αe going to a constant value. Figure 7.9 shows schematically the convergence of α to αe (shown as a dotted line) and the resulting path. The internal equilibrium controller steers the bicycle so that its
220
Automatic Control of a Bicycle
Chap. 7
1
0.8
y
0.6
0.4
0.2
0
-0.2
0
10
20
30
x
40
50
60
70
Figure 7.7: Target path (xd , yd) = (5t, 0)[m]. The x and y scales are in meters. The bicycle’s path in the plane (solid) with the desired straight path (dotted). roll-angle converges to a neighborhood of αe and approximately tracks αe as αe changes. Approximate tracking of αe causes approximate tracking of the desired rectilinear trajectory in the plane. Figure 7.9 corresponds to the top of Figure 6.13 in Chapter 6 pertaining to the internal equilibrium control of the inverted pendulum.
Sec. 7.8
Simulations
221
1
||(x, y) - (xd, yd)|| 2
0.8
0.6
0.4
0.2
0 0
2
4
6
2
4
6
2
4
6
2
4
6
t
8
10
12
14
8
10
12
14
8
10
12
14
8
10
12
14
φ [rad]
0.05 0
-0.05
0
t
vr [m/s]
5.5 5 4.5 0
t
α [rad]
0.04 0.02 0
-0.02
0
t
Figure 7.8: Target path (xd , yd ) = (5t, 0) meters. The top graph shows the tracking error k(x, y) − (xd , yd )k2 versus t. The second graph shows the steering angle φ. The third graph shows the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted) vrd . The fourth graph shows the roll-angle α (solid) with internal equilibrium roll-angle αe (dotted).
222
Automatic Control of a Bicycle
Chap. 7
Figure 7.9: Internal equilibrium control causes the bicycle to steer itself so that its roll angle α converges to a neighborhood of the equilibrium roll angle αe , shown as a dashed line.
Sec. 7.8
7.8.2
Simulations
223
Sinusoidal Path For the second simulation the reference trajectory was a sinusoid 1 (xd (t), yd(t)) = 5t, sin( πt) 5
(7.56)
where the unit of length is meters. The initial conditions for this simulation are shown in Table 7.3. These are the same as the initial conditions for the straight path as shown in Table 7.2. Table 7.3: Initial conditions for the sinusoidal trajectory at constant speed. x(0) 0
y(0) 5[m]
x(0) ˙ 2.5[m/s]
y(0) ˙ 0
x ¨(0) 0
y¨(0) 0
α(0) 0
α(0) ˙ 0
Figure 7.10 shows the resulting path in the plane. Figure 7.11 shows the tracking error k(x, y)−(xd, yd )k2 , as well as the steering angle φ, velocity vr and with desired velocity,
roll α and αe .
2.5 2 1.5 1
y
0.5 0
-0.5 -1 -1.5 -2 -2.5
0
10
20
30
40
50
x
60
70
80
90
100
Figure 7.10: Sinusoidal target path (xd (t), yd(t)) = (5t, sin( 51 πt)) [m]. The bicycles path in the plane (solid) with the desired straight path (dotted). Note that the tracking error becomes bounded, but non-zero due to the presence (3,5)
of non-zero higher order time derivatives, namely xd trajectory.
(3,5)
and yd
, of the output reference
224
Automatic Control of a Bicycle
Chap. 7
1 0.9 0.8
||(x, y) - (xd, yd)|| 2
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
2
4
6
8
10
12
14
16
18
20
2
4
6
8
10
12
14
16
18
20
2
4
6
8
10
12
14
16
18
20
2
4
6
8
10
12
14
16
18
20
t
φ [rad]
0.1 0
-0.1
0
t
vr [m/s]
5.5 5 4.5 0
t
α [rad]
0.1 0
-0.1
0
t
Figure 7.11: Sinusoidal target path (xd , yd) = (5t, 2 sin(0.2πt)). The top graph shows the tracking error k(x, y) − (xd , yd)k2 . The bottom three graphs show the steering angle φ, the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted), and the roll-angle α (solid) with internal equilibrium roll-angle αe (dotted).
Sec. 7.8
7.8.3
Simulations
225
Circle at Constant Velocity For the third simulation the reference trajectory was 5 5 (xd (t), yd(t)) = (8 sin( t), 8 cos( t)) [m] 8 8
(7.57)
a circle of radius 8 [m] traversed with a constant linear velocity of 5 [m/s]. The initial conditions for this simulation are shown in Table 7.4. Table 7.4: Initial conditions for following a circular trajectory. x(0) 0
y(0) 5[m]
x(0) ˙ 4.5 [m/s]
y(0) ˙ 0
x ¨(0) 0
y¨(0) 0
α(0) 0
α(0) ˙ 0
Figure 7.12 shows the resulting path in the plane. Figure 7.13 shows the tracking error k(x, y) − (xd, yd )k2 , as well as the steering angle φ, velocity vr and with desired velocity, roll
α and αe . We could have achieved a circular path by using the internal tracking controller
10
y
5
0
−5
−10
−15
−10
−5
0
5
10
15
x Figure 7.12: Circular target trajectory with radius 8 meters and tangential velocity of 5 meters per second. The first 10 seconds of the bicycle’s path in the plane (solid) with the desired circular path (dotted). to cause α to converge to a constant nonzero angle as was shown in [Get94]. However, in that case we would have no control over the location of the circle in the plane.
226
Automatic Control of a Bicycle
Chap. 7
||(x, y) − (xd, yd)||2
2.5
2
1.5
1
0.5
0
2
4
6
8
10
12
14
16
18
20
t
φ [rad]
0.4 0.2 0 −0.2 −0.4
vr [m/s]
0
2
4
6
8
10
12
14
16
18
20
2
4
6
8
10
12
14
16
18
20
2
4
6
8
10
12
14
16
18
20
5.5 5
4.5 0
α [rad]
0.4 0.2 0 0
t Figure 7.13: Circular target trajectory with radius 8 meters and tangential velocity of 5 meters per second. The top graph shows the tracking error k(x, y) − (xd, yd)k2 . The bottom three graphs show the steering angle φ, the rear wheel velocity vr (solid) with desired rearwheel velocity (dotted), and the roll-angle α (solid) with internal equilibrium roll-angle αe (dotted).
Sec. 7.8
Simulations
227
Note that as in the case of the straight path, the tracking error goes to zero because ˙ the external tracking controller is regulating to constant values of vr and θ.
7.8.4
Figure-Eight Trajectory The fourth simulation used a figure-eight reference trajectory (xd (t), yd(t)) = (20 sin(2πt/20), 10 sin(4πt/10))
(7.58)
where the units of length are in meters. The initial conditions for this simulation are shown in Table 7.5. Table 7.5: Initial conditions for following the figure-eight trajectory. x(0) 0
y(0) 5 [m]
x(0) ˙ 6 [m/s]
y(0) ˙ 0
x ¨(0) 0
y¨(0) 0
α(0) 0
α(0) ˙ 0
Figure 7.14 shows the resulting path in the plane (solid) along with the reference path (dotted). Figure 7.15 shows the tracking error k(x, y) − (xd , yd )k2 along with the steering angle φ, rear-wheel velocity vr , and roll-angle α. Note that the tracking error becomes
bounded, but does not decay to zero, again due to the fact that the higher order time derivatives of (xd (t), yd (t)) are time-varying. In this case, due to the presence of significant energy in the higher derivatives of (xd , yd ), though the tracking error becomes bounded exponentially, it does not converge to zero as in the case of the circle and the straight line.
228
Automatic Control of a Bicycle
Chap. 7
15
10
y
5
0
−5
−10
−15 −25
−20
−15
−10
−5
0
5
10
15
20
25
x Figure 7.14: The bicycle’s path in the plane (solid) with the desired figure-eight path (dotted).
Sec. 7.8
Simulations
229
5 4.5 4
||(x, y) − (xd, yd)||2
3.5 3 2.5 2 1.5 1 0.5 0 0
5
10
15
20
25
30
t
φ [rad]
0.5
0
−0.5 0
5
10
15
20
25
30
5
10
15
20
25
30
5
10
15
20
25
30
vr [m/s]
10 8 6 4 0
α [rad]
0.5
0
−0.5 0
t Figure 7.15: Figure-eight target trajectory. The top graph shows the tracking error k(x, y)− (xd , yd )k2 . The bottom three graphs show the steering angle φ, the rear wheel velocity vr (solid) with desired rear-wheel velocity (dotted), and the roll-angle α (solid) with internal equilibrium roll-angle αe (dotted).
230
7.9
Automatic Control of a Bicycle
Chap. 7
Chapter Summary Application of internal equilibrium control to the bicycle has resulted in a controller
that provides approximate tracking of smooth reference trajectories while retaining the bicycle’s balance. The ultimate bound on the tracking error has been seen to coincide with the energy in the higher time derivatives of the output reference trajectories, as expected from the theory of Chapter 6. Dynamic inversion has been successfully used to track the internal equilibrium angle for the bicycle, with the result incorporated into the tracking controller. In the case of the bicycle, the internal equilibrium angle at time t is the solution of a trigonometric polynomial equation. Other means than dynamic inversion exist for the solving of trigonometric polynomials. For instance, the internal equilibrium angle equations could have been converted to a standard polynomial in s = tan(αe /2). The resulting fifth order polynomial could then be solved for its five roots using a standard polynomial solver, e.g. roots in Matlab [Mat92]. The real roots could then be converted back to angles through the arctangent function. Then the useful root, the one in (−π/2, π/2), could be picked out of the list of angles providing a solution for the internal equilibrium angle at a time t. We do not mean to imply that this solution, though somewhat arcane, is either slow or inefficient. Such an approach is used with great success in the area of robot inverse kinematics [Man92] for robotic manipulators having rotary joints. In contrast, however, the conceptual simplicity of dynamic inversion, combined with its ease of implementation, its continuous-time dynamic nature, and its accurate results hold substantial appeal and convenience. Dynamic inversion may also be applied whether or not the implicit equation defining αe is a trigonometric polynomial. The simulations have shown that countersteering emerges naturally as a result of the action of the internal equilibrium controller. Countersteering may be regarded as the result of tracking the equilibrium roll angle rather than trying to track the path in the plane directly.
231
Chapter 8
Conclusions In this dissertation we have presented a methodology for the construction of nonlinear dynamic systems that produce approximations to the inverse problems having timevarying vector-valued solutions. We have demonstrated application of these results to a variety of problems in matrix analysis and nonlinear control including matrix inversion, polar decomposition, implicit trajectory tracking, robot control, and the inversion of nonlinear control systems, particularly nonminimum-phase systems. In this final chapter we will briefly review the results of the preceeding chapters. Then, after some general observations, we will make some suggestions for future work.
8.1
Review In this section we review the results of each chapter of this dissertation.
• Chapter 2. Dynamic Inversion of Nonlinear Maps. The dynamic inverse of a nonlinear map was defined. The definition of the dynamic inverse was such that when
the dynamic inverse was composed with a nonlinear map, simple quadratic Lyapunov arguments were all that was necessary to prove convergence of the solution of the resulting dynamic inverter to the desired root. It was shown how individual dynamic inverses and derivative estimators could be combined to produce the solutions to coupled inverse problems using a single dynamic system. This resulted in a dynamic inverter that solved for a dynamic inverse, while simultaneously using the resulting dynamic inverse to solve the main inverse problem of interest. Examples showed how dynamic inversion could provide inversion of time-varying matrices and estimation of
232
Conclusions
Chap. 8
the intersections of time-varying curves. Dynamic inversion was also used to construct a dynamic controller for a nonlinear control system. • Chapter 3. Dynamic Inversion and Polar Decomposition of Matrices. Building upon the use of dynamic inversion in inverting time-varying matrices, we extended
its application to the asymptotic inversion of fixed matrices. Then by constructing a time-parameterized homotopy from the identity to a fixed matrix we showed how we could use our result on asymptotic inversion of time-varying matrices to invert a spectrally restricted matrix in finite time. In order to remove the spectral restrictions on the fixed matrix to be inverted, we constructed a dynamic inverter which asymptotically solved for the inverse of the positive-definite part of the polar decomposition of a time-varying matrix. Through additional matrix multiplications the unitary component of the polar decomposition, as well as the inverse of the decomposed matrix were also produced. Then by using another time-parameterized homotopy from the identity, we obtained polar decomposition of any fixed matrix in finite time. • Chapter 4. Tracking Implicit Trajectories. For a partially feedback linearizable
nonlinear control system, we showed how dynamic inversion could be used to allow
exponentially convergent tracking of output reference trajectories that were implicitly defined. We obtained tracking by substituting into a conventional control law the solution of a dynamic inverter and estimates of the derivatives of that solution, for the actual solution and its derivatives. We showed that as long as the output reference trajectory and its derivatives were suitably bounded, and as long as the initial conditions for the dynamic inverter were sufficiently close to the actual reference trajectory and its derivatives, application of the implicit tracking controller preserved a bound on the internal dynamics of the controlled dynamic system. • Chapter 5. Joint-Space Tracking of Workspace Trajectories in Continuous Time. We showed how robotic manipulator control strategies may be divided up
in to four classes, depending upon whether one wishes to stabilize trajectories in the workspace or the joint-space using workspace errors or joint-space errors. Then by applying the implicit tracking controller of Chapter 4, we showed how we could achieve exponentially convergent tracking of workspace trajectories using joint-space errors with no need to call on discrete-time inverse-kinematic algorithms or matrix inversion.
Sec. 8.2
Observations
233
• Chapter 6. Approximate Output Tracking for a Class of Nonminimum-Phase Systems. In this chapter we considered the problem of output tracking for a class of nonlinear nonminimum-phase systems called balance systems. Under suitable conditions, an internal equilibrium manifold, a submanifold of the systems state-space, could be constructed from the internal dynamics of the system. A controller was then constructed which made a neighborhood of the internal equilibrium manifold attractive and invariant. This resulted in approximate tracking of both the output reference trajectory, as well as a bounded internal trajectory, the internal equilibrium angle. Application of the internal equilibrium controller to tracking control for the inverted pendulum on a cart, and comparison to a linear quadratic regulator demonstrated the increase in performance of the internal equilibrium controller over the linear quadratic regulator. • Chapter 7. Automatic Control of a Bicycle. We converted the nonlinear nonholonomic model of a simple bicycle to internal/external convertible form. We showed how
both roll-angle and rear-wheel velocity could be easily controlled, ignoring the resulting path in the ground plane. Then we used the approximate output tracking control methodology of Chapter 6 to construct an internal equilibrium controller for approximate tracking of time-parameterized ground paths while retaining balance. Simulation results verified approximate stabilization of a number of time-parameterized ground paths, along with maintenance of vehicle balance.
8.2 8.2.1
Observations Dynamic Time v.s. Computational Time Consider the use of discrete inversion routines in the context of a digital implemen-
tation of the control of a continuous-time dynamic system. Discrete inversion essentially introduces a computational time axis in addition to the dynamic time axis along which the continuous-time process flows. For instance, assume that we wish to control a plant with a compensator that is represented by a differential-algebraic equation, containing both an ordinary differential equation and a set of implicit algebraic relations. The time axis of both the plant and the differential equation part of the controller is the dynamic time axis. The differential equations must be integrated with a discrete integrator. Before integration can proceed however, the set of implicit algebraic equations must be solved for, say, a root.
234
Conclusions
Chap. 8
Using a discrete inverter this computation of the root is sequential along a discrete computational time axis. Only when the discrete inverter has completed its last step can the integration proceed along the dynamic time axis. Now consider the same problem where dynamic inversion is used. A dynamic inverter for the solution of the implicit algebraic equations is an ordinary differential equation. It may be appended to the differential equations of the compensator and integrated along with the compensator. Thus the need for a computational time axis in addition to the dynamic time axis is removed. Furthermore, all responsibility for accuracy of the inversion is placed squarely in the lap of the integration routine. Accuracy of integration routines is a well studied problem.
8.2.2
Realization of Dynamic Inverters
Dynamic inversion is an analog computational paradigm. Consequently many different physical realizations of dynamic inversion are possible, e.g. analog electronic, mechanical, chemical, optical. In the current prevalent digital computational technology it is most often the case that differential equations such as the dynamic inversion systems (2.53),(2.80), and (2.108) are solved using numerical computation on digital computers. Consequently a comparison of the continuous estimator and discrete root finding techniques such as Newton’s method would, in the digital domain, be more appropriately made by pairing the continuous estimator with an algorithm for integration of ordinary differential equations. For example, one might compare a dynamic inverter integrated with a Runga-Kutta integrator of a particular order with the Newton-Raphson algorithm. Though such comparisons are beyond the scope of this dissertation, each such pairing of dynamic inverter and integration routine results in a discrete root estimator. Indeed, the direct comparison of a discrete estimator to a continuous estimator may be seen as unfair or inappropriate. The continuous dynamic approach of dynamic inversion, however, has the virtue that it is independent of implementation. It may be realized in an analog as well as a digital manner (through association with an integrator). Its combination with a continuous-time plant and controller in a control system allows a seamless incorporation of root-solving without the need to mix continuous-time and discrete-time analysis.
Sec. 8.3
8.3
Future Work
235
Future Work In this section we highlight a few directions in which the results of this dissertation
may be extended in future work.
8.3.1
Methods for Producing Dynamic Inverses For the most part we have relied upon linear dynamic inverses. For many impor-
tant inverse problems no linear dynamic inverse exists and nonlinear dynamic inverses must be found. As yet we have no methodology for producing nonlinear dynamic inverses in a variety of problems. Even in the case that a linear dynamic inverse exists, the determination of an expression for the differential D1 F (θ, t) may be impractical. Ways can be developed to dynamically determine a dynamic inverse by a method akin to finite differences, where one tracks points near the approximator θ for the inverse solution θ∗ in order to construct a dynamic approximation of a linear dynamic inverse.
8.3.2
Differential-Algebraic Systems Dynamic inversion allows one to incorporate the time-varying solutions to algebraic
equations into dynamical systems. This has been shown in this dissertation in association with implicit trajectory tracking as well as internal equilibrium control. In the general context of differential-algebraic systems which arise frequently in the domain of constrained mechanical systems, for instance, dynamic inversion may be able to provide a natural context in which to incorporate the solutions of algebraic constraints into dynamics. Constraint enforcement in the integration of differential-algebraic systems and conservative mechanical systems may also be fruitful areas of application.
8.3.3
Inverse Kinematics with Singularities We have shown how dynamic inversion may be applied to the problem of inverting
robot kinematics as long as a continuous isolated solution to the inverse kinematics problem exists. We also showed that dynamic inversion may be used for redundant manipulators, allowing us to arrive at a unique solution. We have avoided the problem of kinematic singularities, points in joint-space where the Jacobian of the forward-kinematics map drops rank. At these points, continuous isolated solutions meet. Dynamic inversion as stated in this dissertation cannot be used at such singularities. We may, however, be able to modify dynamic inversion, incorporating manipulator dynamics, so that in the context of workspace
236
Conclusions
Chap. 8
tracking, the inverse path through such singularities becomes unique based upon the state of the manipulator.
8.3.4
Tracking Multiple Solutions Dynamic inversion may be used for tracking multiple solutions of an inverse prob-
lem. One need only set up a dynamic inverter for each solution. When solutions cross, however, dynamic inversion breaks down. There are many applications where there is a physically sensible way of choosing the continuation of solutions. This sensibility needs to be incorporated into the inversion process.
8.3.5
Tracking Optimal Solutions Most applications of dynamic inversion in this dissertation have been to solutions
of equations for which the number of unknowns is equal to the number of equations. When the number of equations is less than the number of unknowns, other criteria must be used to produce a unique solution. Typically one chooses a cost function on the solution space and uses minimization of that cost function as the additional criterion for uniqueness, arriving at an optimal solution. In future work dynamic inversion will be extended to the tracking of optimal solutions.
8.3.6
Control System Design Though it is common to implement control systems through either analog or dig-
ital circuits and computers, the study of dynamic inversion suggests that this limited view of controllers, flexible though it may be, is not the only view possible. As an analog computational paradigm, dynamic invertion may be realized by a variety of physical processes. In particular dynamic inversion may be designed into a process or mechanism in order to achieve control without the aid of an electrical digital or analog circuit. This may be particularly useful in the control of very fast processes that happen on time scales too short for circuit realizations of controllers. Such processes occur in, for instance, fast chemical reactions, such as combustion, and in the area of inertial confinement fusion. At the same time, dynamic inversion may provide a way of viewing physical processes as computation. This may lead to new types of computers and new insights into the design of materials and structures.
237
Bibliography [AG80]
E. Allgower and K. Georg. Simplical and continuation methods for approximating fixed points and solutions to systems of equations. SIAM Review, 22(1), January 1980.
[AM90]
B.D.O. Anderson and J.B. Moore. Optimal control : linear quadratic methods. Prentice Hall, Englewood Cliffs, New Jersey, 1990.
[AMR88]
R. Abraham, J.E. Marsden, and T. Ratiu. Manifolds, Tensor Analysis, and Applications, volume 75 of Applied Mathematical Sciences. Springer-Verlag, New York, second edition, 1988.
[BH69]
A.E. Bryson and Y.-C. Ho. Applied optimal control; optimization, estimation, and control. Blaisdell, Waltham, Mass., 1969.
[BKMM94] A.M. Bloch, P.S. Krishnaprasad, J.E. Marsden, and R.M. Murray. Nonholonomic mechanical systems with symmetry. Technical Report CDS 94-013, California Institute of Technology, Pasadena, 1994. Accepted for Archive for Rational Mechanics and Analysis. [BL93]
M.D. Di Benedetto and P. Lucibello. Inversion of nonlinear time-varying systems. IEEE Transactions on Automatic Control, 38(8):1259–1264, August 1993.
[Blo85]
A.M. Bloch. Estimation, principal components, and hamiltonian systems. Systems and Control Letters, 6, 1985.
[Blo90]
A.M. Bloch. Steepest descent, linear programming, and hamiltonian flows. Contemporary Mathematics, 114, 1990.
238 [BRM92]
Bibliography A.M. Bloch, M. Reyhanoglu, and N.H. McClamroch. Control and stabilization of nonholonomic dynamic systems. IEEE Transactions on Automatic Control, 37(11):1746–1757, November 1992.
[Bro89]
R. W. Brockett. Least squares matching problems. Linear Algebra and Its Applications, 122:761–777, Sep-Nov 1989.
[Bro91]
R. W. Brockett.
Dynamical systems that sort lists, diagonalize matrices,
and solve linear programming problems. Linear Algebra and Its Applications, 146:79–91, February 1991. [BS72]
R.H. Bartels and G. H. Stewart. Algorithm 432, solution of the matrix equation AX+XB=C. Communications of the ACM, 15, 1972.
[CD91a]
F.M. Callier and C.A. Desoer. Linear System Theory. Springer Texts in Electrical Engineering. Springer-Verlag, New York, 1991.
[CD91b]
M.T. Chu and K.R. Driessel. Constructing symmetric nonnegative matrices with prescribed eigenvalues by differential equations. SIAM Journal on Mathematical Analysis, 22(5), September 1991.
[CDK87]
L.O. Chua, C.A. Desoer, and E.S. Kuh.
Linear and Nonlinear Circuits.
McGraw-Hill, 1987. [Chu92]
M.T. Chu. Matrix differential equations - a continuous realization process for linear algebra problems. Nonlinear Analysis-Theory Methods and Applications, 18(12), June 1992.
[Chu95]
M.T. Chu. Scaled toda-like flows. Linear Algebra and its Applications, 215, 15 January 1995.
[CR93]
L.O. Chua and T. Roska. The cnn paradigm. IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, 40(3), March 1993.
[Cra89]
J.J. Craig. Introduction to Robotics, Mechanics and Control. Addison Wesley, New York, second edition, 1989.
[DJW84]
J. P. Dunyak, J. L. Junkins, and L. T. Watson. Robust nonlinear least squares estimations using the Chow-Yorke homotopy method. Journal of Guidance, 7:752–755, 1984.
Bibliography [DN92]
239
F. Delebecque and R. Nikoukhah. A mixed symbolic-numeric software environment and its application to control systems engineering. Recent Advances in Computer-Aided Control Systems Engineering, 1992.
[DP94]
S. Devasia and B. Paden. Exact output tracking for nonlinear time-varying systems. In Proceedings of the 33rd IEEE Conference on Decision and Control, volume 3, pages 2346–2355, Lake Buena Vista, December 1994. IEEE.
[DPC94]
S. Devasia, B. Paden, and Degang Chen. Nonlinear inversion-based regulation. In American Control Conference, Baltimore, July 1994. IEEE.
[Far92]
F.A. Farouhar. Robust Stabilization of High Speed Oscillations in Single Track Vehicles by Feedback Control of Gyroscopic Moments of Crankshaft and Engine Inertia. PhD thesis, University of California, May 1992.
[FPEN86]
G.F. Franklin, J. D. Powell, and A. Emami-Naeini. Feedback Control of Dynamic Systems. Addison Wesley, Menlo Park, 1986.
[Fra77]
B.A. Francis. The linear multivariable regulator problem. SIAM Journal of Control and Optimization, 15, 1977.
[FSR90]
G. Franke, W. Suhr, and F. Riess. An advanced model of bicycle dynamics. European Journal of Physics, 11(2):116–21, March 1990.
[GBLL94]
J.W. Grizzle, M.D. Di Benedetto, and F. Lamnabhi-Lagarrigue. Necessary conditions for asymptotic tracking in nonlinear systems. IEEE Transactions on Automatic Control, 39(9):1782–1794, September 1994.
[Get94]
N. H. Getz. Control of balance for a nonlinear nonholonomic non-minimum phase model of a bicycle. In American Control Conference, Baltimore, June 1994. American Automatic Control Council.
[Get95]
N. H. Getz. Internal equilibrium control of a bicycle. In 34th IEEE Conference on Decision and Control, New Orleans, 13-15 December 1995. IEEE.
[GH90]
J. Guckenheimer and P. Holmes. Nonlinear Oscillations, Dynamical Systems, and Bifurcations of Vector Fields, volume 42 of Applied mathematical sciences. Springer-Verlag, New York, 1990.
240 [GH93]
Bibliography S. Gopalswamy and J. K. Hedrick. Tracking nonlinear non-minimum phase systems using sliding control. International Journal of Control, 57(5):1141– 1158, 1993.
[GH95]
N. H. Getz and J. K. Hedrick. An internal equilibrium manifold method of tracking for nonminimum phase systems. In American Control Conference, Seattle, 21-23 June 1995. American Automatic Control Council.
[GM95a]
N. H. Getz and J. E. Marsden. Control for an autonomous bicycle. In IEEE International Conference on Robotics and Automation, Nagoya, Aichi, Japan, 21-27 May 1995. IEEE.
[GM95b]
N. H. Getz and J. E. Marsden. Dynamical methods for polar decomposition and inversion of matrices. Technical Report 624, Center for Pure and Applied Mathematics, Berkeley, California, 5 January 1995. Submitted to Linear Algebra and its Applications.
[GM95c]
N. H. Getz and J. E. Marsden. Tracking implicit trajectories. In IFAC Symposium on Nonlinear Control Systems Design, Tahoe City, 25-28 June 1995. International Federation of Automatic Control.
[GMW81]
P.E. Gill, W. Murray, and M.H. Wright. Practical Optimization. Academic Press, 1981.
[GNL79]
G.H. Golub, S. Nash, and C. Van Loan. A Hessenberg-Scheer method for the problem AX+XB=C. IEEE Transactions on Automatic Control, AC-24, 1979.
[GS93]
R. Gurumoorthy and S.R. Sanders. Controlling nonminimum phase nonlinear systems-the inverted pendulum on a cart example. In Proceedings of the 1993 American Control Conference, volume 1, pages 680–5, San Francisco, June 1993. ACC.
[Han88]
R. S. Hand. Comparisons and stability analysis of linearized equations of motion for a basic bicycle model. Master’s thesis, Cornell, 1988.
[Har82]
P. Hartman. Ordinary Differential Equations. Birkhauser, Boston, second edition, 1982.
[HJ85]
R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge University Press, New York, 1985.
Bibliography [HJ91]
241
R. A. Horn and C. R. Johnson. Topics in Matrix Analysis. Cambridge University Press, New York, 1991.
[HM94]
U. Helmke and J. B. Moore. Optimization and Dynamical Systems. Communications and Control Engineering. Springer-Verlag, New York, 1994.
[HMP94]
U. Helmke, J. B. Moore, and J. E. Perkins. Dynamical systems that compute balanced realizations and the singular value decomposition. Siam Journal of Matrix Analysis and Its Applications, 15:733–754, July 1994.
[HMS94]
L.R. Hunt, G. Meyer, and R. Su. Computing particular solutions. In Proceedings of the 33rd IEEE Conference on Decision and Control, volume 3, pages 2520–1, Lake Buena Vista, December 1994. IEEE.
[HR92a]
J. Huang and W. J. Rugh. An approximation method for the nonlinear servomechanism problem. IEEE Transactions on Automatic Control, 37(9):1395– 1398, September 1992.
[HR92b]
J. Huang and W. J. Rugh. Stabilization on zero-error manifolds and the nonlinear servomechanism problem. IEEE Transactions on Automatic Control, 37(7):1009–1013, July 1992.
[HSM92]
J. Hauser, S. Sastry, and G. Meyer. Nonlinear control design for slightly non-minimum phase systems - application to v/stol aircraft. AUTOMATICA, 28(4):665–679, July 1992.
[IB90]
A. Isidori and C. I. Byrnes. Output regulation of nonlinear systems. IEEE Transactions on Automatic Control, 35(2):131–140, 1990.
[IM91]
A. Isidori and C. H. Moog. On the nonlinear equivalent of the notion of transmission zeros. In C. I. Byrnes and A. Kursonski, editors, Modelling and Adaptive Control. Springer-Verlag, 1991.
[Isi89]
A. Isidori. Nonlinear Control Systems, An Introduction. Springer-Verlag, New York, second edition, 1989.
[JLS88]
J.-S. Jang, S.-Y. Lee, and S.-Y Shin. An optimization network for matrix inversion. In D. Z. Anderson, editor, Neural Information Processing Systems, pages 397–401. American Institute of Physics, New York, 1988.
242 [Kad94]
Bibliography R.R. Kadiyala. Sys view: a visualization tool for viewing the regions of validity and attraction of nonlinear systems. In IEEE/IFAC Joint Symposium on Computer-Aided Control System Design, Tuscon, 7-9 March 1994. IEEE/IFAC.
[Kai80]
T. Kailath. Linear Systems. Prentice-Hall, 1980.
[Kha92]
H.K. Khalil. Nonlinear Systems. Macmillan, New York, 1992.
[KHO86]
P. Kokotovic, H.K.Khalil, and J. O’Reilly. Singular Perturbation Methods in Control: Analysis and Design. Academic Press, London, 1986.
[Kre92]
A.J. Krener. The construction of optimal linear and nonlinear regulators. Systems Models and Feedback: Theory and Applications, 1992.
[Man92]
D. Manocha. Algebraic and numeric techniques for modeling and robotics. PhD thesis, University of California at Berkeley, Berkeley, California, 1992.
[Mat92]
Matlab. The MathWorks, Inc., Natick, Mass., 1992.
[Mea89]
C. Mead. Analog VLSI and Neural Systems. Addison-Wesley, Reading, Mass., 1989.
[MH83]
J.E. Marsden and T.J. Hughes.
Mathematical foundations of elasticity.
Prentice-Hall Civil Engineering and Engineering Mechanics Series. PrenticeHall, Englewood Cliffs, N.J., 1983. [MLS94]
R. M. Murray, Z. Li, and S. S. Sastry. A Mathematical Introduction to Robotic Manipulation. CRC Press, 1994.
[Nak91]
Y. Nakamura. Advanced Robotics: Redundancy and Optimization. AddisonWesley Series in Electrical and Computer Engineering. Control Engineering. Addison-Wesley, Reading, Mass., 1991.
[NDV92]
R. Nikoukhah, F. Delebecque, and D. Van Wissel. Software for simulation and control of multibody systems. In Workshop on Dynamics and Control of Multibody Systems. Army Research Office, April 1992.
[NF72]
Ju. I. Neimark and N. A. Fufaev. Dynamics of Nonholonomic Systems, volume 33 of Translations of Mathematical Monographs. American Mathematical Society, Providence, Rhode Island, 1972.
Bibliography [NTV91a]
243
S. Nicosia, A. Tornamb`e, and P. Valigi. A solution to the generalized problem of nonlinear map inversion. Systems and Control Letters, 17(5), 1991.
[NTV91b]
S. Nicosia, A. Tornamb`e, and P. Valigi. Use of observers for the inversion of nonlinear maps. Systems and Control Letters, 16(6), 1991.
[NTV94]
S. Nicosia, A. Tornamb`e, and P. Valigi. Nonlinear map inversion via state observers. Circuits, Systems, and Signal Processing, 13(5), 1994.
[PC89]
T. S. Parker and L. O. Chua. Practical numerical algorithms for chaotic systems. Springer-Verlag, Berlin, New York, 1989.
[Rol72]
R. D. Roland, Jr. Bicycle dynamics, tire characteristics, and rider modelling. Technical report, Cornell Aeronautical Laboratory, Inc., Buffalo, 1972.
[Rut54]
H. Rutishauser.
Ein infinitesimales analogon zum quotienten-differenzen-
algorithmus. Archiv der Mathematik, 1954. [Rut58]
H. Rutishauser. Solution of eigenvalue problems with the lr-transformation. National Bureau of Standards Applied Mathematics Series, 49, 1958.
[SB89]
S. S. Sastry and M. Bodson. Adaptive Control: Stability, Convergence, and Robustness. Prentice Hall, Englewood Cliffs, New Jersey, 1989.
[Sha71]
R. S. Sharp, Jr. The stability and control of motorcycles. Mechanical Engineering Science, 13(5), 1971.
[Smi91]
S. T. Smith. Dynamical systems that perform the singular value decomposition. Systems and Control Letters, 16(5), 1991.
[Spo96]
M.W. Spong. The control of underactuated mechanical systems. In First International Conference on Mechatronics, 26-29 January 1996.
[SV89]
M. Spong and M. Vidyasagar. Robot Dynamics and Control. Wiley, 1989.
[TD93]
K. Tchon and I. Duleba. On inverting singular kinematics and geodesic trajectory generation for robotic manipulators. Journal of Intelligent and Robotic Systems, 8:325–359, 1993.
[Tor90a]
A. Tornamb`e. An asymptotic observer for solving the inverse kinematic problem. In Proceedings of the American Control Conference, San Diego, 1990.
244 [Tor90b]
Bibliography A. Tornamb`e. Use of high-gain observers in the inverse kinematic problem. Applied Mathematical Letters, 3(1), 1990.
[Tor91]
A. Tornamb`e. Asymptotic inverse dynamics of non-linear systems. International Journal of Systems Science, 22(12), 1991.
[vWN95]
D. von Wissel and R. Nikoukhah. Obstacle-avoiding trajectory optimization: Example of a riderless bicycle. 23(2), 1995.
[Wan93]
J. Wang. A recurrent neural network for real-time matrix inversion. Applied Mathematics and Computation, 55:89–100, 1993.
[Wat81]
L. T. Watson. Engineering applications of the Chow-Yorke algorithm. Applied Mathematical Computation, 9:111–133, 1981.
[WE84]
W.A. Wolovich and H. Elliot. A computational technique for inverse kinematics. In 23rd IEEE Conference on Decision and Control, Las Vegas, 12-14 December 1984. IEEE.
[Wei72]
D. H. Weir. Motorcycle Handling Dynamics and Rider Control and the Effect of Design Configuration on Response and Performance. PhD thesis, University of California at Los Angeles, 1972.
245
Appendix A
Notation and Terminology We gather here some notation and definitions used throughout this dissertation. Let Z+ denote the integers {0, 1, . . . , ∞}.
Z+ a := b
The expression a := b or b =: a defines a as being equivalent to the expression b.
R
The real numbers.
C
The complex numbers.
Rn
The real n-vectors.
Rm×n En
Co− Co+
The real m × n matrices. The n-dimensional Euclidean space formed from Rn and the Euclidean metric kxk = Pn 2 1/2 for x ∈ Rn . i=1 xi
the open left-half of the complex plane consisting of all s ∈ C having strictly negative
real part.
the open right-half of the complex plane consisting of all s ∈ C having strictly positive
real part. k
For any k ∈ Z+ , k ≥ 1, let k denote the set of integers {1, 2, . . ., k}.
r¯
For a list of positive integers r, r¯ is the largest positive integer in r.
R+ ∅
Define R+ := {t ∈ R|t ≥ 0}, the set from which we draw our values of time, t. The empty set.
246
Notation and Terminology For a ∈ R,
sign(a)
sign(a) =
(
Appendix A
1, if a > 0 −1, if a < 0
We will consider sign(0) to be undefined. y (k) (t)
The kth derivative with respect to time, t, of a functions y(t) will be denoted
y (k)(t), where k is in Z+ , and y (0)(t) = y(t). y (n1 ,n2 ) (t)
For y(t) ∈ Rn , and n1 , n2 ∈ Z+ , with n1 < n2 , y (n1 ,n2 )(t) := [y (n1 ) , y (n1 +1) , . . ., y (n2 ) ](t) ∈
Rn2 −n1 +1 . Cnk [0, ∞) Ck
The set of k times continuously differentiable functions on the interval [0, ∞).
Let i := (i1 , . . . , ik ) denote any length k combination of the k integers {1, . . ., k}. A
map F : X → Y whose partial derivatives
∂ k F (x) ∂xi1 ∂xi2 · · · ∂xik are defined and continuous for all x ∈ X and for all length k combinations i is said to
be in C k on X . kxkp
The p-norm kxkp of a vector x ∈ Rn is defined by n X
kxkp =
i=1
|xi |p
!1/p
.
The norm kφk∞ of a vector valued function φ : R+ → Rn ; t 7→ φ(t) is defined as kφk∞ := sup(max(|φi(t)|)). t
k(Γ, θ)k2
i∈n
For (Γ, θ) ∈ Rn×n × Rn , let
k(Γ, θ)k2 :=
n X
i,j=1
|Γij |2 +
n X
This is the l2 norm of the n × n + 1 matrix [Γ | θ]. M R, M L
i=1
1/2
θi
.
(A.1)
If M ∈ Rm×n , m ≤ n, is full rank, then M R := M T (M M T ) ∈ Rn×m is the
right inverse of M . Note that M M R = I ∈ Rm×m . If M ∈ Rm×n , m ≥ n, is full rank,
then M L := (M T M )−1 M T ∈ Rn×m is the left inverse of M , and M L M = I ∈ Rm×m .
Appendix A Br
Notation and Terminology
247
For each dimension n we define the open ball Br := {x ∈ Rn : kxk < r}. The choice
of a particular norm k · k will be apparent from context. In order to emphasize the
dimension of Br we will often specify the set having the same dimension as Br for which Br is a subset, e.g. Br ⊂ Rn .
For a function f : A → B, f (·) refers to the function f evaluated on its entire domain
y(·)
A, as opposed to f (x) which refers to the function f evaluated at a single x ∈ A. Thus
while f (x) is a value in B, f (·) is an element of the function space of all functions having domain A and codomain B.
k · k(k) (k)
For y : [0, ∞) → Rn , y ∈ C k , ky(·)k(k) = supt≥0 {ky(t)k∞ , ky (1)(t)k∞, . . . , ky (k)(t)k∞ }. (k)
For y : [0, ∞) → Rn , y ∈ C k , and real number r > 0, Br
Br
= {y | kyk(k) < r}.
vf(Σ) Given a dynamic system Σ : x˙ = F (x, w, u) with input u and exogenous parameters w, we define vf(Σ) := F (x, w, u) so that vf(Σ) is the vector field associated with the dynamic system Σ. sα , cα
For an angle α, sα := sin(α) and cα = cos(α).
Dk F (a1 , a2 , . . . , an)
For any map F (a1 , a2 , . . . , an ), Dk F (a1 , a2 , . . . , an ) is the partial deriva-
tive of F with respect to ak . The l th derivative with respect to the kth argument is denoted Dkl F (a1 , a2 , . . . , an ). Dj,k F (a1 , a2 , . . . , an) The mixed partial derivative of F with respect to the j th and then the kth argument Dj,k F (a1 , a2 , . . . , an ) :=
∂ ∂ F (a1 , . . . , aj , . . . , ak , . . . , an ) ∂aj ∂ak
Dkl F (a1 , a2 , . . . , an) The repeated partial derivative of F with respect to the kth argument Dkl F (a1 , a2 , . . . , an ) :=
∂ ∂ · · · k F (a1 , . . ., ak , . . . , an) k ∂a ∂a | {z } l times
Lf φ(x)
For smooth vector fields f and g, and real-valued function φ : Rn → R the Lie
derivative of φ in the direction of f denoted Lf φ is defined as Lf φ(x) := dφ(x) · f (x).
The expression Lg Lf φ(x) denotes Lg (Lf φ(x)), and the k-times repeated Lie derivative Lf Lf . . . Lf φ(x) is denoted Lkf φ. By convention L0f φ(x) ≡ φ(x).
248
Notation and Terminology
Appendix A
¯ f φ(x, t) L
For smooth t-dependent vector fields f and g, and real-valued function φ : ¯ f φ is defined as Rn × [0, ∞) → R, L ¯ f φ(x, t) := D1 φ(x, t) · f (x, t) + D2 φ(x, t) L ¯gL ¯ f φ(x, t) denotes L ¯ g (L ¯ f φ(x, t)), and the k-times repeated Lie derivaThe expression L ¯f L ¯f . . . L ¯ f φ(x, t) is denoted L ¯ k φ. By convention L ¯ 0 φ(x, t) ≡ φ(x, t). tive L f
Lipschitz
f
A map f (x) is Lipschitz continuousor Lipschitz on an open set S ⊂ Rn if there
exists a non-negative real-valued piecewise continuous function g : R+ → R+ such that for all x, y ∈ S, kf (x) − f (y)k ≤ g(t)kx − yk.
If for some fixed M ∈ R, 0 < M < ∞, and for some positive real-valued
O(g(x))
function g : Rn → R, it is true that limx→x0 kf (x)k/|g(x)| ≤ M , then we say that P f (x) = O(g(x)) as x → x0 . If h(x) = ki=1 fi (x) where fi (x) = O(gi(x)), i ∈ k, then
we say that h(x) = O(g1 (x), . . ., gk (x)). T
The interval [−π, π] ⊂ R with the points 0 and 2π identified is the torus denoted T. The l2 norm on A = [Ai,j ]i,j∈n ∈ Rn×n , is defined by
kAk2
kAk2 := (
X
i,j∈n
|Ai,j |2 )1/2 .
(A.2)
The l∞ norm on A is defined by kAk∞ := maxi,j∈n |Ai,j |. GL(n, R)
The group of nonsingular n-by-n matrices, {M ∈ Rn×n | det(M ) 6= 0}.
O(n, R)
The group of orthogonal n-by-n matrices, {M ∈ Rn×n |M T M = I}.
S(n, R)
The vector space of symmetric n-by-n matrices, {M ∈ Rn×n |M T = M }.
SE(3)
The special Euclidean group of transformations R3 → R3 x → Rx + p
with x and p in R3 , and where R ∈ Rn×n satisfies det(R) = 0 and RRT = I. s(n) := n(n + 1)/2
The dimension of S(n, R), i.e. s(n) := 12 n(n + 1).
Given any X ∈ S(n, R), and a particular ordered basis {β1 , . . . , βs(n)} of P ˇ ∈ Rs(n) , with X ˇ ≡ S(n, R), assume that X = i∈s(n) xi βi with xi ∈ R. Then X P (X)ˇ:= [x1 , . . . , xs(n) ]T . Also, given any x ∈ Rs(n) , x ˆ ≡ (x)ˆ:= i∈s(n) xi βi ∈ S(n, R).
Xˇ, (X)ˇ, ˆ x, (x)ˆ
Appendix A
Notation and Terminology
249
σ(M )
The spectrum of M ∈ GL(n) is the set of eigenvalues of M and is denoted σ(M ).
θ, θ∗
Consider a vector θ ∈ Rn , and an associated map F : Rn ×R+ → Rn ; (θ, t) 7→ F (θ, t).
We will associate with θ the symbol θ∗ meaning an exact solution of F (θ, t) = 0. When θ is the state of a dynamic system, it may be regarded as an approximator of θ∗ . Otherwise, θ refers to the first argument of F (θ, t). F˜ (z, t)
Given maps F (θ, t) and G(w, t) and a continuous isolated equilibrium θ∗ (t) satisfy˜ ing F (θ, t) = 0 for all t, we define F˜ (z, t) := F (z + θ∗ , t) and G(w, t) := G(w, z + θ∗ , t).
class K
A continuous function α : [0, r) → R+ is in class K if α(0) = 0 and α is strictly
increasing. Note that the sum of two class K functions is also of class K, as is their
product and their composition α1 (α2 (·)). Also, if α(·) is class K then so is its inverse α−1 (·).
class KL
A continuous function β : [0, a) × R+ → R+ belongs to class KL if for each
fixed t1 ∈ R+ , r 7→ β(r, t1) is in class K, and for each fixed r1 ∈ [0, a), t 7→ β(r1 , t) is
decreasing with respect to t, with β(r, t) → 0 as t → ∞.
N
Marks the end of theorems, corrolaries, lemmas, and propositions. Marks the end of proofs of theorems, corrolaries, lemmas, and propositions. Marks the end of remarks, examples, definitions, claims, assumptions, algorithms, and properties.
250
Appendix B
Some Useful Theorems In this section we gather, for the reader’s convenience, some useful theorems and techniques referred to, but not original to this dissertation.
B.1
A Comparison Theorem The following theorem which we will refer to as the comparison theorem will prove
useful. Theorem B.1.1 Comparison Theorem. Let f : R 7→ R; x 7→ f (x) and g : R 7→ R; x 7→ g(x) be Lipschitz in x. Let x(t) denote the solution to x˙ = f (x) with x(0) = x0 , and let y(t) denote the solution to y˙ = g(y) with y(0) = y 0 ≥ x0 . Assume that for all x ∈ R, f (x) ≤ g(x). Then for all t ∈ R+ , x(t) ≤ y(t).
Proof: See Hartman [Har82], Theorem 4.1, page 26.
B.2
Taylor’s Theorem We will rely upon Taylor’s theorem in some of our arguments. For convenience we
include a version here from [GMW81]. Theorem B.2.1 Taylor’s Theorem. If f (x) ∈ C r , then there exists a scalar α, with
0 ≤ α ≤ 1, such that
1 1 1 f (x + h) = f (x) + hf 0 (x) + h2 f 00 (x) + · · · + hr−1 f (r−1) (x) + hr f (r) (x + αh) 2 (r − 1)! r!
Sec. B.4
Tracking Convergence for Integrator Chains
where f (r) (x) denotes the r th derivative of f evaluated at x.
B.3
251
Singularly Perturbed Systems The following theorem, from Khalil [Kha92], provides sufficient conditions under
which one may conclude exponential stability of a singularly perturbed system. Theorem B.3.1 Consider the system x˙ = f (t, x, z, ) z˙ = g(t, x, z, )
(B.1)
Assume that for all (t, x, ) ∈ [0, ∞) × Br × [0, 0 ] i. f (t, 0, 0, ) = 0 and g(t, 0, 0, ) = 0. ii. The equation 0 = g(t, x, z, 0) has an isolated solution z = h(t, x) such that h(t, 0) = 0. iii. The functions f, g, and h and their partial derivatives up to order 2 are bounded for z − h(t, x) ∈ Bρ . iv. the origin of x˙ = f (t, x, h(t, x), 0) is exponentially stable. v. The origin of dy = g(t, x, y + h(t, x), 0) dτ is exponentially stable uniformly in (t, x). Then there exists ¯ > 0 such that for all < ¯, the origin of (B.1) is exponentially stable.
Proof: See Khalil [Kha92], Theorem 8.3, page 467.
252
B.4
Some Useful Theorems
Appendix B
Tracking Convergence for Integrator Chains A standard result of linear control theory, as elementary as it is useful, is the
following: Theorem B.4.1 Consider the control system j ˙j ξi = ξ1 + 1, j ∈ ri, i ∈ p ξ˙iri = ui y = ξ1
(B.2)
i
Let yd (t) ∈ Cpr¯ where r¯ := maxi∈p {ri}. Let βij ∈ R be such that all of the roots of the polynomials
sri +
ri X j=1
βij sj−1 , i ∈ p
(B.3)
have strictly negative real parts. Then the control ui =
ri ydi
−
ri X k=1
(k−1) βik ξik − ydi , i∈p
(B.4)
causes y(t) to converge to yd (t) exponentially.
Remark B.4.2 The utility of the controller B.4 in the context of nonlinear control is greatly enhanced by feedback linearization of nonlinear systems (see Section C of this appendix) in which a state-dependent coordinate transformation converts a nonlinear system to the form B.2, after which B.4 may be applied in order to render the nonlinear system
N
stable. Proof of Theorem B.4.1: Define the coordinate change j
j
(j−1)
ei := ξi − ydi
, i ∈ p, j ∈ ri
(B.5)
The coordinates eji are referred to as error coordinates. In the error coordinates, the system B.2 takes the form
(
e˙ji e˙ri i
= ej1 + 1, j ∈ ri, i ∈ p (r )
= ui − ydi i
(B.6)
and the input B.4 takes the form ri ui = ydi −
ri X k=1
βik eki , i ∈ p.
(B.7)
Sec. B.5
A Converse Theorem
Combining (B.6) with (B.4) gives ( j e˙i e˙ri i
253
= ej1 + 1, j ∈ ri, i ∈ p Pi = − rk=1 βik eki , i ∈ p
(B.8)
Equation B.8 is referred to as the error dynamics of the tracking system. Let ei := [e1i , . . ., eri i ], i ∈ p. Then (B.8) e˙ 1 e˙ 2 ... e˙p where
Ai :=
is equivalent to A1 0 · · · 0 A2 · · · = . .. .. . 0 0 ··· | {z A
0
1
0
0 .. .
0 .. .
1
0
0
···
1
e1
0 e2 ... e Ap p } ··· ..
−βi1 −βi2 · · ·
The matrix A is in block companion
0
.
0 .. . 0
0
1
−βiri −1
−βiri −2
(B.9)
(B.10)
form. Its eigenvalues are, therefore, the union of
the roots of the polynomials (B.3). By assumption all of those roots have strictly negative real parts. Therefore the origin of the error dynamics is exponentially stable. Exponential stability of the origin of the error dynamics implies that y(t) − yd (t) goes to zero exponen-
tially.
B.5
A Converse Theorem The following theorem, from Khalil [Kha92] (Theorem 4.5, page 180), states that if
a nonlinear system has an exponentially stable equilibrium, then it has a Lyapunov function for a neighborhood of that equilibrium. Theorem B.5.1 For x ∈ Br ⊂ Rn , Let x = 0 be an equilibrium of x˙ = f (t, x) 1
(B.11)
See Horn and Johnson [HJ85], page 147 for a discussion of companion matrices, their properties, and their relation to polynomials.
254
Some Useful Theorems
Appendix B
where f : R+ × Br → Rn is C 1 in t and x, and ∂f /∂x is bounded on Br uniformly in t. Let k, γ, and r0 be positive constants with r0 < r/k. For t0 ≥ 0, assume that for each x(t0 ) in B r0 ,
kx(t)k2 ≤ kkx(t0 )k2 e−γ(t−t0 ), ∀t ≥ t0 ,
(B.12)
i.e. the origin of (B.11) is exponentially stable uniformly in t. Then there is a function V : R+ × Br → R, and positive constants c1 , c2 , c3 , and c4 such that c1 kxk22 ≤ V (t, x) ≤ c2 kxk22 ∂V ∂V + f (t, x) ≤ −c3 kxk22 ∂t ∂x
∂V
∂x ≤ c4 kxk2 2
(B.13) (B.14) (B.15)
If f is independent of t, then V can be chosen to be independent of t.
Proof: See Khalil [Kha92], Theorem 4.5, pages 180-183.
B.6
Uniform Ultimate Boundedness The following definition is from [Kha92].
Definition B.6.1 The solutions of x˙ = f (t, x) are said to be uniformly ultimately bounded if there exist constants b > 0 and c > 0 such that for each a ∈ (0, c) and each t0 ∈ R+ , there exists a T = T (a) > 0 such that
kx(t0 )k < a ⇒ kx(t)k < b, ∀t > t0 + T
(B.16)
N Regarding Definition B.6.1 we have the following theorem, also from Khalil [Kha92]. Theorem B.6.2 Uniform Ultimate Boundedness. For x ∈ Br ⊂ Rn , let f : R+ → Rn
be piecewise continuous in t and locally Lipshitz in x. Let V : R+ × Br → R be C 1 in x and t. Let αi (·), i ∈ 3 be class K functions such that for each x ∈ Br , α1 (kxk) ≤ V (t, x) ≤ α2 (kxk)
(B.17)
∂V ∂V + f (t, x) ≤ −α3 (kxk), ∀kxk ≥ µ > 0 ∂t ∂x
(B.18)
Sec. B.6
Uniform Ultimate Boundedness
255
for all t ≥ 0, with µ < α−1 2 (α1 (r)). Then there exists a class KL function β(·, ·) and a finite time t1 (dependent on x(t0 ) and µ) such that for each kx0 k < α−1 2 (α1 (r)), kx(t)k ≤ β(kx(t0 )k, t − t0 ), ∀t0 ≤ t < t
(B.19)
kx(t)k ≤ α−1 1 (α2 (µ)), ∀t ≥ t1
(B.20)
Also, if αi (r) = ki r c , for ki > 0 and c > 0, then β(r, s) = kr exp(−γs) with k = (k2 /k1 )1/c and γ = (k3 /k2 c).
Proof: See Khalil [Kha92], Theorem 4.10, page 202.
256
Appendix C
Partial Feedback Linearization of Nonlinear Control Systems In this section, through state dependent coordinate and input transformations we will put a nonlinear time-invariant control system of the form x˙ = f (x) + g(x)u
(C.1)
y = h(x)
into a form which displays a linear relationship between input u ∈ Rp and output y ∈ Rp.
The state x is assumed to be in Rn . The column vector fields f (x) and gi (x), i ∈ p are
assumed to be smooth on an open set of Rn , as are the functions hi (x), i ∈ p. The standard linearization procedure from [Isi89] for autonomous systems will be used.
Let r := [r1 , . . . , rp], with integers ri ≥ 1. A system (C.1) has vector relative
degree [Isi89] r at a point x0 if
i. for all i, j satisfying 1 ≤ i ≤ p, 1 ≤ j ≤ p, and for all k < ri − 1, and for all x in a neighborhood of x0 ,
Lgj Lkf hi (x) = 0 ii. the p × p decoupling matrix
Lg1 Lrf1 −1 h1 (x) · · ·
Lg Lr2 −1 h2 (x) · · · 1 f β(x) := .. .. . . rp −1 Lg1 Lf hp (x) · · ·
(C.2)
Lgp Lrf1 −1 h1 (x) Lgp Lrf2 −1 h2 (x) .. . r −1
Lgp Lfp
is nonsingular for for all x in a neighborhood of x0 .
h1 (x)
(C.3)
Appendix C
Partial Feedback Linearization of Nonlinear Control Systems
257
Assume that (C.1) has well-defined vector relative degree r. This implies that if we successively differentiate yi (t) with respect to t, then a component uj (t) of the input u(t) appears for the first time at the rith derivative of yi . We use this insight as follows: Define the partial coordinate change ζki = ψ(x) := Lfk−1 h(x), i ∈ p, k ∈ ri.
(C.4)
Let φi (x), i ∈ p be any smooth functions of x, with η i = φi (x), i ∈ {1, . . ., n − p}, which
completes the partial coordinate change (C.4), and such that (ζ, η) = Ψ(x) := (ψ(x), φ(x)) satisfies Ψ(0, 0) = 0, and | det(Ψ(x))| ≥ > 0 Let
Let nr :=
P
i∈p ri.
ζ :=
ζ11 ζ21 .. . ζr11 ζ12 .. . ζr21 .. . ζrpp
y1
y˙1 .. . (r1 −1) y1 y = 2 .. . (r −1) y 2 2 .. . (rp −1) yp
(C.5)
In the new coordinate system (C.1) takes the form ˙j = ζ j + 1, j ∈ ri − 1, ip ζ i i ζ˙ri = α (ζ, η) + β (ζ, η)u, i ∈ p i i i η˙ = q1 (ζ, η) + q2 (ζ, η)u y = ζ 1, i ∈ p i
(C.6)
(C.7)
i
with αi (ζ, η) ∈ R, βi (ζ, η) ∈ R1×p , q1 (ζ, η) ∈ Rn−nr , and q2 (ζ, η) ∈ R(n−nr )×p . Note that β1 (ζ, η) β2 (ζ, η) −1 β(Ψ (ζ, η)) = β(ζ, η) = (C.8) .. . βp (ζ, η) By the assumption of vector relative degree β(Ψ−1 (ζ, η)) is nonsingular. Let α(ζ, η) :=
[αi , . . . , αn−nr ]T (ζ, η). Then defining the control law u := β(ζ, η)−1(v − α(ζ, η))
(C.9)
258
Partial Feedback Linearization of Nonlinear Control Systems
puts our system in the form ζ˙ij ζ˙ri i
where
= ζij + 1, j ∈ ri − 1, i ∈ p
= vi , i ∈ p
η˙ = s1 (ζ, η) + s2 (ζ, η)v y = ζ 1, i ∈ p i i
Chap. C
(C.10)
s1 (ζ, η) := q1 (ζ, η) − β(ζ, η)−1α(ζ, η)
(C.11)
s2 (ζ, η) := q2 (ζ, η)β(ζ, η)−1
(C.12)
and
The ξ part of the dynamics of system (C.10) are in the form of p integrator chains, each of length ri , i ∈ p. The η dynamics of (C.10) are the unobservable, internal dynamics of (C.1).