1997 IEEE Conference on Decision and Control (San Diego, CA)
Feedback Stabilization of Nonlinear Systems: a Path Space Iteration Approach
y
Fernando Lizarralde , John T. Wen and Liu Hsu
Abstract: In this paper we consider the feedback stabilization of non-
linear systems. The design methodology is based on a class of iterative methods, which has recently been proposed for the path planning and feedback stabilization of nonholonomic systems. We show that this scheme guarantees the closed loop asymptotic stability for a class of nonlinear systems.
1
z
Then, the stabilization problem can be stated as a nonlinear zero crossing problem: (3) e(T;x(t);u) = (T;x(t);u) ? xd where xd is the desired equilibrium state. Global controllability means that for all x(0) := x0 ( xed x0 ) and xd , there is at least one solution u to e(T;x0 ; u) = 0 ( nite T ), which implies that the nonlinear map (T;x0 ; ) : Lm 2 [0;T ] ! IRn is onto for all x0 2 IRn .
1In thisIntroduction paper we propose a general control procedure based on the path 2.1 Path Planning Problem space iterative approach [1, 12, 13, 11]. The key idea of this approach Many numerical algorithms exist for the solution of this problem with is to transform the problem into a root nding problem which can be solved using the Newton method. The method converges to a feasible path provided that a singular path (a path about which the linearized system is uncontrollable) is not encountered during the iteration. These methods are also referred in the literature as continuation or homotopy methods, which are widely used in numerical analysis [9, 13]. The appeal of this method lies in its generality and ability to include inequality constraints (e.g., saturation limits of actuators). The algorithm is iterative in nature, with no a priori speci able convergence time. By using a receding horizon concept (c.f., [8]) the feedback stabilization problem can be embedded in this approach. In the standard receding horizon control, the current control at state x and time t is obtained by determining the on-line optimal control u^() over the interval [t; t + T ] and setting the current control equal to u^(t). Repeating this calculation continuously yields a feedback control (since u^(t) depends clearly on the current state x). In this paper the dierence from the conventional implementation is that the control is executed based on one Newton-step rather than waiting for the iteration to converge at each step (i.e., solving the optimal control problem). This approach was rst considered to solve the feedback stabilization problem of nonholonomic systems [5, 7], where inequalities constraints (obstacle avoidance and steering angle limits for a car example) can also be considered. The proposed approach can be considered as a methodology for Model Predictive Control (MPC) based on a function-space optimization of control inputs. It can be shown that the predicted states and system states converge exponentially to the goal states. The exponential convergence implies certain amount of robustness with respect to noise and model uncertainty. In order to illustrate the generality of the approach, a challenging problem is considered: the control of an inverted pendulum (cart-pole system).
2Let consider Feedback Stabilization a time-invariant nonlinear system ane in the control,
xed x0 . In general, the solution involves lifting a path connecting the initial e to the desired e = 0 to the u space. Let u(0) be the rst guess of the input function and e(T;x(t);u(0)) be the corresponding nal state error as given by (3). The goal is to iteratively modify u so e converges to zero. e(T;x ; u( )), a convenient, but also an The derivative of e( ) (= 0 abuse of notation) with respect to the iteration variable is given by:
du de (4) d = ru (T;x0 ; u) d : where ru denotes the Frechet derivative of (T;x0 ; ) with respect to u. If ru (T;x0 ; u) is full rank, then we can choose the following update rule for u( ) (see [11] for other choices): du = ? r (T;x ; u)y e( ) (5) u 0 d y where > 0 and ru (T;x0 ; u) denotes the Moore-Penrose pseudoinverse of ru (T;x0 ; u). This is essentially the continuous version of
Newton's Method. The equation (5) is an initial value problem in u with a chosen u(0), which could be solved using any integration method (c.f., Davidenko's method in [9]). The gradient ru (T;x0 ;u) can be computed from the system (1) linearized about the path corresponding to u (see [10]). A sucient condition for the convergence of the iterative algorithm (5) is that ru (T;x0 ;u( )) is full rank for all , or, equivalently, the time-varying linearized system about the path generated by each u( ) is controllable, i.e., the system (1) is locally controllable along the trajectory generated by u. Under the full rank condition on ru , substituting (5) into (4) yields
de d = ?e x_ = f0 (x) + f (x)u (1) implying the convergence of e( ) to zero. where f0 (x) and f (x) are smooth vector elds, x 2 IRn and u 2 IRm . 2.2 Receding horizon strategy We will assume full state measurement and global controllability (i.e., Here, we present a modi cation of the above iterative approach to renfor all x0 and xf 2 IRn , there exists T > 0 and u : [0;T ] ! U such der it in a feedback form. The main idea is to simultaneously perform that x(0) = x0 and x(T ) = xf , where u 2 U IRm ). the iteration on the variable and the execution of the control u(t). An alternate way to view (1) is to regard it as a nonlinear functional At each time, the current state, x(t) is used as the initial condition mapping of the input function u 2 L2m [t;t + T ) to the nal state for the map (T; ;u) and the gradient ru (T; ;u) (see (3) and (5)). x(t + T ): After the control function u is re ned with one newton-step, the conx(t + T ) = (T;x(t);u) (2) trol at time t (u := u(t)) is executed to drive the system to a new which indicate the state at time t + T resulting from starting at time state and the procedure repeats. Since the Newton step guarantees that the predicted error is strictly decreasing, it is possible to show the t in state x and applying the input function u. convergence of the state to the desired value. Fernando Lizarralde is with the Department of Electronic Eng., To describe the above procedure analytically, it is convenient to conthe system discretized in time. Denote the control vector at the Eng. School, Federal University of Rio de Janeiro, 21945/970, Rio ksider th time interval as de Janeiro (Brazil). e-mail:
[email protected]. y John Wen is with the Center for Advanced Technology in Au(6) ui (k) = [u1 (k); ; ui (k)]; ui 2 IRm:i1 tomation, Robotics and Manufacturing, Dept. of Electrical, Com- where i is an integer which de nes the time window T = ih (h is the puter and Systems Eng., Rensselaer Polytechnic Institute, Troy, NY sampling period), and uj (k) (j 2 [1;i]) is the control at time k + j ? 1, 12180. e-mail:
[email protected]. i.e., uj (k) := u(k + j ? 1) (uj ();u() 2 IRm ). z Liu Hsu is with the Department of Electrical Eng., The i-step ahead predicted state error (a discrete time version of (3)) COPPE/Federal University of Rio de Janeiro, 21945/970, Rio de is de ned by: (7) Janeiro (Brazil). e-mail:
[email protected]. ei (k) = i (x(k);ui (k)) ? xd denoted by:
where x(k + i) = i (x(k);ui (k)). Write the one Newton step update of uM +1 (k) as:
is a constant input u = 0; 8t 2 [0; 1). The path is discretized to 50 increments. The predicted trajectory generated by the control signal u was calculated using a simple Euler approximation, while we apply the control to a real plant which is simulated using a 4th order Runge-Kutta algorithm. Figure 2 shows the performance for an initial condition (0) = , _(0) = x(0) = x_ (0) = 0 and a desired nal state d = _d = xd = x_ d = 0. From Fig. 2a one can see that the pendulum swings up in one swing and then balanced successfully despite the model uncertainty. Figure 2b shows the cart displacement which is restricted to 0:6m. Inequality constraints could be embedded in the proposed algorithm [2, 5] using a interior/exterior penalty function. In [6] control signal and/or cart displacement constraints were considered, where was observed that the pendulum swings up in several swings (in order to pump energy to the system) and then balance successfully.
vM +1 (k) = uM +1 (k) ? hk0 ru M +1 (x(k);uM +1 (k)) y (k) (8) where (k) := eM +1 (k) ? k1 eM (k), and, from (7), eM (k) = M (x(k);uM (k)) ? xd eM +1 (k) = M +1 (x(k);uM +1 (k)) ? xd Now, the rst element of this updated control vector v M +1 is applied
to the actual system:
x(k + 1) = 1 (x(k);v1 (k))
(9)
The control vector is then updated and shifted forward by one step:
uM +1 (k + 1) = 6 4
u1 (k + 1) .. .
uM (k + 1) uM +1 (k + 1)
3
2
7 6 5=4
v2 (k)
.. . vM +1 (k) v
3
Pendulum angle (deg) 200
7 5
(10)
150 100 (a)
2
50 0
where v is the control which characterize the equilibrium condition f0 (xd ) + f (xd)v = 0. Note that the update law (8) was slightly modi ed with respect to (5). A term k1 eM (k) was added in order to guarantee that the control value v does not increase the predicted error. For a drift-less system, f0 (x) = 0, v = 0 would not change the state, then for this kind of systems we should set v = 0 and k1 = 0. We now show that the predicted error, eM (k), converges exponentially to zero. This is stated in the following theorem:
−50 0
0.5
1
1.5
2
2.5
3
2
2.5
3
Cart position (m) 1
(b)
0.5
u
y
0
−0.5 0
0.5
1
1.5 time (seg)
Figure 1: Inverted pendu- Figure 2: Inverted pendulum: (a) in degree, (b) _ in r/s. Theorem 2.1 Consider the nonlinear system (1) and its discrete end- lum System
point mapping (7). Then, by using the update law given by (8) and the receding horizon strategy (10), if k0 = 1=h is chosen in (8) then the predicted error, eM (k), converges exponentially to zero. Consequently, the system state converges to the desired state xd . Proof: For details refer to [6].
4This paper Conclusions presents the application of an iterative approach for the
control of a class of nonlinear systems. By using a receding horizon strategy they can be transformed to a feedback form. It is shown that by coupling the iteration variable and the time variable, the nal state error converges exponentially. The generality of the approach allows to consider problems such as underactuated mechanical systems, dynamic nonholonomic systems, etc. Futures works include parameter adaptation, and experimental veri cation.
2.2.1 Features of the Algorithm Robustness The exponential rate of convergence of the nal state error implies that there should be some stability robustness with respect to modeling error. This is veri ed in the simulations.
References [1] A. Divelbiss and J. Wen, \A perturbation re nement method for
Possible Extensions In the o-line version of the algorithm, inequality constraints are han-
dled through either exterior penalty functions [3] or interior penalty functions [13]. They can both be easily included in the feedback modi cation described above (see [5]). The exponential convergence of the nal state error may allow for parameter adaptation if the model dependent portion can be written in a linear-in-the-parameter form.
[2] [3]
3 Simulation Results
[4]
In order to illustrate the performance of the proposed scheme, we will consider an inverted pendulum system (see Fig. 1). A cart of mass mc has a uniform pendulum of mass mp and length l pivoted on its top and is controlled by applying an input force u(t). The dynamic equations of this system are the following:
[5] [6]
(mc + mp )y ? mp lcos() + mp lsin()_2 = u (11) J ? mp lcos()y ? mp lgsin() = 0 (12) where J = 4=3mpl2 is the pendulum inertia, g is the gravity, is the angle of the pendulum with respect to the vertical position and y is the cart displacement. Solving (11)-(12) with respect to y and and letting x1 = ; x2 = ;_ x3 = y; x4 = y_ be the state variables, and xT = [x1 ; x2 ; x3 ; x4 ] the
[7] [8] [9]
state vector, allows us to re-write the dynamic equation in ane form (1). In order to perform the simulations, the system parameters described in the cart-pole benchmark [4] were considered: mp = 0:1;mc = 1; l = 0:5 and g = 9:8. The integration step was chosen to be h = 0:02. The parameter k1 (from the update law (8)), is found to minimize j eM +1 (k)j at each k. This scalar was found by using a Golden Section search (other line search technique could be used). The initial guess
[10] [11] [12] [13]
2
nonholonomic motion planning," in Proc. 1992 American Control Conference, (Chicago, IL), June 1992. A. Divelbiss and J. Wen, \A path space approach to nonholonomic motion planning in the presence of obstacles," IEEE Trans. on Robotics and Automation, vol. 13, pp. 443{451, June 1997. A. Divelbiss and J. Wen, \Trajectory tracking control of a car{ trailer system," IEEE Trans. on Control System Technology, vol. 5, pp. 269{278, May 1997. S. Geva and J. Sitte, \A cartpole experiment benchmark for trainable controllers," IEEE Contr. Systems, vol. 13, pp. 40{51, October 1993. F. Lizarralde and J. Wen, \Feedback stabilization of nonholonomic systems in presence of obstacles," in Proc. IEEE Int. Conf. on Robotics&Automation, (Minniapolis), 1996. F. Lizarralde, J. Wen, and L. Hsu, \Feedback stabilization of nonlinear systems: a path space iteration approach," tech. rep., COPPE/UFRJ, Rio de Janeiro (Brazil), August 1997. F. Lizarralde, J. Wen, and D. Popa, \Feedback stabilization of nonholonomic systems," in 1996 Proc. Conf. on Information Sciences and Systems (CISS'96), (Princeton, NJ), 1996. D. Mayne and H. Michalska, \Receding horizon control of nonlinear systems," IEEE Trans, on Automatic Control, vol. 35, pp. 814{824, July 1990. S. Richter and R. DeCarlo, \Continuation methods: Theory and applications," IEEE Trans. on Circuits and Syst., vol. 30, no. 6, pp. 347{352, 1983. E. Sontag, Mathematical Control Theory. Springer-Verlag, 1990. E. Sontag, \Control of systems wihtout drift via generic loops," IEEE Trans, on Automatic Control, vol. 40, pp. 1210{1219, July 1995. E. Sontag and Y. Lin, \Gradient techniques for systems with no drift," in Proc. of Conf. in Signals and Systems, 1992. H. Sussmann and Y. Chitour, \A continuation method for nonholonomic path nding problem." IMA Workshop on Robotics, Jan. 1993.