Overcoming Nondifferentiability in Optimization ... - Springer Link

c Allerton Press, Inc., 2010. ISSN 8756-6990, Optoelectronics, Instrumentation and Data Processing, 2010, Vol. 46, No. 5, pp. 408–413. c A.A. Voevoda, A.V. Chekhonadskikh, 2010, published in Avtometriya, 2010, Vol. 46, No. 5, pp. 11–17. Original Russian Text

AUTOMATIC CONTROL SYSTEMS IN SCIENTIFIC RESEARCH AND INDUSTRY

Overcoming Nondifferentiability in Optimization Synthesis of Automatic Control Systems A. A. Voevoda and A. V. Chekhonadskikh Novosibirsk State Technical University, pr. Karla Marksa 20, Novosibirsk, 630092 Russia E-mail: ucit@ucit.ru; alcheh@ngs.ru Received May 31, 2010

Abstract—We solve the problem of minimizing the ravine non-differentiable function defining the quality of the pole location in the synthesis of automatic control systems with reduced-order controllers. The relationship between the objective functions of the location of the polynomial roots and their derivatives is studied. The possibility of conversion from the non-differentiable characteristic of the location of the polynomial roots to a smooth characteristic of its derivative. DOI: 10.3103/S8756699011050025 Key words: automatic control system, reduced-order controller, minimization of nondifferentiable ravine function, graduation of the root set of polynomial, convex hull.

INTRODUCTION In the operator description of an automatic control system (ACS), its most important properties are determined by the poles—the roots of the characteristic polynomial. In the synthesis of full-order systems whose controller is comparable in complexity to the controlled plant, the designer is free to specify the location of the poles; the theory of these systems has been studied in detail and presented in many fundamental monographs (e.g., [1]). However, the industrial and technological practice is dominated by reduced-order proportional-integral controller (PI) and proportional-integral-differential controller (PID).1 In recent years, investigation of the use and adjustment of such controllers has been one of the most important directions of the theory of linear systems. Since, in this case, the number of free parameters is not sufficient to provide an arbitrarily given location of the poles, there are many approaches to the synthesis of systems applicable under particular conditions. One possible method is optimization: it is required to achieve the best location of the poles by choosing the parameters of the controller. However, this process is complicated by the nonconvexity [2] of the majority of problems that arise and, as a consequence, their multiextremal nature [3]. A comparative analysis of various approaches [4] has revealed a number of difficulties. Due to the combination of the above-mentioned factors, the development of the theory of linear systems has focused on two extreme cases: the synthesis of full-order controllers (whose description is close in complexity to the controlled plant) or reduced-order controllers (which have two or three parameters). The theory of reduced-order systems remains a poorly understood area in which there are still many meaningful problems [2]. The optimization method described in [5] focuses on the synthesis of reduced-order controllers, and a major problem in its practical implementation is the nondifferentiability of the objective function.2 1

With proportional, integral, and differential links. More precisely, discontinuities of the second kind in partial derivatives and the unbounded subdifferential of the objective function. This gives rise to a ravine relief with closely adjacent walls and a set of false extrema: points of stabilization of the gradient algorithms on the bottom of the ravine. Thus, in numerical studies of a double pendulum with a PID controller in critical areas, use was made of combinations of various methods, such as varying the pitch, finding the direction of the bottom, etc., until a reliable results was achieved [5].

2

408

OVERCOMING NONDIFFERENTIABILITY IN OPTIMIZATION SYNTHESIS

409

The purpose of the present work is to develop a method for converting from a ravine function of the roots of the original polynomial to a differentiate objective function that allows a relatively simple gradient minimization; this approach relies on Gauss’s theorem on the location of the roots of the derivative of a polynomial [6]. DERIVATIVE MAPPING OF THE CORRESPONDENCE BETWEEN THE ROOTS AND COEFFICIENTS The appearance of discontinuities of the 2-nd kind for derivatives of objective functions that characterize the pole location is due to the well-known relations between the roots and coefficients of polynomials [7]. The coefficients of the reduced polynomial fn (s) = sn +an−1 sn−1 +. . .+a1 s+a0 are elementary symmetric functions of its roots z1 , . . . , zn : an−k = (−1)k σk [z1 , . . . , zn ] (k = 1, . . . , n);

an = σ0 = 1.

Since nothing depends on the order of the roots d, for the sake of brevity, we set σkn [z1 , . . . , zn ] = σkn . Omission of one of the variables zl in the polynomials σkn , which is equivalent to the substitution of the value zl = 0, can be written as n\l = σk [z1 , . . . , zl−1 , zl+1 , . . . , zn ]. σk = σkn zl =0

∂ n ∂ n n\j σ = σk−1 and σ = 1. ∂zj k ∂zj 1 Then, the Jacobian matrix (∂ai /∂zj ) will have the form ⎛ ⎞ −1 ··· −1 ··· −1 ⎜ ⎟ ⎜ ⎟ n\1 n\k n\n ⎜ ⎟ σ1 ··· σ1 ··· σ1 ⎜ ⎟ ⎜ ⎟ ∂(an−1 , . . . , a0 ) ⎜ ⎟ =⎜ ⎟. .. .. .. ⎜ ⎟ ∂(z1 , . . . , zn ) . ··· . ··· . ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎝ (−1)n σ n\1 · · · (−1)n σ n\k · · · (−1)n σ n\n ⎠

From the definition of the polynomial σkn , it is clear that

n−1

n−1

n−1

It is directly verified that the matrix J −1 of the inverse relationship between the coefficients and the roots is a Vandermonde matrix with a diagonal denominator: ⎞−1 ⎛ (z1 − z2 ) . . . (z1 − zn ) 0 ··· 0 ⎟ ⎜ ⎟ ⎜ 0 0 (z2 − z1 ) . . . (z2 − zn ) · · · ⎟ ⎜ ⎟ ⎜ −1 ⎟ ⎜ J =⎜ ⎟ .. .. .. .. ⎟ ⎜ . . . . ⎟ ⎜ ⎠ ⎝ 0

· · · (zn − z1 ) . . . (zn − zn−1 )

0 ⎛

z1n−1

⎜ ⎜ n−1 ⎜ z ⎜ 2 ⎜ ×⎜ ⎜ .. ⎜ . ⎜ ⎝ znn−1

···

z1

1

···

z2

1

.. .

.. .

.. .

···

zn

1

⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟. ⎟ ⎟ ⎟ ⎠

From this, it is evident that for the multiple roots of the polynomial, the derivative map of the correspondence between the roots and coefficients has a discontinuity of the 2-nd kind, resulting in ravines of the objective functions of the coordinates of the roots [4, 5]. This is the case for functions such as metrics, which depend on all the roots, and for functions such as graduation, which depend only on the rightmost of them; we will consider the latter. OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING

Vol. 46

No. 5

2010

410

VOEVODA, CHEKHONADSKIKH

(a)

(b) 3

Im

2 4

1

Im

Re

Re

Fig. 1. Target domains of location of the roots in the modal synthesis of reduced-order systems: (a) the conventional approach is to ensure that the poles fall into the given trapezoid in the plane C, the root hodographs correspond to a change in a certain parameter; (b) the optimization approach is to ensure the leftmost location of the truncated cone that covers all the poles and is given by the most “inconvenient” of them; the boundaries of cones 1 and 2 are determined by the real root, the boundaries of cone 4 by the complex pair, and those of cone 3 by the root and pair together.

ROOT R-GRADUATIONS The most effective objective function in terms of optimizing the location of the poles is a graduation that includes only the coordinates of the least stable roots. The principal (for this work) requirements to this functions are as follows:3 Definition 1. We will call the complex plane R-graduated if it contains a specified family of closed convex domains ordered by inclusion {Bα | α ∈ R} such that: 1) Bα ⊂ Bβ for α < β; 2) Bα = C; α∈R 3) Rec ≤ α for each c ∈ Bα . Definition 2. The graduation of a set M ⊂ C is the quantity α(M ) = inf{α | M ⊂ Bα }; the graduation α(f ) of a polynomial f (s) is the graduation of the set of its roots. This definition fits many proven capabilities, for example, the families of: — left half-planes F (a) = max Rezk , where Bα = {s | Res ≤ α}; — left cones G(a) = max(Rezk + |Imzk |), where Bα = {s | Res + |Ims| ≤ α}; — truncated cones Hl (a) = max(F (a), G(a) − l), where Bα = {s | Res ≤ α, Res + |Ims| ≤ α + l}. Here a is the vector of the polynomial coefficients. The families of embedded trapezoids and ellipses can also be treated as R-graduated. Obviously, the graduation of a polynomial coincides with the graduation of one of its roots as a oneelement set—this root will be called the α-dominant root. It is natural to require that the graduation be a piecewise smooth function of the coordinates of this root. Whereas the conventional approach to the synthesis of systems in operator form presupposes a fixed target domain of a standard form (see Fig. 1a), the construction of the R-graduation makes it possible to solve the problem of locating the roots by indicating only the form of the domain (see Fig. 1b). Thus, for all values of the free parameters of the system, the poles are covered with one of the truncated cones, as is shown in the Fig. 1b; the abscissa of the right edge of the cone is a function of the controller parameters, and its minimization allows one to find their optimum values or to conclude that it is impossible to stabilize the system by the controller of the chosen structure and it is necessary to employ a more complex control. Many problems of the synthesis of ACSs could be solved in this form. However, the minimization of the function of the R-graduation combines all the difficulties mentioned in the introduction. The problem of representing the critical areas in the parameter space and evaluating the set of possible minima was solved in [5]. However, in a numerical experiment (see footnote 2), the greatest difficulties were associated with the 3

In comparison with the list in [8] it is reduced because the consideration given here is more general. OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING

Vol. 46

No. 5

2010


411

stabilization of the gradient algorithm near the manifolds corresponding to the multiple α-dominant roots. A fundamental condition for overcoming this difficulty is the convexity of the sets included in the graduation family {Bα | α ∈ R}. GAUSS’S THEOREM AND R-GRADUATION OF MULTIPLE ROOTS The property expressed by this theorem was used in [9] to synthesize linear ACS with reduced-order controllers; for the purposes of that work, it was sufficient to use the weaker assertion that the derivative of a Hurwitz polynomial is a Hurwitz polynomial. A new proof of this assertion free from complicated geometric and mechanical reasoning is given in [10]. Because of the importance of the main theorem of this paper, we give its short proof. Lemma 1. Let fn (s) be a reduced polynomial (possibly with complex coefficients) and let z1 , . . . , zn be its roots. Then, any root z0 of its derivative satisfies the conditions min Rezl ≤ Rez0 ≤ max Rezl ,

1≤l≤n

1≤l≤n

min Imzl ≤ Imz0 ≤ max Imzl .

1≤l≤n

1≤l≤n

Proof. We consider the logarithmic derivative of the polynomial f (s): d d 1 1 ln f (s) = ln[(z − z1 ) . . . (z − zn )] = + ... + . dz dz z − z1 z − zn Since the real parts of the complex quantity and its inverse have the same sign, for Rez > max Rezl , we obtain the inequality Re

n

k=1

n

1 1 = Re > 0; z − zk z − zk k=1

similarly, for Rez < min Rezl , we have l

Re

n

k=1

1 < 0. z − zk

Consequently, all zeros and all poles of the logarithmic derivative are concentrated in the zone {z | min Rezl ≤ l

f (s) d ln f (s) = , the root z = z0 of the derivative f (s) is a root of the logarithmic Rez ≤ max Rezl }. Since l dz f (s) derivative or its simple pole [when z0 = zk is a multiple root of the polynomial f (s)]. Repeating this argument for the imaginary components (taking into account that the imaginary parts of the complex quantity and its inverse have opposite signs), we obtain the second inequality of the lemma. Gauss’s Theorem [6]. The convex hull of the roots of a polynomial with complex coefficients contains the convex hull of its derivative and the common points of the boundaries of these polygons are the multiple roots of the original polynomial. Proof. First of all, we note that any convex polygon in plane can be obtained as the intersection of a finite number of half-planes. Therefore, it is necessary to establish that if the roots are located in the half-plane P (ϕ, a) defined by the inequality x cos ϕ + y sin ϕ ≥ a for some values of ϕ, a, then the roots of the derivative are located in the same half-plane. However, if the roots of zk = xk + iyk satisfy the relation that defines the half-plane P (ϕ, a), then the numbers ξk = zk e−iϕ are located in the half-plane P (0, a), defined by the inequality x ≥ a. Since z1 , . . . , zn are roots of the polynomial fn (s), the numbers ξk are roots of the polynomial gn (s) = fn (eiϕ s). The roots ξ˜1 , . . . , ξñ−1 of its derivative gn (s) = eiϕ fn (eiϕ s) by the lemma are in the half-plane P (0, a). Furthermore, since they are obtained by the multiplication ξ˜k = z˜k e−iϕ from the roots z˜1 , . . . , zñ−1 of the derivative of the polynomial fn (s), the numbers z˜1 , . . . , zñ−1 should also be placed in the half-plane P (ϕ, a). Remark 1. The statement of Lemma 1 cannot be strengthened by reproducing Rolle’s theorem (known in classical analysis) in the complex plane. For example, the polynomial f5 (s) = (z − 0.4)(z 2 + 1)(z 2 + 2z + 2) = z 5 + 1.6z 4 + 2.2z 3 + 0.8z 2 + 1.2z − 0.8 has a derivative f (z) = 5z 4 + 6.4z 3 + 6.6z 2 + 1.6z + 1.2, whose roots z1, 2 ≈ −0.6233 ± 0.8132i and z3, 4 ≈ −0.0167 ± 0.4778i do not fall in the zone {z | 0 ≤ Rez ≤ 0.4} given by the roots z = ±i and z = 0.4 of the polynomial f (s). OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING

Vol. 46

No. 5

2010

412

VOEVODA, CHEKHONADSKIKH

Proposition 1. If the α-dominant root of the polynomial has multiplicity r, the graduations of the polynomial and its derivatives have the same values: α(f ) = α(f ) = . . . = α(f [r−1] ). Proof. By Gauss’s theorem, the convex hulls of the roots the next derivatives are embedded in the hulls of the previous ones. But the α-dominant root is preserved for all derivatives from the 1st to the (r − 1)th. Due to this, the graduation domains and values for the derivatives and polynomial coincide. Corollary 1. Minimization on the manifold of the r-tuple α-dominant roots in the space of the controller parameters makes it possible to convert from the nondifferentiable graduation of the original polynomial to the smooth graduation of its (r − 1)th derivative. Proof. Because each differentiation reduces the root multiplicity by one up to the (r − 2)th derivative of the polynomial for the Jacobian matrix J −1 , the discontinuity of the 2-nd kind is retained in the derivative map of the coordinates of the multiple roots as functions of the coefficients and is transferred to the αgraduation as a function of the controller parameters. For the (r − 1)th derivative, the α-dominant root is simple and the function of its α-graduation is smooth. We note that, by virtue of the implicit function theorem, the coordinates of simple roots are differentiated as functions of the coefficients, regardless of the multiplicity of the other roots. In this case, the denominator of the derivative of the implicit function (∂zk /∂al = −zkl /f [zk ]) is certainly different from zero. A practically meaningful condition of an U-shaped extremum on the bottom of a ravine with an Y-shaped cross section is given below. Proposition 2. In the minimization of the graduation on the manifold of r-tuple α-dominant roots, a necessary extremum condition is the orthogonality of the gradient of the (r − 1)th derivative and the tangent subspace of the manifold. DIRECTION OF THE BOTTOM OF THE RAVINE AND THE BOTTOM GRADIENT OF R-GRADUATION If, for some value of the parameter vector p, the characteristic polynomial has multiple roots and in the neighborhood of this point of the parameter space, the multiplicity of the roots does not increase, the Jacobian matrix (∂ai /∂zj ) allows one to find the tangent subspace of the manifold of multiple roots. In the space of the coefficients, the tangent subspace Da has dimension greater than the number of different roots since the columns of the matrix corresponding to the multiple roots of the polynomial are the same.4 The subspace Da is generated as the linear hull of the columns ∂a/∂zj of the Jacobian matrix (the summation is over different roots, and the coefficients are the increments Δzj , 1 ≤ j ≤ n):

Da = ∂a/∂zj Δzj . j

Next, we consider the derivative mapping (∂ai /∂pk ) of the dependences of the coefficients of the characteristic polynomial fn (s) = sn +an−1 (p)sn−1 +. . .+a1 (p)s+a0 (p) on the vector p of the controller parameters [in the single-channel case, this dependence is linear and the matrix (∂ai /∂pk ) is numerical]. The tangent subspace to the manifold specified in the space of the coefficients by the dependence a(p) is represented by the total differential

da(p) = ∂a/∂pk Δpk . k

To find the tangent subspace Dp to the manifold of multiple roots in the parameter space, it is sufficient to solve (with respect to Δpk ) the following system which describes the intersection of the subspaces Da ∩da(p):

∂a/∂pk Δpk = ∂a/∂zj Δzj , k

or in more familiar form:

j

⎛ ((∂a/∂p) | (∂a/∂z)) ⎝

Δp

⎞

⎠ = 0. −Δz Using the coordinates of the fundamental solution that correspond to the parameters p, we obtain the generating subspaces Dp and the projection of the vector ∇α(f [r−1] ) onto this subspace specifies the desired bottom gradient of the root R-graduation. 4

In [11], it is proved that the rank of the Jacobian matrix is equal to the number of different roots. OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING

Vol. 46

No. 5

2010


413

CONCLUSIONS By taking the R-graduation dependent on the least stable roots of the characteristic polynomial as a numerical characteristic of the location of the poles of the automatic control system, one can reduce the synthesis of the system to minimizing this quantity. The greatest difficulties in this case are due to the nondifferentiability on the manifolds of multiple roots, which leads to ravine relief and stabilization of the gradient descent onto the bottom of the ravines. Using the features of the construction of the R-graduation and the Gauss’s theorem, it is possible to prove that the nondifferentiable graduation of the polynomial roots is equal to the smooth graduation of the roots of the derivative of the polynomial. This makes it possible to find the bottom gradient of the objective function and effectively use gradient algorithms. This work was supported by the Ministry of Education and Science (TK No. P694 of 12.09.2009). REFERENCES 1. C.-T. Chen, Linear System Theory and Design (Holt, Rinehart and Winston, New York, 1984). 2. B. T. Polyak and P. S. Shcherbakov, “Difficult Problems of Linear Control Theory: Some Approaches to Solution,” Avtomat. Telemekh., No. 5, 7–46, (2005). 3. A. A. Voevoda and A. V. Chekhonadskih, “ Plurality of Extrema in the Optimization of the System of Characteristic Roots of Automatic Control Systems,” Nauch. Vestn. NGTU, No. 2(31), 197–200 (2008). 4. A. V. Chekhonadskih, “Metric, Graduation, and Optimization of the Location of the Characteristic Roots of Automatic Control Systems,” Nauch. Vestn. NGTU, No. 1(34), 165–182 (2009). 5. A. A. Voevoda and A. V. Chekhonadskih, “Optimizing the Location of the Poles Automatic of Automatic Control System with Reduced-Order Controller,” Avtometriya 45 (5), 113–123 (2009) [Optoelectr., Instrum. Data Process. 45 (5), 472–480 (2009)]. 6. S. F. Gauss, Opera Omnia (Gottingen, 1886), Vol. 3, p. 112. 7. B. L. Van der Waerden, Algebra (Ungar, New York, 1970). 8. A. V. Chekhonadskih, “On the Step-Differential Optimization of the Roots of the Characteristic Polynomial of Automatic Control Systems,” Nauch. Vestn. NGTU, No. 4(33), 205–208 (2008). 9. A. A. Voevoda and A. I. Meleshkin, “Synthesis of Reduced Order Controllers,” Nauch. Vestn. NGTU, No. 3, 41–58 (1997). 10. A. A. Voevoda, K. N. Ponomarev, and A. V. Chekhonadskih, “On the Stability of the Derivative of a Stable Polynomial,” Nauch. Vestn. NGTU, No. 1 (4), 185–186 (1998). 11. A. V. Shekhonadskih and A. A. Voevoda, “On Jacobi Matrix Rang of Polynomial Coefficients-Roots Correspondence,” in Algebra and Model Theory 5 (Novosibirsk State Techn. Univ., Novosibirsk, 2005), pp. 275–280.

OPTOELECTRONICS, INSTRUMENTATION AND DATA PROCESSING

Vol. 46

No. 5

2010