8 Canonical Duality Theory: Connections between

8 Canonical Duality Theory: Connections between nonconvex mechanics and global optimization David Y. Gao1 and Hanif D. Sherali2 1 2

Department of Mathematics, [email protected] Grado Department of Industrial and Systems Engineering, [email protected] Virginia Tech, Blacksburg, VA 24061, USA.

Dedicated to Professor Gilbert Strang on the occasion of his 70th birthday Summary. This paper presents a comprehensive review and some new developments on canonical duality theory for nonconvex systems. Based on a tri-canonical form for quadratic minimization problems, an insightful relation between canonical dual transformations and nonlinear (or extended) Lagrange multiplier methods is presented. Connections between complementary variational principles in nonconvex mechanics and Lagrange duality in global optimization are also revealed within the framework of the canonical duality theory. Based on this framework, traditional saddle Lagrange duality and the so-called bi-duality theory, discovered in convex Hamiltonian systems and d.c. programming, are presented in a unified way; together, they serve as a foundation for the triality theory in nonconvex systems. Applications are illustrated by a class of nonconvex problems in continuum mechanics and global optimization. It is shown that by the use of the canonical dual transformation, these nonconvex constrained primal problems can be converted into certain simple canonical dual problems, which can be solved to obtain all extremal points. Optimality conditions (both local and global) for these extrema can be identified by the triality theory. Some new results on general nonconvex programming with nonlinear constraints are also presented as applications of this canonical duality theory. This review brings some fundamentally new insights into nonconvex mechanics, global optimization, and computational science.

Key words: Duality, triality, Lagrangian duality, nonconvex mechanics, global optimization, nonconvex variations, canonical dual transformations, critical point theory, semi-linear equations, NP-hard problems, quadratic programming. Acknowledgement. This work is supported by the National Science Foundation by Grant Numbers DMII-0455807, CCF-0514768, and DMII-0552676.

250

Advances in Mech. & Math., Vol. III, Gao & Sherali (ed.), Springer, 2006

8.1 Introduction Complementarity and duality are two inspiring, closely related, concepts. Together they play fundamental roles in multi-disciplinary fields of mathematical science, especially in engineering mechanics and optimization. The study of complementarity and duality in mathematics and mechanics has had a long history since the well-known Legendre transformation was formally introduced in 1787. This elegant transformation plays a key role in complementary duality theory. In classical mechanical systems, each energy function defined in a configuration space is linked via the Legendre transformation with a complementary energy in the dual (source) space, through which the Lagrangian and Hamiltonian can be formulated. In static systems, the convex total potential energy leads to a saddle Lagrangian through which a beautiful saddle min-max duality theory can be constructed. This saddle Lagrangian plays a central role in classical duality theory in convex analysis and constrained optimization. In convex dynamic systems, however, the total action is usually a nonconvex d.c. function, that is the difference of convex kinetic energy and total potential functions. In this case, the classical Lagrangian is no longer a saddle function, but the Hamiltonian is convex in each of its variables. It turns out that instead of the Lagrangian, the Hamiltonian has been extensively used in convex dynamics. From a geometrical point of view, Lagrangian and Hamiltonian structures in convex systems and d.c. programming display an appealing symmetry, which was widely studied by their founders. Unfortunately, such a symmetry in nonconvex systems breaks down. It turns out that in recent times, tremendous effort and attention have been focused on the role of symmetry and symmetry-breaking in Hamiltonian mechanics in order to gain a deeper understanding into nonlinear and nonconvex phenomena (see Marsden and Ratiu, 1995). The earliest examples of the Lagrangian duality in engineering mechanics are probably the complementary energy principles proposed by Haar and von Ka´rmán in 1909 for elasto-perfectly plasticity and Hellinger in 1914 for continuum mechanics. Since the boundary conditions in Hellinger’s principle were clarified by E. Reissner in 1953, the complementary-dual variational principles and methods have been studied extensively for more than 50 years by applied mathematicians and engineers.3 The development of mathematical duality theory in convex variational analysis and optimization has had a similar history since W. Fenchel proposed the well-known Fenchel transformation in 1949. After the revolutionary concepts of super-potential and subdifferentials introduced by J. J. Moreau in 1966 in the study of frictional mechanics, 3

Eric Reissner (PhD ’1938) was a professor in the Department of Mathematics at MIT from 1949 to 1969. According to Gil Strang, since Reissner moved to the Department of Mechanical and Aerospace Engineering at University of California, San Diego in 1969, many applied mathematicians in the field of continuum mechanics, especially solid mechanics, switched from mathematical departments to engineering schools in the United States.

8 Gao & Sherali: Canonical Duality Theory

251

the modern mathematical theory of duality has been well-developed by celebrated mathematicians such as R. T. Rockafellar (1967, 1970), I. Ekeland, R. Temam (1976), F. H. Clarke (1983), and G. Strang (1986). Mathematically speaking, in linear elasticity where the total potential energy is convex, the Hellinger-Reissner complementary variational principle in engineering mechanics is equivalent to a Fenchel-Moreau-Rockafellar type dual variational problem. The so-called generalized complementary variational principle is actually the saddle Lagrangian duality theory, which serves as the foundation for hybrid/mixed finite element methods, and has been subjected to extensive study during the past 40 years (see Strang and Fix (1973), Oden and Lee (1977), Pian and Wu (2006), Han (2006), and the references cited therein). Early in the beginning of the last century, Haar and von Karman [70] had already realized that in nonlinear variational problems of continuum mechanics, the direct approaches for solving minimum potential energy (primal problem) can only provide upper bounding solutions. However, the minimum complementary energy principle (i.e., the maximum Lagrangian dual problem) provides a lower bound (the mathematical proof of Haar-von Kármán’s principle was given by Greenberg in 1949 [69]). In safety analysis of engineering structures, the upper and lower bounding approximations to the socalled collapse statues of the elasto-plastic structures are equally important to engineers. Therefore, the primal-dual variational methods have been studied extensively by engineers for solving nonsmooth, nonlinear problems (see [28, 29, 87], and [56, 55, 63, 64, 59]). The article by Maier et al. (2000) serves as an excellent survey on the developments for applications of the Lagrangian duality in engineering structural mechanics. In mathematical programming and computational science, the so-called primal-dual interior point methods are also based on the Lagrangian duality theory, which has emerged as a revolutionary technique during the last 15 years. Complementary to the interiorpoint methods, the so-called pan-penalty finite element programming developed by Gao in 1988 [21, 22] is indeed a primal-dual exterior-point method. He proved that in rigid-perfectly plastic limit analysis, the exterior penalty functional and the associated perturbation method possess an elegant physical meaning, which led to an efficient dimension re-scaling technique in large-scale nonlinear mixed finite element programming problems [22]. In mathematical programming and analysis, the subject of complementarity is closely related to constrained optimization, variational inequality, and fixed point theory. Through the classical Lagrangian duality, the KKT conditions of constrained optimization problems lead to corresponding complementarity problems. The primal-dual schema has continued to evolve for linear and convex mathematical programming during the past 20 years. However, for nonconvex systems, it is well-known that the KKT conditions are only necessary under certain regularity conditions for global optimality. Moreover, the underlying nonlinear complementarity problems are fundamentally difficult due to the non-monotonicity of the nonlinear operators, and also, many problems in global optimization are NP-hard. The well-developed Fenchel-Moreau-

252


Rockafellar duality theory will produce a so-called duality gap between the primal problem and its Lagrangian dual. Therefore, how to formulate perfect dual problems (with a zero duality gap) is a challenging task in global optimization and nonconvex analysis. Extensions of the classical Lagrangian duality and the primal-dual schema to nonconvex systems are ongoing research endeavors. On the flip side, the Hellinger-Reissner complementary energy principle, emanating from large deformation mechanics, holds for both convex and nonconvex problems. It is very interesting to note that around the same time period of Reissner’s work, the generalized potential variational principle in finite deformation elasto-plasticity was proposed independently by Hu Hai-chang (1955) and K. Washizu (1955). These two variational principles are perfectly dual to each other (i.e., with zero duality gap) and play important roles in large deformation mechanics and computational methods. The inner relations between the Hellinger-Reissner and Hu-Washizu principles were discovered by Wei-Zang Chien in 1964 when he proposed a systematic method to construct generalized variational principles in solid mechanics (see [10]). Mechanics and mathematics have been complementary partners since Newton’s time, and the history of science shows much evidence of the beneficial influence of these disciplines on each other. However, the independent developments of complementary-duality theory in mathematics and mechanics for more than a half century have generated a “duality gap” between the two partners. In modern analysis, the mathematical theory of duality was mainly based on the Fenchel transformation. During the last three decades, many modified versions of the Fenchel-Moreau-Rockafellar duality have been proposed. One, the so-called relaxation method in nonconvex mechanics, can be used to solve the relaxed convex problems (see, [2, 14, 138]). However, due to the duality gap, these relaxed solutions do not directly yield real solutions to the nonconvex primal problems. Thus, tremendous efforts have been focused recently on finding the so-called perfect duality theory in global optimization. On the other hand, it seems that most engineers and scientists prefer the classical Legendre transformation. It turns out that their attention has been mainly focused on how to use traditional Lagrange multiplier methods and complementary constitutive laws to correctly formulate complementary variational principles for numerical computational and application purposes. Although the generalized Hellinger-Reissner principle leads to a perfect duality between the nonconvex potential variational problem and its complementarydual, and has many important consequences in large deformation theory and computational mechanics, the extremality property of this well-known principle, as well as the Hu-Washizu principle, remained an open problem for more than 40 years, and this raised many arguments in large deformation theory and nonconvex mechanics (see [85, 135, 79, 98, 99, 83, 84, 66]). Actually, this open problem was partially solved in 1989 in the joint work of Gao and Strang [62] on nonconvex/nonsmooth variational problems. In order to recover the lost symmetry between the nonconvex primal problem


253

and its dual, they introduced a so-called complementary gap function, which leads to a nonlinear Lagrangian duality theory in fully nonlinear variational problems. They proved that if this gap function is positive on a dual feasible space, the generalized Hellinger-Reissner energy is a saddle-Lagrangian. Therefore, this gap function provides a sufficient condition in nonconvex variational problems. However, the extremality conditions for negative gap function were ignored until 1997 when Gao [32] got involved with a project on postbuckling problems in nonconvex mechanics. He discovered that if this gap function is negative, the generalized Hellinger-Reissner energy (the so-called super-Lagrangian) is concave in each of its variables, which led to a bi-duality theory. Therefore, a canonical duality theory has gradually developed, first in nonconvex mechanics, and then in global optimization. This new theory is composed mainly of a potentially useful canonical dual transformation and an associated triality theory, whose components comprise a saddle min-max duality and two pairs of double-min, double-max dualities. The canonical dual transformation can be used to formulate perfect dual problems without a duality gap, whereas the triality theory can be used to identify both global and local extrema. The goal of this paper is to present a comprehensive review on the canonical duality theory within a unified framework, and to expose its role in establishing connections between nonconvex mechanics and global optimization. Applications to constrained nonconvex optimization problems are shown to reveal some important new results that are fundamental to global optimization theory. This paper should be of interest to both the operations research and applied mathematics communities. In order to make this presentation easy to follow by interdisciplinary readers, our attention here is mainly focused on smooth systems, although some concepts from nonsmooth analysis have been used in later sections.

8.2 Quadratic Minimization Problems Let us begin with the simplest quadratic minimization problem (in short, the primal problem (Pq )): ½ ¾ 1 (Pq ) : min P (u) = hu, Aui − hu, f i : u ∈ Uk , (8.1) 2 where Uk is an open subset of a linear space U; A is a linear symmetrical operator, which maps each u ∈ U into its dual space U ∗ ; the bilinear form hu, u∗ i : U × U ∗ → R puts U and U ∗ in duality; f ∈ U ∗ is a given input, and P : U → R represents the total cost (action) of the system. The criticality condition δP (u) = 0 leads to a linear equation Au = f,

(8.2)

254


which is called the fundamental equation (or equilibrium equation) in mathematical physics. By the fact that A : U → U ∗ is a symmetrical operator, we have the following canonical decomposition, A = Λ∗ DΛ,

(8.3)

where Λ : U → V is a so-called geometrical operator, which maps each u ∈ U into a so-called intermediate space V, and the symmetrical operator D links V with its dual space V ∗ . The bilinear form hv ; v ∗ i : V × V ∗ → R puts V and V ∗ in duality. We distinguish between the notations h , i and h ; i according to the differences of the dual spaces U × U ∗ and V × V ∗ on which they are respectively defined. The mapping v ∗ = Dv ∈ V ∗ is called the duality equation. The adjoint operator Λ∗ : V ∗ → U ∗ , defined by hΛu ; v ∗ i = hu, Λ∗ v ∗ i, is also called the balance operator. Thus, by the use of the intermediate pair (v, v ∗ ), the fundamental equation (8.2) can be split into the so-called tricanonical form  (a) geometrical equation: Λu = v  (b) duality equation: Dv = v ∗ ⇒ Λ∗ DΛu = f. (8.4)  ∗ ∗ (c) balance equation: Λ v =f In mathematical physics, the duality equation v ∗ = Dv is also recognized as the constitutive law and the operator D depends on the physical properties of the system considered. The pair (v, v ∗ ) is said to be a canonical dual pair on Va × Va∗ ⊂ V × V ∗ if the duality mapping D : Va ⊂ V → Va∗ ⊂ V ∗ is one-to-one and onto. Generally speaking, most physical variables appear in dual pairs, i.e., there exists a Gâteaux differentiable function V : Va → R such that the duality relation v ∗ = δV (v) : Va → Va∗ is revertible, where δV (v) represents the Gâteaux derivative of V at v. In mathematical physics, such a function is called free energy. Its Legendre conjugate V ∗ (v ∗ ) : V ∗ → R, defined by the Legendre transformation V ∗ (v ∗ ) = sta{hv; v ∗ i − V (v) : v ∈ Va },

(8.5)

is called complementary energy, where sta{ } denotes finding stationary points of the statement in { }. In order to study the canonical duality theory, consider the following definition. Definition 1. A real-valued function V : Va ⊂ V → R is called a canonical function on Va if its Legendre conjugate V ∗ (v ∗ ) can be uniquely defined on Va∗ ⊂ V ∗ such that the following relations hold on Va × Va∗ : v ∗ = δV (v) ⇔ v = δV ∗ (v ∗ ) ⇔ hv ; v ∗ i = V (v) + V ∗ (v ∗ ).

(8.6)


255

Clearly, if D : Va → Va∗ is invertible, the quadratic function V (v) = is canonical on Va and its Legendre conjugate V ∗ (v ∗ ) = 12 hD−1 v ∗ ; v ∗ i is a canonical function on Va∗ . Generally speaking, if V : Va → R is a canonical function and v ∗ = δV (v), then (v, v ∗ ) is a canonical dual pair on Va × Va∗ . The one-to-one canonical duality relation serves as a foundation for the canonical dual transformation method reviewed in the following sections. The definition of the canonical pairs and functions can be generalized to nonsmooth systems where the Fenchel transformation and sub-differential have to be applied (see [38, 40]). This is discussed in the context of constrained global optimization problems in Section 8 of this paper. In order to study general problems, we denote the linear function hu, f i by U (u). If the feasible space Uk can be written in the form of 1 2 hv; Dvi

Uk = {u ∈ Ua | Λu ∈ Va },

(8.7)

then the problem (Pq ) can be written in a general form (P) :

min{P (u) = V (Λu) − U (u) : u ∈ Uk }.

(8.8)

This general form covers many problems in applications. In continuum mechanics, the feasible set Uk is usually called the kinetically admissible space. In statics, where the function V (v) is viewed as an internal (or stored) energy and U (u) is considered as an external energy, the cost function P (u) is the so-called total potential and (P) represents a minimal potential variational problem. In dynamical systems if V (v) is considered as a kinetic energy and U (u) is the total potential, then P (u) is called the total action of the system. In this case, the variational problem associated with the general form (P) is the well-known least action principle. A diagrammatic representation of this tri-canonical decomposition is shown in Figure 8.1.

u ∈ Ua ⊂ U ¾

hu , u∗ i

- U ∗ ⊃ Ua∗ 3 u∗ 6

Λ∗

Λ

? v ∈ Va ⊂ V ¾

hv ; v ∗ i

- V ∗ ⊃ Va∗ 3 v ∗

Fig. 8.1. Diagrammatic representation for quadratic systems

The development of the Λ∗ DΛ-operator theory was apparently initiated by von Neumann in 1932, and was subsequently extended and put into a more general setting in the studies of complementary variational principles in continuum mechanics by Noble (1966), Rall (1969), Arthus (1970, 1980), Tonti

256


(1972), Oden and Reddy (1983), and Sewell (1987). In mathematical analysis, the tri-canonical form of A = Λ∗ DΛ has also been used to develop a mathematical theory of duality by Rockefallar (1970), Ekeland and Temam (1976), Toland (1978, 1979), Auchmuty (1983), Clark (1985), and many others. In the excellent textbook by Strang (1986), the tri-factorization A = Λ∗ DΛ for linear operators can be seen through an application of continuum theories to discrete systems. In what follows, we list some simple examples. More applications can be found in the monograph [38].

8.2.1 Quadratic Optimization Problems in Rn First, we consider U as a finite-dimensional space such that U = U ∗ = Rn . Thus A : U → U ∗ is a symmetric matrix in Rn×n and the bilinear form hu, u∗ i = uT u∗ is simply a dot-product in Rn . By linear algebra, the canonical decomposition A = Λ∗ DΛ can be performed in many ways (see Strang, 1986), where Λ : Rn → Rm is a matrix, D : Rm → Rm is a symmetrical matrix, and Λ∗ = ΛT maps V ∗ = Rm back to U ∗ = Rn . The bilinear forms h∗ , ∗i and h∗ ; ∗i are simply dot products in Rn and Rm , respectively, that is   Ã ! n n m m X X X X ∗ ∗ ∗   Λij uj = uj Λij vi = hu, ΛT v ∗ i. vi hΛu; v i = i=1

j=1

j=1

i=1

If the matrix A is positive semi-definite, we can always choose a geometrical operator Λ to ensure that the matrix D ∈ Rm×m is positive definite. In this case the problem (P) is a convex program and any solution of the fundamental equation Au = f also solves the minimization problem (P). If the matrix A is indefinite, the quadratic function 21 hu, Aui is nonconvex. From linear algebra, it follows then that by choosing a particular linear operator Λ : Rn → Rm , the matrix A can be written in the tri-canonical form: µ ¶µ ¶ ¡ T ¢ D 0 Λ A = Λ ,I , (8.9) 0 −C I where D ∈ Rm×m is positive definite, C ∈ Rn×n is positive semi-definite, and I is an identity in Rn . In this case, both V (v) = 12 hv; Dvi and U (u) = 1 2 hu, Cui + hu, f i are convex quadratic functions, but P (u) = V (Λu) − U (u) =

1 1 hΛu; DΛui − hu, Cui − hu, f i 2 2

is a nonconvex d.c. function, that is a difference of convex functions. In this case, the problem (P) is a nonconvex quadratic minimization and the solution of Au = f is only a critical point of P (u). Nonconvex quadratic programming and d.c. programming are important from both the mathematical and application viewpoints. Sahni (1974) first showed that for a negative definite matrix A, the problem (P) is NP-hard.


257

This result was also proved by Vavasis (1990, 1991) and by Pardalos (1991). During the last decade, several authors have shown that the general quadratic programming problem (P) is an NP-hard problem in global optimization (cf. Murty and Kabadi, 1987; Horst et al., 2000). It was shown by Pardalos and Vavasis (1990) that even when the matrix A is of rank one with exactly one negative eigenvalue, the problem is NP-hard. In order to solve this difficult problem, much effort has been devoted during the last decade. Comprehensive surveys have been given by Floudas and Visweswaran [19] for quadratic programming, and by Tuy [132] for d.c. optimization.

8.2.2 Variational Problems in Continuum Mechanics In continuous systems the linear space U is usually a function space over a time-space domain, and the linear mapping A is a differential operator. In classical Newtonian dynamics, for example, the fundamental equation (8.2) is a second order differential equation Au = −mu00 = f, where f is an applied force field. In this case, Λ = d/ dt is a linear differential operator, m > 0 is a mass density, and Λ∗ = −d/ dt can be defined by integrating by parts over a time domain T ⊂ R with boundary ∂T : Z Z hΛu; v ∗ i = u0 v ∗ dt = u(−v ∗ )0 dt = hu, Λ∗ v ∗ i, T

T

subject to the boundary conditions u(t)v ∗ (t) = 0, ∀t ∈ ∂T . For Newton’s law, D = m is a constant and the tri-canonical form Au = Λ∗ DΛu = −mu00 = f is Newton’s equilibrium equation. The quadratic form Z 1 1 1 V (Λu) = hu, Aui = hΛu; DΛui = mu02 dt 2 2 2 T represents the internal (or kinetic) energy of the system, and the linear term Z U (u) = uf dt T

represents the external energy of the system. The function P (u) = V (Λu) − U (u) is called the total action, which is a convex p functional. For Einstein’s law, however, D = m(t) = mo / 1 − c2 /v 2 depends on the velocity v = u0 , where mo > 0 is a constant and c is the speed of light. In this case, the tri-canonical form Au = f leads to Einstein’s theory of special relativity: Ã ! mo d d p u = f. − dt 1 − u02 /c dt The kinetic energy

258


Z V (v) =

−mo

p 1 − v 2 /c2 dt

T

is no longer quadratic, but is still a convex functional on Va = {v ∈ L∞ (T ) | v(t) < c, ∀t ∈ T }. By using the canonical dual transformation, the nonlinear minimization problem (P) can be solved analytically (see [39]). In mass-spring systems, A = −(m∂tt + k) and the fundamental equation (8.2) has the form: Au = −mu00 − ku = f. The additional term ku represents the spring force and k > 0 is a spring constant. In this case, if we let Λ = (∂t , 1)T be a vector-valued operator, the second order linear differential operator A can still be written in the Λ∗ DΛ form as ¸· · ¸· ∂ ¸ ∂2 ∂ m 0 ∂t . (8.10) A = −(m 2 + k) = − , 1 0 −k 1 ∂t ∂t As evident here, if we let Λ = (∂t , 1)T be a vector-valued operator, the operator D is indefinite. However, if we let Λ = ∂t , then similar to (8.9), we have D = m, which is positive definite. Thus in this dynamical system, we have. ¶ Z Z µ 1 1 2 2 V (v) = mv dt, U (u) = ku − uf dt, 2 T 2 T where the quadratic function U (u) represents the total potential energy. The quadratic functional given by Z Z 1 1 P (u) = V (Λu) − U (u) = mu2,t dt − [ ku2 − uf ] dt (8.11) 2 T T 2 is the well-known total action, which is again a d.c. functional. Actually, every function P (u) ∈ C 2 is d.c. on any compact convex set Uk , and any d.c. optimization problem can be reduced to the canonical form (see Tuy, 1995): min{V (Λu) : U (u) ≤ 0, G(u) ≥ 0}, (8.12) where V, U, and G are convex functions. In the next section, we demonstrate how the tri-canonical Λ∗ DΛ-operator theory serves as a framework for the Lagrangian duality theory.

8.3 Canonical Lagrangian Duality Theory Classical Lagrangian duality was originally studied by Lagrange in analytical mechanics. In engineering mechanics it has been recognized as the complementary variational principle, and has been subjected to extensive study for more than several centuries. In this section, we show its connection to constrained optimization/variational problems. In addition to the well-known


259

saddle Lagrangian duality theory, a so-called super-Lagrangian duality is presented within a unified framework, which leads to a bi-duality theorem in d.c. programming and convex Hamiltonian systems. Recall the general primal problem (8.8) (P) : min{P (u) = V (Λu) − U (u) : u ∈ Uk },

(8.13)

where V : Va ⊂ V → R is a canonical function, U : Ua → R is a Gâteaux differentiable function, either linear or canonical, and Uk = {u ∈ Ua | Λu ∈ Va } is a convex feasible set. Without loss of generality, we assume that the geometrical operator Λ : Ua → V can be chosen in a way such that the canonical function V : Va → R is convex. By the definition of the canonical function, the duality relation v ∗ = δV (v) : Va → Va∗ leads to the following Fenchel-Young equality on Va × Va∗ , V (v) = hv; v ∗ i − V ∗ (v ∗ ). Substituting this into equation (8.13), the Lagrangian L(u, v ∗ ) : Ua × Va∗ → R associated with the canonical problem (P) can be defined by L(u, v ∗ ) = hΛu; v ∗ i − V ∗ (v ∗ ) − U (u).

(8.14)

Definition 2 (Canonical Lagrangian). A function L : Ua × Va∗ → R associated with the problem (P) is called a canonical Lagrangian if it is a canonical function on Va∗ and a canonical or linear function on Ua . The criticality condition δL(¯ u, v¯∗ ) = 0 leads to the well-known Lagrange equations: Λ¯ u = δV ∗ (¯ v∗ ) (8.15) ∗ ∗ Λ v¯ = δU (¯ u). By the fact that V : Va → Va∗ is a canonical function, the Lagrange equations (8.15) are equivalent to Λ∗ δV (Λ¯ u) = δU (¯ u). If (¯ u, v¯∗ ) is a critical point of ∗ L(u, v ), then u ¯ is a critical point of P (u) on Uk . Because the canonical function V is assumed to be convex on Va , the canonical Lagrangian L(u, v ∗ ) is concave on Va∗ . Thus, the extremality conditions of the critical point of L(u, v ∗ ) depends on the convexity of the function U (u). Two important duality theories are associated with the canonical Lagrangian, as exposed in Sections 3.1 and 3.2 below.

8.3.1 Saddle-Lagrangian duality First, we assume that U (u) is a concave function on Ua . In this case, L(u, v ∗ ) is a saddle-Lagrangian, that is L(u, v ∗ ) is convex on Ua and concave on Va∗ . By the traditional definition, a pair (¯ u, v¯∗ ) is called a saddle point of L(u, v ∗ ) ∗ on Ua × Va if

260


L(u, v¯∗ ) ≥ L(¯ u, v¯∗ ) ≥ L(¯ u, v ∗ ), ∀(u, v ∗ ) ∈ Ua × Va∗ .

(8.16)

The classical saddle-Lagrangian duality theory can be presented precisely by the following theorem. Theorem 1 (Saddle-Minmax Theorem) Suppose that the function U : Ua → R is concave and there exists a linear operator Λ : Ua → Va such that the canonical Lagrangian L : Ua ×Va∗ → R is a saddle function. If (¯ u, v¯∗ ) ∈ Ua ×Va∗ ∗ is a critical point of L(u, v ), then min max L(u, v ∗ ) = L(¯ u, v¯∗ ) = max min L(u, v ∗ ). ∗ ∗

u∈Uk v ∗ ∈Va∗

v ∈Vk u∈Ua

(8.17)

By using this theorem, the dual function P d (v ∗ ) can be defined as P d (v ∗ ) = min L(u, v ∗ ) = U [ (Λ∗ v ∗ ) − V ∗ (v ∗ ), u∈Ua

(8.18)

where U [ : U ∗ → R is a Fenchel conjugate function of U defined by the Fenchel transformation U [ (u∗ ) = min {hu, u∗ i − U (u)}. (8.19) u∈Ua

Because U (u) is a concave function on Ua , the Fenchel conjugate U [ is also a concave function on Ua∗ ⊂ U ∗ . Thus, on the dual feasible space Vk∗ defined by Vk∗ = {v ∗ ∈ Va∗ | Λ∗ v ∗ ∈ Ua∗ }, the problem, which is dual to (P), can be proposed as the following, © ª (P d ) : max P d (v ∗ ) : v ∗ ∈ Vk∗ .

(8.20)

(8.21)

The saddle min-max duality theory leads to the following well-known result. Theorem 2 (Saddle-Lagrangian Duality Theorem) Suppose that L(u, v ∗ ) : Ua × Va∗ → R is a canonical saddle Lagrangian and (¯ u, v¯∗ ) is a critical point of L(u, v ∗ ). Then u ¯ is a global minimizer of P (u), v¯∗ is a global maximizer of P d (v ∗ ), and min P (u) = P (¯ u) = L(¯ u, v¯∗ ) = P d (¯ v ∗ ) = max P d (v ∗ ). ∗ ∗

u∈Uk

v ∈Vk

(8.22)

Particularly, for a given f ∈ Ua∗ such that U (u) = hu, f i is a linear function on Ua , the Fenchel-conjugate U [ (u∗ ) can be computed as ½ 0 if u∗ = f, (8.23) U [ (u∗ ) = min {hu, u∗ i − U (u)} = −∞ otherwise. u∈Ua Its effective domain is Ua∗ = {u∗ ∈ U ∗ | u∗ = f }. Thus, the dual feasible space can be well-defined as Vk∗ = {v ∗ ∈ Va∗ | Λ∗ v ∗ = f }, and the dual problem is a concave maximization problem with a linear constraint:


(P d ) :

max{P d (v ∗ ) = −V ∗ (v ∗ ) : Λ∗ v ∗ = f, v ∗ ∈ Va∗ }.

261

(8.24)

By using the Lagrange multiplier u ∈ Ua to relax the linear constraint, we have L(u, v ∗ ) = −V ∗ (v ∗ ) + hu, (Λ∗ v ∗ − f )i, which is exactly the canonical Lagrangian (8.14) associated with the problem (P) if the Lagrange multiplier u is in Ua such that V (Λu) is a canonical function on Va . This shows that the classical Lagrangian can be obtained in two ways: 1. Legendre transformation method (by choosing a proper linear operator Λ in (P)) 2. Classical Lagrange multiplier method (by relaxing the constraint Λ∗ v ∗ = f in (P d )) In engineering mechanics, because V ∗ is called the complementary energy, the constrained problem min{V ∗ (v ∗ ) : Λ∗ v ∗ = f, v ∗ ∈ Va∗ } is also called the complementary variational problem and the Lagrangian L(u, v ∗ ) is called the generalized complementary energy. In computational mechanics, the saddle-Lagrangian duality theory serves as a foundation for mixed and hybrid finite element methods.

8.3.2 Super-Lagrangian Duality If the function U : Ua → R is convex, the canonical Lagrangian L(u, v ∗ ) is concave in each of its variables u ∈ Ua and v ∗ ∈ Va∗ . However, L(u, v ∗ ) may not be concave in (u, v ∗ ) ∈ Ua × Va∗ (see examples in [38]). In this case, consider the following definition that was introduced in [38]. Definition 3. A point (¯ u, v¯∗ ) is said to be a super-critical (or ∂ + -critical) ∗ point of L on Ua × Va if L(¯ u, v ∗ ) ≤ L(¯ u, v¯∗ ) ≥ L(u, v¯∗ ),

∀(u, v ∗ ) ∈ Ua × Va∗ .

(8.25)

A function L : Ua × Va∗ → R is said to be a super-critical (or ∂ + ) function on Ua × Va∗ if it is concave in each of its arguments, that is L : Ua → R is concave, ∀v ∗ ∈ Va∗ , L : Va∗ → R is concave, ∀u ∈ Ua . In particular, if the super-critical function L : Ua × Va∗ → R is a Lagrange form, it is called a super-Lagrangian.

262


From a duality viewpoint, a point (¯ u, v¯∗ ) is said to be a sub-critical (or ∗ ∂ -critical) point of L on Ua × Va if −

L(¯ u, v ∗ ) ≥ L(¯ u, v¯∗ ) ≤ L(u, v¯∗ ),

∀(u, v ∗ ) ∈ Ua × Va∗ .

(8.26)

This definition comes from the sub-differential (see [38]): v¯∗ ∈ ∂ − V (v) = {v ∗ ∈ Va∗ | V (v) − V (¯ v ) ≥ hv − v¯; v¯∗ i, ∀v ∈ Va }. Clearly, (¯ u, v¯∗ ) is a super-critical point of L on Ua × Va∗ if and only if it is a sub-critical point of −L on Ua × Va∗ . Theorem 3 (Super-Lagrangian Duality Theorem [38]) Suppose that there exists a linear operator Λ : Ua → Va such that L : Ua ×Va∗ → R is a super-Lagrangian. If (¯ u, v¯∗ ) ∈ Ua ×Va∗ is a super-critical point of L(u, v ∗ ) on Ua × Va∗ , then either the super-maximum theorem in the form max max L(u, v ∗ ) = L(¯ u, v¯∗ ) = max max L(u, v ∗ ) ∗ ∗


v ∈Vk u∈Ua

(8.27)

holds, or the super-minimax theorem in the form min max L(u, v ∗ ) = L(¯ u, v¯∗ ) = min max L(u, v ∗ ) ∗ ∗


v ∈Vk u∈Ua

(8.28)

holds. Based on this super-Lagrangian duality theorem, a dual function to the nonconvex d.c. function P (u) = V (Λu) − U (u) can be formulated as P d (v ∗ ) = max L(u, v ∗ ) = U ] (Λ∗ v ∗ ) − V ∗ (v ∗ ), u∈Ua

(8.29)

where U ] : V ∗ → R is defined by the super-Fenchel transformation U ] (u∗ ) = max{hu, u∗ i − U (u) : u ∈ Ua }.

(8.30)

Suppose that Ua∗ ⊂ U ∗ is an effective domain of U ] . Then on the dual feasible space Vk∗ = {v ∗ ∈ Va∗ | Λ∗ v ∗ ∈ Ua∗ }, we have the following result. Theorem 4 (Bi-Duality Theory [38]) If (¯ u, v¯∗ ) is a super-critical point of L(u, v ∗ ), then either the double-min theorem in the form min P (u) = P (¯ u) = L(¯ u, v¯∗ ) = P d (¯ v ∗ ) = min P d (v ∗ ) ∗ ∗

u∈Uk

v ∈Vk

(8.31)

holds, or the double-max theorem in the form P d (v ∗ ) u) = L(¯ u, v¯∗ ) = P d (¯ v ∗ ) = max max P (u) = P (¯ ∗ ∗

u∈Uk

holds.

v ∈Vk

(8.32)


263

The Hamiltonian H : Ua × Va∗ → R associated with the Lagrangian is defined by H(u, v ∗ ) = hΛu; v ∗ i − L(u, v ∗ ) = V ∗ (v ∗ ) + U (u).

(8.33)

Clearly, if L(u, v ∗ ) is a super-Lagrangian, the Hamiltonian H(u, v ∗ ) is convex in each of its variables and in terms of H(u, v ∗ ), the Lagrange equations (8.15) can be written in the so-called Hamiltonian canonical form: Λu = δv∗ H(u, v ∗ ), Λ∗ v ∗ = δu H(u, v ∗ ).

(8.34)

However, this nice symmetrical form and the convexity of the Hamiltonian does not afford new insights into understanding the extremality conditions of the nonconvex problem. The super-Lagrangian duality theory plays an important role in d.c. programming, convex Hamilton systems, and global optimization.

8.3.3 Applications in Quadratic Programming and Commentary Now, let us consider the nonconvex quadratic programming problem (Pq ) where the cost function is a d.c. function P (u) =

1 1 hΛu; DΛui − hu, Cui − hu, f i 2 2

as discussed in (8.2.1), where D is a positive definite matrix in Rm×m , and C ∈ Rn×n is positive semi-definite. Because U (u) = 21 hu, Cui + hu, f i in this case is convex, the Lagrangian 1 1 L(u, v ∗ ) = hΛu; v ∗ i − hD−1 v ∗ ; v ∗ i − hu, Cui − hu, f i 2 2 is a super-Lagrangian. By using the super-Fenchel transformation, we have 1 U ] (u) = max {hu, u∗ − f i − hu, Cui} u∈Ua 2 1 = hC + (u∗ − f ), (u∗ − f )i, 2 subject to u∗ − f ∈ C(C), where C + is a pseudo-inverse of C and C(C) represents the column space of C. Thus, on the dual feasible space Vk∗ = {v ∗ ∈ Va ⊂ Rm | ΛT v ∗ − f ∈ C(C)},

(8.35)

the dual function P d (v ∗ ) =

1 + ∗ ∗ 1 hC (Λ v − f ), Λ∗ v ∗ − f i − hD−1 v ∗ ; v ∗ i 2 2

(8.36)

264


is also a d.c. function. The bi-duality theorem shows that the optimal values of the primal and dual problems are equal. If u ¯ solves the primal (either minimization or maximization) and Λ∗ v¯∗ − f ∈ ∂ − U (¯ u), then v¯∗ solves the dual. One of the earliest and best known double-min duality schemes was formulated by Toland [128] for the d.c. minimization problem min{W (u) − U (u) : u ∈ dom W },

(8.37)

where W (u) is an arbitrary function, U (u) is a convex proper lsc function on Rn , and dom W represents effective domain of W . The dual problem is min{U ] (u∗ ) − W ] (u∗ ) : u∗ ∈ dom U ] },

(8.38)

which is also a d.c. minimization problem in Rn . The generalizations were made by Auchmuty (1983) to general nonconvex functionals with a linear operator Λ. Since then, several important duality concepts have been developed and studied for nonconvex optimization and d.c. programming by Crouzeix (1981), Hiriart-Urruty (1985), Singer (1988), Penot & Volle (1990), Tuy (1990, 1991, 1992), Tuach (1993, 1995), and many others. A detailed review on duality in d.c. programming appears in Tuy [132]. Much of the foregoing discussion is based on generalized nonconvex functionals, which are allowed to be extended-real-valued. In order to avoid difficulties such as ∞ − ∞, a modified version of the double-min duality in optimization was presented in Rockafellar and Wets [107]. It is traditional in the calculus of variations and optimization that the primal problem is always taken to be a minimization problem. However, this tradition somewhat obscures our view of more general problems. In convex Hamiltonian systems where V (v) = 21 hΛu, DΛui is a kinetic energy function and U (u) = 12 hu, Cui + hu, f i is a total potential energy function, the d.c. function P (u) = V (Λu) − U (u) represents a total action of the system. As pointed out in [16, 38], in the context of dynamical systems, the least action principle is somehow misleading because the action is a d.c. function that takes minimum and maximum values periodically over the time domain. Both the min- and the max-primal problems have to be considered simultaneously in a period. The bi-duality theorem reveals a periodic behavior of dynamical systems. In two-person game theory, the bi-duality theory shows that the d.c. programming problem has two Nash equilibrium points. The super-Lagrangian duality and the associated bi-duality theory were first proposed in the monograph [38]. Based on this theory and the tricanonical form Λ∗ DΛ, we reformulated the nonconvex quadratic programming problem in a dual form of (8.36), which is well-defined on the dual feasible space Vk∗ ⊂ Rm (8.35). Because m ≤ n, we believe this new dual form will play an important role in nonconvex quadratic programming theory.


265

8.4 Complementary Variational Principles in Continuum Mechanics This section presents two simple applications of the canonical Lagrange duality theory in continuum mechanics. The first application shows the connection between the mathematical theory of saddle-Lagrangian duality and the complementary energy variational principles in static linear elasticity, which are well-known in solid mechanics and computational mechanics. Indeed, the application of the super-Lagrangian duality theory to convex Hamiltonian systems may bring some important insights into extremality conditions in dynamic systems.

8.4.1 Linear Elasticity Let us consider an elastic material in R3 occupying a simple connected domain Ω ⊂ R3 with boundary Γ = ∂Ω = Γu ∪ Γt such that Γu ∩ Γt = ∅. On Γu , the boundary displacement u ¯ is given, whereas on Γt , a surface traction ¯t is prescribed. Suppose that the elastic body is subjected to a distributed force field f . The equilibrium equation Au = f has the following form, µ ¶ ∂ ∂uk (x) − Dijkl = fi (x), ∀x ∈ Ω, (8.39) ∂xj ∂xl where D = {Dijkl } (i, j, k, l = 1, 2, 3) is a positive definite fourth-order elastic tensor, satisfying Dijkl = Djikl = Dklij , and Einstein’s summation convention over the repeated sub-indices is used here. In this problem, A = −div D grad is an elliptic operator, Λ = grad is a gradient, and v = gradu is called the deformation gradient. Its symmetrical part is an infinitesimal strain tensor, denoted as ² = 12 (∇u + (∇u)T ). The dual variable v ∗ = D² is a stress tensor, usually denoted by σ. In this infinite dimensional system U = L2 (Ω; R3 ) = U ∗ and V = L2 (Ω; R3×3 ) = V ∗ . The bilinear forms are defined by Z Z hu, f i = u · f dΩ, h², σi = ² : σ dΩ, Ω

Ω

where ² : σ = tr(² · σ) = ²ij σij . The adjoint operator Λ∗ in this case is Λ∗ = {−div in Ω, n · on Γ }, and dive is also called the formal adjoint of Λ = grad. Let Ua = {u ∈ U | u(x) = u ¯(x), ∀x ∈ Γu } Va = {² ∈ V | ²(x) = ²T (x), ∀x ∈ Ω}. Thus on the feasible space, i.e., the so-called statically admissible space Uk = {u ∈ Ua | Λu ∈ Va }, the quadratic form Z Z 1 P (u) = (∇u) : D : (∇u) dΩ − u · f dΓ (8.40) Ω 2 Γt

266


is the so-called total potential of the deformed elastic body. The minimal potential principle leads to the convex variational problem min {P (u) : u ∈ Uk } .

(8.41)

The functional V (²) = 12 h²; D²i is call the internal (or stored) potential. Its Legendre conjugate Z 1 V ∗ (σ) = {h², σi − U (²)| σ = D : ²} = σ : D−1 : σ dΩ 2 Ω is known as the complementary energy in solid mechanics. Because Z Z u · ¯t dΓ u · f dΩ + U (u) = Ω

Γu

is linear, which is also called the external potential, the Lagrangian associated with the total potential P (u), as given by Z Z 1 L(u, σ) = [(∇u) : σ − σ : D−1 : σ] dΩ − u · ¯t dΓ, (8.42) 2 Ω Γu can be considered as a saddle Lagrangian, which is the well-known generalized Hellinger-Reissner complementary energy. Thus, by the saddle Lagrangian duality, the dual functional P d (σ) is defined by P d (σ) = min L(u, σ) = U [ (Λ∗ σ) − V ∗ (σ), u∈Ua

where ½Z ¾ Z Z U [ (Λ∗ σ) = min (∇u) : σ) dΩ − u · f dΩ − u · ¯t dΓ u Ω Ω Γt ½R u ¯ · σ · n dΓ if − divσ = 0 in Ω, σ · n = ¯t on Γt , Γu = −∞ otherwise. Thus, on the dual feasible space, that is the so-called statically admissible space defined by Vk∗ = {σ ∈ Va∗ | − divσ = 0 in Ω, σ · n = ¯t on Γt }, the dual problem for this linear elasticity case is given by ½ ¾ Z Z 1 max P d (σ) = σ : D−1 : σ dΩ : σ ∈ Vk∗ . (8.43) u ¯ · σ · n dΓ − Γu Ω 2 This is a concave maximization problem with linear constraints. The Lagrange multiplier u for the equilibrium constraints is the solution of the primal problem.


267

In continuum mechanics, the functional −P d , denoted by Z Z 1 c −1 u ¯ · σ · n dΓ, P (σ) = σ : D : σ dΩ − Ω 2 Γu is called the total complementary energy. Thus, instead of the dual problem (8.43), the minimum complementary variational problem min {P c (σ) : σ ∈ Vk∗ } has been extensively studied by engineers, which serves as a foundation for the so-called stress, or equilibrium, finite element methods.

8.4.2 Convex Hamiltonian Systems Recall the mass-spring dynamical system discussed in Section 2, where the total action is a d.c. function of the form Z Z 1 1 2 m(u,t ) dt − [ ku2 − uf ] dt. P (u) = V (Λu) − U (u) = (8.44) 2 T T 2 Because the Lagrangian Z Z 1 1 L(u, p) = [u,t p − m−1 p2 − ku2 ] dt − uf dt 2 2 T T is not a saddle function, thus the Hamiltonian H(u, p) = hΛu, pi − L(u, p) Z Z 1 1 = [ m−1 p2 + ku2 ] dt + uf dt 2 T 2 T

(8.45)

was extensively used in classical dynamical systems. One of the main reasons for this could be that H(u, p) is convex. Thus, the original differential equation Au = −mu,tt − ku = f can be written in the well-known Hamiltonian canonical form: Λu = δp H(u, p), Λ∗ p = δu H(u, p). (8.46) However, an important phenomenon has been hiding in the shadow of this convex Hamiltonian for centuries. Because L(u, p) is a super-Lagrangian, the dual action can be formulated as Z Z 1 1 d −1 2 P (p) = max L(u, p) = k (p,t − f ) dt − m−1 p2 dt, u 2 T 2 T which is also a d.c. functional. The bi-duality theory min P (u) = min P d (p), max P (u) = max P d (p) shows that the well-known least action principle in periodic dynamical systems is actually a misnomer, that is the periodic solution u(t) does not minimize the total action P (u), which could be either a minimizer or a maximizer, depending on the time period (see [38]).

268


8.5 Nonconvex Problems with Double-Well Energy We now turn our attention to duality theory in nonconvex systems by considering a very simple problem in Rn : ( ) µ ¶2 1 1 2 n (Pw ) : min P (u) = α |Bu| − λ − hu, f i : u ∈ R , (8.47) 2 2 where B ∈ Rm×n is a matrix, α, λ > 0 are positive constants, and |v| denotes the Euclidean norm of v. The criticality condition δP (u) = 0 leads to a coupled nonlinear algebraic system in Rn : ¶ µ 1 2 α |Bu| − λ B T Bu = f. (8.48) 2 Clearly, it is difficult to solve this nonlinear system by direct methods. Also, due to the nonconvexity of P (u), any solution to this nonlinear system satisfies ¡ ¢2 only a necessary condition. The nonconvex function W (v) = 21 α 21 |v|2 − λ

f >0

(a) Graph of W (u) = 21 ( 21 u2 − λ)2

f 0, W (v) has two minimizers and one local maximizer (see Fig. 8.2(a)). The global and local minimizers depend on the input f (see Fig. 8.2(b)). This double-well function has extensive applications in mathematical physics. In phase transitions of shape memory alloys, or in the mathematical theory of super-conductivity, W (v) is the well-known Landau second order free energy, and each of its local minimizers represents a possible phase state of the material. In quantum mechanics, if v represents the Higgs’ field strength, then W (v) is the energy. It was discovered in the context of post-buckling analysis of large deformed beam models, that the total potential is also a double-well energy (see Gao, [41]), and each potential well represents a possible buckled beam state. More examples can be found in a recent review article [47].


269

8.5.1 Classical Lagrangian and Duality Gap If we choose Λ = B as a linear operator, the primal function can be written in the traditional form P (u) = W (Bu) − U (u), where U (u) = hu, f i is a linear function. Because the duality relation v ∗ = δW (v) = α( 21 |v|2 − λ)v is not one-to-one, the Legendre conjugate W ∗ (v ∗ ) = sta{hv, v ∗ i − W (v) : v ∈ Rm } is not uniquely defined. Thus, the entity (v, v ∗ ) associated with the nonconvex function W (v) is not a canonical dual pair. By using the Fenchel transformation W ] (v ∗ ) = max{hv, v ∗ i − W (v) : v ∈ Rm }, the traditional Lagrangian (associated with the linear operator Λ = B ) can still be defined as L(u, v ∗ ) = hBu, v ∗ i − W ] (v ∗ ) − hu, f i.

(8.49)

Thus, the classical Lagrangian duality theory P ] (v ∗ ) = maxu L(u, v ∗ ) leads to the well-known Fenchel-Rockafellar dual problem (P ] ) :

max {P ] (v ∗ ) = −W ] (v ∗ ) : B T v ∗ = f }.

v ∗ ∈Rm

(8.50)

This is a linearly constrained concave maximization problem. The Lagrange multiplier for the linear constraint set is u. However, due to the nonconvexity of W (v), the Fenchel-Young inequality W (v) + W ] (v ∗ ) ≤ hv, v ∗ i leads to a weak duality relation min P ≥ max P ] . The non-zero value θ ≡ min P (u) − max P ] (v ∗ ) is called the duality gap. This duality gap shows that the classical Lagrange multiplier u may not be a solution to the primal problem. Thus, the Fenchel-Rockafellar duality theory can be used mainly for solving convex problems. In order to eliminate this duality gap, many modified Lagrangian dualities have been proposed during recent years (see, for examples, Rubinov et al. (2001 - 2005), Goh and Yang (2002), Huang and Yang (2003), Zhou and Yang (2004)). Most of these mathematical approaches are based on penalization of a class of augmented Lagrangian functions. On the other hand, the canonical duality theory addressed in the next section is based on a fundamental truth in physics, that is physical variables appear in (canonical) pairs. The one-to-one canonical duality relation leads to a perfect duality theory in mathematical physics and global optimization.

270


8.5.2 Canonical Dual Transformation and Triality Theory In order to recover the duality gap, a canonical duality theory was developed during the last 15 years: first in nonconvex mechanics and analysis (see [62, 63, 32, 33, 38]), then in global optimization (see [38, 40, 46, 49]). The key idea of this theory is to choose a right operator (usually nonlinear) ξ = Λ(u) such that the nonconvex function W (u) can be written in the canonical form W (u) = V (Λ(u)), where V (ξ) is a canonical function of ξ = Λ(u). For the present nonconvex problem (8.47), instead of Λ = B, we choose ξ = Λ(u) =

1 |Bu|2 , 2

(8.51)

which is a quadratic map from U = Rn into Va = {ξ ∈ R | ξ ≥ 0}. Thus, the canonical function 1 V (ξ) = α(ξ − λ)2 2 is simply a scale-valued quadratic function well-defined on Va , which leads to a linear duality relation ς = δV (ξ) = α(ξ − λ). Let Va∗ = {ς ∈ R | ς ≥ −αλ} be the range of this duality mapping. So (ξ, ς) forms a canonical duality pair on Va × Va∗ , and the Legendre conjugate V ∗ is also a quadratic function: ½ ¾ 1 1 ∗ 2 V (ς) = sta hξ; ςi − α(ξ − λ) : ξ ∈ Va = α−1 ς 2 + λς. 2 2 Thus, replacing W (u) = V (Λ(u)) = hΛ(u); ςi − V ∗ (ς) in P (u) = W (u) − U (u), the so-called total complementary function [62, 38] can be defined by Ξ(u, ς) = hΛ(u) ; ςi − V ∗ (ς) − U (u) 1 1 = |Bu|2 ς − α−1 ς 2 − λς − uT f. 2 2

(8.52)

The criticality condition δΞ(u, ς) = 0 leads to the following canonical equilibrium equations 1 ( |Bu|2 − λ) = α−1 ς, 2 ςB T Bu = f.

(8.53) (8.54)

Equation (8.53) is actually the inverse duality relation ξ = δV ∗ (ς), which is equivalent to ς = α( 12 |Bu|2 −λ). Thus, equation (8.54) is identical to the Euler


271

equation (8.48). This shows that the critical point of the total complementary function is also a critical point of the primal problem. For a fixed ς 6= 0, solving (8.54) for u gives u=

1 T −1 (B B) f. ς

(8.55)

Substituting this result into the total complementary function leads to the canonical dual function P d (ς) = −

1 T T −1 1 f (B B) f − λς − α−1 ς 2 , 2ς 2

(8.56)

which is well-defined on the dual feasible space given by Vk∗ = {ς ∈ Va∗ | ς 6= 0} = {ς ∈ R | ς ≥ −αλ, ς 6= 0}. The criticality condition δP d (ς) = 0 gives the canonical dual algebraic equation: 2ς 2 (α−1 ς + λ) = f T (B T B)−1 f. (8.57) Theorem 5 (Gao [40]) For any given parameters α, λ > 0, and vector f ∈ Rn , the canonical dual function (8.56) has at most three critical points ς¯i , (i = 1, 2, 3) satisfying ς¯1 > 0 > ς¯2 ≥ ς¯3 . (8.58) For each of these roots, the vector u ¯i = (B T B)−1 f /¯ ςi ,

for i = 1, 2, 3,

(8.59)

is a critical point of the nonconvex function P (u) in Problem (8.47), and we have P (¯ ui ) = P d (¯ ςi ), ∀i = 1, 2, 3. (8.60) The original version of this theorem was first discovered in a postbifurcation problem of a large deformed beam model in 1997 [32], which shows that there is no duality gap between the nonconvex function P (u) and its canonical dual P d (ς). The dual algebraic equation (8.57) can be solved exactly to obtain all critical points, therefore the vector {¯ ui } defined by (8.59) yields a complete set of solutions to the nonlinear algebraic system (8.48). Let τ 2 = f T (B T B)−1 f . In algebraic geometry, the graph of the algebraic equation τ 2 = 2ς 2 (α−1 ς + λ) is the so-called singular algebraic curve in (ς, τ )space (i.e., the point ς = 0 is on the curve; cf. [116]). From this algebraic curve, we can see that there exists a constant τc such that if τ 2 > τc2 , the dual algebraic equation (8.57) has a unique solution ς > 0. It has three real solutions if and only if τ 2 < τc2 . It is interesting to note that for ς > 0, the total complementary function Ξ(u, ς) is a saddle function and the well-known saddle min-max theory leads to

272

Advances in Mech. & Math., Vol. III, Gao & Sherali (ed.), Springer, 2006 τ

τ 2 > τc2

0.4

τ 2 = τc2 0.2

-1

τ 2 < τc2

-0.8 -0.6 -0.4 -0.2

0.2

ς

0.4

-0.2

-0.4

Fig. 8.3. Graph of the dual algebraic equation (8.57) and a geometrical proof of the triality theorem.

min max Ξ(u, ς) = Ξ(¯ u, ς¯) = max min Ξ(u, ς). u

ς>0

ς>0

(8.61)

u

This means that u ¯1 is a global minimizer of P (u) and ς¯1 is a global maximizer on the open domain ς > 0. However, for ς < 0, the total complementary function Ξ(u, ς) is concave in both u and ς < 0, i.e., it is a super-critical function. Thus, by the bi-duality theory, we have that either min max Ξ(u, ς) = Ξ(¯ u, ς¯) = min max Ξ(u, ς) u

ς 0 > ς¯2 > ς¯3 such that u ¯1 is a global minimizer, u ¯2 is a local minimizer, and u ¯3 is a local maximizer of P (u).

8.5.3 Canonical Dual Solutions to Nonconvex Variational Problems Similar to the nonconvex optimization problem (8.47) with the double-well function, let us now consider the following typical nonconvex variational problem: ( ) µ ¶2 Z 1 Z 1 1 1 02 (P) : min P (u) = α u −λ uf dx , (8.64) dx − u∈Uk 2 0 2 0 where f (x) is a given function, λ > 0 is a parameter, and Uk = {u ∈ L2 [0, 1]| u0 ∈ L4 [0, 1], u(0) = 0} is an admissible space. Compared with Problem (8.47), we see that the linear operator B in this case is a differential operator d . This variational problem dx appears frequently in association with phase transitions in fluids and solids, and in post-buckling analysis of large deformed structures. The criticality condition δP (u) = 0 leads to a nonlinear differential equation in the domain (0, 1) with the natural boundary condition at x = 1, that is ·

µ αu

0

¶¸0 1 02 u −λ + f (x) = 0, ∀x ∈ (0, 1), 2 µ ¶ 1 02 0 αu u − λ = 0 at x = 1. 2

(8.65) (8.66)

Due to its nonlinearity, a solution to this boundary value problem is not unique. Particularly, if we let √ f (x) = 0, the equation (8.65) could have three√real roots u0 (x) = {0, ± 2λ}. Thus, any zig-zag curve u(x) with slope {0, ± 2λ} solves the boundary value problem, but may not be a global minimizer of the total energy P (u). This problem shows an important fact that in nonconvex analysis: the criticality condition is only necessary, but not sufficient for solving variational problems. Traditional direct approaches for solving nonconvex variational problems are very difficult, or impossible. However, by using the canonical dual transformation, this problem can be solved completely. To see this, we introduce a new “strain measure”

274


Fig. 8.5. Zig-zag function: Solution to the nonlinear boundary value problem (8.65)

ξ = Λ(u) = such that the canonical functional Z V (ξ) =

1 0

1 02 u , 2

1 α(ξ − λ)2 dx 2

2

is convex on Va = {ξ ∈ L [0, 1] | ξ(x) ≥ 0 ∀x ∈ (0, 1)}, and the duality relation ς = δV (ξ) = α(ξ − λ) is one-to-one. Thus, its Legendre conjugate can be simply obtained as ½Z 1 ¾ V ∗ (ς) = sta ξς dx − V (ξ) : ξ ∈ Va Z

1

= 0

µ

0

1 −1 2 α ς + λς 2

¶ dx.

Similar to (8.52), the total complementary function is ¶ Z 1µ Z 1 1 02 1 −1 2 Ξ(u, ς) = u ς − α ς − λς dx − uf dx. 2 2 0 0 For a given ς 6= 0, the canonical dual functional can be obtained as ¶ Z 1µ 2 τ 1 −1 2 d P (ς) = sta{Ξ(u, ς) : u ∈ Uk } = − + λς + α ς dx, 2ς 2 0 where τ (x) is defined by

Z τ =−

(8.67)

(8.68)

x

f (x) dx + c,

(8.69)

0

and the integral constant c depends on the boundary condition. The criticality condition δP d (ς) = 0 leads to the dual equilibrium equation 2ς 2 (α−1 ς + λ) = τ 2 .

(8.70)

This algebraic equation is the same as (8.57), which can be solved analytically as stated below.


275

Theorem 7 (Analytical Solutions and Triality Theorem [33, 39]) For any given input function f (x) such that τ (x) is defined by (8.69), the dual algebraic equation (8.70) has at most three real roots ςi (i = 1, 2, 3) satisfying ς¯1 (x) > 0 > ς¯2 (x) ≥ ς¯3 (x). For each ς¯i , the function Z

x

u ¯i (x) = 0

τ dx ς¯i

(8.71)

is a critical point of the variational problem (8.64). Moreover, u ¯1 (x) is a global minimizer, u ¯2 (x) is a local minimizer, and u ¯3 (x) is a local maximizer; that is P (¯ u1 ) = min max Ξ(u, ς) = max min Ξ(u, ς) = P d (¯ ς1 );

(8.72)

P (¯ u2 ) = min max Ξ(u, ς) = min max Ξ(u, ς) = P d (¯ ς2 );

(8.73)

P (¯ u3 ) = max max Ξ(u, ς) = max max Ξ(u, ς) = P d (¯ ς3 ).

(8.74)

u

u

ς>0

ς∈(¯ ς3 ,0)

u

ς0

u

ς∈(¯ ς3 ,0)

ς 0 if and only if ς > 0. Thus, the total complementary function Ξ(u, ς) given by (8.52) is a saddle function for ς > 0. This leads to the saddle min-max duality (8.61) in the triality theory. Example 2. In the nonconvex variational problem (8.64), the quadratic differential operator ξ = Λ(u) = 21 u02 has a physical meaning. In finite deformation theory, if u is considered as the displacement of a deformed body, then ξ can be considered as a Cauchy-Green strain measure (see the following section). The Gâteaux derivative of the quadratic differential operator Λ(u) is Λt (u) = u0 d/ dx. For any given u ∈ Ua , using integration by parts, we get Z 1 Z 1 0 02 0 x=1 hΛt (u)u; ςi = u ς dx = uu ς|x=0 − u [u0 ς] dx = hu, Λ∗t (u)ςi, 0

0

which gives the adjoint operator

Λ∗t ½

Λ∗t (u)ς =

via

u0 ς on x = 1 0 [u0 ς] , ∀x ∈ (0, 1).

For any given ς ∈ Va , the Λ-conjugate transformation Z U Λ (ς) = sta{hΛ(u), ςi − U (u) : u ∈ Uk } = − 0

1

τ 2 ς −1 dx.

280


The complementary operator in this problem is Λc (u) = Λ(u) − Λt (u)u = − 21 u02 , which leads to the complementary gap function Z Gc (u, ς) = 0

1

1 02 u ς dx. 2

Clearly, this is positive if ς ≥ 0.

8.6.2 Extremality Conditions: Triality Theory In order to study the extremality conditions of the nonconvex problem, we need to clarify the convexity of the canonical function V (ξ). Without loss of generality, we assume that V : Va → R is convex. Thus, for each u ∈ Ua , the total complementary function Ξ(u, ς) = hΛ(u) ; ςi − V ∗ (ς) − U (u) : Va∗ → R is concave in ς ∈ Va∗ . The convexity of Ξ(·, ς) : Ua → R will depend on the geometrical operator Λ(u) and the function U (u). We furthermore assume that the function Gς (u) = hΛ(u) ; ςi − U (u) : Ua → R is twice Gâteaux differentiable on Ua and let G := {(u, ς) ∈ Ua × Va∗ | δ 2 Gς (u; δu2 ) 6= 0, ∀δu 6= 0}, 6 0}, G + := {(u, ς) ∈ Ua × Va∗ | δ 2 Gς (u; δu2 ) > 0, ∀δu = 2 2 − ∗ G := {(u, ς) ∈ Ua × Va | δ Gς (u; δu ) < 0, ∀δu 6= 0}.

(8.93) (8.94) (8.95)

Theorem 9 (Triality Theorem) Suppose that (¯ u, ς¯) ∈ G is a critical point of Ξ(u, ς) and Uo × Vo∗ ⊂ Uk × Vk∗ is a neighborhood of (¯ u, ς¯). If (¯ u, ς¯) ∈ G + , then (¯ u, ς¯) is a saddle point of Ξ(u, ς); that is min max Ξ(u, ς) = Ξ(¯ u, ς¯) = max∗ min Ξ(u, ς).

u∈Uo ς∈Vo∗

ς∈Vo u∈Uo

(8.96)

If (¯ u, ς¯) ∈ G − , then (¯ u, ς¯) is a super-critical point of Ξ(u, ς), and we have that either min max∗ Ξ(u, ς) = Ξ(¯ u, ς¯) = min∗ max Ξ(u, ς) (8.97) u∈Uo ς∈Vo

ς∈Vo u∈Uo

holds, or max max Ξ(u, ς) = Ξ(¯ u, ς¯) = max∗ max Ξ(u, ς).

u∈Uo ς∈Vo∗

ς∈Vo u∈Uo

(8.98)

Proof. By the assumption on the canonical function V (ξ), we know that Ξ(u, ς) is concave on Va∗ . Because Gς (u) is twice Gâteaux differentiable on Ua , the theory of implicit functions tells us that if (¯ u, ς¯) ∈ G, then there exists a unique u ∈ Uo ⊂ Uk such that the dual feasible set Vk∗ is non-empty. If such a point (¯ u, ς¯) ∈ G + , then Gς (u) is convex in u and (¯ u, ς¯) is a saddle point of Ξ on Uo × Vo∗ . The saddle-Lagrangian duality leads to (8.96). If (¯ u, ς¯) ∈ G − , then Gς (u) is locally concave in u and (¯ u, ς¯) is a super-critical point of Ξ(u, ς)


281

on Uo × Vo∗ . In this case the bi-duality theory leads to (8.97) and (8.98).

2

If the geometrical operator Λ(u) is a quadratic function and U (u) is either quadratic or linear, then the second order Gâteaux derivative δ 2 Gς (u) does not depend on u. In this case, we let ∗ V+ := {ς ∈ Va∗ | δ 2 Gς (u) is positive definite}, ∗ V− := {ς ∈ Va∗ | δ 2 Gς (u) is negative definite}.

(8.99) (8.100)

The following theorem provides extremality criteria for critical points of Ξ(u, ς). Theorem 10 (Tri-Duality Theorem [33, 38]) Suppose that Gς (u) = hΛ(u); ςi − U (u) is a quadratic function of u ∈ Ua and (¯ u, ς¯) is a critical point of Ξ(u, ς). ∗ If ς¯ ∈ V+ , then u ¯ is a global minimizer of P (u) on Uk if and only if ς¯ is a ∗ global maximizer of P d (ς) on V+ , and P (¯ u) = min P (u) = max∗ P d (ς) = P d (¯ ς ). u∈Uk

ς∈V+

(8.101)

∗ If ς¯ ∈ V− , then on the neighborhood Uo × Vo∗ ⊂ Ua × Va∗ of (¯ u, ς¯), we have that either P (¯ u) = min P (u) = min∗ P d (ς) = P d (¯ ς) (8.102) u∈Uo

holds, or

ς∈Vo

P (¯ u) = max P (u) = max∗ P d (ς) = P d (¯ ς ). u∈Uo

ς∈Vo

(8.103)

∗ This theorem shows that the canonical dual solution ς¯ ∈ V+ provides a global optimality condition for the nonconvex primal problem, whereas the ∗ provides local extremality conditions. condition ς¯ ∈ V− The triality theory was originally discovered in nonconvex mechanics [32, 37]. Since then, several modified versions have been proposed in nonconvex parametrical variational problems (for quadratic Λ(u) and linear U (u) [33]), general nonconvex systems (for nonlinear Λ(u) and linear U (u) [38]), global optimization (for general nonconvex functions of type Φ(u, Λ(u)) [40], quadratic U (u) [46, 47]), and dissipative Hamiltonian system (for nonconvex/nonsmooth functions of type Φ(u, u,t , Λ(u)) [44]). In terms of the parametrical function Gς (u) = hΛ(u); ςi − U (u), the current version (Theorems 9 and 10) can be used for solving general nonconvex problem (8.75) with the canonical function U (u).

8.6.3 Complementary Variational Principles in Finite Deformation Theory In finite deformation theory, the deformation u(x) is a smooth, vector-valued mapping from an open, simply connected, and bounded domain Ω ⊂ Rn into

282


a deformed domain4 ω ⊂ Rm . Let Γ = ∂Ω = Γu ∪ Γt be the boundary of ¯ is prescribed, whereas Ω such that on Γu , the boundary condition u(x) = u on the remaining boundary Γt , the surface traction (external force) ¯t(x) is applied. Similar to the nonconvex optimization problem (8.48), the primal problem is to minimize the total potential energy functional: ½ ¾ Z Z ¯ ¯ on Γu min P (u) = [W (∇u) − u · f ] dΩ − u · t dΓ : u = u Ω

Γt

(8.104) where the stored energy W (F) is a Gâteaux differentiable function of F = ∇u, and f (x) is a given force field. Because the deformation gradient F = ∇u ∈ Rn×m is a so-called two-point tensor, which is no longer a strain measure in finite deformation theory, the stored energy W (F) is usually nonconvex. Particularly, for St. Venant-Kirchhoff material (see [38]), we have · ¸ · ¸ 1 1 T 1 T W (²) = (F F − I) : D : (F F − I) , (8.105) 2 2 2 where I is an identity tensor in Rn×n . Due to nonconvexity, the duality relation τ = δW (F) is not one-to-one. Although the two-point tensor τ ∈ Rm×n is called the first Piola-Kirchhoff stress, according to Hill’s constitutive theory, (F, τ ) is not considered as a work-conjugate (canonical) strain-stress pair (see [38]). The Fenchel-Rockafellar type dual variational problem is ½ ¾ Z Z ¯ · τ · n dΓ − max P ] (τ ) = u W ] (τ ) dΩ (8.106) Γu

s.t. −∇ · τ

T

= f in Ω, n · τ

Ω T

= ¯t on Γt .

(8.107)

In the case where the stored energy W (F) is convex, then W ] (τ ) = W ∗ (τ ) which is called the complementary energy in elasticity. In this case, the functional Z Z ¯ · τ · n dΓ Π c (τ ) = W ∗ (τ ) dΩ − u Ω

Γu

is the well-known Levinson-Zubov complementary energy. As discussed before, if the stored energy W (F) is nonconvex, the Legendre conjugate W ∗ is not uniquely defined. It turns out that the Levinson-Zubov complementary variational principle can be used only for solving convex problems (see [29]). Although the Fenchel conjugate W ] (τ ) can be uniquely defined, the FenchelYoung inequality W (F) + W ] (τ ) ≥ hF; τ i leads to a duality gap between 4

If m = n + 1, then the deformation u(x) represents a hyper-surface in mdimensional space. Applications of the canonical duality theory in differential geometry were discussed in [65].


283

the minimal potential variational problem (8.104) and its Fenchel-Rockafellar dual (see [29]), i.e., in general, min P (u) ≥ max P ] (τ ).

(8.108)

By the fact that the criticality condition δP ] (τ ) = 0 is not equivalent to the primal variational problem and the weak duality is not appreciated in the field of continuum mechanics, the existence of a perfect (i.e., without a duality gap), pure (i.e., involving only stress tensor as variational argument) complementary variational principle in finite elasticity has been argued among well-known scientists for more than three decades (see [72, 73, 78, 79, 83, 84, 85, 98, 99, 141]). This problem was finally solved by the canonical dual transformation and triality theory in [29, 37]. Similar to the quadratic operator Λ(u) = 12 |Bu|2 (see equation (8.51)) chosen for the nonconvex optimization problem (8.48), we let E = Λ(u) =

1 [(∇u)T (∇u) − I], 2

(8.109)

which is a symmetrical tensor field in Rn×n . In finite deformation theory, E is the well-known Green-St. Venant strain tensor. Thus, in terms of E, the stored energy for St. Venant-Kirchhoff material can be written in the canonical form W (∇u) = V (Λ(∇u)), and V (E) =

1 E:D:E 2

is a (quadratic) convex function of the symmetrical tensor E ∈ Rn×n . The canonical dual variable E∗ = δV (E) = D · E is called the second PiolaKirchhoff stress tensor, denoted as T. The Legendre conjugate V ∗ (T) =

1 T : D−1 : T 2

(8.110)

¯ on Γu } is also a quadratic function. Let Ua = {u ∈ W 1,p (Ω; R3 )| u = u (where W 1,p is a standard Sobolev space with p ∈ (1, ∞)) and Va∗ = C(Ω; Rn×n ). Replacing W (∇u) by its canonical dual transformation V (Λ(u)) = E(u) : T − V ∗ (T), the generalized complementary energy Ξ : Ua × Va∗ → R has the following format, Z Z Ξ(u, T) = [E(u) : T − V ∗ (T) − u · f ] dΩ − u · ¯t dΓ, (8.111) Ω

Γt

which is the well-known Hellinger-Reissner generalized complementary energy in continuum mechanics. Furthermore, if we replace V ∗ (T) by its bi-Legendre transformation E : T − V (E), then Ξ(u, T) can be written as Z Z Ξhw (u, T, E) = [Λ(∇u) − E) : T + V (E) − u · f ] dΩ − u · ¯t dΓ. (8.112) Ω

Γt

284


This is the well-known Hu-Washizu generalized potential energy in nonlinear elasticity. The Hu-Washizu variational principle has important applications in computational analysis of thin-walled structures, where the geometrical equation E = Λ(u) is usually proposed by certain geometrical hypothesis. ¯ in the Because Λ(u) is a quadratic operator, its Gâteaux differential at u direction u is δΛ(¯ u; u) = Λt (¯ u)u = (∇¯ u)T (∇u) and 1 Λc (u) = Λ(u) − Λt (u)u = − [(∇u)T (∇u) + I]. 2 By using the Gauss-Green theorem, the balance operator Λ∗t (u) can be defined as ½ −∇ · [(∇u)T · T]T in Ω, ∗ Λt (u)T = n · [(∇u)T · T]T on Γ. The complementary gap function in this problem is a quadratic functional: Z 1 Gc (u, T) = h−Λc (u); Ti = tr[(∇u)T · T · (∇u) + T] dΩ. (8.113) Ω 2 Thus, the complementary variational problem is to find critical (stationary) ¯ such that points (¯ u, T) ½Z ¾ Z 1 c T ∗ ¯ P (¯ u, T) = sta tr[(∇u) · T · (∇u) + T] dΩ + V (T) dΩ (8.114) Ω 2 Ω s.t. −∇ · [(∇u)T · T]T = f in Ω, n · [(∇u)T · T]T = ¯t on Γt . The following result is due to Gao and Strang in 1989. Theorem 11 (Complementary-Dual Variational Principle [62]) ¯ is a critical point of the complementary variational problem (8.114), If (¯ u, T) ¯ is a critical point of the total potential energy P (u) defined by (8.104), then u and ¯ = 0. P (¯ u) + P c (¯ u, T) Moreover, if the complementary gap function ¯ ≥ 0, Gc (u, T)

∀u ∈ Ua ,

(8.115)

¯ is a global minimizer of P (u) and then u ¯ P (¯ u) = min P (u) = max min Ξ(u, T) = −P c (¯ u, T), u

T

u

(8.116)

subject to T(x) being positive definite for all x ∈ Ω. This theorem shows that the positivity of the complementary gap function Gc (u, T) provides a sufficient condition for a global minimizer, the equalities (11) and (8.116) indicate that there is no duality gap between the total potential P (u) and its complementary energy P c (u, T). The physical significance


285

is also clear: a finite deformed material is stable if the second Piola-Kirchhoff stress tensor T(x) is positive definite everywhere in the domain Ω. The linear operator B = ∇ in this nonconvex variational problem is a partial differential operator, therefore it is difficult to find its inverse. It took more than ten years before the canonical dual problem was finally formulated in [35, 37]. To see this, let us assume that for a given force vector field ¯t on the boundary Γt , the first Piola-Kirchhoff stress tensor τ (x) can be defined by solving the following boundary value problem, −∇ · τ T (x) = f in Ω, n · τ T = ¯t on Γt . Then the canonical dual functional P d (T) can be formulated as: Z Z 1 d −1 T P (T) = − tr(τ · T · τ + T) dΩ − V ∗ (T) dΩ. Ω 2 Ω

(8.117)

(8.118)

The criticality condition δP d (T) = 0 gives the canonical dual equation T · (2 δV ∗ (T) + I) · T = τ T · τ .

(8.119)

For St. Venant-Kirchhoff material, V ∗ (T) = 21 T : D−1 : T is a quadratic function and its Gâteaux derivative δV ∗ (T) = D−1 · T is linear. In this case, the canonical dual equation (8.119) is a cubic equation, which is similar to the dual algebraic equations (8.57) and (8.70). Theorem 12 (Pure Complementary Energy Principle [35, 37]) Suppose that for a given force field ¯t(x) on Γt , the first Piola-Kirchhoff stress ¯ of the canonical dual field τ (x) is defined by (8.117). Then each solution T d equation (8.119) is a critical point of P , the vector defined by the line integral Z ¯ −1 dx ¯ = τ ·T u (8.120) is a critical point of P (u), and ¯ P (¯ u) = P d (T). This theorem presents an analytic solution to the nonconvex potential variational problem (8.104). In the finite deformation theory of elasticity, this pure complementary variational principle is also known as Gao principle [86], which holds also for the general canonical energy function V (E). Similar to Theorem 9, the extremality of the critical points can be identified by the complementary gap function. Applications of this pure complementary variational principle for solving nonconvex/nonsmooth boundary value problems are illustrated in [37, 38, 57].

286


8.7 Applications to Semi-Linear Nonconvex Systems The canonical dual transformation and the associated triality theory can be used to solve many difficult problems in engineering and science. In this section, we present applications for solving the following nonconvex minimization problem (P) :

1 min{P (u) = W (u) + hu, Aui − hu, f i : u ∈ Uk }, 2

(8.121)

where W (u) : Uk → R is a nonconvex function, and A : Ua ⊂ U → Ua∗ is a linear operator. If W (u) is Gâteaux differentiable, the criticality condition δP (u) = 0 leads to a nonlinear Euler equation Au + δW (u) = f.

(8.122)

The abstract form (8.122) of the primal problem (P) covers many situations. In nonconvex mechanics (cf. [58, 47]), where U is an infinite dimensional function space, the state variable u(x) is a field function, and A : U → U ∗ is usually a partial differential operator. In this case, the governing equation (8.122) is a so-called semi-linear equation. For example, in the LandauGinzburg theory of superconductivity, A = ∆ is the Laplacian over a given space domain Ω ⊂ Rn and Z W (u) = Ω

1 α 2

µ

1 2 u −λ 2

¶2 dΩ

(8.123)

is the Landau double-well potential, in which α, λ > 0 are material constants. Then the governing equation (8.122) leads to the well-known LandauGinzburg equation 1 ∆u + αu( u2 − λ) = f. 2 This semi-linear differential equation plays an important role in material science and physics including: ferroelectricity, ferromagnetism, and superconductivity. In a more complicated case where A = ∆ + curl curl, we have 1 ∆u + curl curl u + αu( u2 − λ) = f, 2 which is the so-called Cahn-Hilliard equation in liquid crystal theory. Due to the nonconvexity of the double-well function W (u), any solution of the semi-linear differential equation (8.122) is only a critical point of the total potential P (u). Traditional direct analysis and related numerical methods for finding the global minimizer of the nonconvex variational problem have proven unsuccessful to date.


287

In dynamical systems, if A = −∂,tt + ∆ is a wave operator over a given space-time domain Ω ⊂ Rn × R, then (8.122) is the well-known nonlinear Schrödinger equation 1 −u,tt + ∆u + αu( u2 − λ) = f. 2 This equation appears in many branches of physics. It provides one of the simplest models of the unified field theory. It can also be found in the theory of dislocations in metals, in the theory of Josephson junctions, as well as in interpreting certain biological processes such as DNA dynamics. In the most simple case where u depends only on time, the nonlinear Schrödinger equation reduces to the well-known Duffing equation 1 u,tt = αu( u2 − λ) − f. 2 Even for this one dimensional ordinary differential equation, an analytic solution is still very difficult to obtain. It is known that this equation is extremely sensitive to the initial conditions and the input (driving force) f (t). Fig. 8.7 displays clearly that for the same given data, two Runge-Kutta solvers in MATLAB produce very different vibration modes and “trajectories” in the phase space u-p (p = u,t ). Mathematically speaking, due to the nonconvexity of the function W (u), very small perturbations of the system’s initial conditions and parameters may lead the system to different local minima with significantly different performance characteristics, that is the so-called chaotic phenomena. Numerical results vary with the methods used. This is one of the main reasons why traditional perturbation analysis and direct approaches cannot successfully be applied to nonconvex systems [47]. Numerical discretization of the nonconvex variational problem (P) in mathematical physics usually leads to a nonconvex optimization problem in finite dimensional space U = Rn , where the field variable u is simply a vector x ∈ U, the bilinear form hx, x∗ i = xT x∗ = x · x∗ is the dot-product of two vectors, and the operator A : Rn → U ∗ = Rn is a symmetrical matrix. In d.c.(difference of convex functions) programming and discrete dynamical systems, the operator A = AT ∈ Rn×n is usually indefinite. The problem (8.121) is then one of global minimization in Rn . In this section, we discuss the canonical dual transformation method for solving this type of problem.

8.7.1 Unconstrained Nonconvex Optimization Problem with Double-Well Energy First, let us consider an unconstrained global optimization problem in finite dimensional space U = Rn , where A = AT ∈ Rn×n is a matrix, and W (x) is a double-well function of the type W (x) = 12 ( 12 |x|2 − λ)2 . Then the primal problem is

288

Advances in Mech. & Math., Vol. III, Gao & Sherali (ed.), Springer, 2006 (a) u(t)

(b) Trajectory in phase space u−p

4

2

3 1

2 1

0 0 −1

−1

−2 −3

0

10

20

30

40

−2 −4

(a) u(t)

−2

0

2

4

(b) Trajectory in phase space u−p

4

2

3 1

2 1

0 0 −1

−1

−2 −3

0

10

20

30

40

−2 −4

−2

0

2

4

Fig. 8.7. Numerical results by ode23 (top) and ode15s (bottom) solvers in MATLAB.

(

1 min P (x) = 2

µ

1 2 |x| − λ 2

¶2

1 + xT Ax − xT f : ∀x ∈ Uk = Rn 2

) .

(8.124) The necessary condition δP (x) = 0 leads to a coupled nonlinear algebraic system µ ¶ 1 2 Ax + |x| − λ x = f. (8.125) 2 Clearly, a direct method for solving this nonlinear equation with n unknown is elusive. By choosing the quadratic operator ξ = 12 |x|2 , the canonical function V (ξ) = 21 (ξ−λ)2 is a quadratic function. By the fact that 12 |x|2 = ξ ≥ 0, ∀x ∈ Rn , the range of the quadratic mapping Λ(x) is Va = {ξ ∈ R| ξ ≥ 0}. Thus, on Va , the canonical duality relation ς = δV (ξ) = ξ − λ is one-to-one and the range of the canonical dual mapping δV : Va → V ∗ ⊂ R is Va∗ = {ς ∈ R| ς ≥ −λ}. It turns out that (ξ, ς) is a canonical pair on Va × Va∗ and the Legendre conjugate V ∗ is also a quadratic function:


V ∗ (ς) = sta{ξς − V (ξ) : ξ ∈ Va } =

289

1 2 ς + λς. 2

For a given ς ∈ Va∗ , the Λ-conjugate transformation ½ ¾ 1 2 1 U Λ (ς) = sta x ς − xT Ax + xT f : x ∈ Rn 2 2 1 T −1 = − f (A + ςI) f 2 is well-defined on the canonical dual feasible space Vk∗ , given by Vk∗ = {ς ∈ R| det(A + ςI) 6= 0, ς ≥ −λ}.

(8.126)

Thus, the canonical dual problem can be proposed as the following [46]: ¾ ½ 1 1 (P d ) : max P d (ς) = − f T (A + ςI)−1 f − ς 2 − λς : ς ∈ Vk∗ . (8.127) 2 2 This is a nonlinear programming problem with only one variable! The criticality condition of this dual problem leads to the dual algebraic equation 1 T f (A + ςI)−2 f. (8.128) 2 n×n n For any given A ∈ R and f ∈ R , this equation can be solved by MATHEMATICA. Extremality conditions of these dual solutions can be identified by the following theorem (see [46]). ς +λ=

Theorem 13 (Gao [46]) If the matrix A has r distinct non-zero eigenvalues such that a1 < a2 < · · · < ar , then the canonical dual algebraic equation (8.128) has at most 2r + 1 roots ς1 > ς2 ≥ ς3 ≥ · · · ≥ ς2r+1 . For each ςi , the vector xi = (A + ςi I)−1 f, ∀i = 1, 2, . . . , 2r + 1,

(8.129)

is a solution to the semi-linear algebraic equation (8.125) and P (xi ) = P d (ςi ), ∀i = 1, . . . , 2r + 1.

(8.130)

Particularly, the canonical dual problem has at most one global maximizer ς1 > −a1 in the open interval (−a1 , +∞), and x1 is a global minimizer of P (x) over Uk , that is P (x1 ) = min P (x) = max P d (ς) = P d (ς1 ). x∈Uk

ς>−a1

(8.131)

Moreover, in each open interval (−ai+1 , −ai ), the canonical dual equation (8.128) has at most two real roots −ai+1 < ς2i+1 < ς2i < −ai , ∀i = 1, . . . , 2r + 1, ς2i is a local minimizer of P d , and ς2i+1 is a local maximizer of P d (ς).

290


2 1 3 2 1 0 -1

0 2 1

-1

0 -1 -1

0

-2

-2

1

-2

-1

0

1

2

Fig. 8.8. Graph of the primal function P (x1 , x2 ) and its contours

2 1.5 1 0.5 -1.5

-1

-0.5 -0.5

0.5

1

-1 -1.5 -2

Fig. 8.9. Graph of the dual function P d (ς)

As an example in two-dimensional space, which is illustrated in Fig. 8.8, we simply choose A = {aij } with a11 = 0.6, a22 = −0.5, a12 = a21 = 0, and f = {0.2, −0.1}. For a given parameter λ = 1.5, and α = 1.0, the graph of P (x) is a nonconvex surface (see Fig. 8.8a) with four potential wells and one local maximizer. The graph of the canonical dual function P d (ς) is shown in Fig. 8.9. The dual canonical dual algebraic equation (8.128) has a total of five real roots: ς¯5 = −1.47 < ς¯4 = −0.77 < ς¯3 = −0.46 < ς¯2 = 0.45 < ς¯1 = 0.55, and we have P d (¯ ς5 ) = 1.15 > P d (¯ ς4 ) = 0.98 > P d (¯ ς3 ) = 0.44 > P d (¯ ς2 ) = −0.70 > P d (¯ ς1 ). ¯ 1 = (A + ς¯1 I)−1 f = {0.17, −2.02} is By the triality theory, we know that x a global minimizer of P (¯ x); and accordingly, P (¯ x1 ) = P d (¯ ς1 ) = −1.1; and


291

¯ 5 = {−0.23, 0.05} and x ¯ 3 = {1.44, 0.10} are local maximizers, whereas that x ¯ 4 = {−1.21, 0.08} and x ¯ 2 = {0.19, 1.96} are local minimizers. x The graph of P d (ς) for a four-dimensional problem is shown in Fig. 8.10. It can be easily seen that P d (ς) is strictly concave for ς > −a1 . Within each interval −ai−1 < ς < −ai , ∀i = 1, 2, . . . , r, the dual function P d (ς) has at most one local minimum and one local maximum. These local extrema can be identified by the triality theory [46]. 5

2.5

-3

-2

-1

1

2

3

-2.5

-5

-7.5

-10

-12.5

Fig. 8.10. Graph of the dual function P d (ς) for a four-dimensional problem

The nonconvex function W (x) in (8.121) could be in many other forms, for example, µ ¶ 1 2 W (x) = exp |Bx| − λ , 2 where B ∈ Rm×n is a given matrix and λ > 0 is a constant. In this case, the primal problem (P) is a quadratic-exponential minimization problem ½ µ ¶ ¾ 1 1 T 2 T n min P (x) = exp |Bx| − λ + x Ax − x f : x ∈ R . 2 2 By letting ξ = Λ(x) = 21 |Bx|2 − λ, the canonical function V (ξ) = exp(ξ) is convex and its Legendre conjugate is V ∗ (ς) = ς(ln ς − 1). The canonical dual problem was formulated in [60]: ½ ¾ 1 T d d −1 ∗ (P ) : max P (ς) = − f [G(ς)] f − (ς log ς − ς) − λς : ς ∈ V+ , 2 where G(ς) = A + ςB T B and the dual feasible space is defined by ∗ V+ = {ς ∈ R | ς > 0, G(ς) is positive definite}.

Detailed study of this case was given in [60].

292


8.7.2 Constrained Quadratic Minimization over a Sphere If the function W (x) in problem (8.121) is an indicator of a constraint set Uk ⊂ Rn , that is ½ 0 if x ∈ Uk , W (x) = +∞ otherwise, then the general problem (8.121) becomes a constrained nonconvex quadratic optimization problem, denoted as (Pq ) :

min{P (x) =

1 hx, Axi − hx, f i : x ∈ Uk }. 2

(8.132)

General constrained global optimization problems are discussed in the next section. Here, we consider the following quadratic minimization problem with a nonlinear constraint (Pq ) :

min P (x) =

1 T x Ax − f T x 2

(8.133)

s.t. |x| ≤ r, where A = AT ∈ Rn×n is a symmetric matrix, f ∈ Rn is a given vector, and r > 0 is a constant. The feasible space Uk = {x ∈ Rn | |x| ≤ r} is a hyper-sphere in Rn . This problem often arises as a subproblem in general optimization algorithms (cf. Powell, 2002). Often, in the model trust region methods, the objective function in nonlinear programming is approximated locally by a quadratic function. In such cases, the approximation is restricted to a small region around the current iterate. These methods therefore require the solution of quadratic programming problems over spheres. To solve this constrained nonconvex minimization by using a traditional Lagrange multiplier method, we have L(x, λ) =

1 T x Ax − f T x + λ(|x| − r). 2

(8.134)

For a given λ ≥ 0, the traditional dual function can be defined via the FenchelMoreau-Rockafellar duality theory: P ∗ (λ) = min{L(x, λ) : x ∈ Rn },

(8.135)

which is a concave function of λ. However, due to the nonconvexity of P (x), we have only the weak duality relationship min P (x) ≥ max P ∗ (λ).

|x|≤r

λ≥0

The duality gap θ given by the slack in the above inequality is typically nonzero indicating that the dual solution does not solve the primal problem. On the other hand, the KKT condition leads to a coupled nonlinear algebraic system


293

Ax + λ|x|−1 x = f, λ ≥ 0, |x| ≤ r, λ(|x| − r) = 0. As indicated by Floudas and Visweswaran (1995), due to the presence of the nonlinear sphere constraint, the solution of (Pq ) is likely to be irrational, which implies that it is not possible to exactly compute the solution. Therefore, many polynomial time algorithms have been suggested to compute the approximate solution to this problem (see, Sorensen, 1982; Karmarkar, 1990; and Ye, 1992). However, by the canonical dual transformation this problem has been solved completely in [49]. First, we need to reformulate the constraint |x| ≤ r in the canonical form ξ = Λ(x) =

1 2 |x| . 2

Let λ = 12 r2 , then the canonical function V (Λ(x)) can be defined as ½ 0 if ξ ≤ λ, V (ξ) = +∞ otherwise, whose effective domain is Va = {ξ ∈ R| ξ ≤ λ}. Letting U (x) = xT f − 12 xT Ax, the primal problem (Pq ) can be reformulated in the following canonical form: min{Π(x) = V (Λ(x)) − U (x) : x ∈ Rn }. By the Fenchel transformation, the conjugate of V (ξ) is ½ λς if ς ≥ 0, V ] (ς) = max{ξς − V (ξ)} = +∞ otherwise, ξ∈Va

(8.136)

(8.137)

whose effective domain is Va∗ = {ς ∈ R| ς ≥ 0}. The dual feasible space Vk∗ in this problem is Vk∗ = {ς ∈ R | ς ≥ 0, det(A + ςI) 6= 0}. Thus, for a given ς ∈ Va∗ , the Λ-conjugate of U can be formulated as ½ ¾ 1 2 1 T Λ T n U (ς) = sta |x| ς + x Ax − x f : x ∈ R 2 2 1 = − f T (A + ςI)−1 f, 2 and the problem (P d ), which is perfectly dual to (Pq ), is given by ½ ¾ 1 T d d −1 ∗ (Pq ) : max P (ς) = − f (A + ςI) f − λς : ς ∈ Vk . 2

(8.138)

The criticality condition δP d (¯ ς ) = 0 leads to a nonlinear algebraic equation

294


1 T f (A + ς¯I)−2 f = λ. 2

(8.139)

Similar to (8.128), this equation can also be solved easily by using MATHEMATICA. Each root ς¯i is a critical point of P d (ς). The following theorem presents a complete set of solutions for this dual problem. Theorem 14 (Complete Solution to (Pq ) [49]) Suppose that the symmetric matrix A has p ≤ n distinct eigenvalues, and id ≤ p of them are negative such that a1 < a2 < · · · < aid < 0 ≤ aid +1 < · · · < ap . Then for a given vector f ∈ Rn , the canonical dual problem (Pqd ) has at most 2id + 1 critical points ς¯i , i = 1, . . . , 2id + 1, satisfying the following distribution law: ς¯1 > −a1 > ς¯2 ≥ ς¯3 > −a2 > · · · > −aid > ς¯2id ≥ ς¯2id +1 > 0.

(8.140)

For each ς¯i ≥ 0, i = 1, . . . , 2id + 1, the vector defined by ¯ i = (A + ς¯i I)−1 f x

(8.141)

is a KKT point of the problem (Pq ) and P (¯ xi ) = P d (¯ ςi ), i = 1, 2, . . . , 2id + 1.

(8.142)

Moreover, if id > 0, then the problem (Pq ) has at most 2id + 1 critical points on the boundary of the sphere, that is 1 |¯ xi |2 = λ, i = 1, . . . , 2id + 1. 2

(8.143)

Because A = AT , there exists an orthogonal matrix RT = R−1 such that A = RT DR, where D = (ai δij ) is a diagonal matrix. For the given vector f ∈ Rn , let g = Rf = (gi ), and define p

ψ(ς) =

1 T 1X 2 g (ai + ς)−2 . f (A + ςI)−2 f = 2 2 i=1 i

(8.144)

Clearly, this real-valued function ψ(ς) is strictly convex within each interval −ai+1 < ς < −ai , as well as over the intervals −∞ < ς < −ap and −a1 < ς < ∞ (see Fig. 8.11). Thus, for a given parameter λ > 0, the algebraic equation p

ψ(ς) =

1X 2 g (ai + ς)−2 = λ 2 i=1 i

(8.145)

has at most 2p solutions {¯ ςi } satisfying −aj+1 < ς¯2j+1 ≤ ς¯2j < −aj for j = 1, . . . , p − 1, and ς¯1 > −a1 , ς¯2p < −ap . Because A has only id negative


ψ=λ

295

5 4 3 2 1

-4

-2

2

4

-1

Fig. 8.11. Graph of ψ(ς).

eigenvalues, the equality ψ(ς) = λ has at most 2id + 1 strictly positive roots {¯ ςi } > 0, i = 1, . . . , 2id +1. By the complementarity condition ς¯i ( 12 |¯ xi |2 −λ) = ¯i 0, we know that the primal problem (Pq ) has at most 2id + 1 KKT points x on the sphere 21 |¯ xi |2 = λ. If aid +1 > 0, the equality ψ(ς) = λ may have at most 2id strictly positive roots. By using the triality theory, the extremality conditions of the critical points of the problem (Pq ) can be identified by the following result. Theorem 15 (Global and Local Extrema [49]) Suppose that a1 is the smallest eigenvalue of A. Then the dual problem (P d q ) given in (8.138) has a ¯ 1 is a global minimizer unique solution ς¯1 over the domain ς > −a1 ≥ 0, and x of the problem (Pq ), that is P (¯ x1 ) = min P (x) = max P d (ς) = P d (¯ ς1 ). x∈Uk

ς>−a1

(8.146)

If in each interval (−ai+1 , −ai ), i = 1, . . . , id , the dual algebraic equation (8.139) has two roots −ai+1 < ς¯2i+1 < ς¯2i < −ai , then ς¯2i is a local minimizer of P d (ς), and ς¯2i+1 is a local maximizer of P d (ς) over the interval (−ai+1 , −ai ). Proof. Since for any given ς > −a1 , the matrix A + ςI is positive definite, that is the total complementary function Ξ(x, ς) is a saddle function, the saddle minmax theorem leads to (8.146). The remaining statements in Theorem 15 can be proved by the graph of P d (ς) (see Fig. 8.12). 2 It is interesting to note that on the effective domain Va∗ , the Fenchel-Young equality V (ξ) = hξ; ςi − V ∗ (ς) = (ξ − λ)ς holds true. Thus, on Ua × Va∗ , the total complementary function

296


2

1.5

1

0.5

0

-0.5

-1

-1.5

-0.75

-0.5

-0.25

0

0.25

−a2i+1

−a2i

0.5

0.75

1

− a1

Fig. 8.12. Graph of P d (ς)

µ ∗

Ξ(x, ς) = hΛ(x); ςi − V (ς) − U (x) = ς

¶ 1 2 1 |x| − λ + xT Ax − xT f (8.147) 2 2

can be viewed as the traditional Lagrangian of the quadratic minimization problem with the reformulated (canonical) quadratic constraint 12 |x|2 ≤ λ, which is also called extended Lagrangian (see [38]). This example exhibits a connection between the nonlinear Lagrange multiplier method and the canonical dual transformation. Based on this observation, the traditional Lagrange multiplier method can be generalized to solve constrained global optimization problems.

8.8 General Constrained Global Optimization Problems In this section, we present an important application of the canonical duality theory to the following general constrained nonlinear programming problem min {−U (x) : x ∈ Uk },

(8.148)

where U (x) is a Gâteaux differentiable function, either linear or canonical function, defined on an open convex set Ua ⊂ Rn , and the feasible space Uk is a convex subset of Ua defined by Uk = {x ∈ Ua ⊂ Rn | gi (x) ≤ 0, i = 1, . . . , p}, in which gi (x) : Ua → R are convex functions. We will show the connection between the canonical dual transformation and nonlinear Lagrange multiplier methods and how to use the triality theory to identify global and local optima.


297

8.8.1 Canonical Form and Total Complementary Function First, we need to put this problem in the framework of the canonical systems. Let the geometrical operator ξ = Λ(x) = {gi (x)} : Ua → Va ⊂ Rp be a vector-valued function. The generalized canonical function ½ 0 if ξ ≤ 0 V (ξ) = ∞ otherwise is an indicator of the convex cone Va = {ξ ∈ Rp | ξ ≤ 0}. Thus, the canonical form of the constrained problem (8.148) is min{Π(x) = V (Λ(x)) − U (x) : x ∈ Ua }. By the Fenchel transformation, the conjugate of V (ξ) is an indicator of the dual cone Va∗ = {ς ∈ Rp | ς ≥ 0}, that is ½ 0 if ς ≥ 0 ] p V (ς) = max{hξ; ςi − V (ξ) : ξ ∈ R } = ∞ otherwise. By the theory of convex analysis we have ς ∈ ∂ − V (ξ) ⇔ ξ ∈ ∂ − V ] (ς) ⇔ hξ ; ςi = V (ξ) + V ] (ς),

(8.149)

that is (ξ, ς) is a generalized canonical pair on Ua ×Va∗ [40]. Thus, the extended Lagrangian Ξ(x, ς) = hΛ(x); ςi − V ] (ς) − U (x) in this problem has a very simple form: p X ςi gi (x). (8.150) Ξ(x, ς) = −U (x) + i=1

We can see here that the canonical dual variable ς ≥ 0 ∈ Rp is nothing but a Lagrange multiplier for the constraints Λ(x) = {gi (x)} ≤ 0. Let I(¯ x) := {i ∈ {1, . . . , p}| gi (¯ x) = 0} ¯ . By the theory of global be the index set of the active constraints at x ¯ is a local minimizer such that optimization (cf. [75]) we know that if x ∇gi (¯ x), i ∈ I(¯ x), are linearly independent, then the KKT conditions hold: gi (¯ x) ≤ 0, ς¯i ≥ 0, ς¯i gi (¯ x) = 0, i = 1, . . . , p,

∇U (¯ x) =

p X

ς¯i ∇gi (¯ x).

(8.151)

(8.152)

i=1

Any point (¯ x, ς¯) that satisfies (8.151-8.152) is called a KKT stationary point of the problem (8.148). However, the KKT conditions (8.151-8.152) are only

298


necessary for the minimization problem (8.148). They are sufficient for a ¯ provided that, for example, the functions constrained global minimum at x P (x) = −U (x) and gi (x), i = 1, . . . , p, are convex. In constrained global optimization problems, the primal problems may possess many local minimizers due to the nonconvexity of the objective function and constraints. Therefore, sufficient optimality conditions play a key role in developing global algorithms. Here we show that the triality theory can provide such sufficient conditions. The complementary function V ] (ς) = 0, ∀ς ∈ Va∗ , therefore in this constrained optimization problem we have Gς (x) = Ξ(x, ς) = −U (x) + ς T Λ(x).

(8.153)

For a fixed ς ∈ Va∗ , if the parametric function Gς : Ua → R is twice Gâteaux differentiable, the space G can be written as ½ µ 2 ¶ ¾ ∂ Gς (x) ∗ G = (x, ς) ∈ Ua × Va | det 6= 0 . ∂xi ∂xj Clearly for any given (x, ς) ∈ G, the dual feasible space Vk∗ ) ( p X ςi ∇gi (x) = ∇U (x), ∀x ∈ Ua Vk∗ = ς ∈ Va∗ | Λ∗t (x)ς =

(8.154)

i=1

is non-empty and the Λ-conjugate transformation U Λ (ς) = sta {hΛ(x); ςi − U (x) : ∀x ∈ Ua } can be well-formulated on Vk∗ . Thus, the canonical dual problem can be proposed as the following, max{P d (ς) = −U Λ (ς) : ς ∈ Vk∗ }.

(8.155)

In the following, we illustrate the foregoing results using some examples.

8.8.2 Quadratic Minimization with Quadratic Constraints Let U (x) = xT f − 12 xT Ax and g(x) = 12 xT Cx − λ be quadratic functions, where A and C are two symmetrical matrices in Rn×n , f ∈ Rn is a given vector, and λ ∈ R is a given constant. Thus the primal problem is: ¾ ½ 1 T 1 x Cx ≤ λ, x ∈ Rn . (8.156) min P (x) = xT Ax − f T x : 2 2 Because we have only one constraint g(x) = grangian is simply

1 T 2 x Cx

− λ, the extended La-


Ξ(x, ς) =

1 T x (A + ςC)x − f T x − ςλ. 2

299

(8.157)

On the dual feasible space Vk∗ = {ς ∈ R | ς ≥ 0, det(A + ςC) 6= 0}, and the canonical dual problem (8.155) can be formulated as (see [50]): ½ ¾ 1 T d −1 ∗ max P (ς) = − f (A + ςC) f − λς : ς ∈ Vk . (8.158) 2 Since in this problem both Λ(x) = ( 12 xT Cx − λ) and U (x) = − 12 xT Ax + f T x are quadratic functions, δ 2 Gς = (A + ςC). The following result was obtained recently. Theorem 16 (Gao [50]) Suppose that the matrix C is positive definite, and ς¯ ∈ Va∗ is a critical point of P d (ς). If A + ς¯C is positive definite, the vector ¯ = (A + ς¯C)−1 f x is a global minimizer of the primal problem (8.156). However, if A + ς¯C is ¯ = (A + ς¯C)−1 f is a local minimizer of the negative definite, the vector x primal problem (8.156). In 2-D space, if we let a11 = 3, a12 = a21 = .5, a22 = −2.0, and c11 = 1, c12 = c21 = 0, c22 = 0.5, the matrix A = {aij } is indefinite, while C = {cij } is positive definite. Setting f = {1, 1.5} and λ = 2, the graph of the canonical function P (x) = 12 xT Ax − xT f is a saddle surface (see Fig. 8.13), and the boundary of the feasible set Uk = {x ∈ R2 | 12 xT Cx ≤ λ} is an ellipse (see Fig. 8.13). In this case, the dual problem has four critical points (see Fig. 8.14): ς¯1 = 5.22 > ς¯2 = 3.32 > ς¯3 = −2.58 > ς¯4 = −3.97. ∗ ∗ Since ς¯1 ∈ V+ and ς¯4 ∈ V− , the triality theory tells us that x1 = {−0.22, 2.81} is a global minimizer, and x4 = {−1.90, −0.85} is a local minimizer. From the graph of P d (ς) we can see that x2 = {0.59, −2.70} is a local minimizer, and x3 = {2.0, 0.15} is a local maximizer. We have

P (x1 ) = −12.44 < P (x2 ) = −4.91 < P (x3 ) = 4.03 < P (x4 ) = 9.53.

8.8.3 Quadratic Minimization with Box Constraints The primal problem solved in this section is finding a global minimizer of a nonconvex quadratic function over a box constraint: ½ ¾ 1 T T l u (Pb ) : min P (x) = x Ax − f x : ` ≤ x ≤ ` , (8.159) 2

300

Advances in Mech. & Math., Vol. III, Gao & Sherali (ed.), Springer, 2006 3 2 1

20 0

10 2

0 -10

-1 0 -2

-2 0 -2 -3

2

-3

-2

-1

0

1

2

3

Fig. 8.13. Graph of P (x) (left), Contours of P (x) and boundary of Uk (right) 20

10

0

-10

-20

-6

-4

-2

0

2

4

6

8

d

Fig. 8.14. Graphs of P (ς)

where x ∈ Rn , and `l , ù are two given vectors in Rn . Problems of the form (8.159) appear frequently in partial differential equations, discretized optimal control problems, linear least squares problems, and certain successive quadratic programming methods (cf. Floudas and Visweswaran, 1995). Particularly, if `l = 0 and ù = 1, the problem (Pb ) is directly related to one of the fundamental problems of combinatorial optimization, namely, a continuous relaxation to the problem of minimizing a quadratic function in 0-1 variables. In order to solve this problem, we need to reformulate the constraints in canonical form. Without loss of generality, we assume that `l = −1 and ù = 1 (if necessary, a simple linear transformation can be used to convert the problem to this form). ½ ¾ 1 min P (x) = xT Ax − f T x : x2i ≤ 1, i = 1, . . . , n . (8.160) 2 The constraint in this problem is a vector-valued quadratic function Λ(x) = {gi (x)} = {x2i − 1} ≤ 0 ∈ Rn . Thus, the canonical dual variable ς = {ςi } should also be a vector in Rn . It has been shown recently that on the dual


301

feasible space, Vk∗ = {ς ∈ Rn | ς ≥ 0, det (A + 2 Diag (ς)) 6= 0}, where Diag (ς) ∈ Rn×n represents a diagonal matrix with ς i , i = 1, . . . , n as its diagonal entries; the canonical dual problem is given by (see [53, 54]) ) ( n X 1 max P d (ς) = − f T (A + 2 Diag (ς))−1 f − ςi : ς ∈ Vk∗ . (8.161) 2 i=1 This dual problem can be solved to obtain all the critical points ς¯. It is shown in [53, 54] that if ∗ ς¯ ∈ V+ = {ς ∈ Rn | ς ≥ 0, A + 2 Diag (ς) is positive definite},

¯ (¯ then the vector x ς ) = (A+2 Diag (¯ ς ))−1 f is a global minimizer of the primal problem.

8.8.4 Concave Minimization The primal problem in this case is given by (Pc ) : min{P (x) = −U (x) : Bx ≤ b, x ∈ Rn },

(8.162)

where U (x) is a convex, or even nonsmooth function, and where B ∈ Rm×n and b ∈ Rm are given. It is well-known that this problem is NP-hard. Concave minimization problems constitute one of the most fundamental and intensely studied classes of problems in global minimization. A comprehensive review/survey of the mathematical properties, common applications, and solution methods is given by Benson [7]. By the use of the canonical dual transformation, a perfect dual problem has been formulated in [50]. In order to provide insights into the connection between the canonical dual transformation and the traditional Lagrange multiplier method, we demonstrate here how this perfect dual formulation can also be reproduced by the classical Lagrangian duality approach when executed in a particular fashion inspired by the canonical duality. First, let us introduce a parameter µ such that min{P (x) : Bx ≤ b} ≤ µ ≤ max{P (x) : Bx ≤ b}. Then the parameterized canonical form of this problem can be formulated as (see [50]) (Pµ ) : min{P (x) = −U (x) : {U + µ, Bx − b} ≤ 0 ∈ R1+m , x ∈ Rn }. (8.163) In this case, the constraint g1 (x) = U (x) + µ is convex and {gi (x), i = 2, . . . , m+1} = Bx−b are linear. By introducing Lagrange multipliers (ς, y) ∈ R1+m , and letting

302


Va∗ = {(ς, y) ∈ R1+m | ς ≥ 0, y ≥ 0 ∈ Rm }, the Lagrangian dual to the parameterized canonical problem (8.163) is given by Ξ(x, ς, y) = (ς − 1)U (x) + µς + yT (Bx − b). Thus, by the classical Lagrangian duality, the dual problem to (Pµ ) is (LD) :

max {µς − yT b + min{(ς − 1)U (x) + yT Bx}}. x

(ς,y)∈Va∗

(8.164)

Because U (x) is convex, the inner minimization problem in this dual form has ¯ if ς > 1. a unique solution x Remark 1. Assume that (1) U (x) is a convex function such that x∗ = δU (x) is invertible for each x ∈ Rn , and the Legendre conjugate function U ∗ (x∗ ) = sta{xT x∗ − U (x) : δU (x) = x∗ } is uniquely defined in Rn ¯ to the problem (Pµ ) is a KKT solution (2) An optimum solution x ¯ ≥ 0 ∈ Rm . with Lagrange multipliers ς¯ > 1, y Let

∗ V+ = {(ς, y) ∈ R1+m | ς > 1, y ≥ 0 ∈ Rm }.

Under Remark 1, thus, we can write (LD) in (8.164) as ¾¾ ½ ½ T y Bx T (LD) : max ∗ µς − y b + (ς − 1) min + U (x) . x ς −1 (ς,y)∈V+

(8.165)

Observe that the effect of having introduced U (x) + µ ≤ 0 is to convexity the inner minimization problem in (8.165), which, by the assumption of Remark 1, reduces (LD) to the following equivalent dual problem. ½ µ T ¶¾ B y (Pµd ) : max ∗ P d (ς, y) = µς − yT b + (1 − ς)U ∗ . (8.166) 1−ς (ς,y)∈V+ This is the dual problem proposed by the canonical dual transformation in [50]. By the fact that the Legendre conjugate U ∗ (x∗ ) of the convex function U (x) is also convex, this canonical dual is a concave maximization problem ∗ over the dual feasible space V+ , which can be solved uniquely for a given ∗ parameter µ ∈ R if V+ is nonempty. ¯ solves the primal problem (Pµ ) because Under Remark 1, note that x P (¯ x) = µ, and satisfies the KKT conditions ¯ = 0, (¯ ς − 1)δU (¯ x) + B T y ¯ ≤ b, U (¯ ¯ T (B x ¯ − b) = 0, y ¯ ≥ 0, ς¯ > 1. Bx x) + µ = 0, y Writing the (LD) in (8.164) as

(8.167) (8.168)


303

max Pθd (ς, y),

(ς,y)∈Va∗

where

Pθd (ς, y) = µς − yT b + min{(ς − 1)U (x) + yT Bx}, x

we get

¯ ) = ς¯µ − bT y ¯ + (¯ ¯T Bx ˆ, Pθd (¯ ς, y ς − 1)U (ˆ x) + y

(8.169)

T

ˆ satisfies δU (ˆ ¯ /(1 − ς¯). By (8.167) and the assumed invertwhere x x) = B y ˆ=x ¯ . Substituting ibility of the canonical dual relation x∗ = δU (x), we get x ¯ ) = P (¯ this into (8.169) and using (8.168) yields P d θ (¯ ς, y x), that is there is zero duality gap. Furthermore, letting Uµ = {x ∈ Rn | Bx ≤ b, −U (x) = µ}, we have the following result. Theorem 17 (KKT Condition and Global Optimality) Under Remark ¯ ) ∈ Va∗ is a KKT point of (Pµd ) such that 1, for a given parameter µ, if (¯ ς, y ¯∗ = x

¯ BT y , 1 − ς¯

¯ = δU ∗ (¯ ¯ ). then the vector x x∗ ) is a KKT point of (Pµ ), and P (¯ x) = P d (¯ ς, y d ∗ ¯ ¯ ) is a global maximizer of P (ς, y) on V+ Moreover, if ς¯ > 1, then (¯ ς, y ,x is a global minimizer of P (x) on the feasible space Uµ , and min P (x) =

x∈Uµ

max P d (ς, y).

∗ (ς,y)∈V+

This example shows again that when a nonconvex constrained optimization problem can be written in a canonical form, the classical Lagrange multiplier method can be used to formulate a perfect dual problem. A detailed study on the canonical duality theory for solving general constrained nonconvex minimization problems and its connections with Lagrangian duality appears in [61]. One advantage of the canonical duality approach is that if the convex U (x) is nonsmooth on Ua , its Fenchel-Legendre conjugate U ∗ is a smooth function on Ua∗ (see Fig. 8.15). Such an idea has also been used in the study of geometrical dual analysis for solving nonsmooth “shape-preserving” design problems (see [9, 82, 139]).

304 U

Advances in Mech. & Math., Vol. III, Gao & Sherali (ed.), Springer, 2006 U∗ x∗2 x∗1

x1

(a) Graph of U (x).

x

x∗1

x∗2

x∗

(b) Graph of the Legendre conjugate U ∗ (x∗ ).

Fig. 8.15. Nonsmooth function and its smooth Legendre conjugate

8.9 Sequential Canonical Dual Transformation and Solutions to Polynomial Minimization Problems The canonical dual transformation method can be generalized in different ways to solve the global optimization problem: min{P (x) = W (x) − U (x) : x ∈ Ua }

(8.170)

with different types of nonconvex functions W (x) = V (Λ(x)) and geometrical operators Λ. If the geometrical operator Λ : U → V is a general nonlinear, nonconvex mapping, we can continue to use the canonical dual transformation such that the general nonconvex function W (x) can be written in the canonical form (see [38]): W (x) = V (Λ(x)) = Vn (ξn (ξn−1 (. . . (ξ1 (u)) . . . ))),

(8.171)

where ξk (ξk−1 ) is either a convex or a concave function of ξk−1 , and we write Vk (ξk ) = ξk+1 (ξk ), k = 1, . . . , n − 1. Thus, the geometrical operator Λ : U → V in this problem is a sequential composition of nonlinear mappings Λ(k) : Vk−1 → Vk , k = 1, · · · , n, V0 = U, and Vn = V, that is h i ξn (x) = Λ(x) = Λ(n) ◦ Λ(n−1) ◦ · · · ◦ Λ(1) (x). Because each Vk (ξk ) is a canonical function of ξk , the canonical duality relation ςk = δVk (ξk ) : Vk → Vk∗ is one-to-one. It turns out that the Legendre conjugate Vk∗ (ςk ) = hξk ; ςk i − Vk (ξk )


305

can be uniquely defined. Letting ς = {ςi } ∈ Rn , the sequential canonical Lagrangian associated with the general nonconvex problem (8.170) can be written as (see [38]) Ξ(x, ς) = hΛ(1) (x); ςn !i − Vw∗ (ς) − U (x),

(8.172)

where ςp ! := ςp ςp−1 · · · ς2 ς1 and ∗ Vw∗ (ς) = Vn∗ (ςn ) + ςn Vn−1 (ςn−1 ) + · · · +

ςn ! ∗ V (ς1 ). ς1 1

(8.173)

Thus, the canonical dual problem can be formulated as: (1)

max{P d (ς) = U Λ (ς) − Vw∗ (ς) : ς ∈ Vk∗ }.

(8.174)

For certain given canonical functions V , and U , and the geometrical operator Λ(1) , the Λ-conjugate transformation (1)

U Λ (ς) = sta{hΛ(1) (x); ςn !i − U (x) : δΛ(1) (x)ςn ! = δU (x)} can be well-defined on certain dual feasible spaces Vk∗ , and the canonical dual variables ςk linearly depend on ς1 . This canonical dual problem can be solved very easily. Two sequential canonical dual transformation methods have been proposed in Chapter 4 of [38]. Applications to general nonconvex differential equations and chaotic dynamical systems have been given in [33, 39]. As an application, let us consider the following polynomial minimization problem min{P (x) = W (x) − xT f : x ∈ Rn }, (8.175) where x = (x1 , x2 , · · · , xn )T ∈ Rn is a real vector, f ∈ Rn is a given vector, and W (x) is a so-called canonical polynomial of degree d = 2p+1 (see [38]), defined by  W (x) =



1 1 αp  αp−1 . . . 2 2

Ã

1 α1 2

µ

1 2 |x| − λ1 2

!2

¶2 ...

2

2

 − λp−1  − λp  ,

(8.176) where αi , λi are given parameters. It is known that the general polynomial minimization problem is NP-hard even when d = 4 (see [94]). Many numerical methods and algorithms have been suggested recently for finding tight lower bounds of general polynomial optimization problems (see [81, 101]). For the current canonical polynomial minimization problem, the dual problem has been formulated in [52]; that is ( ) p |f |2 X ςp ! ∗ d d (P ) : max P (ς) = − − V (ςk ) , (8.177) ς 2ςp ! ςk ! k k=1

306


where

µ ς1 = ς,

ςk = αk

¶ 1 2 ς − λk , k = 2, · · · , p. 2αk−1 k−1

(8.178)

In this case, V ∗ k (ςk ) is a quadratic function of ςk defined by V ∗ k (ςk ) =

1 2 ς + λk ςk . 2αk k

The dual problem is a nonlinear program having only one variable ς ∈ R, which is much easier to solve than the primal problem. Clearly, for any ς 6= 0 and ςk2 6= 2αk λk+1 , the dual function P d is well-defined and the criticality condition δP d (ς) = 0 leads to a dual algebraic equation 2(ςp !)2 (α1−1 ς + λ1 ) = |f |2 .

(8.179)

Theorem 18 (Complete Solution Set to Canonical Polynomial [52]) For any parameters αk , and λk , k = 1, · · · , p, and input f , the dual algebraic equation (8.179) has at most s = 2p+1 − 1 real solutions: ς¯(i) , i = 1, · · · , s. ¯ defined by For each dual solution ς¯ ∈ R, the vector x ¯ (¯ x ς ) = (¯ ςp !)−1 f

(8.180)

is a critical point of the primal problem (P) and P (¯ x) = P d (¯ ς ). ¯ of the polynomial P (x) can be written in Conversely, every critical point x the form (8.180) for some dual solution ς¯ ∈ R. In the case that p = 1, the nonconvex function W (x) = 21 α1 ( 12 |x|2 − λ1 )2 is a double-well function. The global and local extrema can be identified by the triality theory given in Theorem 6. For the general case of p > 1, the sufficient condition for global minimizer was obtained recently in [52]. Theorem 19 (Sufficient Condition for Global Minimizer) Suppose that for any arbitrarily given positive parameters αk , λk ≥ 0, ∀k ∈ {1, · · · , p}, ς¯ is a solution of the dual algebraic equation (8.179). If v v    u v u s Ã ! u u u u u u 2 2   t 2 ς¯ > ς+ = u λp−1 + λp , t2α1 λ2 + t λ3 + · · · + α2 αp−2 αp−1 then ς¯ is a global maximizer of P d on the open domain (ς+ , +∞), the vector ¯ = (¯ x ςp !)−1 f is a global minimizer of the polynomial minimization problem (8.175), and P (¯ x) = minn P (x) = max P d (ς) = P d (¯ ς ). (8.181) x∈R

ς>ς+


307

In the case of p = 2, the nonconvex function W (x) is a canonical polynomial of degree eight. The dual function P d (ς) has the form of µ ¶ |f |2 1 2 1 2 Π d (ς) = − − ς2 + λ2 ς2 + ς2 ( ς + λ1 ς) , (8.182) 2ςς2 α2 2α1 where ς2 = α2 ς 2 /(2α1 )−λ2 α2 . In this case, the dual algebraic equation (8.179) µ 2ς

2

α2 2 ς − λ2 α2 2α1

¶2 µ

1 ς + λ1 α1

¶ = |f |2

(8.183)

has at most seven real roots ς¯i , i = 1, · · · , 7. Let µ ¶r α2 2 1 φ2 (ς) = ±ς 2( ς + λ1 ), ς − λ2 α2 2α1 α1 and f = {0.1, −0.1}, α1 = 1, α2 = 1, and λ2 = 1. Then, for different values of λ1 , the graphs of φ2 (ς) and P d (ς) are shown in Fig. 8.16. The graphs of P (x) are shown in √ Fig. 8.17 (for √ λ1 = 0 and λ1 = 1) and Fig. 8.18 (for λ1 = 2). Since ς+ = 2α1 λ2 =√ 2, we can see that the dual function P d (ς) is strictly concave for ς > ς+ = 2. The dual algebraic equation (8.183) has a total of seven real solutions when λ1 = 2, and the largest ς1 = 2.10 > ς+ = 2 gives the global minimizer x1 = f /ς1 = {2.29, −0.92}, and P (x1 ) = −1.32 = P d (ς1 ). The smallest ς7 = −4.0 gives a local maximizer x7 = {−0.04, 0.02} and P (x7 ) = 4.51 = P d (ς7 ) (see Fig. 8.18). Detailed studies on solving general polynomial minimization problems are given in [38, 52, 81, 114, 115].

8.10 Concluding Remarks We have presented a detailed review on the canonical dual transformation and its associated triality theory, with specific applications to nonconvex analysis and global optimization problems. Duality plays a key role in modern mathematics and science. The inner beauty of duality theory owes much to the fact that many different natural phenomena can be cast in the unified mathematical framework of Fig. 1. According to the traditional philosophical principle of ying-yang duality; The Complementarity of One Ying and One Yang is the Dao (see [31, 80]), i.e., the constitutive relations in any physical system should be one-to-one. Niels Bohr realized its value in quantum mechanics. His complementarity theory and philosophy laid a foundation on which the field of modern physics was developed [100]. In nonconvex analysis and optimization, this one-to-one canonical duality relation serves as the foundation for the canonical dual transformation method. For any given nonconvex problem, as long as the geometrical operator Λ is chosen properly and the tri-canonical forms can be characterized correctly, the canonical dual transformation can

308

Advances in Mech. & Math., Vol. III, Gao & Sherali (ed.), Springer, 2006 3

1.5

2

1

1 0.5 0 0 -1 -0.5

-2 -3

0

1

0.5

1.5

2

2.5

-1

-2

3

0

1

2

1

2

(a) λ1 = 0: Three solutions ς3 = 0.22 < ς2 = 1.37 < ς1 = 1.45 1.5 3 1

2 1

0.5

0 0

-1 -2

-0.5

-3 -1

1

0

2

-1

-2

3

0

(b) λ1 = 1: Five solutions {−0.96, −0.11, 0.096, 1.38, 1.45} 2 3 1.5

2 1

1

0

0.5

-1

0

-2 -0.5 -3 -2

-1

0

1

2

3

-2

-1

0

1

2

(c) λ1 = 2: Seven solutions {−2.0, −1.45, −1.35, −0.072, 0.07, 1.39, 1.44} Fig. 8.16. Graphs of the algebraic curve φ2 (ς) (left) and dual function P d (ς) (right)

be used to establish elegant theoretical results and to develop efficient algorithms for robust computations. The extended Lagrangian duality and triality theories show promise of having significance in many diverse fields. As indicated in [38], duality in natural systems is a very broad and rich field. To theoretical scientists and philosophical thinkers as well as great artists, duality has always played a central role in their respective fields. It is really “a splendid feeling to realize the unity of a complex of phenomena that by physical perception appear to be completely separated” (Albert Einstein). It is pleasing to see that more and more knowledgeable researchers and scientists are working in this wonderland and exploring the intrinsic beauty of nature, often revealed via duality theory.


2

309

2

1

1 2

0

2

0 -1

-1 0

-2

0

-2 0

0

-2

-2 2

2

(a) λ1 = 0.

(b) λ1 = 1.

Fig. 8.17. Graphs of P (x)

2 1 2 1

0 2

0

-1

-1 0

-2

-2 0 -2 2

-2

-1

0

1

2

Fig. 8.18. Graph of P (x) with λ1 = 2

References 1. Arthurs, A.M. (1980). Complementary Variational Principles, Clarendon Press, Oxford. 2. Atai, A.A. and Steigmann, D. (1998), Coupled deformations of elastic curves and surfaces, Int. J. Solids and Structures, 35, 1915-1952. 3. Aubin, J.P. and Ekeland, I. (1976). Estimates of the duality gap in nonconvex optimization. Math. Oper. Res.1, no. 3, 225-245.

310


4. Auchmuty, G. (1983). Duality for non-convex variational principles, J. Diff. Equations, 50, pp 80-145. 5. Auchmuty, G. (1986). Dual variational principles for eigenvalue problems, Proceedings of Symposia in Pure Math., 45, Part 1, 55-71. 6. Auchmuty, G. (2001). Variational principles for self-adjoint elliptic eigenproblems, in Nonconvex/Nonsmooth Mechanics: Modelling, Methods and Algorithms, Gao, D.Y., R.W. Ogden and G. Stavroulakis (ed.) Kluwer Academic Publishers, 2000, 478pp. 7. Benson, H. (1995). Concave minimization: theory, applications and algorithms, in Handbook of Global Optimization, eds. R. Horst and P. Pardalos, Kluwer Academic Publishers, 43-148. 8. Casciaro, R. and Cascini, A. (1982). A mixed formulation and mixed finite elements for limit analysis, Int. J. Solids and Struct., 19, 169-184. 9. Cheng, H., Fang, S.C., and Lavery, J. (2005). Shape-preserving properties of univeriate cubic L1 splines, J. Comput. Appl. Math., 174, 361-382. 10. Chien , Wei-zang (1980). Variational Methods and Finite Elements (in Chinese). Science Press, 608pp. 11. Clarke, F.H. (1983). Optimization and Nonsmooth Analysis, John Wiley, New York. 12. Clarke, F.H. (1985). The dual action, optimal control, and generalized gradients, Mathematical Control Theory, Banach Center Publ., 14, PWN, Warsaw, pp. 109-119. 13. Crouzeix, J.P. (1981). Duality framework in quasiconvex programming, in Generalized Convexity in Optimization and Economics, eds. S. Schaible and W.T. Ziemba, Academic Press, 207-226. 14. Dacorogna, D. (1989). Direct Methods in the Calculus of Variations. SpringerVerlag, New York. 15. Ekeland, I. (1977). Legendre duality in nonconvex optimization and calculus of variations, SIAM J. Control and Optimization, 15, 905-934. 16. Ekeland, I (1990). Convexity Methods in Hamiltonian Mechanics, SpringerVerlag, New York, 247 pp. 17. Ekeland, I. (2003). Nonconvex duality, in Proceedings of IUTAM Symposium on Duality, Complementarity and Symmetry in Nonlinear Mechanics, D.Y. Gao (ed.), Kluwer Academic Publishers, Dordrecht/Boston/London, pp. 13-19. 18. Ekeland, I. and Temam, R. (1976). Convex Analysis and Variational Problems, North-Holland. 19. Floudas, C.A. and Visweswaran, V. (1995). Quadratic optimization, in Handbook of Optimization, R. Horst and P.M. Pardalos (eds), Kluwer Academic Publishers, Dordrecht/Boston/London, pp. 217-270. 20. Gao, D.Y. (1986). Complementarity Principles in Nonsmooth Elastoplastic Systems and Pan-penalty Finite Element Methods, Ph.D. Thesis, Tsinghua University, Beijing, China, 236 pp. 21. Gao, D.Y. (1988). On the complementary bounding theorems for limit analysis, Int. J. of Solids Struct., 24, 545-556. 22. Gao, D.Y. (1988). Panpenalty finite element programming for limit analysis, Computers & Structures, 28, pp. 749-755. 23. Gao, D.Y. (1990). Dynamically loaded rigid-plastic analysis under large deformation, Quart. Appl. Math. 48, pp. 731-739.


311

24. Gao, D.Y. (1990). On the extremum potential variational principles for geometrical nonlinear thin elastic shell, Science in China (Scientia Sinica) (A), 33 (1), pp. 324-331. 25. Gao, D.Y. (1990). On the extremum variational principles for nonlinear elastic plates, Quart. Appl. Math., 48, pp. 361-370. 26. Gao, D.Y.(1990). Complementary principles in nonlinear elasticity, Science in China (Scientia Sinica) (A) (Chinese Ed.), 33(4), pp. 386-394. 27. Gao, D.Y. (1990). Bounding theorem on finite dynamic deformations of plasticity, Mech. Research Commun., 17, pp. 33-39. 28. Gao, D.Y. (1991). Extended bounding theorems for nonlinear limit analysis, Int. J. Solids Structures, 27, pp. 523-531. 29. Gao, D.Y. (1992). Global extremum criteria for nonlinear elasticity, J. Appl. Math. Physics (ZAMP), 43, 924-937. 30. Gao, D.Y. (1996). Nonlinear elastic beam theory with applications in contact problem and variational approaches, Mech. Research Commun., 23 (1), 11-17. 31. Gao, D.Y. (1996). Complementarity and duality in natural sciences, Philosophical Study in Modern Science and Technology (in Chinese). Tsinghua University Press, Beijing, China, 12-25. 32. Gao, D.Y. (1997). Dual extremum principles in finite deformation theory with applications to post-buckling analysis of extended nonlinear beam theory, Appl. Mech. Rev., 50, 11, November 1997, S64-S71. 33. Gao, D.Y. (1998). Duality, triality and complementary extremum principles in nonconvex parametric variational problems with applications, IMA J. Appl. Math., 61, 199-235. 34. Gao, D.Y. (1998). Bi-complementarity and duality: A framework in nonlinear equilibria with applications to the contact problems of elastoplastic beam theory, J. Appl. Math. Anal., 221, 672-697. 35. Gao, D.Y. (1999). Pure complementary energy principle and triality theory in finite elasticity, Mech. Res. Comm. 26 (1), 31-37. 36. Gao, D.Y. (1999). Duality-Mathematics, Wiley Encyclopedia of Electronical and Electronical Engineering, 6, 68-77. 37. Gao, D.Y. (1999). General Analytic Solutions and Complementary Variational Principles for Large Deformation Nonsmooth Mechanics. Meccanica 34, 169-198. 38. Gao, D.Y. (2000). Duality Principles in Nonconvex Systems: Theory, Methods and Applications, Kluwer Academic Publishers, Dordrecht /Boston /London, xviii + 454pp. 39. Gao, D.Y. (2000). Analytic solution and triality theory for nonconvex and nonsmooth variational problems with applications, Nonlinear Analysis, 42, 7, 11611193. 40. Gao, D.Y. (2000). Canonical dual transformation method and generalized triality theory in nonsmooth global optimization, J. Global Optimization, 17 (1/4), pp. 127-160. 41. Gao, D.Y.(2000). Finite deformation beam models and triality theory in dynamical post-buckling analysis, Int. J. Non-Linear Mechanics, 5, 103-131. 42. Gao, D.Y. (2001). Bi-Duality in Nonconvex Optimization, in Encyclopedia of Optimization, C. A. Floudas and P.D. Pardalos (eds). Kluwer Academic Publishers, Dordrecht/Boston/London, Vol. 1, pp. 477-482. 43. Gao, D.Y. (2001). Gao, D.Y., Tri-duality in Global Optimization, in Encyclopedia of Optimization, C. A. Floudas and P.D. Pardalos (eds). Kluwer Academic Publishers, Dordrecht/Boston/London, Vol. 1, pp. 485-491.

312


44. Gao, D.Y. (2001). Complementarity, polarity and triality in nonsmooth, nonconvex and nonconservative Hamilton systems, Philosophical Transactions of the Royal Society: Mathematical, Physical and Engineering Sciences, 359, 23472367. 45. Gao, D.Y. (2002). Duality and triality in non-smooth, nonconvex and nonconservative systems: A survey, new phenomena and new results, in Nonsmooth/Nonconvex Mechanics with Applications in Engineering, edited by C. Baniotopoulos. Thessaloniki, Greece. pp. 1-14. 46. Gao, D.Y. (2003). Perfect duality theory and complete solutions to a class of global optimization problems, Optimisation, 52 (4-5), 467-493. 47. Gao, D.Y. (2003). Nonconvex semi-linear problems and canonical duality solutions, Advances in Mechanics and Mathematics, Kluwer Academic Publishers, Dordrecht/Boston/London, Vol. II, 261-312. 48. Gao, D.Y. (2004). Complementary variational principle, algorithm, and complete solutions to phase transitions in solids governed by Landau-Ginzburg equation, Mathematics and Mechanics of Solids, 9, 285-305. 49. Gao, D.Y. (2004). Canonical duality theory and solutions to constrained nonconvex quadratic programming, J. Global Optimization, 29, 377-399. 50. Gao, D.Y.(2005). Sufficient conditions and perfect duality in nonconvex minimization with inequality constraints, J. Industrial and Management Optimization, 1, 59-69. 51. Gao, D.Y. (2005). Canonical duality in nonsmooth, concave minimization with inequality constraints, Advances in Nonsmooth Mechanics, a special volume in honor of Professor J.J. Moreau’s 80th birthday, P. Alart and O. Maisonneuve (eds). Springer, New York, 305-314. 52. Gao, D.Y. (2006). Complete solutions to a class of polynomial minimization problems, J. Global Optimization, 35, 131-143. 53. Gao, D.Y. (2007). Duality-Mathematics, Wiley Encyclopidia of Electrical and Electronics Engineering, Vol. 6, (Second Edition), John G. Webster (eds). 54. Gao, D.Y. (2007). Solutions and optimality to box constrained nonconvex minimization problems J. Indust. and Manage. Optim., 3(2), 293-304. 55. Gao, D.Y. and Cheung, Y.K. (1989). On the extremum complementary energy principles for nonlinear elastic shells. Int. J. Solids & Struct., 26, pp. 683-693. 56. Gao, D.Y. and Hwang, K.C. (1988). On the complementary variational principles for elasto-plasticity. Scientia Sinica (A), 31, pp. 1469-1476. 57. Gao, D.Y. and Ogden, R.W. (2007). Closed-form solutions, extremality and nonsmoothness criteria in a large deformation elasticity problem, to appear in Zeitschrift fr angewandte Mathematik und Physik. 58. Gao, D. Y., R.W. Ogden and G. Stavroulakis (2001). Nonsmooth and Nonconvex Mechanics: Modelling, Analysis and Numerical Methods. Kluwer Academic Publishers, Boston/Dordrecht/London, 2001, xliv+471pp. 59. Gao, D.Y. and Onate, E.T. (1990). Rate variational extremum principles for finite elastoplasticity. Appl. Math. Mech., 11(7), pp. 659-667. 60. Gao, D.Y. and Ruan, N. (2007). Complete solutions and optimality criteria for nonconvex quadratic-exponential minimization problem, to appear in Math. Meth. of Oper. Res.. 61. Gao, D.Y., Ruan, N., and Sherali, H.D. (2008). Canonical duality theory for solving nonconvex constrained optimization problems, to appear in J. Global Optimization.


313

62. Gao, D.Y. and Strang, G. (1989). Geometric nonlinearity: Potential energy, complementary energy, and the gap function, Quart. Appl. Math., 47(3), 487504. 63. Gao, D.Y. and Strang, G. (1989). Dual extremum principles in finite deformation elastoplastic analysis, Acta Appl. Math., 17, pp. 257-267. 64. Gao, D.Y. and Wierzbicki, T. (1989). Bounding theorem in finite plasticity with hardening effect. Quart. Appl. Math., 47, pp. 395-403. 65. Gao, D.Y. and Yang, W.-H. (1995). Multi-duality in minimal surface type problems, Studies in Appl. Math., 95, 127-146. 66. Guo, Z.H. (1980). The unified theory of variational principles in nonlinear elasticity. Archive of Mechanics, 32, pp. 577-596. 67. Gasimov, R.N. (2002). Augmented Lagrangian duality and nondifferentiable optimization methods in nonconvex programming. J. Global Optimization, 24, 187-203. 68. Goh, C.J. and Yang, X.Q. (2002). Duality in Optimization and Variational Inequalities, Taylor and Francis, 329pp. 69. Greenberg, H.J. (1949). On the variational principles of plasticity. Brown University, ONR, NR-041-032, March. 70. Haar, A. and von K´ arm´ an, Th. (1909). Zur theorie der spannungszust¨ ande in plastischen und sandartigen medien. Nachr. den Gesellsch der Wissensch. zu G¨ ottingen, 204-218. 71. Han, Weimin (2005). A Posteriori Error Analysis Via Duality Theory With Applications in Modeling and Numerical Approximations. Springer Book Series: Advances in Mechanics and Mathematics , Vol. 8, Springer, 302pp. 72. Hellinger, E. (1914). Die allgemeine Ans¨ atze der Mechanik der Kontinua. Encyklop¨ adie der Mathematischen Wissenschaften IV, 4, 602-94. 73. Hill, R. (1978), Aspects of invariance in solids mechanics, Adv. in Appl. Mech., 18, 1-75. 74. Hiriart-Urruty, J.-B. (1985). Generalized differentialiability, duality and optimization for problems dealing with difference of convex functions, Appl. Mathematics and Optimization, 6, 257-269. 75. Horst, R., Pardalos, P.M., and Thoai, N.V. (2000). Introduction to Global Optimization, Kluwer Academic Publishers. 76. Hu, H.-C. (1955). On some variational principles in the theory of elasticity and the theory of plasticity, Scientia Sinica 4, 33-54. 77. Huang X.X. and Yang X.Q. (2003). A unified augmented Lagrangian approach to duality and exact penalization. Mathematics of Operations Research. 28, 524532. 78. Koiter, W.T. (1973). On the principle of stationary complementary energy in the nonlinear theory of elasticity, SIAM, J. Appl. Math., 25, 424-434. 79. Koiter, W.T. (1976). On the complementary energy theorem in nonlinear elasticity theory, Trends in Appl. of Pure Math. to Mech. ed. G. Fichera, Pitman. 80. Lao Zhi (400BC). Dao De Jing (or Tao Te Ching), English edition by D.C. Lau, Penguin Classics, 1963. 81. Lasserre, J. (2001). Global optimization with polynomials and the problem of moments. SIAM J. Optimization, 11 (3), 796-817. 82. Lavery, J. (2004). Shape-preserving approximation of multiscale univariate data by cubic L1 spline fits, Comput. Aided Geom. Design, 21, 43-64. 83. Lee, S. J. and Shield, R. T. (1980). Variational principles in finite elastics, J. Appl. Math. Physics (ZAMP), 31, 437-453.

314


84. Lee, S. J. and Shield, R. T. (1980). Applications of variational principles in finite elasticity, J. Appl. Math. Physics (ZAMP), 31, pp 454-472. 85. Levinson, M. (1965). The complementary energy theorem in finite elasticity, Trans. ASME, ser. E, J. Appl. Mech., 87, pp. 826-828. 86. Li, S.F. and Gupta, A. (2006). On dual configuration forces, J. of Elasticity, 84:13-31. 87. Maier, G. (1969). Complementarity plastic work theorems in piecewise-linear elastoplasticity, Int. J. Solids and Struct., 5, 261-270. 88. Maier, G. (1970). A matrix structural theory of piecewise-linear plasticity with interacting yield planes, Meccanica, 5, 55-66. 89. Maier, G., Carvelli, V. and Cocchetti, G. (2000). On direct methods for shakedown and limit analysis, Plenary lecture at the 4th EUROMECH Solid Mechanics Conference, Metz, France, June 26-30, European J. of Mechanics, A/Solids, 19, Special Issue, S79-S100. 90. Marsden, J. and Ratiu, T. (1995). Introduction to Mechanics and Symmetry, Springer. 91. Moreau, J.J. (1968). La notion de sur-potentiel et les liaisons unilatérales en élastostatique, C.R. Acad. Sc. Paris, 267 A, 954-957. 92. Moreau, J.J., Panagiotopoulos, P.D. and Strang, G. (1988). Topics in nonsmooth mechanics. Birkhuser Verlag, Basel-Boston, MA. 93. Murty, K.G. and Kabadi, S.N. (1987). Some NP-complete problems in quadratic and nonlinear programmings, Math. Progr., 39, 117-129. 94. Nesterov, Y. (2000). Squared functional systems and optimization problems. High Performance Optimization (H. Frenk et al, eds), Kluwer Academic Publishers, pp. 405-440. 95. Noble, B. and Sewell, M.J. (1972). On dual extremum principles in applied mathematics, J. Inst. Math. Appl., 9, 123-193. 96. Oden, J. T. and Lee, J. K. (1977). Dual-mixed hybrid finite element method for second-order elliptic problems. Mathematical aspects of finite element methods (Proc. Conf., Consiglio Naz. delle Ricerche (C.N.R.), Rome, 1975). In Lecture Notes in Math., Vol. 606, Springer, Berlin, 275-291. 97. Oden, J.T. and Reddy, J.N. (1983). Variational Methods in Theoretical Mechanics. Springer-Verlag. 98. Ogden, R.W. (1975). A note on variational theorems in non-linear elastostatics. Math. Proc. Camb. Phil. Sco., 77, pp. 609-615. 99. Ogden, R.W. (1977). Inequalities associated with the inversion of elastic stressdeformation relations and their implications, Math. Proc. Camb. Phil. Soc., 81, 313-324. 100. Pais, A. (1991). Niels Bohr’s Times, In Physics, Philosophy, and Polity, Clarendon Press, Oxford. 565pp. 101. Parrilo, P. and Sturmfels, B. (2001). Minimizing polynomial functions, Proceedings of DIMACS Workshop on Algorithmic and Quantitative Aspects of Real Algebraic Geometry in Mathematics and Computer Science, (eds. S. Basu and L. Gonzalez-Vega), American Mathematical Society, 2003, pp. 83-100. 102. Pian, T.H.H. and Tong, P. (1980). Reissner’s principle in finite element formulations, in Mechanics Today, 5, S. Nemat-Nasser (ed.), Pergamon Press, 377-395. 103. Pian, T.H.H. and Wu, C.C. (2006). Hybrid and Incompatible Finite Element Methods, Chapman & Hall/CRC, 378 pp. 104. Rockafellar, R.T. (1967). Duality and stability in extremum problems involving convex functions, Pacific J. Math. 21, 167-187.


315

105. Rockafellar, R.T. (1970). Convex Analysis, Princeton University Press. 106. Rockafellar, R.T. (1974). Conjugate Duality and Optimization, SIAM, Philadelphia. 107. Rockafellar, R.T. and Wets, R.J.B. (1998). Variational Analysis, Springer: Berlin, New York. 108. Powell, M.J.D (2002). UOBYQA: unconstrained optimization by quadratic approximation, Mathematical Programming, Series B, 92 (3), pp. 555-582. 109. Rubinov, A.M. and Yang X.Q. (2003). Lagrange-Type Functions in Constrained Non-Convex Optimization. Kluwer Academic Publishers, Boston / Dordrecht / London, 285 pp. 110. Rubinov, A.M., Yang X.Q. and Glover, B.M. (2001). Extended Lagrange and penalty functions in optimization. J. Optim. Theory Appl., 111 (2), 381–405. 111. Penot, J.-P. and Volle, M. (1990). On quasiconvex duality, Mathematics of Operations Research, 14, 597-625. 112. Sahni, S. (1974). Computationally related problems, SIAM J. Comp., 3, 262279. 113. Sewell, M.J. (1987). Maximum and Minimum Principles, Cambridge Univ. Press, 468pp. 114. Sherali, H.D. and Tuncbilek, C. (1992). A global optimization for polynomial programming problem using a reformulation-linearization technique. J. of Global Optimization, 2, 101-112. 115. Sherali, H.D. and Tuncbilek, C. (1997). New reformulation-linearization technique based relaxation for univariate and multivariate polynominal programming problems. Oprations Research Letters, 21(1), 1-10. 116. Silverman, H.H. and Tate, J. (1992). Rational Points on Elliptic Curves, Springer-Verlag. 117. Singer, I. (1998). Duality for optimization and best approximation over finite intersections. Numer. Funct. Anal. Optim., 19, no. 7-8, 903-915. 118. Strang, G. (1979). A minimax problem in plasticity theory, Functional Analysis Methods in Numerical Analysis, M.Z. Nashed (ed.), Springer Lecture Notes 701 319-333. 119. Strang, G. (1986). Introduction to Applied Mathematics, Wellesley-Cambridge Press, 758 pp. 120. Strang, G. (1984). Duality in the classroom, American Math. Monthly, 91, 250-254. 121. Strang, G. (1983). Maximal flow through a domain, Mathematical Programming, 26, 123-143. 122. Strang, G. (1982). L1 and L∞ and approximation of vector fields in the plane, Nonlinear Partial Differential Equations in Applied Science, H. Fujita, P. Lax, and G. Strang (eds.), Lecture Notes in Num. Appl. Anal. 5, 273-288, Springer, New York. 123. Tabarrok, B. and Rimrott, F.P.J. (1994). Variational methods and complementary formulations in dynamics. Kluwer Academic Publishers: Dordrecht. 124. Temam, R. and Strang, G. (1980). Duality and relaxation in the variational problems of plasticity, J. de Mcanique, 19, 1-35. 125. Thach, P.T. (1993), Global optimality criterion and a duality with a zero gap in nonconvex optimization. SIAM J. Math. Anal. 24, no. 6, 1537-1556. 126. Thach, P. T. (1995). Diewert-Crouzeix conjugation for general quasiconvex duality and applications. J. Optim. Theory Appl., 86, no. 3, 719-743.

316


127. Thach, P. T., Konno, H. and Yokota, D. (1996). Dual approach to minimization on the set of Pareto-optimal solutions. J. Optim. Theory Appl., 88, no. 3, 689707. 128. Toland, J.F. (1978). Duality in nonconvex optimization, J. Mathematical Analysis and Applications, 66, 399-415. 129. Toland, J.F. (1979). A duality principle for non-convex optimization and the calculus of variations, Arch. Rational Mech. Anal. 71, . 41-61. 130. Tonti, E. (1972). A mathematical model for physical theories, Accad. Naz. dei Lincei, Serie III, LII, I, 175-181; II, 350-356. 131. Tonti, E. (1972). On the mathematical structure of a large class of physical theories, Accad. Naz. dei Lincei, Serie VIII, LII, 49-56. 132. Tuy, H (1995), D.C. optimization: theory, methods and algorithms, in Handbook of Global Optimization, eds. R. Horst and P. Pardalos, Kluwer Academic Publishers, 149-216. 133. Vavasis, S. (1990). Quadratic programming is in NP, Info. Proc. Lett., 36, 73-77. 134. Vavasis, S. (1991). Nonlinear Optimization: Complexity Issues, Oxford University Press, New York, NY. 135. Veubeke, B.F. (1972). A new variational principle for finite elastic displacements. Int. J. Engineering Science, 10, pp. 745-763. 136. Walk, M (1989). Theory of duality in mathematical programming, SpringerVerlag, Wien. 137. Wright, M. H. (1998). The interior-point revolution in constrained optimization, in High-Performance Algorithms and Software in Nonlinear Optimization (R. DeLeone, A. Murli, P. M. Pardalos, and G. Toraldo, eds.) 359–381, Kluwer Academic Publishers, Dordrecht., The Netherlands. 138. Ye, Y. (1992). A new complexity result on minimization of a quadratic function with a sphere constraint, in Recent Advances in Global Optimization (C. Floudas and P. Pardalos eds.), Princeton University Press, Princeton, NJ. 139. Zhao, Y.B., Fang, S.C., and Lavery, J. (2006). Geometric dual formulation of the first derivative based C 1 -smooth univariate cubic L1 spline functions, to appear in Complementarity, Duality, and Global Optimization, a special issue of J. Global Optimization, D.Y. Gao and H.D. Sherali, Eds. 140. Zhou, Y.Y. and Yang, X.Q. (2004). Some results about duality and exact penalization. J. Global Optimization, 29, 497-509. 141. Zubov, L.M. (1970), The stationary principle of complementary work in nonlinear theory of elasticity, Prikl. Mat. Mech., 34, pp. 228-232.