A Recursive Trust–Region Method for Non–convex Constrained ...

0 downloads 0 Views 316KB Size Report
R. W. Ogden introduced a material law for rubber-like materials (cf., Og- den [1984]). ... Michael Ulbrich, Stefan Ulbrich, and Matthias Heinkenschloss. Global con ...
A Recursive Trust–Region Method for Non–convex Constrained Minimization Christian Groß, Rolf Krause

no. 409

Diese Arbeit ist mit Unterstützung des von der Deutschen Forschungsgemeinschaft getragenen Sonderforschungsbereichs 611 an der Universität Bonn entstanden und als Manuskript vervielfältigt worden. Bonn, Mai 2008

1 A Recursive Trust–Region Method for Non-convex Constrained Minimization Christian Groß1 and Rolf Krause1 Institute for Numerical Simulation, University of Bonn. {gross,krause}@ins.uni-bonn.de

1.1 Introduction The mathematical modelling of mechanical or biomechanical problems involving large deformations or biological materials often leads to highly nonlinear and constrained minimization problems. For instance, the simulation of soft– tissues, as the deformation of skin, gives rise to a highly non–linear PDE with constraints, which constitutes the first order condition for a minimizer of the corresponding non-linear energy functional. Besides of the pure deformation of the tissue, bones and muscles have a restricting effect on the deformation of the considered material, thus leading to additional constraints. Although PDEs are usually formulated in the context of Lebesgue spaces, their numerical solution is carried out using discretizations as, e.g., finite elements. Hence, in the present work we consider the following finite dimensional constrained minimization problem: u ∈ B : J(u) = min!

(M)

where B = {v ∈ Rn | φ ≤ v ≤ φ} and φ < φ ∈ Rn and the possibly nonconvex, but differentiable, energy J : Rn → R. Here, the occurring inequalities have to be understood pointwise. In the context of discretized PDEs, n corresponds to the dimension of the finite element space and may therefore denote a large degree of freedom, as one, for example, obtains after adaptively refining an initially given mesh. As matter of fact, the design of robust and efficient solution methods for problems like (M) is a demanding task. Indeed, globalization strategies, such as trust–region methods (cf., Coleman and Li [1994], Ulbrich et al. [1999]) or line–search algorithms, succeed in computing local minimizers, but are based on the paradigm of Newton’s method. This is, that a sequence of iterates is computed by means of solving linear, but potentially large, systems of equations. The drawback is, that due to the utilization of a globalization strategy

2

Christian Groß and Rolf Krause

the computed corrections generally need to be damped. Hence, for two reasons the convergence of such an approach may slow down. Often the linear systems of equations are ill–conditioned, such that iterative linear solvers tend to converge slowly. Moreover, the step–size limitation may result in another slowdown, as far as the current iterate is not sufficiently close to the minimization problem’s solution. Even if the linear system can be solved efficiently, for instance by employing a Krylov-space method in combination with a good preconditioner, the solely solution of the nonlinear system for the given discretization can stay, due to the step–size limitation, inefficient. In the context of quadratic minimization problems, linear multigrid methods have turned out to be highly efficient since these algorithms are able to resolve also the low frequency contributions of the solution. Similarly, nonlinear multigrids (cf., Nash [2000], Gratton et al. [2006], Gross and Krause [2008]) aim on a better resolution of the low–frequency contributions of non–linear problems. Hence, Gratton et al. [2007] introduced a class of non–linear multilevel algorithms called RMTR∞ (Recursive Multilevel Trust–Region method) to solve problems of the class (M). In the present work, we will introduce a V–cycle variant of the RMTR∞ algorithm. On each level of a given multilevel hierarchy, this algorithm employs a trust–region strategy to solve a constrained nonlinear minimization problem, which arises from particular level dependend representations of J, φ, φ. To prove first–order convergence, we will state less restrictive assumptions on the smoothness of J than used by Gratton et al. [2007]. Moreover, we illustrate the efficiency of our RMTR∞ – algorithm by means of an example from the field of non–linear elasticity in 3d.

1.2 The Multilevel Setting The key concept of the RMTR∞ algorithm, which we will present in Section 1.3, is to minimize on different levels arbitrarily non–convex functions Hk approximating the fine level energy J. The minimization is carried out employing a trust–region strategy which ensures convergence. Corrections computed on coarser levels will be summed up and interpolated which provide possible corrections on the fine level. In particular, on each level m1 pre–smoothing and m2 post–smooting trust–region steps are computed yielding trust–region corrections. In between, a recursion is called yielding a coarse level correction which is the interpolated difference between first and last iterate on the next coarser level. Therefore, we assume that a decomposition of the Rn into a sequence of nested subspaces is given, such as Rn = Rnj ) Rnj−1 ) · · · ) Rn0 . The spaces are connected to each other by full–ranked linear interpolation, restriction and projection operators, i.e., Ik : Rnk → Rnk+1 , Rk : Rnk → Rnk−1 and Pk : Rnk → Rnk−1 . Given these operators and a current fine level iterate, uk+1 ∈ Rnk+1 the nonlinear coarse level model (cf., Nash [2000], Gratton

1 Solution of Non–convex Constrained Minimization Problems

3

et al. [2006], Gross and Krause [2008]) is defined as Hk (uk ) = Jk (uk ) + hδgk , uk − Pk+1 uk+1 i

(1.1)

where we assume that a fixed sequence of nonlinear functions (Jk )k is given, representing J on the coarser levels. Here, the residual δgk ∈ Rnk is given by ( Rk+1 ∇Hk+1 (uk+1 ) − ∇Jk (Pk+1 uk+1 ) if j > k ≥ 0 δgk = 0 if k = j Coarse Level Obstacles In the context of constrained minimization, the fine level obstacles φ, φ have also to be represented on coarser levels. As in the linear case, an admissible coarse level correction, i.e., sk = uk − Pk+1 uk+1 , has to be an admissible correction in the context of the next finer level, i.e., φk+1 − uk+1 ≤ Ik sk ≤ φk+1 − uk+1 . Moreover, in the nonlinear context, Hk is based on an initial iterate Pk+1 uk+1 , and therefore a coarse level obstacle must be chosen such that Pk+1 uk+1 is admissible, i.e., φk ≤ Pk+1 uk+1 ≤ φk . In the remainder, we assume that the chosen coarse level obstacles satisfy φk ≤ Pk+1 uk+1 ≤ φk and φk+1 − uk+1 ≤ Ik sk ≤ φk+1 − uk+1 .

1.3 Recursive Trust–Region Methods The reliable minimization of nonconvex functions Hk crucially depends on a control of the iterates’ quality. Line–search algorithms, as proposed by Nash [2000], scale the length of Newton corrections in order to force convergence to first–order critical points. Similarly, in trust–region algorithms corrections are the solutions of constrained quadratic minimization problems. For a given iterate uk,i ∈ Rnk , where i denotes the current iteration on level k, a correction sk,i is computed as an approximate solution of sk,i ∈ Rnk : ψk,i (sk,i ) = min! w.r.t. ksk,i k∞ ≤ ∆k,i and φk ≤ uk,i + sk,i ≤ φk

(1.2)

Here, ψk,i (s) = h∇Hk (uk,i ), si + 12 hBk,i s, si denotes the trust–region model with Bk,i , a symmetric matrix, possibly approximating the Hessian ∇2 Hk (uk,i ) (if it exists) and ∆k,i is the trust–region radius. On the coarse level, the reduction of Hk−1 starting at the initial iterate uk−1,0 = Pk uk,m1 yields a final coarse level iterate uk−1 . Therefore, the recursively computed correction is sk,m1 = Ik−1 (uk−1 − Pk uk,m1 ). To ensure convergence, corrections are only added to the current iterate, if the so–called contraction rate ρk,i is sufficiently large. The contraction rate

4

Christian Groß and Rolf Krause Trust–Region Algorithm, Input: uk,0 , ∆k,0 , φk , φk , Hk do several times { compute sk,i as an approximate solution of (1.2) if (ρk,i (sk,i ) ≥ η1 ) uk,i+1 = uk,i + sk,i otherwise uk,i+1 = uk,i compute a new ∆k,i+1 }

Algorithm 1: Trust–Region Algorithm compares Hk (uk,i ) − Hk (uk,i + sk,i ) to the reduction which the underlying model predicts. The value of ψk,i (sk,i ) prognoses the reduction induced by corrections computed by means of (1.2). The underlying model for recursively computed corrections sk,m1 is the coarse level energy Hk−1 . Thus, we define ( H (u )−H (u +s ) k k,i k k,i k,i if sk,i computed by (1.2) −ψk,i (sk,i ) ρk,i (sk,i ) = Hk (uk,i )−Hk (uk,i +Ik−1 sk−1 ) otherwise Hk−1 (Pk uk,i )−Hk−1 (Pk uk,i +sk−1 ) Now, a correction will be summed up with the current iterate if ρk,i (sk,i ) ≥ η1 where η1 > 0. In this case, the next trust–region radius ∆k,i+1 will be chosen larger than the current one, i.e., γ3 ∆k,i ≥ ∆k,i+1 ≥ γ2 ∆k,i > ∆k,i with γ3 ≥ γ2 > 1. Otherwise, if ρk,i (sk,i ) < η1 , the correction will be discarded and ∆k,i+1 chosen smaller than ∆k,i , which is ∆k,i > ∆k,i+1 ≥ γ1 ∆k,i , with 0 < γ1 < 1. These four steps, computation of sk,i by means of (1.2), computation of ρk,i , application of sk,i and an update of the trust–region radius are summarized in Algorithm 1. Now, Algorithm 2 on Page 5, introduces the RMTR.C algorithm, which is a V-cycle algorithm with embedded trust–region solver. 1.3.1 Convergence to First-Order Critical Points To show convergence to first–order critical points, we state the following assumptions on Hk , cf. Coleman and Li [1994]. (A1) For a given initial fine level iterate uj,0 ∈ Rnj we assume that the level set Lj = {u ∈ Rnj | φj ≤ u ≤ φj and Jj (u) ≤ Jj (uj,0 )} is compact. Moreover, for all initial coarse level iterates uk,0 ∈ Rnk , we assume that the level sets Lk = {u ∈ Rnk | φk ≤ u ≤ φk and Hk (u) ≤ Hk (uk,0 )} are compact.

1 Solution of Non–convex Constrained Minimization Problems

5

RMTR∞ Algorithm, Input: uk,0 , ∆k,0 , φk , φk , Hk Pre–smoothing call Algorithm 1 with uk,0 , ∆k,0 , φk , φk , Hk receive uk,m1 , ∆k,m1 Recursion compute φk−1 , φk−1 , Hk−1 call RMTR∞ with uk,m1 , ∆k,m1 , φk−1 , φk−1 , Hk−1 , receive sk−1 and compute sk,m1 = Ik−1 sk−1 if (ρk,m1 (sk,m1 ) ≥ η1 ) uk,m1 +1 = uk,m1 + sk,m1 otherwise uk,m1 +1 = uk,m1 compute a new ∆k,m1 +1 Post–smoothing call Algorithm 1 with uk,m1 +1 , ∆k,m1 +1 , φk , φk , Hk receive uk,m2 , ∆k,m2 if (k == j) goto Pre–smoothing otherwise return uk,m2 − uk,0

Algorithm 2: RMTR∞ (A2) For all levels k ∈ {0, . . . , j} we assume that Hk is continuously differentiable on Lk . Moreover, we assume that there exists a constant cg > 0 such that for all iterates uk,i ∈ Lk holds k∇Hk (uk,i )k2 ≤ cg . (A3) For all levels k ∈ {0, . . . , j} there exists a constant cB > 0 such that for all iterates uk,i ∈ Lk and all employed symmetric matrices Bk,i the inequality kBk,i k2 ≤ cB is satisfied. Moreover, computations on level k − 1 are carried out only if kRk Dk,m1 gk,m1 k2 ≥ κg kDk,m1 gk,m1 k2 εφ ≥ kDk,i k∞ ≥ κφ > 0

(AC)

where εφ , κφ , κg > 0, gk,i = ∇Hk (uk,i ) and m1 indexes the recursion. Dk,i is a diagonal matrix given by ( (φk − uk,i )l if (gk,i )l < 0 (Dk,i )ll = (uk,i − φk )l if (gk,i )l ≥ 0 In the remainder, we abbreviate gˆk,i = Dk,i gk,i

6

Christian Groß and Rolf Krause

Finally, we follow Coleman and Li [1994] and assume that corrections computed in Algorithm 1 satisfy ψk,i (sk,i ) < β1 ψk,i (sC k,i )

(CC)

nk where β1 > 0 and sC solves k,i ∈ R

ψk,i (sC k,i ) =

min

2 g t≥0: s=−tDk,i k,i

{ ψk,i (s) : ksk∞ ≤ ∆k,i and φk ≤ uk,i + s ≤ φk }

(1.3)

Now, we can cite Lemma 3.1 from Coleman and Li [1994]. Lemma 1. Let (A1)–(A3) and (AC) hold. Then if sk,i in Algorithm 1 satisfies (CC), we obtain gk,i k22 } −ψk,i (sk,i ) ≥ ckˆ gk,i k22 min{∆k,i , kˆ

(1.4)

To obtain the next results, the number of applied V-cycles will be indexed by ν, so that uνk,i , denotes the i-th iterate on Level k in Cycle ν. Theorem 1. Assume that (A1)–(A3) and (AC) hold. Moreover, assume that in Algorithm 2 at least m1 > 0 or m2 > 0 holds. We also assume that for each sk,i computed in Algorithm 1 equation (CC) holds. Then for each sequence of ν iterates (uνj,i )ν,i we obtain lim inf ν→∞ kˆ gj,i k2 = 0. Proof. We will proof the result by contradiction, i.e., we assume that ∃ε > 0 ν and a sequence (uνj,i )ν,i such that lim inf ν→∞ kˆ gj,i k2 ≥ ε. ν In this case, one can show that ∆j,i → 0. Namely, if ρνj,i ≥ η1 only finitely often holds, we obtain that ∆νj,i is increased finitely often but is infinitely often decreased. On the other hand, for infinitely many iterations with ρνj,i ≥ η1 we obtain ν Hkν (uνk,i ) − Hkν (uνk,i + sνk,i ) ≥ −η1 ψk,i (sνk,i ) ν Now, we exploit Lemma 1, (A1), and kˆ gj,i k2 ≥ ε and obtain for sufficiently large ν that

Hkν (uνk,i ) − Hkν (uνk,i + sνk,i ) ≥ c∆νk,i ≥ cksνk,i k2 → 0 Next, we employ the mean value theorem, i.e., hsνk,i , g νk,i i = Hk (uνk,i + sνk,i ) − Hk (uνk,i ), as well as (A2) and (A3) and obtain for each trust–region correction |predνk,i (sνk,i )||ρνk,i − 1| = |Hkν (uνk,i + sνk,i ) − Hkν (uνk,i ) 1 ν ν ν +hsνk,i , gk,i i + hsνk,i , Bk,i sk,i i| 2 1 ν ν ≤ | hsνk,i , Bk,i sk,i i| + |hsνk,i , g νk,i − gkν (uνk,i )i| 2 1 ≤ cB (∆νk,i )2 + kgνk,i − gkν (uνk,i )k2 ∆νk,i 2

1 Solution of Non–convex Constrained Minimization Problems

7

Due to the convergence of ∆νk,i , and, hence, of (uνk,i )ν,i , and the continuity of ν gk,i we obtain ρνk,i → 1 for i 6= m1 . Hence, on each level, for sufficiently small ∆νj,i , trust–region corrections are successful and applied. One can also show, that for sufficiently small ∆νj,i recursively computed corrections will be computed and applied: Due to ∆νk,0 = ∆νk+1,m1 and ∆νk+1,(m1 +m2 ) ≥ γ1m1 +m2 ∆νk,0 we obtain that there exists a c > 0 such that ∆νk,i ≥ c∆νj,m1 . In turn, (AC) provides that there exists another constant such that ν ν k2 } gk,m k2 min{c∆νk,m1 , kˆ gk,m Hkν (uνk,m1 ) − Hkν (uνk,m1 + sνk,m1 ) ≥ ckˆ 1 2 1 2

(cf., the proof of Lemma 4.4, Gross and Krause [2008]). Now, one can show, cf. Theorem 4.6, Gross and Krause [2008], that the contraction rates for recursively computed corrections also tend to one, i.e., ρνj,m1 → 1. Due to ∆νj,i → 0, we obtain (ρνj,i )ν,i → 1. But this contradicts ∆νj,i → 0 ν and lim inf ν→∞ kˆ gj,i k2 → 0 must hold.  Using exactly the same argumentation as used in Theorem 6.6, Gross and Krause [2008], we obtain convergence to first–order critical points, i.e., ν limν→∞ kˆ gj,i k2 = 0. Theorem 2. Let assumptions (A1)–(A3), (AC) hold, Moreover, assume that al least one pre– or post–smoothing step in Algorithm 2 will be performed and that (CC) holds for each sνk,i computed in Algorithm 1. Then for each sequence ν of iterates (uνj,i )ν,i we obtain limν→∞ kˆ gj,i k2 = 0.

1.4 Numerical Example In this section, we present an example from the field of non–linear elasticity computed with the RMTR∞ algorithm which is implemented in ObsLib++, cf. Krause [2007]. R. W. Ogden introduced a material law for rubber-like materials (cf., Ogden [1984]). The associated stored energy function is highly non-linear due to a penalty term which prevents the inversion of element volumes: W (∇ϕ) = a · trE + b · (trE)2 + c · tr(E 2 ) + d · Γ (det(∇ϕ))

(1.5)

where ϕ = id + u. This function is a polyconvex stored energy function depending on the Green – St. Venant strain tensor E(u), i.e., E(u) = 1 T T + 2 (∇u + ∇u + ∇u ∇u), a penalty function Γ (x) = − ln(x) for x ∈ R , 1 ′ ′ ′′ ′ and a = −dΓ (1), b = 2 (λ − d(Γ (1) + Γ (1))), c = µ + dΓ (1), d > 0. Figure 1.1 shows the results of the computation of a contact problem with parameters λ = 34, µ = 136, and d = 100. In particular, a skewed pressure is applied on a cube’s top side, which yields that the cube is pressed towards an obstacle. Problem (M) with J(u) = W (∇ϕ) was solved using, on the one hand, using a fine level trust–region strategy and, on the other, using our RMTR∞ algorithm with m1 = m2 = 2. Equation (1.2) was (approximately) solved using 10 projected cg–steps.

8

Christian Groß and Rolf Krause

Fig. 1.1. Nonlinear Elastic Contact Problem 823, 875 degrees of freedom. Left image: Deformed mesh and von–Mises stress distribution. Right image: Comparison ν of (kˆ gj,0 k2 )ν computed by our RMTR∞ algorithm (black line) and by a trust–region strategy, performed only on the finest grid (grey line).

References Thomas F. Coleman and Yuying Li. On the convergence of interior-reflective Newton methods for nonlinear minimization subject to bounds. Math. Programming, 67(2, Ser. A):189–224, 1994. ISSN 0025-5610. S. Gratton, M. Mouffe, Ph. L. Toint, and M. Weber-Mendonca. A recursive trust-region method in infinity norm for bound-constrained nonlinear optimization. Technical Report 07/01, Dept of Mathematics, FUNDP, Namur, 2007. S. Gratton, A. Sartenaer, and P. L. Toint. Recursive trust-region methods for multiscale nonlinear optimization. Technical Report 06/01, Cerfacs, 2006. C. Gross and R. Krause. On the convergence of recursive trust–region methods for multiscale non-linear optimization and applications to non-linear mechanics. Technical Report 710, Institute for Numerical Simulation, University of Bonn, March 2008. Submitted. R. Krause. A parallel decomposition approach to non-smooth minimization problems - concepts and implementation. Technical Report 709, Institute for Numerical Simulation, University of Bonn, Germany, 2007. Stephen G. Nash. A multigrid approach to discretized optimization problems. Optim. Methods Softw., 14(1-2):99–116, 2000. ISSN 1055-6788. International Conference on Nonlinear Programming and Variational Inequalities (Hong Kong, 1998). R. W. Ogden. Nonlinear elastic deformations. Ellis Horwood Series: Mathematics and its Applications. Ellis Horwood Ltd., Chichester, 1984. ISBN 0-85312-273-3. Michael Ulbrich, Stefan Ulbrich, and Matthias Heinkenschloss. Global convergence of trust-region interior-point algorithms for infinite-dimensional nonconvex minimization subject to pointwise bounds. SIAM J. Control Optim., 37(3):731–764 (electronic), 1999. ISSN 0363-0129.

Bestellungen nimmt entgegen: Institut für Angewandte Mathematik der Universität Bonn Sonderforschungsbereich 611 Wegelerstr. 6 D - 53115 Bonn Telefon: Telefax: E-mail:

0228/73 4882 0228/73 7864 [email protected]

http://www.sfb611.iam.uni-bonn.de/

Verzeichnis der erschienenen Preprints ab No. 385

385. Albeverio, Sergio; Hryniv, Rostyslav; Mykytyuk, Yaroslav: Factorisation of Non-Negative Fredholm Operators and Inverse Spectral Problems for Bessel Operators 386. Otto, Felix: Optimal Bounds on the Kuramoto-Sivashinsky Equation 387. Gottschalk, Hanno; Thaler, Horst: A Comment on the Infra-Red Problem in the AdS/CFT Correspondence 388. Dickopf, Thomas; Krause, Rolf: Efficient Simulation of Multi-Body Contact Problems on Complex Geometries: A Flexible Decomposition Approach Using Constrained Minimization 389. Groß, Christian; Krause, Rolf: On the Convergence of Recursive Trust-Region Methods for Multiscale Non-Linear Optimization and Applications to Non-Linear Mechanics 390. Nemitz, Oliver; Nielsen, Michael Bang; Rumpf, Martin; Whitaker, Ross: Finite Element Methods on Very Large, Dynamic Tubular Grid Encoded Implicit Surfaces 391. Barbu, Viorel; Marinelli, Carlo: Strong Solutions for Stochastic Porous Media Equations with Jumps 392. Albeverio, Sergio; Cebulla, Christof: Synchronizability of Stochastic Network Ensembles in a Model of Interacting Dynamical Units; erscheint in: Physica A: Statistical Mechanics and its Applications 393. Albeverio, Sergio; Cebulla, Christof: Synchronizability of a Stochastic Version of FitzHughNagumo Type Neural Oscillator Networks 394. Bacher, Kathrin: On Borell-Brascamp-Lieb Inequalities on Metric Measure Spaces 395. Croce, Roberto; Griebel, Michael; Schweitzer, Marc Alexander: Numerical Simulation of Droplet-Deformation by a Level Set Approach with Surface Tension 396. Schweizer, Nikolaus: Spectral Gaps of Simulated Tempering for Bimodal Distributions 397. Koivisto, Juha; Rost, Martin; Alt, Wolfgang: A Phase Field Model for Phospholipid Surfactant 398. Reiter, Philipp; Felix, Dieter; von der Mosel, Heiko; Alt, Wolfgang: Energetics and Dynamics of Global Integrals Modeling Interaction between Stiff Filaments

399. Bogachev, Leonid; Daletskii, Alexei: Poisson Cluster Measures: Quasi-Invariance, Integration by Parts and Equilibrium Stochastic Dynamics 400. Lapid, Erez; Müller, Werner: Spectral Asymptotics for Arithmetic Quotients of SL(n,R)/SO(n) 401. Finis, Tobias; Lapid, Erez; Müller, Werner: On The Spectral Side of Arthur’s Trace Formula II 402. Griebel, Michael; Knapek, Stephan: Optimized General Sparse Grid Approximation Spaces for Operator Equations 403. Griebel, Michael; Jager, Lukas: The BGY3dM Model for the Approximation of Solvent Densities 404. Griebel, Michael; Klitz, Margit: Homogenization and Numerical Simulation of Flow in Geometries with Textile Microstructures 405. Dickopf, Thomas; Krause, Rolf: Weak Information Transfer between Non-Matching Warped Interfaces 406. Engel, Martin; Griebel, Michael: A Multigrid Method for Constrained Optimal Control Problems 407. Nepomnyaschikh, Sergey V.; Scherer, Karl: A Domain Decomposition Method for the Helmholtz-Problem 408. Harbrecht, Helmut: A Finite Element Method for Elliptic Problems with Stochastic Input Data 409. Groß, Christian; Krause, Rolf: A Recursive Trust–Region Method for Non-convex Constrained Minimization

Suggest Documents