Multigrid second-order accurate solution of

Comput Optim Appl (2012) 51:835–866 DOI 10.1007/s10589-010-9358-y

Multigrid second-order accurate solution of parabolic control-constrained problems S. González Andrade · A. Borzì

Received: 19 April 2010 / Published online: 6 October 2010 © Springer Science+Business Media, LLC 2010

Abstract A mesh-independent and second-order accurate multigrid strategy to solve control-constrained parabolic optimal control problems is presented. The resulting algorithms appear to be robust with respect to change of values of the control parameters and have the ability to accommodate constraints on the control also in the limit case of bang-bang control. Central to the development of these multigrid schemes is the design of iterative smoothers which can be formulated as local semismooth Newton methods. The design of distributed controls is considered to drive nonlinear parabolic models to follow optimally a given trajectory or attain a final configuration. In both cases, results of numerical experiments and theoretical twogrid local Fourier analysis estimates demonstrate that the proposed schemes are able to solve parabolic optimality systems with textbook multigrid efficiency. Further results are presented to validate second-order accuracy and the possibility to track a trajectory over long time intervals by means of a receding-horizon approach.

Supported in part by the Austrian Science Fund SFB Project F3205-N18 “Fast Multigrid Methods for Inverse Problems”. S. González Andrade · A. Borzì () Institut für Mathematik und Wissenschaftliches Rechnen, Karl-Franzens-Universität Graz, Heinrichstr. 36, 8010 Graz, Austria e-mail: [email protected] S. González Andrade e-mail: [email protected] S. González Andrade Research Group on Optimization, Departamento de Matemática, Escuela Politécnica Nacional, Ladrón de Guevara E11-253, Quito, Ecuador A. Borzì Dipartimento e Facoltà di Ingegneria, Palazzo Dell’Aquila Bosco Lucarelli, Università degli Studi del Sannio, Corso Garibaldi 107, 82100 Benevento, Italia

836

S. González Andrade, A. Borzì

Keywords Multigrid methods · Semismooth Newton method · Parabolic partial differential equations · Optimal control theory

1 Introduction Real-time optimal control [6] of reaction-diffusion systems represents a challenge in many application fields as control of arrhythmia [9, 34], requiring the development of algorithms that guarantee a control response to an external event within a given time. Therefore, it is mandatory to design control algorithms with optimal computational complexity that are accurate and robust. Recent developments [8, 9] show that a viable strategy towards these aims is represented by space-time collectivesmoothing multigrid schemes. In fact, Fourier analysis estimates [13] and results of numerical experiments with linear [8] and nonlinear [9, 10] parabolic control problems demonstrate that space-time multigrid schemes provide optimal control solutions with mesh-independent convergence and robustness with respect to the value of the control parameters and of the nonlinearity of the governing model. Nevertheless, since parabolic control problems result in large-sized algebraic systems, further techniques are in order to reduce the computational time required to determine the optimal control function. A possible strategy is to construct reduced models via, e.g., proper orthogonal decomposition [30] or consider adaptivity [26, 33]. In this paper we pursue the idea of reducing the size of the algebraic problems by considering uniform high-order discretization of the optimality system characterizing the optimal solution. In fact, the use of these discretization schemes allows to attain the required accuracy with much coarser meshes thus reducing considerably the size of the algebraic problems to be solved and without the computational overhead of an adaptive approach thus keeping the computational cost at a minimum. Previous contribution to the multigrid solution of parabolic control problems [8, 10, 13, 16, 20, 23, 24] have focused on first-order time discretization while higherorder space discretization of steady optimal control problems have been considered in, e.g., [7]. For this reason, in this paper we focus on second-order time discretization of parabolic control problems that are suitable for implementation in a space-time multigrid framework. This means that in addition to guaranteeing second-order accuracy, the proposed discretization schemes should accommodate appropriate smoothing strategies. We find that the Crank-Nicolson scheme is not a convenient choice while multistep backward differencing schemes are advantageous in the design of very efficient pointwise and linewise smoothers. The construction of effective smoothers for higher-order time discretization requires to address many implementation problems. On the one hand, multistep discretization schemes need to be combined with onestep methods for initialization and consequently the smoothing scheme applied to the optimality system ought to implement the coupling between onestep and multistep schemes corresponding to the optimality equations with opposite time orientation. On the other hand, the construction of linewise smoothers with multistep schemes requires to solve efficiently block-band systems. Furthermore, these smoothing schemes should accommodate the presence of inequality constraints on the control, which is an open issue in the case of the

Multigrid second-order accurate solution

837

multigrid approach to parabolic control problems. In this paper, we present and analyze smoothing schemes that address all these issues. These schemes are formulated based on criteria proposed in [8] and we show that they can be interpreted as local semismooth Newton methods [25, 36, 42] . The resulting smoothers appear to be robust with respect to change of values of the control parameters and have the ability to accommodate constraints on the control also in the limit case of bang-bang controls. To the best of our knowledge, this is a unique feature of our methodology which could be instrumental to support theoretical investigation of the bang-bang control phenomenon in parabolic problems [11, 18, 44]. In the next section, we formulate control-constrained nonlinear parabolic optimal control problems and the characterization of optimal solutions as solution of the corresponding optimality systems. In Sect. 3, we discuss second-order backward time differentiation formula and Crank-Nicolson schemes for the discretization of the optimality systems. In the control-unconstrained case, we obtain second-order accuracy estimates in space and time. In Sect. 4, we illustrate the space-time multigrid framework and focus on the construction of efficient pointwise and linewise smoothers. This is a delicate issue considering the opposite time orientation of the state and adjoint equations and the presence of constraints on the control. In the pointwise approach a particular time splitting is introduced to accommodate the opposite time orientation. On the other hand, the linewise smoother requires the development of block-diagonal solvers. For both approaches, we present novel insight that shows that the resulting smoothers can be interpreted as local semismooth Newton schemes. Section 4 is completed considering the combination of the space-time multigrid scheme with the receding-horizon approach to track trajectories over long time intervals. In Sect. 5 and in the Appendix, twogrid local Fourier analysis tools are given to analyze the convergence properties of the multigrid schemes with pointwise and linewise smoothers. We obtain smoothing-factor and multigrid convergence-factor estimates with typical textbook multigrid efficiency and robustness with respect to the values of the control parameters. In Sect. 6, results of numerical experiments are presented that demonstrate the ability of the proposed multigrid framework to provide efficient second-order accurate solutions to parabolic optimal control problems and to allow the investigation of the bang-bang control phenomenon in control-constrained parabolic problems. A section of conclusion completes this work. 2 Parabolic optimal control problems We consider time-dependent nonlinear parabolic processes controlled through source terms with the purpose of tracking a desired trajectory, given by yd ∈ L2 (Q), or with the objective of reaching a desired terminal state yT ∈ L2 () at a given final time T . In order to obtain an optimal control function, we formulate a distributed parabolic optimal control problem as follows ⎧ minu∈Uad J (y, u) := α2 y − yd 2L2 (Q) + β2 y(·, T ) − yT 2L2 () + ν2 u2L2 (Q) , ⎪ ⎪ ⎪ ⎨ −∂t y + G(y) + σ y = f + u in Q = × (0, T ), (2.1) ⎪ y = y0 on × {t = 0}, ⎪ ⎪ ⎩ y = 0 on = ∂ × (0, T ).

838


We first discuss about the parameters in the objective cost functional J (y, u), ν ≥ 0 is the weight of the cost of the control and α ≥ 0, β ≥ 0 (α + β > 0), are control parameters, which allow us to achieve the proposed objectives. Indeed, the case α = 1, β = 0 corresponds to tracking without terminal observation. With α = 0, β = 1, the objective is to reach a given final target configuration without any specification of the trajectory that should be followed. Next, we focus on the reaction-diffusion equation. We assume that σ > 0, f ∈ L2 (Q), and we select an initial condition y0 ∈ H01 (). The nonlinearity G(y) models the reaction kinetics for the state y and u stands for the control function. Regarding the control function, we assume that u ∈ Uad , where Uad ⊂ L2 (Q) represents the set of admissible controls. If Uad coincides with the whole space L2 (Q), then (2.1) represents an control-unconstrained optimal control problem. Otherwise, the problem (2.1) represents a control-constrained optimal control problem. In this paper, we are interested in bilateral pointwise constraints, so we consider the following set Uad := {u ∈ L2 (Q) : u(x, t) ≤ u(x, t) ≤ u(x, t), a.e. in Q},

(2.2)

where u and u are elements of L∞ (Q). Existence of solutions to the optimal control problem above can be established under suitable conditions for the nonlinearity G; see, e.g., [19, 31, 32, 35]. Moreover, the solution to (2.1)–(2.2) is characterized by the following first-order optimality system −∂t y + G(y) + σ y = f + u

in Q,

p + G (y)p + σ p + α(y

in Q,

∂t

− yd ) = 0

y = 0, p = 0

on ,

(νu − p, v − u) ≥ 0

for all v ∈ Uad .

(2.3)

System (2.3) is completed with the initial condition y(x, 0) = y0 (x) for the state variable (evolving forward in time) and a terminal condition for the adjoint variable (evolving backward in time), given by p(x, T ) = β(y(x, T ) − yT (x)).

(2.4)

The optimality system corresponding to ν = 0 is discussed in Sect. 6.3. We assume, for a given u ∈ L2 (Q) and given initial and boundary conditions, that the solution of the reaction-diffusion model is uniquely determined. We denote this dependence by y = y(u) and assume that the mapping u → y(u) is affine and differentiable. Therefore, we can introduce the so-called reduced cost functional J given by J(u) = J (y(u), u).

(2.5)

The gradient of J with respect to u is given by ∇ J(u) = νu − p(u) where p(u) is the solution of the adjoint equation for the given y(u). Our purpose is to develop and analyze a robust and efficient multigrid scheme to solve (2.3) with second-order accuracy in space and time.


839

3 High-order time discretization We are concerned with the use of high-order time-discretization schemes for the optimality system (2.3). We consider the second-order backward differentiation formula (BDF2) together with the Crank-Nicolson (CN) method in order to obtain a secondorder time discretization scheme. While we focus on second-order time discretization, the techniques presented in this paper can be generalized to higher-order BDF schemes. We remark that CN schemes are strictly non-dissipative but easily oscillatory in contrast to BDF schemes that introduce numerical dissipation and thus are more appropriate in a multigrid framework. Therefore, we use the CN scheme only as second-order onestep method for the purpose of initialization. For a detailed discussion of the BDF and CN schemes, see [2, 17]; for higher-order space discretization of optimality systems, we refer to [7]. To illustrate our approach, we use the framework in [21, 22, 40] and assume that the space domain is a square and h is a uniform space mesh, where h is the mesh size, and h defines the set of interior mesh-points, (xi , yj ) = ((i − 1)h, (j − 1)h), 2 ≤ i, j ≤ Nx . On this mesh, −h denotes the negative Laplacian approximated by the common five-point stencil including homogeneous Dirichlet boundary conditions. For grid functions vh and wh defined on h , we have the discrete L2 ()-scalar product vh (x)wh (x), (vh , wh )L2 (h ) = h2 h

x∈h 1/2 . L2h (h )

with associated norm |vh | = (vh , vh )

Further, let δt = T /Nt be the time-step

size and define the following space-time mesh Qh,δt = {(x, tm ) : x ∈ h , tm = (m − 1)δt, 1 ≤ m ≤ Nt + 1}. For grid functions defined on Qh,δt , we use the discrete L2 (Q) scalar product with 1/2 norm vh,δt = (vh,δt , vh,δt ) 2 . Later, we use γ = δt/ h2 . Lh,δt (Qh,δt )

On the Qh,δt grid, yhm and phm denote grid functions at time level m. The action of the one-step backward and forward time-discretization operator on these functions is defined as follows ∂ + yhm :=

yhm − yhm−1 δt

and ∂ − phm := −

phm − phm+1 . δt

The action of the BDF2 time-difference operators is as follows + ∂BD yhm :=

3yhm − 4yhm−1 + yhm−2 2δt

− and ∂BD phm := −

3phm − 4phm+1 + phm+2 . 2δt

The coefficients in the last two expressions above are given by the classical BDF2 formula (see, e.g., [2]) while the minus sign in the second operator allows us to discretize the adjoint variable taking into account its backward evolution in time.

840


With this setting, the following discrete optimality system is obtained + yhm + G(yhm ) + σ h yhm = fhm + um −∂BD h, − m ∂BD phm + G (yhm )phm + σ h phm + α(yhm − ydh ) = 0, m m m (νum h − ph , vh − uh ) ≥ 0,

(3.1) h for all v ∈ Uad ,

h is a grid function approximation of the admissible set U . Further, we where Uad ad assume sufficient regularity of the data, yd , yT , and f , such that these functions are properly approximated by their values at grid points. As stated before, at t = δt , represented by m = 1, and at t = T − δt, given by m = Nt , we combine the multistep BDF2 method with the CN method. Indeed, for m = 1 the adjoint equation is discretized as in (3.1) and the state equation as follows

−∂ + yhm =

1 m −σ h yhm − G(yhm ) + um h + fh 2

+ −σ h yhm−1 − G(yhm−1 ) + uhm−1 + fhm−1 .

(3.2)

Similarly, for m = Nt , the state equation is approximated as in (3.1) and the adjoint equation as ∂ − phm =

1 m+1

−σ h phm+1 − G (yhm )phm+1 − α(yhm+1 − ydh ) 2

m ) . + −σ h phm − G (yhm )phm − α(yhm − ydh

(3.3)

Now, we use the theory of Malanowski [32] and the BDF2 estimates theory in [17] to prove that our approach guarantees a second-order accurate approximation in the case where the constraints on the control are not active. Based on Lemma 1 and Theorem 1.2 of [32] and Sect. 2 of [8], we can state that u∗h,δt − RQ u∗ ≤ c(h2 + (δt)2 ),

(3.4)

where RQ : H 2 (Q) → L2h,δt (Q) is a restriction operator, u∗ is the optimal control of (2.3), and u∗h,δt solves (3.1). In this framework, the estimate (3.4) is obtained assuming that the constraints on the control are not active; see, e.g., [27, 37] for a case of active constraints. Next, we follow the approach in [17]. Therefore, we intend to analyze the state and adjoint equations as evolution equations governed by monotone operators that might be perturbed by time-dependent continuous operators. We introduce the operators Ay : H01 () → H −1 () and Ap : H01 () → H −1 () by Ay v(t) := σ v(t) + G(v(t)), Ap w(t) := σ w(T − t) + G (y(T − t))w(T − t),


841

for all t ∈ [0, T ] and all v, w ∈ L2 (0, T ; H01 ()). Here, y stands for the state variable. We consider that , G and G (y) are Nemytskii operators [38] acting on L2 (0, T ; H01 ()) via (v)(t) = v(t), (Gv)(t) = G(v(t)), (w)(t) = v(T − t) and (G (y)w)(t) = G (y(T − t))w(T − t). With these operators, we rewrite the state and the adjoint equations as the following initial-value problems −∂t y + Ay y = f + u

in Q, (3.5a)

y(x, 0) = y0 (x), ∂t p + Ap y = −α(y − yd )

in Q,

p(x, T ) = β(y(x, T ) − yT (x)).

(3.5b)

We assume that the functions G and G are strongly continuous, according to the definition given in [17, p. 41]. Further, we assume that there exist two non-negative constants λ1 and λ2 such that

G (w), w L2 ≥ −λ2 . (3.6) (G(v), v)L2 ≥ −λ1 and Since the Dirichlet Laplace operator is known to be maximal monotone in H 2 () ∪ H01 () (see [15, Sect. 3]), the strongly continuity assumptions on G and G , and (3.6) guarantee that the operators involved in (3.5) satisfy the hypothesis of [17, Theorem 5.1]. Next, denote with F1 := f + u∗ , where u∗ is the optimal control, here considered a known function. If F1 ∈ L2 (0, T , L2 ()) and if δt ≤ τ , with τ < T sufficiently small, then [17, Theorem 5.1] implies that the following estimate for the problem (3.5a) holds max |y(tm ) − yhm |2 ≤ c |y(t0 ) − y0 |2 + |y(δt) − yh1 |2 2≤m≤Nt+1

+ (δt)4 F1 − y + δtF1 − RQ F1 2 .

(3.7)

Similarly, we denote with F2 := −α(y ∗ − yd ), where y ∗ is the optimal state variable, here considered a known function. If F2 ∈ L2 (0, T , L2 ()) and if δt ≤ τ , with τ < T sufficiently small, then [17, Theorem 5.1] provides the following estimate for the problem (3.5b), we have max |p(tm ) − phm |2 ≤ c |p(T ) − phN t+1 |2 + |p(tN t ) − phN t |2 Nt+1 ≤m≤2

+ (δt)4 F2 − p + δtF2 − RQ F2 2 .

(3.8)

Here, stands for the time derivative in the weak sense. The two estimates (3.7) and (3.8) allow us to state that if the initial- and terminal conditions, the first-step initial approximations of the state and adjoint variables are second-order accurate, and if F1 − RQ F1 and F2 − RQ F2 are at least second-order accurate, then the proposed approach guarantees an optimal order O(δt 2 ) in the discrete l 2 (0, T , L2 )-norm. Now, since the Crank-Nicolson scheme is second-order accurate, the second-order

842


accuracy for the approximations of y(x, δt) and p(x, T − δt) is given. Moreover, the estimate (3.4) guarantees that F1 − RQ F1 is second-order accurate which implies that also F2 − RQ F2 is second-order accurate, and our claim is proved.

4 The space-time multigrid framework In this section, we discuss the extension of the space-time collective-smoothing multigrid strategy for parabolic optimal control problems [8, 9, 12] to the case of higher-order discretization and constraints on the control. Collective-smoothing multigrid schemes for control-constrained elliptic control problems are discussed in [7, 11, 12]. For completeness, we recall the space-time multigrid scheme that belongs to the class of nonlinear full approximation storage (FAS) methods [14]. Consider L grid levels indexed by k = 1, . . . , L, where k = L refers to the finest grid. The mesh of level k is denoted by Qk = Qhk ,δt k where hk = h1 /2k−1 and δt k = δt, that corresponds to semicoarsening in space. Any operator and variable defined on the discrete space-time cylinder Qk is indexed by k. The optimality system at level k with given initial, terminal, and boundary conditions is represented by the following nonlinear equation Ak (wk ) = fk ,

wk = (yk , uk , pk ).

(4.1)

As well known [14, 41], the multigrid strategy combines two complementary schemes. The high-frequency components of the solution error are reduced by a smoothing iteration, denoted by Sk and defined in the following subsection, while the low-frequency error components are effectively reduced by a coarse-grid correction method as defined below. The action of one multigrid cycle applied to (4.1) can be expressed in terms of a (nonlinear) multigrid iteration operator Bk . Starting with an initial approximation (0) (0) wk the result of one multigrid cycle is then denoted by wk = Bk (wk )fk . Algorithm 1 (Space-Time Multigrid (STMG) (ν1 , ν2 )-Cycle) Set B1 (w1 ) ≈ A−1 1 (0) (e.g., iterating with S1 starting with w1 ). For k = 2, . . . , L define Bk in terms of Bk−1 as follows. (0)

(0)

1. Set the starting approximation wk . (l) 2. Pre-smoothing. Define wk for l = 1, . . . , ν1 , by (l)

(l−1)

wk = Sk (wk (ν +1)

(ν1 )

3. Coarse-grid correction. Set wk 1 = wk 0 i = 1, . . . , μ is defined by (q = 0)

, fk ).

k (q μ − I k−1 w + Ik−1 k k

(ν1 )

) where q i for

(ν ) q i = q i−1 + Bk−1 (Ikk−1 wk 1 )

× Ikk−1 (fk − Ak (wk(ν1 ) )) + Ak−1 (Ikk−1 wk(ν1 ) ) − Ak−1 (q i−1 ) .


843

(l)

4. Post-smoothing. Define wk for l = ν1 + 2, . . . , ν1 + ν2 + 1, by (l)

(l−1)

wk = Sk (wk (0)

(ν1 +ν2 +1)

5. Set Bk (wk )fk = wk

, fk ).

.

Notice that we can perform μ two-grid iterations at each working level. For μ = 1 we have a V (ν1 , ν2 )-cycle and for μ = 2 we have a W (ν1 , ν2 )-cycle; μ is called the cycle index [41]. In our implementation, we choose Ikk−1 to be the full-weighted restriction operator k is dein space with no averaging in the time direction [41]. The prolongation Ik−1 k−1 fined by bilinear interpolation in space. We choose Ik to be straight injection. The intergrid transfer operators do not involve time since we are using semicoarsening. Other choices of intergrid operators and coarsening strategies are possible; see [28]. One important concern for parabolic control problems is tracking of a desired trajectory over long-time intervals. For this purpose, we combine our space-time multigrid with receding-horizon techniques [1, 29]. Results of numerical experiments demonstrate that our receding-horizon space-time multigrid scheme provides an efficient and robust control strategy which is able to solve bang-bang problems. 4.1 Space-time smoothing schemes A key component in the STMG Algorithm 1, is the smoothing scheme Sk that must be efficient in solving high-frequency error components and robust with respect to the control parameters. We discuss pointwise and linewise smoothing schemes for constrained- and unconstrained-control problems that are suitable for higher-order time discretization. In order to develop the smoothing schemes, let us write the discretized optimality system (3.1), in expanded form, for a space-time grid point (ij m). We have 3 − + 4σ γ yij m + σ γ yi+1j m + yi−1j m + yij +1m + yij −1m + 2yij m−1 2 1 − yij m−2 + δtG(yij m ) − δtuij m − δtfij m = 0, 2 ≤ m ≤ Nt + 1, (4.2a) 2 3 + 4σ γ pij m + σ γ pi+1j m + pi−1j m + pij +1m + pij −1m + 2pij m+1 − 2 1 − pij m+2 + δtG (yij m )pij m + αδt (yij m − ydij m ) = 0, 2 0 ≤ m ≤ Nt − 1, ν(uij m − pij m ) · (vij m − uij m ) ≥ 0,

(4.2b) for all

h vh ∈ Uad .

(4.2c)

In the case of terminal observation, at tNt +1 = T , we have (2.4) in place of (4.2b).

844


Further, since we use the Crank-Nicolson method to calculate the required first steps to initialize the BDF2 method, we also write the expanded form of (3.2) and (3.3) to analyze the cases m = 1, corresponding to the instant δt , and m = Nt , corresponding to the instant T − δt. Therefore, for the case m = 1 we approximate the state variable with the following CN discretization − (1 + 2σ γ ) yij m +

σγ yi+1j m + yi−1j m + yij +1m + yij −1m + (1 − 2σ γ ) yij m−1 2

+

δt δt σγ G(yij m ) − uij m + yi+1j m−1 + yi−1j m−1 + yij +1m−1 + yij −1m−1 2 2 2

+

δt δt δt G(yij m−1 ) − uij m−1 − (fij m−1 + fij m ) = 0, 2 2 2

m = 1,

(4.3)

and we approximate the corresponding adjoint variable pij 2 with (4.2b). Moreover, for the case m = Nt we approximate the corresponding state variable yij Nt with (4.2a), and we calculate the adjoint variable with the following CN discretization − (1 + 2σ γ ) pij m +

σγ pi+1j m + pi−1j m + pij +1m + pij −1m 2

+ (1 − 2σ γ ) pij m+1 +

δt σγ G (yij m )pij m + pi+1j m+1 + pi−1j m+1 2 2

αδt δt (yij m − ydij m ) + G (yij m+1 )pij m+1 + pij +1m+1 + pij −1m+1 + 2 2 +

αδt (yij m+1 − ydij m+1 ) = 0, 2

m = Nt .

(4.4)

4.1.1 Pointwise smoothing schemes In this section, the construction of pointwise smoothing schemes is discussed. We illustrate the construction of these schemes for the BDF2 discretization and consider the optimality system (4.2) at the space-time grid points (ij m) for m = 2, . . . , Nt − 1. A similar discussion follows for m = 1 and m = Nt , where the BDF2 and the CN discretization both appear in the optimality system. If we call a := ( 32 + 4σ γ ) and introduce the following notations 1 Sij m := σ γ yi+1j m + yi−1j m + yij +1m + yij −1m + 2yij m−1 − yij m−2 − δtfij m , 2 Rij m := σ γ pi+1j m + pi−1j m + pij +1m + pij −1m + 2pij m+1 1 − pij m+2 − δtαydij m , 2 we can write the optimality system (4.2) at (i, j, m) as follows −ayij m + Sij m + δtG(yij m ) − δtuij m = 0,

(4.5a)


845

−apij m + Rij m + δtG (yij m )pij m + αδtyij m = 0, (νuij m − pij m )(vij m − uij m ) ≥ 0,

(4.5b)

for all vh ∈ Uad,h .

(4.5c)

This is a nonlinear problem that includes an inequality constraint. To solve this problem, we generalize the scheme proposed in [7, 11] in the case of elliptic control problems. Moreover, we show that this approach can be interpreted as a local semismooth Newton method [25, 36, 42]. Consider (4.5a) and (4.5b). The Jacobian of these two equations is given by −a + δtG 0 Jij m := δt(α + G pij m ) −a + δtG and its inverse is Jij−1m

1 = (−a + δtG )2

−a + δtG

0

−δt(α + G pij m )

−a + δtG

.

(4.6)

Notice that, in the case of nonmonotone nonlinearity (e.g., singular control problems), we should chose δt sufficiently small to guarantee that (−a + δtG )2 = 0. Now, for a given uij m , a classical local Newton update for the state and adjoint ij m , is given by variables yij m and p r y y = + Jij−1m y , (4.7) rp ij m p ij m p ij m where (ry )ij m and (rp )ij m denote the residuals of (4.5a) and (4.5b), respectively. In the case of the BDF2 discretization, these residuals are given by (ry )ij m = ayij m − Sij m − δtG(yij m ) + δtuij m ,

for m = 2, . . . , Nt + 1,

(rp )ij m = apij m − Rij m − δtG (yij m )pij m − αδtyij m ,

for m = 0, . . . , Nt − 1.

ij m as a function of uij m as Since (ry )ij m depends explicitly on uij m , we can write p follows p ij m (uij m ) = pij m +

−δt(α + G pij m )[ayij m − Sij m − δtG(yij m )] (−a + δtG )2

+

apij m − Rij m − δtG (yij m )pij m − αδtyij m (−a + δtG )

−

δt 2 (α + G p)uij m . (−a + δtG )2

(4.8)

We use p ij m in order to obtain the update for the control variable uij m . Let us recall that the gradient of the objective functional is given by ∇ J(u) = νu − p. Therefore, ij m = 0, we obtain the auxiliary variable from ν uij m − p −1 δt 2 (α + G p) uij m = ν + (−a + δtG )2

846


−δt(α + G pij m )[ayij m − Sij m − δtG(yij m )] × pij m + (−a + δtG )2 apij m − Rij m − δtG (yij m )pij m − αδtyij m . + (−a + δtG )

(4.9)

Then, the new value for uij m resulting from the smoothing step is obtained by projection as follows ⎧ ⎪ ⎨ uij m if u˜ ij m > uij m , uij m ≤ uij m , uij m if uij m ≤ uij m = (4.10) ⎪ ⎩ uij m if u˜ ij m < uij m . With uij m given, we can use (4.7) to obtain new values for yij m and pij m . An iteration step of this smoothing scheme can be performed in any ordering for the spatial variables i, j . However, we need to take into account the opposite time orientation of the state and the adjoint equations. We then propose to update the state variable y using the first vector component of (4.7) marching in the forward direction and the adjoint variable p is being updated using the second component of (4.7) marching backwards in time. In this way a robust iteration is obtained. Let us recall that the calculation of the initialization steps yij 2 and pij Nt are carried out in the same way, but using the combination of the Crank-Nicolson scheme with the BDF2 method, as described above. Algorithm 2 (Projected Time-Splitted Collective Gauss-Seidel Iteration (P-TSCGS)) 1. Set the starting approximation: calculate yij 1 , pij 1 , uij 1 , yij Nt , pij Nt and uij Nt . 2. For ij in, e.g., lexicographic order do 3. For m = 2, . . . , Nt − 1: compute (ry )ij m , u˜ ij m and projection uij m . Then, the state update is given by (1)

(0)

yij m = yij m +

(ry )ij m . (−a + δtG )

4. For m = Nt − 1, . . . , 2 (backwards): compute (ry )ij m , (rp )ij m , u˜ ij m and projection uij m . Then, the adjoint update is given by (1)

(0)

pij m = pij m +

(−a + δtG )(rp )ij m − δt(α + G p)(ry )ij m . (−a + δtG )2

5. End. In the control-unconstrained case, the iteration above applies without projection. In this case a simpler (equivalent) derivation of the pointwise smoothing is possible by eliminating the control variable enforcing directly the optimality condition m νum h − ph = 0. In this way, the time-splitted collective Gauss-Seidel (TS-CGS) iteration of [8, 9] is obtained.


847

Now, we analyze the application of a (local) semismooth Newton (SSN) method to (4.5) to show that the resulting iterative scheme is equivalent to the P-TS-CGS scheme. Recall that (4.5c) is equivalent to the following (see [32]) 1 u(x, t) = max u(x, t), min u(x, t), p(x, t) , a.e. in Q and for ν > 0. (4.11) ν We consider (4.11) at a grid point (ij m) and, for the ease of illustration, we study the system (4.5) with G(y) = 0. Further, we denote (4.5) as the following operator equation ⎤ ⎡ −ayij m + Sij m − δtuij m ⎥ ⎢ −apij m + Rij m + αδtyij m (yij m , pij m , uij m ) := ⎣ ⎦ = 0. (4.12) 1 uij m − max{uij m , min{uij m , ν pij m }} We can state that both the max and min functions involved in (4.12) are semismooth. Indeed, it is well known (see [25, Lemma 3.1]) that the mappings y → max(0, y) and y → min(0, y), from Rn to Rn , n ∈ N, are Newton differentiable with Newton derivatives given by the diagonal matrices 1 if yi ≥ 0, (max )ii := and 0 if yi < 0 (4.13) 1 if yi ≤ 0, (min )ii := i = 1, . . . , n, 0 if yi > 0, respectively. Thus, [39, Theorem 4.6] implies that the real function max{uij m , min{uij m , ν1 pij m }} is Newton differentiable, with respect to pij m , and its Newton derivative is given by 1 p := χA+ χA− , ν where χA+ and χA− are defined by 1 if min{uij m , ν1 pij m } ≥ uij m , χA+ := 0 if min{uij m , ν1 pij m } < uij m 1 if ν1 pij m ≤ uij m , χA− := 0 if ν1 pij m > uij m ,

and (4.14)

respectively. Consequently, we obtain the semismooth Newton step applied to the operator equation (4.12) as follows ⎤ ⎛ ⎞ ⎡ ⎛ ⎞ −a 0 −δt ry δy ⎥ ⎢ δtα −a 0 ⎦ ⎝ δp ⎠ = ⎝ rp ⎠ . (4.15) ⎣ δ r 1 u u ij m ij m 1 0 −χ χ A+ A− ν

ij m

848


From this system we obtain the following update for the state and adjoint variables (1) y p

ij m

=

(0) y p " +

ij m

−a

−δtχA+ χA− ν1

αδt

−a

(0)

#(0)−1 ij m

ay − S + δtU ap − R − αδty

(0) ,

(4.16a)

ij m

(0)

where Uij m := max{uij m , min{uij m , ν1 pij m }}. The update for the control uij m results as follows 1 1 (0) (1) uij m = max uij m , min uij m , pij m + χA+ χA− (δp )ij m . ν ν

(4.16b)

Now, we show that one iteration step given by (4.16) is equivalent to one iteration of the algorithm P-TS-CGS in the sense that the two methods compute the same update for the control, the state and the adjoint variables. Indeed, the local SSN iteration must be performed in the forward time-direction to calculate the updates for yij m and uij m and in the backwards time-direction to calculate the updates for pij m . Consider the three possible cases arising in (4.16b). (i)

> uij m . Here, we have that χA− = 0, and we obtain that Uij m := uij m . Therefore, from (4.16b), we obtain that u(1) ij m = uij m and, from (4.16a), the following updates for yij m and pij m : 1 ν pij m

(1)

(0)

yij m = yij m + (1) pij m

(0) = pij m

(ry )ij m −a

and (4.17)

−αδt(ry )ij m − a(rp )ij m + , a2

(0)

(0)

with (ry )ij m := ayij m − Sij m + δtuij m and (rp )ij m := apij m − Rij m − αδtyij m . (ii) ν1 pij m < uij m . In this case, we have that χA+ = 0, since min{uij m , ν1 pij m } < (1)

uij m . Hence, we have that Uij m = uij m and (4.16b) implies that uij m = uij m . Further, (4.16a) gives the following updates for yij m and pij m : (1)

(0)

yij m = yij m + (1) pij m (0)

(0) = pij m

(ry )ij m −a

and

−αδt(ry )ij m − a(rp )ij m + , a2 (0)

(4.18)

with (ry )ij m := ayij m − Sij m + δtuij m and (rp )ij m := apij m − Rij m − αδtyij m .


849

(iii) uij m ≤ ν1 pij m ≤ uij m . In this case, we have that χA+ = χA− = 1. Thus, Uij m = 1 (0) ν pij m and (4.16b) yields that 1 (0) 1 1 (0) 1 (1) 1 (1) (1) (0) uij m = pij m + (δp )ij m = pij m + (pij m − pij m ) = pij m . ν ν ν ν ν

(4.19)

By solving the system (4.16a), we obtain the following updates for yij m and pij m yij(1)m =

νaSij m − νδtRij m νa 2

+ αδt 2

(1) and pij m=

νaRij m − ναδtSij m νa 2 + αδt 2

. (4.20)

uij m , the equivalence between the SSN iteration and the P-TSThus, since ν1 pij m = CGS iteration is clear in the cases (i) and (ii). Furthermore, in the case (iii), since uij m is given by uij m =

aRij m + δtαSij m a 2 δt + αδt 2

,

we obtain the same expressions (4.20), by plugging uij m in (ry )ij m . Therefore, the equivalence between the P-TS-CGS and the semismooth Newton iteration (4.16b)– (4.16a) is totally established. 4.1.2 Linewise smoothing schemes In the regime of small σ (or γ ), the P-TS-CGS iteration cannot provide robust smoothing because the coupling in the space direction becomes weak and therefore pointwise relaxation in space is not effective in reducing the high-frequency components of the error. To overcome this problem, block-relaxation of the variables that are strongly connected must be performed. In our case, this means solving for the pairs of state and adjoint variables along the time-direction for each space coordinate. In order to construct a robust linewise smoothing scheme, we proceed in a way similar to that followed for pointwise approach in order to obtain an approximation for the controls uij m . Thus, we consider the residuals (ry )ij m and (rp )ij m , for all m = 0, . . . , Nt + 1 at any (i, j ) as follows

(ry )ij 0 , (rp )ij 0 , . . . , (ry )ij m , (rp )ij m , . . . , (ry )ij Nt +1 , (rp )ij Nt +1 = 0.

(4.21)

The solution of this problem provides the mapping uij m → yij m and uij m → pij m and by requiring to satisfy the (unconstrained) optimality condition, we obtain uij m followed by projection thus obtaining a new approximation for the control and consequently the updates for the state and adjoint variables. Let us recall that the residual (ry )ij 0 is given by (ry )ij 0 = yij 0 − ψij 0 , where ψij 0 is a function representing the initial condition y0 . The introduction of this residual helps us to completely describe the system of equations in the time interval [0, T ],

850


making possible to obtain a desirable structure for the matrices involved in the solution of (4.21). Further, the residual (ry )ij 1 is given by the negative of the left-hand side of (4.3) and (ry )ij m , for 2 ≤ m ≤ Nt + 1, are given by the negative of the lefthand side of (4.2a). In the same way, the residuals (rp )ij m , for 0 ≤ m ≤ Nt − 1, are given by the lefthand side of (4.2b), while the residual (rp )ij Nt corresponds to the negative of the left-hand side of (4.4). Finally (rp )ij Nt +1 = β(yij Nt +1 − yT ij Nt +q ) − pij Nt +1 , which corresponds to the terminal condition (2.4). To describe the block Gauss-Seidel procedure, consider the discrete equation (4.21) at any i, j and for all time steps. For each spatial grid point i, j , the pair of state and adjoint equations corresponding to (y, p) at a given m correspond to five 2 × 2 blocks for the pairs (y, p)m−2 , (y, p)m−1 , (y, p)m , (y, p)m+1 , (y, p)m+2 . Considering all time steps, we obtain a block-pentadiagonal system Mw = r, where w = (yh2 , ph2 , . . . , yhNt +1 , phNt +1 ) and r = (ry (w 2 ), rp (w 2 ), . . . , ry (w Nt +1 ), rp (w Nt +1 )). The system matrix has the following form ⎡

A0

⎢ ⎢ C1 ⎢ ⎢ B2 ⎢ ⎢ M =⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣

D0

E0

A1 C2

D1 A2 .. .

⎤ E1 D2 .. . BNt −1

E2 .. .

..

CNt −1 BNt

..

.

ANt −1 CNt BNt +1

.

DNt −1 ANt CNt +1

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ . (4.22) ⎥ ⎥ ENt −1 ⎥ ⎥ ⎥ D Nt ⎦ ANt +1

Centered at tm , the entries Bm Cm , Am , Dm , Em refer to the variables (y, p) at tm−2 , tm−1 , tm , tm+1 and tm+2 , respectively. For m = 2, . . . , Nt − 1, the block Am , is given by " Am =

−( 32 + 4σ γ ) + δtG

−χij m δtν

δt(α + G p)

−( 32 + 4σ γ ) + δtG

# ,

(4.23)

where all functions within the brackets [] are evaluated at tm , and the indicator function χij m is defined, for m = 1, . . . , Nt + 1, by χij m :=

uij m ≤ uij m , 1 if uij m ≤ 0 otherwise,

(4.24)

where uij m is given by (4.9). Thanks to the introduction of this indicator term, we guarantee a correct updating of the state and adjoint variables, mainly in the grid points which uij m ≤ uij m ≤ uij m . Further, at the end of this section, we will show that this is equivalent to a semismooth Newton approach to the solution of problem (4.21).


The Bm , Cm , Dm and Em blocks are given by 1 2 0 −2 0 , Cm = , Bm = 0 0 0 0 0 0 . Em = 0 − 12

851

0 0 Dm = 0 2

and (4.25)

Clearly, for each time step, the variables neighboring the point ij are taken as constant and contribute to the right-hand side of the system. Next, since yij 1 and pij Nt are approximated by the Crank-Nicolson method, while the corresponding pij 1 and yij Nt are approximated by the BDF2 method, the matrices A1 , C1 as well as ANt and DNt are slightly different defined. Actually, the matrices A1 and ANt have the same structure given in (4.23), except for the following entries: δt (A1 )11 := −(1 + 2σ γ ) + δt2 G , (A1 )12 := −χij 2 2ν , (ANt )21 := δt2 (α + G p) and (ANt )22 := −(1 + 2σ γ ) + δt2 G . Further, C1 and DNt have the same structure given in δt , (DNt )21 := δt2 (α + G p) (4.25) except for the following entries: (C1 )12 = −χij 1 2ν δt and (DNt )21 := (1 − 2σ γ ) + 2 G . Clearly, all the functions in A1 are evaluated at t1 = δt and the functions in C1 are evaluated at t0 = 0. Moreover, the functions in ANt are evaluated at tNt = T − δt, while the functions in DNt are evaluated at tNt +1 = T . As stated before, at t0 we impose the condition yij 0 = ψij 0 , where ψij 0 is a given function representing the initial condition. On the other hand, pij 0 is approximated using the BDF2 method. Thus, the block A0 has the same structure defined in (4.23), except for the entries (A0 )11 := 1 and (A0 )12 := 0. Clearly, the condition yij 0 = ψij 0 , allows us to obtain the structure of a pentadiagonal block matrix for the matrix M. It remains to discuss the block ANt +1 for β = 0. At tNt +1 = T , pij Nt +1 is defined by the terminal condition β(yij Nt +1 − yT ,ij Nt +1 ) − pij Nt +1 = 0 and yij Nt +1 is approximated by the BDF2 method. Thus, the block ANt +1 shares the same structure defined in (4.23), except for the entries (ANt +1 )21 := β and (ANt +1 )22 := −1. Summarizing our collective t-line relaxation is given by the following algorithm [9, 10]. Algorithm 3 (Projected-Time-Line Collective Gauss-Seidel Iteration (P-TL-CGS)) 1. Set the starting approximation. 2. For ij in, e.g., lexicographic order do: calculate uij by using (4.9) and (4.10), and construct (ry )ij and (rp )ij . Then (1) (0) r y y = + M −1 y . rp ij p ij p ij 3. End. We consider that the residuals ry and rp are constructed at i, j and for all m prior to the update. Since the solution in time is exact, no time splitting is required. In the case of the control-unconstrained problem, the algorithm described above does not need the projection step to calculate the control. We use the optimality conm dition νum h − ph = 0 in order to eliminate the control in the construction of (ry )ij m

852


and, consequently, the indicator term χij m is no longer needed. Therefore, the controlunconstrained problem can be numerically solved by using the same Algorithm 3 by setting χij m = 1 everywhere, i.e. for all grid points i, j, m. When solving controlunconstrained problems, we will refer to Algorithm 3 as the TL-CGS smoother (see [9] for further details). In Algorithm 3, the problem of how to solve the block-pentadiagonal system Mw = r arises. The Thomas algorithm represents a standard method to solve blocktridiagonal systems [43]. In this paper, we propose an algorithm based on a generalization of the Thomas algorithm (see [3–5]) which provides us a block-pentadiagonal solver with optimal complexity. Although this procedure is standard (Gaussian elimination for banded matrices), for the sake of clearness in the exposition of our work we present the used algorithm. Algorithm 4 Block Penta-Diagonal Solver 1. Factorization. (i) For m = 0 calculate F0 := A−1 0 , P0 := F0 D0 and Q0 := F0 E0 . (ii) For m = 1 calculate F1 := (A1 − C1 P0 )−1 , P1 := F1 (D1 − C1 Q0 ) and Q1 := F1 E 1 . (iii) For m = 2, . . . , Nt + 1 calculate Hm := Cm − Bm Pm−2 , Fm := (Am − Hm Pm−1 − Bm Qm−2 )−1 . While m < Nt calculate Pm := Fm (Dm − Hm Qm−1 ). While m < Nt − 1 calculate Qm := Fm Em . 2. Intermediate forward solution. (i) Calculate g0 := F0 r0 and g1 := F1 (r1 − C1 g0 ). (ii) For 2 = 3, . . . , Nt + 1, calculate gm := Fm (rm − Hm gm−1 − Bm gm−2 ). 3. Back substitution. (i) Calculate wNt +1 := gNt +1 and wNt := gNt − PNt wNt +1 . (ii) For m = Nt − 1, . . . , 0, calculate wm := gm − Pk wm+1 − Qm wm+2 . 4. End. Clearly, the algorithm described above works if all the matrices which need to be inverted are nonsingular. Following [3], we can state that our Algorithm 4 solves pentadiagonal system of order Nt with O(4(Nt + 1)) effort. We complete this section analyzing the equivalence between our P-TL-CGS approach and a semismooth Newton scheme. We require that the controls uij m are available to construct the residuals (ry )ij m and (rp )ij m , prior to the calculation of the generalized Newton step, when solving problem (4.21). Similar to Sect. 4.1.1, we consider that the controls uij m are given by function (4.11). Further, for the ease of illustration, we consider the linear case G(y) = 0. Therefore, due to the fact that function (4.11) is Newton differentiable (see Sect. 4.1.1), we calculate a semismooth Newton step to solve (4.21) as follows

δy δp

ij

= −JE−1

ry rp

, ij

for i = 1, . . . , Nx , and j = 1, . . . , Ny ,

(4.26)


853

where JE stands for the generalized Jacobian of Eij . We immediately observe that this Jacobian can be written as the following block pentadiagonal matrix ⎤ ⎡ A0 D0 E0 ⎥ ⎢C 1 D1 E1 ⎥ ⎢ 1 A ⎥ ⎢ 2 ⎥ ⎢ B2 C2 A D2 E2 ⎥ ⎢ ⎥ ⎢ . . . . . ⎥ , (4.27) ⎢ . . . . . −JE = ⎢ . . . . . ⎥ ⎥ ⎢ ⎢ Nt −1 DNt −1 ENt −1 ⎥ BNt −1 CNt −1 A ⎥ ⎢ ⎥ ⎢ Nt BNt CNt D Nt ⎦ A ⎣ Nt +1 BNt +1 CNt +1 A where the diagonal block entries are given by " 3 # −( 2 + 4σ γ ) −δtχA+ χA− ν1 m = A , δtα −( 32 + 4σ γ )

for m = 2, . . . , Nt − 1,

(4.28)

where χA+ and χA− are given by (4.14). Next, due to the combination of CN and BDF2 schemes for m = 1 and m = Nt , 1 , A Nt +1 and C 1 are slightly different. Matrices A 1 and we have that matrices A Nt +1 have the same structure as (4.28) except for the following components: A 1 )11 := −(1 + 2σ γ ), (A 1 )12 := −δtχA χA 1 , (A Nt )21 := δt α and (A Nt )22 := (A + − 2ν 2 −(1 + 2σ γ ). Moreover, matrix C1 shares the structure of the corresponding matrix in 1 )11 := −(1 − 2σ γ ) and (C 1 )12 := −δtχA χA 1 . (4.25) except for the entries (C + − 2ν Nt +1 has the same components as (4.28), except for Finally, the terminal block A Nt +1 )22 := −1. Nt +1 )21 := β and (A the entries (A Further, all the other constituent block matrices of −JE are defined in the same way as the corresponding constituent blocks of the matrix M. Moreover, notice that the function χA+ χA− can be rewritten as 1 if uij m ≤ ν1 pij m ≤ uij m , χij m := 0 otherwise, which is, by construction, equivalent to the indicator function (4.24). Therefore, the matrix M and the generalized Jacobian −JE are equivalent. Thanks to this argumentation, we conclude that the algorithm P-TL-CGS is equivalent to the semismooth Newton strategy given by the generalized Newton step (4.26).

5 Twogrid Fourier analysis of space-time smoothers In this section, we use local Fourier analysis (LFA) [8, 41] to calculate the smoothing factors μ(Sk ) of the two smoothing iterative schemes obtained above, as well as the convergence factors η(T Gk−1 k ) for the corresponding multigrid schemes. For a detailed discussion of how these magnitudes are constructed, see the Appendix and the references given there.

854


Fig. 1 Smoothing factor μ(Sk ) of TS-CGS scheme (left) and of the TL-CGS scheme (right) as function of ν and γ , δt = 1/64, α = 1, β = 0 and σ = 1

The smoothing property of a smoothing iteration Sk measures the ability of the smoothing scheme to damp the high-frequency components of the solution error. It is defined as follows k−1 (θ ) μ(Sk ) = max{r(Q Sk (θ )) : θ ∈ ([−π/2, π/2) × [−π, π))}, k

(5.1)

k−1 (θ ) is a projection where Sk (θ ) is the Fourier symbol of the smoothing scheme, Q k operator on the space of high-frequency components, and r is the spectral radius. In Fig. 1, the smoothing factor μ(Sk ) of both the TS-CGS (left) and TL-CGS (right) schemes are depicted, as function of ν and γ to show the dependence of this factor on the optimization parameter and on the discretization parameters. These results demonstrate that the TS-CGS and the TL-CGS schemes, for the BDF2 discretization, have similar smoothing properties. Further, the action of these two schemes appears to be mesh-independent and optimization-parameter independent for a very large range of values of the weight of the cost of the control. These facts appear also from results of numerical experiments. Notice that for σ = 0 no spatial coupling is present and the TL-CGS scheme becomes an exact solver. Next, we report results on the convergence factor of the twogrid scheme. It provides a sharp estimate of the convergence factor of the multigrid scheme with respect to all frequencies of the solution error. The twogrid convergence factor estimate by local Fourier analysis is given as follows $ η(T Gk−1 k ) = sup{r(T Gk

k−1

(θ )) : θ ∈ ([−π/2, π/2) × [−π, π))},

where T$ Gk (θ ) is the Fourier symbol of the twogrid operator and r stands for the spectral radius. We obtain that the convergence factor η is almost independent of the value of the weight ν and of the discretization parameter γ for both choices of the smoothing scheme. Notice this analysis predicts convergence factors that improve for smaller values of the optimization parameter. This is a unique feature of the spacetime collective-smoothing multigrid approach. Our estimates obtained with twogrid LFA analysis are sharp and in order to facilitate comparison with values of convergence factors obtained with numerical experiments, we report in the Tables 1 and 2 k−1

Multigrid second-order accurate solution Table 1 Smoothing factor μ(Sk ) and convergence factor η(T Gkk−1 ) for the TS-CGS multigrid scheme (ν1 = ν2 = 1). Parameters: δt = 1/64, σ = 1, α = 1 and β = 0

ν TS-CGS

γ

10−8

10−6

10−4

10−2

μ(Sk )

32

0.2289

0.4843

0.4516

0.4493

48

0.3317

0.4737

0.4502

0.4486

64

0.4056

0.4677

0.4494

0.4483

32

0.0427

0.1317

0.1361

0.1347

48

0.0822

0.1352

0.1354

0.1344

64

0.1147

0.1368

0.1350

0.1342

η(T Gk−1 k )

Table 2 Smoothing factor μ(Sk ) and convergence factor η(T Gkk−1 ) for the TL-CGS multigrid schemes (ν1 = ν2 = 1). Parameters: δt = 1/64, σ = 1, α = 1 and β =0

855

ν TL-CGS

γ

10−8

10−6

10−4

10−2

μ(Sk )

32

0.2289

0.4843

0.4516

0.4493

48

0.3317

0.4737

0.4502

0.4486

64

0.4056

0.4677

0.4494

0.4483

32

0.0427

0.1300

0.1282

0.1266

48

0.0822

0.1330

0.1303

0.1290

64

0.1147

0.1345

0.1313

0.1301

η(T Gk−1 k )

the LFA quantitative estimates of the smoothing factor and the convergence factor of TL-CGS- and TS-CGS-multigrid schemes. This task may be performed using any symbolic package, such as Mathematica.

6 Numerical experiments In this section, we discuss numerical experiments to validate our space-time multigrid second-order accurate solution of parabolic control-constrained problems. The validation focuses on accuracy, robustness, and efficiency of our solution procedure. For this purpose, we construct exact solutions for unconstrained- and constrained-control problems to compute solution errors. On the other hand, we measure the convergence factor of the proposed multigrid schemes. In our experiments, the convergence factor ρ is defined as the asymptotic value of the ratio of the norm of the residuals given by ry + rp /ν resulting from two successive multigrid cycles. The stopping criteria in all experiments is ry + rp /ν < 10−10 . In most cases, we choose W -cycles, two pre- and two post-smoothing steps, i.e. ν1 = ν2 = 2, and h = 14 is the coarsest space-mesh size. We take = (0, 1) × (0, 1) and T = 1. To show the tracking ability of the space-time multigrid approach we report the tracking error y − yd and the terminal observation error |y − yT |. In the following examples, we consider three different grids Nx × Ny × Nt : 64 × 64 × 64, 128 × 128 × 128, and 256 × 256 × 256, which result in γ = 64, γ = 128, and γ = 256, respectively.

856


Table 3 Accuracy results for the unconstrained control problem. Parameters: σ = 1, α = 1 and β = 0 ν 10−3

10−7

Nx × Ny × Nt

y − yh

p − ph

y − Rh yd

64

1.41 · 10−6

2.63 · 10−7

8.00 · 10−3

128 × 128 × 128

128

3.52 · 10−7

6.74 · 10−8

8.00 · 10−3

256 × 256 × 256

256

9.62 · 10−8

1.71 · 10−8

8.00 · 10−3

64 × 64 × 64

64

2.44 · 10−7

2.86 · 10−11

8.33 · 10−7

128

2.26 · 10−8

7.23 · 10−12

8.00 · 10−7

256

2.02 · 10−9

1.82 · 10−12

8.00 · 10−7

64 × 64 × 64

128 × 128 × 128 256 × 256 × 256

γ

6.1 Control-unconstrained problems We consider an unconstrained tracking problem with α = 1, β = 0 and σ = 1, considering the linear case G(y) = 0 so that an exact solution is easily constructed. We choose the following y(x, t) = t 2 (1 − t)2 sin(πx1 ) sin(πx2 ), p(x, t) = 2νt (1 − t)(π 2 t 2 − (π 2 − 2)t − 1) sin(πx1 ) sin(πx2 ), and the objective function is given by yd (x, t) = ((1−t)2 t 2 +2ν(2π 4 t 4 −4π 3 t 3 +2(π 4 −3)t 2 +6t −1)) sin(πx1 ) sin(πx2 ). We take y0 (x) = y(x, 0). The optimality system (2.3) without constraints on the control is solved by the STMG-scheme. The results are reported in Table 3 showing that halving the space and time mesh sizes, the solution errors reduce approximately as a factor of four, thus demonstrating second-order convergence. Notice that the high regularity of the data (polynomial in time) may result in superconvergence phenomena. Also, we show the obtained tracking error. As expected, this error improves for smaller values of the weight of the cost of the control. Concerning multigrid convergence performance, we present numerical results for an unconstrained optimal control problem with σ = 1 and G(y) = exp(y). This nonlinearity models explosive combustion phenomena; see, e.g., [9]. In this case, consider a desired target trajectory given by yd (x, t) := (1 + t) cos(4πt)(x1 − x12 )(x2 − x22 ),

(6.1)

which is an oscillating function whose amplitude increases linearly with time. For the terminal observation problem, we take yT (x) = yd (x, T ). First, we consider the tracking of a desired trajectory, α = 1 and β = 0. The results of this experiment with the TS-CGS and TL-CGS smoothers are showed in Table 4. The values of the convergence factor ρ, the values of the norms of the calculated residuals and the norm of the obtained tracking errors are reported, showing the efficiency and robustness of the STMG solver. It is possible to appreciate a similar


857

Table 4 Numerical results for the tracking problem with TS-CGS and TL-CGS smoothing schemes. Parameters: σ = 1, α = 1, β = 0. Initial condition for state equation: y0 = y(x, 0) ν TS-CGS

10−2

ρ

ry

rp

yh − Rh yd

0.069

2.19 · 10−9

4.67 · 10−10

4.52 · 10−2

0.071

4.39 · 10−8

6.82 · 10−9

4.52 · 10−2

128

0.074

7.51 · 10−8

5.60 · 10−9

4.52 · 10−2

32

0.065

1.18 · 10−10

1.28 · 10−12

3.30 · 10−3

0.067

1.99 · 10−9

2.44 · 10−11

3.73 · 10−3

128

0.070

2.87 · 10−9

3.16 · 10−11

3.93 · 10−3

32

0.106

1.14 · 10−10

1.06 · 10−13

6.69 · 10−4

0.065

1.12 · 10−10

6.74 · 10−14

2.94 · 10−4

128

0.064

1.66 · 10−9

1.70 · 10−12

2.49 · 10−4

32

0.071

2.99 · 10−9

4.78 · 10−10

4.53 · 10−2

γ 32 64

10−4

64 10−6

64

TL-CGS

10−2

10−4

10−6

64

0.073

5.92 · 10−8

7.11 · 10−9

4.53 · 10−2

128

0.074

7.81 · 10−8

5.34 · 10−9

4.53 · 10−2

32

0.065

1.43 · 10−10

1.28 · 10−12

3.30 · 10−3

64

0.067

2.15 · 10−9

2.45 · 10−11

3.29 · 10−3

128

0.069

2.37 · 10−9

2.69 · 10−11

3.30 · 10−3

32

0.100

8.92 · 10−11

8.34 · 10−14

6.69 · 10−4

0.061

8.54 · 10−11

7.93 · 10−14

2.86 · 10−4

0.056

9.34 · 10−10

1.00 · 10−12

1.82 · 10−4

64 128

convergence performance of the two smoothing schemes which results independent of the values of the mesh-parameter γ and of the optimization weight ν. These results also demonstrate the validity and sharpness of the local Fourier analysis presented in Sect. 5 that predicts ‘textbook’ multigrid convergence rates. We notice that the tracking error improves by taking smaller values of the control weight as predicted by the LFA analysis. Next, we consider the case of terminal observation without tracking a given trajectory, α = 0 and β = 1. In Table 5, we show the values of the convergence factor ρ, the values of the norms of the calculated residuals and the norm of the obtained terminal error employing the TS-CGS and TL-CGS smoothers. The results show that the two smoothing schemes present a similar efficient performance. Further, the results in Table 5 show robustness with respect to the values of γ and ν. 6.2 Control-constrained problems In this section, we discuss the solution of a control-constrained optimal control problem. The multigrid algorithm with the P-TS-CGS smoothing scheme is tested with an exact solution of the linear case G = 0, which is constructed as follows. We consider σ = 1 and choose u(x, t) = −1/2 and u(x, t) = 1/2, and the following state, adjoint

858


Table 5 Numerical results for the terminal problem with TS-CGS and TL-CGS smoothing schemes. Parameters: σ = 1, α = 0, β = 1. Initial condition for state equation: y0 = y(x, 0) ν TS-CGS

ρ

ry

rp

|yh − Rh yT |

0.070

4.13 · 10−10

9.45 · 10−11

6.29 · 10−3

0.070

4.61 · 10−9

1.56 · 10−9

6.33 · 10−3

128

0.065

1.97 · 10−8

1.26 · 10−8

6.33 · 10−3

32

0.080

6.00 · 10−11

8.18 · 10−13

8.76 · 10−5

0.089

1.53 · 10−10

3.48 · 10−12

8.91 · 10−5

128

0.083

7.19 · 10−10

3.93 · 10−11

8.99 · 10−5

32

0.101

1.98 · 10−10

2.09 · 10−14

8.82 · 10−7

0.093

9.67 · 10−10

1.72 · 10−13

8.99 · 10−7

128

0.070

2.05 · 10−9

8.82 · 10−13

9.10 · 10−7

32

0.064

5.58 · 10−9

6.64 · 10−10

6.29 · 10−3

γ

10−2

32 64

10−4

64 10−6

64

TL-CGS

10−2

10−4

10−6

64

0.067

4.76 · 10−9

8.77 · 10−10

6.33 · 10−3

128

0.059

1.94 · 10−8

3.68 · 10−9

6.33 · 10−3

32

0.062

5.57 · 10−9

8.14 · 10−12

8.76 · 10−5

64

0.064

4.82 · 10−9

1.08 · 10−11

8.91 · 10−5

128

0.045

2.00 · 10−8

1.18 · 10−10

8.99 · 10−5

32

0.062

5.57 · 10−9

8.14 · 10−14

8.82 · 10−7

0.064

4.82 · 10−9

1.08 · 10−13

8.99 · 10−7

0.055

2.00 · 10−8

1.19 · 10−12

9.10 · 10−7

64 128

and control functions y(x1 , x2 , t) = (1 − t) sin(πx1 ) sin(πx2 ), p(x1 , x2 , t) = ν(1 − t) sin(2πx1 ) sin(2πx2 ),

(6.2)

u(x1 , x2 , t) = max{−0.5, min{0.5, p(x1 , x2 , t)/ν}}. The corresponding data is then given by f = −u − ∂t y + y, yd = y + ∂t p + p.

(6.3)

The optimality system (2.3) with constraints on the control is solved by the STMGscheme. The results are reported in Table 6. In this case, in spite of the fact that the control function loses regularity when constraints become active, for moderate value of ν = 10−3 , we obtain second-order accuracy in the approximation of the state, the adjoint, and the control variables. For much smaller values of ν = 10−7 , corresponding to much larger active sets and steeper gradients of the control function, a worsening of the accuracy of approximation of the control variable can be observed. In Table 7, we report results concerning the computational performance of the TSMG approach with constrained-control problems. We obtain convergence rates


859

Table 6 Accuracy results for a constrained-control problem: σ = 1, α = 1 and β = 0 ν 10−3

10−7

Nx × Ny × Nt

y − yh

p − ph

u − uh

64

1.88 · 10−4

4.49 · 10−6

3.56 · 10−3

128 × 128 × 128

128

5.07 · 10−5

1.10 · 10−6

8.74 · 10−4

256 × 256 × 256

256

1.32 · 10−5

2.68 · 10−7

2.13 · 10−4

64 × 64 × 64

64

3.31 · 10−5

1.81 · 10−7

2.52 · 10−2

128

7.60 · 10−6

3.03 · 10−8

8.67 · 10−3

256

1.85 · 10−6

5.93 · 10−9

2.94 · 10−3

γ

64 × 64 × 64

128 × 128 × 128 256 × 256 × 256

Table 7 Numerical results for the constrained tracking problem with P-TS-CGS and P-TL-CGS smoothing schemes. Parameters: σ = 1, α = 1 and β = 0. Initial condition for state equation: y0 = y(x, 0) ν P-TS-CGS

10−2

10−4

10−6

ρ

ry

rp

yh − Rh yd

32

0.034

1.32 · 10−9

6.56 · 10−11

2.26 · 10−1

γ

64

0.030

6.32 · 10−9

3.41 · 10−10

2.29 · 10−1

128

0.029

6.17 · 10−8

4.09 · 10−9

2.30 · 10−1

32

0.081

3.67 · 10−9

5.86 · 10−12

2.30 · 10−3

64

0.029

4.43 · 10−10

5.03 · 10−12

2.29 · 10−3

128

0.028

2.87 · 10−9

6.21 · 10−11

2.30 · 10−3

32

0.548

7.15 · 10−6

6.27 · 10−9

1.75 · 10−4

0.293

1.95 · 10−8

1.94 · 10−11

4.35 · 10−5

128

0.091

1.76 · 10−9

4.99 · 10−13

2.52 · 10−5

32

0.034

1.34 · 10−9

6.57 · 10−11

2.26 · 10−1

0.030

6.30 · 10−9

3.41 · 10−10

2.29 · 10−1

128

0.029

6.17 · 10−8

4.09 · 10−9

2.30 · 10−1

32

0.082

3.67 · 10−9

5.66 · 10−12

2.29 · 10−3

64

0.029

6.82 · 10−10

5.05 · 10−12

2.29 · 10−3

128

0.028

3.42 · 10−9

6.21 · 10−11

2.30 · 10−3

32

0.503

4.18 · 10−3

2.20 · 10−5

1.30 · 10−4

64

0.294

1.56 · 10−8

1.32 · 10−11

3.86 · 10−5

128

0.149

2.18 · 10−9

1.74 · 10−12

2.48 · 10−5

64

P-TL-CGS

10−2

64 10−4

10−6

that show robustness and typical multigrid efficiency that improves on finer meshes. This is due to the fact that on fine meshes the active sets are better resolved. On the other hand, smaller values of ν result in larger active sets and steeper gradient of the control function arise making the problem more difficult to solve thus explaining the worsening of the convergence factor. In Fig. 2, the state y and the control u for two instants are depicted. We observe that for small time periods, the control is active but as the time period increases, the control becomes non-active, since p → 0 as t → T = 1.

860


Fig. 2 Control constrained tracking problem: state y (left column) and control u (right column) for t = T /4 and t = 3T /4. Parameters: α = 1, β = 0, ν = 10−7 , γ = 64 and σ = 1

One important concern for parabolic control problems is tracking of a desired trajectory over long-time intervals. For this purpose, the combination of our multigrid method with receding-horizon techniques [1, 29] arises as an efficient strategy. In this section, we use the receding-horizon algorithm developed in [9, Sect. 3.2] in order to show the ability of this approach to track over long-time intervals. We test the receding-horizon algorithm by solving the control-constrained problem with the nonlinearity G(y) = exp(y) and the following desired trajectory yd (x1 , x2 , t) := t sin(2πt)(x1 − x12 )(x2 − x22 ). We study the tracking of the given trajectory, over the time interval (0, 5), considering the following constraints for the control u(x, t) = −1 and u(x, t) = 1. We take σ = 0.01 and we use the STMG-algorithm with the P-TL-CGS smoothing scheme because this smoothing scheme is efficient and robust with small values of the diffusion coefficient. Our experience shows that fast accurate tracking is obtained taking α = 1, β = 0.1 and ν = 10−4 . In this case, the optimal control problem is solved to the required tolerance, on a grid with γ = 64, by 5 STMG-W (2, 2)-cycles, in average in each time window. In Fig. 3 (left), the time evolution of the state variable compared to the desired trajectory is depicted that shows accurate tracking (yh − Rh yd ≈ 10−4 ). In Fig. 3 (right), the control function is depicted. We observe that initially the control constraints appear to be non-active but as the time increases, the constraints become active.


861

Fig. 3 Receding-horizon solution for the control-constrained tracking problem. Left: time evolution of the state y (solid line) and the desired trajectory yd (dots) at (x1 , x2 ) = (0.5, 0.5). Right: optimal control u at (x1 , x2 ) = (0.5, 0.5)

6.3 Bang-bang control In this section, we discuss the limit case of ν = 0. This discussion is possible due to the robustness of our STMG approach where the smoothing scheme remains well defined choosing a zero weight of the control. This appears to be a unique feature of our solution strategy. In particular, with our multigrid scheme, we are able to investigate bang-bang control problems that arise, e.g., taking ν = 0 and non-attainable target functions. This is a less investigated subject due to the difficulty of computing bang-bang solutions; see [11, 18, 44]. Following [11, 18] and considering the linear case G = 0, we can prove that the solution to (2.1) with ν = 0, exists and is unique. It is characterized by the following optimality system −∂t y + σ y = f + u

in Q,

∂t p + σ p + α(y − yd ) = 0

in Q,

y = 0, p = 0

on ,

(6.4)

p = min{0, p + u − u} + max{0, p + u − u} in Q with initial condition y(x, 0) = y0 (x) and terminal condition p(x, T ) = β(y(x, T ) − yT (x)). We take f = 0 and y0 (x) = 0, u(x, t) = −1 and u(x, t) = 1, and consider the following target trajectory yd (x1 , x2 , t) = sin(2πt) sin(3πx1 ) sin(3πx2 ). With this target function and ν = 0 we obtain a control which is everywhere active, that is, the control is bang-bang. In Fig. 4, the optimal control and the corresponding state for ν = 0 are depicted for two different instant of time.

862


Fig. 4 Numerical bang-bang control solutions with ν = 0 at t = T /4 (top) and t = 3T /4 (bottom). The state (left) and the control (right); 128 × 128 × 128 mesh

7 Conclusions A space-time second-order accurate discretization scheme for control-constrained parabolic control problems was proposed and accuracy estimates for this scheme were discussed. In order to efficiently solve the resulting optimality systems, a multigrid algorithm for nonlinear control-constrained parabolic optimal control problems was discussed. The multigrid strategy was developed based on collective projected Gauss-Seidel schemes, which were proved to be local semismooth Newton methods. Moreover, a detailed local Fourier analysis was presented to study the convergence and smoothing properties of the multigrid schemes. This theoretical investigation and results of numerical experiments demonstrated the accuracy, robustness, and textbook multigrid efficiency of the proposed high-order multigrid solution process. Further numerical experiments showed the ability of the space-time multigrid approach in solving parabolic control-constrained problems in the limit case of bang-bang control and the ability to track for long-time intervals using a receding-horizon technique. Appendix: Details on twogrid local Fourier analysis Local Fourier analysis (LFA) is a powerful tool for the analysis and design of efficient multigrid methods for PDE problems [41]. The LFA approach allows to characterize the convergence and smoothing properties of a multigrid scheme by considering the corresponding representation in the Fourier space; see [7, 8, 13, 41] for more details.


863

In the LFA framework, one considers infinite grids. On the fine grid, we define the Fourier components φ(j , θ ) = eij ·θ where i is the imaginary unit, j = (jx , jt ) ∈ Z × Z, θ = (θx , θt ) ∈ [−π, π)2 , and j · θ = jx θx + jt θt . These functions are called harmonics. In a semicoarsening setting where coarsening applies only to the spacegrid, the frequency domain is spanned as follows low frequencies: φ(·, θ (0,0) )

with θ (0,0) := (θx , θt ) ∈ [−π/2, π/2) × [−π, π),

high frequencies: φ(·, θ (1,0) ) with θ (1,0) := (θ x , θt ) ∈ ([−π, π) \ [−π/2, π/2)) × [−π, π). Using semicoarsening, we have that φ(j , θ (0,0) ) = φ(j , θ (1,0) ) on the coarse grid. results in a Under this setting, the Fourier symbol of the twogrid operator T Gk−1 k 4 × 4-matrix given by k−1 ν1 $ k−1 Sk (θ )ν2 CG T$ Gk (θ ) = k (θ )Sk (θ ) ,

(7.1)

$ k−1 represent the Fourier symbols of the smoothing scheme opwhere Sk (θ ) and CG k k (A −1 k−1 A ], = [Ik −Ik−1 erator and the coarse grid correction given by CGk−1 k−1 ) Ik k k respectively [41]. Note that here we represent the two main features in the multigrid scheme: to reduce the high frequency error components by applying the smoothing operator Sk and to reduce the low frequency error components by coarse grid correction. Also, we assume that (Ak−1 )−1 exists. Next, we determine the explicit form of the operator symbols given above. We % θ φk (j , θ ) denotes the errors first analyze the smoothing scheme. Let w (j ) = θ W θ ) are the corresponding Fourier coefficients. θ := ( yθ , P on the% space-time grid and W Here θ denotes formal summation in θ = (θx , θt ) ∈ ([−π/2, π/2) × [−π, π)). The (1) = (0) , where the suaction of one smoothing step can be expressed by W Sk (θ )W θ θ perscript (0) corresponds to the old approximation of the involved variables, before the smoothing step, and (1) corresponds to the new approximation, after the smoothing step. The operator Sk applies to the two equations (4.2a)–(4.2b) acting on low- and high-frequency components, and has the following form s(θ (0,0) ), s(θ (1,0) )}, Sk (θ ) = diag{ where s(θ ) is the 2 × 2 Fourier symbol of the smoothing scheme for a generic θ . A way to characterize the smoothing property of the operator Sk is to assume an ideal coarse grid correction which annihilates the low frequency error components and leaves the high frequency error components unchanged. That is, one defines the k−1 on E θ × E θ by projection operator Q k k k " k−1 # Q (θ ) 0 k k−1 (θ ) = Q k 0 Qk−1 k (θ ) diag{0, 0} if θ = θ (0,0) , where Qk−1 k (θ ) = diag{1, 1} if θ = θ (1,0) .

864


We calculate s(θ ) for the TS-CGS and the TL-CGS smoothing schemes applied to (4.2a)–(4.2b) resulting from the BDF2 discretization and without constraints on the control. To analyze the TS-CGS update procedure at (i, m), we consider that yim−1 and yim−2 as well as pim+1 and pim+2 have been updated in the previous iteration step. We then obtain that ⎛

s(θ ) = ⎝

−( 32 + 2σ γ ) + σ γ e−iθx + 2e−iθt − αδt

−σ γ eiθx × 0

0 −σ γ eiθx

e−2iθt 2

⎞−1

− δtν −( 32 + 2σ γ ) + σ γ e−iθx + 2eiθt −

e2iθt 2

⎠

.

Further, for the case of TL-CGS relaxation, the Fourier symbol of the smoothing operator is given by the following 2 × 2 matrix s(θ ) = −(A + Be−2iθt + Ce−iθt + Deiθt + Ee2iθt + Ie−iθx )−1 (Ieiθx ), where A is given by (4.23) and the matrices B, C, D and E are given by (4.25). Further, I:= σ γ I , where I is the 2 × 2 identity matrix. Furthermore, since the system is solved for all m at once, we have that yim−1 , yim−2 , pim+1 and pim+2 have superindex (0). Next, we calculate the Fourier symbols of the involved operators in the coarse grid correction CGk−1 k . We consider a full-weighting restriction operator whose symbol is given by 1 (1 + cos(θx )) 0 (1 − cos(θx )) 0 k−1 Ik (θ ) = . 0 (1 − cos(θx )) 0 (1 + cos(θx )) 2 k (θ ) = I k−1 (θ )T . The symbol of For the linear prolongation operator, we have Ik−1 k the fine grid operator is

⎡

ay (θ (0,0) ) ⎢ αδt k (θ ) = ⎢ A ⎣ 0 0

−δt/ν ap (θ (0,0) ) 0 0

0 0 ay (θ (1,0) ) αδt

⎤ 0 ⎥ 0 ⎥, −δt/ν ⎦ ap (θ (1,0) )

where 1 3 ay (θ ) = 2σ γ cos(θx ) + 2e−iθt − e−2iθt − 2σ γ − , 2 2 1 3 ap (θ ) = 2σ γ cos(θx ) + 2eiθt − e2iθt − 2σ γ − . 2 2

and

Then, the symbol of the coarse-grid operator follows ⎡ k−1 (θ ) = ⎣ A

σ γ cos(θx ) 2

+ 2e−iθt − 12 e−2iθt − αδt

σ γ +3 2

⎤

−δt/ν σ γ cos(θx ) 2

+ 2eiθt − 12 e2iθt −

σ γ +3 2

⎦.


865

Notice that on the coarser grid δt remains unchanged, since we do not apply coarsening in the time direction, while γ → γ /4 by coarsening.

References 1. Allgöwer, F., Badgwell, T.A., Qin, J.S., Rawlings, J.B., Wright, S.J.: Nonlinear predictive control and moving horizon estimation—An introduction overview. In: Frank, P.M. (ed.) Advances in Control, Highlights of ECC’99, pp. 391–449 (1999), Chap. 5 2. Ascher, U.M., Petzold, L.R.: Computer Methods for Ordinary Differential Equations and DifferentialAlgebraic Equations. SIAM, Philadelphia (1998) 3. Batista, M.: A method for solving cyclic block penta-diagonal systems of linear equations (2008). arXiv:0803.0874v3 4. Batista, M., Karawia, A.A.: The use of the Sherman-Morrison-Woodbury formula to solve cyclic block tri-diagonal and cyclic block penta-diagonal linear systems of equations. Appl. Math. Comput. 210, 558–563 (2009) 5. Benkert, K., Fischer, R.: An efficient implementation of the Thomas-Algorithm for block pentadiagonal systems on vector computers. In: Shi, Y., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds.) Proceedings of the 7th International Conference on Computer Science, ICCS (2007) 6. Biegler, L.T., Ghattas, O., Heinkenschloss, M., Keyes, D., van Bloemen Waanders, B. (eds.) Real-Time PDE-Constrained Optimization, Computational Science and Engineering, vol. 3. SIAM, Philadelphia (2007) 7. Borzì, A.: High-order discretization and multigrid solution of elliptic nonlinear constrained optimal control problems. J. Comput. Appl. Math. 200, 67–85 (2007) 8. Borzì, A.: Multigrid methods for parabolic distributed optimal control problems. J. Comput. Appl. Math. 157, 365–382 (2003) 9. Borzì, A.: Space-time multigrid methods for solving unsteady optimal control problems. In: Biegler, L.T., Ghattas, O., Heinkenschloss, M., Keyes, D., van Bloemen Waanders, B. (eds.) Real-Time PDE-Constrained Optimization, Computational Science and Engineering, vol. 3. SIAM, Philadelphia (2007), Chap. 5 10. Borzì, A., Griesse, R.: Distributed optimal control of lambda-omega systems. J. Numer. Math. 14, 17–40 (2006) 11. Borzì, A., Kunisch, K.: A multigrid scheme for elliptic constrained optimal control problems. Comput. Optim. Appl. 31, 309–333 (2005) 12. Borzì, A., Schulz, V.: Multigrid methods for PDE optimization. SIAM Rev. 51, 361–395 (2009) 13. Borzì, A., von Winckel, G.: Multigrid methods and sparse-grid collocation techniques for parabolic optimal control problems with random coefficients. SIAM J. Sci. Comput. 31, 2172–2192 (2009) 14. Brandt, A.: Multi-level adaptive solutions to boundary-value problems. Math. Comput. 31, 333–390 (1977) 15. Brézis, H., Crandall, M.-G., Pazy, A.: Perturbations of nonlinear maximal monotone sets in Banach spaces. Commun. Pure Appl. Math. 23, 123–144 (1970) 16. Dreyer, Th., Maar, B., Schulz, V.: Multigrid optimization in applications. J. Comput. Appl. Math. 120, 67–84 (2000) 17. Emmrich, E.: Two-step BDF time discretisation of nonlinear evolution problems governed by monotone operators with strongly continuous perturbations. Comput. Methods Appl. Math. 9, 37– 62 (2009) 18. Glashoff, K., Sachs, E.: On theoretical and numerical aspects of the bang-bang principle. Numer. Math. 29, 93–113 (1977) 19. Goldberg, H., Tröltzsch, F.: Second order sufficient optimality conditions for a class of non-linear parabolic boundary control problems. SIAM J. Control Optim. 31, 1007–1027 (1993) 20. Goldberg, H., Tröltzsch, F.: On a SQP–multigrid technique for nonlinear parabolic boundary control problems. In: Hager, W.W., Pardalos, P.M. (eds.) Optimal Control: Theory, Algorithms, and Applications, pp. 154–174. Kluwer Academic, Dordrecht (1998) 21. Hackbusch, W.: Parabolic multigrid methods. In: Glowinski, R., Lions, J.-L. (eds.) Computing Methods in Applied Sciences and Engineering VI. North-Holland, Amsterdam (1984) 22. Hackbusch, W.: Elliptic Differential Equations. Springer, New York (1992)

866


23. Hackbusch, W.: On the fast solving of parabolic boundary control problems. SIAM J. Control Optim. 17, 231–244 (1979) 24. Hackbusch, W.: Numerical Solution of Linear and Nonlinear Parabolic Optimal Control Problems. Lecture Notes in Control and Information Science, vol. 30. Springer, Berlin (1981) 25. Hintermüller, M., Ito, K., Kunisch, K.: The primal-dual active set strategy as a semi-smooth Newton method. SIAM J. Optim. 13, 865–888 (2003) 26. Hintermüller, M., Hoppe, R.H.W.: Goal-oriented adaptivity in control constrained optimal control of partial differential equations. SIAM J. Control Optim. 47, 1721–1743 (2008) 27. Hinze, M.: A variational discretization concept in control constrained optimization: the linearquadratic case. Comput. Optim. Appl. 30, 45–63 (2005) 28. Horton, G., Vandewalle, S.: A space-time multigrid method for parabolic partial differential equations. SIAM J. Sci. Comput. 16(4), 848–864 (1995) 29. Ito, K., Kunisch, K.: Asymptotic properties of receding horizon optimal control problems. SIAM J. Control Optim. 40(5), 1585–1610 (2002) 30. Kunisch, K., Volkwein, S.: Proper orthogonal decomposition for optimality systems. ESAIM: Math. Model. Numer. Anal. 42, 1–23 (2008) 31. Lions, J.L.: Optimal Control of Systems Governed by Partial Differential Equations. Springer, Berlin (1971) 32. Malanowski, K.: Convergence of approximations vs. regularity of solutions for convex, controlconstrained optimal-control problems. Appl. Math. Optim. 8, 69–95 (1981) 33. Meidner, D., Vexler, B.: Adaptive space-time finite element methods for parabolic optimization problems. SIAM J. Control Optim. 46, 116–142 (2007) 34. Nagaiah, Ch., Kunisch, K., Plank, G.: Numerical solution for optimal control of the reaction-diffusion equations in cardiac electrophysiology. Comput. Optim. Appl. doi:10.1007/s10589-009-9280-3 35. Neittaanmäki, P., Tiba, D.: Optimal Control of Nonlinear Parabolic Systems. Dekker, New York (1994) 36. Qi, L., Sun, J.: A nonsmooth version of Newton’s method. Math. Program. 58, 353–368 (1993) 37. Rösch, A.: Error estimates for linear-quadratic control problems with control constraints. Optim. Methods Softw. 21, 121–134 (2006) 38. Showalter, R.E.: Monotone Operators in Banach Space and Nonlinear Partial Differential Equations. Mathematical Surveys and Monographs AMS. AMS, Providence (1997) 39. Stadler, G.: Semismooth Newton and augmented Lagrangian methods for a simplified friction problem. SIAM J. Optim. 15, 39–62 (2004) 40. Thomas, J.W.: Numerical Partial Differential Equations: Finite Difference Methods. Springer, Berlin (1995) 41. Trottenberg, U., Oosterlee, C., Schüller, A.: Multigrid. Academic Press, London (2001) 42. Ulbrich, M.: Semismooth Newton methods for operator equations in function spaces. SIAM J. Optim. 13, 805–842 (2002) 43. Varga, S.R.: Matrix Iterative Analysis. Prentice Hall, New York (1962) 44. Zuazua, E.: Switching control. J. Eur. Math. Soc. (to appear)