C18:2 (C18H32O2). -. 1.5 overhead pressure p1. [mbar]. 35.0. 35.0 distillate stream. D. [kg/h] 1400.0. 1400.0 reflux ratio r. [â]. 1.5. 2.6 condenser sub-cooling. âT.
An efficient sparse approach to sensitivity generation for large-scale dynamic optimization Tilman Barz, Stefan Kuntsche, G¨ unter Wozny, Harvey Arellano-Garcia Chair of Process Dynamics and Operation, Sekr. KWT-9, Berlin Institute of Technology, D-10623 Berlin, Germany
Abstract In this work, an efficient approach is proposed for the combined step-wise state and sensitivity integration tailored to the orthogonal collocation in finite elements integration method. The presented algorithm can be adapted to any one step integration method where gradients are available from the solution of the discretized system. Moreover, it is completely based on sparse matrix calculus, which makes it especially suitable for large-scale equation systems with a relative small number of independent variables. The relative computational effort in comparison to a state integration is small with a linear increase w.r.t. the number of independent variables. The performance of the developed algorithm has been tested on two case studies: the parameter estimation of a stiff implicit differential equation system with parameters directly connected to the differential states, and the dynamic optimization of a stiff general DAE system. Keywords: first-order sensitivities, integration of fully implicit dynamic systems, orthogonal collocation on finite elements, large sparse equation systems, differential algebraic equation systems, single shooting
1 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
1. Introduction General attention has been paid to the computation of sensitivities w.r.t. independent model parameters (or free decisions) in dynamic systems. Besides its use for a wide range of analytic purposes, the main interest lies on its application in gradient-based dynamic optimization e.g. when the partial discretization approach also called sequential approach is used, [2]. Here only independent variables are considered in the optimization problem formulation and a Nonlinear Programming Problem (NLP) is solved. The dynamic process model is then integrated at each iteration of the NLP solver for a given set of independent variables and the corresponding sensitivities have to be evaluated. Most of the problems in chemical engineering exhibit a relative large number of dependent states in comparison to the independent variables, and thus, the main effort for the solution of the optimization problem lies in the repeated integration and sensitivity evaluation. Consequently, there is a need for algorithms, which can perform sensitivity computations for large-scale models in an efficient and rapid manner, [7, 12, 16, 1]. Sensitivities can be calculated in three different ways: by perturbations, using forward integration of the sensitivity equations, or backward integration of the adjoint system. However, if the number of parameters or free decisions is small in comparison to the number of dependent states, the latter approach is generally not efficient, [3]. The most direct way to obtain sensitivities is the perturbation method also known as external numeric differentiation, [1]. This is based on a finite difference approximation, where the integration is carried out repeatedly for slightly perturbed parameters or free decisions. Although easy to be implemented, it is relative computationally expensive. Moreover, since the output depends discontinuously on the adopted size or order of integration, the perturbation approach exhibits additional problems when using adaptive integrators. In order to overcome this issue, the so called StepFreeze perturbation method was proposed, where the integration grid and all other properties of the integrator are held constant during the repeated integration, [17]. Accordingly, information used in the integration steps can be used directly for the integration of the perturbed system, and thus, significant savings in computational effort are achieved. In addition to the integration of the equation system, an extra set of linear equation systems of the same size has to be solved for each free parameter, when applying forward integration for the evaluation of sensitivities, [19]. The sensitivities can be integrated step-wise together with the integration of the model equations. Towards a more efficient sensitivity integration, the idea of reusing information from state integration has been adopted by various authors. Different algorithms for a combined state and sensitivity integration have been developed as extensions of the multi-step backward differentiation formula (BDF) code. The so called staggered corrector method has been applied to the DASSL solver, where disposable information is again reused. In particular, the iteration matrices from state integration are reused, which are already available from the DAE corrector iteration. This has been shown to be very efficient 2 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
for large sparse systems, [7]. An implementation has been done in the software DASPK3.0 for sensitivity analysis of large-scale DAE systems of index up to two, [12]. An efficient calculation of sensitivities for initial value problems for large and sparse linear-implicit systems of differential-algebraic equations (DAEs) of index one have been implemented in the solver DAESOLII, [1], which is based on a variable order and an adjustable step-size BDF method. Here the so called internal numeric differentiation is used, where the adaptive components of the integrator are freezed. Furthermore, a so called staggered state and sensitivity approach has been proposed by [16] based on an extrapolated linearly implicit Euler discretization scheme, where the fact is exploited that by fixing the step size of the integrator, the factorization of the iteration matrix required for the solution of the DAE states can be reused for the integration of the sensitivities. In chemical engineering, general DAEs often correspond to the natural statement of the underlying physical models. A reformulation or index reduction is often not convenient and for large nonlinear problems may even not be possible, [13, 20, 3]. Thus, solvers able to handle general DAEs present basically some advantages. In this work, collocation methods also known as fully implicit Gauss forms are considered, which can be applied for the approximation of general DAEs. Their equivalence to particular implicit Runge-Kutta methods with highest order accuracy and excellent stability properties for index one and higher systems has already been shown, [13, 22]. For the application of the full discretization approach in dynamic optimization, Orthogonal Collocation on Finite Elements (OCFE) has shown to be a valuable method for transforming dynamic equations into algebraic ones. Based on this efficient solution, methods for general differential-algebraic optimization problems have been derived, [4, 13, 2, 22]. Applying the partial discretization approach, OCFE can also be used for the direct integration of general implicit and stiff DAEs, [11, 20, 21, 18]. Although commonly more expensive than multi-step methods, they have the special capability of self-starting in high orders, which may be especially important for the partial discretization approach when using multiple shooting or having a high number of discontinuities. In this work, we propose an efficient algorithm for sensitivity generation within the scope of the partial discretization approach, which is tailored to an OCFE based integration of general DAEs. The developed algorithm can also be used with any one-step integration method and gives exact sensitivity information for the approximate solution. In contrast to other algorithms used together with OCFE integration methods, [14, 10], the presented apporach is also efficient when integrating general large and sparse DAEs. It has also been adapted for the use within the staggered method for state and sensitivity integration presented by other authors and exhibits similar properties in terms of improved computational efficiency. The remainder of this paper is structured as follows: in section 2, the general discretization scheme used for the partial discretization of dynamic optimization problems is introduced. For this purpose, some important characteristics of sensitivities are discussed, when applying single shooting. Section 3 gives a brief introduction in the OCFE integration method and presents the basis al3 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
gorithm for the evaluation of sensitivities. The proposed alternative algorithm is derived in section 4. In section 5, the implementation together with specific details is discussed for its use in single shooting. Finally, in order to demonstrate the algorithm performance, two large-scale sparse systems are treated, namely a general stiff implicit differential equation system, and a general stiff DAE system. Thus, in section 6, a detailed description of the applications are presented including the solution of the related parameter estimation and control problems, and performance data. 2. Transformation of dynamic optimization problems 2.1. Discretization scheme In this work, dynamic optimization problems as defined in Eq. (1) are considered which are subject to general fully implicit equation systems f , equality g, and inequality constraints h, as well as constant lower and upper bounds on dependent states x ∈ RNx and free variables (also referred to as decision variables) u ∈ RNu . min u
s.t.
ϕ(x, u, t)
(1a)
˙ x, u, t) = 0 ; f (x, g(x, u, t) = 0 h(x, u, t) ≤ 0 x
min
min
u
x(t0 ) = x0
(1b) (1c) (1d)
max
≤x≤x
≤ u ≤ umax
In general, the free variables u are dependent on a set of parameters p and time t: u(t, p), where higher order approximations are realized by polynomials. In order to transform those problems into finite-dimensional nonlinear programming (NLP) problems, the free variables are discretized in a first step (decision grid). Points in time are then defined, where states are evaluated (evaluation points). • decision grid. The time intervals, where the free decision variables ud with d ∈ [1, Nd] are constant, define the decision grid. In this work, the decision variables are assumed to be piecewise constant over each grid element. Thus, the parameter p can be removed from the problem.1 • evaluation points. Discrete points in time, at which state variables x(t = te ) with e ∈ [1, Ne] are evaluated, are denoted as evaluation points. 1 In order to apply a higher order approximation to the decision variables u, polynomial equations u = f (t, p) can be added to the equation system (1b). The decision variables u turn into dependent states and are then replaced by the respective parameters p.
4 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
In the proposed implementation, intervals between evaluation points are independent from the decision grid. Furthermore, during optimization both the position of evaluation points and distances within the decision grid are in general not constant. This is in particular desirable when e.g. minimizing time in order to comply with a certain specification in a dynamic process, or when optimizing the time length of a certain control action. In those cases, positions and distances are considered as independent variables in the optimization problem, [14]. In this work, sequential optimization based on single shooting is applied. For the necessary integration, a general high order one-step method is considered, which also includes a step size control. The steps of the integrator are denoted here as simulation grid. • simulation grid. The finite sub-intervals s ∈ [1, Nsd ] used for discretization in order to integrate the differential equation system define the so called simulation grid. The sub-intervals depend on the step size control of the integrator. However, since the decision interval bounds imply discontinuities, the simulation grid ends have to match the boundaries of the corresponding decision interval, [13, 5]. Accordingly, the optimization problem is reduced by the dependent states x(te ) and the problem depends then only on the free decisions ud : min
u1 ,··· ,uNd
s.t.
ϕ( x(t1 ), · · · , x(tNe ), u1 , · · · , uNd )
(2a)
g( x(t1 ), · · · , x(tNe ), u1 , · · · , uNd ) = 0
(2b)
h( x(t1 ), · · · , x(tNe ), u1 , · · · , uNd ) ≤ 0 umin d
≤ ud ≤
umax d
(2c)
; ∀ d ∈ [1, Nd]
Figure 1 shows the discretization scheme, when Orthogonal Collocation on Finite Elements (OCFE) is used for the approximation of differential states with c ∈ [1, Nc]. The problem functions ϕ, g, h, as well as the first order gradients with respect to the decision variables in a certain interval ud are computed by integrating the differential equation system (1b) in a separate simulation subroutine. The gradients of all problem functions denoted in Eqs. (2a, 2b, 2c) can be computed using the chain rule. Thus, the gradients for the objective function read: Ne X Nx X ∂ϕ ∂ϕ ∂xi (te ) dϕ = + · (3) d ud ∂ud e=1 i=1 ∂xi (te ) ∂ud
For the computation of (3) it is necessary to determine the sensitivities ∂xi (te )/∂ud . Generally with an independent (step size controlled) simulation grid, the values xi (te ) and gradients ∂xi (te )/∂ud at specific evaluation points are obtained by polynomial interpolation. Due to the coupling of the states in the equation system (1b), the sensitivities for a specific xi (te ) can not be computed independently. Instead, they have to be evaluated simultaneously with all other states ∂x(te )/∂ud . 5 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
e =1
* evaluation points for
e = Ne
x ( te )
* decision grid for
u
d
d =1
d -1
t0
tf c = [1, ⋅⋅⋅, Nc] c=0
**simulation grid for
x
d , s,c
d = Nd
d +1
d
s =1
s −1
s = Ns
s +1
s
d
* dependends on the formulation of the optimization problem ** possibly not static during optimization (step size controlled), boundaries of decision intervals have to be matched
Figure 1: Discretization grids used for the numeric solution of a dynamic optimization problem. The Radau-Collocation is applied here for the integration within each decision interval.
2.2. Characteristics of sensitivities in single shooting The necessary sensitivity information comprises derivatives of the state variables in a certain decision interval d = i w.r.t. the decisions in the interval d = j. Si,j (te ) =
∂xd=i (te ) ∂ud=j
(4)
From Fig. (1) it can be seen that the integration of the state variables within the decision interval d = i leads to a further discretization over time based on the simulation sub-intervals s. The aspects of these integration steps are discussed later in section 3.1. Generally, the diagonal elements Si,i in Eq. (5) represent the so called local sensitivities in one decision interval, whereas the off-elements or global sensitivities in the lower triangular matrix Si,j represent dependencies of states on decisions in previous intervals i > j. The off-elements in the upper triangular matrix, where i < j holds, stand for dependencies of states on decisions in later intervals. These elements are always equal to zero. S1,1 0 ··· 0 S2,1 S2,2 · · · 0 S(te ) = (5) .. .. . . . . . . . . SNd,1 SNd,2 · · · SNd,Nd with
Si,j (te ) =
∂x1,i ∂u1,j
.. .
∂xNx,i ∂u1,j
··· .. . ···
∂x1,i ∂uNu,j
.. .
∂xNx,i ∂uNu,j
6 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
3. Orthogonal collocation on finite elements for state integration and sensitivity computation In the sequential optimization approach, it has turned out to be efficient to combine the integration of model equations with a sensitivity computation. In this manner, one can exploit the fact that both solution strategies involve similar matrix calculus, which are mainly related to the system’s Jacobian. In the following section, the applied integration technique and the sensitivity computation approach are presented. 3.1. Integration of the model equations The integration of the differential equations is based on the discretization using OCFE, see Fig. 1. The resulting nonlinear algebraic equation systems are then inductively solved for each element. A detailed description on the use of piecewise polynomial approximations for the solution of DAE systems can be found, e.g. in [8, 4, 13, 14, 20]. The method applies a linearly weighted sum of Lagrange polynoms L which include the coefficients of the state values in each sub-interval s at all collocation points c ∈ [0, Nc]. xd,s (td,s,c ) =
Nc X j=0
Ld,j (td,s,c ) · xd,s,j = xd,s,c
(6)
Nc
d xd,s,c d xd,s (td,s,c ) X d Ld,j (td,s,c ) = · xd,s,j = dt dt dt j=0 A common approach using OCFE is the Radau Collocation method. In that approach, one collocation point is placed at the beginning of each element c = 0. These elements represent here the simulation subintervals s, see Fig. 1. The discretized equation system (1b) is defined for one simulation sub-interval including the initial values xd,s,0 and the local decision variables ud , as follows: 0 = fd,s (xd,s,0 , xd,s , ud )
(7) T
with xd,s = [xd,s,1 , · · · , xd,s,Nc ]
The solution of Eq. (7) for each element s within the decision interval d satisfies the equation system at all discrete collocation points c. Using the continuity conditions, the initial value problem of the present element s depends on the solution of the prior element s−1: s ∈ [2, Nsd ] xd,s,0 = xd,s−1,Nc (8) d ∈ [1, Nd] and x1,1,0 = x(t1,1,0 ) = x0
with x0 being the initial value of the equation system (1b). As shown in Eq. (7), the dependent states xd,s do not include the initial value xd,s,0 . Instead, the continuity condition for each simulation sub-interval is implicitly part of the 7 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
discretized equations. Accordingly, during the step-wise integration, Eq. (7) is solved for xd,s with xd,s,0 and ud considered as constant parameters.2 The discretization generates block-diagonal elements in the systems Jacobian with the size of the polynomial order Nc. The size of the resulting nonlinear equation system (7) is multiplied with the order of the polynomial approximation, Nc. Moreover it should be noted, that the first order Radau Collocation equates to the implicit Euler discretization method. 3.2. Sensitivity computation It has generally been proved to be efficient to compute concurrently the sensitivities and the element-wise integration of the model equations. This corresponds to the so called staggered state and sensitivity integration, [16], in which local information is considered regarding the states and decisions in the current integration step. In the staggered approach, the iteration matrix used for the integration can be reused for the sensitivities evaluation. Furthermore, the current step can be rejected and recalculated with a different step size, e.g. in case of no convergence. In contrast, since the sensitivities depend on the results from the state integration, using a simultaneous state and sensitivity approach would waste computational resources, [16]. [14] and [10] presented an approach, in which the staggered concept is adapted to the integration based on OCFE. However, in contrast to the discretization concept discussed in section 2.1, no simulation sub-intervals were considered. The length of the decision intervals determine directly the length x u grid simulation ( ) & decision ( )
d ,c
d
d =1
d -1
c=0
c = [1, ⋅⋅⋅, Nc] d
d +1
t0
d = Nd
tf
Figure 2: Discretization grid when the decision grid coincides with the simulation (subinterval) grid for Nsd = 1 (see also Fig. 1). Radau Collocation is applied for the integration within each decision interval, d.
of the integration intervals (finite elements used for the integration), see Fig. 2. The sensitivity computation is based on the derivation of Eq. (7) w.r.t. the dependent variables xd = [xd,1 , · · · , xd,Nc ]T , as well as the independent variables xd,0 , and ud : d fd =
∂fd ∂fd ∂fd d xd,0 + d xd + d ud = 0 ∂xd,0 ∂xd ∂ud
(9)
2 It
should be pointed out that each discretization scheme, which conforms to the conditions as aforementioned, can be used together with the sensitivity generation described in section 3.2 and 4.
8 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
The dependence of xd on the independent variables can be computed by the implicit function theorem. −1 ∂fd ∂fd d xd =− (10) · d ud ∂xd ∂ud | {z } | {z } |{z} Sd,d
Jd −1
Kd
−1
d xd ∂fd =− d xd,0 ∂xd | {z | {z } Td
Jd −1
∂fd ∂xd,0 } | {z } ·
(11)
J0d
Eqs. (10, 11) are linear matrix equation systems. They can be transformed to a series of Nu (Eq. (10)) and Nx (Eq. (11)) independent linear equation systems of the size Nx·Nc. Thus, the LU-decomposition of the local Jacobian, Jd , has to be performed only once. Moreover, whereas local sensitivities Sd,d are obtained directly from Eq. (10), global sensitivities (see also Eq. (5)) are issued with the following matrix multiplication: ∂xd=i ∂xi ∂xi−1,Nc ∂xi−2,Nc ∂xj+1,Nc ∂xj,Nc = · · · ··· · · ∂ud=j ∂xi,0 ∂xi−1,0 ∂xi−2,0 ∂xj+1,0 ∂uj | {z } | {z } | {z } | {z } | {z } | {z } Si,j
Ti
ˆ i−1 T
ˆ i−2 T
ˆ j+1 T
(12)
Sj,j
It is conceivable from Eqs. (10-12) that the staggered approach can be applied to OCFE. However, the major drawback when integrating large equation systems is the need to evaluate local derivatives w.r.t. initial conditions Td with d ∈ [j + 1, i]. Note that we use the symbol ∧ over the character T to indicate that ˆ d is sensitivities are only used for the last collocation point (c = Nc). Thus, T a sub-matrix of Td , which is required in Eq. (12) to transfer the influence of local sensitivities to posterior intervals. For large systems, the required solution of Nx linear equation systems in Eq. (11) results in a high computational effort. Moreover, the resulting matrices are not sparse, which is due to the coupled equation system (7). Accordingly, the evaluation of Eq. (12) requires dense matrix multiplication of Nx × Nx size, which becomes computationally expensive for a large dimension of Nx. 4. Alternative decomposition approach In the following, an alternative approach is proposed for the evaluation of sensitivities, which has turned out to be more efficient when solving large equation systems. It is generally suited for any one-step integration method. In this work, it has been adapted to the OCFE. The idea of the staggered state and sensitivity integration has been kept. In contrast to the method described in section 3.2, sparse matrix calculus can fully be exploited, which is crucial for the efficient solution of large equation systems. This means that it does not involve the direct evaluation of the dense matrices Ti . Furthermore, the presence of a step size integrator is considered, which results in an arbitrary number of simulation sub-intervals within each decision interval (section 2, Fig. 1). 9 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
4.1. Evaluation of local sensitivities The sensitivity computation within one decision interval d is first considered. This means that the decisions u are constant. Afterwards, it will be shown that the derived computation scheme can easily be extended to the case where the current decision interval d is left and the actual sub-interval enters into d+1. The general idea is to decompose the decisions ud in order to match the simulation grid. In doing so, decisions ud,s are obtained for each simulation sub-interval s. This corresponds to the step-wise integration of the model equations. For the decomposed decisions, Nsd individual contributions are obtained, ud,1 = ud,2 = · · · = ud,Nsd
=
ud
(13)
Accordingly, the influences of the decisions in each simulation sub-interval can be considered separately. The dependencies of the states in the current simulation sub-interval s on the decomposed decisions read: xd,s (ud,1 , · · · , ud,s−1 , ud,s )
(14)
The total differential reads: ∂xd,s ∂xd,s ∂xd,s d ud,1 + · · · + d ud,s−1 + d ud,s d xd,s = ∂ud,1 ∂ud,s−1 ∂ud,s
(15)
It can be seen that xd,s is a function of all individual previous contributions ud,1 , · · · , ud,s−1 including the current contribution ud,s . Based on Eq. (13), the sensitivities w.r.t. previous contributions can be separated from those sensitivities w.r.t. the contributions in the current simulation sub-interval. ∂xd,s d xd,s ∂xd,s ∂xd,s + (16) + ··· + = d ud ∂ud,1 ∂ud,s−1 ∂ud,s | {z } | {z } | {z } loc Sd,s
loc,past Sd,s
loc,cur Sd,s
Based on the solution of Eq. (7), influences of the current contributions ud,s read: 0 = fd,s (ud,s , xd,s (ud,s )) (17)
In order to consider the contribution regarding influences of previous simulation sub-intervals ud,s−1 , ud,s−2 , · · · , ud,1 , the continuity conditions in Eq. (8) are applied. Taking account of the respective initial values xd,s,0 , xd,s−1,0 , · · · , xd,2,0 the following expressions are obtained, 0 = fd,s (
xd,s−1,Nc (ud,s−1 ),
xd,s (xd,s−1,Nc (ud,s−1 )) )
(18)
0 = fd,s ( .. .
xd,s−1,Nc (xd,s−2,Nc (ud,s−2 )),
xd,s (xd,s−1,Nc (xd,s−2,Nc (ud,s−2 ))) )
It can be seen that this can be continued for all other previous decomposed decisions ud,s−3 , · · · , ud,1 . Derivation of (17) w.r.t. the local variables yields: 0=
∂fd,s ∂fd,s ∂xd,s + · ∂ud,s ∂xd,s ∂ud,s
(19)
10 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
The derivation of all Eqs. (18) w.r.t. to all previous contributions results in: ∂fd,s
∂xd,s−1,Nc ∂fd,s ∂xd,s ∂xd,s−1,Nc + · · ∂xd,s−1,Nc ∂ud,s−1 ∂xd,s ∂xd,s−1,Nc ∂ud,s−1 ∂fd,s ∂xd,s−1,Nc ∂xd,s−2,Nc 0= · · ∂xd,s−1,Nc ∂xd,s−2,Nc ∂ud,s−2 ∂xd,s ∂xd,s−1,Nc ∂xd,s−2,Nc ∂fd,s · · · + ∂xd,s ∂xd,s−1,Nc ∂xd,s−2,Nc ∂ud,s−2 .. .
0=
·
(20)
For all derivatives in Eqs. (20) w.r.t. all ud,s−k with k > 1, we can write: 0
=
∂fd,s ∂xd,s−1,Nc +
·
k−1 Y j=1
∂xd,s−j,Nc ∂xd,s−k,Nc · ∂xd,s−1−j,Nc ∂ud,s−k
k−1 Y ∂xd,s−j,Nc ∂fd,s ∂xd,s ∂xd,s−k,Nc · · · ∂xd,s ∂xd,s−1,Nc j=1 ∂xd,s−1−j,Nc ∂ud,s−k
(21)
Parts of the product, which result from the chain rule, can be substituted as follows: ( Q k−1 ∂xd,s−j,Nc for k > 1 j=1 ∂xd,s−1−j,Nc Pk = (22) INx×Nx for k = 1 Accordingly, the general expression for all derivatives with respect to all ud,s−k with k ≥ 1 in Eqs. (20) reads: 0=
∂fd,s ∂xd,s−1,Nc
· Pk ·
∂xd,s−k,Nc ∂fd,s ∂xd,s ∂xd,s−k,Nc + · · Pk · ∂ud,s−k ∂xd,s ∂xd,s−1,Nc ∂ud,s−k
(23)
After summing up all derivatives w.r.t. the local ud,s and all previous contributions ud,s−k for k ≥ 1, it arises: 0
=
∂fd,s ∂xd,s ∂fd,s · + ∂xd,s ∂ud,s ∂ud,s +
+
∂fd,s ∂xd,s−1,Nc
·
s−1 X
k=1
Pk ·
∂xd,s−k,Nc ∂ud,s−k
s−1 X ∂fd,s ∂xd,s ∂xd,s−k,Nc · · Pk · ∂xd,s ∂xd,s−1,Nc ∂ud,s−k
(24)
k=1
11 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
This can be rewritten as follows: h ∂f
s−1 i h ∂x X ∂xd,s ∂xd,s−k,Nc i d,s · + · Pk · ∂xd,s ∂ud,s ∂xd,s−1,Nc ∂ud,s−k k=1 | {z } | {z } | {z } loc,cur d,s
Jd,s
Sd,s
=−
h ∂f
loc,past Sd,s
d,s
∂ud,s | {z }
+
Kd,s
∂fd,s
s−1 X
·
∂xd,s−1,Nc | {z } k=1 | 0 Jd,s
Pk ·
∂xd,s−k,Nc i ∂ud,s−k {z }
(25)
loc Sˆd,s−1
As discussed before, the calculation of the matrices Pk is computationally expensive (see section 3.2). However, in this work, a explicit calculation of these matrices can efficiently be avoided as shown below. According to Eq. (16), the following integration formula is obtained: h i loc [Jd,s ] · Sd,s = − Kd,s + J0 · Sˆloc (26) d,s
d,s−1
loc It can be seen that Sd,s represents already the desired results, which comprises the sensitivities of xd,s in the current simulation sub-interval w.r.t. the contributions of all (current and previous) decomposed decisions (see Fig. 3, left). loc loc This is a function of Sˆd,s−1 which is a sub-matrix of the previous solution Sd,s−1 . Accordingly, only information w.r.t. the last collocation point is used. Eq. (26) is evaluated concurrently with the element-wise solution of the model equations (7). For the first simulation sub-interval s = 1 the local sensitivities in Eq. (26)
decision interval
decision intervals
d
j
d
+ =
Figure 3: Algorithmic implementation for the evaluation of local (left) and global (right) sensitivities.
are initialized with: loc loc Sˆd,s=0 = Sˆd,s−1
=
0Nx×Nu
(27)
Subsequently, after reaching the last simulation sub-interval s = Nsd , the local sensitivities Sd,d are determined as defined in Eq. (5): Sd,d
=
loc Sd,N sd =
NX sd −1 ∂xd,N sd ∂xd,N sd ∂xd,N sd −k,Nc + · Pk · (28) ∂ud,N sd ∂xd,N sd −1,Nc ∂ud,N sd −k k=1
12 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
The evaluation of local sensitivities for each decision interval d starts with the initialization of the first simulation sub-interval s = 1 with Eq. (27) and the evaluation of Eq. (26) after the successful integration step in Eq. (7). 4.2. Evaluation of global sensitivities The global sensitivities Si=d,j with i > j in Eq. (5) denote the sensitivities of states w.r.t. the decision variables in past decision intervals. The computation of global sensitivities is performed by considering already evaluated sensitivities and transferring their influence to posterior simulation sub-intervals. However, the influence of current decisions is not taken into account. Accordingly, only dependencies of xd,s with respect to the decisions uj are considered. Thus, using the same procedure as in Eqs. (18-20), the following expression is obtained for the simulation sub interval s = 2: 0
=
∂xd,s−1,Nc ∂xd−1,Nsd−1 ,Nc · ∂xd,s−1,Nc ∂xd,s−2,Nc ∂uj ∂fd,s ∂xd,s ∂xd,s−1,Nc ∂xd−1,Nsd−1 ,Nc + · · · ∂xd,s ∂xd,s−1,Nc ∂xd,s−2,Nc ∂uj ∂fd,s
·
(29)
In the general case, for any s ≥ 1 reads: 0
=
∂fd,s ∂xd,s−1,Nc
·
s−1 Y
∂xd−1,Nsd−1 ,Nc ∂xd,s−j,Nc · ∂xd,s−1−j,Nc ∂uj j=1
s−1 Y ∂xd,s−j,Nc ∂xd−1,Nsd−1 ,Nc ∂fd,s ∂xd,s + · · · ∂xd,s ∂xd,s−1,Nc j=1 ∂xd,s−1−j,Nc ∂uj
(30)
The notation of Pk from Eq. (22) is again adopted for Pk=s . It follows then: 0
=
∂xd−1,Nsd−1 ,Nc ∂xd,s−1,Nc ∂uj ∂xd−1,Nsd−1 ,Nc ∂fd,s ∂xd,s + · · Ps · ∂xd,s ∂xd,s−1,Nc ∂uj ∂fd,s
· Ps ·
This expression can be reformulated to: h ∂f i h ∂x ∂xd−1,Nsd−1 ,Nc i d,s d,s · · Ps · ∂xd,s ∂xd,s−1,Nc ∂uj | {z } | {z } Jd,s
=−
h
(31)
glob S(d,j),s
∂fd,s
· Ps · ∂xd,s−1,Nc | {z } | J0d,s
∂xd−1,Nsd−1 ,Nc i ∂uj {z }
(32)
glob Sˆ(d,j),s−1
Thus, an integration formula is obtained, which is also graphically depicted in Fig. 3, right: glob glob [Jd,s ] · [S(d,j),s ] = −[J0d,s · Sˆ(d,j),s−1 ]
(33)
13 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
For d > 1, Eq. (33) has to be evaluated for all j ∈ [1, d − 1]. This can be done jointly with the evaluation of the local sensitivities in Eq. (26) after each successful integration step of Eq. (7). For the first simulation sub-interval s = 1 in every decision interval d, the global sensitivities are initialized in the following way: ( loc Sˆd−1,Ns j =d−1 glob glob d−1 = (34) = Sˆ(d,j),s−1 Sˆ(d,j),s=0 glob Sˆ(d−1,j),Nsd−1 else Similar to Eq. (28) one can write for s = Nsd : Sd,j
glob = S(d,j),Ns = d
∂xd−1,Nsd−1 ,Nc ∂xd,Nsd · Ps · ∂xd,Nsd −1,Nc ∂uj
(35)
On the basis of the above derived relations, sensitivities can be evaluated using a step-wise integration based on the OCFE method. All necessary sensitivity information in Eq. (5) can be generated and thus used for single shooting algorithms in dynamic optimization. In the following section, the algorithmic implementation and performance data are discussed. 5. Algorithmic implementation and performance 5.1. Algorithmic details The algorithm has been implemented in the solver sDACl, a sparse DAE solver based on OCFE. As discussed in section 3.1, the dynamic model is integrated using an orthogonal collocation based discretization along with the element-wise solution of the discretized nonlinear equation system (7). The size of the equation system depends on the order of the polynomial approximation, which is Nc·Nx. For general large dynamic equation systems, such as those in chemical engineering, their Jacobian Jd,s have a sparse partly unordered structure. To exploit this fact, a general sparse nonlinear equation solver NLEQ1S ([15]) is used. glob loc The sensitivities Sd,s and S(d,j),s are evaluated after each successful integration step according to Eqs. (26, 33). The definition of the linear matrix equation system is efficiently done exploiting the sparsity of the derivatives Jd,s , J0d,s , and Kd,s . If not already available from the solution of the current integration step, the exact LU factorization on the sparse local Jacobian Jd,s has to be performed only once. The sparse linear matrix equation system is decomposed into a series of Nu systems of sparse linear equations. These are then solved repeatedly using the same LU factorization. The effort required for the evaluation of the local loc sensitivities Sd,s within one decision interval increases linearly with the number of decisions Nu. This issue has already be shown before in different approaches for the integration of sensitivities ([19]). In comparison to the straight state integration, it comprises Nu additional back substitutions for the just once performed LU decomposition of the Jacobian. The states and sensitivities at the
14 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
specified evaluation points te (see Fig. 1) are obtained by interpolation based on the polynomial approximations used for the discretization in Eq. (6). The generic algorithm for the state and sensitivity integration for single shooting with Nd > 1 is shown below in Alg. (1). Algorithm 1 initialize starting x1,1,0 = x0 , Eq. (8) for d = 1, · · · , Nd loc · initialize local Sˆd,s=0 = 0Nx×Nu , Eq. (27) · if (d > 1) · · for j = 1, · · · , d − 1 glob glob , Eq. (34) = Sˆ(d,j),s−1 · · · initialize global Sˆ(d,j),s=0 · for s = 1, · · · , Nsd · · integrate Eq. (7) using xd,s,0 and get xd,s loc loc · · compute Eq. (26) using Sˆd,s−1 and get local Sd,s · · for j = 1, · · · , d − 1 glob glob · · · compute Eq. (33) using Sˆ(d,j),s−1 and get global S(d,j),s · · if ts ≤ te ≤ ts +∆ts · · · interpolate at te and get xi (te ), ∂xi (te )/∂ud , Eq. (3) glob loc · · store xd,s+1,0 = xd,s,Nc , Sˆd,s and Sˆ(d,j),s ∀ j ∈ [1, d − 1] glob loc · store Sd,d = Sd,Nsd and Sd,j = S(d,j),Nsd ∀ j ∈ [1, d − 1] Eq. (5) 5.2. Generation of local and global sensitivities for single shooting In case of d > 1, local Si,i and also global sensitivities Si,j have to be evaluated for j = 1, · · · , i − 1. Fig. 4 shows the adopted procedure when Nd = 4 decision intervals are considered. From the sensitivity matrix in Eq. (5) it can simulation sub‐intervals
t0 state integration &
… … tf
sensitivity evaluation d=1
S1,1
d=2
S2,1 S2,2
d=3
S3,1 S3,2 S3,3
x S4,1 S4,2 S4,3
d=4
S4,4
Figure 4: Algorithmic implementation of the state and sensitivity integration for single shooting when d > 1.
be seen that the total number of global sensitivities increases quadratically with the number of the decision intervals: dim(Si,j )
=
(Nd2 −Nd)/2
(36)
15 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
It has to be noted that this is the direct consequence of the application of single shooting. Anyway, a quadratic increase in Eq. (36) does not necessarily lead to a quadratic increase in the computational effort. This is basically due to the fact that only parts of the whole optimization time horizon have to be integrated for the evaluation of Si,j , see Fig. 4. Accordingly, the computational effort depends directly on the spacing of decision intervals, i.e. their individual time length. In order to show the general aspects, an equidistant spacing of all decisions ud is assumed. In comparison to the case, where d = 1 is, the relative increase is then linear in terms of the necessary sensitivity evaluations. r
=
Nd 1 X · d = 0.5 · Nd Nd
(37)
d=1
Note that Eq. (37) marks the maximal theoretical increase. Since the solution of the matrix equation system in Eq. (33) involves similar steps and thus further information can be reused, the practical increase is linear but much smaller (see also Fig. 11), 5.3. Implementation All computations were done in double precision using the Intel FORTRAN compiler for Microsoft Visual Studio (MVS 2005) on a Windows platform using a AMD Athlon 64 X2 3800+ personal computer with 1 GB RAM. The function tolerance of the nonlinear equation solver NLEQ1S (FORTRAN 77 implementation) was 10−10 . The LU factorization and solution of general sparse linear systems has been performed using the routines LFTXG, and LFSXG, both from the IMSL Fortran Library V6.0. In the following sections, two case studies are carried out in order to assess the feasibility and performance of the developed approaches. The computation time needed for one state integration and sensitivity generation is used as performance measure. 6. Case studies 6.1. Case study I: Estimation of adsorption parameters in liquid chromatography The first case study is related to the modeling of high performance liquid chromatography (HPLC). For this purpose, pulse experiments of the inlet concentrations are performed in order to determine the specific adsorption parameters solving a parameter estimation problem. The separation of a two component mixture is considered, geometric data and operating parameters are shown in appendix Appendix A, Tab. A.4.
16 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
HPLC column
cinfl,i , u
cout fl ,i z pore volume
r adsorbent particle
Figure 5: Schematic view of the chromatographic column and balance volumes used for the mathematical model.
6.1.1. Process model and adsorption parameters A comprehensive and realistic chromatographic column model for the preparativeand large-scale chromatography represents the so called general multicomponent rate model. It includes different mass transfer mechanisms and can be used with general nonlinear and multicomponent isotherms. Figure 5 depicts the balance volumes used for the mathematical modeling. This results in a coupled PDE system with two sets of mass balance equations, one for the bulk fluid (f l) phase in axial direction z, and one for the particle phases (p) in radial direction r, for each component i ∈ [1, Ncp], respectively. ∂cf l,i ∂ 2 cf l,i ∂cf l,i 3 · kf ilm,i · (1 − εtot ) = Dax,i · −u· − · (cf l,i − cp,i (r = Rp )) 2 ∂t ∂z ∂z εtot · Rp ∂qi (cf l,i ) 1 ∂ ∂cp,i 2 ∂cp,i + (1 − εint ) · = εint · Di · 2 · r · εint · ∂t ∂t r ∂r ∂r (38) In Eq. (38) three transport phenomena are considered by their respective coefficients: Dax axial dispersion; kf ilm external film mass transfer; D intraparticle diffusion. Moreover, u represents the intraparticular fluid velocity, Rp the particle radius, and εint and εtot the internal particle porosity and the total porosity, respectively. The relation qi = f (cf l,i ) represents the adsorption equilibrium. In this work, the following multicomponent Langmuir isotherm is used: qi =
ai · cf l,i PN C 1 + j=1 bj · cf l,j
(39)
An efficient discretization in space represents the Galerkin finite element (GFE) method together with second order polynoms for the bulk fluid phase, and orthogonal collocation (OC) with third order polynoms for the particle phase equation ([9]). The resulting implicitly defined differential equation system is of type: M(x, p) · x˙ = f (x, t)
(40)
17 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
Table 1: Estimated adsorption parameters in (42).
Langmuir adsorption parameters true values initial guess lower bounds upper bounds
a1 1.2 5.4 0.0 50.0
a2 8.0 1.0 0.0 50.0
b1 1.5 1.1 0.0 50.0
b2 10.0 5.5 0.0 50.0
where p = [a, b]T are the above mentioned adsorption parameters. The system dimension depends on the number of intervals used for the discretization in axial direction. Note that using OC for the particle phase means only one element in radial direction. Since the axial concentration profiles can be very steep in particular in the column entry, the robustness and stability of the numeric solution of Eq. (38) depend strongly on the number of elements used for the discretization in axial direction. Especially for the parameter estimation with repeated simulations and different parameter values, a stable numeric solution without oscillations is crucial. Thus, a relatively high number of elements (Ngfe = 50) has been used in the GFE method for the axial discretization of the bulk fluid phase. Being Ncp the number of components, the number of equations of the differential equation system is: Neq = (Ngfe · 2 + 1) · 3 · Ncp
(41)
6.1.2. Parameter estimation problem For the experimental determination of the specific adsorption parameters in Eq. (39), pulse experiments of the inlet fluid concentrations cin f l,i are performed. out The evolution of the outlet concentrations cf l,i are measured over a time horizon of 10 min. Here e = 100 equidistant measurements are taken, where each of them are assumed to be exact and without noise. The measurement time points are defined as evaluation points in the parameter estimation problem, and the model data h(p, te ) is fitted to the measurement ym (te ). The corresponding bounded optimization problem for the determination of the parameters p = [a1 , a2 , b1 , b2 ]T reads: Ne
1X m 2 min Φ = (y (te ) − h(p, te )) p 2 e=1 s.t.
(42)
pmin ≤ p ≤ pmax
The related initial and optimal parameter values, as well as their respective bounds are given in Table 1. Figure 6 shows the iterative solution of problem (42) for both components starting from the initial parameter guess (It.0) to iteration 10 (It.10), where the objective function value reaches a value of Φ = 7.5E − 09.
18 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
outlet concentration cout f l,2 [mM ]
outlet concentration cout f l,1 [mM ]
−3
x 10 15 10 5 0 0
model measurements 0 ,1 ,8 6 .0 4, It. ← It t.2 I ← ←
5 time: t [min]
10
−3
x 10 15 10 5 0 0
0 It. ← t.2 I ←
model measurements
←
4 It. ←
5 time: t [min]
10 8, , 6 It.
10
Figure 6: Fitting of liquid outlet concentrations: iterative solution of the parameter estimation problem starting from the initial parameter guess It.0. Component 1 (left); Component 2 (right).
6.1.3. Performance evaluation The differential equation system in (40) is highly stiff and exhibits steep concentration profiles in time and space. Using high order approximations, the solution requires a relatively small number of integration steps (simulation subintervals). Solving a parameter estimation problem, the free decision variables ud (denoted as p in problem (42)) are constant over the whole optimization horizon, being the number of decision intervals Nd = 1. For the performance evaluation, the system dimension has been varied by changing the number of elements for the discretization in axial direction, Ngfe, as well as the number of components considered in the liquid phase, Ncp (see also Eq. (41)). In addition to the adsorption parameters in problem (42), operational parameters are also considered. The consideration of all these free decision variables u leads to the implicit differential equation system: M(x, u) · x˙ = f (x, u, t). The total number of the free decision variables, Nu, increases with the number of selected components, Ncp. Due to the step size control, the number of integration steps changes for different simulation runs. However, all data is referred to 30 simulation subintervals with a third order discretization in time. The problem size varies between 183 to 1809 differential equations, which corresponds to 19215 and 189945 states in the fully discretized system (including the grid for the simulation sub-intervals), respectively. The iteration matrix for one integration step has between 4365 and 65043 nonzero elements. Fig. 7 illustrates the computation time required for simulation and sensitivity generation. For simulation runs with the same number of Ncp, the time depends linear on the problem size, which is due to the full exploitation of sparse calculus. Increasing Ncp increases the complexity of the problem, and thus, the total number of Newton iterations (NLEQ1s solver), which are needed for the solution of all nonlinear systems of all integration steps. In this specific case, they are between 60-80. The absolute computation time for the sensitivity generation is not affected by this complexity and takes around 30-50% of the whole computation time.
19 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
CPU-time [sec]
20.0 10.0 5.0 2.0 Ncp=1; Nu=9 Ncp=2; Nu=15 Ncp=3; Nu=21
1.0 0.5 0.2 0
500 1000 1500 2000 number of equations, Neq
Figure 7: Computation time needed for simulation and sensitivity generation (hollow symbols indicate time for state integration only, solid symbols indicate time for state and sensitivity integration). The element number used for the axial discretization, Ngfe, as well as the number of components, Ncp, and constant decisions, Nu, are increased.
6.2. Case study II: Trajectory planning for changeover in a multicomponent distillation column In the second case study, a packed distillation column operated at vacuum is considered for the separation of fatty acids. Whereas the distillate stream is processed in a downstream separation unit, the bottom product stream consists of 8 key components which are subject to strict specification ranges defined by upper and lower bounds. The column is a multi-purpose unit. By this means that feed conditions and also product specifications may change. After a feed changeover, the objective is to achieve a feasible operation in a minimum time while reducing off-spec products. This can be achieved by an optimal sequence of control parameters. top condenser
3
reflux
distillate
NTU = 10 1
feed 2
NTU = 8
5 hot plugs à 190 l
column height
TOPPER.
15 10 → feed tray 5 bottom
4
7
bottom product
160 170 180 190 200 temperature [◦ C]
210
Figure 8: Left: Flowsheet of the packed distillation column; Right: Different temperature profiles during product changeover (optimized control strategy).
20 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
3.0
reboiler Q˙ [MW] reflux r [/]
reboiler Q˙ [MW] reflux r [/]
6.2.1. Process model and changeover scenario The detailed and experimental validated dynamic model used in this case study has been found to soundly represent industrial scenarios.3 It considers 8 different organic carbon compounds, complex static and dynamic hold-up relations, energy balances, constant pressure drop, and vapour-liquid equilibrium relations. The column model consists of 411 equations, where 179 are implicitly defined differential and 232 algebraic equations of a general type, which include detailed physical property relations, see Eq. (1b) and appendix Appendix B. In appendix Appendix C Tab. C.5, specific feed conditions and general column parameters are listed for the here considered production scenarios P-1 and P-2. The respective parameters have been found to assure a robust operation during steady state operation. The manipulated key parameters are the reboiler duty Q˙ and the reflux ratio r. During a product changeover, it is a common practice to directly set Q˙ and r to their corresponding values defined by the new process conditions, which generally also include a feed changeover. For a change from P-1 to P-2, the conventional strategy is depicted in Figure 9, left. The corresponding development of the most critical bottom product components is shown in Fig. 10, left. Note that the components C18:1 and C18:2 (not shown in Fig. 10) are not present during the scenario P-1. They enter not before switching to the new feed conditions defined for P-2 (see Table C.5).
2.0 1.0 0 1 0.5 0
← time for → change over 0
0.1
0.2 0.3 time [h]
0.4
0.5
3.0 2.0 1.0 0 1 0.5 0
interval→ ←fixed length 0
0.1
0.2 0.3 time [h]
0.4
0.5
Figure 9: Conventional (left) and optimal control strategies for reflux ratio and reboiler duty (right).
6.2.2. Dynamic optimization problem The dynamic optimization problem is aimed at computing optimal piecewise constant control trajectories for the decisions Q˙ and r in the time horizon, where the changeover scenario takes place. Therefore, decision intervals d ∈ [1, 15] are defined. Each interval has a variable time length ∆t(d), which means 3 The
changeover scenarios discussed throughout this case study have been taken over from an industrial cooperation project with InfraServ GmbH & Co. Knapsack KG. The process model has been validated with data from industrial site.
21 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
55 50 25 20 10 5 0 10 5 0
time for → ←change over 0
0.1
specification ↑ ↓
0.2 0.3 time [h]
0.4
55 50 25 20 10 5 0
5 0
0.5
evaluation points used for optimiz. ···············
10
0
0.1
→
bottom product concentration [kg%] C14 C12 C18 C18:1
60
→
bottom product concentration [kg%] C14 C12 C18 C18:1
60
0.2 0.3 time [h]
0.4
0.5
Figure 10: Evolution of bottom concentrations during product change over (P-1 to P-2), using conventional controls (left) and optimized controls (right).
Table 2: Bounds of the constrained bottom product specifications for all evaluation points in problem (43).
bottom specs. [kg%] lower bounds upper bounds
C8 0.0 3.0
C10 0.0 3.0
C12 51.0 58.0
C14 20.0 24.0
C16 9.0 13.0
C18 1.5 6.0
C18:1 5.0 9.0
C18:2 1.0 2.5
additional degrees of freedom in the optimization problem. The bottom product specifications are considered as hard constraints with lower and upper bounds. Note that here the molar fractions in the process model have to be transformed into mass fractions, as this corresponds to the product specifications given in Table 2. In order to obtain reliable results, it is most effective to demand feasible operation w.r.t. product specifications within a certain time window, where the decisions Q˙ and r are held constant. Here, this time window corresponds to an additional decision interval, where ∆t(d = Nd = 16) = 0.3 h is fixed. Within this last interval, uniformly spaced evaluation points e ∈ [1, Ne] with Ne = 15 are defined, where the inequalities are computed. The primary objective is the minimization of time while complying with bottom product specifications. Secondly, the optimal trajectory should show certain smoothness to prevent from unnecessary perturbations during operation. Accordingly, the objective function is composed firstly of the sum over all ∆t, and secondly of the weighted quadratic difference of Q˙ and r in all neighbor intervals. The problem can be
22 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
Table 3: Starting values, bounds and weighting factors for all discretized decisions in problem (43).
decisions starting values lower bounds upper bounds weights
reb. duty Q˙ [M W ] 0.61 0.1 1.0 1/5
refl. ratio r [−] 2.6 0.1 10.0 1/40
interval length ∆t [h] 0.1 0.0083 0.1500 −
written as follows: min Φ = u
s.t. with
X
u1 + ∆u2 T ·W2 ·∆u2 + ∆u3 T ·W3 ·∆u3
(43)
umin ≤ u ≤ umax
hmin ≤ h(u, te ) ≤ hmax ; ∀ e = 1, · · · , Ne ˙ ˙ ˙ u1 = [ Q(1), Q(2), · · · , Q(Nd) ]T u2 = [ r(1), r(2), · · · , r(Nd) ]T
u3 = [ ∆t(1), ∆t(2), · · · , ∆t(Nd−1) ]T
The corresponding lower and upper bounds on the decisions, the weighting and starting values are all listed in Tab. 3. The objective is linear in the controls, ∆ti , and quadratic in Q˙ and r, where the diagonal matrices W are used for weighting. With the last interval length being fixed, the degree of freedom is 47 and the number of inequality constraints h(u, te ) for all components and evaluation points is 120. Fig. 9 and 10, right, show the results, where the changeover is finished in 81% of the time needed by the conventional strategy. 6.2.3. Performance evaluation Case study II is concerned with a dynamic optimization problem, which applies single shooting for a fully implicit and stiff DAE system. The computational effort is analyzed by changing the number of decision intervals, Nd, over a fixed time horizon length. For this purpose, step size and the number of simulation sub-intervals have been fixed to 25. The system size is 411 with 179 differential and 232 algebraic equations, which corresponds to 20550 states in the fully discretized system (including the grid for the simulation sub-intervals) and an iteration matrix for one integration step with 7694 nonzero elements. The size of the (non-discretized) free decision vector u is three. This is the case when one decision interval Nd = 1 is defined for the whole optimization horizon. Fig. 11 shows the theoretical linear increase in computation time, according to Eq. (37), for an increase of Nd (and thus an quadratic increase of the number of global sensitivities in single shooting, Eq. (36)). The real increase in computation time is linear but much smaller due to the fact that most of the computations can be reused (especially the LU factorization of the
23 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
iteration matrix Jd,s ) once the local sensitivities have been evaluated, see also Eqs. (26,33).
CPU-time [sec]
4 ←maximal theoretical effort for state and sensitivity integration
3.5 3
← real effort → for state and sensitivity integration
2.5 2 1.5
←
state integration only
1 5 10 15 20 number of decision intervals, Nd
Figure 11: Computation time needed for simulation and sensitivity generation (hollow symbols indicate time for state integration only, solid symbols indicate time for state and sensitivity integration). The number of equidistant decision intervals is increased and thus the number of global sensitivity generations as well. The maximal theoretical effort is given according to Eq. (37).
24 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
7. Conclusion An efficient algorithm for the combined step-wise state and sensitivity integration tailored to the OCFE integration method has been presented. In contrast to other proposed methods, sparse matrix calculus can fully be exploited which makes it also especially efficient for large-scale equation systems with a relative small number of independent variables. The relative computational effort in comparison to a state integration is small with a linear increase w.r.t. the number of independent variables. The proposed algorithm can be adapted to any one step integration method where gradients are available from the solution of the discretized system. It shows also the same advantages of the staggered state and sensitivity integration methods proposed by other authors, where the integration steps are freezed for sensitivity generation. It has however to be noted that this applies only a partial error control on the DAE states, excluding the sensitivities. Despite this fact, the computed sensitivities are exact w.r.t. the approximate solution of the state variables. Although the OCFE integration method shares the disadvantage of implicit solution methods, which involve solutions of nonlinear equation systems in each integration step, it need however much less intervals for the approximation and it has the capability of self-starting in high orders, which is important for a large number of discontinuities or when applying multiple shooting methods. Moreover, the main advantage of the proposed approach in comparison to alternative methods is that no special care needs to be taken to the characteristics of general DAEs, either implicitly defined or of high index. In other words, no special reformulations or assumptions regarding the mass-matrix M have to be ˙ taken, when integrating systems of the type M(x, u, t)· x(x, u, t) = f (x, u, t). The developed algorithm has been applied successfully to the parameter estimation of a stiff implicit differential equation system with parameters directly connected to the differential states. In addition, the dynamic optimization problem of a stiff general DAE system has been solved applying single shooting. It has been shown that the computation of local and global sensitivities, as they result from single shooting techniques, is only related to a small linear increase in the computation time for an increasing number of elements in the control discretization. Acknowledgements The project is supported by the Cluster of Excellence ”Unifying Concepts in Catalysis” coordinated by the Berlin Institute of Technology and founded by the German Research Foundation, DFG. T.B. gratefully acknowledges the financial support of BMBF (Federal Ministry of Education and Research of Germany), support code: 03WOPAL4. The authors wish to thank Dr.-Ing. Moritz Wendt for the cooperation and support regarding case study II.
25 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
Appendix A. Case study I: Parameters and operating values of the HPLC column
Table A.4: Parameters and operating values of the HPLC column (adapted from [9])
column length column diameter particle diameter total porosity internal porosity interstitial velocity impulse concentration impulse time axial dispersion, i={1,2} film transport coef., i={1,2} diffusion coef., i={1,2}
Lc Dc Dp εtot εint u cin f l,1 cin f l,2 ∆t Dax,i kf ilm,i Di
[m] [m] [m] [−] [−] [cm/min] [mM ol] [mM ol] [min] [cm2 /min] [cm/min] [cm2 /min]
0.090 0.025 1.0E-6 0.70 0.40 8.15 0.10 0.10 0.10 3.45E-1 1.02E-2 1.28E-7
Appendix B. Case study II: Equation system of the multicomponent distillation column The packed distillation column is modeled by j = [1, 20] units, including the condenser unit, the transfer units and the reboiler unit. The corresponding state vector for each transfer unit j and all components i ∈ [1, 8] reads: x = [Lj , Vj , HUj , pj , xi,j , yi,j , Tj ]T . The component mass balances read: ∂(HUj · xi,j ) ∂t
= Vj+1 · yi,j+1 + Lj−1 · xi,j−1 − Vj · yi,j − Lj · xi,j + Fj · xF i,j
∀
i = [1, 8]
The vapour liquid equilibrium relations are given as follows: pj yi,j = xi,j · LV ∀ i = [1, 8] with pLV = f (Tj , xi,j ) p The relations for the liquid holdup in packed columns are taken from [6]: HUj HUjstat HUjdyn
= HUjstat + HUjdyn
! g · ρL j = 0.033 · exp −0.22 · L 2 σj · ageo ! ! 3/2 0.25 1/2 0.66 ηjL · ageo vjL · ageo = 3.6 · · · 1/2 g 1/2 ρL j ·g Lj L ρj · ageo · 3600 L σj , ηjL = f (Tj , x1,j , · · ·
σjL · a2geo ρL j ·g
!0.1
with vjL = and ρL j ,
, x8,j )
26 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
The pressure loss over the column height is assumed to be constant: pj = pj−1 + ∆p The sum relations for liquid and vapour phase results in: 8 X
xi,j = 1
and
i=1
8 X
yi,j = 1
i=1
Finally, the energy balances read: ∂(HUj · uL j) ∂t
=
V L F Vj+1 · hVj+1 + Lj−1 · hL j−1 − Vj · hj − Lj · hj + Fj · hj L with uL j , hj = f (Tj , x1,j , · · · , x8,j ) ; F F F hF j = f (Tj , x1,j , · · · , x8,j ) ;
hVj = f (Tj , y1,j , · · · , y8,j )
Pure and mixed physical property data for pLV , ρL , σ L , η L , uL , hL , hV are calculated using the industry-standard DIPPR databank, which gives the functional dependencies shown above. The superscripts L, V , LV and F denote liquid, vapour, liquid/vapour and feed, respectively. Fj is the molar feed flow entering the transfer unit j. All equations are given in molar units.
27 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
Appendix C. Case study II: Operating values and specifications of the distillation column
Table C.5: Operating values and specifications for two different industrial scenarios
feed unit number feed flow feed temperature feed composition C8 (C8 H16 O2 ) C10 (C10 H20 O2 ) C12 (C12 H24 O2 ) C14 (C14 H28 O2 ) C16 (C16 H32 O2 ) C18 (C18 H36 O2 ) C18:1 (C18 H34 O2 ) C18:2 (C18 H32 O2 ) overhead pressure distillate stream reflux ratio condenser sub-cooling drop pressure per tray bottom stream reboiler duty
j =NF m ˙F TF ξF
p1 D r ∆T ∆p B Q˙
[kg/h] [◦ C] [kg%]
[mbar] [kg/h] [−] [K] [mbar] [kg/h] [kW ]
production scenario P-1 P-2 8 8 8000.0 9282.5 190.0 195.0 8.1 6.1 48.7 18.3 8.6 10.2 35.0 1400.0 1.5 0.0 0.4 6600.0 485.5
8.5 6.0 48.0 17.5 8.5 4.0 6.0 1.5 35.0 1400.0 2.6 13.0 0.4 7882.0 610.7
28 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
[1] Albersmeyer, J., Bock, H. G., January 2009. Efficient sensitivity generation for large scale dynamic optimization. Tech. Rep. SPP1253-01-02, Interdisciplinary Center for Scientific Computing, Heidelberg, Germany, http://www.am.uni-erlangen.de/home/spp1253. [2] Biegler, L. T., Grossmann, I. E., 2004. Retrospective on optimization. Computers & Chemical Engineering 28 (8), 1169–1192. [3] Cao, Y., Li, S., Petzold, L., 2002. Adjoint sensitivity analysis for differential-algebraic equations: algorithms and software. Journal of Computational and Applied Mathematics 149 (1), 171–191. [4] Cuthrell, J. E., Biegler, L. T., 1987. On the optimization of differentialalgebraic process systems. AIChE Journal 33 (8), 1257–1270. [5] Drews, A., Arellano-Garcia, H., 2008. Model-based determination of changing kinetics in high cell density cultures using respiration data. Chemical Engineering Science 63 (19), 4789–4799. [6] Engel, V., Stichlmair, J., Geipel, W., 1997. A new model to predict liquid holdup in packed columns - using data based on capacitance measurement techniques. In: Distillation and Absorption’97. IChemE Symposium Series, pp. 939–948. [7] Feehery, W. F., Tolsma, J. E., Barton, P. I., 1997. Efficient sensitivity analysis of large-scale differential-algebraic systems. Appl. Numer. Math. 25 (1), 41–54. [8] Finlayson, B. A., 1980. Nonlinear Analysis in Chemical Engineering. McGraw-Hill, New York. [9] Gu, T., 1995. Mathematical Modeling and Scale-up of Liquid Chromatography. Springer Verlag, Berlin, New York. [10] Hong, W., Wang, S., Li, P., Wozny, G., Biegler, L. T., 2006. A quasisequential approach to large-scale dynamic optimization problems. AIChE Journal 52 (1), 255–268. [11] Li, P., Arellano-Garcia, H., Wozny, G., Reuter, E., 1998. Optimization of a semibatch distillation process with model validation on the industrial site. Ind. Eng. Chem. Res 37 (4), 1341–1350. [12] Li, S., Petzold, L., 2000. Software and algorithms for sensitivity analysis of large-scale differential algebraic systems. Journal of computational and applied mathematics 125 (1-2), 131–145. [13] Logsdon, J. S., Biegler, L. T., 1989. Accurate solution of differentialalgebraic optimization problems. Industrial & Engineering Chemistry Research 28 (11), 1628–1639.
29 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.
[14] Logsdon, J. S., Biegler, L. T., 1993. Accurate determination of optimal reflux policies for the maximum distillate problem in batch distillation. Industrial & Engineering Chemistry Research 32 (4), 692–700. [15] Nowak, U., Weimann, L., December 1991. A family of newton codes for systems of highly nonlinear equations. Tech. Rep. TR-910, Konrad-Zuse-Zentrum f¨ ur Informationstechnik Berlin, Berlin, Germany, http://www.zib.de/bib/pub/index.en.html. [16] Schlegel, M., Marquardt, W., Ehrig, R., Nowak, U., 2004. Sensitivity analysis of linearly-implicit differential-algebraic systems by one-step extrapolation. Applied Numerical Mathematics 48 (1), 83–102. [17] Støren, S., Hertzberg, T., 1999. Obtaining sensitivity information in dynamic optimization problems solved by the sequential approach. Computers & Chemical Engineering 23 (6), 807–819. [18] Vaca, M., Monroy-Loperena, R., Jim´enez-Guti´errez, A., 2008. An implementation variant of the polynomial finite difference method with orthogonal collocation and adjustable element length. Computers and Chemical Engineering 32 (12), 3170–3175. [19] Vassiliadis, V. S., Canto, E. B., Banga, J. R., 1999. Second-order sensitivities of general dynamic systems with application to optimal control problems. Chemical Engineering Science 54 (17), 3851–3860. [20] Wang, F. S., 2000. A modified collocation method for solving differentialalgebraic equations. Applied Mathematics and Computation 116 (3), 257– 278. [21] Wu, B., White, R. E., 2004. One implementation variant of the finite difference method for solving odes/daes. Computers and Chemical Engineering 28 (3), 303–309. [22] Zavala, V. M., Laird, C. D., Biegler, L. T., 2008. Fast implementations and rigorous models: Can both be accommodated in NMPC? International Journal of Robust and Nonlinear Control 18 (8), 800–815.
30 Post-print version of the article: Barz, T., Kuntsche, S., Wozny, G., & Arellano-Garcia, H. (2011). An efficient sparse approach to sensitivity generation for large-scale dynamic optimization. Computers & Chemical Engineering, 35(10), 2053-2065. doi: 10.1016/j.compchemeng.2010.10.008. The content is identical to the published paper but without the final typesetting by the publisher.