Handling Uncertainty in Model-based Optimal Experimental Design

0 downloads 0 Views 863KB Size Report
based design of experiments and a critical state of the art analysis is given by ... based on physic-chemical data, equilibrium models have to be determined ...
Handling Uncertainty in Model-based Optimal Experimental Design Tilman Barz,



Harvey Arellano-Garcia, and Günter Wozny

Chair of Process Dynamics and Operation, Berlin Institute of Technology, Sekr. KWT-9, Str. des 17. Juni 135, D-10623 Berlin, Germany E-mail: [email protected]

Abstract In contrast to the majority of published works in the eld of model-based optimal experimental design which focused on numerical studies so as to demonstrate the validity of the OED approach or the development of new criteria or numerical approaches, this work is mainly concerned with the experimental application and practical insights gained from the adaption of an optimal design framework. The presented work is discussed based on the determination of protein ion-exchange equilibrium parameters. For this purpose, especial attention is paid to the explicit modeling of all laboratory steps so as to prepare, implement, and analyze experiments in order to have a realistic denition of the numeric design problem and to formally include experimental restrictions and sources of uncertainties in the problem formulation. Moreover, whereas the eect of erroneous assumptions in the initially assumed parameter values have been covered by various authors, in this work, uncertainties are considered in a more general way including those which arise during an imprecise implementation of optimal planned experiments. In order to compensate for uncertainty inuences, a feed-back based approach to optimal design is adopted based on the combination of the parallel and sequential design approaches. Uncertainty identication is done by solution of 1 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

an augmented parameter estimation problem where deviations in the experimental design are detected and estimated together with the parameter values. It has been shown that uncertainty inuences vanish along with the iterative renement of the experiment design variables and estimated parameter values.

1. Introduction Optimal experimental design (OED) techniques have become widely adopted for the development of mechanistic process models in systems engineering. The decision on appropriate experiments to be conducted is a critical issue in order to obtain an accurate parameter estimation and to reduce time and experimental eort as well. A recent overview over modelbased design of experiments and a critical state of the art analysis is given by Franceschini and Macchietto. 1 The generation of an OED comprises primarily the identiability of the parameters to be estimated. Furthermore, when several possible process models are available, additional aspects such as model distinguishability should also be considered. Most of the previous works in OED are based on numerical examples in order to demonstrate their validity or the development of new numerical approaches. There are rather few publications available with experimental results which provide relevant insights on the appropriate formulation and solution of OED problems for the practical engineer in order to cope principally with practical limitations, e.g. limited instrument accuracy or restrictions in the operating range and/or equipment. For instance, a practical case study is presented by Franceschini and Macchietto 2 for a biodiesel production process, where parameters of a complex kinetic network are identied. For this purpose, strategies are proposed to cope with special problems, which can occur in the parameter estimation of complex and highly nonlinear systems such as those which make use of Arrhenius' equations. In this work, a reliable determination of adsorption isotherm parameters for a bioprocess is presented for which data generation is a very time-consuming, labor-intensive and costly job. Since there is no theoretical tool available for the prediction of adsorption isotherms 2 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

based on physic-chemical data, equilibrium models have to be determined experimentally. Moreover, despite the fact that there are several experimental methods available, the experimental determination of isotherms is still far away from being a routine job. 3 Thereto, especial attention has to be paid to the compensation of uncertainties as they exist while putting the experimental design framework into practice and dealing with experimental results. The remainder of the paper is organized as follows, rst the general systematic procedure and basic numerical principles from planning to implementation and experiment analysis are briey reviewed. Second, special attention is paid to the inuence of uncertainties during this procedure. In the subsequent section, the impact of various uncertainties is rstly demonstrated using a numerical example. Afterward, the formulation of an OED-problem for the determination of adsorption isotherm parameters is described in detail. Moreover, the interaction of uncertainties is studied and a need for their compensation can be demonstrated based on a feed-back strategy (sequential approach) together with the identication of the most relevant uncertainties.

1.1 Optimal experimental design problem formulation

OED is aimed at selecting conditions of experiments that generate a maximum information content for the determination of specic parameters, θ , of an underlying general nonlinear process model, g.

g(x, u, θ , p, t) = 0

(1)

In 1, x and u represent dependent state and free design variables, respectively. θ denotes the parameter set to be determined and p are all other independent model and/or experimental design parameters, which are constant (and have a known and assumed value). The estimation of parameters θ is realized by tting the corresponding simulated model output

y to the measurement data ymeas . Assuming an absence of systematic errors, the nonlinear

3 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

regression reads: (2)

ymeas = h(u, θ , p) +ξξ y | {z } =y

where measurement errors ξ y are assumed to be zero-mean white noise characterized by the measurement-covariance matrix MV. When considering more than one experiment (N Exp > 1), the objective function of the parameter estimation problem subject to the model equations in 1 is given as follows:

min θ

Exp N X

k=1

meas − yk (uk , θ , pk )) (ykmeas − yk (uk , θ , pk ))T · MV−1 k · (yk

(3)

The accuracy of the estimated parameters θ is characterized by the parameter covariance matrix θ V.



θ V ≥ F−1 = 

Exp N X

k=1

−1

Fk 

(4)

4 gives a linear approximation of θ V based on the so called Fischer information Matrix F. 4 Its value results from summing over all individual contributions Fk . For one single experiment,

Fk is calculated based on the sensitivities of measured variables w.r.t. parameters weighted with the inverse measurement-covariance matrix MVk as dened in 5. 4

Fk =



∂yk ∂θθ

T

·

MV−1 k

·



∂yk ∂θθ



(5)

It is though sometimes convenient to use the normalized covariance matrix θ V, which is obtained using the normalized sensitivities according to 6.



∂yk ∂θi



=

∂yk · θi ; ∂θi

∀ i ∈ Nθ

(6)

4 gives exact values for θ V in case of a process model, which is linear in states and parameters. However, for a general nonlinear process model (see 1), the approximation represents

4 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

only a lower bound on the true parameter covariance matrix θ V. 5 Together with the assumption that all uncertainties in measurement and model parameters can be represented by a Gaussian probability distribution, 4 is a generally accepted approach in parameter estimation and OED problem formulations. 1 It can be seen from 4 and 5 that θ V depends not only on the sensitivities of the model output w.r.t. the parameters, but also on the measurement accuracy dened by MV. In the multivariate case, correlations in sensitivities and measurements are also considered. Generally, high and uncorrelated sensitivities as well as exact and uncorrelated measurements (indicated by small values in MV) will enable an accurate parameter estimation characterized by small values in θ V. The formulation of an OED-problem is aimed at maximizing the parameter accuracy by minimizing the values in θ V. Considering the dependencies of y in 2, as well as 4 and 5, this can be accomplished by selecting optimal design variables u for the actual parameter estimate

θ and all other constant parameters p. The objective function in 7 can be formulated using dierent standard metrics ϕ which provide a scalar indicator for parameter accuracy.

min ϕ{θθ V(u, θ , p)}

(7a)

max ϕ{F(u, θ , p)}

(7b)

u

u

The metrics ϕ can be applied either to the approximated parameter covariance matrix θ V or to the Fischer information matrix F. Common metrics are the so called A, D and E-optimal criteria, whose denition and discussion on their suitability can be found elsewhere. 1 It is anyway important to note that the review on the application of dierent criteria presented by Franceschini and Macchietto 1 is only restricted to the case dened in 7b. In this work, the denition in 7a is used instead, where ϕ applies directly to θ V. 6,7 The use of this denition can be interpreted graphically by considering the condence region of the estimated parameters. The limits of this region are dened by θ V and have an ellipsoid shape. The

5 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

size of the ellipsoid represents uncertainties in θ . The A-optimal criterion corresponds to the minimization of the sum of the ellipsoids principle axes. Its value is given by the matrix trace. In contrast, the often referenced D-optimal criterion means to minimize the determinant of

θ V. Its value is equivalent to the surface area of the condence ellipsoid. Applied to θ V, the D-optimal criterion can lead to thin condence regions, but still possessing a signicant length in one single direction of the major axis of the condence ellipsoid because of the correlations between parameters. Comparing the A and D-optimal criteria applied to θ V, the A-optimal criterion is deemed to be more suitable for the minimization of parameter cross-correlation, which is especially important when parameters with a physical meaning have to be determined. Finally, the E-optimal criterion aims at minimizing the major principle axis of the con. For strong correlated dence ellipsoid, which is obtained from the largest eigenvalue, λmax i parameters (with a very large major axis compared to the other axes) the optimal result is similar when applying the A or the E-optimal criterion. 8 The advantage of the A-optimal criterion is that it does not include discontinuities, which can result if the orientation of the major axis switches during optimization. Based on these facts and using 7a instead of 7b, the stated drawbacks related to the application of the A-optimal criterion by Franceschini and Macchietto, 1 which could cause an appreciable loss of information in case of high crosscorrelation between parameters, are not valid here. When applied to the Fischer information matrix F, the matrix o-elements are not considered, and thus, correlations are neglected. In contrast, the parameter covariance matrix θ V is based on the inverse of F, and thus, correlations are considered in the A-optimal criterion. 8 shows the A-optimal criterion applied throughout this work. θ

ϕ{θθ V(u, θ , p)} = trace[θθ V(u, θ , p)] =

N X i=1

  λi θ V(u, θ , p)

(8)

6 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

1.2 The role of uncertainties in the optimal experimental design approach

The determination of model parameters can generally be seen as a sequence of three con1 computation of the OED variables, u0 , based on the current parameter secutive steps: 2 imestimate, θ 0 , and based on some assumptions about constant design variables, p0 ;

plementation of the planned experiments, analysis of the measured variables, ymeas , and (if 3 possible) verication of the assumed experimental conditions, u0 → u∗ and p0 → p∗ ;

estimation of new model parameters, θ 0 → θ ∗ and statistical assessments, and (if possible) estimation of the values of the implemented variables u0 → u∗ and p0 → p∗ (see 1). For error in initial parameter guess, A

error in design variables, A

error in constant design parameters,

planning assumed design parameters, A

s.t.

1

implementation

2

realization of experiments:

estimation

3

s.t.

analysis & verification

not measured

sequential / feed-back approach

consider previous experiments

Figure 1: General solution approach to parameter determination using an OED framework and possible sources of uncertainties. The use of the feed-back loop corresponds to the sequential strategy. the solution of the design problem in 7a, an initial guess for the parameters, θ 0 , is adopted, i.e. the best currently available parameter values. Based on the general nonlinear process model in 1, the quality of the computed experimental design depends on the accuracy of

θ 0 . Therefore, the OED-solution is usually termed as local optimum 1,9 and is a function of the error in the initial parameter guess, ξ θ . After planning and implementation of opti7 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

mal experiments, the parameter estimation problem in 3 is solved and the parameters are then updated, θ ∗ → θ 0 . As shown in 1, besides uncertainties in the parameter values, ξ θ , additional uncertainties related to the practical realization of the experiments can have an inuence on the design procedure. The problem of an imprecise implementation of desired decisions is a common phenomenon in control applications (e.g. the realization of a set-point in a control loop, which is subject to disturbances). According to Skogestad, et al., 10 the term implementation error is adopted to indicate that certain design values deviate from originally planned or assumed process conditions. These errors usually aect certain experimental conditions with an assumed and xed value, p0 + ξ p , as well as the computed OED variables, u0 + ξ u , which can not be implemented exactly as desired.

θ∗ = θ0 + ξθ

(9)

u∗ = u0 + ξ u p∗ = p0 + ξ p Depending on their sensitivities with respect to the adopted design criterion, all considered uncertainties may generally cause large additional losses in the optimality of a specic design. Thus, the need for a robust design, which is insensitive to parametric uncertainties, has commonly been discussed. 1,6,1115 However, the source of uncertainties is usually assigned to unreliable initial guesses of θ . The most widely used approach to cope with uncertainties is the indirect method, 13 which relies on an iterative renement of the experimental design. The indirect method, also referred to as sequential design strategy, 16 is described in 1, where a feed-back loop is used. Here, experiments are alternately designed, experimental data collected, and parameters estimated. The robustness is based on the identication of uncertainties from previous implemented experiments and the step-wise improvement of the values and the condence regions of the parameters. The sequential approach can provide reliable estimates (high

8 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

parameter accuracy with small condence regions) with a drastic less experimental eort in the presence of uncertainties. 6,16 However, when multiple equipment are available, parallel planned experiments can be advantageous in terms of time and use of resources. The concept of model-based design of parallel experiments is presented by Galvanin, et al., 17 together with a novel design criterion, which aims at maximizing complementary information by the consideration of dierent eigenvalues in the information matrix. For the robustness w.r.t. uncertainties, a combination of both strategies, the so called parallel/sequential approach can be used, where parallel planned experiments are designed in a sub-sequential manner. Additionally, it has to be noted that in case of dynamic experiments, the consequent application of the sequential approach means to exploit available information as soon as possible, and consequently, to start the redesign of experiments online. 18,19 Generally, the application of the sequential design approach means that the design of a certain number of n optimal experiments is considered together with m already planned experiments. Accordingly, 10 represents the sum over constant and variable parts of the Fischer-Information Matrix:

F=

m X k=1

Fk +

m+n X

Fk (uk , θ , pk )

(10)

k=m+1

In contrast to the indirect method, which is based on a feed-back based approach, in the direct method, uncertainties are dened and considered a priori (feed-forward approach). The consideration of all uncertainties leads then to the maximization of the probability density function of the design objective. As a result, the solution of a single design problem shows already robustness against uncertainties, e.g. deviations from initially assumed parameters or optimally planned design variables. Consequently, dependent and independent model variables can not be separated, and thus, a more general implicit or so called

error-in-variables model formulation is needed. 15 However, recent approaches for a robust design focus on unreliable initial parameter guesses. Two dierent criteria are used: rst, the expected value criterion, which optimizes the design criterion on average, and second,

9 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

the minimax criterion, which accounts for the worst possible performance. 13 The drawback of the direct method is that both criteria rely on the description of the size of the parameter uncertainty (here the size of the error in ξ θ ) using either a probability distribution in the rst case, or the knowledge of an admissible parameter domain in the second one. This information can either result from basic assumptions or when experiments are designed sequentially. The distribution may be obtained from prior experiments. For the expected value criterion, the objective function reads in case of 7a as follows:

min E [ϕ {θθ V (u, θ , p)}] ; u

with θ ∈ Θ

(11)

where E[·] is the expected value and Θ is the probability space of θ indicating the realizable set of parameter values, which can be dened, e.g. by a uniform or Gaussian distribution function. Applications of this criterion have been presented e.g. by Walter, 13 for a hypothetical rst order example and by Asprey and Macchietto, 12 for a small semi-continuous bioreactor model. For the minimax criterion the following objective function is used in case of 7a:

min max ϕ {θθ V (u, θ , p)} ; u

θ

with θ ∈ Θ

(12)

Here, the OED problem is solved for the worst case parameter set obtained from the solution of the inner maximization problem. In contrast to 11, where the design tries to ensure optimality on average (but can be even very poor for certain parameters), here the algorithm focuses on parameter sets which represent always the worst case in the domain Θ . Applications of the minimax design have been presented e.g. by Dette et al., 11 for a classical biological Monod growth model, by Asprey and Macchietto, 12 for the example of a small semi-continuous bioreactor model, by Bock et al., 6 for a biochemical problem from enzyme kinetics, and nally by Körkel et al., 7 for the reaction of urethan. In the last two examples, the authors used a modication of 12 based on a Taylor expansion of the parameters in the min-max objective function, which is then solved more eciently. The main advantage of 10 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

the minimax approach is certainly the relatively simple denition of parameter uncertainty by simple bounds of the admissible region. The computational time for solving 12 is greater than the time needed to compute a non-robust experiment (7a). However, the application of the robust design allows reducing the number of real experiments, and thus, the time necessary to identify the parameters. Finally, it should be noted that for the parameter determination (3), the term robustness is linked to a solution which is non-sensitive to outliers in data. For numeric experiments, Kostina, 20 presents results which perform better using a

l1 -based parameter estimation rather than the traditional l2 -based approach.

2. Impact of uncertainties - an illustrative example The inuence of various uncertainties on the information content of an experimental design is now discussed rst using a simple dynamic process model. The system of ordinary dierential equations is taken from Espie and Macchietto 21 and describes a semi-continuous (fed-batch) fermentation of baker's yeast. Using Contois kinetics together with a constant specic death rate, the following equations describe the consumption of biomass and substrate in the reactor:

dx1 = (r − p2 − u1 ) · x1 dt r · x1 dx2 =− + u1 · (u2 − x2 ) dt p1 θ1 · x2 r= θ 2 · x1 + x2

(13)

where x1 and x2 denote biomass and substrate concentrations, respectively. Both are given in g/l. The parameters for the Contois kinetics, θ = [θ1 , θ2 ]T , are estimated using OED. Here, the "true" parameter set is given with θ ∗ = [0.30, 0.03]T . The parameters, p0 = [0.55, 0.03]T , are assumed to be known constants. Whereas the initial substrate concentration is xed with

x2 (t = 0) = 0.01 g/ltr, the initial biomass concentration, x1 (t = 0), can be chosen between 1.0

11 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

and 10.0 g/ltr. Together with the dilution factor of the feed, u1 (range 0.05 to 0.20 h−1 ) and the substrate fed to the reactor, u2 (range 5 to 35 g/ltr), the experimental design vector is

u = [u1 , u2 , x1 (t = 0)]T . Furthermore, it is assumed that the state variables can be measured at equidistant points in time, Mi ; with i ∈ {1, · · · , 10} and the measurement y = [x1 , x2 ]T . An OED problem is formulated which minimizes the A-optimal criterion as dened in 8. However, it has to be noted that the resulting problem shows plenty of local minima such that a combination of a global and local search algorithm may be suitable in order to solve this kind of problems. 8 2 shows the solution when no uncertainties are considered and the true parameter set is known a priori. As discussed above, the assumption that θ 0 is equal to

conc.[g/ltr]



















20



M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 x1 -biomass x2 -substrate

10

0.1

conc.[g/ltr]

dil.[h−1 ]

0 0.2 u1 -dilution

0 40 20 0 0

u2 -feed substrate

5

10

15 time [h]

20

25

30

Figure 2: OED results based on the "true" parameter set, θ ∗ , and yielding the maximum information content for this design problem (theoretically best design).

θ ∗ is not realistic. Instead, the initially assumed θ 0 will deviate from the "true" estimated parameter set θ ∗ . As a result, the computed optimal experimental design (using θ 0 ) is not optimal for θ ∗ . In other words, with larger deviations of θ 0 from θ ∗ , the OED diers from the design shown in 2 and the accuracy of θ ∗ deteriorates. In order to account for these deviations, the minimax problem: min max trace[θθ V(u, θ 0 , p)] (see also 12) has been solved 0 u

θ

12 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

for four dierent scenarios: The rst scenario corresponds to the ideal case θ 0 = θ ∗ , where the minimax problem is reduced to a deterministic minimization problem. The other three scenarios consider increasing uncertainties in θ 0 , being θ 0 = θ ∗ + ξ θi with the corresponding uncertainty intervals ξ θ ∈ {±10%, ±20%, ±30%} · θ ∗ , respectively. The problems are solved by a uniform sampling of the uncertain parameter values in their respective intervals together with a repeated solution of the OED-problem and incorporating the corresponding worst case result. 3 shows 95% condence regions of the normalized "true" parameters θ ∗ for all four scenarios. The condence regions indicate the accuracy of θ ∗ . It can be seen that the parameter accuracy, and thus, the quality of the experimental design depends directly on the uncertainties attached to the initially assumed parameter values θ 0 . Besides the uncertainties 1.4

1.2

θ2



θ o= θ ∗ 1

w.c. θ o∈ [θ ∗±10%·θ ∗ ]

0.8

0.6 0.95

w.c. θ o∈ [θ ∗±20%·θ ∗ ]

w.c. θ o∈ [θ ∗±30%·θ ∗ ] 0.975

1 ∗ θ1

1.025

1.05 ∗

Figure 3: 95% condence region of the normalized "true" parameter set, θ , when the "true" parameter set, θ ∗ , is known a priori (θ 0 = θ ∗ ), and the worst cases (w.c.), when θ 0 deviates from the "true" parameter value θ ∗ . in θ 0 , implementation errors are also considered, which result from an imprecise realization of the optimally planned decisions u0 . Thus, the worst case (w.c.) is again calculated for four dierent scenarios, namely: u∗ = u0 (with no implementation errors) and u∗ = u0 + ξ ui with ξ u ∈ {±10%, ±20%, ±30%} · u0 . The respective parameter accuracy is obtained by a simulation based on a uniform sampling using perturbed values of u0 . In the same way, the impact of uncertainties in the assumed parameters p has been studied. 1 shows the A-optimal criterion as scalar indicator for the parameter accuracy for all three considered 13 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

sources of uncertainties: ξ θ , ξ u , ξ p . The example demonstrates that for small uncertainties Table 1: Numeric results for OED with an increasing uncertainty level. considered uncertainties

θ∗ + ξθ u0 + ξ u p0 + ξ p

nominal ±0% 0.001759

A-optimal criterion ±10% 0.006226 0.004318 0.002140

worst case ±20% ±30% 0.012190 0.101032 0.015401 0.024500 0.003020 0.003704

(deviations from the true parameter set, θ ∗ , from the planned decisions, u0 , and from the assumed design conditions, p0 ), the information content of an OED can degrade signicantly. Summing up, it can be concluded that the OED framework will loose its signicant eect whenever uncertainty is not explicitly incorporated and/or compensated. The main sources of uncertainties as they occur during the dierent steps of an OED have been depicted in 1.

3. Case study - determination of protein adsorption isotherms Errors in the initially assumed parameter values will always exist at the beginning of an optimal design for parameter precision. For a strong nonlinear process model, initial guesses can vary over several orders of magnitude from the nally estimated values, e.g. in equilibrium relations or reaction kinetics. 2,8 Accordingly, it is not always possible to dene reasonable intervals for the uncertainty size in initial parameter guesses. Moreover, implementation errors (deviations from initially assumed or optimally planned process conditions) can generally be identied using measurement data, which is available after the implementation of optimal planned experiments. Thus, from a practical point of view, the a priori consideration of uncertainties, as preconditioned in the direct approach and which implies the solution of a worst-case or minimax OED problem, shall be deemed to be not appropriate. In this work we use the indirect method (sequential approach), where erroneous assumptions and uncertainties are compensated by an iterative renement and repeated solution of OED and parameter estimation problems. For this purpose, the parameter estimation problem is 14 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

augmented for the identication and partial compensation of implementation errors. The following case study describes the experimental determination of adsorption isotherms which are generally used in modeling, scale-up, and control of high performance liquid chromatography. The proteins considered in this work (β -lactoglobulin A and B) are counted to the potential important commercial proteins, which can be extracted from milk whey. 22,23 In bioprocess engineering, ion-exchange chromatography is used for their separation and purication. 24 The adsorbent material used here is a strong anion exchanger Source 30Q from GE-Healthcare (Munich, Germany). It is composed of rigid, mono-dispersed, spherical serine particles with a diameter of 30 µm. However, the wide range of retentivities of the macromolecules makes it dicult to separate them under isocratic conditions. Thus, using an additional component (here salt ions) with a variable concentration increases the eluent strength, i.e. the retentivities of the eluites, and thus, the chromatographic separation can be improved. This is the so called non-isocratic operation, which is a relevant operation mode in liquid chromatography. 24 For more detailed information on the biological system, the experimental set-up, the resulting parameter estimation problem, and the evaluation of the results, we refer to Barz et al. 25

3.1 Equilibrium model and parameters Adsorption isotherms describe the equilibrium for dierent components, N C , between pore surface and uid phase in the macropores of an adsorbent. Thus, equilibrium models link the adsorbed stationary phase concentration, qjeq , to the free liquid concentration, ceq j . Accordingly, the equilibrium model consists of N C equations and some constant parameters

θ: qjeq = f (ceq j , θ) ;

with j ∈ {1, · · · , N C }

(14)

For the explicit consideration of non-isocratic operation in chromatographic protein separation, Brooks and Cramer 26 developed a steric mass (SMA) ion-exchange equilibrium formalism, which explicitly accounts for the steric hindrance of salt counterions upon protein 15 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

binding in multicomponent equilibria. In analogy to the variable coecient multi-component Langmuir isotherms, the SMA formalism can be represented as follows:

Λ · αj,1 · ceq j eq ; α · (σ + ν i i ) · ci i i,1

qjeq = P

with j ∈ {1, · · · , N C } = b {Cl− , β−LgA, β−LgB}

(15)

15 gives the equilibrium concentration in the liquid and stationary phase for N C components. Besides the proteins, salt ions are here used as modulator in the non-isocratic system and designated as the rst component. The SMA parameters are: the stationary phase capacity for the salt counterions, Λ [mM], the dimensionless constant for the characteristic charge, νj , and the dimensionless steric hindrance factor, σj . The separation factor, αj,1 , is the variable coecient and dened as follows:

αj,1 = k1,j ·



q1eq ceq 1

νj −1

;

with j ∈ {1, · · · , N C }

(16)

According to Brooks and Cramer 26 the SMA parameters for salt ions (Cl− ) hold the values:

k1,1 = 1 ;

σ1 = 0 ;

ν1 = 1

(17)

Thus, the unknown parameter space is given by,

θ = [Λ, k1,j , νj , σj ]T ;

with j ∈ {2, 3} and σ2 = σ3

(18)

Since the proteins β -LgA and β -LgB have a similar structure, the steric factor of the proteins,

σj , is assumed to be equal for both proteins, 2729 and thus, the parameter space is then reduced by one being then N θ = 6.

3.2 Experimental design of "static" batch experiments In the well-tried batch method, thermodynamic equilibrium points are obtained, which characterize uid phase concentrations, ceq , and the loading of the solid, qeq . For one batch, preset 16 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

amounts of solutes are equilibrated in a close vessel, which is lled by parts with adsorbent. It has to be noted that despite the intrinsic dynamic behavior of batch processes, here, initial and nal equilibrium states are only analyzed, and therefore, the method is counted to the static methods. A general overview over recent (alternative) methods for the determination of adsorption isotherms is given by Seidel-Morgenstern. 3 The static batch method is well suited for the measurement of multi-component equilibrium data. The mayor advantage is the exclusion of kinetic eects during the adsorption process, such as the intraparticular mass transfer, and thus, no additional parameters have to be determined. On the other hand, there is a relatively high eort of laboratory work due to the time consuming preparation of each batch, the concentration analysis, as well as the required time until equilibrium is reached. Because of the limited accuracy, a systematic consideration of uncertainties is compulsory for a precise determination of the adsorbent amount. The liquid volume in the batch consists of both the free liquid (external) around the adsorbent particle, and the liquid in the particle pores (internal). The adsorbed phase concentration, q, refers to the molecules attached to the adsorbent. The volumetric liquid part of the batch volume, εtot , is calculated as follows: (19)

εtot = εext + (1 − εext ) · εint

Where εext and εint are the external or interstitial porosity and the internal or intra-particle porosity, respectively. The intraparticular adsorbent porosity is εint = 0.57, taken from Wekenborg et al. 30 The external porosity, and thus the value of εtot , is varied by changing the amount of adsorbent in the batch. During the adsorption process each component has to fulll the mass balance: eq eq ini εtot · cini j + (1 − εtot ) · qj = εtot · cj + (1 − εtot ) · qj ;

with j ∈ {1, · · · , N C }

(20)

In 20, initial concentrations and nal equilibrium states are denoted by the superscripts 17 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

ini

and

eq

, respectively. The adsorption process is initialized by lling a certain amount of

dissolved proteins in the batch, which is partially lled with pure adsorbent. Thus, the initial adsorbed concentration of the proteins is qjini = 0 ; with j ∈ {2, 3}. In contrast, the initial salt ion concentration is set to the total adsorbent capacity, q1ini = Λ, because of the special pre-treatment of the adsorbent. 25 Each specic experiment aims to adjust the initial solute and adsorbent amount in the batch so as to implement a certain equilibrium point. The batch volume, V bt , is usually determined by the minimum amount of liquid needed for the analysis of the dissolved concentrations. The equilibrium conditions can be selected by choosing the initial amount of solutes and adsorbent in the batch. Thus, the resulting degree of freedom is then: N C + 1.

3.3 Preparation, analysis and uncertainties of a batch experiment The inclusion of practical restrictions is decisive for the formulation of the experimental design problem. Furthermore, the necessary decisions in order to obtain a desired equilibrium point are not obvious, because of the reduction from the dissolved initial protein concentration to the equilibrium concentration. Therefore, the proposed model is extended by additional equations, which principally describe the basic steps done by the laboratory assistant for the batch preparation, which then lead to a desired equilibrium point. In order to get a maximal accuracy using precise pipettes, components and adsorbent are added as volumetric liquid solution or suspension to the batch (4), which denes the total batch volume, V bt . C

V bt = V sl +

N X

Vistd

(21)

i=1

In 21, the superscripts

sl

and

std

denote the slurry with the adsorbent in the solvent and the

dissolved standard protein concentrations as well as the solvent enriched with a desired salt concentration, correspondingly. The total porosity of the slurry, εsl tot , is dened in the same way as in 19. Its value is inuenced by changing the amount of adsorbent in the slurry. For

18 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

slurry

standard protein solution

standard salt solution

standard protein solution

Astd j sample

analytical HPLC

Aeq j

sl c1sol , ε tot , 0 q1 = Λ

c1std

c1sol , c2std

sample

eq Asum

c1sol , c3std

batch preparation equilibrium state

initial state std 1

V

V bt

c1eq

c1ini

c2eq

V2std

mixing

V3std

V

c2ini

q1eq

c3ini

q2eq

ε tot

sl

c3eq

q3eq

Figure 4: Preparation of initial and equilibrium state for one single batch experiment.

19 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

a given εsl tot , the volumetric liquid fraction of the batch is calculated as follows,

εtot

P C std sl εsl + N tot · V i=1 Vi = bt V

(22)

With the corresponding standard concentrations, cstd ; with j ∈ {1, 2, 3}, and a constant j solvent salt concentration, csol 1 , which results from the specic pre-treatment during the batch preparation, the initial liquid salt concentration in the batch is:

cini 1 =

std + csol cstd 1 · 1 · V1

P

NC i=2

sl Vistd + εsl tot · V

εtot · V bt



(23)

and the initial protein concentrations:

cini j =

std cstd j · Vj ; εtot · V bt

with j ∈ {2, 3}

(24)

For all experiments, the standard protein concentration is set to cstd = 0.37mM ; with j ∈ j

{2, 3}, which together with the corresponding standard volume determines the maximal initial protein concentration, cini j . According to the above mentioned steps (see also Barz, et.al 25 ), a batch experiment is now determined by the three volumes V sl , V2std , V3std , and the salt concentration, cstd 1 , (4). Due to the small batch size of 4 ml, a measurement of salt ion concentration has not been carried out. The analysis of the protein concentration is done using an analytical HPLC by peak area integration of the photometric outlet signal. The concentrations are obtained from the specic peak areas, Aeq j , using a linear calibration with the constant parameters mj and nj as follows, eq ceq j = mj · Aj + nj ;

with j ∈ {2, 3}

(25)

A total separation of the specic areas of each individual protein has not been achieved. 25 For their independent determination, the overlapping peaks have to be divided manually,

20 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

which introduces a certain error. However, the sum of the protein concentrations Aeq sum is not aected by this division and possesses a much higher accuracy than each single Aeq j . Therefore, it is convenient to consider also Aeq sum as a measurement variable (26), which provides some extra information with a relatively high accuracy. C

Aeq sum

=

N X

Aeq j

(26)

j=2

eq T eq The measured variables, yk = [Aeq 2 , A3 , Asum ] , represent the result of one single batch

experiment, k . In addition, some uncertain parameters can be considered, which result from an imprecise preparation of the batch experiments. Uncertainties can here mainly be sl std std ascribed to the prepared standard concentrations cstd 1 , c2 , c3 , and the void fraction, εtot . As

shown in 4, values of the protein standard concentrations can be veried by measurements. This is done through analysis of the corresponding protein peak areas by using the same linear calibration as in 25: std cstd j = mj · Aj + nj ;

with j ∈ {2, 3}

(27)

Uncertainties in the protein concentration measurement are assigned to the corresponding protein peak areas. The statistical analysis of several repeated calibration curves revealed that a combination of relative and absolute measurement error reproduces the resulting std deviations in the protein peak areas quite good. In contrast, the values for εsl tot and c1

represent empirical approximations of their respective uncertainties. They are based on analysis of the laboratory steps for the preparation of the slurry and the standard salt solution. The corresponding standard deviations used to dene the diagonal elements of the measurement covariance matrix, MVk , are shown in 2.

21 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

Table 2: Uncertainties dened as uncorrelated zero mean white noise for measured variables, uncertain experimental conditions, and design parameters.

Aeq j Astd j

variable

;

j ∈ {2, 3} Aeq sum εsl tot cstd 1

standard deviation, σ σ = σ abs + σ rel abs σ = 10 ; σ rel = 1/80 · Aj σ = σ abs + σ rel abs σ = 5 ; σ rel = 3/400 · Asum σ = 1/800 σ = 3/4

3.4 Preliminary studies with dierent experimental design strategies OED aims at minimizing the number of batch experiments, in other words, the number of certain equilibrium points which are necessary in order to obtain an adequate accuracy of the estimate of the SMA parameters, θ . At a rst glance, a simultaneous design of experiments seems to be the rst choice for the described problem as it corresponds to the standard procedure in the regular lab work, where several batches are produced and analyzed in parallel. Here, rst of all, dierent design strategies including also the sequential and parallel/sequential approach (see also 1) are compared for a given and xed initial parameter set, θ 0 . Using the A-optimal criterion (see 8), the general OED problem for k

22 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

experiments with a restricted maximal available amount of proteins, mmax , reads: j

min u

s.t.

ϕ{θθ V(u1 , · · · , uN Exp , θ , p1 , · · · , pN Exp ) 15, 16, 20 − 26 0 ≤ V sl + V1std + V2std ≤ 4ml

with

0 ≤ V sl ; 0 ≤ V2std ; 0 ≤ V3std 0 ≤ cstd 1 ≤ 400mM ; P Exp  std Vj + cstd ≤ mmax ; j ∈ {2, 3} 0≤ N j j k k

θ = [Λ, k1,j , νj , σj ]T ; xk =

σ2 = σ3 ;

eq ini std T [αj,1 , ceq j , qj , cj , εtot , V1 ]

eq T eq yk = [Aeq 2 , A3 , Asum ] T uk = [V sl , V2std , V3std , cstd 1 ]

          

∀ k ∈ {1, · · · , N Exp }

j ∈ {2, 3}

  ; j ∈ {1, 2, 3}       

T std std sl pk = [V bt , cstd 1 , εtot , c2 , c3 , m2 , m3 , n2 , n3 ]

       

∀ k ∈ {1, · · · , N Exp }

(28)

While the number of parameters remains constant, with N θ = 6, the number of variables, equations, and inequality constraints depend on the number of experiments to be planned. Accordingly, the number of model equations and states is: N x = 14 · N Exp , the number of measurement equations and measured variables: N y = 3 · N Exp , the number of design variables: N u = 4 · N Exp , and the number of constant parameters for each experiment:

N p = 9 · N Exp , respectively. The diagonal elements of the measurement covariance matrix, MVk , (see 5) are dened by their respective values given in 2. As seen in 4, neither the values, nor the uncertainties of concentrations and void fractions of the standard volumes as well as the calibration parameters, mj , nj , are necessarily independent. These correlations were neglected here. This is due to the fact that in some cases neither the implementation nor the analysis of simultaneously planned experiments could be accomplished parallel-wise. 5 shows the results of theoretical studies concerning the evolution of the A-optimal criterion for dierent design strategies and xed parameters θ 0 . It can be seen that a minimum of four parallel planned batch experiments is necessary in order to reach identiability of all

23 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

six SMA parameters, θ . Thus, up to a total of 20 experiments were additionally planned. However, high-dimensional OED problems exhibit commonly several local minima (e.g. for 10 parallel planned experiments the number of design variables reaches 40). Thus, each problem has been solved repeatedly by a gradient based SQP algorithm using random start values of the decision variables, u, and choosing the best result. 8 It should be noted that OED strategies → 4+1+...+1 → 4+3+...+3 → 5+5+5+5 → 10+10

0.2 0.15 0.1

no identifiability

ϕ (A-optimal criterion)

0.25

0.05 0 0

5

10 15 number of experiments

20

Figure 5: Theoretical studies using dierent experimental design strategies for a xed parameter guess, θ 0 , (e.g. 4 + 1 + · · · + 1 means: 4 parallel planned experiments and subsequently several single planned experiments). in 5 the planning of 4 + 1 + · · · + 1 experiments follows in essence the sequential design strategy while the planning of 10 + 10 experiments can be seen as nearly a parallel strategy. The remaining ones in 5 comply with a combined parallel/sequential strategy. Surprisingly, the evolution of the A-optimal criterion is not inuenced by the strategies applied. Even if experiments are repeated using exactly four times the same "optimal" design variables, which were obtained for 5 parallel planned experiments (4 · 5 strategy), the evolution of the A-optimal criterion does not dier signicantly. Moreover, arising deviations can be assigned to local optima rather than to factual dierences in the A-optimal criterion because of a specic design strategy. In contrast to this, examples can be found e.g. in Schöneberger et al., 8 where the planning of dierent experimental designs oers already an advantage for a two dimensional problem. In this case study, the A-optimal criterion seems to be independent from the applied 24 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

strategy. This is mainly because of the low accuracy of the measured data. Simply spoken, the information content of the rst 5 measured equilibrium points is already relatively high in terms of identiability, but the measured data accuracy is still low. For all applied strategies, it can be concluded that for an increasing experiment number, the decrease of the A-optimal criterion is mainly because of an increased measured data accuracy based on repeated measurements rather than because of new information (e.g. analysis of the adsorption behavior at a dierent equilibrium point).

3.5 Experimental results for a combined parallel/sequential strategy In practice, after each set of parallel planned and implemented optimal experiments, the parameters are updated, θ 0 → θ ∗ , using available measurements and solving a parameter estimation problem such as in 29. The updated parameters, θ ∗ , are then used for the solution of subsequent OED-problems. For an exact determination of the SMA-parameters, implementation errors have to be considered. 15 For this purpose, uncertain design and experimental conditions, p0k , of each batch experiment are added as free decisions and are now adjustable parameters in the problem dened in 29. By this means that the size estimation of so called implementation errors is now part of the parameter estimation problem:

min 0 θ

s.t. with

PN Exp k=1

˜ −1 · (ymeas − yk (θθ 0 , pk )) (ykmeas − yk (θθ 0 , pk ))T · MV k k ∀ k ∈ {1, · · · , N Exp }

15, 16, 20 − 27

θ 0 = [θθ , p01 , p02 , · · · , p0N Exp ]T θ = [Λ, k1,j , νj , σj ]T ;

σ2 = σ3 ;

j ∈ {2, 3}

std sl std T p0k = [cstd 2 , c3 , εtot , c1 ] eq ini std T xk = [αj,1 , ceq j , qj , cj , εtot , V1 ] ; j ∈ {1, 2, 3}

yk =

eq eq std std std sl T [Aeq 2 , A3 , Asum , A2 , A3 , c1 , εtot ]

T pk = [V sl , V2std , V3std , V bt , cstd 1 , m2 , m3 , n2 , n3 ]

                

∀ k ∈ {1, · · · , N Exp }

(29)

25 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

where the measurement vector, yk , is now augmented rstly by assumed values for cstd 1 std and Astd and εsl 3 , which represent the tot , and secondly by the measured peak areas, A2

protein standard concentrations, cstd and cstd 3 , using the linear calibration dened in 27. 2 0

The number of parameters and estimated uncertain experimental conditions is then: N θ =

6 + 4 · N Exp , the number of states and model equation: N x = 14 · N Exp , the number of the measurement equations and variables: N y = 7 · N Exp , and the number of constant parameters: N p = 9 · N Exp , respectively. It should be noted that the constant design std std sl parameter in the OED-problem in 28, cstd 1 , as well as the design variables, c2 , c3 , εtot , are

aected by implementation errors, here ξ p and ξ u , respectively. The values for the denition of MVk are again taken from 2. It has however to be noticed that a careful selection of the uncertainty size attached to yk (2) is crucial in order to obtain a meaningful solution of the problem dened in 29. In particular, the considered uncertain experimental conditions, p0k , depend strongly on the uncertainty attached to their assumed values. So, for instance σ(Astd 2 ) sl sl has an inuence on the result of cstd 2 and so has σ(εtot ) on εtot , respectively. In other words,

if the respective uncertainty values of a variable in p0k is selected too high, a signicant deviation from the assumed experimental conditions is possible. This can no longer be explained by correction of a possible implementation fault, but it means an articial change in the experimental conditions, which is directly used for the parameter tting problem in 29. It is however apparent that this is not desirable for an exact determination of the unknown SMA-parameter set, θ . 6 shows the experimental results from the parallel/sequential (5+5+5) strategy. For the rst and second ve experiments, the initial parameter guess θ 0 used for the experiment planning diers signicantly from those parameters θ ∗ obtained after implementation and parameter estimation (see also 1). The optimally planned experiment design gives only a maximal parameter accuracy (represented by a small value of the A-optimal criterion) for the initially assumed parameters θ 0 . For the dierent values θ ∗ after implementation the accuracy degrades and the A-optimal criterion increases as depicted in 6. However, the 26 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

parallel/

5+5+5 sequential

0.6 0.5 0.4 0.3 0.2

→ ϕ(θ 0 , u0 , p0 )

(initially planned)

no identifiability

ϕ (A-optimal criterion)

0.7

→ ϕ(θ ∗ , u∗ , p∗ )

(after implementation)

0.1 0 0

5

10 15 number of experiments

20

Figure 6: Experimental results of the parallel/sequential design approach over the implemented number of experiments. Each value of the A-optimal criterion is shown before (0 ) and after (∗ ) the update of the SMA-parameters. quality of the estimate of θ 0 improves, and thus, the error of an imprecise initial parameter estimate diminishes with an increasing number of experiments. Moreover, as discussed above, the source of uncertainty can not only be assigned to uncertainties in the initial guess of the SMA-parameters, ξ θ . The theoretical inuence of the considered possible implementation errors ξ p and ξ u is obtained by sampling. Corresponding standard deviations are given in 2. In 7, the resulting inuence after parameter update is shown w.r.t. the obtained A-optimal criterion. Here again, the possible error on the design criterion is reduced for an increasing number of experiments. It should anyway be noted that the size of the implementation error can partially be detected by the solution of the augmented parameter estimation problem in 29. 15 This approach improves the quality of the SMA-parameter estimate, but it does not prevent from the degradation of the experimental design criterion.

3.6 Comparison of conventional and optimal planned experiments 8 shows both conventional planned experiments at two dierent salt ion concentrations (a total of 40 batches), and the optimal planned experiments (15 experiments). It can be seen that the optimally placed equilibrium points can not be selected by an intuitive reasoning, while the conventional planned experiments aim at keeping certain liquid concentrations or 27 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

0.7 0.6 0.5 0.4 0.3 0.2

no identifiability

ϕ (A-optimal criterion)

σϕ =2.4E-2

parallel/ 5+5+5 sequential → ϕ(θ ∗ , u∗+ξ u , p∗+ξ p )

σϕ=8.3E-3 σϕ=3.5E-3

0.1 0 0

5

10 15 number of experiments

20

Figure 7: Inuence of implementation errors, ξ u , ξ p , on the experimental results for the parallel/sequential design approach. their relation constant for several batch experiments. The optimal planned experiments cover a wider range regarding the salt concentration ceq 1 , whereas the conventional experiments are performed only at 130 and 150mM . It can also be noted that the protein concentrations ceq 2 and ceq 3 in the optimal planned experiments are generally smaller in comparison to conventional experiments. This is because of the restrictions related to the total amount of proteins (see also 28). For a further characterization of all conducted experiments, and mmax mmax 3 2

optimal planned experiments • 1-5, ◦ 6-10, ∗ 11-15

200

Cl− : ceq 1 [mM]

Cl− : ceq 1 [mM]

• conventional planned experiments

150 140 130

180 160 140 120 100

0.10 0.05

βLgB : ceq 3 [mM]

0.00

0.05

0.05 0.025

0.10

βLgB : ceq 3 [mM]

βLgA : ceq 2 [mM]

0.00

0.01

0.02

0.03

βLgA : ceq 2 [mM]

Figure 8: Liquid concentrations for the equilibrium points of conventional planned experiments (left) and those which result from the solution of the OED problem (right). ini the reduction of protein concentrations is analyzed starting from initial values cini 2 and c3 to eq the equilibrium protein concentrations ceq 2 and c3 for dierent adsorbent void fractions εtot ,

28 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

and a constant initial salt concentration, cini 1 , in each batch. In 9, it can thus be seen that eq eq eq liquid and adsorbed equilibrium protein concentrations ceq 2 , c3 , q2 , q3 , get dierent values ini depending on εtot . This is basically because of the higher for constant values of cini 2 , c3

adsorption anity of β -lgA. The variation of the equilibrium points with increasing values of εtot (i.e. decreasing amounts of adsorbent in the batch) is calculated by using the nal parameter estimation from 3. The experimental results for the dierent batches are depicted 5

→ εtot

β−lgA 4

β−lgB

← [ceq , qeq ]

qj [mM]

2

3

εtot →

εtot = 0.948

1 0 0

0.02

0.04

cj [mM]

0.06

[c3eq , q3eq ]



2

εtot = 0.988

2

→ 0.1

ini ini ini [cini 2 , q2 ] =[c3 , q3 ]

Figure 9: Reduction of dissolved concentrations from initial to equilibrium state for dierent = 100mM , adsorbent void fractions, εtot , with a constant initial salt concentration, cini 1 ini ini and constant initial protein concentrations, c2 = c3 = 0.1 (computation based on the SMA-parameters from 3). in 10. It should be noted that while for the conventional planned experiments (10, left) the reduction of liquid concentrations is relatively small and nearly constant, it is rather much higher for the optimally planned experiments because of higher amounts of adsorbent in ini each batch. However, due to the limited accuracy of the measured values of cini and 2 , c3 eq ceq 2 , c3 , both their absolute values and their dierence have to be suciently high in order

to obtain signicant results. The clear advantages of the OED are the reduced experimental eort due to the smaller number of experiments and the higher parameter accuracy because of the higher information content of the measured concentrations. It can be seen from 3 that a value of ϕopt = 0.148 is obtained for the 15 optimal designed experiments with a much smaller protein consumption, whereas ϕconv = 516.0 is achieved based on the conventional 29 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

4

4

β−lgA β−lgB

3 qj [mM]

qj [mM]

3

2

1

0 0

β−lgA β−lgB

2

1

0.05

0.1 cj [mM]

0.15

0 0

0.2

0.05

0.1 cj [mM]

0.15

0.2

Figure 10: Experimental results for the reduction of the dissolved protein concentrations with dierent salt concentrations and adsorbent amounts for conventional planned experiments (left) and those which result from the solution of the OED problem (right).

Table 3: Experiment results with initially assumed (0) and updated (∗) parameters.

30 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

planned experiments. The corresponding relative standard deviations σ i,i , which are adopted from the diagonal elements, and the maximal correlation of the parameter emax taken from the o-elements of the approximated parameter covariance matrix in 4 conrm these results. It has to be noted that the OED procedure combines equilibrium points with single (pure A and B) and also mixed protein concentrations. 4 shows the initially assumed and nally estimated parameter values θ . In 5 are listed the initially assumed and nally veried experimental conditions p as well as the corresponding design variables u, which result from 29. Table 4: Initially assumed and nally estimated parameter values.

Table 5: Estimated deviations in initially assumed (0) and nally estimated (∗) constant experimental conditions pk as well ‫ ܝ‬as planned design variables uk including minimum/maximum/mean values for all conducted experiments. estimated deviations / implementation errors in design constant experimental variables ௢ ‫כ‬ conditions (ȁ࢖௞ െ ࢖௞ ȁ) (ȁ࢛௢௞ െ ࢛‫כ‬௞ ȁ) ܿଶ௦௧ௗ

ܿଷ௦௧ௗ

௦௟ ߝ௧௢௧

ܿଵ௦௧ௗ

min max

5.8E-4 2.3E-2

3.7E-4 1.4E-2

7.3E-5 1.9E-2

0.0 7.24

mean

6.5E-3

4.0E-3

4.6E-3

1.36

Conclusions When designing experiments, the practical engineer will generally encounter dierent sources of uncertainties such as implementation errors, deviations from general process conditions 31 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

(which are assumed to be constant), and erroneous assumptions in initial parameter guesses. The identication of main sources of uncertainties and their size are not known a priori. They will rather be identied after the implementation of experiments. Moreover, when dealing with strongly nonlinear process models, the initially assumed parameter values can vary over several orders of magnitude from the nally estimated value. Based on these reasons, an inclusion of uncertainties in the OED problem formulation so as to directly account for uncertainties is often not possible. The proposed feed-back based approach (indirect method), which follows the sequential design of experiments, appears to be the most appropriate procedure to partially compensate for possible degradations during the design procedure. While the update of a parameter set is an inherent step in the sequential strategy, special attention should be paid to the identication of deviations in the originally planned design variables and conditions. This can be done either directly by a measurement, which veries the implemented value, or indirectly by solving an augmented estimation problem, which accounts simultaneously for model parameters and uncertain experimental conditions. In the latter case, the size estimation of the so called implementation errors becomes part of the parameter estimation problem. In this work, the obtained results of the case study show a signicant improvement in terms of lab work load, number of experiments required, as well as reduced use of costly chemicals. By an explicit consideration of all experimental steps beginning from error-prone procedures in the experiment preparation (e.g. feed preparation) up to the result analysis and verication of the realized optimal experimental conditions, the experimental eort can additionally be reduced by a standardization of the lab work. This is in particular interesting when performing parameter identication for dierent components, but using the same equipment. Moreover, it has been shown that the setting of specic and especially informative process conditions (e.g. a desired equilibrium point or a specic reaction temperature) can often only be inuenced indirectly by the experiment design variables. Accordingly, without an inclusion of these relations in the OED problem formulation and without an 32 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

iteratively process model renement, deviations from the desired experiment conditions are to be expected. Lastly, the A-optimal criterion becomes a valuable choice when applied to the approximation of the parameter covariance matrix. Based on the values of the parameter correlations, it has been demonstrated that both variances and correlations have signicantly been minimized in comparison to conventional planned experiments. However, independent from the discussed approaches here for uncertainty compensation, model-based OED suers from one drawback: its unsystematic nature. The optimal experimental conditions are rarely evident in complex models. They are often located at the limits of the operating region omitting those regions with a low information content for the underlying process model. Consequently, in case of uncertainties related to the model structure, the experiment design is seldom easy to be interpreted (e.g. selection of the appropriate adsorption mechanism) and it does not substitute a systematic study over the entire operating range.

Acknowledgement The authors gratefully acknowledge the support of Knauer GmbH (Berlin, Germany) and the nancial support of BMBF (Federal Ministry of Education and Research of Germany), support code: 03WOPAL4.

33 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

Nomenclature

Latin

j Aj cj emax F k1,j mj , nj mj MV N p qj t u V x y αj σ σj ε λ Λ θ Θ θV νj ξ ϕ

component index j = {1, 2, 3} = b {Cl− , β−LgA, β−LgB} integrated area of HPLC signal (eq -equilibrium, ini-initial state, std-standard, sum-sum signal) mobile phase concentration, [mM] (eq -equilibrium, ini-initial state, sol-solvent, std-standard) maximal relative parameter correlation Fischer information matrix equilibrium constant, dimensionless linear calibration parameter available protein amount (max-maximum), [mg] measurement covariance matrix number of: C -components, P -proteins, Exp-experiments, θ-parameters constant model parameters / experimental conditions (0-initially assumed, ∗-true value) stationary phase concentration, [mM] (eq -equilibrium, ini-initial state) time design variables (0-initially planned, ∗-true value) bt-batch, sl-slurry, std-standard volume, [ml] state variables measured state variables (meas-measurement values)

Greek letters & symbols

variable Langmuir coecient, dimensionless standard deviation (abs-absolut, rel-relativ, {i, i}-diagonal element) steric factor, dimensionless porosity (tot-total, ext-external, int-intraparticular, sl-slurry), dimensionless eigenvalues stationary phase capacity (for monovalent salt counterions), [mM] model parameters (0-initially assumed, ∗-true value) variable space / domain of θ parameter covariance matrix characteristic charge, dimensionless error in: θ-initial parameter guess, p-experimental conditions, u-design variables OED-functional

References (1) Franceschini, G.; Macchietto, S. Model-based design of experiments for parameter precision: State of the art.

Chemical Engineering Science 2007, 63, 48464872. 34

Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

(2) Franceschini, G.; Macchietto, S. Validation of a model for biodiesel production through model-based experiment design.

Ind. Eng. Chem. Res 2007, 46, 220232.

(3) Seidel-Morgenstern, A. Experimental determination of single solute and competitive adsorption isotherms. (4) Bard, Y.

Journal of Chromatography A 2004, 1037, 255272.

Nonlinear parameter estimation ; Academic Press: New York, 1974.

(5) Benabbas, L.; Asprey, S. P.; Macchietto, S. Curvature-based methods for designing optimally informative experiments in multiresponse nonlinear dynamic situations.

Ind.

Eng. Chem. Res 2005, 44, 71207131. (6) Bock, H. G.; Körkel, S.; Kostina, E.; Schlöder, J. P. In

Reactive Flows, Diusion and

Transport ; Jäger, W., Rannacher, R., Warnatz, J., Eds.; Springer, 2007; pp 117146. (7) Körkel, S.; Kostina, E.; Bock, H.; Schlöder, J. Numerical methods for optimal control problems in design of robust optimal experiments for nonlinear dynamic processes.

Optimization Methods and Software 2004, 19, 327338. (8) Schöneberger, J. C.; Arellano-Garcia, H.; Wozny, G.; Körkel, S.; Thielert, H. ModelBased Experimental Analysis of a Fixed-Bed Reactor for Catalytic SO2 Oxidation.

Industrial & Engineering Chemistry Research 2009, 48, 51655176. (9) Atkinson, A. C.; Bogacka, B. Compound and other optimum designs for systems of nonlinear dierential equations arising in chemical kinetics.

Chemometrics and Intelligent

Laboratory Systems 2002, 61, 1733. (10) Skogestad, S. Plantwide control: The search for the self-optimizing control structure.

Journal of Process control 2000, 10, 487507. (11) Dette, H.; Melas, V. B.; Pepelyshev, A.; Strigul, N. Robust and ecient design of experiments for the Monod model.

Journal of theoretical biology 2005, 234, 537550. 35

Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

(12) Asprey, S. P.; Macchietto, S. Designing robust optimal dynamic experiments.

Journal

of Process Control 2002, 12, 545556. (13) Walter, E.

Identiability of parametric models ; Pergamon Press, Inc. Elmsford, NY,

USA, 1987. (14) Chu, Y.; Hahn, J. Integrating parameter selection with experimental design under uncertainty for nonlinear dynamic systems.

AIChE Journal 2008, 54, 23102320.

(15) Doví, V. G.; Reverberi, A. P.; Maga, L. Optimal design of sequential experiments for error-in-variables models.

Computers & Chemical Engineering 1993, 17, 111115.

(16) Körkel, S.; Bauer, I.; Bock, H. G.; Schlöder, J. P. A sequential approach for nonlinear optimum experimental design in DAE systems.

Scientic Computing in Chemical

Engineering II: Simulation, Image Processing, Optimization, and Control 1999, 338. (17) Galvanin, F.; Macchietto, S.; Bezzo, F. Model-based design of parallel experiments.

Industrial & Engineering Chemistry Research 2007, 46, 871882. (18) Galvanin, F.; Barolo, M.; Bezzo, F. Online Model-Based Redesign of Experiments for Parameter Estimation in Dynamic Systems.

Industrial & Engineering Chemistry Re-

search 2009, 48, 44154427. (19) Körkel, S.;

Arellano-Garcia, H. In

Computer Aided Chemical Engineering ;

De Brito Alves, R. M., Oller Do Nascimento, C. A., Biscaia, E. C., Eds.; Elsevier, 2009; pp 528533. (20) Kostina, E. Robust parameter estimation in dynamic systems.

Optimization and Engi-

neering 2004, 5, 461484. (21) Espie, D.; Macchietto, S. The optimal design of dynamic experiments.

AIChE Journal

1989, 35, 223229. 36 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

(22) Gadam, S. D.; Cramer, S. M. Salt eects in anion exchange displacement chromatography: Comparison of pentosan polysulfate and dextran sulfate displacers.

Chro-

matographia 1994, 39, 409418. (23) Pedersen, L.; Mollerup, J.; Hansen, E.; Jungbauer, A. Whey proteins as a model system for chromatographic separation of proteins.

Journal of Chromatography B 2003, 790,

161173. (24) Natarajan, V.; Ghose, S.; Cramer, S. M. Comparison of linear gradient and displacement separations in ion-exchange systems. Biotechnology

and Bioengineering 2002, 78,

365375. (25) Barz, T.; Löer, V.; Arellano-Garcia, H.; Wozny, G. In

Computer Aided Chemical

Engineering ; De Brito Alves, R. M., Oller Do Nascimento, C. A., Biscaia, E. C., Eds.; Elsevier, 2009; pp 309314. (26) Brooks, C. A.; Cramer, S. M. Steric mass-action ion exchange: Displacement proles and induced salt gradients. (27) Susanto, A.

AIChE Journal 1992, 38, 19691978.

Untersuchung und Modellierung intrapartikularer Stotransportmech-

anismen bei der Proteinaufreinigung durch lonenaustauschchromatographie ; VDI Fortschritt-Berichte Reihe 3 Verfahrenstechnik; VDI Verlag: Düsseldorf, 2006; Vol. 865. (28) Wekenborg, K.; Susanto, A.; Schmidt-Traub, H. Modelling and Validated Simulation of Solvent-Gradient Simulated Moving Bed (SG-SMB) Processes for Protein Separation.

Computer Aided Chemical Engineering 2005, 20, 313318. (29) Frederiksen, S. S. Computer aided development and optimisation of chromatographic separations. PhD, Technical University of Denmark, IVC-SEP, Department of Chemical Engineering, Center for Phase Equilibria and Separation Processes, 2004.

37 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

(30) Wekenborg, K.; Susanto, A.; Frederiksen, S.; Schmidt-Traub, H. Nicht-isokratische SMB-Trennung von Proteinen mittels Ionenaustauschchromatographie.

Chemie-

Ingenieur-Technik 2004, 76, 815  819.

38 Post-print version of the article: Barz, T., Arellano-Garcia, H., & Wozny, G. (2010). Handling uncertainty in model-based optimal experimental design. Industrial & Engineering Chemistry Research, 49(12), 5702-5713. doi: 10.1021/ie901611b. The content is identical to the published paper but without the final typesetting by the publisher.

Suggest Documents