doi 10.1515/ijcre-2012-0081
International Journal of Chemical Reactor Engineering 2013; 11(1): 1–12
Mahdi Feyzdar, Ahmad Reza Vali, and Valiollah Babaeipour*
Identification and Optimization of Recombinant E. coli Fed-Batch Fermentation Producing γ-Interferon Protein Abstract A novel approach to identification of fed-batch cultivation of E. coli BL21 (DE3) has been presented. The process has been identified in the system that is designed for maximum production of γ-interferon protein. Dynamic order of the process has been determined by Lipschitz test. Multilayer Perceptron neural network has been used to process identification by experimental data. The optimal brain surgeon method is used to reduce the model complexity that can be easily implemented. Validation results base on autocorrelation function of the residuals, show good performance of neural network and make it possible to use of it in process analyses. Keywords: identification, fed-batch fermentation process, neural network, optimization, recombinant E. coli
*Corresponding author: Valiollah Babaeipour, Department of Biochemical Engineering, Malek-Ashtar University of Technology, Tehran, Iran, E-mail:
[email protected] Mahdi Feyzdar, Department of Electrical Engineering, Islamic Azad University, Damavand Branch, Tehran, Iran, E-mail:
[email protected] Ahmad Reza Vali, Malek-Ashtar University of Technology, Tehran, Iran, E-mail:
[email protected]
1 Introduction Because of biotechnology rapid developments, the biotechnological companies are competing with each other to increase quality and productivity of their productions. Biological systems are used to produce desire products in biotechnological process. From the biochemical engineering point of view, the most efficient method to increase the production volume is to study in process modeling, control and optimization. Bioprocess identification makes it possible to represent the process characteristics to biochemical engineers. If the system modeling and its identification procedures
are well accomplished, then the other analyses such as process behavior control or process optimization can be applied on the system [1]. Incomplete knowledge on biological processes and low availability of sensor information about the physiological internal states results in the use of intelligent data analysis methods [2]. Therefore, several methods like neural network, fuzzy logic and hybrid structure have been introduced for modeling bioprocesses treatment [2–8, 19, 20]. Artificial Neural Networks (ANN) with their massive parallelism and learning capabilities have been shown to be able to approximate any continuous nonlinear functions and have been proposed as a powerful tool to models extraction [9]. A properly trained ANN can be used to accurately predict steady-state and dynamic processes behavior [10–12]. The multi-layer perceptron (MLP) is one of the popular ANNs and has been successfully applied to a large variety of problems. It is a feed-forward network having one or more hidden layers between the output and the input layers [13]. In this study, a two-layer MLP network has been used for identification of recombinant E. coli fed-batch fermentation based on experimental data. Biotechnological process considered in this paper is fed-batch cultivation of E. coli for producing γ-interferon protein. E. coli is one of the most widely used hosts for production of recombinant proteins because of rapid growth that leads to high-level protein production. Moreover, vast amount of studies have been done on it that collect a lot of information about it. These two reasons make E. coli popular and widely in used with respect to other microorganisms. Many high cell density cultivation (HCDC) techniques such as Batch, Fed-batch, and Continues have been developed to improve production. Fed-batch process is used very often since it allows relatively high level of desire production. This paper is organized as follows: In Section 2, the considered fed-batch fermentation process has been described. In Section 3, problem formulation of the
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM
2
M. Feyzdar et al.: Identification of Fed-Batch Cultivation of E. coli
process identification are introduced. In Section 4, we present the process identification procedures. In Section 5, identification results are presented and finally in Section 6 we drown some conclusion.
2 Process description 2.1 Material and the media LB (Luria–Bertani) agar medium and M9 medium were used for plate cultivation and seed culture preparation of E. coli strain BL21 (DE3). M9 modified medium containing (g/L): glucose, 10; K2HPO4, 15; KH2PO4, 7.5; Citric acid, 2; (NH4)2SO4, 2.5; MgSO4‧7H2O, 2; and 1 ml/L trace element solution, was used for fed-batch cultivation. Trace element solution had a composition of (g/L in 1 M HCl): FeSO4‧7H2O, 2.8; MnCl2‧4H2O, 2; CoSO4‧7H2O, 2.8; CaCl2‧ 2H2O, 1.5; CuCl2‧2H2O, 0.2; and ZnSO4‧7H2O, 0.3 [14]. Feed medium in fed-batch experiments contained (g/L): glucose, 700; MgSO4‧7H2O, 20; and trace element solution, 5 ml/L.
2.2 Procedures analytical information Cell growth was monitored by measuring culture optical density (OD) at 600 nm and dry cell weight (DCW). In order to determine cell dry weight, 5 ml of fermented broth was centrifuged at 9,000 rpm for 10 min, washed twice with 0.9% (w/v) NaCl isotonic solution, and dried at 105°C until constant weight. Glucose, ammonia, phosphate, and acetate were analyzed enzymatically by using various kits (ChemEnzyme CO., I.R. Iran; Boehringer Mannheim/R-Biopharm, Germany). The total expression level of rhIFN-γ was determined by SDS-PAGE using polyacrylamide at 12.5% (w/v). Gels were stained with Coomassie brilliant blue R250, and then quantified by a gel densitometer with accuracy higher than 95%. Total soluble protein was analyzed by the Bradford method using bovine serum albumin as a standard [15], and the concentration of rhIFN-γ was analyzed by ELISA [14]. For analytical purposes, 10 μg of purified protein was applied to a 4% (w/v) stacking gel and separated on a 12.5% (w/v) resolving gel. Concentrations of rhIFN-γ in the gel fractions were determined by scanning the silver-stained gels using a densitometer. The standard biological assay of purified samples was carried out based on the reduction of CPE of VSV on Vero cell lines by serially diluted rhIFN-γ samples and results
obtained, were then compared with commercial rhIFN-γ (Imukin, Bohringer, Germany). Purification of samples was carried out by the method used by Babaeipure et al. [15]. The stability of the plasmid in the recombinant E. coli strain was determined by aseptically sampling from the bioreactor at different cell densities. Fermentation broth samples, diluted with 0.9% (w/v) NaCl when required were plated on LB agar plates with and without ampicillin supplementation (three replicates for each case). The fraction of plasmid-containing cells was calculated as the average ratio of viable colonies on LB with ampicillin to those on LB without antibiotic [15].
2.3 Fermentation process description A batch culture was initially established by the addition of 100 ml of an overnight-incubated seed culture (OD600 = 0.7–1) to the bioreactor containing 900 ml of M9 modified medium. The pH was maintained at 7 by the addition of 25% (w/v) NH4OH or 3 M H3PO4 solutions. Dissolved Oxygen was controlled at 30 ± 10% (v/v) of air saturation by controlling of both the inlet air (which was enriched with pure oxygen) and agitation rate. Foam was controlled by the addition of silicon-antifoaming reagent. After depletion of initial glucose in the medium, as indicated by a rapid increase in the dissolved oxygen concentration, the feeding was initiated. Feeding rate was increased stepwise based on the exponential feeding strategy under maximum attainable specific growth rate in per-induction and constant feeding at post-induction. By using this method, the final cell density reached the value of 107 g/L DCW after 16.5 h for the cultures without induction. Plasmid stability was maintained higher than 95% and by-products concentration (acetate and lactate) were under inhibitory levels and main components of medium were kept in permissible ranges. By induction at cell density of 65 g/L DCW (OD600 = 150), maximum cell density and rhIFN-γ concentration after 16.5 h were 107 g/L DCW and 26.5 g/L, respectively. This quantity is one of the highest values that have been reported for recombinant protein production yet. The maximum specific yield and productivity of rhIFN-γ were also 247 mg rhIFN-γ/g DCW and 1.6 g rhIFN-γ/L, respectively.
3 Neural network formulation Neural network is a powerful tool for system behavior modeling. It means that the ANN extracts the dependencies
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM
M. Feyzdar et al.: Identification of Fed-Batch Cultivation of E. coli
3
N 1X ½yðtÞ ^yðtjθÞT ½yðtÞ ^yðtjθÞ N t¼1
½4
of output(s) on the input(s). This feature extraction can be achieved by learning the ANN with some examples with an evolutionary algorithm. The MLP network is the mostly used member of the neural network family. The main reason for this is its ability to model simple as well as very complex functional relationships. This has been proven through a large number of practical applications [13]. MLP is a feed-forward type network with one or more layers of nodes between the input and output nodes. The input layer introduces input values into the network. The hidden layers perform processing and classification of features and the output layer passes the output as the neural network answer(s). In this research, by trial and error, two-layer network with 1 hidden layer consist of 16 neurons and 4 output units are used. Estimated outputs of MLP network can formulated as follow:
^yi ðw; WÞ ¼ Fi
q X j¼0
Wij fj
m X
! wij zl þ wj0
! þ Wi0
WN ðθ; Z N Þ ¼
1 þ θT Dθ N The matrix D is a diagonal matrix, which is commonly selected to D ¼ αI [13]. Hence, the weights are found as: θ^ ¼ arg min WN ðθ; Z N Þ
½5
By some kind of iterative minimization scheme: θðiþ1Þ ¼ θðiÞ þ μðiÞ f ðiÞ
½6
where θðiÞ specifies the current iterate, f (i) is the search direction, and μðiÞ is the step size. In this paper, the system identification problem has been solved in the following steps: (1) model structure selection, (2) determination of dynamic order, (3) training the network, (4) network architecture optimization, and then (5) validating the network.
½1
l¼1
Where f and F are hyperbolic tangent and linear activation functions for hidden layer and output layer, respectively. The weights (matrices w and W) are the adjustable parameters of the network, and they are determined through training procedure. In order to obtain best parameters for training ANN, two different training methods are applied. Assume the training set to be as follows: Z N ¼ f½uðtÞ; yðt Þjt ¼ 1; . . . ; Ng
½2
u(t) is the set of inputs and y(t) is the corresponding desired outputs set. Hence, the objective of training is to determine a mapping from the set of training data to the set of adjustable parameters θ^ as follow: Z N ! θ^
½3
In other words, the training algorithm changes the weights in order to the estimated value by the ANN, _ yðtÞ, would be close to the real output, y(t). Indeed, the training algorithms are a solution to optimization problem that its goal is to minimize the value of difference between network output and its corresponding real value. The algorithm convergence depends on the initial point. Other problem can be occurred, is finding a local minimum instead of the global minimum. Hence, in this paper, the following objective function should be minimized in training procedure:
4 Identification procedures 4.1 Model structure selection After acquiring a data set, this step is to select a model structure. Model structure extraction in nonlinear systems is more difficult than in linear cases. All nonlinear dynamic input–output models can be written in the form ^yðkÞ ¼ f ðfðkÞÞ, where the regression vector fðkÞ contains previous and possibly current process inputs, previous process or model outputs, and previous prediction errors. The three most common nonlinear model structures are NARX, NARMAX, and NOE models. In the nonlinear systems, system is being more complex with increasing input variable space dimension. Hence, the applications of NARX or NOE models are more widespread because of lower dimensionality. Since the equation error is linear in the parameters the NARX structure allows the utilization of linear optimization techniques. For nonlinearly parameterized approximators such as MLP networks, nonlinear optimization techniques have to be applied. In such cases, the NARX model training is still simpler because no feedback components have to be considered in the gradient calculations, but this advantage over the NOE approach is not very significant [16, 21]. In this study, NARX applied with the following regression:
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM
4
M. Feyzdar et al.: Identification of Fed-Batch Cultivation of E. coli
fðkÞ ¼ ½uðk 1Þ uðk mÞ yðk 1Þ yðk mÞT
½7
where u and y are input and output signals, respectively, and m is dynamic order. When a particular model structure has been selected, the next problem is the number of past signals used as in regression calculation, that is, the dynamic order.
4.2 Dynamic order determination The problem of order determination for nonlinear dynamic systems is still not satisfactorily solved. In the studies about the nonlinear systems, the model dynamic order usually are selected by trial and error and are based on prior knowledge that an expert knows about the process. However, in this paper, He and Asada method for determining lag space has been used. This strategy is based on measured data and does not make any presumptions about the intended model architecture or structure. In this strategy, an index is defined based on so-called Lipschitz quotients, which is large if one or several inputs are missing (the larger the more inputs are missing) and is small otherwise. Thus, the correct lag space can be detected at the point where the Lipschitz index stops to decrease. The Lipschitz quotients in the multi-dimensional case are defined as: jyðiÞ yðjÞj lnij ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðf1 ðiÞ f1 ðjÞÞ2 þ þ ðfn ðiÞ fn ðjÞÞ2 for i ¼ 1; . . . ; N; j ¼ 1; . . . ; N;
½8
iÞj
where, N is the number of samples in the data set and the superscript n stands for the number of inputs. The Lipschitz index can be defined as the maximum occurðnÞ ring Lipschitz quotient as lðnÞ ¼ Maxi;j;ðiÞjÞ ðlij Þ [16].
in a gradient-related direction. In this research, Network training was done by two algorithms and results are compared then the best result was used. These algorithms are Back-propagation with momentum (BPM) with variable learning rate and Levenberg–Marquardt (LM). In each algorithm, training stops when any of these conditions occur: 1. The maximum number of EPOCHS (repetitions) is reached. 2. Performance has been minimized to the GOAL. 3. The performance gradient falls below a preset parameter (MINGRAD). Back-propagation: The back-propagation algorithm is simple and easy to apply. It often converges rapidly to a local minimum, but it may not find a global minimum and in some cases, it may not converge at all. To overcome this problem, a momentum term is added to the minimization function. This algorithm updates weight and bias values according to the gradient-descent momentum and an adaptive learning rate. With the standard steepest descent, the learning rate is held constant throughout the training phase. The performance of the algorithm is very sensitive to a proper setting of the learning rate. If the learning rate is set too high, the algorithm may oscillate and become unstable. If the learning rate is too low, the algorithm will take too long to converge. Therefore, it is not practical to determine the optimal setting of the learning rate before training, because the optimal learning rate changes during the training process, as the algorithm moves across the performance surface. In our algorithm for each epoch, if performance decreases toward the goal, then the learning rate is increased by the factor 1.1. If performance increases by more than the 1.03, the learning rate is adjusted by the factor 0.85 and the change, which increased the performance, is not made. In our experiments, the momentum is constant and set to 0.9. Weights update can be writhen as:
4.3 Network training Training a neural network model essentially means selecting one model from the set of allowed models that minimizes the cost criterion. There are several algorithms available for training neural network models; most of them can be viewed as a straightforward application of optimization theory and statistical estimation. Most of the algorithms used in training ANN are employing some form of gradient descent. This is done by simply taking the derivative of the cost function with respect to the network parameters and then changing those parameters
WðkÞ ¼ Wðk 1Þ ηðkÞ
@EðkÞ þ αΔWðk 1Þ @Wðk 1Þ
½9
Levenberg–Marquardt: Gradient descent method has the fastest convergence response among gradient descent techniques, but its convergence speed is lower than the Newton method. Levenberg–Marquardt method is the standard method for minimization of mean-square error criteria, due to its rapid convergence properties and robustness and it takes the advantages of Newton method with lower computation. By setting Levenberg factor ðμÞ
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM
M. Feyzdar et al.: Identification of Fed-Batch Cultivation of E. coli
to zero, the fast convergence of Gauss–Newton iteration for small residual problems is achieved. If μ is too small, the algorithm may diverge. For big values of μ, the algorithm convergence achievement is very slow. The choice of μ in an adaptive manner will have some benefits. In this method, the size of the elements of the diagonal matrix added to the Gauss–Newton Hessian is adjusted according to the size of the ratio between actual decrease and predicted decrease. The algorithm tries to keep μ as small as possible. If the cost function decreases, the current step is accepted, and the ratio of the actual decrease compared to the predicted decrease is checked. Then, μ is decreased if the ratio was acceptable and increased otherwise. If the cost function increases, the step is rejected and calculations are repeated with an increased μ.
4.4 Network architecture optimization In any data modeling problem, one faces the modeling dilemma: if a model has too many parameters, it risks over-fitting the training data and results in poor generalization (on unseen data); if the model has too few parameters, it may compromise its capability to represent the underlying data distribution. One of the interesting approaches to overcome this problem is pruning. In this approach, a possibly oversized model is trained and then pruned to the right size according to an optimality criterion. Optimal Brain Surgeon (OBS) is the most important strategy, and it is consequently the only method which has been implemented for models of dynamic systems. The purpose is two-fold: ● To select a good topology with improved generalization capability ● To reduce the model complexity so as to save memory, computation costs and easy implementation. OBS belongs to a class of sensitivity-based weight-pruning methods that make use of second-order derivatives (of some cost function) to eliminate the least “important” weights in a neural network (NN). It does not only remove weights but also re-adjusts the remaining weights optimally. The method has been shown effective in refining the complex topology of an over-fitted neural network. The method was originally proposed by Hassibi and Stork [17], but Hansen and Petersen [18] have later derived a modification of the method so that it can handle networks trained according to the regularized criterion. Hansen and Petersen [18] define a saliency was defined, as the estimated increase of the unregularized criterion
5
when a weight is eliminated. Because of this definition, a new expression for the saliences is obtained. The saliency for weight j is defined by: ζi ¼
λ T 1 1 e H ðθ ÞDθ þ λ2j eTj H 1 ðθ ÞRðθ ÞH 1 ðθ ÞH 1 ðθ Þej N j 2
½10
where θ specifies the minimum and Hðθ Þ the Guass– Newton Hessian of the regularized criterion: Hðθ Þ ¼ Rðθ Þ þ
1 D N
½11
ej is the jth unit vector and λj is the Lagrange multiplier which is determined by λj ¼
eTj θ
eTj H 1 ðθ Þej
¼
θj Hj;j1 ðθ Þ
½12
The constrained minimum (the minimum when weight j is 0) is then found from: δθ ¼ θ θ ¼ λj H 1 ðθ Þej
½13
Notice that for the unregularized criterion, where Rðθ Þ ¼ Hðθ Þ the above algorithm will degenerate to scheme of Hassibi and Stork [17]. The problem with the basic OBS-scheme is that it does not take into account that it should not be possible to have networks where a hidden unit has lost all the weights leading to it, while there still are weights connecting it to the output layer, or vice versa. For this reason, in this study we used an improved version of OBS method. In the beginning, the saliences are calculated and the weights are pruned as described above. But when a situation occurs where a unit has only one weight leading to it or one weight leading from it, the saliency for removing the entire unit is calculated instead. This gives rise to some minor extensions to the algorithm (the saliency for the unit is calculated by setting all weights connected to the unit to 0). Define J as the set of indices to the weights leading to and from the unit in question. Furthermore, let EJ be a matrix of unit vectors corresponding to each element of the set J. In order to calculate the saliency for the entire unit, the above expressions are then modified to: ζ j ¼ λTJ EJT H 1 ðθ Þ
1 1 T T 1 Dθ þ λJ EJ H ðθ ÞRðθ ÞH 1 ðθ ÞEj λj N 2
½14
1 λj ¼ EJT H 1 ðθhÞEj EJT θ
½15
δθ ¼ θ θ ¼ H 1 ðθ ÞEJ λJ
½16
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM
6
M. Feyzdar et al.: Identification of Fed-Batch Cultivation of E. coli
When a weight (or unit) has been removed, it is necessary to obtain the new inverse Hessian before proceeding to eliminate new weights. If the network has been retrained it is of course necessary to construct and invert the Hessian once more. If, however, the network is not retrained, a simple trick from inversion of partitioned matrices can be used for approximating the inverse Hessian of the reduced network. Assume that the pruned weights are located at the end of the parameter vector, q. The Hessian is then partitioned as follows:
~ H H¼ T hJ
hJ hJJ
½17
~ is the “new” Hessian, which is to be inverted. H Partitioning the inverse Hessian in the same way yields: H 1 ¼
~ pJJ P pTJ pJJ
½18
The new inverse Hessian is then determined as the Schur ~ Complement of P: ~ 1 ¼ P ~ pj p1 pT H JJ J
½19
It is difficult to decide how often the network should be retrained during the pruning session. One choice is to
retrain each time 5% of the weights have been eliminated. The safest, but also the most time consuming strategy, is of course to retrain the network each time a weight has been eliminated.
4.5 Network validation For nonlinear systems, validation methods are restricted to criteria in time but not in frequency. The most typical validation test is the prediction error "ðtÞ ¼ yðtÞ ^yðtjθÞ. Although this test is a good one and its result would be acceptable, further tests have been applied in this paper. This test is based on correlation analysis of the residuals. This test allows us to detect information that can still be modeled and has not been captured by the model. This test also detects unmodelled dynamics or bias in the estimation.
5 Simulation results Experimental data of the process that has been used for identification are the volume of dissolve Oxygen (DO), Feeding (F) (See Figure 1), Biomass, Glucose, Acetate, and γ-interferon.
Figure 1 Experimental inputs data (a) Feeding profile (g/L) (b) Dissolve Oxygen (%).
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM
M. Feyzdar et al.: Identification of Fed-Batch Cultivation of E. coli
7
Figure 2 Lipschitz index obtained from experimental data for lag space from 1 to 5.
For determining the dynamic order of the model Lipschitz test has been applied on the experimental data. The Lipschitz index obtained from experimental data for varying lag space from 1 to 5 is shown in Figure 2. As shown in Figure 2 Lipschitz index is decrease when lag space is increased. It is shown in Figure 2 that the increasing in system modeling has lead to lower Lipschitz index. Although higher dynamic order is more accurate in system studies, it results in the system complexity and a compromise should be mentioned in the model order determination. Hence, in this paper, the model order 3 has been chosen because its corresponding Lipschitz index is in the acceptable margin. The MLP has been trained with BPM and LM algorithms and the results of the training algorithm have been summarized in Table 1. These results are achieved in a PC with Intel(R) Core(TM)2 Duo E6550 @ 2.33 MHz processor.
Table 1 Training results. Parameter/method Execution time (sec) Performance index
BPM
LM
30.65 4.54 × 10–6
156.34 5.61 × 10–7
In BPM learning method, momentum term was set to 0.001. The initial value of the Levenberg factor and learning rate in the LM and BPM algorithm is not a particularly critical so they are adjusted adaptively. In this paper, these parameters have been set to 3 and 0.01, respectively. As shown in Table 1, BPM is a faster algorithm than LM method. In contrast, comparing performance indices shows that LM method is more accurate than BPM. As shown in Table 1, BPM method needs more number of iterations than using LM algorithm to reach the same performance index. It should be noted that a compromise should be applied in selecting one of the training method. Although BPM method has less accuracy than LM and more iteration is needed to reach the acceptable performance index, it is faster that LM. After network training, for improving the performance of the network, superfluous weights were removed according to the OBS strategy. After each weight elimination network is retrained with 40 iterations. Then, network was test with test data set. Performance indices obtained by LM algorithm are better than BPM algorithm so only the result of LM algorithm application has been shown. For each variable, the ANN estimated value, the experimental data and the residual errors have been shown in separate figures.
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM
8
M. Feyzdar et al.: Identification of Fed-Batch Cultivation of E. coli
As depicted in Figures 3–6, the residual errors are small enough and are acceptable in this study. On the other hand, the ANN identification is accomplished well and the system modeling has enough accuracy and can be used in the other studies.
As noted above the most typical validation test is applied using residual errors. Distributions of the residual errors are shown in Figure 7. As depicted in Figure 7 residual errors are less than 0.1 that is acceptable in this study. Furthermore, residual validation test based on
Figure 3 ANN estimation result (a) Estimated (dashed line) and experimental (solid line) data of biomass (b) Error between estimated and experimental data of biomass.
Figure 4 ANN estimation result (a) Estimated (dashed line) and experimental (solid line) data of glucose (b) Error between estimated and experimental data of glucose.
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM
M. Feyzdar et al.: Identification of Fed-Batch Cultivation of E. coli
9
Figure 5 ANN estimation result (a) Estimated (dashed line) and experimental (solid line) data of acetate (b) Error between estimated and experimental data of acetate.
Figure 6 ANN estimation result (a) Estimated interferon (dashed line) and experimental data (solid line) of γ-interferon (b) Error between estimated and experimental data of γ-interferon.
correlation analysis also has been applied. The results are depicted in Figure 8. As shown in Figure 8 auto correlation coefficients almost stay within their standard deviations. The margins that are shown in Figure 8 by red dashed line, can be considered tight or relax with respect to the data analysis needed accuracy.
As these results show, MLP network with LM learning algorithm gives the best modeling performance and the acceptable range of error. These results justify that this method provides an efficient and accurate method to model identification of E. coli fed-batch cultivation process.
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM
10
M. Feyzdar et al.: Identification of Fed-Batch Cultivation of E. coli
Figure 7 Histogram of residual estimated outputs error.
Figure 8 Autocorrelation coefficients of estimated outputs.
6 Conclusion and future work In this paper, recombinant E. coli fed-batch process has been identified by MLP neural network. At first, the model order is determined by Lipschitz test, and then the MLP has been trained by LM algorithm based on experimental data set. This method has been chosen for
network training because of its fast convergence. Finally, model order reduction has been performed by improved version of OBS. The validation results show that this procedure is an accurate one to identify recombinant E. coli fed-batch cultivation process and can be a powerful tool for the modeling and the identification of the other biochemical processes.
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM
M. Feyzdar et al.: Identification of Fed-Batch Cultivation of E. coli
One of the areas of future research is the consideration of on-line adaptation. Being the model of the process updated during the fermentation process, a task that can be performed by Neuro-Adaptive method. It is important to get more real-time information of some process variables for increasing product quality of an industrial fermentation process, but usually this information is difficult to get and suited instrumentation for real-time measuring and monitoring of biological variables is absent so the information’s are little and lagging. In the future work to overcome this problem, a soft sensor will be used for on-line estimation of some immeasurable variables such as growth rate that is an important variable and production is directly related to it.
Notations DO e
dissolve Oxygen error between desired outputs and estimated outputs, dimensionless
11
f hyperbolic tangent activation function, dimensionless F linear activation function, dimensionless H Guass–Newton Hessian, dimensionless ~ the “new” Hessian, dimensionless H l Lipschitz quotient, dimensionless u set of inputs t time, h W Neural Network weights, dimensionless Y set of desired outputs _ y set of estimated outputs
Greek letters α θ η μ ’ λ
momentum term, dimensionless adjustable parameters, dimensionless learning rate, dimensionless levenberg factor, dimensionless regression vector, dimensionless Lagrange multiplier, dimensionless
References 1. Manuel R, Oliveira F. Supervision, control and optimization of biotechnological processes based on hybrid models. PhD thesis, Martin-Luther-University Halle-Wittenberg, 1998. 2. Marenbach P, Brown M. Evolutionary versus inductive construction of neurofuzzy systems for bioprocess modeling. IEE 1997;5:320–5. 3. Undey C, Tatara E, Williams BA, Birol G, Cinar A. Hybrid supervisory knowledge-based system for monitoring penicillin fermentation. In: Proceedings of the American control conference, Chicago, USA 2000:3944–8. 4. Saarela U, Leiviska K, Juuso E. Modelling of a fed-batch fermentation proces. Control engineering laboratory report, 2003, ISBN 951-42-7083-5. 5. Hodge D, Simon L, Karim MN. Data driven approaches to modeling and analysis of bioprocesses: some industrial example. In: Proceeding of the American Control Conference, Denver, CO, 2003;3:2062–76. 6. Horiuchi JI, Kishimoto M, Momose H. Hybrid simulation of microbial behavior combining a statistical procedure and fuzzy identification of culture phases. J Ferment Bioeng 1995;79: 297–9. 7. Simeonov I, Chorukova E. Neural networks modeling of two biotechnological processes. In: IEEE international conference on intelligent systems, Sofia, Bulgaria, 2004:331–6. 8. Wu Y, Lu J, Xu J, Sun Y. Bioprocess modeling using fuzzy regression clustering and genetic programming. In: Proceedings of the 6th world congress on intelligent control and automation, Dalian, China, 2006:9337–41. 9. Sjoberg J, Zhang Q, Ljung L, Benveniste A, Deylon B, Glorennec PY, et al. Nonlinear black-box modeling in system
10.
11.
12.
13.
14.
15.
16.
17.
18.
identification: a unified overview. Automatica 1995;31:1691–724. Chen S, Billings SA. Neural networks for non-linear dynamic system modeling and identification. Int J Control 1992;56:319–46. Hunt KJ, Sbarbaro D, Zbikowski R, Gawthrop PJ. Neural networks for control systems – a survey. Automatica 1992;28:1083–112. Zhang J. Developing robust neural network models by using both dynamic and static process operating data. Ind Eng Chem Res 2001;40:234–41. Nørgaard M. Neural network based system identification toolbox. Technical Report 00-E-891, Department of Automation, Technical University of Denmark, 2000. Babaeipure V, Shojaosadati SA, Khalilzadeh R, Maghsoodi N, Tabandeh F. Over production of human interferon-γ by HCDC of recombinant Escherichia coli. Process Biochem 2007;42: 112–7. Babaeipure V, Shojaosadati SA, Khalilzadeh R, Maghsoodi N, Tabandeh F. A proposed feeding strategy for over production of recombinant proteins by E coli ., Biotechnol Appl Biochem 2008;53:655–60. Nelles O. Nonlinear system identification from classical approaches to neural networks and fuzzy models. Berlin: Springer, 2001. Hassibi B, Stork DG. Second order derivatives for network pruning: optimal brain surgeon. Adv Neural Inform Process Syst 1993;5:164–71. Hansen LK, Petersen MW. Controlled growth of cascade correlation nets. In: Proceedings of ICANN’94
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM
12
M. Feyzdar et al.: Identification of Fed-Batch Cultivation of E. coli
international conference on neural networks, Orlando Florida, USA, 1994:797–800. 19. Barbu M, Caraman S, Ceanga E. Bioprocess control using a recurrent neural network model. In: IEEE proceedings of international symposium on intelligent control, Limassol, Cyprus, 2005:479–84.
20. Bhat NV, McAvoy TJ. Use of neural nets for dynamic modeling and control of chemical process systems. Comput Chem Eng 1990;14:573–83. 21. Ljung JL. System identification – theory for the user. New York: Prentice-Hall, 1987.
Brought to you by | University of British Columbia - UBC Authenticated | 142.103.160.110 Download Date | 8/14/13 11:29 PM