Application of different training methodologies for the development of ...

4 downloads 61936 Views 419KB Size Report
Jan 21, 2008 - Application of different training methodologies for the development of a back propagation artificial neural network retention model in ion ...
Research Article Received: 4 June 2007,

Revised: 18 September 2007,

Accepted: 23 September 2007,

Published online in Wiley InterScience: 21 January 2008

(www.interscience.wiley.com) DOI: 10.1002/cem.1096

Application of different training methodologies for the development of a back propagation artificial neural network retention model in ion chromatography Tomislav Bolancˇaa*, Sˇtefica Cerjan-Stefanovic´a, Sˇime Ukic´a, Marko Rogosˇic´a and Melita Lusˇaa The reliability of predicted separations in ion chromatography depends mainly on the accuracy of retention predictions. Any model able to improve this accuracy will yield predicted optimal separations closer to the reality. In this work artificial neural networks were used for retention modeling of void peak, fluoride, chlorite, chloride, chlorate, nitrate and sulfate. In order to increase performance characteristics of the developed model, different training methodologies were applied and discussed. Furthermore, the number of neurons in hidden layer, activation function and number of experimental data used for building the model were optimized in terms of decreasing the experimental effort without disruption of performance characteristics. This resulted in the superior predictive ability of developed retention model (average of relative error is 0.4533%). Copyright ß 2008 John Wiley & Sons, Ltd. Keywords: artificial neural network; training algorithm; retention modeling; ion chromatography

1. INTRODUCTION

integral, one may easily switch to the gradient elution result by allowing for the temporal variation of c:

The chromatographic property, which has the major impact on the separation quality, is retention. If there is a business expectation to find reasonable separation conditions within a couple of days, then there are only a dozen or so experiments possible before time runs out. Therefore, it is still important to try to apply different methodologies in order to increase predictive ability of retention in overall optimization process. The first step in the computer-assisted optimization of separation is gathering information about the chromatographic behavior of the compounds in the sample, covering a reasonably wide factor region. This will allow the inference of a relationship to describe the retention as a function of the experimental factors, for each solute. The accuracy of these predictions is decisive for the reliability of the optimization. A number of references that explain and apply computer-assisted optimization tools to different samples have been published in the last two decades [1–9]. When facing the separation problem where gradient elution is needed, the retention modeling in ion chromatography becomes even more complex problem then in isocratic elution mode. One possibility to solve gradient elution optimization problem is by the prediction of retention in gradient elution mode by using isocratic experimental data [10–13]. That model is based on final (integral) of retention times of solutes, tg, which is described in terms of measurable properties (capacity factor, k; void time of a column, t0):

tZ g t0

t0 ¼

dt k½cðtÞ

(2)

0

k[c] can be assumed constant for each step and t0 can be approximated to: t0 

t1 t2  t1 ti  ti1 tiþ1  ti þ þ  þ þ k0;1 k1;2 k0;1 ki;iþ1

¼ I0;1 þ Ii;iþ1 ¼ I0;iþ1

kðcÞi;iþ1 ¼

k ½cðti Þ þ k ½cðtiþ1 Þ 2

(3)

(4)

where I represents the approximate cumulative integral. The approximate value of the cumulative integral is calculated stepwise; it is expected to increase in due course of the integration procedure and it will eventually exceed the fixed t0-value on the left-hand side of Equation (11) at some tg  t0

(1)

Faculty of Chemical Engineering and Technology, University of Zagreb, Marulic´ev trg 20, 10000 Zagreb, Croatia. E-mail: [email protected]

Upon the inclusion of the time-independent term k[c] (c denotes concentration of eluent competing ion) within the time

a T. Bolancˇa, Sˇ. Cerjan-Stefanovic´, Sˇ. Ukic´, M. Rogosˇic´, M. Lusˇa Faculty of Chemical Engineering and Technology, University of Zagreb, Marulic´ev trg 20, 10000 Zagreb, Croatia

Fðtg ;k;t0 Þ ¼ 0

*

106 J. Chemometrics 2008, 22 106–113

Copyright ß 2008 John Wiley & Sons, Ltd.

Neural network training in ion chromatography value. At this point tg can be easily calculated as tg ¼ t0 þ ti þ ðt0  I0;i ÞkðcÞi;iþ1

(5)

The quality of prediction is more extensively evaluated by checking the uncertainty of the isocratic to gradient model [13]. It is demonstrated that errors obtained in the isocratic fitting are propagated in the prediction of gradient retention [13]. The predictive ability of such gradient retention model is therefore compromised by the model used for isocratic retention modeling. Thus, it is extremely important to develop isocratic elution retention model with as highest predictive ability as possible. Several retention models have been proposed and tested in isocratic ion chromatography [14–24]. For the theoretical models it was noted that as the complexity of the model increases, the accuracy and precision both increase, but also the level of knowledge requirements necessary to implement the model [24–26]. More complex theoretical retention models are capable to predict retention times with equal or greater accuracy and precision than the empirical end points model, but the low level of knowledge requirements and excellent ruggedness of the end points model make it superior for optimization calculations [26]. The artificial neural network retention models show the highest accuracy, precision and robustness [27,28] but more experimental data have to be used for modeling procedures then for any other model [29]. Of all available types of artificial neural networks, multilayered perceptrons (MLPs) are the most commonly used. There are many algorithms for training MLP networks [30–34]. The popular backpropagation (BP) algorithm [35] is simple but reportedly has a problem with slow convergence. Thus, various related algorithms have been introduced to address that problem [30]. Most of them are based on second order information about the shape of the error surface [36]. Another problem inherent with neural network training is ‘over-fitting’, that is, the error on the training set is driven to a very small value, but when new data are presented to the network the error is large. This signifies that the network has memorized the training patterns instead of learning from examples to generalize to new situations. One method for improving network generalization is to use a sufficiently large network that provides an adequate fit. The larger networks can create more complex functions. Smaller networks are susceptible to over-fitting the data. However, there exists no hard and fast technique to know beforehand how large a network should be for a specific application. Regularization through weight decay [37] is a method to redress such a problem. The aim of this work is development of the suitable artificial neural network retention model, which can be used in a variety of applications for method development in ion chromatography, particularly if isocratic data are used for gradient predictions and extremely accurate isocratic models are needed for gradient modeling. MLP artificial neural networks were used to model retention behavior of void peak, fluoride, chlorite, chloride, chlorate, nitrate and sulfate in relation with concentration of OH in eluent. Different training algorithms were tested: (1) gradient descent algorithm with adaptive learning rate; (2) Fletcher– Reeves conjugate gradient algorithm; (3) Polak–Ribie´re conjugate gradient algorithm; (4) Powell–Beale conjugate gradient algorithm; (5) Quasi–Newton algorithm with Broyden, Fletcher, Goldfarb and Shanno update and (6) Levenberg–Marquardt algorithm with Bayesian regularization; in order to improve

predictive ability of the final retention model. Activation function, number of hidden layer neurons and number of experimental data points used for training set were optimized until minimum on error surface has been found.

2. THEORY Back-propagation algorithm is a gradient descent algorithm in which the network weights are moved along the negative of the gradient of the performance function. There is a number of variations of the basic algorithm that are based on heuristics and standard numerical optimization techniques, such as variable learning rate gradient descent, conjugate gradient and Newton methods. The simplest implementation of back-propagation learning updates the network weights and biases in the direction in which the performance function decreases most rapidly, that is, the negative of the gradient. The (k þ 1)th iteration of this algorithm, as described in Reference [37], can be written as xkþ1 ¼ xk  ak gk

(6)

where xk is a vector of current weights and biases, gk is the current gradient and ak is the learning rate. The gradient descent algorithm with adaptive learning rate is based on a heuristic technique [38] and is much faster than the standard steepest descent algorithm as it allows the learning rate to change during the training process. This procedure increases the learning rate, but only to the extent that the network can learn without large error increases. In the conjugate gradient algorithms a search is performed along conjugate directions, which produces generally faster convergence than steepest descent directions. All the conjugate gradient algorithms start by searching in the steepest descent direction on the first iteration, Equation (7). Then a line search is performed to determine the optimal distance to move along the current search direction, Equation (8): p0 ¼ g0

(7)

xkþ1 ¼ xk þ ak pk

(8)

The next search direction is determined so that it is conjugate to previous search directions: pk ¼ gk þ bk pk1

(9)

The variants of conjugate gradient are distinguished by the manner in which the constant bk is computed. * For Fletcher–Reeves update, the procedure for computation includes the ratio of the squared norm of the current gradient to the squared norm of the previous gradient [37,39]: bk ¼

gTk gk T gk1 gk1

(10Þ

* For Polak–Ribie´re update, the procedure for computation includes the inner product of the previous change in the gradient with the current gradient divided by the squared

107

J. Chemometrics 2008, 22 106–113

Copyright ß 2008 John Wiley & Sons, Ltd.

www.interscience.wiley.com/journal/cem

T. Bolancˇa et al. (20)), where g is the performance ratio of regularization parameter and n is the number of network parameters.

norm of the previous gradient [37,39]: bk ¼

DgTk1 gk gTk1 gk1

(11Þ MSE ¼

* The Powell–Beale update procedure for computation is characterized by two features: (a) the search direction is reset to the negative of gradient if there is very little orthogonality left between the current gradient and the previous gradient, which is tested with the inequality (12); (b) the search direction is computed by Equation (9) where parameter bk can be computed in several different ways.  T  g gk   0:2kgk k2 (12Þ k1 Newton’s method, as described in References [37,39], is an alternative to the conjugate gradient methods for fast optimization. The basic step of Newton’s method is: xkþ1 ¼ xk  A1 k gk

(13)

where Ak is the Hessian matrix (second derivatives) of the performance index at the current values of the weights and biases. It is complex and expensive to compute the Hessian matrix for feedforward neural networks. Therefore, a Quasi– Newton method is used for this purpose, which updates an approximate Hessian matrix at each iteration of the algorithm. The update is computed as a function of the gradient by using the following Equation (14), as illustrated in Reference [39].   DgT Ak Dg Dxk DxTk Akþ1 ¼ Ak þ 1 þ k T k Dgk DxTk Dgk 

Ak Dgk DxTk þ ðAk Dgk DxTk ÞT DgTk Dxk

(14)

The Levenberg–Marquardt training algorithm [40] is designed to achieve second-order training speed without computing the Hessian matrix. When the performance function has the form of a sum of squares (as usually is there in training feedforward networks), then the Hessian matrix can be approximated (Equation (15)) and gradient can be computed (Equation (16)) where J is the Jacobian matrix that contains first derivatives of the network errors with respect to the weights and biases, and e is a vector of network errors. H ¼ JT J

(15)

g ¼ JT e

(16)

The Levenberg–Marquardt algorithm uses this approximation to the Hessian matrix in the following Newton-like update:  1 xkþ1 ¼ xk  JT J þ mI JT e

N 1X e2 N i¼1

MSREG ¼ gMSE þ ð1  gÞMSW

MSW ¼

n 1X w2 n j¼1 j

(18)

(19)

(20)

When using MSREG, the performance function causes the network to attain smaller weights and biases, which smoothes the network response and makes it less likely to overfit.

3. EXPERIMENTAL 3.1.

Instrumentation

A Dionex DX600 chromatography system (Sunnyvale, CA, USA), equipped with a quaternary gradient pump (GS50), chromatography module (LC30) and detector module (ED50A), was used in all experiments. The separation and suppressor columns used were Dionex IonPac AG19 (4 mm  50 mm) guard column, IonPac AS19 (4 mm  250 mm) separation column and ASRS-ULTRA II–4 mm suppressor, the latest working in recycle mode. The sample-loop volume was 25 ml, the eluent flow rate was 1.00 ml/ min and the temperature was 358C. The whole system was computer-controlled through Chromeleon 6.70, Build 1820 software. The concentration of OH in eluent was varied from 5 to 80 mmol/L and 45 equidistant experimental data points were obtained, each measured in three replicates, providing total of 135 experimental data points. 3.2.

Reagents and solutions

A mixed standard solution of fluoride (30.00 mg/L), chlorite (100.00 mg/L), chloride (60.00 mg/L), chlorate (250.00 mg/L), nitrate (250.00 mg/L) and sulfate (250.00 mg/L) was prepared from the air-dried (at 1058C) salts of individual anions of p.a. grade (Merck, Darmstadt, Germany). Appropriate amounts of individual salts were weighed into a volumetric flask (100 ml) and dissolved in Milli-Q water. Working standard solutions of fluoride (3.00 mg/L), chlorite (10.00 mg/L), chloride (6.00 mg/L), chlorate (25.00 mg/L), nitrate (25.00 mg/L) and sulfate (25.00 mg/L) were prepared by diluting the appropriate volume of mixed standard solution in a 100 ml volumetric flask, which was later filled to the mark with Milli-Q water. 18 MV cm1 water (Millipore, Bedford, MA, USA) was used for dilution in all cases.

(17)

The problem, which arises from the fact that larger networks can create more complex functions and smaller networks are susceptible to data overfitting, can be redressed by employing the regularization method through weight decay [37]. The performance function, which is typically the mean sum of squares of the network errors (MSE—(Equation (18)) can be modified by adding a penalty term that consists of the mean of the sum of squares of the network weights and bases (Equations (19) and

3.3.

Neural networks

The neural network used in this paper was the three-layer feedforward neural network. The input layer consists of one neuron representing concentration of OH in eluent. The output layer consists of seven neurons representing the retention of void peak and retention times of particular anions (fluoride, chlorite, chloride, chlorate, nitrate and sulfate). The training algorithm, activation function, number of neurons in hidden layer and

108 www.interscience.wiley.com/journal/cem

Copyright ß 2008 John Wiley & Sons, Ltd.

J. Chemometrics 2008, 22 106–113

Neural network training in ion chromatography

Figure 1. Influence of number of hidden layer neurons and number of experimental data points on relative error of the artificial neural network retention model by using logistic sigmoid activation function and different training algorithms: (A) gradient descent algorithm with adaptive learning rate; (B) Powell–Beale conjugate gradient algorithm; (C) Fletcher–Reeves conjugate gradient algorithm; (D) Polak–Ribie´re conjugate gradient algorithm; (E) Quasi–Newton algorithm with Broyden, Fletcher, Goldfarb and Shanno (BFGS) update and (F) Levenberg–Marquardt algorithm with Bayesian regularization. This figure is available in color online at www.interscience.wiley.com/journal/cem

number of experimental data points used for training calculations had to be optimized. Therefore, the training algorithm was varied between (1) gradient descent algorithm with adaptive learning rate; (2) Fletcher–Reeves conjugate gradient algorithm; (3) Polak–Ribie´re conjugate gradient algorithm; (4) Powell–Beale

conjugate gradient algorithm; (5) Quasi–Newton algorithm with Broyden, Fletcher, Goldfarb and Shanno update and (6) Levenberg–Marquardt algorithm with Bayesian regularization; the activation function was varied between logistic and hyperbolic; the number of neurons in hidden layer was varied

109

J. Chemometrics 2008, 22 106–113

Copyright ß 2008 John Wiley & Sons, Ltd.

www.interscience.wiley.com/journal/cem

T. Bolancˇa et al.

Figure 2. Influence of number of hidden layer neurons and number of experimental data points on relative error of the artificial neural network retention model by using tangent hyperbolic activation function and different training algorithms: (A) gradient descent algorithm with adaptive learning rate; (B) Powell–Beale conjugate gradient algorithm; (C) Fletcher–Reeves conjugate gradient algorithm; (D) Polak–Ribie´re conjugate gradient algorithm; (E) Quasi–Newton algorithm with Broyden, Fletcher, Goldfarb and Shanno (BFGS) update and (F) Levenberg–Marquardt algorithm with Bayesian regularization. This figure is available in color online at www.interscience.wiley.com/journal/cem

from 2 to 16; the number of experimental data points in training set was varied form 4 to 28. It is preferable that every experimental data point has an equal influence on the neural network model, if training and testing sets are aimed to be the representative groups of data of the whole design area. For this reason the random function was

applied for selection of experimental data points used for training, testing and validation sets of data. The input experimental data were scaled so their mean and their standard deviation were 0 and 1, respectively. This was necessary as, although most neural networks could accept input values in any range, they were sensitive to inputs in a far smaller range only.

110 www.interscience.wiley.com/journal/cem

Copyright ß 2008 John Wiley & Sons, Ltd.

J. Chemometrics 2008, 22 106–113

Neural network training in ion chromatography The activation function connecting the input and hidden layer of nodes was varied between log sigmoid activation function, defined as f ðxÞ ¼

2 1 þ ex

(21)

and hyperbolic tangent activation function, defined as f ðxÞ ¼

2 1 1 þ e2x

(22)

For the computation of output activities the linear transfer function was employed: f ðxÞ ¼ x

(23)

To test the predictive performance of the developed artificial neural network retention model, an independent validation set was used followed by extensive statistical evaluation. All calculations were performed in MATLAB 7.0.0. (MatWorks, Sherborn, USA) environment.

4. RESULTS AND DISCUSSION In Figures 1 and 2, the optimization of artificial neural network is illustrated, including the effects of the training algorithm, activation function, number of hidden layer neurons and number of experimental data points used for training. It can be seen (Figures 1 and 2) that the Levenberg–Marquardt training algorithm with Bayesian regularization produces models with the lowest relative error of all the tested algorithms (gradient descent algorithm with adaptive learning rate, Fletcher–Reeves conjugate gradient algorithm, Polak–Ribie´re conjugate gradient algorithm, Powell–Beale conjugate gradient algorithm and Quasi–Newton algorithm with Broyden, Fletcher, Goldfarb and Shanno update) in all domains of the optimized parameters (activation function, number of hidden layer neurons, number of experimental data points used for training procedure). Moreover, the training process with the lowest amount of oscillations is found with the Levenberg–Marquardt training algorithm with Bayesian regularization (Figure 2) (6). The optimization surface showing less oscillation increases the possibility of finding the global optimal conditions on error hyperplane. This could lead to the significant increase of predictive ability of the final artificial neural network retention model. In addition, it could enable

lowering of the number of experimental data points used for the training procedure without severe impact on the performance characteristics of the model. Furthermore, the Bayesian regularization procedure allows for the possibility of overlapping the training and validation data sets. This advantage may lower significantly the experimental time and effort by reducing the overall size of data sets. The features observed point to the Levenberg–Marquardt training algorithm with Bayesian regularization as being the optimal training algorithm for the artificial neural network retention modeling. The type of activation function used in artificial neural network structure can have significant influence on the retention model performance. One can observe from Figures 1 and 2 that artificial neural networks retention models, obtained by using logistic activation function (Figure 1), provide lower relative errors than those obtained by using hyperbolic tangent activation function, for all the training algorithms. The marked exception is the Levenberg–Marquardt training algorithm with Bayesian regularization (Figure 2 (6)). This result, in conjunction with the previous discussion about the selection of training algorithm, indicates hyperbolic tangent activation function as the optimal one for retention modeling. Figures 1 and 2 also illustrate the optimization of the number of hidden layer neurons and number of experimental data points needed for the training set. It is generally preferable to diminish the number of experimental data points used for training in order to reduce the overall experimental effort. The figures show that the number of experimental data points used for training procedure can be reduced to 16 without severely affecting the predictive ability of the retention model. However, the lowest relative error is obtained by using 24 experimental data points for the training set, indicating that 28 or more experimental data points in the training set could lead to overfitting. The optimal number of neurons in hidden layer, that is compatible to the previously selected optimal conditions (Levenberg–Marquardt training algorithm with Bayesian regularization, hyperbolic tangent activation function, 24 experimental data points for training set) is 8. Any further increase of the number of neurons in hidden layer enlarges the relative error and causes overfitting. In accordance to the previous discussion, the optimal configuration of the artificial neural network in ion chromatographic retention modeling comprises the Levenberg–Marquardt training algorithm with Bayesian regularization, hyperbolic tangent activation function, 24 experimental data points for the training set and 8 neurons in the hidden layer. The predictive ability of the developed retention model is shown in Table I. The

Table I. The performance characteristic of developed artificial neural network retention model

Mean error/% Minimal error Value/% c(KOH) in eluens/mM Maximal error Value/% c(KOH) in eluens/mM R2

Void

Fluoride

Chlorite

Chloride

Sulfate

Chlorate

Nitrate

0.6184

0.3807

0.3615

0.4218

0.7982

0.2747

0.3181

0.0161 35.68

0.0034 45.91

0.0001 59.55

0.0053 49.32

0.0059 62.95

0.0001 64.65

0.0022 66.36

1.6309 5.00 0.9999

1.3806 54.43 0.9999

1.3311 71.48 0.9999

1.4199 71.47 0.9999

1.5931 33.98 0.9999

1.4413 5.00 0.9999

1.2224 5.00 0.9999

Mean, minimal and maximal errors with corresponding conditions (c(KOH) in eluens) for which minimal and maximal errors are obtained.

111

J. Chemometrics 2008, 22 106–113

Copyright ß 2008 John Wiley & Sons, Ltd.

www.interscience.wiley.com/journal/cem

T. Bolancˇa et al. overall average of relative error is 0.4533%; the maximal relative error is 1.6309% (obtained for the sulfate peak) which proves an extremely good predictive ability. The minimal average of relative error was 0.2747%, obtained for chlorate. By using 5 mM KOH, the elution order for investigated anions in terms of increasing retention time was: fluoride, chlorite, chloride, chlorate, nitrate, sulfate. Upon increase of KOH concentration to 80 mM KOH, the elution order is changed to: fluoride, chlorite, sulfate, chloride, chlorate, nitrate, that is, the position of sulfate is changed significantly. This could have lead to its lower predictive ability. The described artificial neural network retention model can be used for method development in numerous applications in ion chromatography. By applying the model, one can control selectivity and time of chromatographic run simultaneously, without excessive experimentation.

5. CONCLUSIONS This work presents the development of back propagation artificial neural network retention model in ion chromatography. The different training methodologies for back propagation neural networks were discussed. It was shown that the optimal training methodogy was the Levenberg–Marquardt training algorithm with Bayesian regularization. The activation function, number of hidden layer neurons and number of experimental data points used for training set were optimized in order to obtain artificial neural network model with superior predictive ability. The optimized artificial neural network retention model showed extremely good predictive ability which could be proved by examining its performance characteristic (overall average of relative error was 0.4533%; the maximal relative error was 1.6309%). From the results it can be concluded that the developed artificial neural network retention model correlates data well and it can be successfully used for the retention modeling in ion chromatography. Moreover, its superior performance characteristics shows a great potential for the application of developed model in gradient elution retention modeling by crossing the data form the isocratic elution mode. However, the practical application of these enhanced isocratic predictions to predict gradient retention followed by extensive error analysis will be considered in the future.

REFERENCES 1. Balke ST. Quantitative Column Liquid Chromatography: A Survey of Chemometric Methods. Elsevier: Amsterdam, 1984. 2. Schoenmakers PJ. Optimisation of Chromatographic Selectivity: A Guide to Method Development. Elsevier: Amsterdam, 1986. 3. Ahuja S. Selectivity and Detectability Optimisation in HPLC. Wiley: New York, 1989. 4. Snyder LR. The future of chromatography: optimizing the separation. Analyst 1991; 116: 1237–1244. 5. Smith RM (ed.). Retention and Selectivity in Liquid Chromatography. Elsevier: Amsterdam, 1995. 6. Lukulay PH, McGuffin VL. The evolution from univariate to multivariate optimization methods in liquid chromatography. J. Microcolumn Sep. 1996; 8: 211–224. 7. Snyder LR. Practical HPLC Method Development (2nd edn.). Wiley: New York, 1997. 8. Siouffi AM, Phan-Tan-Luu R. Optimization methods in chromatography and capillary electrophoresis. J. Chromatogr. 2000; A 892: 75–106. 9. Torres-Lapasio´ JR, Garcı´a-A´lvarez-Coque MC. Levels in the interpretive optimisation of selectivity in high-performance liquid chromatography: a magical mystery tour. J. Chromatogr. 2006; A 1120: 308–321.

10. Schoenmakers PJ, Billiet HAH, de Galan L. Use of gradient elution for rapid selection of isocratic conditions in reversed-phase highperformance liquid chromatography. J. Chromatogr. 1981; 205: 13–30. 11. Quarry MA, Grob RL, Snyder LR. Prediction of precise isocratic retention data from two or more gradient elution runs. Analysis of some associated errors. Anal. Chem. 1986; 58(4): 907–917. 12. Vivo´-Truyols G, Torres-Lapasio´ JR, Garcı´a-A´lvarez-Coque MC. Error analysis and performance of different retention models in the transference of data from/to isocratic/gradient elution. J. Chromatogr. 2003; A 1018: 169–181. 13. Bolancˇa T, Cerjan-Stefanovic´ Sˇ, Lusˇa M, Rogosˇic´ M, Ukic´ Sˇ. Development of an ion chromatographic gradient elution retention model from isocratic elution experiments. J. Chromatogr. 2006; A 1121: 228–235. 14. Gjerde DT, Schmuckler G, Fritz JS. Anion Chromatography with lowconductivity eluents. II. J. Chromatogr. 1980; 187: 35–45. 15. Haddad PR, Cowie CE. Computer-assisted optimization of eluent concentration and pH in ion chromatography. J. Chromatogr. 1984; A 303: 321–330. 16. Jenke DR, Pagenkopf GK. Optimization of anion separation by nonsuppressed ion chromatography. Anal. Chem. 1984; 56: 85–88. 17. Jenke DR, Pagenkopf GK. Models for prediction of retention in nonsuppressed ion chromatography. Anal. Chem. 1984; 56: 88–91. 18. Jenke DR. Modeling of analyte behavior in indirect photometric chromatography. Anal. Chem. 1984; 56: 2674–2681. 19. Maruo M, Hirayama N, Kuwamoto T. Ion chromatographic elution behaviour and prediction of the retention of inorganic monovalent anions using a phosphate eluent. J. Chromatogr. 1989; 481: 315–322. 20. Jenke DR. Prediction of retention characteristics of multiprotic anions in ion chromatography. Anal. Chem. 1994; 66: 4466–4470. 21. Hajos P, Horvath O, Denke V. Prediction of retention for halide anions and oxanions in suppressed ion chromatography using multiple species eluent. Anal. Chem. 1995; 67: 434–441. 22. Novicˇ Mi, Zupan J, Novicˇ Ma. Computer simulation of ion chromatography separation: an algorithm enabling continous monitoring of anion distribution on an ion-exchange chromatography column. J. Chromatogr. 2001; A 922: 1–11. 23. Madden JE, Avdalovic N, Haddad PR, Havel J. Prediction of retention times for anions in linear gradient elution ion chromatography with hydroxide eluents using artificial neural networks. J. Chromatogr. 2001; A 910: 173–179. 24. Madden JE, Haddad PR. Critical comparison of retention models for optimisation of the separation of anions in ion chromatography: I. Non-suppressed anion chromatography using phthalate eluents and three different stationary phases. J. Chromatogr. 1998; A 820: 65–80. 25. Madden JE, Haddad PR. Critical comparison of retention models for the optimisation of the separation of anions in ion chromatography: II. Suppressed anion chromatography using carbonate eluents. J. Chromatogr. 1999; A 850: 29–41. 26. Madden JE, Avdalovic N, Jackson PE, Haddad PR. Critical comparison of retention models for optimisation of the separation of anions in ion chromatography: III. Anion chromatography using hydroxide eluents on a Dionex AS11 stationary phase. J. Chromatogr. 1999; A 83: 65–74. 27. Bolancˇa T, Cerjan-Stefanovic´ Sˇ, Regelja M, Regelja H, Loncˇaric´ S. Application of artificial neural networks for gradient elution retention modelling in ion chromatography. J.Sep. Sci. 2005; 28: 1427–1433. 28. Bolancˇa T, Cerjan-Stefanovic´ Sˇ, Regelja M, Regelja H, Loncˇaric´ S. Development of an inorganic cations retention model in ion chromatography by means of artificial neural networks with different two phase training algorithms. J. Chromatogr. 2005; A 1085: 74–85. 29. Havel J, Madden JE, Haddad PR. Prediction of retention times for anions in ion chromatography using artificial neural networks. Chromatographia 1999; 49: 481–488. 30. Maren A, Harston C, Pap R. Handbook of Neural Computing Applications. Academic Press: London, 1990. 31. Pham DT, Sagiroglu S. Three methods of training multi-layer perceptrons to model a robot sensor. J. Eng. Manufacture 1996; 210: 69–76. 32. Alpsan D, Towsey M, Ozdamar O, Tsoi AC, Ghista DN. Efficacy of modified backpropagation and optimisation methods on a real-world medical problem. Neural Netw. 1995; 8(6): 945–962. 33. Tang KW, Pingle G, Srikant G. Artificial neural networks for the diagnosis of coronary artery disease. J. Intell Syst. 1997; 7: 307–338.

112 www.interscience.wiley.com/journal/cem

Copyright ß 2008 John Wiley & Sons, Ltd.

J. Chemometrics 2008, 22 106–113

Neural network training in ion chromatography 34. Stager F, Agarwal M. Three methods to speed up the training of feedforward and feedback perceptrons. Neural Netw. 1997; 10(8): 1435–1443. 35. Drake R, Packianather MS. A decision tree of neural networks for classifying images of wood veneer. Int. J. Adv. Manuf. Technol. 1998; 14: 280–285. 36. Devika P, Achenie L. On the use of quasi-newton based training of a feed forward neural network for time series forecasting. J. Intell. Fuzzy Syst. 1995; 3: 287–299.

37. Demuth HB, Beale M. Neural Network Toolbox for Use with MATLAB, User’s Guide. The MathWorks, Inc.: Natick, MA, USA, 2004. 38. Vogl TP, Mangis JK, Zigler AK, Zink WT, Alkon DL. Accelerating the convergence of the backpropagation method. Biol. Cybern. 1988; 59: 256–264. 39. Chong EKP, Zak SH. An Introduction to Optimization. Wiley: Singapore, 2004. 40. Hagan MT, Menhaj MB. Training feedforward networks with the Marquardt algorithm. IEEE Trans. Neural Netw. 1994; 5(6): 989–993.

113

J. Chemometrics 2008, 22 106–113

Copyright ß 2008 John Wiley & Sons, Ltd.

www.interscience.wiley.com/journal/cem

Suggest Documents