Removing the ambiguity of data processing ... - Wiley Online Library

279

J. Sep. Sci. 2013, 36, 279–287

Paul G. Stevenson Hong Gao Fabrice Gritti Georges Guiochon Department of Chemistry, University of Tennessee Knoxville, TN, USA Received August 13, 2012 Revised September 20, 2012 Accepted September 20, 2012

Research Article

Removing the ambiguity of data processing methods: Optimizing the location of peak boundaries for accurate moment calculations The calculation of the first few moments of elution peaks is necessary to determine: the amount of component in the sample (peak area or zeroth moment), the retention factor (first moment), and the column efficiency (second moment). It is a time consuming and tedious task for the analyst to perform these calculations, thus data analysis is generally completed by data stations associated to modern chromatographs. However, data acquisition software is a black box which provides no information to chromatographers on how their data are treated. These results are too important to be accepted on blind faith. The location of the peak integration boundaries is most important. In this manuscript, we explore the relationships between the size of the integration area, the relative position of the peak maximum within this area, and the accuracy of the calculated moments. We found that relationships between these parameters do exist and that computers can be programmed with relatively simple routines to automatize the extraction of key peak parameters and to select acceptable integration boundaries. It was also found that the most accurate results are obtained when the S/N exceeds 200. Keywords: Automated data analysis / Integration boundaries / Statistical peak moments DOI 10.1002/jssc.201200759

Additional supporting information may be found in the online version of this article at the publisher’s web-site

1 Introduction Studies of mass transfer kinetics in high-performance liquid chromatographic systems require at least accurate values of both the first absolute (␮1 ) and the second central (␮2 ) moments of eluted peaks [1–4]. Accurate values of the third central moment would also be most useful [5]. Band integration for the calculation of the second (and higher) moments requires correct information on the outer limits of the peak [6,7]. It is important that these boundaries be accurately determined [8] because it was shown that a 0.5% error in the calculation of ␮1 will translate to a 5% error in ␮2 and an error of approximately 25% in ␮3 [2]. If the integration boundaries are cut short, valuable peak information is lost. If the integration boundaries are overestimated, the results will be skewed by an area corresponding to the baseline noise. Every year, instrument companies involved in the manufacture of liquid chromatographic instruments sell over a thousand instruments (www.chromatographyonline.com/ Correspondence: Professor Georges Guiochon, Department of Chemistry, University of Tennessee, 1420 Circle Drive, Knoxville, TN 37996-1600, USA E-mail: [email protected] Fax: +1-865-974-2667

Abbreviation: EMG, exponentially modified Gaussian C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

lcgc/article/articleDetail.jsp?id=690988) [9]. Each of them incorporates a data station that processes the experimental data provided by detectors and calculate peak areas, retention times, column efficiencies, and, in some cases, higher peak moments. These data stations are a black box and analysts do not know the processes that have been applied to their data. Most chromatographers around the world blindly accept these numbers as true. This is strange considering that every parameter in experimental designs is closely monitored and controlled, so why are data processing steps not treated with the same considerations? This situation is dangerous and must change, especially because peaks that are eluted from the high-efficiency columns made available by recent progress in column technology are very narrow where even a slight degree of peak asymmetry may change the integration result by a significant factor. As a result, the lab-to-lab reproducibility of measurements of column efficiency are no longer satisfactory. For example, the column manufacturer claims that 4.6 × 100 mm columns packed with Poroshell120 EC-C18 have a reduced plate height between 1.45 and 1.65, although we measured values between 1.55 and 1.75 for new columns. For similar 2.1 mm internal diameter columns, the reduced efficiency is claimed to range between 1.9 and 2.1, although we obtained values of 2.1–2.3. The manufacturer Colour Online: See the article online to view Fig. 5 in colour. www.jss-journal.com

280

J. Sep. Sci. 2013, 36, 279–287

P. G. Stevenson et al.

claim an RSD of these efficiencies of 2–3% but we found 6–8% [10]. Few papers have been published in this field and instrument manufacturers are not forthcoming about the processes and procedures used to make these measurements [11]. The procedure that we use [12] is different from the classical peak-width at half height method and is more accurate if less precise. More importantly, the algorithms used to correct for baseline drift and noise and for selecting the integration boundaries are considered as confidential information by manufacturers. A previous manuscript [11] examined the performance of one of these softwares. It was found in 2001 that the contribution of the data analysis toward the error was far less than instrumental contributions. However, there were some concerns such as the peak area and zeroth moment (two identically defined terms) having different values that were never clarified. Now that instrument contributions to band broadening are larger than they ever were 10 years ago, this issue has become most important. If we experience difficulties in obtaining good, reliable data from our data stations, so does everybody else using any data station in the world. Many important decisions are daily made on the basis of analytical results for which the actual accuracy and precision are really unknown and are not properly documented. One of the goals of this report is to clarify the nature of the problems encountered in these calculations and to bring some necessary information on the difficulties encountered in selecting a correct procedure for quantitative measurements of chromatographic data, in implementing it, and in assessing its accuracy and its precision. When integrating chromatographic peaks and finding their higher order moments, analysts either trust their data station or determine the peak boundaries manually by examining the signal close to the baseline and making a personal, subjective judgement as to where the peak begins and ends. Once an appropriate decision is made, the data are ported to a program that integrates the peak and provides the peak moments. The accuracy and precision of this procedure depend on the analysts experience and skill. Because the procedure is tedious and time-consuming, its accuracy is often affected. In a recent publication [13], we illustrated several possible methods to determine the location of the HPLC peak boundaries. Accurate boundary positions are required for the confident calculations of peak moments. Previously, we assumed that our manual calculations provided correct values of the moments and we did not test the actual accuracy of these techniques. Actually, they cannot be very accurate. Even though strict guidelines can be followed to identify the peak boundary positions, their final locations are still determined largely by the analyst intuition. Thus, the absolute moments cannot be known and the accuracy of any method developed on this basis cannot be accurately determined. The peak width multiplier method was shown to provide results consistent with those of manual determinations, and to afford highly precise replicates. To encompass the whole peak, this method assumes a width that is based on the peak width at half height, w1/2 . This width is multiplied by an C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

arbitrary value called the width multiplication factor, n, that is defined by the analyst. Accordingly, the boundary width is proportional to the peak width at half height, regardless of the mobile-phase flow rate used for the measurement, according to the following equation w peak = w1/2 × n

(1)

where w peak is the boundary peak width. Because it is highly unlikely that real peaks are actually symmetrical, the times when the peak begins and ends must be defined by multiplying the result of Eq. (1) by an arbitrary weighting function, w, with: tstart = t R − w × w peak

(2)

and tend = t R + (1 − w) × w peak

(3)

where t R is the peak centroid and 0 < w < 1. In a similar way, Chesler and Cram [2] used increments in the number of data points away from the peak centroid to define the peak boundaries. This method is certainly simplistic; this is also where its strength lies. By not examining the shape of the peaks, this method eliminates the errors caused by imposing thresholds. Also, by assuming a constant amount of baseline relative to the peak width, it ensures that the relative error in each analysis is the same. The simplicity of this method also means that analysts can program their computers to evaluate chromatograms with confidence that their data are handled in the way they wish. This method of measuring the peak moments was shown to provide results in close agreement with those calculated via manual determination of the peak terminating points. The aim of this manuscript is to describe methods that permit an accurate determination of the peak area, the retention time and higher order peak moments of well-resolved chromatographic peaks. The presentation of these methods will permit analysts to judge whether they are appropriate for them to use it in their applications. It will also explain the nature of the difficulties encountered by all methods used for this purpose, whether transparent as this one or wrapped under confidential secrecy as most methods implemented in industrial computers. We present these methods as simple as possible to make them accessible to chromatographers who do not have advanced knowledge of computer programming. The methods presented here are particularly suitable for the determination of Van Deemter curves that requires repetitive analysis of well-resolved peaks. We acknowledge, however, that these methods fail under nonideal conditions that provide incomplete separations. When chromatographers attempt to characterize column performance, the peaks should always be well resolved. We attempted to find an optimal combination of peak width multipliers and weighting factors that could provide accurate values of ␮2 for all the peaks recorded. This was done by comparing the values of ␮2 measured for simulated www.jss-journal.com

Liquid Chromatography

J. Sep. Sci. 2013, 36, 279–287

peaks that were obtained by adding sequences of experimental noise to a model peak and the true value of ␮2 . If this failed, a secondary goal was to determine if a relationship exists between the peak parameters, the multiplier, and the weighting factor that could be used to predict n and w. This was achieved via a series of simulations whereby peaks of different shapes and sizes were generated and superimposed with experimental noise sequences. ␮2 was then calculated with different combinations of w and n. The value of the second central peak moment, ␮2 was then compared with the true ␮2 . These simulations were performed over a range of n and w values and for different peak S/Ns. Symmetrical chromatographic peaks are well described by the Gaussian function (see Eq. (4)), however, this equation is unable to explain even the slight degree of fronting or tailing that is often observed for experimental chromatograms. −(t − t R )2 (4) h(t) = h max exp 2␴2 where h max is the peak height, t R is the retention time of the peak, ␴ is the peak SD, and t is the time scale of the chromatogram. To account for asymmetrical peaks, the exponentially modified Gaussian (EMG) function was used because it has been used previously in many types of chromatographic applications [14–16]. The EMG function combines Eq. (4) with an exponential decay function [17]: 1 −(t − t R )2 (5) × exp {−t/␶} y(t) = h max exp 2 2␴ ␶ where y(t) is the convolved function that describes the combination of the Gaussian peak f (t) and the exponential decay function, and ␶ is an exponential modifier that describes the differences between the peak shape and the ideal symmetrical peak.

2 Experimental 2.1 Instrumentation The experimental noise of the detector of an Acquity UPLC (Waters, Milford, MA, USA) liquid chromatograph was acquired. This instrument includes a quaternary solvent delivery system, an auto-sampler with a nominal 5 ␮L sample loop, a column oven, and a data station running the Empower data software (also from Waters). The chromatogram was produced by recording the signal obtained when eluting an isocratic mobile phase of 50% ACN with water (Fisher Scientific, Fair Lawn, NJ, USA) at a flow rate of 0.8 mL/min for over 650 min. No sample was injected and the chromatogram was recorded at a wavelength of 254 nm at 40 Hz. The measurements were recorded at the laboratory temperature, 24◦ C. The noise was determined as the range between the upper and lower points in ten randomly selected 15-min segments of this chromatogram. It was found to be 0.0003 Au (see [5] C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

281

for more information regarding the recording of this experimental noise). A mixture of toluene, ethylbenzene, propylbenzene, and butylbenzene (Fisher Scientific) was prepared in 90% ACN and injected onto a Phenomenex Luna C18 HPLC column (Phenomonex, Torrance, CA, USA, 150 × 4.6 mm, 5 ␮m particle size, 100 A˚ pore diameter) that was eluted under isocratic conditions with an aqueous solution of 90% ACN. These separations were performed on an Agilent 1100 HPLC (Agilent Technologies, Palo Alto, CA, USA) equipped with a mobile-phase degasser, an auto-sampler, a binary pump, and a column compartment at 1 mL/min, at the laboratory temperature.

2.2 Simulation Simulated peak profiles were generated using a theoretical model of peaks, either a Gaussian or an EMG model. The parameters for these models were selected to investigate the effects of the S/N, the peak width and the degree of peak tailing. All simulated peaks had their center, t R , at 3 min. For the Gaussian model, the peaks were simulated with h max = 0.02, 0.01, 0.0.005, and 0.0025 Au with values for ␴ being 0.07, 0.035, and 0.0175 min. For the EMG peaks, ␴ was fixed at 0.02 min and ␶ was 0.02, 0.04, 0.08, 0.16, and 0.32 min, which provided peaks with asymmetry values at 5% peak height (As 5% ) of 1.05, 1.07, 1.11, 1.19, and 1.35. The values of h were adjusted, so that an appropriate S/N was kept constant, see Section 3.2. The width multiplier was a function of the peak width at 5% height. To the peak profiles, sections of a random 20 min of the experimentally recorded noise were added. A small amount of baseline drift was observed in the recorded signal and a basic baseline correction routine was required. The baseline was adjusted by extrapolating a straight line between the first and last data points within this data set. The data relating to the signal intensity was then subtracted by this line. Ten random segments of experimental baseline were superimposed onto the model peaks for each peak shape/width and multiplier/weighting combination, and the results were averaged.

2.3 Data analysis Calculation of the peak boundaries, peak area, and peak central moments were completed with algorithms built inhouse, using Wolfram Mathematica 7 (Wolfram Research, Champaign, IL, USA). The baseline drifts lead to errors made in the second, and higher moments [18]. The experiments were designed and carried out in order to obtain flat baselines; the baseline corrections were completed prior to proceeding to the moment calculations, by extrapolating a straight line between the first and the last data points of the selected data set. www.jss-journal.com

282

J. Sep. Sci. 2013, 36, 279–287


The first absolute peak moment (retention time) was calculated according to Eq. (6) [6, 19]:

3.1 Simulation of boundary parameters

tend

␮1 =

3 Results and discussion

hi ti

i=ts tar t

(6)

tend

hi

i=ts tar t

In this equation, hi is the height of the point from the baseline. Throughout this work, the areas were calculated via the summation method based on the assumption that a rectangular profile is eluted between points i and i + 1 [20]. A trapezoidal shape might be more appropriate but this distinction was deemed to be unimportant for the results presented here, given the high frequency of the data acquisition. The second and higher central moments were calculated by the following Eq. (7) [6]:

The ideal boundary parameters were obtained by measuring the second central moment, ␮2 , of the simulated peaks. These peaks were generated by adding sequences of experimental baseline noise to theoretical peak profiles. The boundary locations were determined using the method described in the previous sections. The second moment, ␮2 , was then calculated and compared to the true second moment. These true moments were derived from the theoretical peak and obtained without addition of noise. The moments were then calculated while systematically changing the peak width multiplier and the weighting factor. Two peak models were tested in this work, the Gaussian and EMG models.

3.2 Influence of peak width

tend

hi (ti − ␮1 )

m

where m is the moment order.

Gaussian peaks were simulated according to Eq. (4). Gaussian peaks are symmetrical and do not tail, thus the only parameter that was investigated was the peak width multiplier, n. Three peak shapes were simulated with peak width parameters, ␴, of 0.07, 0.035, and 0.0175 min. The calculations were also made for S/N of approximately 200, 100, 50, and 15.

Figure 1. (A) Accuracy and (B) precision of the second central moment of simulated Gaussian peaks with a S/N of 200, and where ␴ was made to equal 0.07 (䊉), 0.035 (), and 0.0175 ().


␮m =

i=ts tar t

(7)

tend

hi

i=ts tar t

C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

www.jss-journal.com

J. Sep. Sci. 2013, 36, 279–287


283

The accuracy of the simulations was provided by the percentage difference of the peak central moments with added noise to those of the same peak without noise included, i.e. the “neat” peak profile. The precision of the calculations was determined by finding the RSD of ten replicate simulations made with profiles including ten different noise sequences. This was completed from the zeroth through the second peak moments. The third central peak moment is related to the peak asymmetry. Since the Gaussian profiles are symmetrical, ␮3 was not considered for the analysis of this model. For the zeroth and the first peak moments, the relative differences between the moments of the noisy and the neat profiles were very small, with a relative difference of 0.003% or less for all the simulations that were completed here. Because the accuracy is so high, these parameters will not be discussed further. Plots of the percentage differences in the calculations of ␮2 and its RSD versus the peak width multiplier for the three peak shapes at different S/N are illustrated in Figs. 1–4. Figure 1A represents the accuracy of the second central moments as the relative difference between the moments for the simulated and the neat peak profiles that have an S/N of 200. The noise was calculated as the midpoint of the mean maximum and minimum of the data points in a flat baseline segment. The results show that the ␮2 of these peaks

are in close agreement with the true value, with a relative difference of ca. 1% for the three peaks. The narrowest peak gave closest agreement, with a relative difference of 0.37% to the true value that remains constant over a range of width multipliers from 3.5 to 4.5. The accuracy decreased with increasing peak width. The relative difference was 0.86% for ␴ = 0.035 with a multiplier range from 3 to 4 and reached 1.18% for ␴ = 0.07 with a multiplier range from 3.5 to 4. When the multiplier was increased above 4, the accuracy decreased. Figure 1B shows the RSD of the ten replicate analyses in the calculation of ␮2 . The RSD increased with increasing multiplier, illustrating the effect of the noise on these calculations. Over the range of n between 3.5 and 4, the RSD more than doubled in magnitude. Precise calculations of moments must be made with as small a window multiplier as possible. For a multiplier of 3, the RSD was smallest for the narrowest peak at 1.23%, the other peaks had similar RSD of up to ca. 2.1%. The results obtained for peaks simulated with an S/N of approximately 100 are illustrated in Fig. 2A. Unlike Fig. 1A, there is no correlation between the peak width and the accuracy of ␮2 . In this instance, the widest peak gave the best correlation with the true moment with an error of 0.52% at a multiplier of 3, followed by the narrowest peak




www.jss-journal.com

284

J. Sep. Sci. 2013, 36, 279–287


with an error of 0.89% and a multiplier of 3.5 and finally the middle peak with an error of 2.77%, and a multiplier of 3. Figure 2B is a plot of the RSD of the results versus the multiplier. In contrast with Fig. 1B, the narrowest peak had the largest RSD at a multiplier of 3 (4.17%). The other two peaks gave again similar results with RSD of ca. 3.06%. Figures 3 and 4 show the same results for peaks having an S/N of 50 and 15, respectively. The shapes of the graphs are similar to those described previously. The minimum difference between the noisy and the neat data sets was found for a peak width multiplier of 3, giving a relative difference of 0.19–4.19% for S/N of 50, Fig. 3A, and 2.0–5.1% for S/N of 15, Fig. 4A. Also, the RSD increased with decreasing S/N, 5.33–8.78% for 50 (Fig. 3B) and 16.84–18.03% for 15 (Fig. 4B). From this set of simulations, we conclude that the S/N is most important to achieve accurate and precise calculations of the peak moments. Figures 1–4 show the same general result that it is impossible to obtain highly accurate values without sacrificing precision. This is an important trade off that must be closely considered. It is vital that high accuracy moment values are obtained, otherwise there is no point in performing scientific experiments; however, precision must also be maintained to provide confidence in the results. It was found that the peak width does not have a significant impact on the moment results. For symmetrical peaks, n should be as small as possible to maintain a high precision, thus a high confidence degree in the results. We found that n should be equal to 3, to ensure that the whole peak is accounted for and to minimize the influence of the baseline noise.

3.3 Influence of the peak tailing Unlike Gaussian peak profiles, EMG profiles are asymmetric. For the purpose of taking the peak asymmetry into account, the EMG model contains one more parameter that describes the degree of peak tailing. Since few experimental peaks are actually symmetrical, the EMG model provides a good illustration of real peaks [14–16]. The optimization of the boundary positions is much more difficult with asymmetric peaks. There are now two parameters that must be taken into account, the peak width multiplier and the weighting function. The role of the weighting function is to shift the boundaries so that a shorter segment of flat baseline is

Figure

5. Simulated results of the (A) relative accuracy to the true (B) relative precision in ␮2 of the noisy peak in replicate analysis and (C) an addition of the plots (A) and (B) as functions of both the width multiplier and weighting function. Here, we used a rainbow color gradient to illustrate the magnitude of the difference between noisy and noiseless moments where blue coloring indicates a relative error approaching 0% and red coloring approaching (A, B) 20% and (C) 40% (the white areas represent values greater than these cut-offs). This peak was constructed so that the S/N was ca. 200 and the peak tailing variable was equal to 0.08.


www.jss-journal.com

J. Sep. Sci. 2013, 36, 279–287

included before a tailing peak and a longer section of the tailing edge is accounted for, and the converse for leading peaks. To determine if there is a connection between the peak tailing, the width multiplier and the weighting factor, peaks with S/N of ca. 240 were simulated over a wide range of values of ␶, thus As 5% . This S/N was selected so as to provide accurate and precise values of ␮2 , based on the results obtained for the Gaussian profiles. The accuracy and precision were measured in the same way as in the previous section and are illustrated by way of contour plots. In Fig. 5A, the purple color represents good agreement between the values for the noisy and the neat profiles for the true moment and the red color a poor agreement. In Fig. 5B, the purple color represents a high precision in the replicate analysis and the red color a poor precision. These values were determined with the aid of a third contour plot, the superimposition of the relative accuracy and relative precision plots. When multiple areas were found sharing the minimum values, they were refined by first finding the minimum relative difference, i.e. the highest accuracy, to ␮2 of the theoretical peak and then the minimum RSD of replicate analyses. Unlike in the previous section, the peak width multiplier was not calculated as a function of the peak width at half height. Instead, the multiplier is applied to the peak width at 5% of the peak height (w5% ). It was found that the peak height at w1/2 could not provide adequate parameters when applied to real data (see next section). The influence of the peak asymmetry on the peak width is much stronger at 5% of the peak height than at half height. The relative accuracy of ␮2 calculated for a peak with a S/N of ca. 200 and As 5% = 1.11 is represented in Fig. 5. Figure 5C is the combination of the relative accuracy and the precision plots, in Fig. 5A and B, respectively. The combination plots were produced with the assumption that the accuracy and precision both have equal importance when identifying the optimal position. Thus, no weighting was performed and the data represented is merely an addition of the values used to create the (A) and (B) figures. The procedure to select the optimum values for the peak width multiplier and the weighting factor was somewhat subjective. Because there is no clear method to find the exact best values; the guidelines followed were to find the minimum areas in the combination plot (Fig. 5C) and then to determine if this location was appropriate, based first on the accuracy plot and then on the precision plot. The optimum values for the plots illustrated in Fig. 5 occurred at a peak width multiplier of 2.54 and a weighting factor of 0.35. At this location in Fig. 5A, the relative accuracy of ␮2 was better than 2.5% and the RSD was less than 5.5%. For these simulations, peaks were designed to have As 5% values of 1.05, 1.07, 1.11, 1.19, and 1.35 and a ␴ value of 0.02 min. A total of 15 contour plots were used to generate the data for Fig. 6. These profiles are not displayed in this manuscript but can be found in the Supporting Information of this work. The optimal values for As 5% were calculated using the same methodology. They are illustrated in Fig. 6. As the tailing increases, the distance between the C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim


285

Figure 6. Plots illustrating the relationship between the degree of peak tailing and the optimal (A) width multiplier and (B) weighting factor. Dashed lines indicate the linear least squares fit of the data points.

peak boundaries must increase to accommodate the longer return to the baseline. A linear relationship was observed between the peak width multiplier and the peak asymmetry, see Fig. 6A, with n = 1.63As 5% + 0.51. Figure 6B shows the relationship between the degree of peak tailing and the optimal weighting factor. At the first point, the tailing was minimal and required a weighing factor of ca. 0.4. As the tailing increased, the boundaries must be adjusted so that the centroid was closer the beginning marker. A linear relationship, w = −0.56As 5% + 0.96 is observed between As 5% and w. The influence of peak fronting on the weighting factor was not measured here; we assumed that the same relationship will exist, only the value of w would increase to 1 rather than decrease. As the order of central peak moment increases the information contained recesses further from the peak center, as illustrated in Fig. 6.9 of [6]. As the heights of the peak front and tail decay toward the baseline, the relative importance of the signal contribution to the moment decreases while that of baseline noise increases, thus magnifying the error made (see Fig. 6.10 in [6]). Since the peak broadens and integration must be made over a wider range of signal, the error made in the calculation of the higher order moments increases, in particular those made on the third and higher central peak www.jss-journal.com

286

J. Sep. Sci. 2013, 36, 279–287


Figure 7. Extracted peaks from a separation of alkylbenzenes: (A) toluene, (B) ethylbenzene, (C) propylbenzene, (D) butylbenzene. Automatically calculated integration boundaries are indicated by vertical lines.

moments [5]. The approach presented here for finding the peak boundaries is useful for these calculations because it helps to choose an optimum integral interval that is wide enough to limit the calculation error due to the loss of integration area but which is also sufficiently narrow to limit the contribution of the baseline noise. However, extreme caution is needed when performing the calculations of the third or higher peak moments. At this time, we suggest that it might be preferable to determine the peak boundaries manually to find the points where the contribution of the baseline noise becomes unacceptably high. 3.4 Application to real chromatographic peaks Since a clear relationship was observed between n and w on the one hand and As 5% on the other, the resulting linear equations can be applied to real peaks. This is illustrated here by a study of the elution profiles and of the separation of four alkylbenzenes, toluene, ethylbenzene, propylbenzene, and butylbenzene. After importing the chromatographic data, the algorithm extracted approximate boundary locations for each peak, ensuring that the peak width was overestimated. For each C 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim

peak, the data subset was then extracted, the baseline corrected, and the width at ␴5% determined directly from the data. Following this, the linear equations discussed in the previous section were solved for n and w and the peak integration boundaries were calculated according to Eqs. (2) and (3). To solve these equations, values of As 5% and tG were determined by fitting an EMG profile to the extracted data set. The peaks and integration boundaries are illustrated in Fig. 7. As we do not know the real values of ␮2 for these peaks, we cannot report on the accuracy of these limits of integration. Visually, however, the boundaries illustrated in Fig. 7 are acceptable. The values of the moments were not calculated for these peaks.

4 Conclusion The series of calculations made on Gaussian and EMG peaks showed a relationship between the peak width, the degree of peak tailing, and the integration boundaries required for the integration of the moments of HPLC peaks. These calculations showed also that accurate measurements of these moments are possible if the S/N exceeds 200; the larger the www.jss-journal.com


J. Sep. Sci. 2013, 36, 279–287

S/N ratio, the smaller the error made on these moments. From the analysis of the Gaussian profiles, it was concluded that the width of the integration area should be approximately three times the peak width at half height, with no significant effect due to the peak width in the range examined. The use of the EMG function provided a relationship between the degree of peak tailing, the width multiplier, and the weighting function. Although no universal set of parameters could be found that could provide optimum integration boundaries for all chromatographic peaks, a linear relationship was found between these parameters. This relationship was tested with real peak profiles and it was shown that computers may automatically select the integration parameters and determine the best integration boundaries for accurate measurements of the second peak moment. The peaks we used here were simulated to represent real peak profiles. For researchers to find appropriate integration boundaries, we suggest that a peak model which accounts for asymmetry be fitted to their data then applying Fig. 6 to find values for the peak width multiplier and weighting function. Finally, this work concludes that the measurements of accurate and precise values of the first few moments of chromatographic peaks still pose difficult problems that are not yet definitively solved. The commercial literature regarding chromatographic instruments and the software that their data station implement provides little detailed information explaining how this problem is approached, and particularly which algorithms are used. This work shows that the accuracy and the reproducibility of the peak area measured is often better than 1%. Those of the retention factors tend to be in the 1% range under favorable circumstances. Those of the second central moment, hence of the column efficiency, cannot be better than a few percent. These figures assume that the S/N exceeds 200. For lesser values of this ratio, the errors made increase. These figures should be included in all attempt to assess the quality of the data obtained in quantitative analyses by chromatography. We showed that algorithms can be written to achieve quality calculations for peak moments. When chemists use custom approaches for the analysis of their data, they can trust the results because they know exactly what has been done with these data, which removes any ambiguity in the results. Approaches similar to the one reported here should be considered for the collection of all information when every aspect of the experimental design is closely controlled. This work was supported in part by Grant DE-FG05-88-ER13869 of the US Department of Energy and by the cooperative agreement between the University of Tennessee and Oak Ridge National Laboratory. The authors have declared no conflict of interest.

5 Nomenclature ␮m ␴

m th central peak moment peak standard deviation


␶ As5% f h max n t tR tstar t tend w w1/2 w peak

287

asymmetry factor peak asymmetry measured at 5% peak height time period in between detector acquisitions maximum peak height peak width multiplier chromatogram time peak retention time boundary start time boundary end time weighting function peak width at half height boundary peak width

6 References [1] Miyabe, K., J. Sep. Sci. 2009, 32, 757–770. [2] Chesler, S. N. and Cram, S. P., Anal. Chem. 1971, 43, 1922–1933. [3] Brown, P. R., Grushka, E., Advances in Chromatography, Vol. 40, Marcel Dekker, New York, NY, USA 2000. [4] Jeansonne, M. S., Foley, J. P., J. Chromatogr. 1992, 594, 1–8. [5] Gao, H., Stevenson, P. G., Gritti, F., Guiochon, G., J. Chromatogr. A 2012, 1222, 81–89. [6] Guiochon, G., Felinger, A., Shirazi, D. G., Katti, A. M., Fundamentals of Preparative and Nonlinear Chromatography, Academic Press, San Diego, CA, USA 2006. [7] Miyabe, K., Guiochon, G., J. Sep. Sci. 2003, 26, 155–173. [8] Oberholtzer, J. E., Rogers, L. B., Anal. Chem. 1969, 41, 1234–1240. [9] LCGC, Global hplc market report, LCGC, Iseline, NJ 2010. [10] Gritti, F., Guiochon, G., J. Chromatogr. A 2012, 1252, 56–66. [11] Felinger, A., Guiochon, G., J. Chromatogr. A 2001, 913, 221–231. [12] Gritti, F., Guiochon, G., J. Chromatogr. A 2011, 1218, 4452–4461. [13] Stevenson, P. G., Gritti, F., Guiochon, G., J. Chromatogr. A 2011, 1218, 8255–8263. [14] Howerton, S. B., Lee, C., McGuffin, V. L., Anal. Chim. Acta 2003, 478, 99–110. [15] Jeansonne, M. S., Foley, J. P., J. Chromatogr. Sci. 1991, 29, 258. [16] Foley, J. P., Dorsey, J. G., J. Chromatogr. Sci. 1984, 22, 40–46. ´ [17] Pap, T. L., Papai, Z., J. Chromatogr. A 2001, 930, 53–60. [18] Anderson, D. J., Walters, R. R., J Chromatogr. Sci. 1984, 22, 353–359. [19] Grushka, E., Myers, M. N., Giddings, J. C., Anal. Chem. 1970, 42, 21–26. [20] Jeansonne, M. S., Foley, J. P., J. Chromatogr. 1989, 461, 149–163.

www.jss-journal.com

Removing the ambiguity of data processing ... - Wiley Online Library

Removing the ambiguity of data processing ... - Wiley Online Library

Suggest Documents

Dynamic choice under ambiguity - Wiley Online Library

The Effectiveness of Removing Predators to ... - Wiley Online Library

The Cost of Removing Tax and Trade ... - Wiley Online Library

Are we overestimating the niche? Removing ... - Wiley Online Library

Organizational Membership, Ambiguity and the ... - Wiley Online Library

Floodplain biogeochemical processing of ... - Wiley Online Library

automated data pre‐processing for high ... - Wiley Online Library

Removing allosteric feedback inhibition of ... - Wiley Online Library

Geoscience data - Wiley Online Library

Big Data - Wiley Online Library

Advanced processing technologies and ... - Wiley Online Library

Vacuolar processing enzyme activates ... - Wiley Online Library

Shortterm information processing, longterm ... - Wiley Online Library

Industrial Laser Materials Processing - Wiley Online Library

Production, processing and characterization ... - Wiley Online Library

The Protein Data Bank - Wiley Online Library

Alpha as Ambiguity: Robust MeanVariance ... - Wiley Online Library

Does language ambiguity in clinical practice ... - Wiley Online Library

Political Commitment, Policy Ambiguity, and ... - Wiley Online Library

Effects of thermal processing on the nutritional ... - Wiley Online Library

Effects of processing conditions on the fiber ... - Wiley Online Library

Effect of processing methods on the nutritional ... - Wiley Online Library

The cognitive processing of Japanese ... - Wiley Online Library

Neural basis of the non-attentional processing ... - Wiley Online Library