Accepted Manuscript Title: Estimation of the Limit of Detection Using Information Theory Measures Author: Jordi Fonollosa Alexander Vergara Ramon Huerta Santiago MarcoAuthor to whom all correspondence should be addressed: PII: DOI: Reference:
S0003-2670(13)01338-X http://dx.doi.org/doi:10.1016/j.aca.2013.10.030 ACA 232904
To appear in:
Analytica Chimica Acta
Received date: Revised date: Accepted date:
9-8-2013 8-10-2013 11-10-2013
Please cite this article as: J. Fonollosa, A. Vergara, R. Huerta, S. Marco, Estimation of the Limit of Detection Using Information Theory Measures, Analytica Chimica Acta (2013), http://dx.doi.org/10.1016/j.aca.2013.10.030 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Estimation of the Limit of Detection Using Information Theory Measures
1 2
ip t
3
5 a
BioCircuits institute (BCI),
us
6
cr
Jordi Fonollosaa,1, Alexander Vergarab, Ramon Huertaa, Santiago Marcoc,d
4
University of California San Diego,
8
La Jolla, CA 92093, USA
an
7
9 b
Biomolecular Measurement Division, Material Measurement Laboratory,
M
10
National Institute of Standards and Technology,
12
Gaithersburg, MD 20899-8362, USA
ed
11
13 c
Signal and Information Processing for Sensing Systems
pt
14
Institute for Bioengineering of Catalonia (IBEC),
16
Baldiri Reixac, 4-8, 08028 Barcelona, Spain
Ac ce
15
17 18 19
d
Departament d’Electrònica, Universitat de Barcelona,
Martí i Franqués 1, 08028 Barcelona, Spain
20 1
Author to whom all correspondence should be addressed:
Dr. Jordi Fonollosa Tel.: +1 858 534-6758 Fax: +1 858 534-7664 e-mail:
[email protected] 1
Page 1 of 44
Abstract
22
Definitions of the Limit of Detection (LOD) based on the probability of false positive and/or
23
false negative errors have been proposed over the past years. Although such definitions are
24
straightforward and valid for any kind of analytical system, the proposed methodologies to
25
estimate the LOD are usually simplified to signals with Gaussian noise. Additionally, there is a
26
general misconception that two systems with the same LOD provide the same amount of
27
information on the source regardless of the prior probability of presenting a blank/analyte
28
sample. Based upon an analogy between an analytical system and a binary communication
29
channel, in this paper we show that the amount of information that can be extracted from the
30
analytical system depends on the probability of presenting the two different possible states. We
31
propose a new definition of LOD utilizing Information Theory tools that deals with noise of any
32
kind and allows the introduction of prior knowledge easily. Unlike most traditional LOD
33
estimation approaches, the new definition is based on the amount of information that the
34
chemical instrumentation system provides on the chemical information source. Our findings
35
indicate that the benchmark of analytical systems based on the ability to provide information
36
about the presence/absence of the analyte (our proposed approach) is a more general and proper
37
framework, while converging to the usual values when dealing with Gaussian noise.
38
Keywords
39
Limit of Detection; Information Theory; Mutual Information; Heteroscedasticity; False
40
positive/negative errors; Gas Discrimination and Quantification
Ac ce
pt
ed
M
an
us
cr
ip t
21
41
2
Page 2 of 44
41
1. Introduction The Limit of Detection (LOD) of an analytical method (or measurement instrument) is a
43
fundamental figure of merit. In most settings, it specifies the smallest concentration quantity at
44
which the analyte can be detected or distinguished from a blank measurement within a stated
45
confidence level, thereby constituting a limiting factor of a chemical detection system. It must be
46
remarked that the value of the LOD that is provided in the specifications of any generic system is
47
very significant since it may have legal implications and play a relevant role in coordinating
48
market regulations by standardization agencies. For example, in 2006 the US Environmental
49
Protection Agency (EPA) reestablished the maximum admissible contamination level of arsenic
50
in drinking water from 50 nmol/mol to 10 nmol/mol. Based on the LOD of the different
51
methodologies to measure the concentration of arsenic, EPA invalidated some of the previously
52
accepted techniques to measure arsenic content in water [1, 2]. Therefore, a clear and accurate
53
definition of LOD along with a consistent experimental protocol for its estimation is imperative
54
for the determination of the LOD provided in the specifications of any chemical system as well
55
as for the mutual understanding among technicians, system developers, and policy makers.
Ac ce
pt
ed
M
an
us
cr
ip t
42
56
Over the last decades, different definitions of the LOD have been proposed, each leading to
57
substantially different estimated values of the LOD [3]. Early definitions of LOD considered
58
only the probability of having false positive errors (i.e., the probability of falsely claiming the
59
presence of the analyte in a sample, or Type I errors) and not the probability of having false
60
negative errors (i.e., the probability of falsely claiming the absence of the target compound, or
61
Type II errors), which could lead to 50% the Type II error probability [4]. Similar approaches
62
based on the probability of errors were utilized to evaluate the acceptability of analytical
63
methods [5]. Subsequent definitions of LOD were adapted to take into account both Type I and 3
Page 3 of 44
Type II errors [6], recommending to set the LOD at the concentration level that would make the
65
probabilities of Type I and Type II errors 5% or less. However, although all these widely
66
accepted definitions are generally valid for any kind of analytical system, the proposed
67
methodologies utilized to estimate the LOD are usually simplified to signals with Gaussian
68
noise.
ip t
64
A chemical measurement system can be considered in most cases as a black-box with
70
input/output signals. The input signal is the concentration of the target analyte, whereas the
71
output signal takes the form of an instrument- dependant quantity derived from the instrument
72
raw output (e.g., peak-area integral at a certain retention time in a GC-FID configuration). We
73
have to note that the LOD is always given in terms of the analyte concentration. Therefore, the
74
concentration (input space) uncertainty must be estimated from the variance at the instrument
75
output. To do so and build the corresponding calibration model, the systems are frequently
76
assumed to have a linear input-output relationship and a constant output variance
77
(homoscedasticity) because linear models favor an easy transformation from the output noise
78
variance to the input. In some instrumental techniques, however, the input-output relationship
79
could be non-linear (e.g. due to competition effects) and the noise distribution can depart from
80
the assumed Gaussian distribution. Interestingly, even though some authors have considered
81
more complex sources of noise like heteroscedastic stochastic process to solve problems in
82
realistic scenarios, they still assume Gaussian noise in their calculations [7, 8].
Ac ce
pt
ed
M
an
us
cr
69
83
The assumption of Gaussian noise in the estimation of the LOD can be too restrictive for
84
chemical detection systems because the measured value (i.e., the analyte concentration) is
85
necessarily positive. Therefore, the distribution of noise inevitably becomes asymmetric and
4
Page 4 of 44
86
non-Gaussian for very small concentration levels, which corresponds to the concentration range
87
explored to estimate the LOD. In attempts to deal with measurements in which the variance is neither normal nor
89
homogenous, Fraga et al. introduced a methodology to estimate the LOD of chemical systems
90
based on the repetition of measurements with and without the chemical of interest in a test
91
sample [9]. Because the presence or absence of the chemical is known by the practitioner, the
92
authors were able to estimate the Type I and Type II probability errors from the predictions of
93
the chemical system. Then, to estimate the LOD, the concentration of the chemical of interest
94
was increased gradually and the error probabilities were evaluated for each concentration level.
95
Finally, the LOD was set at the concentration that made the error probabilities lower than a
96
defined threshold (10 % for both types of error). The authors ultimately used their methodology
97
to optimize the operation of the system by building a Receiver Operating Characteristic (ROC)
98
curve changing the relevant parameter.
ed
M
an
us
cr
ip t
88
All methodologies based on the probability of Type I and Type II errors assume implicitly
100
that the two states representing analyte/no-analyte exposure are presented with the same
101
probability, i.e. the same a priori probability for both classes. However, in most of the
102
applications one of the possible states is expected to be found more often than the other one,
103
thereby leaning the probability of the system towards one of the classes. In this paper we show
104
that the amount of information that can be extracted about the sample from the analytical system
105
depends on the prior probability of presence or absence of the analyte. We propose a new
106
definition of LOD based on the amount of information that the chemical system can extract from
107
the presented stimulus. This new approach clearly shows that the amount of information
108
provided by the analytical method is very dependent on the prior probabilities. In sharp
Ac ce
pt
99
5
Page 5 of 44
comparison to previous definitions, our methodology, which is based on information-theoretic
110
tools [10], is sensible to the probability of presenting the analyte and deals with noise of any
111
kind. The remainder of the paper is organized as follows. In Section 2 we review previous
112
definitions of LOD. Then, we will explore the amount of information that can be extracted by a
113
chemical system (Section 3), followed by the proposed methodology to estimate the LOD
114
(Section 4), two examples to estimate the LOD of a system (section 5), and the conclusions of
115
this work (Section 6).
cr
us
2. Limits of Classic Definitions of LOD
an
116
ip t
109
The LOD is a fundamental figure of merit, the definition of which has changed over the
118
course of years. An early definition of LOD adopted by the IUPAC in 1975 [4] stated that “the
119
limit of detection, expressed as a concentration CL (or amount, qL), is derived from the smallest
120
measure, XL, that can be detected with reasonable certainty for a given analytical procedure”,
121
determining the actual LOD represented by the following equation:
122
= b + σb
Ac ce
pt
ed
M
117
represents the mean of the blank measures,
(1)
123
where
is its standard deviation, and k is a
124
parameter set according to a defined confidence level [11]. Note that the LOD is expressed in
125
units of concentration and that the distribution of the samples is estimated at the input space
126
(concentration). However, since the systems are usually simplified to a linear relationship input-
127
output, the standard deviation of the blank measurements is often estimated at the output space,
128
and the LOD is converted into analyte concentration by utilizing a previously obtained
129
calibration model: 6
Page 6 of 44
y
k σb = b
130
(2)
where b is the slope of the linear calibration and
132
output space (sensor output).
is the standard deviation calculated at the
ip t
131
Based on Kaiser’s work, the numerical value k usually adopts a value of 3, so the confidence
134
level is set to 99.86%, provided a one-sided normal distribution [12]. In 1995, the IUPAC
135
adopted a new definition that considers both the probabilities of Type I and Type II errors [6]. In
136
such an approach, the methodology to calculate the value XL included the standard deviation of
137
the net concentration when the analyte is not present (σb) and, when the analyte is present, at the
138
level of the LOD, σLOD. Assuming that the noise follows a normal distribution, XL can then be
139
expressed as:
M
an
us
cr
133
ed
= b + 1− σb + 1− σLOD
140
(3)
where α and β define the thresholds for the Type I and Type II error probabilities, and z1-α
142
and z1-β
143
limit the accepted probabilities. If the noise in the system is considered homoscedastic (i.e., with
144
a constant variance) and Type I and Type II errors are limited to 5% (as recommended by
145
IUPAC), the Eq (3) can be rewritten as [13]:
146
pt
141
Ac ce
are, respectively, the upper percentage points of the two noise distributions that
= b + 2 0.95 σb = b + 3.3σb
(4)
147
where the factor 3.3 is the common value to estimate the LOD when the noise is considered
148
homoscedastic and its variance is known. If the variance of the noise is unknown, which is the 7
Page 7 of 44
common scenario, the z-values must be replaced by the equivalent t-values of the t-Student
150
distribution [14]. Figure 1 shows a visual comparison between the two classic LOD definitions
151
outlined above2. Despite the efforts made by the IUPAC to standardize the more rigorous
152
definition adopted in 1995, many investigations are still presenting LOD estimates based on the
153
definition that only considers false positive errors. The limitations of the frequent simplifications
154
of methodologies to estimate the LOD based on the probabilities of the errors have been
155
analyzed previously [7, 15]. They include the assumptions that errors in the blank signal are
156
distributed normally, systematic errors are negligible, errors occur only in the y-direction, the y-
157
intercept is not significantly different from the measured value, and a good calibration function is
158
obtained. Additionally, they do not take into account the a priori probabilities of measuring
159
blank samples or the chemical of interest.
cr
us
an
M
3. Information Theory Applied to Chemical Sensing
ed
160
ip t
149
Over six decades ago, Shannon developed the discipline of Information Theory (IT), a
162
mathematical model of communication that quantifies the efficiency of data transmission over
163
noisy channels by measuring the information content at the source and at the receptor [10].
164
Information Theory is today a sound and complete framework for parameter estimation and for
165
learning machines in general [16]. In measurement science in general, and, in particular, in
166
chemical sensing, Information Theory techniques represent basic tools for algorithm analysis,
167
parameter evaluation and optimization, molecular visualization, feature selection, and inference,
168
as attested in numerous works [17-27].
Ac ce
pt
161
2
The reader is referred to very complete reviews [13-17] for a more detailed discussion of these definitions.
8
Page 8 of 44
The efficiency of a communication system is evaluated by comparing the a posteriori
170
probability (i.e., the probability of decoding the original message after its reception) and the a
171
priori probability (i.e., the probability of guessing the original message without any additional
172
information). Hence, it is possible to establish an analogy between a communication system for
173
data transmission and an analytical system. The analyte presence/absence can be seen as a source
174
of information —the analytical system is equivalent to the transmission channel— and the output
175
of the analytical system (sensor reading) represents the received message. By virtue of this
176
analogy, the efficiency of an analytical system can be evaluated by estimating the a posteriori
177
probability (the probability assigned to each state of the source knowing the output of the
178
analytical system) and the a priori probability (the probability assigned to each state with no
179
other information available). The source of information can be a continuous quantity if the goal
180
is the estimation of the analyte concentration or a discrete variable if the purpose of the system is
181
the determination of presence/absence of the analyte.
ed
M
an
us
cr
ip t
169
This study is focused on the determination of whether the target compound is present or not
183
in a test sample. Hence, the source of information is a random binary variable (present/absent
184
analyte). The output of the analytical system, after the definition of a proper threshold, is also a
185
random binary variable since it predicts the presence or absence of the target compound. Ideally,
186
both variables should present the same state, but, as in the data transmission over a noisy
187
channel, there may also be Type I (false positive) and Type II (false negative) errors. Figure 2
188
shows a schematic representation of the analytical system with the error probabilities. This
189
analogy corresponds to a binary communication channel, where the emitter transmits a bit
190
representing the presence/absence of the analyte, and the receiver acquires a bit. The probability
Ac ce
pt
182
9
Page 9 of 44
191
of flipping the bit during transmission, which is equivalent to a wrong prediction by the
192
analytical system, determines the efficiency of the analytical system. Information Theory is based on probability theory and statistics. The two most important
194
measures of information of Shannon’s mathematical theory of communication are entropy (S) —
195
the information contained in a random variable— and Mutual Information (MI) —the amount of
196
information that a second random variable Y yields about the random variable of interest X.
cr
ip t
193
First, the entropy is the amount of self-contained information in a process and describes the
198
level of uncertainty (or disorder) of a system. The entropy of a Discrete Memory-less Source
199
(DMS) depends on the probability of presenting the N possible different states (or symbols) xi:
()2 ()
(5)
ed
200
=1
M
= −
=
an
us
197
where S is expressed in bits and represents the minimum number of binary bits needed by a
202
receiver to reconstruct the original message, and p(xi) is the probability of finding the state xi.
203
For a DMS source with two different states (presence or absence of an analyte, see Fig. 2) the
204
entropy can be simply expressed as a function of p(x1), the probability of finding the state x1;
205
thus
Ac ce
206
pt
201
= −(1 )2 (1 ) − [1 − (1 )]2 [1 − (1 )]
(6)
207
where p(x2) = [1-p(x1)]. For the extreme case in which a system has a probability p(x1)=1, the
208
system always presents the state x1 and hence there is no uncertainty and S=0. Figure 3 illustrates
10
Page 10 of 44
209
the entropy of a two-state DMS, which is maximized when both states are equally probable i.e.
210
p(x1)= p(x2)=0.5, providing a maximum entropy of 1 bit. Second, the Mutual Information is a measure of the information of one random variable
212
contained in another random variable. Hence, the MI quantifies the amount of information on the
213
state of the variable X contained by the known state of another random variable Y:
,
(, ) 2
( , ) ()()
cr
=
us
214
ip t
211
(7)
where px(i) and py(j) are the marginal probability distribution functions of variables X and Y and
216
p(i,j) is the joint probability distribution function. In our analogy, X is the absence/presence of
217
the analyte, while Y is the thresholded binary output of the analytical instrument. Our approach is
218
based on the evaluation of the MI given by the observation of Y regarding X. Accordingly, if two
219
random variables are statistically independent, the known state of the first variable does not bring
220
any information on the unknown state of the second variable and MI=0. Conversely, if both
221
variables are coincident, the known state of the first variable makes perfectly ascertainable the
222
state of the second variable, and the MI equals S.
Ac ce
pt
ed
M
an
215
223
Because Eq. 7 leads to the restriction 0 MI S, the maximum value of MI of an analytical
224
system working to discriminate the presence/absence of a compound (Smax = 1 bit, see Fig. 3) is 1
225
bit, which corresponds to a system where both states (presence/absence of analyte) are equally
226
probable with no Type I or Type II errors. However, the MI of a DMS depends on the a priori
227
probability of the states and the probability of Type I and Type II errors. Figure 4 shows the MI
228
between the source (presence/absence of analyte) and the analytical prediction of
229
presence/absence of analyte for different probabilities of presenting a blank sample in the source 11
Page 11 of 44
and different Type I and Type II error probabilities3. When the presence/absence of analyte is
231
equally probable in the source (p(x1) = pblank =0.5), the maximum value of MI is 1 bit and the
232
obtained map is symmetric. However, when the probability of presenting a blank sample
233
increases, the maximum value of MI decreases and the Type I error becomes more significant
234
because the system is biased towards the probability of presenting a blank sample, which is the
235
needed input to obtain a false positive reading from the system.
cr
us
236
ip t
230
237
an
4. Definition of LOD based on Mutual Information
Our methodology to estimate the LOD of analytical systems is based on the analogy from a
239
binary channel, where the source of information is the random variable representing the
240
absence/presence of an analyte (represented by 0/1) and the random variable representing the
241
prediction made by the analytical system corresponds to the received message. The maximum of
242
MI between these two variables is 1 bit, which would correspond to an ideal analytical system
243
with zero errors (Type I or Type II, i.e. α = β = 0, see Fig. 2) and where the prior probability of
244
analyte presence is 50 %. Two different contributions impact the amount of information, MI, that
245
can be extracted from the system taking into account the knowledge of the output Y. On the one
246
hand, the prior probability of analyte presence limits the entropy of the system and the MI. On
247
the other hand, the efficiency of the analytical system itself, which is given by Type I and Type
248
II error probabilities, misguides the predictions made by the system. Therefore, in order to
249
estimate the LOD of an analytical system, it is necessary (i) to set the desired thresholds for the
Ac ce
pt
ed
M
238
3
We assigned the state ‘blank sample presented’ to x1 and the state ‘analyte presented’ to x2. Therefore, from now on, the probability of presenting a blank sample to an analytical system p(x1)=pblank.
12
Page 12 of 44
250
Type I and Type II probability errors, and (ii) to estimate the prior probability to present
251
blank/analyte samples. The thresholds for the Type I and Type II error probabilities can be set to any arbitrary value
253
according to the needs and restrictions of the user. The typical IUPAC definition for the LOD
254
suggests a probability for both Type I and Type II errors of 5%. In this work we assume that
255
there is some a priori knowledge that allows the estimation of the probability of presenting the
256
analyte or a blank sample. However, if such information is not available, the results from the
257
IUPAC definition are reproduced by setting the probability of presenting the analyte to 50 %.
258
Therefore, before estimating the LOD, our methodology needs to define the parameters α, β and
259
pblank to determine the MI threshold, MIth. Once MIth is set, the LOD estimation is given by the
260
minimum analyte concentration that makes the MI between the source of information and the
261
analytical system higher than MIth. Figure 5 shows the MIth for different values of the Type I and
262
Type II errors and the probability to present blank samples. The figure also provides a guide to
263
determine the MIth defined by the relevant parameters (α, β, and pblank ). For the convenience of
264
the reader, the MIth values from Figure 5 for the most common scenarios are provided in Table 1
265
(Equation A.1 in the appendix relates the MIth for arbitrary values of α, β, and pblank ).
Ac ce
pt
ed
M
an
us
cr
ip t
252
266
In summary, as the concentration rises, the amount of information that the system provides
267
from the input is higher, and the MI between the binary input variable (gas present/gas absent)
268
and the binary output variable (prediction of the sensor) increases. A MI threshold is defined
269
from the accepted tolerances in Type I and Type II errors and the probability to present blank
270
samples (see Figure 5). And finally, the LOD is the lowest concentration level that makes the MI
271
higher than the threshold.
13
Page 13 of 44
It is important to note that MIth depends on three parameters that need to be defined before
273
estimating the LOD: Type I and Type II errors and the probability to present blank samples. The
274
first two parameters depend exclusively on the accepted tolerances in the errors made by the
275
system and can take any value agreed by the practitioner, community, or standardization
276
agencies. However, the estimation of the probability to present blank samples may not be
277
straightforward. Although in most common scenarios such information is available and can be
278
obtained either from previous experiments or in the literature, sometimes the practitioner has to
279
face difficulties due to lack of information. If no information on pblank is available, one can
280
simply assume pblank = 50 % to reproduce the LOD values proposed by the IUPAC. A more
281
accurate solution would include the estimation of pblank from the same set of measurements used
282
to calculate the LOD. If the samples to estimate the LOD are obtained from the same
283
environment from which the system acquires samples in normal operation, one can expect that
284
the obtained pblank from the subset of samples will be a good estimate of the actual pblank. In this
285
scenario, the practitioner should collect first all the samples to estimate pblank and then determine
286
the MIth to estimate the LOD.
cr
us
an
M
ed
pt
5. Examples of LOD estimation
Ac ce
287
ip t
272
288
In this section we validate our methodology with two different analytical systems. First, we
289
estimate the LOD of a system utilizing synthetic data to show that the noise distribution of the
290
measurement, indeed, affects the LOD estimate. Second, we estimate the LOD of an analytical
291
system to detect benzene, a compound that has special interest due to its carcinogenic properties
292
and presence in industrial environments.
293
5.1: LOD estimation under different noise probability density distributions (pdf) 14
Page 14 of 44
In order to compare our definition of LOD with the methodologies that assume
295
homoscedastic Gaussian noise and neglect the probability of presenting the analyte, we
296
simulated a system with different noise distributions. Specifically, we built a linear model of a
297
system with sensitivity S=0.5 and offset=2: = . +
(8)
cr
298
ip t
294
We generated synthetic noise with different probability density function distributions. The
300
resulting noise was added to the system output (Y). In particular, we studied four different
301
distributions of noise4: a) a Gaussian noise with mean 0 and standard deviation σ = 1, b) a
302
uniform distribution in the interval (-1.73,+1.73), c) a discrete binary noise distribution where the
303
values ±1 are equally probable, and d) a Gaussian noise distribution with mean 0 and standard
304
deviation increasing linearly with the input signal according to σ = 1 +0.1 X. The latter
305
distribution reproduces a system, the variability of which is proportional to the input
306
concentration and which is a common attribute of analytical systems [28]. Note that the four
307
abovementioned distributions have been designed to show the same standard deviation at zero.
308
Therefore, the measured standard deviation of the blank samples (at the output space Y) is sn=0.5
309
for all the noise distributions, and the estimated LOD coincides for the methodologies that only
310
consider the dispersion of the blank samples. In particular, assuming k=3.3, the estimation of the
311
LOD corresponds to 3.3 in all the considered examples.
Ac ce
pt
ed
M
an
us
299
312
The methodology based on the MI estimation, however, is sensitive to the different noise
313
distributions and the probability to present blank samples. Figure 6 illustrates that MI increases
314
differently for different noise distributions, which is the origin of providing different MI 4
The noise distributions are referred to the input space (in units of concentration).
15
Page 15 of 44
estimates for the LOD. Figure 7 shows the LOD estimations for the different noise distributions
316
and the probability of blank samples (pblank). From Figure 7 we can conclude that i) the LOD
317
changes for different noise distributions if we want to obtain the same amount of information
318
from the analytical system and ii) the LOD needs to be more restrictive (higher analyte
319
concentration) when the probability of presenting blank samples increases. Although the effect
320
of pblank becomes only significant when it is expected to measure blank samples most of the time,
321
it is a scenario faced by many applications. Additionally, the variation of the LOD caused by the
322
different values of pblank or Type I and Type II error probabilities is comparable to the parameter
323
divergence of other methodologies and definitions. For example a change in the parameter k
324
from 3 to 3.3 represents a variation of the LOD estimation of the same order of magnitude as the
325
variations shown by the different noise distributions and pblank. Therefore, the methodology to
326
estimate the LOD based on the MI can provide sound and consistent LOD estimations, can be
327
adapted to any arbitrary value of Type I and Type II errors, and, in contrast to any methodology
328
presented before, give the possibility to introduce a priori knowledge on the probability of
329
presenting blank/analyte samples.
cr
us
an
M
ed
pt
Ac ce
330
ip t
315
331
5.2 LOD of a chemosensory system
332
5.2.1 Data Collection
333
To illustrate the methodology proposed to estimate the LOD, we studied an analytical
334
system to detect benzene, which has been identified as carcinogenic to humans and can be
335
present in the atmosphere from natural sources such as oil seeps and wild fires or originated in
336
industrial environments, gasoline filling stations and automobile combustion engines [29]. In 16
Page 16 of 44
order to minimize the risks of the individuals exposed to benzene, the Occupational Safety and
338
Health Administration (OSHA) defined permissible exposure limits (PEL) according to the time-
339
weighted average (TWA) and short-term exposure limits (STEL) in different industrial scenarios
340
[30]. Therefore, the measurement of the levels of benzene is important for worker safety, but
341
such measurements need to be performed reliably with instrumentation that has the LOD clearly
342
defined.
cr
ip t
337
For illustration purposes, we consider a detection system based on a metal oxide (MOX) gas
344
sensor, the conductivity of which changes when the sensing layer is exposed to
345
reducing/oxidizing volatiles. MOX sensors are a common choice due to their cost-effective
346
design, native cross-reactivity (i.e., the vast number of volatiles sensors can detect), sensitivity,
347
and ease of operation [31-33]. In particular, we utilized a TGS2610 MOX gas sensor
348
(Figaro[34]) placed in a 60 ml volume Teflon/stainless steel air-tight gas chamber into which
349
benzene could be injected at different concentrations. In order to ensure a sensor response as
350
reproducible as possible, the sensor was pre-heated for several days before starting the set of
351
measurements while flowing 200 ml/min of clean air in the test chamber. Briefly, the vapor
352
delivery system was composed of two fluidic branches that met each other in the injection point
353
before bringing the resulting gas mixture to the test chamber. On the one hand, the solvent
354
branch included a high-pressure cylinder containing the carrier gas (medical-grade dry-air
355
supplied by Airgas [35]) connected in series to a mass flow controller (supplied by Bronkhorst
356
High-Tech B.V.[36]) with a maximum flow of 200 ml/min. On the other hand, the solute branch
357
was based on a high-pressure calibrated cylinder of benzene at 500 nmol/mol in air (provided
358
and certified by Airgas), the flow of which was regulated by a mass flow controller (Bronkhorst
359
High-Tech B.V.) with a maximum flow rate of 100 ml/min. A computer platform equipped with
Ac ce
pt
ed
M
an
us
343
17
Page 17 of 44
a National Instruments acquisition board (PCI-6014) and LabView software (ver. 6) was adapted
361
to command the full set of experiments, which required control over several parameters that can
362
be defined by the user. First, the sensor’s operating temperature was controlled by the voltage
363
applied on the built-in heating element of the sensor, which was kept constant at 5 V during the
364
whole set of experiments. Second, the concentration of the gas sample (benzene) was controlled
365
by the flow of the two fluidic branches in such a way that the total flow was kept constant to 200
366
ml/min. And finally, the resistance of the sensor was acquired at a sampling frequency of 100 Hz
367
and stored in a computer for further processing.
us
cr
ip t
360
The sensor was exposed to eight different levels of benzene concentration (12.5, 18.75, 25,
369
31.25, 37.5, 43.75, 50, and 56.25 nmol/mol), each repeated 13 times in a random order for a total
370
of 104 measurements. The experimental procedure for each of the measurements was composed
371
of three different steps. First, clean air was circulated in the test chamber for 5 minutes to capture
372
the signal baseline of the sensor. Second, the sensor was exposed to the mixture of benzene at a
373
concentration level randomly selected from the list of concentrations for 5 minutes. And finally,
374
clean air was re-circulated for 5 minutes to purge out the gas sample cell.
pt
ed
M
an
368
In order to consider only the steady-state portion of the sensor signal, we selected the
376
samples of the time series just before air conditions were changed. In particular, for each of the
377
measurements, we selected the 4000 samples comprised from 250 s to 290 s to capture the
378
system response to a blank stimulus and the 4000 samples comprised from 550 s to 590 s to
379
acquire the response to the benzene presentation. Therefore, in total, we have 52000 samples
380
(4000 samples × 13 repetitions) for each of the concentration levels and 416000 samples
381
acquired from blank responses (we extracted the blank response from each of the 104
382
measurements).
Ac ce
375
18
Page 18 of 44
Usually, the baseline of the sensor is removed to improve the performance of the system.
384
Baseline is estimated from the value of the transitory response just before the gas exposure [37].
385
Therefore, we subtracted the mean of the sensor response during the 40 seconds before the
386
beginning of the gas exposure5.
387
5.2.2 LOD Estimation
cr
ip t
383
In order to estimate the LOD it is necessary to define the accepted limits for the Type I and
389
Type II errors and estimate the probability to present blank samples. For the following example
390
we set the limits for the Type I and Type II errors to 5%, as suggested by the IUPAC. However,
391
in order to illustrate better our methodology, we estimated the LOD for two cases: i) assuming
392
pblank = 50%, which corresponds to the scenario with no a priori knowledge, and ii) assuming
393
pblank = 90%. Therefore, the MI threshold for the LOD estimation is 0.7136 and 0.2978,
394
respectively (see Table 1). Finally, it is necessary to compute the MI between the information
395
source and the sensor prediction for different benzene concentrations. The LOD will be defined
396
by the minimum benzene concentration level that makes the MI larger than the corresponding
397
threshold value.
Ac ce
pt
ed
M
an
us
388
398
To estimate the LOD of a system given pblank, it is necessary to create a dataset with the
399
same ratio of blank measurements versus analyte exposures than pblank. Therefore, to estimate the
400
LOD for pblank = 50%, we built a dataset of sensor predictions with a balanced number of
401
blank/analyte exposures, whereas the dataset to estimate the LOD for pblank = 90% was balanced
402
accordingly. The MI between the source of information (analyte presented/absent) and the sensor
5
The collection of the total set of experiments took about 32 hours, and the measurements were acquired during day and night. Although uncontrolled variables such us room temperature and pressure can be considered constant during the length of a single experiment (15 minutes), they affect the baseline of the sensor during the whole duration of the experiment set.
19
Page 19 of 44
403
prediction increases for higher levels of benzene concentration (see Figure 8). The MI threshold
404
to define the LOD is set to 0.7136 (pblank = 50%) and 0.2978 (pblank = 90%). According the proposed definition in this work, the LOD is given by the smallest input
406
concentration that makes the MI higher than the corresponding threshold. Therefore, from Fig. 8
407
we adjusted a 4th degree polynomial function for each pblank to determine the concentration level
408
at which the MI reaches the corresponding threshold. The obtained LOD estimations are 22.4
409
nmol/mol and 25.1 nmol/mol for pblank = 50% and pblank = 90%, respectively. We also estimated
410
the LOD with the definition proposed by the IUPAC. The standard deviation of the blank
411
measures (at the output space) is 7.8 Ω/Ω. In order to convert the LOD to the concentration
412
space, we utilized the Clifford-Tuma model that relates the resistance of a MOX sensor when it
413
is exposed to pure air to the resistance of the sensor when it is exposed to different gas
414
concentrations [38, 39]. The estimated LOD value based on the standard deviation (k=3.3) of the
415
blank samples would be 21.1 nmol/mol, which is an optimistic value (especially for low
416
probabilities of presenting the analyte) compared to the LOD estimations based on Information
417
Theory. From Figure 8 we can conclude that the LOD needs to be shifted towards higher
418
concentration levels for different prior probabilities if the thresholds for the Type I and Type II
419
error probabilities, α and β, are set to be constant.
421
cr
us
an
M
ed
pt
Ac ce
420
ip t
405
6. Conclusions
422
In this work, we showed that the amount of information on the absence/presence of a target
423
analyte that can be extracted using a particular instrumental technique depends not only on the
424
prior probabilities of presenting the analyte or a blank sample, but also on the noise distribution. 20
Page 20 of 44
In other words, in order to have the same information on the chemical source, it is necessary to
426
set the limit of detection at different concentration levels if the noise distribution is different.
427
This outcome is an interesting result because analytical systems are usually compared based on
428
the misconception that if the error probabilities (false and/or negative positives) of two systems
429
are the same, the systems are equivalent regardless of the noise distribution present in the
430
system. Based on the amount of information, we proposed a methodology to estimate the LOD
431
that deals with noise of any kind and can provide more accurate comparisons between systems.
us
cr
ip t
425
Using synthetic and experimental data with different distributions of noise, we showed that
433
our methodology captures better the information contained in the analytical signals and is
434
sensitive to the noise distribution and the prior probabilities, whereas classic methodologies
435
always provide the same LOD estimate. We showed that a limit of detection estimated from
436
actual information that can be extracted from an analytical system is more informative than
437
simply estimating the error probabilities, thereby providing better comparisons between systems.
438
Therefore, we believe that our methodology can be of special interest to obtain accurate
439
benchmarks of analytical systems, and it can be used to estimate the LOD of a wide variety of
440
analytical systems. Further investigations on the definition and methodologies to estimate the
441
LOD may include maximum likelihood ratio tests [40], which have already been used to evaluate
442
the discrimination ability of analytical systems and can also deal with unbalanced number of
443
samples [41-43]. The Neyman-Pearson lemma [44] shows that, given the Type-I error rate,
444
likelihood ratio testing provides the test with the lowest Type-II error rate (maximum power) for
445
simple-vs-simple hypotheses. Therefore, when performing a hypothesis test between H0: µ=µ0
446
versus H1: µ=µ1, the likelihood ratio test is the most powerful test. However, when designing a
447
test to estimate the LOD of an analytical system, a new sample will have certain probability to
Ac ce
pt
ed
M
an
432
21
Page 21 of 44
belong simultaneously to class blank (H0) and class analyte (H1). Therefore, one does not have to
449
reject H0 in favor of H1 since the goal is the optimization of a combined hypothesis of the form
450
ƞH0 + (1-ƞ)H1. Additionally, one needs to build a model for class blank and class analyte from
451
experimental data, and the parameters of the models will unavoidably have some uncertainty that
452
may limit the power of the likelihood ratio test. Hence, it remains as further work to adapt
453
maximum likelihood ratio tests to estimate LOD of analytical systems.
cr
ip t
448
us
454
an
455 456
M
457
459
pt
ed
458
Appendix:
461
Mutual Information for arbitrary values of α (Type I error probability), β (Type II error
462
probability) , and P1(probability of presenting a blank example):
463 464 465 466
Ac ce
460
p1 1 1 p1 MI p1 1 log 2 1 p1 log 2 p1 p1 1 1 p1 1 p1 p1 1 1 p1 1 p1 1 p1 p1 log 2 1 p1 1 log 2 p1 p1 1 p1 1 1 p1 p1 1 p1 1 22
Page 22 of 44
467
Acknowledgements:
469
This work has been supported by the Jet Propulsion Laboratory under the contract number 2013-
470
1479652 and partially funded by the Spanish Ministerio de Economía y Competitividad under
471
the project TEC2011-26143. Alex Vergara was financially supported by the NIST/NIH Research
472
Associateship program administered by the National Research Council and partially financed by
473
NATO under the Science for Peace & Security Program under grant no. SPS-984511. Santiago
474
Marco is member of the consolidated research group SGR2009-0753 by the Generalitat de
475
Catalunya. The authors also thank Joanna Zytkowicz for proofreading and revising the
476
manuscript. The suppliers and methodological tools identified herein are only specified for the
477
experimental procedures presented in this manuscript. Their mentioning in no way implies
478
recommendation or endorsement by the National Institute of Standards and Technology.
ed
M
an
us
cr
ip t
468
pt
479
Vitae:
481
Jordi Fonollosa (Ph.D., 2009 – University of Barcelona) is a Postdoctoral Researcher at the
482
BioCircuits Institute, UC San Diego. His research is focused on gas sensor array robustness and
483
optimization, support vector machines, and Information Theory applied to chemical sensing.
484
Other strong interests include biologically inspired algorithms, signal recovery systems, and
485
infrared sensing technologies.
486
Alexander Vergara (Ph.D., 2006 – Universitat Rovira i Virgili) is a NRC research associate
487
jointly working at the National Institute of Standards and Technology (NIST) and the National
Ac ce
480
23
Page 23 of 44
Institutes of Health (NIH) and a Visiting Research Scholar at the BioCircuits Institute, UC San
489
Diego, where he was a postdoctoral researcher until summer 2012. His work mainly focuses on
490
the use of dynamic methods and information-theoretic formalisms for the optimization of micro
491
gas-sensory systems and on the building of autonomous vehicles that can localize odor sources
492
through a process resembling the biological olfactory processing. His areas of interest also
493
include information theory, signal processing, pattern recognition, feature extraction, chemical
494
sensor arrays, and machine olfaction.
495
Ramón Huerta (Ph.D., 1994 – Universidad Autónoma de Madrid) is a research scientist at the
496
BioCircuits Institute, UC San Diego. Prior his current appointment, he was associate professor at
497
the Universidad Autónoma de Madrid (Spain). His areas of expertise include dynamic systems,
498
artificial intelligence, and neuroscience. His work deals with the development algorithms for the
499
discrimination and quantification of complex multidimensional time series, model building to
500
understand the information processing in the brain, and chemical sensing and machine olfaction
501
applications based on bio-inspired technology. Dr. Huerta's research work gathers in a
502
publication record of over 90 articles in peer-reviewed journals at the intersection of computer
503
science, physics, and biology.
504
Santiago Marco (Ph.D., 1993 – University of Barcelona) is Associate Professor at the
505
University of Barcelona and head of the Signal and Information Processing for Sensor Systems
506
Lab at the Institute for Bioengineering of Catalonia, Barcelona, Spain. His research concerns the
507
development of signal/data processing algorithmic solutions for smart chemical sensing based in
508
sensor arrays or microspectrometers integrated typically using microsystem technologies. Dr.
509
Marco research has produced over 100 articles in peer-reviewed archival journals. More at
510
http://www.ibecbarcelona.eu/artificial_olfaction
Ac ce
pt
ed
M
an
us
cr
ip t
488
24
Page 24 of 44
511 512
References:
514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550
[1] K.A. James, J.R. Meliker, J.A. Marshall, J.E. Hokanson, G.O. Zerbe, T.E. Byers, Journal of Exposure Science and Environmental Epidemiology (2013). [2] Analytical Feasibility Support Document for the Second Six-Year Review of Existing National Primary Drinking Water Regulations US Environmental Protection Agency, 2009. [3] L.A. Currie, Analytica Chimica Acta, 391 (1999). [4] Pure and Applied Chemistry, 45 (1976). [5] E.F. McFarren, R.J. Lishka, J.H. Parker, Analytical Chemistry, 42 (1970) 358. [6] L.A. Currie, Analytica Chimica Acta, 391 (1999). [7] E. Desimoni, B. Brunetti, Analytica Chimica Acta, 655 (2009). [8] E. Voigtman, K.T. Abraham, Spectrochimica Acta Part B-Atomic Spectroscopy, 66 (2011). [9] C.G. Fraga, A.M. Melville, B.W. Wright, Analyst, 132 (2007) 230. [10] C.E. Shannon, Bell System Technical Journal, 27 (1948) 379. [11] H.M.N.H. Irving, H. Freiser, T.S. West, IUPAC, Compendium of Analytical Nomenclature - The orange book. Definitive rules, Oxford : Pergamon, 1978. [12] H. Kaiser, Analytical Chemistry, 42 (1970) 24A. [13] R. Boque, A. Maroto, J. Riu, F.X. Rius, Grasas Y Aceites, 53 (2002) 128. [14] M.C. Ortiz, L.A. Sarabia, M.S. Sanchez, Analytica Chimica Acta, 674 (2010). [15] E. Desimoni, B. Brunetti, R. Cattaneo, Annali Di Chimica, 94 (2004). [16] J. Principe, Information Theoretic Learning, Springer, 2010. [17] F. Dupuis, A. Dijkstra, Analytical Chemistry, 47 (1975) 379. [18] K. Eckschlager, Information theory in analytical chemistry, John Wiley & Sons, 1994. [19] V. David, A. Medvedovici, Journal of Chemical Information and Computer Sciences, 40 (2000) 976. [20] T.K. Alkasab, J. White, J.S. Kauer, Chemical Senses, 27 (2002) 261. [21] T. Pearce, A. Sanchez-Montañes, Handbook of Artificial Olfaction Machines, WileyVCH, Weinheim, 2002. [22] P.P. Vazquez, M. Feixas, M. Sbert, A. Llobet, Computers & Graphics-Uk, 30 (2006) 98. [23] A. Vergara, M.K. Muezzinoglu, N. Rulkov, R. Huerta, Sensors and Actuators BChemical, 148 (2010) 298. [24] M. Trincavelli, A. Loutfi, Ieee, IEEE International Conference on Robotics and Automation (ICRA), Anchorage, AK, 2010, p. 2852. [25] J. Fonollosa, A. Gutierrez-Galvez, S. Marco, Plos One, 7 (2012). [26] A. Vergara, E. Llobet, Talanta, 88 (2012) 95. [27] J. Fonollosa, L. Fernández, R. Huerta, A. Gutiérrez-Gálvez, S. Marco, Sensors and Actuators B: Chemical, 187 (2013) 331. [28] D.T. O'Neill, E.A. Rochette, P.J. Ramsey, Analytical Chemistry, 74 (2002) 5907.
Ac ce
pt
ed
M
an
us
cr
ip t
513
25
Page 25 of 44
an
us
cr
ip t
[29] P. National Toxicology, Report on carcinogens : carcinogen profiles / U.S. Dept. of Health and Human Services, Public Health Service, National Toxicology Program, 12 (2011) iii. [30] Toxic and Hazardous Substances: Benzene. Occupational Safety and Health Administration. [31] Y.K. Min, H.L. Tuller, S. Palzer, J. Wollenstein, H. Bottner, Sensors and Actuators BChemical, 93 (2003) 435. [32] N. Barsan, D. Koziej, U. Weimar, Sensors and Actuators B-Chemical, 121 (2007) 18. [33] G.F. Fine, L.M. Cavanagh, A. Afonja, R. Binions, Sensors, 10 (2010) 5469. [34] Figaro USA, Inc. [35] Airgas, Inc. [36] Bronkhorst High-Tech B.V. [37] J.W. Gardner, P.N. Bartlett, Oxford University Press, New York, 1999. [38] P.K. Clifford, D.T. Tuma, Sensors and Actuators, 3 (1983) 233. [39] P.K. Clifford, D.T. Tuma, Sensors and Actuators, 3 (1983) 255. [40] A.P. Dempster, Statistics and Computing, 7 (1997) 247. [41] A. Vexler, A. Liu, E. Eliseeva, E.F. Schisterman, Biometrics, 64 (2008) 895. [42] R. Thiebaut, H. Jacqmin-Gadda, Computer Methods and Programs in Biomedicine, 74 (2004) 255. [43] H.S. Lynn, Statistics in Medicine, 20 (2001) 33. [44] J. Neyman, E.S. Pearson, On the problem of the most efficient tests of statistical hypotheses, Springer, 1992.
M
551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572
Ac ce
pt
ed
573
26
Page 26 of 44
573
Tables:
574
Table 1: Mutual Information between the source of information and the prediction of the system
576
for different values of the probability to present blank samples and error Type I and Type II
577
probabilities. This table sets the threshold to define the MI for different configurations of the
578
system. The case MI (α = 0%; β = 0%) corresponds to the entropy of the system.
cr
Pblank=0.67
Pblank=0.75
Pblank=0.9
Pblank=0.98
MI (α = 0%; β = 0%)
1
0.9149
0.8113
0.4690
0.1414
MI (α = 5%; β = 5%)
0.7136
0.6450
0.5622
0.2978
0.0720
MI (α = 5%; β = 10%)
0.6205
0.5688
0.4984
0.2663
0.0646
MI (α = 10%; β = 5%)
0.6205
0.5497
0.4727
0.2402
0.0553
MI (α = 10%; β = 10%)
0.5310
0.477
0.4123
0.2111
0.0488
pt
ed
M
an
us
Pblank=0.5
Ac ce
579
ip t
575
27
Page 27 of 44
Figure captions
580
Figure 1: Representation of the classic definitions of LOD for a linear homoscedastic system.
581
The first definition has no control over the probability of Type I errors (left). The second LOD
582
definition considers the probability of Type I and Type II errors (right).
583
Figure 2: An analytical system shows an analogy to a communication channel. The presence (or
584
not) of an analyte represents the message in the source (information to be transmitted). The
585
analytical system is equivalent to a noisy channel where the message is transmitted. The readings
586
of the sensory system determine if the analyte is present (or not) at the source. The probabilities
587
of Type I errors (false positive) and Type II errors (false negative) are given by α and β
588
respectively.
589
Figure 3: Entropy of a two-state discrete memoryless source as a function of the probability of
590
finding the state x1. The entropy takes its maximum value S=1bit when both states (analyte
591
present/analyte not present) are equally probable (p(x1)= p(x2)=0.5).
592
Figure 4: Mutual Information (in bits) between the source (analyte presented/absent) and the
593
prediction of the analytical system for different values of the probability of presenting blank
594
samples. Type I error becomes more significant when the probability of presenting blank
595
samples increases.
596
Figure 5: The entropy of an analytical system aimed at the discrimination between
597
absence/presence of an analyte is limited to 1 bit. The MI between the source of information
598
(analyte presented or not) and the prediction of the system depends on the probability of
599
presenting a blank sample (pblank ) and the defined thresholds for Type I and Type II errors (α and
Ac ce
pt
ed
M
an
us
cr
ip t
579
28
Page 28 of 44
β respectively). The LOD can be defined by the amount of information that can be extracted
601
from the analytical system. Once the parameters α, β, and pblank are set, this figure provides a
602
guide to determine MIth. Then, the LOD is the lowest concentration level that makes the MI
603
between the source and the output higher than MIth.
604
Figure 6: MI across the concentration of X for a system with homoscedastic σ = 1 Gaussian
605
noise (blue), uniform distribution (green), discrete binary noise (red), and Gaussian noise with
606
standard deviation increasing linearly with the input signal according to σ = 1 +0.1 X (black).
607
The probability of presenting a blank sample (pblank) is 50%, so the threshold that defines the
608
LOD is MIth=0.7136 (see Table 1). The obtained LOD utilizing our methodology for the three
609
systems is 3.30 , 3.12 , 2.0 , and 4.98 respectively. Our methodology to estimate the LOD is
610
sensitive to the noise distribution, whereas the IUPAC methodology provides the same
611
estimation for the three systems (3.3, assuming k=3.3) since the standard deviation of the blank
612
samples is the same for all the simulated systems.
613
Figure 7: LOD for an analytical system with different noise distributions. The LOD is estimated
614
in such a way that the amount of information provided by the system remains constant. In
615
contrast to classical definitions that would estimate LOD = 3.3 . for all the cases, the
616
methodology based on the MI is sensitive to the noise distribution and the probability to present
617
blank samples. Gaussian noise (dark blue), uniform distribution (light blue), discrete binary noise
618
(yellow), and Gaussian noise with increasing standard deviation (dark red).
619
Figure 8: Mutual Information between the source of information (presence/absence of analyte)
620
and the sensor prediction. We estimated the LOD for two different a priori probabilities pblank =
621
50 % (top) and pblank = 90 % (bottom). The red line shows the MI threshold, MIth, which
Ac ce
pt
ed
M
an
us
cr
ip t
600
29
Page 29 of 44
622
determines the concentration for the LOD. MI increases for higher values of benzene
623
concentration before it reaches the corresponding maximum value set by the entropy.
Ac ce
pt
ed
M
an
us
cr
ip t
624
30
Page 30 of 44
624
Highlights
625 626
We propose a definition of Limit of Detection (LOD) based on Information Theory.
Analytical systems are compared based on their ability to provide information.
The methodology to estimate the LOD deals with noise distributions of any kind.
Our methodology converges to the same LOD values than traditional methods.
We show different examples to estimate the LOD with our methodology.
628
ip t
627
630
cr
629
632
us
631
633 634
an
635 636
M
637
Ac ce
pt
ed
638
31
Page 31 of 44
Ac
ce
pt
ed
M
an
us
cr
i
*Graphical Abstract
Page 32 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig1a.tif
Page 33 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig1b.tif
Page 34 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig2.tif
Page 35 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig3.tif
Page 36 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig4a.tif
Page 37 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig4b.tif
Page 38 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig4c.tif
Page 39 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig5.tif
Page 40 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig6.tif
Page 41 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig7_f.tif
Page 42 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig8a.tif
Page 43 of 44
Ac
ce
pt
ed
M
an
us
cr
i
fig8b.tif
Page 44 of 44