Apr 1, 1985 - Avishai Ben-David and Benjamin M. Herman. Results of a nonlinear ... This method, when used for the backscattering ker- nel, which has large ...
Method for determining particle size distributions by nonlinear inversion of backscattered radiation Avishai Ben-David and Benjamin M. Herman
Results of a nonlinear iterative method for determination of the size distribution of spherical aerosols with known refractive indices from backscattered radiation are presented, along with a discussion of the nature of the nonlinear iterative method. The nonlinear method uses an initial guess which is built into the kernel function. By successiveiterations, a correction vector is to be used in the next iteration until the correction vector is the unity vector and the process is terminated. This method, when used for the backscattering kernel, which has large information content, can tolerate errors in the measurements up to 10%.
1.
II.
Introduction
In this paper inversions for the determination of particle size distributions n(r), from measurementsg(X) of the backscattered radiation at several wavelengths X will be presented and discussed. Because this work is intended for use in laboratory
Method
A detailed discussion of the method of inversion is given by Herman et al. ,2 and only a brief summary will be given here. The governing equation for the backscattered (neglecting attenuation of the medium) radiation is
experiments, the particles are assumed to be spherical with known refractive indices (polyester particles which
mainly scatter and absorb only slightly). This fact enables us to use Mie theory as the kernel of the problem. The information content in the backscattered measurements is much more than the information content generally obtained from other optical measurements, such as extinction, as shown by Capps et al. 1 This allows constrained inversions to be successful, even for measurement errors up to 10%, which is not possible for
inversions performed with measurements of extinction. Since it is reasonable to expect that most particle size distributions will be relatively smooth, that is, they will not contain rapid and erratic fluctuations in number density as a function of radius, a second derivative smoothing matrix is employed along with an iterative scheme that uses a first-guess distribution which is re-
peatedly corrected with each iteration.
Ix(1 8 0) = FoCxfx,
where I\(1 8
(1)
0°) is the measured backscattered radiance
at wavelength X,FOAis the transmitted pulse irradiance,
CAis the instrument constant, and Oxis the unit volume differential backscattering cross section. For our purpose (laboratory measurements), we can neglect multiple scattering because we deal with a very small field of view and the optical depth has been chosen to keep multiple scatter negligible. These factors result in a very small probability for multiple scattering, and the backscattered signal consists mostly of single scattering events. Oixcan be written as Ox=
a(rm) b
- dr,
~dr
rl
(2)
where Ub(r,m,X) is the backscatter cross section (calculated from Mie theory) for a particle of radius r, index
of refraction m, and wavelength X;and r, and r 2 are the radii limits of the unknown size distribution dn/dr. Equation (2) cannot be integrated analytically and will be written as
AX= The authors are with University of Arizona, Institute spheric Physics, Tucson, Arizona 85721. Received 13 April 1984. 0003-6935/85/071037-06$02.00/0. © 1985 Optical Society of America.
of Atmo-
E (lb(rPmp X)
p
-p
Arp
Arp,
(3)
where Tpis the midpoint radius over the interval Arp. In matrix notation, the problem can be addressed as g = Af + e,
(4)
where g is the measurement vector whose components are the backscattered radiation at I discrete wave1 April 1985 / Vol. 24, No. 7 / APPLIED OPTICS
1037
lengths, and A is the I X p matrix representation of the kernel function (bArp), which may contain weighting factors depending on the quadrature formula used for converting from an integral to a finite sum. is a column vector consisting of the measurement errors and all other error sources, and f is the unknown column vector Anp/Arp, at p discrete points. Since a direct E
inversion of Eq. (4) to solve for f will result in instability due to the normally nearly singular nature of the matrix
e
A, a constrained solution is employed. With the constraint, the solution to Eq. (4) becomes, as shown by
c
Twomey,3 ,4
I f = (ATA + yH)h1ATg,
(5)
where H is the constraint matrix, y is the non-negative Lagrange multiplier which determines the strength of the constraint, and the superscript T denotes the matrix transpose. For this problem, the H matrix is taken to be that for second derivative smoothing of the solution. As a result of the erratic fine structure of the kernel, 0lb(rmX) (Fig. 1), a large number of intervals p must
. _ I I Mo
3J Radius (cIm) Fig. 1.
Backscatter cross section for X = 0.26 /im as a function of
be used for a good approximation for fAxfrom Eq. (3), or equivalently, for an accurate representation of f from Eq. (5). This approach will, however, force the di-
particle radius, assuming a spherical particle with a real refractive
mensions of the ATA matrix to be extremely large (severalhundreds), making it impractical to invert with a computer. An alternative approach is using a first guess for the size distribution function dn(1 )/dr (superscript denotes the iteration number) to integrate over the fine intervals required. The radius range (r1 ,r2 ) is divided into q large intervals, each containing p small intervals. As a result, Eq. (3) can be written
of information in the kernels, it is advised to use several initial guesses. The process is repeated until the f(n) vector is a unit vector, yielding a final solution: = f(n) . f(n-1) . (n-2) .... f(1) d
index of 1.475 and an imaginary part of 0.001.
dr
dr
(7)
(6)
Our experience with hundreds of inversions shows that the method always leads toward a convergence of the correction vector. The solution form represents a nonlinear method of inversion, a fact which makes it a very powerful scheme, especially when the problem
This operation will make the dimension of the A matrix to be I X q (I is the number of wavelengths,and
requires a large dynamic range (e.g., in the case of inverting for a size distribution n (r) which ranges over a few orders of magnitude) as shown by Twomey. 5
as Ox
q
fq
P
Orb (rp,m,X)
dr
Arp.
q is the number of the large intervals), which will be a
much smaller number than before, by an order of magnitude of a few tens. The unknown in Eq. (6) is now the correction vector
fq,
which serves to correct the
first guess over each of the q large intervals. The size distribution function is now fqdn(1)/dr over each of the q intervals.
After the initial solution for fPl)is obtained,
a second iteration is performed, where the improved guessed solution is now dn( 2 )/dr = f)dn(M)/dr, and a new correction factor is solved for. This second guessed
solution is better because it reflects additional information in the kernels. Solution points in the correction vector which are different from unity suggest that more
particles should be added to or subtracted from the former size distribution function. Solution points in the correction vector which have the numerical value 1 suggest that either the number density of the size distribution function for these radii is correct, or there is no information in the kernels for these radii that can be used to change the size distribution function. To avoid a potential problem of inaccurate values in the size distribution function that cannot be changed due to lack 1038
APPLIED OPTICS / Vol. 24, No. 7 / 1 April 1985
111. Discussion
The solution in Eq. (7) should not be overly sensitive to the choice of the first guess, which governs only the number of iterations required. This fact can be shown experimentally and can be explained (even if not completely) by the second law of large numbers (Ref. 6, p. 367), in which the integration over
0
rb is the
equivalent
of using many observation events; sensitivity to the first guess implies limited information content in the measurements. The selection of y is a critical feature of this technique.7 The magnitude of y is used to increase the eigenvalues of ATA by increasing the diagonal of ATA, since H is nearly diagonal. Increasing the eigenvalues implies a greater degree of independence between the various rows and columns of the ATA matrix, which stabilizes the solution since the solution error is magnified by the inverse of the smallest eigenvalue.8 The constrained inversion is thus a trade-off between stabilizing the solution and diluting the information content of the kernel. Therefore, in each iteration the
smallest y resulting in all components of the f vector being positive with a maximum allowable oscillation was
chosen for the next iteration. In this way, no oscillations to which the measurements lack sensitivity will be developed in the solution. In the final iteration, all the components of the f vector are nearly unity, and a set of measurements g(X) is calculated from the derived
solution to be compared with the original measurements. If the inversion is successful, the difference between the two will be of the order of magnitude of the
1.2-
errors [e(X)]in the measurements. Another problem is the determination of the size of the A matrix, of I X q. Since the number of the measurements I is given (generally determined by the experimental arrangement, resulting in a given but limited information content), the remaining question is how many unknowns to solve for. One limitation on the number of components of the f vector is computer time,
0.6-
since ATA (q X q) will be inverted many times.
ever, this is not the only limitation.
How-
0
0.7
1.4
In simulated
measurements from some kernel 0 rb, the only source for error is error in the measurements (neglecting round-off
Fig. 2.
2.1 2.8 Radius (m)
3.5
4.2
Extinction cross section for X = 0.26 ,m as a function of
particle radius, assuming a spherical particle with refractive index as in Fig. 1.
error in the computer). In this case, the standard deviation in the solution vector is (can be derived from Ref. 6, Chap. 14) rf = S_1AT
y2
A(ST)-1,
S = (ATA + yH),
where rf is the covariance matrix in which the mth el-
ement down the diagonal is the variance of the mth element of the f vector, and 2 is the variance in the measurements. The variance in f is inversely proportional to the number of unknowns (q). This means that it is preferable to use as big a matrix A as possible.
However, in a real problem, where the kernel may not accurately describe all the physics of the problem, this is no longer true. For example, the refractive index may
be assumed constant with Xbut, in fact, may vary, resulting in a different shape to the kernel.9 10 Additionally, the particles may be assumed to be spheres, when in fact they are irregularly shaped, resulting generally in a decrease of the backscattered radiation. As a result, there is an error in the kernel itself, thus reducing the information content, permitting a smaller number of solution points. Another way to see this problem is from the mathematical point of view. As the
matrix ATA increases in dimension (i.e., large number of solution points), the additional eigenvalues of the matrix will become smaller, which is a result of the fact
that the sum of the off-diagonal elements increases, meaning the matrix becomes more and more nearly singular. The off-diagonal elements of ATA are, in fact, the correlation between the solution vector for different
particle radii (e.g., in temperature inversion, it is correlation between the temperature at different layers in the atmosphere). The matrix ATAis no better than a singular matrix if its rows are parallel to within the error in the measurements (within +e). The correlation aspect in the ATAmatrix can relate to the Lagrange multiplier in the following way. Since the H matrix is nearly diagonal, the Lagrange multiplier
can be taken as increasing the diagonal elements of ATA,
which represent the autocorrelation of the terms and reduce the cross correlation between the terms (the off-diagonal elements). In general, we try to solve for more solution points (unknowns) than the limited number of measurements that we have. This means that we have to add somehow additional equations for the additional required solution points. This is accomplished by the constraint matrix H combined with the relative weight y. Since the additional equations are based on physical knowledge of the problem or a priori information (smoothing and so on), a gross mis-
take in these assumptions will lead to a very bad set of equations, which will be as bad as error in the original
measurements g. Results of the relations between the H matrix, the value of y, and the size distribution solution, regarding the correlation aspect, will be presented in a separate paper."
The kernel for this study was chosen to be a backscattering kernel which results in a relatively large amount of information. The advantage of this kernel (Fig. 1) compared with the extinction kernel (Fig. 2) can
be seen in the oscillatory nature of the kernel. As a result of these oscillations, the measurements g are sensitive to oscillations in the solution vector f. In the case of the extinction kernel, which is a relatively smooth function, the measurements g are not sensitive to oscillations in f. Assuming that f can be written as a superposition of a smooth function, f 1 (x) and an oscillatory function f 2(x), i.e.,f (x) = f1 (x) + f 2 (x). Then if f 2(x) is very oscillatory and the kernel k (x) is a smooth
function, the resulting vector g will not be affected by the oscillatory part of f: g = f k(x)f(x)dx = f k(x)fl(x)dx + f k(x)f 2 (x)dx
-
k(x)fl(x)dx + 0.
1 April 1985 / Vol. 24, No. 7 / APPLIED OPTICS
1039
50.
-
1
Correct solution
Initialguess
I 40 -
|
----
l II
Random errorinmeasurement 5%
Finalsolution
30
°20
20
I\ J.I
10
10
us
1.2
2.4 3.6 Radius (jm)
4.8
6.0
Fig. 3. Particle size distribution dn/dr of the correct distribution
0
1.2
2.4 3.6 Radius (jm)
4.8
6.0
Same as Fig. 3 but with 5% random error in the
Fig. 4.
measurements.
which is lognormal with mean radius of 0.8 ,um and standard deviation of 1.4. Also shown are the initial guess which is a Junge-type power law r-l, = 3.0, and the inversion solution. All curves are as a func-
tion of particle radius for 1%random error in the measurements.
The high information content in the backscattering kernel can be seen mathematically in the large eigenvalues of ATA, which is another way to say that the rows
.E
of the matrix are more orthogonal to each other. It is important to note that a large number of integration points [the small intervals in Eq. (6)] are required to make use of this oscillatory kernel. In this study, 600 small intervals were used for radii limits between 0.1 and 6.0 gm. IV.
I Z
T 1
Results 6.0
Simulated measurements were calculated for twelve wavelengths (between 0.26 and 4.91 Mm)and for various size distributions (radii limits between 0.1 and 6.0Mum).
These measurements were then perturbed with various random errors and served as the measurement vector g. Experiments with the dimension of the ATAmatrix showed that the optimal dimension is -50 X 50 (50 unknowns). For more than 50 X 50, the results are slightly better but not significantly so when considering
the additional computation time. As the dimension of the matrix increases, a bigger y was needed to com-
pensate for the small eigenvalues of the matrix. It is important to know that there is no recipe for choosing the optimum value for y, and in each case a study of the problem should be made. The same is true for the radii
limits of the inversion, for which there is only an estimation from the spectrum of the wavelengths used. The information content for the smaller size particles is limited by the smallest wavelength, since for particles
in the Rayleigh scattering region the variation of the backscattering radiation with Xis the same regardless of their sizes. For very large particles (large values of 27rr/X),the cross section for backscattering will not change appreciably with wavelength, resulting in insensitivity for resolving the size distribution for these sizes. 1040
APPLIED OPTICS / Vol. 24, No. 7 / 1 April 1985
Radius(m)
Fig. 5.
Same as Fig. 3 but with 10% random
error in the
measurements.
Figure 3 shows results for the case where 1% random
errors were added to the correct values of the measurements g, which were simulated from a lognormal size distribution with mean radius of 0.8 gm and standard deviation of 1.4. The initial guess for this case (and others to follow,except where noted) was taken to be a Junge-type power law, dn/d(logr) = cr-w with v = 3.0. Figures 4 and 5 show inversions for 5% and 10%
random errors in the data, for the same size distributions as in Fig. 3. Figures 6 and 7 show inversions for
Junge-type distributions of v = 3.0, where the initial guess is a Junge of v = 4.0. In Figs. 8-10, the initial guess was a Junge type with v = 5.0, which is different from the initial guess used in Figs. 6 and 7. The correct
solution in these figures (8-10) is the same as used in Figs. 6 and 7. When other initial guesses were used, a similar result was obtained, a fact which should increase confidence in the result. The correct solution is a Junge with v = 3.0. Figures 11 and 12 show five of the possible
solutions for random errors 0-5% and 0-10% in the measurements. In these cases, the correct measure-
72 _
54 0
Radius (m)
Fig. 6. Three curves showing particle size distribution dn/dr of the correct distribution, which is a Junge-type power law r->, = 3.0; the initial guess which is a Junge-type law with = 4.0; and the inversion solution, all as a function of particle radius for 1% random error in the
Fig. 9.
1.2
2.4 3.6 Radius (m)
4.8
6.0
Same as Fig. 6 but the initial guess is a Junge-type power law
r-1, v = 5.0, with 5%random error in the measurements.
measurements.
14I I
Correct solution Initialguess
126
- -----
Final solution
Random errorin measurement '10%
9- bd
108
72
54 0
Fig. 10.
.
.
1.2
.
,
.
.
2.4 3.6 Radius (n)
.
,
4.8
6.0
Same as Fig. 6 but the initial guess is a Junge-type power
law r-1, v = 5.0, with 10%random error in the measurements. Fig. 7.
Same as Fig. 6 but with 5% random
error
in the
measurements.
13
.E
I T
I
I
1
2.4 3.6 Radius (m) 631
I 1.2
.
'| 2.4
3.6
4.8
I
6.0
Radius(m)
Fig. 8.
Same as Fig. 6 but the initial guess is a Junge-type power law
r-1, = 5.0.
6.0
Fig. 11. Five possible solutions for particle size distributions structed from an initial guess of a Junge-type power law rP,
=
con3.0.
All solutions satisfy a measurement vector g constructed for a Junge-type power law rP, v = 5.0, that was perturbed five times with
different sets of random errors up to 5%. 1 April 1985 / Vol. 24, No. 7 / APPLIED OPTICS
1041
-6-
30 0
Fig.12.
1.2
2.4 3.6 Radius (m)
4.8
6.0
Same as Fig.11 but with random errors in the measurements
up to 10%.
ment vector g, constructed for a Junge distribution with v = 5.0, was perturbed five times with different sets of
random errors and then each was inverted. In these figures, the results for each of the measurements are presented.
These figures are included to emphasize the
fact that the solution for any inversion is not unique and to present an estimated envelope of possible solutions. A second method to determine the confidence in the solution is discussed in Ben-David et al." V.
The figures presented here indicate that very good results can be obtained by using the nonlinear iterative method for the inversion of backscattered measurements to determine various particle size distributions. The inversion method can tolerate error in the measurements up to 10%due to the large information content in the backscattering kernel, which applies to spherical particles with known refractive indices. When this method is applied to extinction measurements, the results are much worse because of less information content. However,it should be remembered that, when applying a second-order smoothing constraint for functions like r- or lognormal, the solution is not an objective solution any more, since the sec-
ond-order derivative of these functions is not zero and Since fre-
quently inversion results are only a tool for the calculation of other functions (e.g., calculation of size distribution to determine a phase function for radiation transfer calculations, or the determination of temperature profiles for a general circulation model), it is advisable to check the sensitivity and the details required for the particular problem.
1042
References 1. C. D. Capps, R. L. Henning, and G. M. Hess, "Analytic Inversion of Remote-Sensing Data," Appl. Opt. 21, 3581 (1982). 2. B. M. Herman, S. R. Browning, and J. A. Reagan, "Determination
of Aerosol Size Distribution from Lidar Measurements," J. Atmos. Sci. 28, 763 (1971).
Conclusion
we will lose some features in the solution.
This work was supported by U.S. Army grant DAAK11-82-K-0012. The final manuscript was edited by Margaret Sanderson Rae.
APPLIED OPTICS / Vol. 24, No. 7 / 1 April 1985
3. S. Twomey, "On the Numerical Solution of Fredholm Integral Equation of the First Kind by the Inversionof the Linear System Produced by Quadrature," J. Assoc. Comput. Mach. 10, 97 (1963). 4. S. Twomey, "The Application of Numerical Filtering for the
Solution of Integral Equations Encountered in Indirect Sensing Measurements," J. Franklin Inst. 279,95 (1965). 5. S. Twomey, Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements (Elsevier, New York, 1977).
6. B. R. Frieden, Probability, Statistical Optics and Data Testing (Springer, New York, 1983). 7. M. D. King, "Sensitivity of Constrained Linear Inversions to the Selection of the Lagrange Multiplier," J. Atmos. Sci. 39, 1356 (1982). 8. S. Twomey, "Information Content in Remote Sensing," Appl. Opt. 13, 942 (1974).
9. E. Thomalla and H. Quenzel, "Information Content of Aerosol Optical Properties with Respect to Their Size Distribution," Appl. Opt. 21, 3170 (1982). 10. H. Grassl, "Determination of Aerosol Size Distributions from
Spectral Attenuation Measurements," Appl. Opt. 10, 2534 (1971). 11. A. Ben-David, L. W. Thomason, and B. M. Herman, "Correlation
and Variance in a Constrained Linear Inversion," submitted to Appl. Opt.