Convolution Square Root of Band-Limited Symmetrical Functions and

2 downloads 0 Views 750KB Size Report
Sep 12, 1980 - A method for the deconvolution of the convolution square of a symmetrical function with a limited range of definition is presented. The solution ...
101 J. Appl. Cryst. (1981). 14, 101-108

Convolution Square Root of Band-Limited Symmetrical Functions and its Application to SmallAngle Scattering Data BY OTTO GLATTER

Institut fffr Physikalische Chemie der Universitdt Graz, Heinrichstrasse 28, .4-8010 Graz, Austria and Institut ffir R6ntgenfeinstrukturforschung der Osterreichischen ,4kademie der Wissenschaften und des Forschungszentrums Graz, Steyrergasse 17, .4-8010 Graz, Austria (Received 12 September 1980; accepted 28 October 1980)

Abstract

A method for the deconvolution of the convolution square of a symmetrical function with a limited range of definition is presented. The solution function is approximated by a number of equidistant step functions. This allows the analytical computation of the integrals of overlap in one-dimensional (lamellar) symmetry, in two-dimensional (cylindrical) symmetry and in three-dimensional (spherical) symmetry. A special iterative linearized weighted-least-squares technique solves the non-linear convolution square-root problem without any a priori information on the solution. As an application, the electron or scattering length density Q(r) from the distance distribution function p(r) of small-angle scattering is computed as well as the propagation of the statistical error from the input. The influence of imperfect realization of the symmetry conditions is discussed. Numerical instabilities that appear under certain conditions can easily be removed by a stabilization procedure. I. Introduction

Small-angle scattering experiments measure the angular dependence of the scattering intensity. It is impossible to obtain scattering amplitudes which would allow the computation of the electron or scattering-length density by Fourier transformation. However, Fourier transformation of the scattering intensity gives the correlation function or the distance distribution function of the particle. This function is the so-called convolution square of the electron density distribution (Hosemann & Bagchi, 1962; Bracewell, 1965). The electron density distribution can be determined in principle by two different methods from the scattering intensity under the additional assumption of lamellar, cylindrical or spherical symmetry. The conventional procedure starts with the determination of the scattering amplitudes from the scattering intensities by a simple square-root operation. The determination of the right sign - known as the so-called phase problem is the main problem in this step. The critical regions near the zeros are influenced greatly by deviations from ideal symmetry, by polydispersity of the sample and by 0021-8898/81/020101-08501.00

any experimental errors. These effects lead to the fact that there appear more or less pronounced minima instead of real zeros. There is no unique technique for the estimation of the best symmetrical approximation. The phase problem can sometimes be solved by contrast variation (Mateu, Tardieu, Luzzati, Aggerbeck & Scanu, 1972; Mfiller, Laggner, Kratky, Kostner, Holasek & Glatter, 1974). The final step of Fourier transformation of the amplitude curve usually is not difficult (Glatter, 1977a). Another possible procedure is Fourier transformation of the scattering intensity and the subsequent computation of the electron density from the distance distribution function by a convolution square-root technique. This method does not suffer from the phase problem. It has already been shown for the one-dimensional problem (lamellar symmetry) by Hosemann & Bagchi (1952) and by Engel (1973) that the convolution squareroot operation has a unique solution except for a factor ( + 1) if the function has a finite range of definition and if the function is symmetrical. The first attempt for the one-dimensional case was made by Hosemann & Bagchi (1962). They solved the problem by discretization of the integral equation. The resulting non-linear equation system has a triangular matrix and can be solved stepwise. An improvement of this technique using an iterative procedure was given by Bradaczek & Luger (1978). A completely different method has been developed by Pape (1974). The method solves the phase problem implicitly by a system of linear equations originating from a sine-series development of the correlation function. All these methods have the property that they give results of sufficient accuracy only for exact input data. Statistical errors are not taken into account by these methods. A weighted-least-squares techniques for the onedimensional case recently developed by Pape & Kreutz (1978) approximates the electron density distribution by a few Gaussian functions. The method is very sensitive to the first data set of the iteration, i.e. it requires good a priori information on the signs and positions of the Gaussians. The method presented in this paper needs no such a priori information at all and © 1981 International Union of Crystallography

102

C O N V O L U T I O N SQUARE ROOT O F B A N D - L I M I T E D SYMMETRICAL F U N C T I O N S

can be applied to any of the three types of symmetry. The electron density is approximated in its range of definition by a linear combination of a finite number of functions that have to be linearly independent in this range. Equidistant step functions (B splines of zero order) are then used in order to permit the analytical integration of the overlap integrals. II. Theory anti method (1) Basic equations The electron density distribution ~o(r) is approximated by the series

different from zero in five subregions within their range of definition. These regions can be illustrated easily for symmetry of type 2 (see Fig. 1). This figure shows two circular rings of different size (i4: k) where Rh is the inner radius and Ri is the outer radius of the ith step function, Rj and Rk are the inner and outer radius of the kth step function. The five regions are: (I) entering neighbouring step Rh--Rkk

i>k; k= 1,j=0 II III and IV

Vii, l(r) = R i - Rh non-existent

Ok; k= 1,j=0 V~.,(r) = r{rE(R 2 --a2h) 1/2 --(R 2 --a2i) 1/2] + R 2 arccos ( a i l / R i ) - R 2 arccos (ahl/Rh) + R 2 [arcsin (a i h/R 1) -- arcsin (a l i/R 1)] } III and IV n o n - e x i s t e n t i=l,k=l

Ok Region

O v e r l a p integrals

I II

V~.k(r) = rZ{ -- (2rc/3) (Rh3 -- R 3) + (rrr/2) (R~ + RE) -- r3r~/12 + (n/4r) (Rh2 -- R~) 2 } V~. k(r) = r2 {(2n/3) (R 3 - R 3) - (rcr/2) (R 2 + R y) + r3~/l 2 2 - (rc/4r)(R 4 + R i 4 + 2 R h2R k -2- 2 R h R2j 2 - 2 R i 2 Rk)}

III IV

2 Vii. k(r) = (rn/2) (R~ R~ + R h2R j2 - Ri2 R j2 - Rk2 Rh) V~.k(r) = r2{ -- (2~/3) (R 3 + R~) + (gr/2) (R 2 + R 2) + (n/4r) (R 4 + R 4 + 2R2R2k - 2R~2R j2 - 2 R 2 R 2) - r3~/12}

V

Vi. k(r)= r 2 {(2n/3) (R 3 + R k3) --0zr/2) (R 2 + R 2) - ( n / 4 r ) ( R 2 - R2) 2 + r 3 n/12}

with the e x c e p t i o n of the following cases: i=k, h=j, k~ 1

I II

non-existent Vi. ,(r)= r 2 {(4~/3) (R 3 - g 3) - rcr(g2~ + R~) + raTt/6}

i>k, k=l,j=0

II

V~.x(r)= rZ{ - (rt/4r)[(R~ - R~) - 2R2~(R~ - Rh2)] + (2rc/3) (R~ - R 3) - ( n r / 2 ) (R~2 - Rh2)}

III a n d IV n o n - e x i s t e n t i=1, k=l O N. This system can be solved with a least-squares technique, i.e. in matrix notation equation (9b) must be multiplied on the left by A r, the transpose of A, and by the inverse of the product ArA (Brandt, 1970). We then obtain Ac = (ArA) - 1ATAp.

(11)

This allows us to compute the result after the first iteration: c~l) = C! 0) + A c i . (12) This result will not be perfect, because of the

linearization in (9). We must continue our iterative procedure, replacing c(°) by c! 1), and then computing new deviations Ap(r). The calculation is repeated until the deviations become negligible. There is no explicit proof that this method will converge under all conditions, but the method has been found to converge for all tests performed so far (for details see Numerical results). If we want to take into account the statistical accuracy of our data we must use a weighted-leastsquares technique. This is possible if we know the covariance matrix G ; 1 of the p(r~) values. This matrix will not be a diagonal one, as the p(r) values result from a Fourier transformation which leads to a correlation of the errors. The covariance matrix G ; 1 can be computed easily by the indirect transformation method (Glatter, 1977a, b). If we do not know the entire covariance matrix but use approximate values for the standard deviations p(rj), we can define an approximate covariance matrix by Gp -1=

°'2(r2) .......'--

0

.

(13)

...........a2(rM)

The weighted-least-squares condition is fulfilled with ATGpAAC = BAc = ArGpAp.

(14)

Equations (14) are called the normal equations. They can be solved by inversion of B: Ac = B- 1ATGpAp. (15) (3) Error propagation Any error of the input data will influence the result Q(r) or the solution vector c. This propagation of errors

OTTO GLATTER

can be estimated only if the iterative procedure converges, i.e. if we already know the solution because the transmitting matrix A depends on the solution coefficients according to (10). We assume below that the procedure has converged after k iterations. From (12) the solution is given by

c!k)=c!k-

1)+Aci.

The coefficients C~k- 1)can be treated as exact quantities in the estimation of the propagated error. The error is connected with the correction terms Aci as given by (15); i.e. the error of the solution is given only by the error of the Aci. The error propagation of linear transformations as in (15) can be calculated with the standard error propagation formulae (Brandt, 1970; Hamilton, 1964). The covariance matrix of the Ac is given by Ga-~' = [(ArGpA)- 1ArGp-] Gp '[(AT"GpA)- 'A rGp] r. (16) This expression can be reduced to the form G~c I = (ATGpA) - 1 = B- 1

(17)

taking into account the symmetry properties of the matrices. A rough quantitative approximation for the standard deviation of the solution coefficients is given by

tTci~[(ATgpA)iT1]I/2=[(B-1)ii] 1/2

(18)

105

Only a few test runs were necessary in order to verify the correctness of the equations, but several hundred test runs were necessary to check the points (b)-(d). This large number of tests was necessary because it was not possible to find formal proofs or general answers. Only a few significant examples will be discussed in the following part of this paper. (1) Convergence, quality of the solution The calculation converged under all the different test conditions; i.e. the convergence was found to be independent of any a priori information on the solutions as is required by many other methods. This independence of the convergence and of the results led to a choice of constant starting values cl°~ normalized according to equations (6). Test data without any error and a reasonable choice of the step width A R assure excellent approximations, i.e. graphical presentations of the input curve p(ri) and of the approximation curve p(ri) do not show any deviations. The results under such idealized conditions would, therefore, be impressive but are not presented here because of the negligible relevance to practical problems. In practice, we are interested in the evaluation of imperfect data. The quality of the approximation of such data can be controlled by the mean deviation MD=

{1 M Fp(ri)_~(ri)12~l/2 ~ i ~ 1 L a-p(~-~) _] .~ "

(2)

neglecting the covariances. IlL Numerical results

The Fortran IV computer program D E C O N * has been written in order to test the procedure. The following problems have been investigated by a comprehensive series of tests for: (a) the correctness of all equations derived in this paper, especially the equations for the E, k(r) in Tables 1-3; (b) the convergence for arbitrary starting coefficients c!°) and for arbitrary solution vectors c, and the accuracy of the solutions; (c) the applicability of the method if the symmetry conditions are not fulfilled properly; (d) the stability of the method in dependence on statistical or systematical errors. As a simulation routine is part of the computer program, one can compute the p(r) function of an arbitrary profile (ci), add statistical noise and use the data points as input data for the convolution squareroot procedure. A program compound with the evaluation procedure I T P (Glatter, 1977b, 1980a,b) allows the evaluation of small-angle scattering data starting from smeared unsmoothed experimental data with the electron density distribution Q(r) as the final result. * For further details of the program please contact the author.

The deviations are mainly due to the statistical errors but they also depend on the choice of the functions (pi(r), on the number of coeffÉcients N (equation 1) and on deviations from the assumed symmetry and geometry. The width of the steps A R makes the most important influence of the functions ~oi(r), and the influence of the shape is of minor importance taking into account the limited resolution of the usual experiments. It is possible, for example, to find a step function and a corresponding smooth function (Glatter, 1977a, Fig. 9a), the scattering amplitudes of which deviate within the range of the first four subsidiary maxima only in the fourth digit. The width AR should correspond to the resolution of the p(r) function, i.e. it should be equal to the distance of the knots if the indirect transformation method is used. The number of steps is then fixed by the maximum radius which can be determined from the p(r) function. (2) Stability The numerical tests show that the one-dimensional problem (type 1) can be solved without stability problems. The solution shows no artificial oscillations. The propagated error band has the right magnitude but instabilities start to arise with problems of type 2 and can be important for problems of type 3. They become stronger as N increases. Such instabilities lead to artificial oscillations and to a corresponding increase of the propagated error. This effect influences particularly

106

CONVOLUTION SQUARE ROOT OF BAND-LIMITED SYMMETRICAL FUNCTIONS

the innermost steps and it can be negligible at the outermost steps. These properties are the result of the special structure of the matrix B. The elements of the matrix B depend on the overlap integrals V~k(r).These integrals are all of the same order of magnitude, independent o f / a n d k, for symmetry type 1, but their magnitude increases with increasing index i and k for type 2 and in particular for type 3, leading to an ill-conditioned matrix B. This instability can be eliminated with the stabilization technique already developed for the indirect transformation technique (Glatter, 1977a, b). In all our equations we replace the matrix B by a stabilized matrix B' obtained from B by addition of a stabilizing matrix K, where

(_1_1

)

B'=B+2K

(19)

with

1

K=

2

-1

0

--1 -2

0 ....

-1 - i ...........2 -1

program I T P (Glatter, 1977b) with the following parameters: 20 splines, distance of the knots DRB = 10 ~. The resulting p(r) function was transferred together with the propagated error band to the program DECON. The computation of the o(r) function was performed with ten steps and a step width AR of 10 ./k, in

Type

5

Type

1

2

6

t4

3 2,

k.

2 ;

5'O

o

• 0~

~I(~0

-1 1

In (19) the Lagrange multiplier 2 is a stabilization parameter; i.e. the larger 2, the higher the stabilizing effect of K. The optimum value of 2 can be determined by the point-of-inflexion method (Glatter, 1977b). The first series of results is a simulation with a twostep electron density profile, 0 = - 0 . 5 for 0_< r < 30 ,~ and ~ = 1.0 for 3 0 < r _ 5 0 A. Fig. 2 shows th-e p(r) functions of the three types of symmetry with a constant statistical noise with a magnitude of about five percent of the maximum value of the actual p(r) function together with the best approximation by DECON and with the exact theoretical functions. The solutions were calculated with five steps with a step width AR = 10 A. The theoretical o(r) profile and the corresponding solutions with and without stabilization are shown in Fig. 3. Type 1 shows no instability and the stabilized solution coincides with the unstabilized. The infuence of stabilization can be seen clearly for type 2 and stabilization is essential for type 3. The effect of stabilization is also important for the propagated error. The standard deviations for the different solutions of type 3 (Fig. 3c) are illustrated in Fig. 4. The degree of instability depends also on the number of steps N. Using the data of Fig. 2(c), with ten steps we get the solutions shown in Fig. 5. Comparing Fig. 3(c) and Fig. 5, we see that the stabilized solutions are very similar but the unstabilized solution for N = 10 shows much stronger oscillations. The next example shows the whole evaluation procedure starting with a simulated scattering curve of a sphere (radius R -- 75 A), distorted by the effects of the slit length, slit width and Cu Kfl effect. The data points have a statistical error of 5% and are given at equidistant intervals in the range 0.012 _