Statistical Experimental Design for Quantitative Atomic Resolution ...

7 downloads 44586 Views 2MB Size Report
Statistical Experimental Design for Quantitative Atomic. Resolution Transmission Electron Microscopy. S. VAN AERT,1 A. J. DEN DEKKER,2 A. VAN DEN BOS,3 ...
ADVANCES IN IMAGING AND ELECTRON PHYSICS, VOL. 130

Statistical Experimental Design for Quantitative Atomic Resolution Transmission Electron Microscopy S. VAN AERT,1 A. J. DEN DEKKER,2 A. VAN DEN BOS,3 AND D. VAN DYCK1 1

Department of Physics, University of Antwerp, Groenenborgerlaan 171, 2020 Antwerp, Belgium 2 Delft Center for Systems and Control, Delft University of Technology, Mekelweg 2, 2628 CJ Delft, The Netherlands 3 Faculty of Applied Sciences, Delft University of Technology, Lorentzweg 1, 2628 CJ Delft, The Netherlands

I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . A. Qualitative Atomic Resolution Transmission Electron Microscopy . . . B. Quantitative Atomic Resolution Transmission Electron Microscopy . . . C. Statistical Experimental Design . . . . . . . . . . . . . . . . . II. Basic Principles of Statistical Experimental Design . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . B. Parametric Statistical Models of Observations . . . . . . . . . . . C. Attainable Precision . . . . . . . . . . . . . . . . . . . . . 1. The Crame´r-Rao Lower Bound. . . . . . . . . . . . . . . . 2. Precision Based Optimality Criteria . . . . . . . . . . . . . . D. Maximum Likelihood Estimation . . . . . . . . . . . . . . . . E. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . III. Statistical Experimental Design of Atomic Resolution Transmission Electron Microscopy Using Simplified Models . . . . . . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . B. Parametric Statistical Models of Observations . . . . . . . . . . . 1. One-Dimensional Observations . . . . . . . . . . . . . . . . 2. Two-Dimensional Observations. . . . . . . . . . . . . . . . 3. Three-Dimensional Observations . . . . . . . . . . . . . . . C. Approximations of the Crame´r-Rao Lower Bound . . . . . . . . . 1. One-Dimensional Observations . . . . . . . . . . . . . . . . 2. Two-Dimensional Observations. . . . . . . . . . . . . . . . 3. Three-Dimensional Observations . . . . . . . . . . . . . . . D. Discussions and Examples . . . . . . . . . . . . . . . . . . . 1. Two-Dimensional Observations. . . . . . . . . . . . . . . . 2. Three-Dimensional Observations . . . . . . . . . . . . . . . E. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . IV. Optimal Statistical Experimental Design of Conventional Transmission Electron Microscopy . . . . . . . . . . . . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . . . . . B. Parametric Statistical Model of Observations . . . . . . . . . . .

. . . . . . . . . . . .

. . . . . . . . . . . .

2 5 7 10 13 13 17 19 20 22 25 26

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

27 27 28 28 29 31 34 34 36 42 45 45 51 57

. . . . . .

58 58 62

1 Copyright 2004, Elsevier Inc. All rights reserved. ISSN 1076-5670/04

2

VAN AERT ET AL.

1. The Exit Wave. . . . . . . . . . . . . . . . . . 2. The Image Wave . . . . . . . . . . . . . . . . . 3. The Image Intensity Distribution . . . . . . . . . . . 4. The Image Recording . . . . . . . . . . . . . . . 5. The Incorporation of a Monochromator . . . . . . . . C. Statistical Experimental Design . . . . . . . . . . . . 1. Microscope Settings . . . . . . . . . . . . . . . . 2. Numerical Results . . . . . . . . . . . . . . . . 3. Interpretation of the Results . . . . . . . . . . . . D. Conclusions . . . . . . . . . . . . . . . . . . . . V. Optimal Statistical Experimental Design of Scanning Transmission Electron Microscopy . . . . . . . . . . . . . . . . . . A. Introduction . . . . . . . . . . . . . . . . . . . . B. Parametric Statistical Model of Observations . . . . . . . 1. The Exit Wave. . . . . . . . . . . . . . . . . . 2. The Image Intensity Distribution . . . . . . . . . . . 3. The Image Recording . . . . . . . . . . . . . . . C. Statistical Experimental Design . . . . . . . . . . . . 1. Microscope Parameters . . . . . . . . . . . . . . 2. Numerical Results . . . . . . . . . . . . . . . . 3. Interpretation of the Results . . . . . . . . . . . . D. Conclusions . . . . . . . . . . . . . . . . . . . . VI. Discussion and Conclusions . . . . . . . . . . . . . . . Appendix A . . . . . . . . . . . . . . . . . . . . . Appendix B . . . . . . . . . . . . . . . . . . . . . Appendix C . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

62 65 67 68 69 70 71 72 99 102 104 104 107 108 110 113 115 116 117 138 141 143 145 149 152 157

I. Introduction In materials science, the last decades are characterized by an evolution from macro- to micro- and, more recently, to nanotechnology. In nanotechnology, nanomaterials play an important role. Examples of nanomaterials are nanoparticles, nanotubes, and layered magnetic and superconducting materials (Nalwa, 2002; van Tendeloo et al., 2000). The interesting properties of these materials are related to their structure. Therefore, one of the central issues in materials science is to understand the relations between the properties of a given material on the one hand and its structure on the other hand. A complete understanding of this relation, combined with recent progress in building nanomaterials atom by atom, will enable materials science to evolve into materials design, that is, from describing and understanding towards predicting materials with interesting properties (Browning et al., 2001; Olson, 1997, 2000; Reed and Tour, 2000; Wada,

QUANTITATIVE ATOMIC RESOLUTION TEM

3

1996). In order to understand the properties-structure relation, experimental and theoretical studies are needed (Muller and Mills, 1999; Spence, 1999; Springborg, 2000). Essentially, theoretical studies allow one to calculate the properties of materials with known structure, whereas experimental studies allow one to characterize materials in terms of structure. In practice, however, the combination of both approaches is not yet feasible. One of the reasons is that present experimental characterization methods may generally not locally determine atom positions within sub-a˚ngstrom precision (Olson, 2000). A precision of the ˚ is needed (Muller, 1998, 1999; Kisielowski, Principe, order of 0.01 to 0.1 A Freitag, and Hubert, 2001). Various experimental characterization methods exist. However, scanning probe techniques, such as scanning tunnelling and atomic force microscopy, restrict investigations to surface or near-surface regions (Wiesendanger, 1994). Hence, they cannot provide subsurface information. Classical X-ray and neutron diVraction techniques, on the other hand, only provide averaged, instead of local, structure information (Zanchet and Ugarte, 2000). Therefore, they may only be applied successfully to periodic materials, such as crystals, whereas nanomaterials are usually aperiodical. Only atomic resolution transmission electron microscopy (TEM) techniques seem to be appropriate to provide local information to atomic scale since electrons interact suYciently strongly with materials (Fujita and Sumida, 1994), (Spence, 1999). Another advantage of electrons is that they are charged and can therefore give information about the ionization state of atoms. They can also be deflected by lenses yielding information both in real and Fourier space. Furthermore, as compared to X-rays or neutrons, electrons would even provide more structure information for a given amount of radiation damage (Henderson, 1995). Figure 1 presents a compact scheme of the collection of electron microscopical observations by means of atomic resolution TEM. The observations are two-dimensional projected images of three-dimensional objects. Obviously, only the position of projected atoms or atom columns may be obtained from a single image. Quantitative atomic resolution TEM allows materials scientists to measure structure parameters, including the positions of projected atoms or atom columns, from the obtained observations. The observations fluctuate about their expectations. The physical model describing these expectations, the expectation model, contains the structure parameters to be measured. Quantitative atomic resolution TEM makes use of such a model combined with statistical parameter estimation techniques in order to measure, or more specifically, to estimate, the positions of projected atoms or atom columns. Subsequently, the positions of the atoms in three-dimensional space may be derived from combining the measurements of a set of projected images. Therefore,

4

VAN AERT ET AL.

Figure 1. Scheme of an atomic resolution TEM experiment. The observations are twodimensional projected images of three-dimensional objects. The structure parameters of these objects are unknown. Quantitative atomic resolution TEM allows one to estimate these parameters from the observations. The precision of the estimates depends on the microscope settings. The optimal microscope settings result into the highest attainable precision.

quantitative atomic resolution TEM is probably the most appropriate technique for very precise measurement of atom positions. The precision of the projected atom position or atom column position estimates is limited by the presence of the fluctuations in the observations. It depends on the microscope settings, such as the defocus and the aperture. In the literature, a particular choice of such settings is referred to as experimental design (Fedorov, 1972). The purpose of this article is to optimize the experimental design in terms of the attainable precision, under relevant physical constraints. These constraints are either the radiation sensitivity of the object or the specimen drift. Therefore, either the incident electron dose per square a˚ ngstrom (that is, the amount of electrons per square a˚ ngstrom that interact with the object during the experiment) or the recording time has to be kept subcritical. Of crucial importance in the optimization procedure is that the attainable precision can be adequately quantified (van den Bos, 1982; van den Bos and den Dekker, 2001). It is used as optimality criterion for quantitative evaluation of the eVect of microscope settings on the precision. This evaluation procedure, which is called statistical experimental design, allows electron microscopists to derive the optimal statistical experimental design, that is, the experimental design resulting into the highest attainable precision. Strictly speaking, optimal statistical experimental design refers to the optimization of tunable settings,

QUANTITATIVE ATOMIC RESOLUTION TEM

5

such as the defocus, and not to fixed settings, such as the spherical aberration constant, which, in the absence of a so-called spherical aberration corrector, is a fixed property of the microscope. According to van den Bos (2002), the optimization of fixed settings by means of new instrumental developments could be called optimal instrumental design. However, since the optimization procedure is the same, this distinction in terminology will not be made in the remainder of this article. The application of statistical experimental design to quantitative atomic resolution TEM is, in the author’s opinion, novel. In these considerations, subjective qualities of the electron microscope as an imaging instrument are no longer important. In a sense, it doesn’t matter whether the produced images are good-looking or not. The electron microscope is considered to be a measuring instrument (van Aert, den Dekker, van den Bos, and van Dyck, 2002; van den Bos, 2002a). This means that the structure parameters, the projected atom positions or atom column positions in particular, are quantitatively estimated from the electron microscopical observations, instead of visually determined. For many years, it has been standard practice to interpret images visually or to compare images visually with computer simulations in order to determine the structure of an object. This will be called qualitative atomic resolution TEM. The optimality criteria used to evaluate the accompanied microscope designs are based on classical resolution criteria, such as Rayleigh’s. However, these criteria are not suitable for quantitative atomic resolution TEM. Instead, the attainable statistical precision is the criterion of importance. A. Qualitative Atomic Resolution Transmission Electron Microscopy Up to recently, qualitative atomic resolution TEM was hampered by ˚ would insuYcient resolution of the electron microscope. A resolution of 1 A be required to visualize the individual projected atom columns of materials with columnar structures, such as perfect crystals or crystals containing defects in the structure, viewed along a main zone axis. Over the years, ˚ resolution. Examples diVerent methods have been developed to obtain 1 A of such methods are: . . .

. . .

High-voltage electron microscopy Correction of the spherical aberration in the electron microscope High-angle annular dark-field scanning transmission electron microscopy Focal-series reconstruction OV-axis holography Correction of the chromatic aberration in the electron microscope

6

VAN AERT ET AL.

These methods improve the interpretability of the experimental images in terms of the structure. The former three do not require image processing techniques, whereas the latter three do. In high-voltage electron microscopy, the accelerating voltage of the electron microscope is increased up to 1 MV and beyond (Phillipp et al., 1994). It is used in conventional transmission electron microscopy (CTEM). In this mode, the object is illuminated with a parallel incident electron beam. If the object is thin, the directly interpretable resolution for CTEM is given by the so-called point resolution (O’Keefe, 1992; Spence, 1988). In high˚ . For comparison, in voltage electron microscopy, it is about equal to 1 A intermediate voltage electron microscopy, operating at an accelerating ˚ . However, the disadvantage of high voltage of about 300 kV, it is 2 A voltage electron microscopy is the increase of the displacement damage to the object, that is, the displacement of the atoms in the object from their initial positions (Spence, 1999; Williams and Carter, 1996). Spherical aberration in the electron microscope is a lens defect that, like other aberrations, causes a point object to be imaged as a disk of finite size. By using multipole lenses, Rose (1990) has developed a corrector which cancels spherical aberration out. Correction of the spherical aberration is applied to both CTEM (Haider et al., 1998) and scanning transmission electron microscopy (STEM) (Batson, Dellby, and Krivanek, 2002). In the STEM mode, an electron probe is formed, which scans in a raster over the object. At present, one of the main diYculties of the spherical aberration corrector is the complicated procedure for the alignment of the large number of electrostatic and magnetic optical elements (Spence, 1999). In high-angle annular dark-field scanning transmission electron microscopy (HAADF STEM), one of the STEM variants, mainly inelastically scattered electrons are detected. The elastically scattered electrons are eliminated from detection. Here, the directly interpretable resolution is enhanced (Nellist and Pennycook, 2000), although at the expense of a significant loss of imaging electrons. The latter three possibilities, focal-series reconstruction, oV-axis holography, and correction of the chromatic aberration in the electron microscope are used in CTEM mode. In CTEM, one has, apart from the point resolution, another resolution measure, the so-called information limit. The information limit represents the smallest detail that can be resolved by using image processing techniques. It is inversely proportional to the highest spatial frequency that is still transferred with appreciable intensityfrom the exit plane of the object to the image plane (de Jong and van Dyck, 1993; O’Keefe, 1992). In intermediate voltage electron microscopy, the information limit is usually smaller than the point resolution.

QUANTITATIVE ATOMIC RESOLUTION TEM

7

Focal-series reconstruction and oV-axis holography push the directly interpretable resolution down to the information limit. This is done by retrieving the exit wave, that is, the complex electron wave function at the exit plane of the object. Ideally, the exit wave is free from any imaging artifacts, which means that the visual interpretability of the reconstruction is enhanced considerably for thin objects when compared to the original experimental images. Today, the information limit of CTEM is slightly ˚ for electron microscopes equipped with a field emission gun as below 1 A electron source (Spence, 1999), (Kisielowski, Hetherington, Wang, Kilaas, O’Keefe, and Thust, 2001; O’Keefe et al., 2001). The focal-series reconstruction method reconstructs the exit wave from a series of images collected at diVerent defocus values (Coene et al., 1996; Kirkland, 1984; Saxton, 1978; Schiske, 1973; Thust, Coene, Op de Beeck, and van Dyck, 1996; Thust, Overwijk, Coene, and Lentzen, 1996; van Dyck and Coene, 1987; van Dyck, Op de Beeck, and Coene, 1993). OV-axis holography (Lichte, 1991) is based on the original idea of Gabor (1948), where the exit wave is retrieved from the interference between the object wave and a reference wave. The dominant factor governing the information limit is generally the chromatic aberration. It results from a spread in defocus values, arising from fluctuations in accelerating voltage, lens current, and thermal energy of the electron. By use of a chromatic aberration corrector (Reimer, 1984; Weißba¨ cker and Rose, 2001, 2002) or a monochromator (Mook and Kruit, 1999), the chromatic aberration and hence the information limit enhance. The chromatic aberration corrector is at the conceptual stage, while the monochromator is already used in practice. However, by use of a monochromator the enhancement of the information limit is reached at the expense of a loss of the incident electron dose. The qualitative methods presented in this section, nowadays result in a ˚ . Other methods to obtain this resolution exist as well, but resolution of 1 A they will not be treated in this article. B. Quantitative Atomic Resolution Transmission Electron Microscopy One a˚ ngstrom resolution is convenient for atomic resolution, but insuYcient for materials science of the future, which will require precision rather than ˚ resolution (Cahn, 2001). One is inclined to think that a precision of 0.01 A ˚ requires a resolution of 0.01 A, which is far beyond the present possibilities. However, resolution and precision are quite diVerent things. On the one hand, resolution expresses the ability to visualize separately adjacent atom columns in an image. On the other hand, precision corresponds to the

8

VAN AERT ET AL.

variance, or the square root of the variance, the standard deviation, with which structure parameters can be estimated. In this study, the most important parameters are the projected atom column positions since nanomaterials are usually crystals containing defects in their columnar ˚ precision, quantitative atomic structure. In order to attain 0.01 to 0.1 A resolution TEM is needed. Its goal is to estimate structure parameters of an object as precisely as possible from the observations. Estimation of the structure parameters requires an expectation model of the observations. In quantitative atomic resolution TEM, the expectation model represents the expected number of electron counts detected, for example, with a charged coupled device (CCD) camera. It describes, for instance, the expected number of electrons per pixel in the two-dimensional projected image of Figure 1. The expectation model is given by a function, which describes the electron-object interaction, the transfer in the microscope, and the image detection. Nowadays, these processes are suYciently well understood to make the derivation of an expectation model possible and several commercial software packages for atomic resolution TEM image simulations are available (Kilaas and Gronsky, 1983; Stadelmann, 1987). The parameters of the expectation model are structure parameters as well as microscope settings, characterizing the object under study and the microscope, respectively. In the derivation of this model, the object is described by the assembly of electrostatic potentials of the constituting atoms. Since the electrostatic potential is known for each atom type, the structure parameters reduce to atom numbers, atom positions, object thickness, orientation of the object with respect to the incident electron beam, and the Debye-Waller factor, which accounts for vibrations of the atoms at a given temperature (Wang, 2001). Then, the exit wave, resulting from the electron-object interaction, can be derived. An allembracing solution for this exit wave has not yet been found. DiVerent routes to achieve this goal are currently investigated. Proposed solutions are given by, for example, the weak phase object (Buseck, Cowley and Eyring, 1988), the multislice (Cowley and Moodie, 1957), and the Bloch wave theory (Hirsch et al., 1965; Howie, 1970; Kambe, Lehmpfuhl, and Fujimoto, 1974). A remarkable solution is given by the channelling theory (Geuens and van Dyck, 2002; Howie, 1966; Op de Beeck and van Dyck, 1996; Pennycook and Jesson, 1991; Sinkler and Marks, 1999, van Dyck and Chen, 1999a; van Dyck et al., 1989). It requires advanced knowledge of quantum mechanics. The channelling theory proposes a solution for the exit wave, which is simple, albeit approximate, but which is in closed analytical form so that it has the advantage that the projected structure of the object may relatively easily be obtained from this solution. The theory is applicable if the object is oriented along a main zone axis. In this orientation, the atoms

QUANTITATIVE ATOMIC RESOLUTION TEM

9

are superimposed along a column, hence the name atom column. It can then be shown that the electrons are trapped in the positive potential of these columns. Each atom column, in a sense, acts as a channel for the electrons. If the distance between adjacent columns is not too small, a one-to-one correspondence between the exit wave and the object structure is established. From the channelling theory an analytical expression for the exit wave can be derived, which is parametric in the projected atom column positions, the atom numbers of the atoms along a column, the distance between successive atoms along a column, and the Debye-Waller factor (van Dyck and Chen, 1999a). As already mentioned, one may expect to obtain projected information only. Ambiguity about the types and distance of atoms along a column may only be removed by combining information from diVerent zone axis orientations (van Dyck and Chen, 1999b). Furthermore, the transfer in the microscope and the image detection, which are also described by the expectation model, are characterized by a collection of microscope settings, such as the defocus value, the spherical aberration constant of the objective lens, the accelerating voltage, and the pixel size of the camera. The structure parameters or microscope settings of the expectation model are either known beforehand with suYcient accuracy and precision or not, in which case they have to be estimated from the experiment by means of statistical parameter estimation techniques. This is done by adapting the expectation model to the experimentally obtained observations with respect to the unknown parameters using a criterion of goodness of fit, such as the least squares sum or the likelihood function (Saxton, 1997). The set of parameters for which this criterion is optimum corresponds to the estimates. In a sense, in quantitative atomic resolution TEM, one is looking for the optimal value of a criterion in a parameter space whose dimension is equal to the number of parameters to be estimated. This search for the global optimum of the criterion of goodness of fit is an iterative numerical optimization method (Mo¨ bus et al., 1997). An overview of such methods can be found in Murray (1972) and van den Bos (1982). Generally, the dimension of the parameter space is high. Consequently, it is quite possible that the optimization procedure ends up at a local optimum instead of at the global optimum of the criterion of goodness of fit, so that the wrong structure is suggested. To solve this dimensionality problem, that is, to find a pathway to the global optimum in the parameter space, a good starting structure is required (van Dyck et al., 2003). Finding such a starting structure is not trivial, since due to two scrambling processes, details in the images do not necessarily correspond to features in the atomic structure. The first scrambling process is the dynamic scattering of the electrons on their way through the object. The second scrambling process is the transfer

10

VAN AERT ET AL.

in the electron microscope. Imaging lenses are not perfect, but have aberrations, such as spherical and chromatic aberration. As a consequence, the structure information of the object may be strongly delocalized. Additionally, the images are always disturbed by noise, that is, fluctuations in the observations, which further complicates direct interpretation. However, it has been shown that good starting structures can be found by using the qualitative methods described before. For example, focal-series reconstruction methods in a sense invert, or equivalently, undo, the eVect of lens aberrations. Consequently, the thus obtained exit wave is much more related to the object structure, providing a directly interpretable resolution close to the information limit, which just surpasses the limit beyond which individual atom columns can be discriminated (Kisielowski, Hetherington, Wang, Kilaas, O’Keefe and Thust, 2001; Thust and Jia, 2000; Zandbergen and van Dyck, 2000). Focal-series reconstruction methods thus yield an approximate structure that may be used as a starting point in a final numerical optimization procedure by adapting the expectation model to the original observations. The starting structure obtained may still be insuYciently close to the global optimum of the criterion of goodness of fit to guarantee convergence. In order to find a better starting structure, one also has to undo the first scrambling process mentioned, that is, the dynamic scattering of the electrons on their way through the object. Undoing the dynamic scattering is possible by means of the channelling theory. Adapting the analytical expression for the exit wave to the reconstructed exit wave with respect to the structure parameters provides the experimenter with an approximate structure that can then be used as an improved starting point for a final numerical optimization procedure by adapting the expectation model to the original images.

C. Statistical Experimental Design As mentioned before, with the resolution becoming suYcient to discriminate individual atom columns, a structure is char-acterized completely by the atom column positions, the atom numbers, the distance between successive atoms along a column, the object thickness, and orientation. Then, quantitative structure determination by means of quantitative atomic resolution TEM is a statistical parameter estimation problem, the image pixel values being the observations from which the parameters of interest have to be estimated. The precision with which these parameters can be estimated is only limited by the presence of noise. In this article, it will be shown that estimation of the unknown parameters may result in higher precisions if it is accompanied by statistical experimental design.

QUANTITATIVE ATOMIC RESOLUTION TEM

11

The procedure to derive the optimal statistical experimental design is as follows. Due to the inevitable presence of noise, the observations will always fluctuate randomly and are therefore modelled as stochastic variables. By definition, a stochastic variable is characterized by its probability density function, while a set of stochastic variables has a joint probability density function. The joint probability density function defines the expectations, that is, the mean value of each observation, as well as the fluctuations of the observations about these expectations. The expectations are described by the expectation model, which is parametric in the quantities to be estimated. Given the joint probability density function, use of the concept of Fisher information allows one to determine the attainable precision, that is, the lowest possible variance, with which a parameter can be estimated unbiasedly from a set of observations assumed to obey a certain distribution (Frieden, 1998; van den Bos, 1982; van den Bos and den Dekker, 2001). Thus, it is possible to derive an expression for the lower bound on the variance with which the atom column positions can be estimated from a quantitative atomic resolution TEM experiment. This lower bound, which is called the Crame´r-Rao Lower Bound (CRLB), is independent of the estimation method used, and therefore represents the intrinsic limit on precision. Moreover, it is a function of the microscope settings. This means that the CRLB varies with the microscope settings, of which at least some are adjustable. The optimal statistical experimental design of an atomic resolution TEM experiment is then given by the microscope settings that correspond to the lowest CRLB (van Aert, den Dekker, van den Bos and van Dyck, 2002b; van Dyck et al., 2002). It is found by minimizing the CRLB with respect to the microscope settings, under the existing physical constraints, which are the radiation sensitivity of the object or the specimen drift. Notice, that the optimal statistical experimental design may be diVerent for diVerent objects under investigation. In this article, the use of statistical experimental design for quantitative atomic resolution TEM is demonstrated. To begin with, it is applied to CTEM, STEM, and electron tomography experiments, all described in a simplified way (van Aert, den Dekker, van Dyck and van den Bos, 2002a). The attainable precision with which position and distance parameters of one or two components can be estimated has been investigated. For CTEM and STEM, the components are two-dimensional and the observations are counting events in a two-dimensional pixel array, whereas for electron tomography, the components are three-dimensional and the observations are counting events in a set of two-dimensional pixel arrays, which is obtained by rotating these components about a rotation axis. The expectation models of the observations are assumed to be Gaussian peaks, although they are of a higher complexity in practice. Under this assumption,

12

VAN AERT ET AL.

the CRLB on the variance with which position and distance parameters of one or two components can be estimated, which is usually calculated numerically, may be given in closed analytical form. Although a simplified model has been used for the derivation of these expressions, they are very useful as rules of thumb to give insight into statistical experimental design for quantitative atomic resolution TEM. The rules of thumb clearly show the dependence of the attainable precision on the width of the point spread function, the width of the components, and the number of detected counts. For electron tomography, the attainable precision also depends on the orientation of the components with respect to the rotation axis. Generally, the precision improves by increasing the number of detected counts or by narrowing the point spread function. However, below a certain width of the point spread function, the precision is limited by the intrinsic width of the components. Then, further narrowing of the point spread function is useless. This result is meaningful in practice. For example, in STEM experiments, further narrowing of the probe, which represents the point spread function, is not so beneficial in terms of precision since the width of the probe is currently almost equal to the width of an atom (Krivanek, Dellby, and Nellist, 2002). Moreover, as in STEM, if a narrower probe may be accompanied by a decrease of the number of detected electrons, both eVects have to be weighed against each other under the existing physical constraints. The optimal statistical experimental designs of CTEM and STEM experiments, assuming more complicated, physics based expectation models, instead of Gaussian peaks, are derived as well. These results are derived from the numerical minimization of the CRLB with respect to the microscope settings. The thus obtained results are intuitively interpreted using the rules of thumb for the CRLB, which are derived from Gaussian peaked expectation models. First, for CTEM operating at an intermediate accelerating voltage of about 300 kV, it is shown that a spherical or a chromatic aberration corrector may improve the attainable precision. However, the gain, which depends on the object under study, usually turns out to be disappointing. Furthermore, a monochromator does usually not improve the attainable precision if the experiment is limited by specimen drift (den Dekker et al., 2001), whereas it may slightly improve the precision if the experiment is limited by the radiation sensitivity of the object. For CTEM operating at a low accelerating voltage of about 50 kV, the attainable precision improves substantially by using both a spherical aberration corrector and either a chromatic aberration corrector or a monochromator. Next, for STEM, it is shown that the optimal probe is not the narrowest possible and that its optimal width strongly depends on the object under study. Moreover, an

QUANTITATIVE ATOMIC RESOLUTION TEM

13

annular detector usually results in a higher attainable precision than an axial one. Furthermore, as for CTEM, the precision that is gained using a spherical aberration corrector depends on the object under study, but this gain is generally only marginal (den Dekker, van Aert, van Dyck, and van den Bos, 2000; van Aert and van Dyck, 2001; van Aert, den Dekker, van Dyck, and van den Bos, 2000, 2002b). Also, it is shown that for both CTEM and STEM, the reduced brightness of the electron source is preferably as high as possible and the specimen holder as stable as possible, especially if the experiment is limited by specimen drift. The outline of the article is as follows. Section II introduces the basic principles of statistical experimental design. The attainable precision is proposed as quantitative performance measure. It allows one to evaluate, to optimize, and to compare diVerent experimental settings. In Section III, this process is illustrated for the estimation of position and distance parameters from CTEM, STEM, and electron tomography experiments, which are all described by simplified expectation models. In Section IV, the optimal statistical experimental design of CTEM experiments is derived from more complicated, physics based expectation models. Special attention is paid to the spherical aberration corrector, the chromatic aberration corrector, and the monochromator. In Section V, the optimal statistical experimental design of STEM experiments is discussed. In particular, the optimal probe and detector configuration are determined. In Section VI, conclusions are drawn.

II. Basic Principles of Statistical Experimental Design

A. Introduction In this section, the basic principles of statistical experimental design will be introduced. These principles may be applied to set up experiments in many branches of science, from elementary particle physics to astronomy. In these experiments, the measurement of any unknown parameter, such as the position of a star, the concentration of chemical elements, or the decay constant in a radio-active decay process, always takes place in the presence of fluctuations in the observations. As a result of these fluctuations, the precision with which the parameters can be measured is limited. The purpose of statistical experimental design is to derive the experimental design, that is, a particular choice of experimental settings, resulting in the highest precision. This so-called optimal statistical experimental design can be derived by applying the apparatus of mathematical statistics

14

VAN AERT ET AL.

straightforwardly. Hence, statistical experimental design is a powerful method, which can replace conventional methods that were, or, still are, used to optimize the experimental design. These conventional methods are based on the intuition of the experimenter. However, intuition might be very misleading, especially in combination with the increasing complexity of today’s experiments. Instead, statistical experimental design is needed. In the remainder of this article, it will be used to optimize the experimental design of quantitative atomic resolution TEM experiments in terms of the precision with which the atom positions can be estimated. To begin with, a simple definition of an experiment must be given. In principle, it can be defined as the way of collecting and analyzing a set of observations for a given purpose. From statistician’s point of view, this purpose is to measure unknown parameters as precisely as possible. This allows the experimenter to draw reliable conclusions from his or her experiment. The vital importance of precise measurement as a path to understanding was already recognized in 1883 by William Thomson, Lord Kelvin, the famous Scottish physicist. One of his much-quoted utterances in a lecture to civil engineers in London is the following one (Cahn, 2001): I often say that when you can measure what you are speaking about, and express it in numbers, you know something about it; but when you cannot measure it, when you cannot express it in numbers, your knowledge is of a meagre and unsatisfactory kind; it may be the beginning of knowledge, but you have scarcely, in your own thoughts, advanced to the state of science. William Thomson, Lord Kelvin

In order to be able to measure unknown parameters as precisely as possible, the analysis of the observations is based on the use of parameter estimation techniques. In other words, an estimator, which estimates the parameters from the observations, is chosen. Since an estimator is a function of the observations, the precision of the chosen estimator depends on the way the observations are collected. Often, these observations may be collected under a large variety of experimental designs. Given the purpose of an experiment, the optimal experimental design is given by the experimental settings resulting in the highest precision of the unknown parameters. The definition of an experiment may be illustrated at the hand of an example. A quantitative atomic resolution TEM experiment may be regarded as a set of observations, that is, electron counting results made, for example, with a CCD camera, from which the structure of the object under study, the atom positions in particular, has to be estimated as precisely as possible. In TEM, these observations may be collected by choosing, for example, defocus and aperture, and by choosing between diVerent imaging modes, such as conventional transmission electron CTEM and STEM. This oVers electron

QUANTITATIVE ATOMIC RESOLUTION TEM

15

microscopists the possibility to choose the electron microscope settings in accordance with the optimal experimental design so as to estimate unknown parameters as precisely as possible. The optimization of the experimental design consists of diVerent steps. First, a parametric statistical model of the observations has to be chosen. Since the observations fluctuate randomly about their expectations, due to the inevitable presence of noise, they are modelled as stochastic variables. By definition, a stochastic variable is characterized by its probability density function, while a set of stochastic variables has a joint probability density function. The joint probability density function of the observations defines the expectations of the observations as well as the fluctuations of the observations about these expectations. The expectations are described by the expectation model, that is, a physical model containing the parameters to be estimated. For example, in a radioactive decay process, the expectation model is a multi-exponential function, where the parameters are the decay constants. In quantitative atomic resolution TEM, the expectation model represents the expected number of electron counts. It is given by a function, which describes the electron-object interaction, the transfer in the microscope, and the image detection. The parameters of the expectation model are, for example, the projected atom or atom column positions, the object thickness, and the atom numbers. Usually, this kind of parameters has a clear physical meaning. Hence, the specification of the parametric statistical model of the observations needs a solid physical base. Second, the optimality criterion that will be used to optimize the experimental design has to be specified. The choice of this criterion depends on the purpose of the experiment, which is to estimate the unknown parameters of the expectation model as precisely as possible. Hence the optimality criterion to be preferred is the precision of the parameter estimates. Therefore, the precision has to be adequately quantified. This can be done using statistical parameter estimation theory. From the parametric statistical model of the observations, the attainable statistical precision can be determined, that is, the lower bound on the variance with which the parameters can be estimated without bias from the observations (van den Bos, 1982; van den Bos and den Dekker, 2001). The meaning of this socalled CRLB is as follows. Generally, one may use diVerent estimators in order to estimate parameters. An estimator is a function of the observations that is used to compute the parameters. Thus, an estimator is, like the observations, a stochastic variable. It is said to be unbiased if its expectation is equal to the true value of the parameter. Stated diVerently, an unbiased estimator has no systematic error. Moreover, diVerent estimators will have diVerent precisions. The precision of an estimator is represented by its variance or by its standard deviation, which is the square root of the

16

VAN AERT ET AL.

variance. It can be shown that the variance of unbiased estimators will never be lower than the CRLB. There exists a class of estimators, including the maximum likelihood estimator, that achieves the CRLB asymptotically, that is, for an increasing number of observations. The existence of the maximum likelihood estimator justifies the choice of the CRLB as optimality criterion. The CRLB is a function of the experimental settings. Thus, the lower bound on the variance of each individual, unknown parameter of the expectation model could be computed and minimized as a function of the experimental settings. However, simultaneous minimization of the set of lower bounds corresponding to the entire set of unknown parameters is usually impossible. Therefore, statistical parameter estimation theory provides diVerent optimality criteria, which are functions of the set of lower bounds. These are scalar measures and the experimenter has to choose one of them or has to produce a criterion him or herself, reflecting his or her specific purpose. For an electron microscopist, a specific purpose might be to measure the atom column positions as precisely as possible, irrespective of the precision of the object thickness or of the atom numbers. Thus, a possible optimality criterion is the sum of the lower bounds on the variance of the position coordinates. Generally, the choice of the optimality criterion requires detailed knowledge from experts in the scientific field. Finally, the optimality criterion chosen has to be optimized with respect to the experimental settings. This produces the optimal statistical experimental design. Usually, this is a nonlinear optimization problem for which the optimal value of the criterion has to be found numerically. This optimization is subject to the relevant physical constraints. For atomic resolution TEM, these constraints are the radiation sensitivity of the object under study or the specimen drift. Therefore, the incident electron dose per square a˚ ngstrom or the recording time has to be kept within the constraints. So far the introduction to the basic principles of statistical experimental design. For an extended introduction to statistical experimental design and the diVerent steps encountered for the optimization, the reader is referred to Fedorov (1972) and Pa´ zman (1986). The section is organized as follows. In Section II.B, parametric statistical models of observations will be discussed. In Section II.C, it will be shown how an adequate expression for the attainable statistical precision of the parameter estimates, that is, the CRLB, can be derived from such a parametric statistical model. The presented optimality criteria are functions of the attainable precisions. In Section II.D, the maximum likelihood estimator of the parameters will be derived from the parametric statistical model of observations. This estimator attains the CRLB asymptotically and, hence, justifies the choice of the optimality criteria. Section II.E consists of conclusions.

QUANTITATIVE ATOMIC RESOLUTION TEM

17

B. Parametric Statistical Models of Observations In this section, parametric statistical models of observations will be introduced. Specifically, they will be used to model electron microscopical observations. Any experimenter will readily admit that his or her observations ‘contain errors’. With a view to statistical experimental design, these errors must be specified. Generally, due to the inevitable presence of noise, sets of observations made under the same conditions nevertheless diVer from experiment to experiment. The usual way to describe this behaviour is to model the observations as stochastic variables. The reason is that there is no viable alternative and that it has been found to work (van den Bos, 1999; van den Bos and den Dekker, 2001). By definition, a stochastic variable is characterized by its probability density function, while a set of stochastic variables has a joint probability density function. Consider a set of stochastic observations wm ; m ¼ 1; . . . ; M made at the measurement points x1, . . . , xM. These measurement points are assumed to be exactly known. In CTEM, the observations are, for example, electron counting results made, for example, at the pixels of a CCD camera, where M represents the total number of pixels. Then, the M  1 vector w defined as w ¼ ðw1 . . . wM ÞT

ð1Þ

is the column vector of these observations. It represents a point in the Euclidean M space having w1, . . . , wM as coordinates. This will be called space of observations (van den Bos and den Dekker, 2001). The expectations of the observations, that is, the mean values of the observations, are defined by their probability density function. The vector of expectations E ½w ¼ ðE ½w1  . . . E ½wM ÞT

ð2Þ

is also a point in the space of observations and the observations are distributed about this point. The symbol E [.] denotes the expectation operator. In this article, the expectations of the observations are described by the expectation model, that is, a physical model, which contains the unknown parameters to be estimated, such as the position coordinates of the projected atoms or atom columns. The unknown parameters are represented by the T  1 parameter vector y ¼ ðy1 . . . yT ÞT . Thus, it is supposed that the expectation of the mth observation is described by E ½wm  ¼ fm ðyÞ ¼ fðxm ; yÞ;

ð3Þ

where fm(y) represents the expectation model, which is evaluated at the measurement point xm and which depends on the parameter vector y. Apart

18

VAN AERT ET AL.

from the unknown parameters y, the expectation model contains known parameters and experimental settings as well. For example, in quantitative atomic resolution TEM, the expectation model is sometimes described as fkl ðyÞ ¼

N Inorm

jcðrkl ; yÞ  tðrkl ; "; Cs Þj2 ;

ð4Þ

where N represents the total number of detected electrons in an image, the function c(rkl; y) describes an object consisting of nc atom columns with rkl ¼ ðxk yl ÞT the position of the pixel (k, l ) and with the parameter vector y ¼ ðbx1 . . . bxnc by1 . . . bync ÞT containing the positions of the atom columns, t(r; ", Cs) represents the point spread function of the electron microscope depending on microscope settings such as the spherical aberration constant Cs and the defocus ", and Inorm represents a normalization factor so that the integral of the function jcðrkl ; yÞ  tðrkl ; "; Cs Þj2 =Inorm is equal to one. Models like Eq. (4) will be derived and explained in detail in the remainder of this article. Electron microscopical observations are electron counting results detected, for example, with a CCD camera. Under the assumption that the quantum eYciency of this detector is large enough to detect single electrons, these observations are binomially distributed. This means that the probability that the observation wm is equal to om is given by (Papoulis, 1965) ! N N om m po ð5Þ m ð 1 pm Þ om with N the total number of detected electrons, pm the probability that a single electron hits the pixel at the position xm, and ! N N! ð6Þ ¼ om !ðN om Þ! om For large N and pm 1, which is a useful approximation for electron microscopical observations, the binomial distribution tends to a Poisson distribution (Bevington, 1969). Therefore, the probability that the observation wm is equal to om is given by (Papoulis, 1965) m lo m expð lm Þ; om !

ð7Þ

where the parameter lm ¼ Npm is equal to the expectation of the observation wm, which in its turn, is described by the expectation model, given by Eq. (3):

QUANTITATIVE ATOMIC RESOLUTION TEM

E ½wm  ¼ lm ¼ fm ðyÞ:

19 ð8Þ

The assumption that the observations are Poisson distributed is usually made in electron microscopy (see, for example, (Herrmann, 1997)). A property of the Poisson distribution is that the variance of the observation wm is equal to lm: varðwm Þ ¼ lm :

ð9Þ

Moreover, electron microscopical observations may be assumed to be statistically independent. Therefore, the probability P(o; y) that a set of observations w ¼ ðw1 . . . wM ÞT is equal to o ¼ ðo1 . . . oM ÞT is equal to the product of all probabilities described by Eq. (7): Pðo; yÞ ¼

M Y lo m m

o ! m¼1 m

expð lm Þ

ð10Þ

This function is called the joint probability density function of the observations. It represents the parametric statistical model of the observations. The parameters y to be estimated enter P(o; y) via lm. In Section II.C, the parameterized joint probability density function will be used to derive the CRLB, that is, an expression for the attainable precision with which the unknown parameters can be estimated unbiasedly from the observations. The presented optimality criteria, which may be used for the optimization of the experimental design, are functions of the attainable precisions. In Section II.D, from the joint probability density function, the maximum likelihood estimator of the parameters is derived. This estimator actually achieves the CRLB asymptotically, that is, for the number of observations going to infinity. C. Attainable Precision In this section, it will first be shown how the joint probability density function can be used to determine the attainable precision, that is, the CRLB, which is a lower bound on the variance of any unbiased estimator. The CRLB is independent of any particular method of estimation. Next, optimality criteria, which are functions of the CRLB, are given. The CRLB depends on experimental settings, the design. Hence, functions of the CRLB, such as the optimality criteria, also depend on the experimental settings. This means that they vary with the experimental settings, of which at least some are adjustable. The experimenter has to choose one of these criteria, depending on his or her purpose, and optimize it to find the corresponding optimal design.

20

VAN AERT ET AL.

1. The Crame´ r-Rao Lower Bound In this section, the parameterized probability density function of the observations, which is derived in Section II.B, will be used to define the Fisher information matrix and to compute the CRLB on the variance of unbiased estimators of the parameters of the expectation model. The CRLB will also be extended to include unbiased estimators of vectors of functions of these parameters. The reader is referred to (Frieden, 1998; van den Bos, 1982; van den Bos and den Dekker, 2001) to find the details of the CRLB. First, the Fisher information matrix F with respect to the elements of the T  1 parameter vector y ¼ ðy1 . . . yT ÞT is introduced. It is defined as the T  T matrix  2  @ ln Pðo; yÞ F ¼ E ; ð11Þ @y @yT where P(o; y) is the joint probability density function of the observations w ¼ ðw1 . . . wM ÞT . The expression between square brackets represents the Hessian matrix of ln P, for which the (r, s)th element is defined by @ 2 ln P(o; y)/@yr@ys. For electron microscopical observations, where P(o; y) is given by Eq. (10), it follows from Eqs. (8), (10), and (11) that the (r, s)th element of F is equal to: Frs ¼

M X 1 @lm @lm : l @yr @ys m¼1 m

ð12Þ

ˆ of any unbiased Next, it can be shown that the covariance matrix cov(y) ˆ estimator y of y satisfies:   cov yˆ F 1 ð13Þ ˆ and F 1 is This inequality expresses that the diVerence of the matrices cov(y) ˆ positive semidefinite. Since the diagonal elements of cov(y) represent the variances of yˆ 1 ; . . . ; yˆ T and since the diagonal elements of a positive semidefinite matrix are nonnegative, these variances are larger than or equal to the corresponding diagonal elements of F 1:  

ð14Þ var yˆ r F 1 rr ; where r ¼ 1; . . . ; T and [F 1]rr is the (r, r)th element of the inverse of the Fisher information matrix. In this sense, F 1 represents a lower bound to ˆ The matrix F 1 is called the CRLB on the the variances of all unbiased y. ˆ variance of y.

QUANTITATIVE ATOMIC RESOLUTION TEM

21

Finally, the CRLB can be extended to include unbiased estimators of vectors of functions of the parameters instead of the parameters proper. Let gðyÞ ¼ ðg1 ðyÞ . . . gC ðyÞÞT be such a vector and let gˆ be an unbiased estimator of g(y). Then, it can be shown that covðgˆ Þ

@g 1 @gT F @y @yT

ð15Þ

where @g/@yT is the C  T Jacobian matrix defined by its (r, s)th element @gr/@ys (van den Bos, 1982). The right-hand member of this inequality is the CRLB on the variance of gˆ . It should be noticed that the CRLB may only be computed if the probability density function of the observations is known. At first sight, this seems to be a problem since the true parameters of the probability density function are unknown. Nevertheless, even if the CRLB is a function of the unknown parameters, it remains an extremely useful tool. For nominal values of the unknown parameters it enables one to quantify variances that might be achieved, to detect possibly strong covariances between parameter estimates and, as will be shown in this article, to optimize the experimental design (van den Bos, 1982). Moreover, the estimates obtained using an estimator that achieves the CRLB may be substituted for the true parameters in the expression for the CRLB so as to get a level of confidence to be attached to these estimates (den Dekker and van Aert, 2002). In this section, it has been shown how from the joint probability density function, which is described in Section II.B, the elements of the Fisher information matrix may be calculated explicitly. From the latter, the CRLB on the variance of the parameters of the expectation model and on the variance of functions of these parameters may be computed from the righthand member of Eq. (13) and (15), respectively. The diagonal elements of the CRLB give a lower bound on the variance of any unbiased estimator of the parameters. Since the joint probability density function is a function of the experimental settings, the CRLB is a function of these settings as well. Therefore, the CRLB may be used to evaluate and to optimize the experimental design in terms of the precision. However, simultaneous minimization of the diagonal elements of the CRLB, that is, the right-hand members of Eq. (14), is usually impossible. Therefore, statistical parameter estimation theory provides diVerent optimality criteria, which are functions of the elements of the CRLB. These are scalar measures. The experimenter may choose one of these provided criteria or may produce a criterion him or herself, reflecting his or her purpose. A selection of criteria, which are provided in the literature, are given in the following section.

22

VAN AERT ET AL.

2. Precision Based Optimality Criteria In this section, optimality criteria that may be used for the evaluation and optimization of the experimental design are discussed. These criteria are functions of the CRLB and depend, like the CRLB, on the experimental settings. Several criteria are found in the literature (Fedorov, 1972; Pa´ zman, 1986). A selection of them is discussed here. A distinction between global and partial, or, equivalently, truncated, optimality criteria is made. Global criteria are used when all parameters, represented by the elements of the parameter vector y, are important. Partial or truncated criteria are used when only some parameters or some functions of the parameters are important. For atomic resolution TEM, partial criteria are needed if the electron microscopist is only interested in, for example, the positions of atom columns, the positions of light or heavy atom columns, the distance between particular atom columns, or the positions of the atoms of a certain atom type, whereas he or she is not so interested in the object thickness or the atom numbers. Examples of both types of criteria are given below. a. Global Optimality Criteria A-optimality criterion. The A-optimality criterion is defined by the sum of the diagonal elements of the CRLB, that is, the trace of the CRLB: .

tr F 1 :

ð16Þ

This criterion may be interpreted under the assumption that there exists an estimator with covariance matrix equal to the CRLB. Then, minimizing the A-optimality criterion corresponds to minimizing the sum of the variances of the estimates yˆ 1, . . ., yˆ T of the parameters y1, . . . yT, without taking the correlation between these estimates into account. A geometric interpretation of this criterion may be given by considering the ellipsoid of concentration, which is a measure of the concentration of the distribution of the estimates about the true parameters. It is defined by the ellipsoid enclosing the true parameters y such that, a uniform distribution over the area bounded by the ellipsoid will have the same expectation and covariance matrix as the distribution of the estimates (Crame´ r, 1999; Mood, Graybill, and Boes, 1974). In Figure 2, the square root of the A-optimality criterion, (tr F 1)1/2, is shown on the ellipsoid of concentration for the special case of two unknown parameters. This figure is based on Fedorov (1972). . D-optimality criterion. The D-optimality criterion is defined by the determinant of the CRLB:

QUANTITATIVE ATOMIC RESOLUTION TEM

det F 1 :

23 ð17Þ

A statistical interpretation of the D-optimality criterion may be given for the hypothetical estimator discussed before. Then, minimizing the D-optimality criterion corresponds to minimizing the volume of the ellipsoid of concentration, which is shown in Figure 2 for the special case of two parameters. The drawback of minimizing the D-optimality criterion is that in some cases the volume of the ellipsoid of concentration is small because it is ‘narrow but long’. This means that there is a linear combination of the parameters which is estimated with a very large variance under the corresponding optimal design. . Minimax criterion in space of parameters. The minimax criterion in the space of parameters is defined by the maximum value of the diagonal elements of the CRLB:

maxr F 1 rr : ð18Þ

Minimizing this criterion corresponds to minimizing the largest variance of the estimate of the corresponding parameter. For example, in Figure 2, the 1=2 square root of the criterion, given by Eq. (18), corresponds to ½F 1 11 .

Figure 2. Ellipsoid of concentration for two parameters. The geometric interpretation of the square root of the A-optimality criterion, minimax criterion in space of parameters, and 1=2 1=2 E-optimality criterion is represented by (trF 1 Þ1=2 ; ½F 1 11 , and 1=lmin , respectively. Minimizing the D-optimality criterion corresponds to minimizing the volume (the area in this example) of the ellipsoid of concentration. This figure is based on (Fedorov, 1972).

24

VAN AERT ET AL.

. E-optimality criterion. The E-optimality criterion is defined by the inverse of the minimum eigenvalue lmin of the Fisher information matrix:

1 lmin

:

ð19Þ

In Figure 2, the square root of the E-optimality criterion is shown on the ellipsoid of concentration for the special case of two parameters. . Linear optimality criteria. Linear optimality criteria are defined by criteria functions of the form

tr WF 1 ;

ð20Þ

where W is a positive definite T  T matrix. The A-optimality criterion corresponds to the particular case where W is equal to the identity matrix. If W is a diagonal matrix, Eq. (20) is equal to: T X



Wrr F 1 rr ;

ð21Þ

r¼1

that is, a weighted sum of the variances. b. Partial or Truncated Optimality Criteria. In principle, partial or truncated optimality criteria are analogous to global optimality criteria, but instead of the full CRLB that is, F 1, only a submatrix FS 1 of F 1 is used. If only y1, . . . , yS of the entire collection of T unknown parameters y1, . . . yT are important, the submatrix to be used is defined as: 0 1 1 ½F 11 ½F 1 12 . . . ½F 1 1S B ½F 1 21 ½F 1 22 . . . ½F 1 2S C B C ð22Þ FS 1 ¼ B C .. .. .. @ A . ... . . ½F 1 S1 ½F 1 S2 . . . ½F 1 SS Then, for example, the partial D-optimality criterion is defined by the determinant of FS 1 . Moreover, if only some functions of the parameters are important, the inverse of the right-hand member of inequality (15) has to be used. The optimality criteria, which are presented in this section, are functions of the elements of the CRLB. Minimization of these criteria as a function of the experimental settings, under the relevant physical constraints, produces the optimal statistical experimental design. However, diVerent optimality criteria will generally produce diVerent optimal designs. The experimenter

QUANTITATIVE ATOMIC RESOLUTION TEM

25

has to choose one of them or has to produce a criterion him or herself depending on his or her purpose. D. Maximum Likelihood Estimation In this section, it is discussed how the maximum likelihood estimator of the parameters may be derived from the parameterized probability density function, which is discussed in Section II.B. This estimator is very important since it achieves the CRLB asymptotically, that is, for the number of observations going to infinity. Thus, it is asymptotically most precise and is therefore often used in practice. The maximum likelihood estimator is clearly discussed in (van den Bos and den Dekker, 2001). A summary is given here. The maximum likelihood method for estimation of the parameters consists of three steps: 1. The available observations w ¼ ðw1 . . . wM ÞT are substituted for the corresponding independent variables o ¼ ðo1 . . . oM ÞT in the probability density function, for example, in Eq. (10). Since the observations are numbers, the resulting expression depends only on the elements of the parameter vector y ¼ ðy1 . . . yT ÞT . 2. The elements of y ¼ ðy1 . . . yT ÞT , which are the hypothetical true parameters, are considered to be variables. To express this, they are replaced by t ¼ ðt1 . . . tT ÞT . The logarithm of the resulting function, ln P(w; t), is called the log-likelihood function of the parameters t for the observations w, which is denoted as q(w; t). 3. The maximum likelihood estimates yˆ ML of the parameters y are defined by the values of the elements of t that maximize q(w; t), or yˆ ML ¼ arg maxt qðw; tÞ

ð23Þ

The most important properties of the maximum likelihood estimator are the following ones: Consistency. Generally, an estimator is said to be consistent if the probability that an estimate deviates more than a specified amount from the true value of the parameter can be made arbitrarily small by increasing the number of observations used. . Asymptotic normality. If the number of observations increases, the probability density function of a maximum likelihood estimator tends to a normal distribution. .

26

VAN AERT ET AL.

. Asymptotic eYciency. The asymptotic covariance matrix of a maximum likelihood estimator is equal to the CRLB. In this sense, the maximum likelihood estimator is most precise. . Invariance property. The maximum likelihood estimates g ˆ ML of a vector of functions of the parameters y, that is, gðyÞ ¼ ðg1 ðyÞ . . . gC ðyÞÞT , are equal to gðyˆ ML Þ ¼ ðg1 ðyˆ ML Þ . . . gC ðyˆ ML ÞÞT (Mood, Graybill and Boes, 1974).

In the remainder of this article, it will be checked if the maximum likelihood estimator attains the CRLB for atomic resolution TEM experiments. If so, the use of the optimality criteria given in Section II.C.2, which are functions of the elements of the CRLB, is justified.

E. Conclusions In this section, it has been shown how to evaluate and to optimize the experimental design in terms of the precision with which unknown parameters can be estimated. The optimization consists of diVerent steps, which may be summarized as follows: 1. The parametric statistical model of the observations is derived. This model defines the expectations of the observations as well as the fluctuations of the observations about these expectations. The specification of this model requires a solid physical base. 2. The CRLB, which is a theoretical lower bound on the variance of the parameter estimates, is computed from the parametric statistical model of the observations. This lower bound represents the highest attainable precision. Since the parametric statistical model of the observations is a function of the experimental settings, the CRLB is a function of these settings as well. 3. An optimality criterion is chosen, reflecting the purpose of the experimenter. This criterion is a function of the elements of the CRLB, which, like the CRLB, depends on the experimental settings. Generally, diVerent optimality criteria will produce diVerent optimal experimental designs. 4. The criterion chosen is optimized with respect to the experimental settings. The settings corresponding to the optimum are suggested as the optimal statistical experimental design. This optimization procedure is subject to the physical constraints. In the remainder of this article, this procedure will be applied to set up quantitative atomic resolution TEM experiments.

QUANTITATIVE ATOMIC RESOLUTION TEM

27

III. Statistical Experimental Design of Atomic Resolution Transmission Electron Microscopy Using Simplified Models

A. Introduction In this section, the attainable precision with which position and distance parameters of one or two components can be estimated, is computed for atomic resolution TEM experiments described by simplified models. In other words, an expression for the CRLB on the variance of position and distance estimates, which has been introduced in Section II, is derived for one-, two-, and three-dimensional components. Such expressions may be used to evaluate and optimize the experimental designs. For one- and two-dimensional components, the observations consist of counting events in a one- and twodimensional pixel array, respectively. For three-dimensional components, they consist of counting events in a set of two-dimensional pixel arrays, which is obtained by rotating these components about a rotation axis. In principle, these examples may be considered as simulations of a wide variety of experiments. However, in the remainder of this article, the two-dimensional example will be regarded as a simplified simulation of a high-resolution CTEM or STEM experiment, whereas the three-dimensional example will be regarded as a simplified simulation of an electron tomography experiment. Usually, the performance of such experiments is discussed in terms of twopoint resolution, expressing the possibility of perceiving separately components of a two-point image. One of the earliest and most famous criteria for two-point resolution is that of Rayleigh (1902). Criteria such as Rayleigh’s are suitable to set up qualitative atomic resolution TEM experiments. However, as already mentioned in Section I, a diVerent optimality criterion is needed in the framework of quantitative atomic resolution TEM, where one has prior knowledge about the observations in the form of a parametric statistical model, describing the expectations of the observations as well as the fluctuations of the observations about these expectations. Then, an obvious alternative to two-point resolution is the attainable precision with which position or distance parameters can be measured. In this section, the model describing the expectations of the observations, the expectation model, is assumed to consist of Gaussian peaks with unknown position. Under this assumption, it will be shown that the CRLB, which is usually calculated numerically, may be approximated by a simple rule of thumb in closed analytical form. Although the expectation models of images obtained in practice are usually of a higher complexity than a Gaussian peak, the rules of thumb are suitable to give insight into statistical experimental design for quantitative atomic resolution TEM. This will be

28

VAN AERT ET AL.

shown in the remainder of this article, where more complicated, physics based expectation models will be considered and where, consequently, the CRLB has to be calculated numerically. In the absence of rules of thumb for the attainable precision, it would be diYcult, if not impossible, to understand these numerical results. In the author’s opinion, whenever possible, every numerical analysis should be preceded by a simplified analysis. This will provide a check of the numerical results. In Section III.B, parametric statistical models of the observations are described. In Section III.C, the approximations of the CRLB, that is, the rules of thumb for the CRLB, are derived from these models. Section III.D consists of discussions and examples. In Section III.E, conclusions are drawn. Part of the results of this section has earlier been published in (van Aert, den Dekker, van Dyck, and van den Bos, 2002a). B. Parametric Statistical Models of Observations In this section, the pertinent parametric statistical models of the observations are described. In the remainder of this section, these models will be used for the derivation of expressions for the CRLB with which the position of one component or the distance between two components can be measured. The purpose is to find rules of thumb for the CRLB, that is, expressions that are easy to calculate and to interpret. In order to accomplish this, it will be assumed that the expectation models underlying the observations consist of Gaussian peaks with unknown position and known amplitude and width. In Sections III. B. 1, 2, and 3, the expectation model is described for one-, two-, and three-dimensional observations, respectively. 1. One-Dimensional Observations For one-dimensional observations, the normalized image intensity distribution is assumed to be given by: f ðx; bÞ ¼

nc 1X F ðx bxn Þ; nc n¼1

ð24Þ

where nc is the total number of components, bxn is the position of the nth component, and  2 1 x F ðxÞ ¼ pffiffiffiffiffiffi exp ð25Þ 2r2 2pr with r the width of the Gaussian peak, to which both the width of the component and the two-point resolution of the imaging instrument

QUANTITATIVE ATOMIC RESOLUTION TEM

29

contribute. The nc-dimensional parameter vector b is equal to ðb1 . . . bnc ÞT ¼ ðbx1 . . . bxnc ÞT . Suppose that the observations wk ; k ¼ 1; . . . ; K are made at equidistant pixels of size Dx at the measurement points xk. If Dx is small compared to the width r of the Gaussian peak, the probability pk(b) that an electron hits the pixel at the position xk is approximately given by: Z xk þDx=2 pk ðbÞ ¼ pðxk ; bÞ ¼ f ðx; bÞdx  f ðxk ; bÞDx: ð26Þ xk Dx=2

This means that the number of electrons expected to be found at this pixel is given by: lk ¼ nc Np pk ðbÞ;

ð27Þ

where Np is the total number of electrons in each Gaussian peak. Therefore, Eq. (27) describes the expectation model, which contains the parameters b. 2. Two-Dimensional Observations For two-dimensional observations, two distinct expectation models are assumed, corresponding to the so-called dark-field and bright-field imaging mode in TEM. In dark-field imaging, the noninteracting electrons are eliminated from detection, whereas in bright-field imaging, these electrons contribute to the background intensity in the image. The expectation models for dark-field and bright-field imaging are approximated by a model consisting of Gaussian peaks without and with background, respectively, although they are of a higher complexity in practice. a. Dark-Field Imaging. For dark-field imaging, the normalized image intensity distribution of the two-dimensional object is assumed to be given by: gDF ðx; y; bÞ ¼

nc   1X G x bxn ; y byn ; nc n¼1

ð28Þ

where bxn and byn are the x- and y-coordinate of the position of the nth component, respectively, and  2  1 x y2 Gðx; yÞ ¼ exp ; ð29Þ 2pr2 2r2 with r the width of the Gaussian peak. The 2nc-dimensional parameter vector b is equal to ðb1 . . . b2nc ÞT ¼ ðbx1 . . . bxnc by1 . . . bync ÞT . For a twodimensional object, the components are, for example, atoms or atom

30

VAN AERT ET AL.

columns in projection. In fact, Eq. (28) results from a two-dimensional convolution between an object function and the point spread function of the electron microscope. The intensity distribution of the identical components of the object as well as the point spread function t(x, y) are assumed to be Gaussian with corresponding widths rC and rEM, respectively. In this case r2 ¼ r2C þ r2EM :

ð30Þ

The observations wkl ; k ¼ 1; . . . ; K, l ¼ 1; . . . ; L are made at equidistant pixels of area Dx  Dy at the measurement points (xk yl )T. The field of view (FOV), that is, the total area of detection is equal to KDx  LDy. If Dx and Dy are small compared to the width r of the Gaussian peak, the probability pkl (b) that an electron hits the pixel at the position (xk yl )T is approximately given by: pkl ðbÞ ¼ pðxk ; yl ; bÞ ¼

Z

xk þDx=2 xk Dx=2

Z

yl þDy=2

gDF ðx; y; bÞdxdy

yl Dy=2

ð31Þ

 gDF ðxk ; yl ; bÞDxDy: For a given total number of electrons Np in each Gaussian peak, the number of electrons expected to be found at this pixel is given by: lkl ¼ nc Np pkl ðbÞ:

ð32Þ

This equation describes the expectation model containing the parameters b. b. Bright-Field Imaging. For bright-field imaging, the normalized image intensity distribution of the two-dimensional object is assumed to be given by: gBF ðx; y; bÞ ¼

1 nc OgDF ðx; y; bÞ ; FOV nc O

ð33Þ

where O is a constant, representing the strength of the interaction of the electrons with one component, gDF (x, y; b) is described by Eq. (28), and FOV is the field of view. The term ‘1’ represents a constant background, corresponding to the noninteracting electrons and the denominator FOV nc O is a normalization constant. In what follows, the term ‘nc OgDF ðx; y; bÞ’ is assumed to be much smaller than the term ‘1’, which means that the number of interacting electrons is small compared to the number of noninteracting electrons. In analogy with dark-field imaging, the probability pkl (b) that an electron hits the pixel at the position (xk yl)T is approximately given by:

QUANTITATIVE ATOMIC RESOLUTION TEM

pkl ðbÞ  gBF ðxk ; yl ; bÞDxDy:

31 ð34Þ

For a given total number of electrons N, the number of electrons expected to be found at the pixel at the position ðxk yl ÞT is given by: lkl ¼ Npkl ðbÞ:

ð35Þ

This result defines the expectation model for bright-field imaging containing the parameters b. 3. Three-Dimensional Observations The three-dimensional observations made at the three-dimensional object consist of a single-axis tilt series of two-dimensional projections recorded by an electron tomography experiment. These projections are obtained by recording two-dimensional images while tilting the object about a fixed axis. Other data collection geometries in electron tomography exist as well, such as conical and random-conical tilting (Frank, 1992). However, only singleaxis tilting is considered here. It is assumed that the three-dimensional density distribution of the object is given by: d ðx; y; z; bÞ ¼

nc   1X D x bxn ; y byn ; z bzn ; nc n¼1

ð36Þ

where bxn, byn, and bzn are the x-, y-, and z-coordinate, respectively, of the position of the nth component with respect to a reference coordinate system and  2  1 x y2 z2 Dðx; y; zÞ ¼ exp ; ð37Þ 2r2C ð2pÞ3=2 r3C with rC the width of the identical components. The 3nc-dimensional parameter vector b is equal to ðb1 . . . b3nc ÞT ¼ ðbx1 . . . bxnc by1 . . . bync bz1 . . . bznc ÞT . The components are, for example, atoms. Figure 3 shows the surface of the three-dimensional density distribution and the positions of two components. It will be assumed that the y-axis is the rotation axis and the z-axis is the axis parallel to the illuminating electron beam. In the derivation of a rule of thumb for the attainable precision, it will be assumed that the tilt angles y j ; j ¼ 1; . . . ; J are equidistantly located on the interval ( p/2, p/2). Although such a full angular range is rather unrealistic, it will be shown in Section III.D that the derived rules of thumb still provide insight for a limited angular range. At each tilt angle y j, the position coordinates of j j j j j j the components b j ¼ ðbx1 . . . bxn by1 . . . byn bz1 . . . bzn ÞT with respect to the c c c reference coordinate system are given by:

32

VAN AERT ET AL.

Figure 3. Surface of the three-dimensional density distribution of an object consisting of two components. The position coordinates of the two components are represented by the elements of the parameter vector b. It has been assumed that the y-axis is the rotation axis and the z-axis is the axis parallel to the illuminating electron beam. Furthermore, d is the distance between the two components, d0 is the distance between the components projected onto the (x, z)-plane, and f is the angle between the rotation axis and the axis connecting both components. It should be mentioned that this is not the tilt angle.

j bxn ¼ bxn cosy j þ bzn siny j ; j ¼ byn ; byn j bzn

ð38Þ j

j

¼ bzn cosy bxn siny ;

for n ¼ 1; . . . ; nc . The normalized image intensity distribution of a twodimensional projection is equal to: Z     h j ðx; y; bÞ ¼ d x; y; z; b j dz  tðx; yÞ ¼ gDF x; y; " j ; ð39Þ that is, the convolution of the projected density distribution and the point spread function of the electron microscope. The parameters j j j " j ¼ ðbx1 . . . bxn b j . . . byn ÞT are the position coordinates of the componc y1 c ents in this projection and the function gDF is given by Eq. (28). It follows from Eq. (39) that each projected image is assumed to be a two-dimensional

QUANTITATIVE ATOMIC RESOLUTION TEM

33

dark-field imaging experiment. However, for future research, it would be interesting to consider a bright-field imaging experiment as well, since this imaging mode is often used in practice. The observations wklj ; k ¼ 1; . . . ; K; l ¼ 1; . . . ; L; j ¼ 1; . . . ; J are made at equidistant pixels of area Dx  Dy at the measurement points (xk yl)T at the tilt angles y j. The FOV of each projection is equal to KDx  LDy. If Dx and Dy are small compared to the width r of the projected Gaussian peak, which is defined by Eq. (29), the probability pklj ðbÞ that an electron hits the pixel at the position (xk yl)T at the tilt angle y j is approximately given by: Z xk þDx=2 Z yl þDy=2 pklj ðbÞ ¼ p j ðxk ; yl ; bÞ ¼ h j ðx; y; bÞ dxdy ð40Þ xk Dx=2 yl Dy=2  h j ðxk ; yl ; bÞDxDy: It will be assumed that the total number of electrons ncNp is equally distributed over the two-dimensional projections. In this case, the number of electrons at each projection is equal to nc Np =J, where Np =J represents the number of electrons in each projected Gaussian peak. Then, the number of electrons expected to be found at the pixel at the position (xk yl)T at the tilt angle y j is given by: n c Np j p ðbÞ: ð41Þ J kl This result describes the expectation model containing the parameters b. In Sections III.B.1, 2, and 3, expectation models have been given for one-, two-, and three-dimensional observations, respectively. These models describe the expected numbers of detected electron counts, that is, the expectations. Notice that for each expectation model, the components have been assumed to be identical. In future research, this may be extended to nonidentical components, representing, for example, objects consisting of diVerent types of atoms. Moreover, it will be supposed that the observations, which fluctuate about the expectations, are statistically independent and have a Poisson distribution. Therefore, the joint probability density function of the observations P(o; b), representing the parametric statistical model of the observations, is given by Eq. (10), where the total number of observations M is equal to K, K  L, and K  L  J for one-, two-, and three-dimensional observations, respectively. In Section III.C, the CRLB on the variance with which position and distance parameters can be estimated will be derived from the obtained parametric statistical models of the observations. Notice that for three-dimensional objects, the estimation of the position and distance parameters has to be interpreted as follows. The parameter estimates are obtained by adapting ljkl ¼

34

VAN AERT ET AL.

the assembly of projected models, given by Eq. (41), to the experimental projected images with respect to the unknown parameters. This procedure is considered rather than adapting the three-dimensional model, such as that given by Eq. (36), to a three-dimensional reconstruction, which may be obtained by combining the projected images using the so-called weighted back-projection method (Frank, 1992). The reason why this alternative procedure is not considered is because the joint probability density function of the three-dimensional reconstruction is unknown. If the joint probability density function is unknown, the CRLB cannot be computed. C. Approximations of the Crame´ r-Rao Lower Bound In this section, rules of thumb will be derived for the highest attainable precision with which the position coordinates of an isolated component and the distance between two components can be measured. In other words, the exact expressions for the CRLB, following from Section II.C.1 will be approximated. This will be done for one-, two-, and three-dimensional objects, for which the parametric statistical models of the observations are described in Section III.B. Throughout this section, the words ‘isolated component’ and ‘two components’ should not be interpreted in their strict sense. Expressed in a simplified way, it means that neighboring components may be present as long as these components do not overlap with the one or two components considered. Expressed in a correct way, it means that the elements of the Fisher information matrix associated with a position coordinate of the one or two components considered and a position coordinate of a neighboring component are equal to zero. Hence, the Fisher information matrix and its inverse, the CRLB, are block diagonal. In the derivation of the approximations of the CRLB, only their submatrices need to be considered. An interpretation of a block diagonal CRLB may easily be given for a (hypothetical) estimator with covariance matrix equal to the CRLB. Then, the zero-elements of the CRLB associated with two diVerent position coordinates indicate that the estimates of these position coordinates are uncorrelated. 1. One-Dimensional Observations For a one-dimensional object, the approximations of the CRLB on the variance s2bx of the position bx of an isolated component and on the variance s2d of the distance d between two components may be directly obtained from the results presented in (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999). In this paper, the same expectation model as the one

QUANTITATIVE ATOMIC RESOLUTION TEM

35

described in Section III.B.1 was used, but the observations were assumed to be multinomially distributed instead of Poisson distributed. However, it can be shown that the expressions for the elements of the Fisher information matrix are equal under both assumptions and given by Eq. (12). Therefore, also the approximations of the CRLB are equal. For an isolated component, the CRLB on the variance s2bx of the position bx is approximated by: s2bx 

r2 Np

ð42Þ

where r is the width of the Gaussian peak, which is defined by Eq. (25), and Np is the total number of electrons in this peak. The conditions for the validity of this approximation are that the pixel size Dx is small compared to the width of the Gaussian peak and that the component is located for the most part within the FOV. The approximation is not valid if, for instance, only one half of an object is detected. For two components, under the same conditions, the CRLB on the variance s2d of the distance d between these components is approximated by: s2d 

4r4 N p d2

if d 

pffiffiffi 2r

ð43Þ

and s2d 

2r2 Np

if d

pffiffiffi 2r

ð44Þ

pffiffiffi If d is equal to 2r, both approximations are equal to one another. From the comparison of Eqs. (42) and it follows that, if the distance between pffiffi(44), ffi 2 two components is larger than 2r, s2d is twice pffiffiffi as large as sbx . This expresses the fact that for distances larger than 2r, a (hypothetical) estimator attaining the CRLB will provide uncorrelated estimates of the coordinates of neighboring components. From Eqs. (42)–(44), it follows that the precision with which the position or the distance can be measured is a function of the total number of electrons Np in each component and the width r of the Gaussian peaks. The precision may be improved, that is, s2bx or s2d may be decreased, by increasing the number of electrons. Also, the precision will improve ifpthe ffiffiffi peaks are narrower. Moreover, if the distance becomes smaller than 2r, the lower bound on the standard deviation sd of the distance increases inversely proportionally to the distance. In (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999), it has been shown that the approximations given in this section are useful rules of thumb.

36

VAN AERT ET AL.

2. Two-Dimensional Observations For a two-dimensional object, the approximations of the CRLB on the variance s2bx or s2by of the position coordinates bx or by, respectively, of an isolated component and on the variance s2d of the distance d between two components will be derived for both dark-field and bright-field imaging, for which the expectation models are described in Section III.B.2. The derivations of these lower bounds are similar to those of the onedimensional object. First, an isolated component is considered, for which its position coordinates are represented by the elements of the parameter vector b ¼ ðbx by ÞT . It will be assumed that this component is located for the most part within the FOV, which means that detection of only one half of an object is not considered. Moreover, the pixel sizes Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, which is defined by Eq. (29). Under these assumptions, the (1, 1)th element of the Fisher information matrix F associated with the position coordinates b is approximately equal to its (2, 2)th element, that is, F11  F22. The reason for this is that the component has rotational symmetry. Furthermore, it follows from Eq. (12) that the Fisher information matrix F is symmetric. Therefore, F simplifies into:   F11 F12 F : ð45Þ F12 F11 From Eq. (14), it follows that the CRLB on the variance s2bx or s2by is given by the corresponding diagonal element of F 1:

s2bx ¼ s2by ¼ F 1 11 : ð46Þ The right-hand member of this equation will be calculated explicitly for dark-field as well as for bright-field imaging, resulting in Eqs. (65) and (68), respectively. Second, two components are considered, for which the position coordinates are represented by the elements of the parameter vector b ¼ ðbx1 bx2 by1 by2 ÞT . It will be assumed that the components are located for the most part within the FOV and that the pixel sizes Dx and Dy are small compared to the width r of the Gaussian peak. Under these assumptions, it can be shown for the elements of the Fisher information matrix F associated with the position coordinates b that F11  F22, F33  F44, F24  F13, and F23  F14. The reason for this is that the components are assumed to be identical and hence interchangeable. Furthermore, F is symmetric. Therefore, F simplifies into:

37

QUANTITATIVE ATOMIC RESOLUTION TEM

0

F11 B F12 F B @ F13 F14

F12 F11 F14 F13

F13 F14 F33 F34

1 F14 F13 C C F34 A F33

ð47Þ

The purpose is to find an expression for the CRLB on the variance s2d of the distance between two components. For a two-dimensional object, the distance is defined as: qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2 d ¼ ðbx1 bx2 Þ2 þ by1 by2 : ð48Þ Since d is a function of the elements of the parameter vector b, an expression for s2d follows directly from the right-hand member of inequality (15): s2d ¼

@d 1 @dT ; F @b @bT

ð49Þ

where the Jacobian matrix @d/@bT is equal to  @d 1 ¼ bx1 bx2 bx2 bx1 by1 by2 by2 by1 : T d @b Equation (49) may be written as:   F11 F12 2 2 sd  2 bx1 bx2 by1 by2 F13 F14 d

F13 F14 F33 F34

 1

bx1 bx2 by1 by2

ð50Þ

! : ð51Þ

The derivation of Eq. (51) is given below. It should be noted that this derivation may be skipped during a first reading without losing the thread of this article. Derivation of Equation (51). The derivation of Eq. (51) is based on the fact that the T  T Fisher information matrix F may easily be transformed into a block diagonal matrix FD if F is invariant under a transformation of the parameters b to Mb, where the T  T matrix M represents a symmetry operation. This supposition will first be proven. The condition that F is invariant under a symmetry operation M is mathematically written as: F ¼ M T FM;

ð52Þ

where the matrix M has the property Mn ¼ I

ð53Þ

38

VAN AERT ET AL.

with n an integer and I the identity matrix. Next, suppose that the eigenvectors and eigenvalues of M are represented by the columns Yi ; i ¼ 1; . . . ; T of the T  T matrix V and the elements li ; i ¼ 1; . . . ; T of the T  T diagonal matrix L, respectively, or equivalently, in symbols: MV ¼ V L:

ð54Þ

Then, it follows from Eq. (54) that M n V ¼ V Ln :

ð55Þ

Furthermore, it follows from Eq. (53) that MnV ¼ V :

ð56Þ

Combining Eqs. (55) and (56) results in: Ln ¼ I:

ð57Þ

This means that the eigenvalues of M are equal to exp(i2pr/n), with r ¼ 0; 1; . . . ; n 1. Since the dimension T of the Fisher information matrix F is usually larger than n, these eigenvalues are degenerated. From Eqs. (52) and (54), it follows that: V T FV ¼ LT V T FV L:

ð58Þ T

The notation FD will be used to indicate the matrix V FV. It will now be shown that FD is block diagonal. The (i, j)th element of FD, represented by YiT FYj , is calculated by subsequent use of Eqs. (52) and (54) as follows: ðFD Þij ¼ YiT FYj ¼ YiT M T FMYj ¼ li lj YiT FYj ;

ð59Þ

where the symbol * denotes the complex conjugate. Thus, YiT FYj is equal to li lj YiT FYj . This relation is trivial if Yi and Yj have the same eigenvalue, since then li lj ¼ 1. If, on the other hand, Yi and Yj have diVerent eigenvalues, YiT FYj has to be equal to 0, since li lj 6¼ 1. Therefore, the matrix FD ¼ V T FV is block diagonal, which proves the supposition. The supposition, which is discussed above, will now be used to derive Eq. (51). Since the two components are assumed to be identical, the Fisher information matrix F, given by Eq. (47), is invariant under interchanging the components. Thus, the matrix M, which represents this symmetry operation, is given by: 0 1 0 1 0 0 B1 0 0 0C C M¼B ð60Þ @ 0 0 0 1 A: 0 0 1 0

QUANTITATIVE ATOMIC RESOLUTION TEM

39

The matrix of eigenvectors V of M and the matrix of eigenvalues L of M are given by: 0 1 1 0 1 0 1 B 1 0 1 0C C V ¼ pffiffiffi B ð61Þ @ 0 1A 2 0 1 0 1 0 1 and 0

1 B0 L¼B @0 0 The matrix FD ¼ V T FV supposition, and is equal 0 F11 þ F12 B F13 þ F14 B @ 0 0

0 1 0 0

1 0 0 0 0C C: 1 0A 0 1

ð62Þ

is block diagonal, as predicted by the preceding to 1 F13 þ F14 0 0 C F33 þ F34 0 0 C: ð63Þ 0 F11 F12 F13 F14 A 0 F13 F14 F33 F34

Since FD is defined as VT FV, it follows that: F 1 ¼ VFD 1 V T ;

ð64Þ

where FD is given by Eq. (63). Equation (64) allows one to easily invert the 4  4 Fisher information matrix F associated with the position coordinates b since FD is block diagonal. The inverse of FD is block diagonal as well, with submatrices equal to the inverse of the 2  2 submatrices of FD. Next, the result of Eq. (64) is substituted into Eq. (49) resulting into Eq. (51). Next, the right-hand member of Eq. (51) will be calculated explicitly for distances, which are either small or large compared to the width r of the Gaussian peak, and for dark-field, as well as for bright-field imaging. Dark-Field Imaging. The CRLB on the variance s2bx or s2by of the position coordinates bx or by, respectively, of an isolated component and on the variance s2d of the distance d between two components are given for dark-field imaging. The results are obtained from the explicit calculations of the expressions given by the right-hand members of Eqs. (46) and (51), which are given in Appendix A. For an isolated component, the CRLB on the variance s2bx or s2by is approximated by:

40

VAN AERT ET AL.

s2bx ¼ s2by 

r2 Np

ð65Þ

where r is the width of the Gaussian peak, which is defined by Eq. (29), and Np is the total number of electrons in this peak. For two components, the CRLB on the variance s2d of the distance d between these components is approximated by: s2d 

4r4 N p d2

if d 

pffiffiffi 2r

ð66Þ

and s2d 

2r2 Np

if d

pffiffiffi 2r

ð67Þ

pffiffiffiffiffiffi If d is equal to 2r, both approximations are equal to one another. Notice that Eqs. (65), (66), and (67) are equal to their one-dimensional analogues, which are given by Eqs. (42), (43), and (44). Moreover, from the comparison of Eqs. (65) and (67), itpfollows that, if the distance between the two ffiffiffi 2 2 components is larger than 2r, s2d is twice pffiffiffi as large as sbx or sby . This expresses the fact that for distances larger than 2r, an estimator attaining the CRLB will provide uncorrelated estimates of the coordinates of neighboring components. The approximations of s2bx , s2by , and s2d are valid if the pixel sizes Dx and Dy are small compared to the width r of the Gaussian peak and if the components lie for the most part within the FOV. The approximation is not valid if, for instance, only one half of an object is detected. Bright-Field imaging. The CRLB on the variance s2bx or s2by of the position coordinates bx or by, respectively, of an isolated component and on the variance s2d of the distance d between two components are given for bright-field imaging. The results are obtained from the explicit calculations of Eqs. (46) and (51), which are given in Appendix B. For an isolated component, the CRLB on the variance s2bx or s2by is approximated by: s2bx ¼ s2by 

8pr4 FOV NO2

ð68Þ

where r is the width of the Gaussian peak, which is defined by Eq. (29), FOV is the field of view, N is the total number of detected electrons, and O represents the strength of the interaction of the incident electrons with one component. Notice that N/FOV denotes the total number of detected electrons per unit area. For two components, the CRLB on the variance s2d of the distance d between these components is approximated by:

QUANTITATIVE ATOMIC RESOLUTION TEM

41

s2d

64pr6 FOV  3NO2 d2

rffiffiffiffiffiffi 4 r if d  3

ð69Þ

s2d

16pr4 FOV  NO2

rffiffiffiffiffiffi 4 r if d 3

ð70Þ

and

pffiffiffiffiffiffiffiffi If d is equal to 4=3r, both approximations are equal to one another. From the comparison of Eqs. (68) and (70), it follows that, if the distance pffiffiffiffiffiffiffi ffi 2 between the two components is larger than 4=3r, s2d is twice as p large ffiffiffiffiffiffiffiffias sbx 2 or sby . This expresses the fact that for distances larger than 4=3r, an estimator attaining the CRLB will provide uncorrelated estimates of the coordinates of neighboring components. The approximations of s2bx , s2by , and s2d are valid if the pixel sizes Dx and Dy are small compared to the width r of the Gaussian peak and if the components lie for the most part within the FOV. The approximation is not valid if, for instance, only one half of an object is detected. The rules of thumb for dark-field and bright-field imaging, which are described by Eqs. (65)–(70), are scalar measures that may be used to obtain insight into statistical experimental design. The precision with which the position coordinates and the distance can be measured is a function of the number of electrons and the width of the peaks. The attainable precision may be quantified and improved by increasing the number of detected electrons per unit area or by narrowing the peaks. In practice, it follows from Eq. (30) that the peaks may be narrowed by narrowing the point spread function, that is, by improving the two-point resolution of the electron microscope. However, it is important to notice that below a certain width of the point spread function, the precision is limited by the intrinsic width of the components, for instance, by the width of the electrostatic potential of the atoms (den Dekker, Sijbers, and van Dyck, 1999). Then, further narrowing of the point spread function is useless. This result is meaningful in practice. For example, in STEM experiments, further narrowing of the probe, which represents the point spread function, is not so beneficial in terms of precision since the width of the probe is currently almost equal to the width of an atom (Krivanek, Dellby, and Nellist, 2002). Moreover, as in STEM, if a narrower point spread function may be accompanied with a decrease of the number of detected electrons, both eVects have to be weighed against each other under the existing physical constraints. Also, from the rules of thumb, it follows that the precision may be orders of magnitude better than the two-point resolution of the imaging instrument if the number of detected electrons p per ffiffiffiffiffiffiffiffi is large. ffiffiffi unitparea Furthermore, if the distance becomes smaller than 2r or 4=3r for dark

42

VAN AERT ET AL.

field and bright field imaging, respectively, the lower bound on the standard deviation sd of the distance d increases inversely proportionally to the distance. In Section III.D.1 which consists of discussions and examples, it will be shown that the lower bounds on the standard deviation sbx or sby of the position coordinates bx or by, respectively, of an isolated component and on the standard deviation sd of the distance d is well approximated by the square roots of the right-hand members of Eqs. (65)–(70). 3. Three-Dimensional Observations For a three-dimensional object, the derivation of rules of thumb for the highest attainable precision, that is, the CRLB, with which the position coordinates of an isolated component or the distance between two components can be estimated is similar to its two-dimensional analogue. First, an isolated component is considered, for which its position coordinates are represented by the elements of the parameter vector b ¼ ðbx by bz ÞT . The symmetric Fisher information matrix F associated with the position coordinates b is given by: 0 1 F11 F12 F13 F ¼ @ F12 F22 F23 A: ð71Þ F13 F23 F33 From Eq. (14), it follows that the CRLB on the variance s2bx , s2by or s2bz of the position coordinates bx, by or bz, respectively, is given by its corresponding diagonal element of F 1:

s2bx ¼ F 1 11 ;

s2by ¼ F 1 22 ; ð72Þ 1

2 sbx ¼ F 33 : The right-hand members of these equations are calculated explicitly in Appendix C, resulting in: s2bx ¼ s2bx 

2r2 Np

ð73Þ

and s2by 

r2 Np

ð74Þ

where r is the width of the projected Gaussian peak, which is defined by Eq. (29), and Np is the total number of detected electrons in the component. The conditions for the validity of the approximations are that the pixel sizes

QUANTITATIVE ATOMIC RESOLUTION TEM

43

Dx and Dy are small compared to r, that the diVerence Dy between successive tilt angles is small compared to the full angular tilt range, and that the component is located for the most part within the region of observation. Furthermore, the tilt angles y j are assumed to be equidistantly located on the interval ( p/2, p/2). From the comparison of Eqs. (73) and (74) with Eqs. (42) and (65), it follows that the lower bound on the variance with which the y-coordinate or the x- and z-coordinates of the position can be estimated is equal to or twice as large as their one- and two-dimensional analogues, respectively. Recall that the y-coordinate is the coordinate along the rotation axis and that the x- and z-coordinates are the coordinates perpendicular to the rotation axis. Second, two components, for which their position coordinates are represented by the elements of the parameter vector b ¼ ðbx1 bx2 by1 by2 bz1 bz2 ÞT , are considered. It will be assumed that the three-dimensional components are located for the most part within the region of observation and that the pixel sizes Dx and Dy are small compared to the width r of the projected Gaussian peak. Furthermore, the Fisher information matrix F associated with the position coordinates b is a symmetric matrix. Under the assumptions given above and the symmetry property of the Fisher information matrix, it may be shown that 1 0 F11 F12 F13 F14 F15 F16 B F12 F11 F14 F13 F16 F15 C C B B F13 F14 F33 F34 F35 F36 C C ð75Þ F B B F14 F13 F34 F33 F36 F35 C: C B @ F15 F16 F35 F36 F55 F56 A F16 F15 F36 F35 F56 F55 The reason for this is that the components are assumed to be identical and hence interchangeable. The purpose is to find an expression for the CRLB on the variance s2d of the distance between two components. For a threedimensional object, the distance is defined as: ffi qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2 d ¼ ðbx1 bx2 Þ2 þ by1 by2 þ ðbz1 bz2 Þ2 : ð76Þ Since d is a function of the elements of the parameter vector b, an expression for s2d follows directly from the right-hand member of inequality (15): s2d ¼

@d 1 @dT ; F @b @bT

where the Jacobian matrix @d=@bT is equal to

ð77Þ

44

VAN AERT ET AL.

@d 1 ¼ bx1 bx2 T d @b

bx2 bx1

by1 by2

by2 by1

bz1 bz2

 bz2 bz1 :

ð78Þ Following the same lines of thought as in the derivation of Eq. (51), it can be shown that s2d may be approximated by: s2d 

 2 bx1 bx2 by1 by2 bz1 bz2 2 d 0 1 1 0 1 F11 F12 F13 F14 F15 F16 bx1 bx2 B C B C @ F13 F14 F33 F34 F35 F36 A @ by1 by2 A: F15 F16

F35 F36

F55 F56

ð79Þ

bz1 bz2

The expression given by the right-hand member of Eq. (79) has been calculated explicitly in Appendix C for the special cases where the distance between the two components is small or large compared to the width r of the projected Gaussian peak. This results in the following rules of thumb: s2d 

4r4 V ðfÞ N p d2

if d 

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2V ðfÞ=W ðfÞr

ð80Þ

s2d 

2r2 W ðfÞ Np

if d

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2V ðfÞ=W ðfÞr

ð81Þ

and

where V ðfÞ ¼ 4

3cos4 f 3cos2 f 2 ; cos4 f 6cos2 f 3

W ðfÞ ¼ 1 þ sin2 f;

ð82Þ ð83Þ

and f is the angle between the rotation axis and the axis that connects the two components. This angle has been visualized in Figure 3. It should be mentioned that this is not the tilt angle. For diVerent tilt angles y j in a tilt series, f is constant. The conditions for the validity of the approximations are that the components are located for the most part within the region of observation, that the pixel sizes Dx and Dy are small compared to r, and that the diVerence Dy between successive tilt angles is small compared to the full angular tilt range. Furthermore, the tilt angles y j are assumed to be equidistantly located on the interval ( p/2, p/2). If d is equal to pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2V ðfÞ=W ðfÞr, for a given angle f, both approximations are equal to one another. From Eqs. (80) and (81), it follows that the precision with which

QUANTITATIVE ATOMIC RESOLUTION TEM

45

the distance can be estimated is a function of the total number of electrons, the width of the peaks, the distance between the components, and the angle f. If f is equal to p/2, the approximated s2d is about 2 times as large as if f pffiffiisffi equal to 0. In terms of the standard deviation this corresponds to a factor 2. Moreover, if f is equal to 0, the approximations given by Eqs. (80)–(81) are equal to their one- and two-dimensional analogues given by Eqs. (43)–(44) and (66)-(67), respectively. This is intuitively clear since the components are then on the rotation axis and therefore the distance between the components in a two-dimensional projection is at each tilt angle equal to the real distance. In Section III.D it will be shown that the lower bounds on the standard deviation sbx, sby or sbx of the position coordinates bx, by or bz, respectively, of an isolated component and on the standard deviation sd of the distance d between two components is well approximated by the square roots of the right-hand members of Eqs. (73)–(74) and (80)–(81), respectively. D. Discussions and Examples In this section, the exactly calculated lower bounds on the standard deviation of the position coordinates of an isolated component and on the standard deviation of the distance will be compared with its approximations. This will be done for two- and three-dimensional objects. For onedimensional objects, a discussion may be found in Bettens, van Dyck, den Dekker, Sijbers, and van den Bos (1999). 1. Two-Dimensional Observations The approximations of the lower bound on the standard deviation, which are derived in Section III.D.2 for two-dimensional objects, will be investigated by means of examples, for dark-field as well as for brightfield imaging. a. Dark-Field Imaging. The approximations of the lower bound on the standard deviation sbx and sby of the position coordinates bx and by of an isolated component and on the standard deviation sd of the distance d between two components, which are described by the square roots of the right-hand members of Eqs. (65)-(67), are discussed for dark-field imaging experiments. These approximations will be compared with their exactly calculated lower bounds, which are found by numerical computation of the CRLB with respect to the position coordinates. The expressions for the CRLB are given in Section II.C.1. The CRLB is computed by substitution of the parametric statistical model of dark-field imaging observations, which is derived in Section III.B. into the obtained expressions. Unless otherwise

46

VAN AERT ET AL.

stated, the total number of electrons in a Gaussian peak, the width of this peak, the pixel sizes, and the field of view are given by the numbers of Table 1. Moreover, it is assumed that the center of mass of the components coincides with the center of the field of view. Figure 4 shows the exactly calculated lower bound on the standard deviation of the position coordinates together with its approximation as a function of the width of the Gaussian peak. Furthermore, Figure 5 shows the exactly calculated lower bound on the standard deviation of the distance and its approximations as a function of the distance between two components. From these figures, it is observed that the square roots of the right-hand members of Eqs. (65)–(67) are accurate approximations of sbx, sby, and sd. One of the assumptions that is made in the derivation of Eqs. (65)–(67) is that the pixel sizes Dx and Dy are small compared to the width of the

TABLE 1 Total Number of Electrons in a Gaussian Peak (Np), the Width ( r) of this Peak, the Pixel Sizes (Dx and Dy), and the Field of View (FOV ) Np

r

Dx

Dy

FOV

15,000

20

1.2

1.2

200  200

Figure 4. The exactly calculated lower bound on the standard deviation of the position coordinates and its approximation, given by the square root of the right-hand member of Eq. (65), as a function of the width of the Gaussian peak.

QUANTITATIVE ATOMIC RESOLUTION TEM

47

Gaussian peak. Therefore, Figure 6 shows the exactly calculated lower bound on the standard deviation sd of the distance as a function of the pixel size Dx, which has been assumed to be equal to Dy. The distance between the two components is equal to 10. From this figure, it is seen that below a certain pixel size, sd decreases only slightly with decreasing pixel size, with

Figure 5. The exactly calculated lower bound on the standard deviation of the distance and its approximations, given by the square roots of the right-hand members of Eqs. (66) and (67), as a function of the distance between two components.

Figure 6. The exactly calculated lower bound on the standard deviation of the distance as a function of the pixel size Dx, with Dy ¼ Dx. The distance is equal to 10.

48

VAN AERT ET AL.

all other quantities kept constant. Hence, the precision that is gained by decreasing the pixel size is only marginal. This was also observed for onedimensional observations by Bettens et al. (1999). This has to do with the fact that the pixel signal-to-noise ratio (SNR) decreases with decreasing pixel size. Finally, it is examined if there exists an estimator attaining the CRLB on the variance of position coordinates and on the variance of the distance and if this estimator may be considered unbiased. If so, this would justify the choice of the CRLB as precision based optimality criterion. Generally, one may use diVerent estimators in order to estimate the position coordinates or the distance such as the least squares estimator or the maximum likelihood estimator, which has been introduced in Section II.D. DiVerent estimators have diVerent properties. One of the asymptotic properties of the maximum likelihood estimator is that it is normally distributed about the true parameters with a covariance matrix approaching the CRLB (van den Bos, 1982). This property would justify the use of the CRLB as optimality criterion, but it is an asymptotic one. This means that it applies to an infinite number of observations. However, the number of observations used in the examples given above is finite and even relatively small. If asymptotic properties still apply to such experiments can often only be assessed by estimating from artificial, simulated observations (van den Bos, 1999). Therefore, 600 diVerent dark-field experiments made at an isolated component are simulated; the observations are modelled using the parametric statistical model described in Section III.B. Next, the position coordinates bx and by of the component are estimated from each simulation experiment using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the position coordinate and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression for the CRLB. The results are presented in Table 2. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. The maximum likelihood estimates of bx are presented in the histogram of Figure 7. The solid curve represents a normal distribution with mean and variance given in Table 2. This curve makes plausible that the estimates are normally distributed. This property is also tested quantitatively by means of the so-called Lilliefors test (Conover, 1980), which does not reject the hypothesis that the estimates are normally distributed. From the results obtained from the simulation experiments, it is concluded that the maximum likelihood estimates cannot be distinguished from unbiased, eYcient estimates. These results justify the choice of the CRLB as optimality criterion.

49

QUANTITATIVE ATOMIC RESOLUTION TEM TABLE 2 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 600 Maximum Likelihood Estimates of the Position Coordinates, Respectively True position coordinate bx by

0 0

Estimated mean

Standard deviation of mean

0.9  10 3 12.5  10 3

6.6  10 3 6.4  10 3

Estimated variance

Standard deviation of variance

s2bx

25.8  10 3

1.5  10 3

s2by

3

24.4  10

1.4  10 3

bx by

Lower bound on variance s2bx s2by

26.7  10 3 3

26.7  10

The numbers of the last column represent the estimated standard deviation of the variable of the previous column.

Figure 7. Histogram of 200 maximum likelihood estimates of the x-coordinate of the position of a component. The normal distribution superimposed on this histogram makes plausible that the estimates are normally distributed.

b. Bright-Field Imaging. The approximations of the lower bound on the standard deviation sbx and sby of the position coordinates bx and by of an isolated component and on the standard deviation sd of the distance d between two components, which are described by the square roots of the

50

VAN AERT ET AL.

right-hand members of Eqs. (68)–(70), are discussed for bright-field imaging experiments. These approximations will be compared with their exactly calculated lower bounds, which are found by numerical computation of the CRLB with respect to the position coordinates. The expressions for the CRLB are given in Section II.C.1. The CRLB is computed by substitution of the parametric statistical model of bright-field imaging observations, which is derived in Section III.B, into the obtained expressions. Unless otherwise stated, the total number of electrons, the width of the Gaussian peak, the constant representing the strength of the interaction, the pixel sizes, and the field of view are given by the numbers of Table 3. Moreover, it is assumed that the center of mass of the components coincides with the center of the field of view. Figure 8 shows the exactly calculated lower bound on the standard deviation of the position coordinates together with its approximation as a function of the constant O, which represents the strength of the interaction. Furthermore, Figure 9 shows the exactly calculated lower bound on the

TABLE 3 The Total Number of Electrons (N ), the Width ( r) of the Gaussian Peak, the Constant (O) Representing the Strength of the Interaction, the Pixel Sizes (Dx and Dy), and the Field of View (FOV ) N

r

O

Dx

Dy

FOV

18,000,000

20

100

1.2

1.2

200  200

Figure 8. The exactly calculated lower bound on the standard deviation of the position coordinates and its approximation, given by the square root of the right-hand member of Eq. (68), as a function of the constant O representing the interaction strength.

QUANTITATIVE ATOMIC RESOLUTION TEM

51

Figure 9. The exactly calculated lower bound on the standard deviation of the distance and its approximations, given by the square roots of the right-hand members of Eqs. (69) and (70), as a function of the distance between two components.

standard deviation of the distance and its approximations as a function of the distance between two components. From these figures, it is observed that the square roots of the right-hand members of Eqs. (68)–(70) are accurate approximations of sbx, sby, and sd. Like for dark-field imaging, it is examined by means of simulation experiments if the maximum likelihood estimator attains the CRLB on the variance of the distance between two components and if it is unbiased. The observations made at these components are modelled using the parametric statistical model for bright-field imaging described in Section III.B. From 600 diVerent simulation experiments, the distance is estimated using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the distance and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression for the CRLB. The results are presented in Table 4. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. 2. Three-Dimensional Observations The approximations of the lower bounds on the standard deviation sbx, sby, and sbz of the position coordinates bx, by, and bz of an isolated component and on the standard deviation sd of the distance d between two components, which are described by the square roots of the right-hand members of

52

VAN AERT ET AL. TABLE 4 Comparison of True Distance and Lower Bound on the Variance with Estimated Mean and Variance of 600 Maximum Likelihood Estimates of the Distance between Two Components, Respectively True distance 60

d

d

Lower bound on variance s2d

s2d

1.27

Estimated mean

Standard deviation of mean

60.0

0.5

Estimated variance

Standard deviation of variance

1.29

0.07

The numbers of the last column represent the estimated standard deviation of the variable of the previous column. TABLE 5 The Total Number of Projected Images (J ), the Number of Electrons in each Projected Gaussian Peak ðNp =JÞ, the Width ( r) of this Peak, the Pixel Sizes (Dx and Dy), the Field of View (FOV ) of each Projected Image, and the Angle(f) between the Rotation Axis and the Axis Connecting Two Components J

Np

r

Dx

Dy

FOV

f

20

15,000

20

1.2

1.2

200  200

p=2

Eqs. (73), (74), (80), and (81) in Section III.C.3, are investigated by means of examples. These approximations will be compared with their exactly calculated lower bounds, which are found by numerical computation of the CRLB with respect to the position coordinates. The expressions for the CRLB are given in Section II.C.1. The CRLB is computed by substitution of the parametric statistical model of the three-dimensional observations, which is given in Section III.B, into the obtained expressions. Unless otherwise stated, the total number of projected images, the number of electrons in each projected Gaussian peak, the width of this peak, the pixel sizes, the field of view of each projected image, and, in case of two components, the angle between the rotation axis and the axis connecting these components are given by the numbers of Table 5. Moreover, it is assumed that the center of mass of the components coincides with the center of the field of view. Figure 10 shows the exactly calculated lower bound on the standard deviation of the position coordinates and its approximations as a function of the width of the projected Gaussian peak, which is described by Eq. (29). Furthermore, Figure 11 shows the exactly calculated lower bound on the standard deviation of the distance and its approximations as a function of the distance between two components. The axis combining both

QUANTITATIVE ATOMIC RESOLUTION TEM

53

Figure 10. The exactly calculated lower bound on the standard deviation of the position coordinates and its approximations, given by the square roots of the right-hand members of Eqs. (73) and (74), as a function of the width of the projected Gaussian peak.

Figure 11. The exactly calculated lower bound on the standard deviation of the distance and its approximations, given by the square roots of the right-hand members of Eqs. (80) and (81), as a function of the distance between two components.

components is assumed to be perpendicular to the rotation axis. Moreover, in Figures 12 and 13, the exactly calculated lower bound on the standard deviation of the distance and its approximations are shown as a function of the angle f, for the distance between the components being small and large compared to the width of the projected Gaussian peak, respectively. From Figures 10 to 13, it is observed that the square roots of the right-hand

54

VAN AERT ET AL.

Figure 12. The exactly calculated lower bound on the standard deviation of the distance and its approximation, given by the square root of the right-hand member of Eq. (80), as a function of the angle f between the rotation axis and the axis connecting the two components of the object. The distance is equal to 2.

Figure 13. The exactly calculated lower bound on the standard deviation of the distance and its approximation, given by the square root of the right-hand member of Eq. (81), as a function of the angle f between the rotation axis and the axis connecting the two components of the object. The width of the projected Gaussian peaks is equal to 10 and the distance is equal to 50.

members of Eqs. (73), (74), (80), and (81) are accurate approximations of sbx, sby, sbz, and sd. Next, some remarks are due. It should be mentioned that in the derivation of the approximations of the CRLB, the diVerence Dy between successive tilt

QUANTITATIVE ATOMIC RESOLUTION TEM

55

angles has been assumed to be small compared to the full angular tilt range ( p/2, p/2), or in other words, the total number of projections has been assumed to be large, which is rather unrealistic. However, in the comparisons presented in Figures 10 to 13, the exactly calculated lower bounds on the standard deviation follow from the assumption that there are only 20 available projections. This shows that the approximations are useful, even for a limited number of projections. Additionally, Figure 14 shows the exactly calculated lower bound on the standard deviation of the distance sd as a function of the total number of projections, with all other parameters kept constant. It is seen that there is a fast convergence of sd to a constant with increasing number of projections. This means that the precision does not improve beyond a certain number of projections. The reason for this is that the number of electrons per projection decreases with increasing number of projections since the total number of electrons has been kept constant. Therefore, the pixel SNR decreases with increasing number of projections. Furthermore, in the derivation of the approximations, a full angular tilt range, that is, the interval ( p/2, p/2), has been assumed, which is also unrealistic. Therefore, Figure 15 shows the exactly calculated sd, following from a limited angular tilt range, that is, the interval ( p/3, p/3), and the approximations as a function of the distance between the two components. Although the approximations start to deviate from the exactly calculated sd, they are still useful as rule of thumb since they describe the behaviour of sd well.

Figure 14. The exactly calculated lower bound on the standard deviation of the distance as a function of the number of projections J, with the number of electrons in each projected Gaussian peak Np/J. The width of this peak is equal to 10 and the distance between the two components is equal to 40.

56

VAN AERT ET AL.

Figure 15. The exactly calculated lower bound on the standard deviation of the distance, assuming a limited angular tilt range, that is, the interval ( p/3, p/3), and its approximations, given by the square roots of the right-hand members of Eqs. (80) and (81), as a function of the distance between the two components.

Finally, it is examined by means of simulation experiments if the maximum likelihood estimator attains the CRLB on the variance of the position coordinates of an isolated component and if it is unbiased. The significance of this has been made clear earlier in Section III.D.1. The three-dimensional observations made at the component are modelled using the parametric statistical model described in Section III.B. The width of the projected Gaussian peaks is equal to 10. From 600 diVerent simulation experiments, the position coordinates are estimated using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the position coordinate and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression for the CRLB. The results are presented in Table 6. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. Additionally, a remark on maximum likelihood estimation has to be made. Maximum likelihood estimates are given by the values that maximize the log-likelihood function, as shown in Section II.D. However, in order to avoid ending up at a local maximum, instead of at the global maximum of the log-likelihood function, it is important to have good starting values for the position coordinates of the components, as already mentioned in Section I. For that purpose, a three-dimensional reconstruction could be

57

QUANTITATIVE ATOMIC RESOLUTION TEM TABLE 6 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 600 Maximum Likelihood Estimates of the Position Coordinates, Respectively True position coordinate bx by bz

0 0 0

Estimated mean

Standard deviation of mean

0.6  10 3 3.7  10 3 5.3  10 3

4.7  10 3 3.3  10 3 4.8  10 3

Estimated variance

Standard deviation of variance

s2bx

13.0  10 3

0.8  10 3

s2by s2bz

3

6.6  10 13.8  10 3

0.4  10 3 0.8  10 3

bx by bz

Lower bound on variance s2bx

13.3  10 3

s2by s2bz

6.7  10 13.3  10 3

3

The numbers of the last column represent the estimated standard deviation of the variable of the previous column.

useful. It may be obtained by combining the projected images using the socalled weighted back-projection method (Frank, 1992). E. Conclusions The attainable precision with which position and distance parameters of one or two components can be estimated is computed for simulations of highresolution CTEM, STEM, and electron tomography experiments, all described by simplified models. Usually, the performance of such atomic resolution TEM experiments is discussed in terms of two-point resolution, expressing the possibility of perceiving separately components of a twopoint image. Although such resolution based criteria are suitable to set up qualitative atomic resolution TEM experiments, a precision based optimality criterion is needed in the framework of quantitative atomic resolution TEM. Then, an obvious alternative to two-point resolution is the attainable precision with which position or distance parameters can be measured. In the simulation experiments, the observations were assumed to be electron counting results made at Gaussian peaks with unknown position. Under this assumption, the CRLB, which is usually calculated numerically, is given by a simple rule of thumb in closed analytical form. Although the expectation models of images obtained in practice are usually of a higher complexity, the rules of thumb are suitable to give insight into statistical experimental design for quantitative atomic resolution TEM. The

58

VAN AERT ET AL.

rules of thumb show how the attainable precision depends on the width of the point spread function, the width of the components, the number of detected electrons, and on the distance between the components. Particularly for electron tomography experiments, it is a function of the orientation of the components with respect to the rotation axis as well. Generally, the precision improves by increasing the number of detected counts or by narrowing the point spread function. However, below a certain width of the point spread function, the precision is limited by the intrinsic width of the components. Then, further narrowing of the point spread function is useless. Moreover, if a narrower point spread function results into a decrease of the number of detected electrons, both eVects have to be weighed against each other under the existing physical constraints. In the following sections, the optimal statistical experimental designs of CTEM and STEM experiments, assuming more realistic expectation models than Gaussian peaks, will be derived by computing the CRLB numerically. It will be shown that these numerical results may be interpreted by means of the obtained rules of thumb of this section.

IV. Optimal Statistical Experimental Design of Conventional Transmission Electron Microscopy

A. Introduction Optimal statistical experimental designs of CTEM experiments will be described. As mentioned in Section I the future of such experiments is quantitative structure determination. Unknown structure parameters, atom column positions in particular, are quantitatively estimated from the observations. Quantitative structure determination should be done as precisely as possible. A precision of the atom column positions of the ˚ is needed (Kisielowski, Principe, Freitag and Hubert, order of 0.01 to 0.1 A 2001; Muller, 1998, 1999). Precise measurements will allow materials scientists to draw reliable conclusions from the experiment. Such measurements may be used for comparison with or as an input for theoretical first-principles calculations in order to get a deeper understanding of the properties-structure relation. Hence, the experimental design of CTEM experiments should be evaluated and optimized in terms of precision. As shown in Section II, the obvious optimality criterion is the attainable precision, that is, the CRLB, with which the atom column positions can be estimated. The attainable precision should replace widely used performance criteria of an electron microscope, which express the

QUANTITATIVE ATOMIC RESOLUTION TEM

59

possibility to perceive separately two atom columns in an image. Although these criteria are suitable to set up qualitative CTEM experiments, the attainable precision is needed as a criterion in the framework of quantitative CTEM experiments. In Section III, the attainable precision has been derived in closed analytical form for atomic resolution transmission electron microscopy experiments using simplified models. In this section, the attainable precision will be derived for more complicated, physics based CTEM models and the obtained expression will be used to evaluate and optimize the experimental design. To begin with, it will be described how CTEM observations are collected. A scheme is shown in Figure 16. The object under study is illuminated by a parallel incident electron beam. As a result of the electron-object interaction, the so-called exit wave, which is a complex electron wave function at the exit plane of the object, is formed. A one-to-one correspondence between the exit wave and the projected object structure is established if the object is oriented along a main zone axis and if the distance between adjacent atom columns is not too small. Next, a magnified image of the exit wave is formed by a set of lenses of which the objective lens is the most important one. The formation of this image may be described in two steps. First, the so-called image wave, which is a complex electron wave function at the image plane, is formed. Since the objective lens is not perfect, the image wave is influenced by lens aberrations such as spherical

Figure 16. Scheme of a CTEM experiment.

60

VAN AERT ET AL.

aberration, defocus, and chromatic aberration. Second, the image intensity distribution, given by the modulus square of the image wave, is recorded. As a recording device, a CCD camera may be chosen. Therefore, CTEM observations may be considered to be electron counting results collected at the pixels of a CCD camera. Widely used performance criteria of CTEM experiments are the point resolution and the information limit of the electron microscope. The point resolution rs represents the smallest detail that may be interpreted directly from the image provided that the object is thin and that the defocus is adjusted to the so-called Scherzer defocus (Scherzer, 1949). The point resolution depends only on the spherical aberration constant Cs and the electron wavelength l, according to the formula rs ¼ 0:66ðCs l3 Þ1=4 (Spence, 1988). The information limit ri represents the smallest detail that is present in the image and that may be resolved by image processing techniques such as oV-axis holography (Lichte, 1991) and the focal-series reconstruction method (Coene, Thust, Op de Beeck, and van Dyck, 1996; Kirkland, 1984; Saxton, 1978; Schiske, 1973; Thust, Coene, Op de Beeck, and van Dyck, 1996; van Dyck and Coene, 1987; van Dyck, Op de Beeck and Coene, 1993). Both techniques retrieve the exit wave, which ideally is free from any lens aberration. The information limit is inversely proportional to the highest spatial frequency that is still transferred with enough intensity from the exit plane of the object to the image plane (de Jong and van Dyck, 1993; O’Keefe, 1992). Usually, the information limit is smaller than the point resolution in intermediate voltage electron microscopy. The information limit is mainly determined by spatial incoherence and temporal incoherence. Spatial incoherence is due to beam convergence, which is caused by the fact that the illuminating beam is not parallel but may be considered as a cone of incoherent plane waves. Temporal incoherence is due to chromatic aberration, which results from a spread in defocus values, arising from fluctuations in accelerating voltage, lens current, and thermal energy of the electron, where the thermal energy fluctuation is often the dominating term. Chromatic aberration will mostly be the dominant factor governing the information limit (de Jong and van Dyck, 1993). The information limit due to chromatic aberration is defined as ri ¼ ðplD=2Þ1=2 , with D the defocus spread, expressed in terms of the standard deviation (Spence, 1988). Over the years, diVerent methods have been developed to improve the point resolution or the information limit. Existing methods to improve the point resolution are, for example, high-voltage electron microscopy and correction of the spherical aberration. High-voltage electron microscopy is based on the principle that an increase of the accelerating voltage is accompanied with a decrease of the electron wavelength and a corresponding improvement of the point resolution (Phillipp, Ho¨ schen, Osaki, Mo¨ bus,

QUANTITATIVE ATOMIC RESOLUTION TEM

61

and Ru¨ hle, 1994). Spherical aberration is a lens defect that, like other aberrations, causes a point object to be imaged as a disk of finite size. By using a combination of magnetic quadrupole and octopole lenses, spherical aberration may be cancelled out (Rose, 1990; Scherzer, 1949). This improves the point resolution. One of the advantages of the spherical aberration corrector is that structure-imaging artifacts due to contrast delocalization may to a great extent be avoided (Haider, Uhlemann, Schwan, Rose, Kabius, and Urban, 1998). Existing methods to improve the information limit are based on correction of chromatic aberration by use of either a chromatic aberration corrector (Reimer, 1984; Weißba¨ cker and Rose, 2001, 2002) or a monochromator (Mook and Kruit, 1999). The chromatic aberration corrector is still at the conceptual stage. The monochromator is already used in practice and eliminates all electrons having energies outside a prespecified energy range. The methods presented nowadays result in a ˚ , which is suYcient to visualize the individual atom resolution of about 1 A columns of materials with columnar structures, viewed along a main zone axis. In fact, the methods developed to improve the point resolution or the information limit are advantageous for qualitative high-resolution CTEM. However, the future of CTEM experiments, is quantitative, instead of qualitative, structure determination. The structure parameters, the atom column positions in particular, are quantitatively estimated from the electron microscopical observations, instead of visually determined. Hence, the obvious optimality criterion to be used to evaluate the experimental design of CTEM experiments is the attainable precision, that is, the CRLB with which these structure parameters can be estimated, and not so much the point resolution or the information limit. In this section, optimal statistical experimental designs of CTEM experiments will be computed in terms of the experimental settings producing the highest attainable precision. It will be obtained using the principles of statistical experimental design as explained in Section II. The section is organized as follows. In Section IV.B, a parametric statistical model of the observations will be derived. This model describes the expectations of the observations as well as the fluctuations of the observations about these expectations. Next, in Section IV.C, it will be shown how the CRLB on the variance of the atom column position estimates may be deduced from this model. Afterward, an adequate optimality criterion, which is a function of the elements of the CRLB, will be given. This criterion is then used to evaluate and optimize the experimental design. Special attention is paid to the dependence of the optimality criterion on the use of a spherical aberration corrector, a chromatic aberration corrector, and a monochromator. In Section IV.D, conclusions are drawn.

62

VAN AERT ET AL.

Part of the results of this section concerning the use of a monochromator has earlier been published in den Dekker, van Aert, van Dyck, van den Bos, and Geuens (2000) and den Dekker, van Aert, van Dyck, van den Bos, and Geuens (2001). B. Parametric Statistical Model of Observations In order to derive the optimal statistical experimental design, a parametric statistical model of the CTEM observations is needed. This model, which contains microscope settings such as defocus, spherical aberration constant, chromatic aberration constant, and defocus spread, as well as structure parameters such as the atom column positions and the object thickness, will be derived in this section. In this derivation, two basic approximations will be made. The first approximation is the use of the simplified channelling theory to describe the dynamical scattering of the electrons on their way through the object (Geuens and van Dyck, 2002; van Dyck and Op de Beeck, 1996). Secondly, partial spatial and temporal coherence will be incorporated by representing the microscope’s transfer function as a product of the corresponding coherent transfer function and two envelope functions (Fejes, 1977; Frank, 1973). The image calculation is then treated as a simple Fourier optics scheme. This approach is nowadays called the quasi-coherent approximation (Coene and van Dyck, 1988). Admittedly, the approximations made are of a limited validity. However, they are very useful for a compact analytical model-based derivation of the optimal statistical experimental design of quantitative CTEM experiments as well as for explaining the basic principles governing the obtained results. The principal results obtained are independent of the approximations made. Moreover, it should be noticed that the image magnification will be ignored, without loss of generality. 1. The Exit Wave The first important step in the derivation of the parametric statistical model of the observations is to obtain an expression for the exit wave c(r, z). This is a complex wave function in the plane at the exit face of the object, resulting from the interaction of the electron beam with the object. Use will be made of the simplified channelling theory. At this stage, structure parameters will enter the model. High-resolution CTEM images often show a one-to-one correspondence with the projected object structure if the incident electron beam propagates along a main zone axis. This happens for instance in ordered alloys with columnar structures provided that the point resolution of the microscope is suYcient and the distance between adjacent columns is not too small

QUANTITATIVE ATOMIC RESOLUTION TEM

63

Figure 17. Schematic representation of electron channelling.

(van Tendeloo and Amelinckx, 1978; van Tendeloo, and Amelinckx, 1982). From this, it has been suggested that for materials oriented along a main zone axis and with suYcient separation between the columns, the exit wave mainly depends on the projected structure, that is, on the type of atom columns. The physical reason behind this is that the atoms are superimposed along an atom column in this orientation. Then, it can be shown that the electrons are trapped in the positive electrostatic potential of the atoms. Because of this, each atom column acts as a guide or a channel within which the electron scatters dynamically without leaving the column (van Dyck, 2002). This channelling eVect is schematically represented in Figure 17. In the simplified channelling theory, applicable if the incident electron beam propagates along a main zone axis, an expression for the exit wave is given by (van Dyck and Op de Beeck, 1996):     nc X E1s;n 1 cðr; zÞ ¼ 1 þ z 1 ; ð84Þ cn f1s;n ðr bn Þ exp ip E0 l n¼1 where r ¼ ðx yÞT is a two-dimensional vector in the plane at the exit face of the object, perpendicular to the incident beam direction, z is the object thickness, E0 is the incident electron energy, and l is the electron wavelength. The incident electron energy and the electron wavelength are related (Kirkland, 1998): hc l ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi E0 ð2m0 c2 þ E0 Þ

ð85Þ

with h Plancks constant, m0 the electron rest mass and c the velocity of light so that hc ¼ 12:398 keV and m0 c2 ¼ 511 keV. It should be mentioned that

64

VAN AERT ET AL.

the accelerating voltage is equal to E0/e, where e ¼ 1:6  10 19 C is the electron charge. The summation in Eq. (84) is over nc atom columns. The function f1s;n ðr bn Þ is the lowest energy bound state of the nth atom column located at position bn ¼ ðbxn byn ÞT and E1s,n is its energy. The lowest energy bound state is a real-valued, centrally peaked, radially symmetric function, which is a two-dimensional analogue of the 1s-state of an atom. Following van Dyck and Op de Beeck (1996), it has been assumed that the dynamical motion of the electron in a column may be expressed primarily in terms of this tightly bound 1s-state. The other states are not neglected, but for thin objects they will not build up and are incorporated in the term ‘1’ in Eq. (84), which describes the unscattered incident electron wave. The author is well aware of the fact that for heavy atom columns, where higher order states start to play a more prominent role (Kambe, Lehmpfuhl, and Fujimoto, 1974), Eq. (84) becomes a less accurate description of the exit wave (van Dyck and Op de Beeck, 1996). The excitation coeYcients cn may be found from (van Dyck and Op de Beeck, 1996): Z ð86Þ cn ¼ f1s;n ðr bn Þcðr; 0Þdr; where the symbol * denotes the complex conjugate. For plane wave incidence, i.e., cðr; 0Þ ¼ 1, one thus has: Z cn ¼ f1s;n ðr bn Þdr: ð87Þ Following Geuens, Chen, den Dekker, and van Dyck (1999) and Geuens and van Dyck (2002), the 1s-state function may be approximated by a single, quadratically normalized, parameterized Gaussian function   1 r2 f1s;n ðrÞ ¼ pffiffiffiffiffiffi exp 2 ; ð88Þ 4an an 2p where r is the Euclidean norm of the two-dimensional vector r, that is, r ¼ jrj, and an represents the column dependent width. This width is directly related to the energy of the 1s-state. Then, it follows from Eqs. (87) and (88) that pffiffiffiffiffiffi cn ¼ 2 2pan : ð89Þ The two-dimensional Fourier transform F1s;n ðgÞ of Eq. (88), which will be needed in the remainder of this section, is given by: pffiffiffiffiffiffi   F1s;n ðgÞ ¼ 2 2pan exp 4p2 a2n g2 ð90Þ with g being the Euclidean norm of the two-dimensional spatial frequency vector g in reciprocal space, that is, g ¼ jgj. Throughout this article, the

QUANTITATIVE ATOMIC RESOLUTION TEM

65

two-dimensional Fourier transform H(g) of an arbitrary function h(r) is defined as Z H ðgÞ ¼ =r!g hðrÞ ¼ hðrÞ exp ði2pg:rÞdr; ð91Þ where the symbol ‘.’ denotes the scalar product. Consequently, the inverse Fourier transform is defined as: Z hðrÞ ¼ = 1 H ð g Þ ¼ H ðgÞ exp ð i2pg:rÞdg: ð92Þ g!r 2. The Image Wave In the second step of the derivation of the parametric statistical model of the observations, an expression for the image wave ci (r, z) is obtained. This is a complex electron wave function at the image plane. At this stage, most microscope settings will enter the model. The image wave is written as the convolution product of the exit wave with the point spread function t(r) of the electron microscope (van Dyck, 2002): ci ðr; zÞ ¼ cðr; zÞ  tðrÞ:

ð93Þ

The two-dimensional Fourier transform of t(r) represents the microscope’s transfer function T(g). Following (van Dyck, 2002), T(g) is radially symmetric and described as: T ðgÞ ¼ T ðgÞ ¼ AðgÞDs ðgÞDt ðgÞ expð iwðgÞÞ; where A(g) is a circular aperture function, given by: ( 1 if g  gap AðgÞ ¼ 0 if g > gap

ð94Þ

ð95Þ

with gap the objective aperture radius. Notice that the objective aperture semiangle ao is equal to gapl. In what follows, it will be assumed that there is no objective aperture so that A(g) is constant and equal to 1. The phase shift w(g), resulting from the objective lens aberrations, is radially symmetric and given by: 1 wðgÞ ¼ p"lg2 þ pCs l3 g4 2

ð96Þ

with " being the defocus. Notice that higher order aberration eVects such as 2-fold astigmatism, 3-fold astigmatism, and axial coma, have been neglected. They could be included in the phase shift as well (Thust, Overwijk, Coene,

66

VAN AERT ET AL.

and Lentzen, 1996). In the quasi-coherent approximation, the eVects of partial spatial and temporal coherence are incorporated by the damping envelope functions Ds(g) and Dt(g), respectively. For a Gaussian incoherent eVective electron source, the function Ds(g) is described as (Frank, 1973), (Spence, 1988):  2 ! a2c pCs l2 g3 þ p"g Ds ðgÞ ¼ exp ; ð97Þ ln 2 where ac is the semi-angle of beam convergence. For a Gaussian spread of defocus, the function Dt(g) is described as (Fejes, 1977): ! p2 l2 D2 g4 Dt ðgÞ ¼ exp ; ð98Þ 2 where D is the defocus spread due to chromatic aberration, which is given by (O’Keefe, 1992; Spence, 1988): sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2    2 DI DV 2 DE : ð99Þ þ þ D ¼ Cc 4 I0 V0 E0 Notice that the defocus spread D, which is here defined pasffiffiffi the standard deviation, corresponds to a half width at 1/e height equal to 2D. In Eq. (99), Cc is the chromatic aberration coeYcient, DV and DI are the standard deviations of the statistically independent fluctuations of the accelerating voltage V0 and objective lens current I0, respectively, while DE is the intrinsic energy spread, that is, the standard deviation of the statistically independent fluctuations of the incident electron energy E0 of the electrons in the electron source, defined as: DE ¼

Z

1

2

ðE E0 Þ pðE ÞdE

1=2 ;

ð100Þ

1

where p(E ) is the energy probability density function. It is usually assumed that p(E ) is well approximated by a Gaussian function: ! 1 ðE E0 Þ2 pffiffiffiffiffiffi exp pðE Þ ¼ ð101Þ DE 2p 2ðDE Þ2 with expectation value E0 and standard deviation DE. Straightforward calculations show that the relationship between the standard deviation DE and the full width at half maximum height of the energy distribution described by Eq. (101) is given by:

67

QUANTITATIVE ATOMIC RESOLUTION TEM

FWHM ¼

pffiffiffiffiffiffiffiffiffi 8ln2DE  2:35DE:

ð102Þ

In the following, it is assumed that DV/V0 and DI/I0 are small in comparison to DE/E0, so that they may be neglected and Eq. (99) reduces to:   DE D ¼ Cc : ð103Þ E0 Notice that the quasi-coherent approximation used is only of a limited validity and is certainly not the state-of-the art to treat partial coherence. According to the work of Frank (1973), this approximation is only valid for a small eVective source and a central ‘unscattered’ beam much stronger than any other (Spence, 1988). A more correct analytical treatment may be achieved via autocorrelations in Fourier space, incorporating the microscope properties in the form of a transmission-cross-coeYcient (Born and Wolf, 1999; Frank, 1973; Ishizuka, 1980). However, such a treatment would complicate the derivation of the optimal statistical experimental design and the explanation of the basic principles governing the obtained results severely and unnecessarily. Moreover, it should be mentioned that the analysis via transmission-cross-coeYcients is also not perfect, since it does not take the influence of beam convergence and defocus spread on the scattering of the electrons with the object into account (van Dyck, 2002). 3. The Image Intensity Distribution Next, an expression for the image intensity distribution I(r) will be derived. This is given by the modulus square image wave. Hence, it follows from Eqs. (84) and (93) that I ðrÞ ¼ jci ðr; zÞj2 2  (    ) nc   ð104Þ X E1s;n 1   z 1  tðrÞ ; ¼ 1 þ cn f1s;n ðr bn Þ exp ip   E0 l n¼1 where it is taken into account that 1 * t(r) is equal to 1. Furthermore, f1s;n ðr bn Þ  tðrÞ

ð105Þ

represents the 1s-state function convoluted with the microscope’s point spread function, which is equal to Z 1 2p F1s;n ðgÞT ðgÞJ0 ð2pgjr bn jÞgdg ð106Þ 0

since f1s,n(r) and t(r) are both radially symmetric functions. In Eq. (106), J0(.) is the zeroth-order Bessel function of the first kind.

68

VAN AERT ET AL.

Furthermore, notice that it can be seen from Eq. (104) that for identical atom columns, the contrast varies periodically with thickness, where the periodicity is given by (van Dyck and Chen, 1999a):   2E0 l  ð107Þ D1s ¼  E1s;n  which is called the extinction distance. This periodic oscillation is due to dynamical eVects, which have been included in the model via the channelling approximation. Generally, the extinction distance will be diVerent for diVerent types of atom columns. 4. The Image Recording Next, the expectation model, describing the expected number of electrons recorded by the detector, will be derived. As a recording device, a CCD camera is chosen, consisting of K  L equidistant pixels of area Dx  Dy, where Dx and Dy are the sampling distances in the x- and y-direction, respectively. Pixel (k, l ) corresponds to position ðxk yl ÞT  ðx1 þ ðk 1Þ Dx y1 þ ðl 1ÞDyÞT of the recorded image, with k ¼ 1; . . . ; K and l ¼ 1; . . . ; L and ðx1 y1 ÞT represents the position of the pixel in the bottom left corner of the field of view (FOV). The FOV is centered about (0 0)T. It is chosen suYciently large so as to guarantee that the tails of the microscope’s point spread function t(r) are collected. Furthermore, it is assumed that the quantum eYciency of the CCD camera is suYciently high to detect single electrons. The probability pkl that an electron hits a pixel (k, l ) is then approximately given by pkl ¼

I ðrkl Þ DxDy Inorm

ð108Þ

with I(r) given by Eq. (104), rkl ¼ ðxk yl ÞT , and Inorm a normalization factor given by: Z Inorm ¼ I ðrÞdr; ð109Þ where the integral extends over the whole FOV. This means that for a given total number of detected electrons N, the number of electrons expected to be found at pixel (k, l ) is equal to: lkl ¼ Npkl :

ð110Þ

This result defines the expectations of the observations wkl recorded by the detector and is hence called the expectation model. The total number of detected electrons N is equal to the total number of incident electrons, that

QUANTITATIVE ATOMIC RESOLUTION TEM

69

is, the number of electrons that interact with the object, since it has been assumed that there is no objective aperture. In the presence of an objective aperture, part of the electrons would be lost. The total number of incident electrons depends on the reduced brightness (Br) of the electron source, the incident electron energy (E0), the recording time (t), the field of view (FOV ), the semi-angle of beam convergence (ac), and the electron charge (e ¼ 1:6  10 19 C), according to the formula (Spence, 1988): N¼

Br E0 tFOV pa2c : e2

ð111Þ

The reduced brightness of the electron source is defined as the brightness of the electron source per accelerating voltage, whereas the brightness of the electron source describes the current density per unit solid angle of this source (Williams and Carter, 1996). In the absence of electron-electron interactions, the reduced brightness is a conserved quantity. This means that it is the same at every point on the optical axis (van Veen, Hagen, Barth, and Kruit, 2001). In what follows, the importance of this quantity on the performance of CTEM experiments will be studied. 5. The Incorporation of a Monochromator In this section, special attention is paid to the incorporation of a monochromator into the expectation model (den Dekker, van Aert, van Dyck, van den Bos, and Geuens, 2001). Suppose that a monochromator is incorporated in the imaging system below the electron source, removing all electrons, except those whose energy lies within a prespecified energy range ½E0 dE=2; E0 þ dE=2. The monochromator reduces the standard deviation of the energy spread from DE, which is defined by Eq. (100), to DEm, which is described by: !1=2 Z E0 þdE=2

DEm ¼

ðE E0 Þ2 p0 ðE ÞdE

ð112Þ

E0 dE=2

with p0 (E) being the energy distribution of the electrons transmitted by the monochromator, which is given by: 8 pðE Þ dE dE > : 0 otherwise with p(E) defined as in Eq. (101). Straightforward calculations, using Eqs. (101), (112), and (113), then show that the standard deviation defining the

70

VAN AERT ET AL.

energy spread of the electrons transmitted by the monochromator may be described as: vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi   u 1 dE u   p E0 ð114Þ DEm ¼ DE t1 dE 2 ffiffi Erf pdE 2 2DE

with Erf(.) being the error function. As an unfavorable side eVect of the incorporation of a monochromator, the total number of incident electrons that interact with the object reduces if the recording time t is kept constant. Only a fraction of the total number of electrons given by Eq. (111) will be recorded. It may be shown that the total number of detected electrons by use of a monochromator is given by: Z Br E0 tFOV pa2c E0 þdE=2 N¼ pðE ÞdE e2 E0 dE=2 ð115Þ   Br E0 tFOV pa2c dE ¼ Erf pffiffiffi : e2 2 2DE Hence, the expectation model by incorporating a monochromator is still given by Eq. (110), but with a reduced total number of electrons N as in Eq. (115) instead of as in Eq. (111) and a reduced energy spread of the electrons as in Eq. (114) instead of as in Eq. (100). For CTEM, the observations are electron counting results, which are supposed to be independent and Poisson distributed. Therefore, the joint probability density function of the observations P(o; b), representing the parametric statistical model of the observations is given by Eq. (10), where the total number of observations is equal to K  L and the expectation model is given by Eq. (110). The parameter vector b ¼ ðbx1 . . . bxnc by1 . . . bync ÞT consists of the x- and y-coordinates of the atom column positions to be estimated. In the following section, the experimental design resulting into the highest attainable precision with which the elements of the vector b can be estimated will be derived from the joint probability density function of the observations. C. Statistical Experimental Design In this section, the optimal statistical experimental design of high-resolution CTEM experiments will be derived in the sense of the microscope settings resulting into the highest attainable precision with which the position coordinates of the atom columns can be estimated. Therefore, the CRLB with respect to the position coordinates will be computed from the

QUANTITATIVE ATOMIC RESOLUTION TEM

71

parametric statistical model of the observations discussed in the previous section. In Section II, this CRLB was discussed. Then, a scalar measure of this CRLB, that is, a function of the elements of the CRLB, will be chosen as optimality criterion, which will then be evaluated and optimized as a function of the microscope settings. An overview of the microscope settings will be given in Section IV.C.1. Some of them are tunable, while others are fixed properties of the electron microscope. Next, in Section IV.C.2, the results of the numerical evaluation of the dependence of the chosen optimality criterion on the microscope settings will be discussed. This will be done for both isolated and neighboring atom columns. The section is concluded by simulation experiments to find out if the maximum likelihood estimator attains the CRLB and, moreover, if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. Finally, in Section IV.C.3, an interpretation of the numerical optimization results will be given. The object thickness, the energy of the atom columns, and the microscope settings are supposed to be known. However, the following analysis may relatively easily be extended to include the case in which these or even more parameters are unknown and hence have to be estimated simultaneously. 1. Microscope Settings An overview of the microscope settings, which enter the parametric statistical model of the CTEM observations, is given in this section. For simplicity, some of these settings will be kept constant in the evaluation and optimization of the experimental design. The settings describing the illuminating electron beam are the electron wave-length l, the semi-angle of beam convergence ac, the standard deviation DE of the intrinsic energy spread of the electrons in the electron source, the reduced brightness Br of the electron source, and the width dE of the energy selection slit (in the presence of a monochromator). The electron wavelength and the reduced brightness of the electron source are fixed properties of a given electron microscope. The eVect of these settings on the precision with which atom column positions can be estimated will be studied. The semi-angle of beam convergence may be varied experimentally, but it will be held fixed and suYciently small in the present analysis in order to guarantee that the quasi-coherent approximation made in the derivation of the expectation model is reasonable. Moreover, typical values will be chosen for the standard deviation of the intrinsic energy spread of the electrons, in agreement with electron sources used today. The width of the energy selection slit will be variable, thus resulting into a variable energy spread DEm of the electrons.

72

VAN AERT ET AL.

The microscope settings specifying the objective lens are the defocus ", the spherical aberration constant Cs, and the chromatic aberration constant Cc. The defocus will be variable. For most electron microscopes, the spherical and chromatic aberration constant are fixed properties of the microscope, however, by incorporating a spherical or chromatic aberration corrector, these settings are (or will become) tunable. Therefore, it is interesting to study the eVect of these settings on the precision. The microscope settings describing the image recording are the pixel sizes Dx and Dy, the number of pixels K and L in the x- and y-direction, respectively, and the recording time t. The pixel sizes Dx and Dy will be kept constant. In agreement with the results presented in Section III, it may be shown that the precision will generally improve with smaller pixel sizes, with all other settings kept constant. However, below a certain pixel size, no more improvement is gained. This has to do with the fact that the pixel signal-tonoise ratio (SNR) decreases with a decreasing pixel size. Therefore, the pixel sizes are chosen in the region where no more improvement may be gained. This is similar to what is described in (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999; den Dekker, Sijbers and van Dyck, 1999; van Aert, den Dekker, van Dyck, and van den Bos, 2002a). The number of pixels K and L, defining the FOV for given pixel sizes Dx and Dy, will be chosen fixed, but large enough so as to guarantee that the tails of the microscope’s transfer function are collected in the FOV. 2. Numerical Results In this section, the results of the numerical evaluation of the dependence of the attainable precision, that is, the CRLB, on the microscope settings will be studied. This section is divided into four parts. First, general comments, which should be kept in mind during the reading of this section, will be given, including an overview of the original, non-optimized microscope settings and of the structure parameters. Second, optimal experimental designs for isolated atom columns will be computed. The corresponding highest attainable precisions will be compared to the attainable precisions at the original microscope settings. Third, the influence of neighboring atom columns on these optimal designs will be discussed. Finally, simulation experiments will be carried out to find out if the maximum likelihood estimator attains the CRLB and if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. a. General Comments. In this section, general comments will be given, which should be kept in mind during further reading. They are related to the

QUANTITATIVE ATOMIC RESOLUTION TEM

73

comparison of the original and optimal microscope settings and to the structure parameters of the objects under study. i. Original and Optimal Microscope Settings. In what follows, the values for the original, non-optimized microscope settings are given in Table 7, unless otherwise mentioned. These values are typical for today’s electron microscopes. In what follows, they will be compared to the optimal values which result into the highest attainable precision. In principle, the optimal values should be found by optimizing the attainable precision for all microscope settings simultaneously. This corresponds to an iterative, numerical optimization procedure in the space of microscope settings. In this space, every point represents a set of values for the microscope settings of which the dimension is equal to the number of microscope settings. However, it has been found that, apart from the optimal defocus, the optimal value of each of these microscope settings is independent of the other settings. Consequently, the optimization of most microscope settings may be performed one at a time, instead of simultaneously. This kind of optimization is also justified from a practical point of view. Suppose, for example, that an experimenter has an electron microscope with spherical aberration corrector but without chromatic aberration corrector. This microscope will allow him or her to tune the spherical aberration constant, whereas the chromatic aberration constant is fixed. In this case, one is only interested in knowing the optimal spherical aberration constant for a given chromatic aberration constant, instead of knowing the combined optimal spherical and chromatic aberration constant.

TABLE 7 Original Microscope Settings Microscope setting

Value

ac(rad) DE(eV ) Br ðAm 2 sr 1 V 1 Þ Cs(mm) Cc(mm) ˚) Dx(A ˚) Dy(A K L t(s)

10 4 0.75 2  107 0.5 1.3 0.2 0.2 100 100 1

74

VAN AERT ET AL.

In the following, the attainable precision will be computed as a function of the following microscope settings: . . . . .

Defocus Spherical aberration constant Chromatic aberration constant Energy spread of a monochromator Reduced brightness of the electron source

The evaluation of the precision as a function of the defocus will be done for a range of spherical aberration constants, for a given incident electron energy and corresponding electron wavelength. In this way, it will be possible to express the optimal defocus in terms of the spherical aberration constant and electron wavelength. The evaluation as a function of the other microscope settings will be performed separately. Moreover, microscopes operating at an incident electron energy of both 300 keV and 50 keV will be considered. Unless otherwise stated, the values of the microscope settings diVerent from those to be optimized are given in Table 7 and the defocus is adjusted to its optimal value, which will be shown to be given, to a good approximation, by Eqs. (118)-(119). The results of the evaluation of the attainable precision as a function of the individual microscope settings will be presented in figures. In these figures, the point corresponding to the original microscope settings will be marked with a symbol. Use of the same symbol in diVerent figures indicates that the corresponding microscope settings are identical. This makes comparison between diVerent figures easier. The following three symbols with corresponding microscope settings will be used: . .

.

Ed ¼ 300 keV, optimal defocus, other settings are given in Table 7. Em ¼ 50 keV, Cc ¼ 0 mm, optimal defocus, other settings are given in Table 7. Ej ¼ 50 keV, Cs ¼ 0 mm, optimal defocus, other settings are given in Table 7.

ii. Structure Parameters. The evaluation and optimization of the attainable precision as a function of the microscope settings will be done for both silicon [100] and gold [100] atom columns for which the width of the 1s-state and its energy are given in Tables 8 and 9 for a microscope operating at 300 keV and 50 keV, respectively. The other structure parameters of the object under study, such as the atom column positions and the object thickness, will be given in the following parts.

75

QUANTITATIVE ATOMIC RESOLUTION TEM TABLE 8 ˚ 2 and Width of the 1s-State and Its Energy (Debye-Waller Factor ¼ 0.6 A E0 ¼ 300 keV) of a Silicon [100] and a Gold [100] Atom Column Column type Structure parameter

Si [100]

Au [100]

˚) an(A E1s,n(eV)

0.34 20.2

0.13 210.8

TABLE 9 ˚ 2 and Width of the 1s-State and Its Energy (Debye-Waller Factor ¼ 0.6 A E0 ¼ 50 keV) of a Silicon [100] and a Gold [100] Atom Column Column type Structure parameter

Si [100]

Au [100]

˚) an(A E1s,n(eV)

0.45 12.4

0.16 148.3

TABLE 10 Structure Parameters of an Isolated Atom Column Structure parameter

Value

˚) bx(A ˚) by(A ˚) z(A

0  0  E0 l  E1s;n 

b. Isolated Atom Columns i. Structure Parameters. For isolated atom columns, the structure parameters other than the width of the 1s-state and its energy, that is, the atom column positions and the object thickness, are given in Table 10, unless otherwise stated. The object thickness is equal to half the extinction distance, which is given by Eq. (107). At this thickness and at thicknesses equal to odd multiples of half the extinction distance, the electrons are strongly localized at the atom column positions (Lentzen, Jahnen, Jia, Thust, Tillmann and Urban, 2002). ii. Optimality Criterion. The optimal statistical experimental design will be described by the microscope settings resulting into the highest attainable precision with which the position coordinates b ¼ ðbx by ÞT can be measured. This attainable precision (in terms of the variance) is represented

76

VAN AERT ET AL.

by the diagonal elements sb2x and sb2y of the CRLB. An expression for these elements will be derived in the following paragraphs. For an isolated atom column, the CRLB is equal to the inverse of the 2  2 Fisher information matrix F associated with the position coordinates. The (r, s)th element of F is defined by Eq. (12): Frs ¼

K X L X 1 @lkl @lkl l @br @bs k¼1 l¼1 kl

ð116Þ

with lkl the expected number of electrons at the pixel (k, l ). An expression for the elements Frs is found by substitution of the expectation model given by Eq. (110) as derived in Section IV.B and its derivatives with respect to the position coordinates into Eq. (116). Explicit numbers for these elements are obtained by substituting values of a given set of microscope settings and structure parameters of the object into the obtained expression for Frs. For the radially symmetrical expectation model used, the diagonal elements of the Fisher information matrix are equal to one another. Moreover, since the Fisher information matrix is symmetric, the diagonal elements of its inverse, that is, of the CRLB, are also equal to one another:

s2bx ¼ s2by ¼ F 1 11 ð117Þ with [F 1]11 the (1, 1)th element of the CRLB, that is, of F 1. In what follows, the precision will be represented by the lower bound on the standard deviation sbx and sby, that is, the square root of the right-hand member of Eq. (117). It will be used as optimality criterion for the evaluation and optimization of the experimental design. Therefore, this chosen optimality criterion will be calculated for various types of atom columns as a function of the defocus, the spherical aberration constant, the chromatic aberration constant, and the energy spread of a monochromator. In this evaluation and optimization procedure, the relevant physical constraints are taken into consideration. The constraint is either the radiation sensitivity of the object under study or the specimen drift. ˚ or the recording Therefore, either the incident electron dose per square A time has to be kept within the constraints. iii. Optimal Defocus Value. First, the dependence of the precision on the defocus is studied, as well as the dependence of the optimal defocus on the spherical aberration constant and the electron wavelength. The precision is represented by the square root of the right-hand member of Eq. (117). In Figure 18, it is plotted for a silicon [100] atom column as a function of the defocus " and the spherical aberration constant Cs for a given electron wavelength l. Notice that the evaluation is done for positive as well as for negative Cs-values. Negative Cs-values may be obtained by use of a spherical

QUANTITATIVE ATOMIC RESOLUTION TEM

77

Figure 18. The lower bound on the standard deviation of the position coordinates of an isolated silicon atom column as a function of the spherical aberration constant and the defocus. The solid white curve is described by Eqs. (118) and (119) and the dotted white curve describes the numerically found optimal defocus values as a function of the considered spherical aberration constants.

aberration corrector (Kabius, Haider, Uhlemann, Schwan, Urban, and Rose, 2002), (Lentzen, Jahnen, Jia, Thust, Tillmann, and Urban, 2002). The solid white curve shown in Figure 18 is described by the relation rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4 ð118Þ " ¼ Cs l if Cs < 0; 3 rffiffiffiffiffiffiffiffiffiffiffi 4 "¼ Cs l 3

if Cs 0;

ð119Þ

where Eq. (119) is the well-known Scherzer defocus (Scherzer, 1949), which is generally believed to be optimal in terms of point resolution and contrast (Spence, 1988). The dotted white curve shown in Figure 18 describes the numerically found optimal defocus values as a function of the considered spherical aberration constants. From the comparison of the solid and dotted

78

VAN AERT ET AL.

white curve in Figure 18, it follows that the Scherzer defocus (for positive Cs) and Eq. (118) (for negative Cs) are close to the optimal defocus values in terms of precision, except for values of Cs that are significantly higher than the original setting of 0.5 mm. Moreover, for a given spherical aberration constant, operating at the corresponding optimal defocus instead of at the defocus described by Eqs. (118) or (119) is hardly beneficial. Therefore, the optimal defocus value, in terms of spherical aberration constant and electron wavelength, is approximately given by Eqs. (118) and (119). This result is in agreement with the results presented in (den Dekker, Sijbers, and van Dyck, 1999), where the attainable precision with which the position of a single atom can be estimated is evaluated as a function of microscope settings for high-resolution CTEM. Furthermore, this finding does not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint. In Figure 18, the ˚ are recording time as well as the number of incident electrons per square A fixed. The optimal defocus value does not change if, for example, longer ˚ would be allowed. recording times or more incident electrons per square A The reason for this is that the precision is inversely proportional to the square root of the total number of detected electrons N, which, in its turn is directly proportional to the recording time. This follows from Eqs. (110), (111), (115), (116), and (117). Therefore, for other values of the recording ˚ , only the actual values time or the number of incident electrons per square A for the standard deviation ascribed to Figure 18 would be diVerent, whereas the optimal defocus value would be the same. From now on, the defocus will be adjusted to the value given by Eq. (118) for negative Cs-values and to the Scherzer defocus, given by Eq. (119), for positive Cs-values since these are useful approximations of the optimal defocus value. iv. Optimal Spherical Aberration Constant. Subsequently, the dependence of the precision on the spherical aberration constant is studied. Usually, the spherical aberration constant is a fixed property of the electron microscope. However, by incorporating a spherical aberration corrector, it is tunable and may range from the value of the original uncorrected microscope over zero and even to negative values (Kabius, Haider, Uhlemann, Schwan, Urban, and Rose, 2002; Lentzen, Jahnen, Jia, Thust, Tillmann, and Urban, 2002). Thus far, the advantages of a spherical aberration corrector were usually discussed in the literature in terms of qualitative structure determination, that is, in terms of the possibility to perceive two atom columns separately in an image. The optimality criterion used was the point resolution rs of the electron microscope, which is equal to 0.66(Cs l3)1/4. By use of a spherical aberration corrector, the point resolution improves and, consequently, structure-imaging artifacts due to

QUANTITATIVE ATOMIC RESOLUTION TEM

79

contrast delocalization reduce (Haider, Uhlemann, Schwan, Rose, Kabius, and Urban, 1998). In the present analysis, however, the possible benefit of a spherical aberration corrector is discussed in terms of the attainable statistical precision with which position coordinates of an atom column can be determined. This is the criterion of importance in the framework of quantitative structure determination, which will gain importance in the future. This criterion takes the object and the total number of detected electrons into account. First, the precision is evaluated and optimized as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 300 keV, corresponding to an accelerating voltage of ˚ . In Figure 19, it is plotted for 300 kV and an electron wavelength of 0.02 A a silicon [100] as well as for a gold [100] atom column as a function of the spherical aberration constant Cs. The optimal spherical aberration constant in terms of precision is the one that corresponds to the minimum of the curve shown in Figure 19. From Figure 19, it follows that the optimal spherical aberration constant is equal to 0 mm in this example. For light atom columns such as silicon [100], the precision in terms of the standard deviation that is gained by reducing the spherical aberration constant from the original setting of 0.5 mm to the optimal setting of 0 mm is a factor of 1.3. For heavy atom columns such as gold [100], the precision that is gained by reducing the spherical aberration constant from 0.5 mm to 0 mm is a factor of 1.9. Therefore, correction of spherical aberration is more useful in terms of precision for heavy than for light atom columns. Notice, however,

Figure 19. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the spherical aberration constant. The incident electron energy is equal to 300 keV.

80

VAN AERT ET AL.

that for silicon, it follows from Figure 18 that a comparable gain in precision as the mentioned factor of 1.3 may be obtained without spherical aberration corrector, by using a slightly diVerent defocus value than Scherzer’s. The same conclusion may be obtained for gold. Next, the previous evaluation has been repeated, but this time for a ˚ . The thinner object. The object thickness is assumed to be equal to 50 A results are shown in Figure 20. From this figure, it is concluded that for thin objects, the optimal spherical aberration constant is diVerent from 0 mm. The reason for this is that for the thin object considered, a spherical aberration constant equal to 0 mm and a defocus adjusted to Scherzer’s lead to images with very low contrast, which result into extremely high standard deviations of the position coordinates. This is also found in (den Dekker, Sijbers, and van Dyck, 1999), where the attainable precision with which the position of a single atom can be estimated is evaluated as a function of microscope settings for high-resolution CTEM. In this paper, intuitive interpretations of the results may be found. For a gold [100] atom column, the optimal spherical aberration constant is close to but diVerent from 0 mm, whereas for a silicon [100] atom column, it is negative and equal to 0.35 mm. Therefore, from the comparison of Figures 19 and 20, it is concluded that the optimal spherical aberration constant clearly depends on the object under study. This finding is in contrast to what is found in Lentzen, Jahnen, Jia, Thust, Tillmann, and Urban (2002), where expressions are derived for the optimal spherical aberration constant in terms of phase

Figure 20. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the spherical aberration constant. The ˚. incident electron energy is equal to 300 keV. The object thickness is equal to 50 A

QUANTITATIVE ATOMIC RESOLUTION TEM

81

contrast and delocalization. The obtained expressions do not depend on structure parameters of the object under study. Subsequently, the precision is evaluated and optimized as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 50 keV, instead of 300 keV, corresponding to an ˚. accelerating voltage of 50 kV and an electron wavelength of 0.05 A Usually, decreasing the incident electron energy, or equivalently, increasing the electron wavelength, is not beneficial in terms of precision if the relevant physical constraint of the experiment is determined by the specimen drift. Some of the reasons for the deterioration of the precision with decreasing incident electron energy are the accompanied decrease of the number of detected electrons, which follows directly from Eq. (111), and the deterioration of the point resolution rs ¼ 0:66ðCs l3 Þ1=4 . However, for some materials one should use incident electron energies lower than 300 keV in order to avoid displacement damage, that is, displacement of atoms from their initial positions. The amount of displacement damage decreases with decreasing incident electron energy (Williams and Carter, 1996). Examples of materials which are sensitive to displacement damage are metals and amorphous materials. Although silicon and gold are possibly insensitive to displacement damage, the evaluation of the attainable precision is again performed for these columns so as to make comparison with the 300 keV results possible. The results for 50 keV are shown in Figure 21. In this evaluation, the chromatic aberration constant is equal to 0 mm. From

Figure 21. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the spherical aberration constant. The incident electron energy is equal to 50 keV. A chromatic aberration constant is used with Cc = 0 mm.

82

VAN AERT ET AL.

Figure 21, it follows that the optimal spherical aberration constant is equal to 0 mm, just as for a microscope operating at an incident electron energy of 300 keV and an object thickness equal to half the extinction distance. Moreover, it is concluded that, both for light and for heavy atom columns, correction of the spherical aberration is useful in terms of precision, although the gain is higher for heavy than for light atom columns. For example, for a light atom column, such as silicon [100], the precision that is gained by reducing the spherical aberration constant from 0.5 mm to 0 mm is a factor of 2.5, whereas for a heavy atom column such as gold [100], this is a factor of 13.9. The latter is a substantial reduction of the standard deviation. From the comparison of the numerical values of the lower bound on the standard deviation of the position coordinates corresponding to 50 keV and 300 keV, it follows that, as predicted above, the precision is higher for 300 keV than for 50 keV if the recording time is fixed. Therefore, reducing the incident electron energy is only beneficial in terms of precision if the object under study is sensitive to displacement damage. In the discussion of the optimal spherical aberration constant, some remarks are due. It should be mentioned that the results of the optimal spherical aberration constant do not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint of the CTEM experiment. In Figures 19 to 21, the ˚ are recording time as well as the number of incident electrons per square A fixed. Furthermore, it should be mentioned that the possible benefit of a spherical aberration corrector, which allows one to reduce the spherical aberration constant, is underestimated in the present analysis due to the following reason. The semi-angle of beam convergence has been kept constant and suYciently small in order to guarantee that the quasi-coherent approximation, made in the derivation of the expectation model, is reasonable (Spence, 1988). The chosen angle does therefore not correspond to the optimal value in terms of attainable precision. In the quasi-coherent approximation, the eVects of partial spatial and temporal coherence are incorporated by coherent damping envelope functions. For large semi-angles of beam convergence, the quasi-coherent approximation is no longer valid. A better approximation would be to include partial spatial and temporal coherence in the expectation model in the form of transmission-crosscoeYcients (Frank, 1973), (Born and Wolf, 1999; Ishizuka, 1980). This model would allow one to evaluate and optimize the attainable precision as a function of the semi-angle of beam convergence. Although such an analysis is not made in this work, it is intuitively clear that the optimal semi-angle of beam convergence would increase with decreasing spherical aberration constant and that the relative gain in precision would increase accordingly. This intuitive reasoning is based on the facts mentioned in Kabius, Haider,

QUANTITATIVE ATOMIC RESOLUTION TEM

83

Uhlemann, Schwan, Urban, and Rose (2002) and on the expectation model given by Eq. (110), although it is of a limited validity. From Eqs. (111) and (115), it follows that the total number of detected electrons increases with increasing semi-angle of beam convergence, which has a favorable eVect on the attainable precision with which position coordinates can be estimated. As a side eVect, however, it follows from Eq. (97) that with increasing semi-angle of beam convergence, high spatial frequencies are more severely attenuated due to partial spatial coherence, which has an unfavorable eVect on the attainable precision. The optimal semi-angle of beam convergence is the one for which both eVects are balanced so as to produce the highest attainable precision. The relative importance of the attenuation of high spatial frequencies becomes less for lower values of spherical aberration constant as follows from Eq. (97). Therefore, the optimal semi-angle of beam convergence will shift to higher values with decreasing spherical aberration constant. Due to the accompanied increase of the total number of detected electrons, the relative gain in precision will increase accordingly. Nevertheless, a decisive answer to the questions which semi-angle of beam convergence is optimal and what precision may be gained can only be provided by means of further research. v. Optimal Chromatic Aberration Constant. Next, the dependence of the precision on the chromatic aberration constant is studied. Usually, the chromatic aberration constant is a fixed property of the electron microscope. However, by incorporating a chromatic aberration corrector, which is at a conceptual stage (Weißba¨ cker and Rose, 2001, 2002), it will become tunable and may even become negative. The advantages of a chromatic aberration corrector for use in CTEM experiments are usually discussed in the literature in terms of the information limit of the electron microscope. The information limit ri is equal to (plD/2)1/2, with D the defocus spread, which is proportional to the chromatic aberration constant (Spence, 1988). By use of a chromatic aberration corrector, the information limit improves. In combination with image processing techniques such as oV-axis holography or the focal-series reconstruction method, visual interpretability of the reconstructed exit wave is enhanced, which is a benefit for qualitative structure determination. In the present analysis, the performance of a chromatic aberration corrector is studied for quantitative structure determination aiming at the highest precision with which position coordinates of an atom column can be estimated. First, the precision is evaluated and optimized as a function of the chromatic aberration constant for a microscope operating at an incident electron energy of 300 keV. In Figure 22, it is plotted for a silicon [100] as well as for a gold [100] atom column as a function of the chromatic

84

VAN AERT ET AL.

Figure 22. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the chromatic aberration constant. The incident electron energy is equal to 300 keV.

aberration constant. From this figure, it follows that the optimal chromatic aberration constant is equal to 0 mm. The precision in terms of the standard deviation that is gained by reducing the chromatic aberration constant from the original setting of 1.3 mm to 0 mm is a factor of 1.1 and 1.4 for silicon [100] and gold [100], respectively. Hence, both for light and for heavy atom columns, correction of the chromatic aberration is not so useful in terms of precision under the given conditions. Second, the precision is evaluated and optimized as a function of the chromatic aberration constant for a microscope operating at an incident electron energy of 50 keV, instead of 300 keV. Figure 23 shows the results of the evaluation for a silicon [100] as well as for a gold [100] atom column. The spherical aberration constant is equal to 0 mm. From this figure, it follows that the optimal chromatic aberration constant is again equal to 0 mm. Compared to the results obtained for a microscope operating at an incident electron energy of 300 keV, correction of the chromatic aberration is more useful in terms of precision both for light and for heavy atom columns. The precision that is gained by reducing the chromatic aberration constant from 1.3 mm to 0 mm is a factor of 3.5 and 22.0 for a light atom column such as silicon [100] and for a heavy atom column such as gold [100], respectively. These are substantial reductions of the standard deviation. However, as mentioned earlier, decreasing the incident electron energy is only recommended for materials which are sensitive to displacement damage.

QUANTITATIVE ATOMIC RESOLUTION TEM

85

Figure 23. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the chromatic aberration constant. The incident electron energy is equal to 50 keV. A spherical aberration corrector is used with Cs = 0 mm.

The results of the optimal chromatic aberration constant do not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint of the CTEM experiment. In Figure 22 and 23, the recording time as well as the number of ˚ are fixed. incident electrons per square A vi. Optimal Energy Spread of a Monochromator. Furthermore, the precision of the position coordinate estimates is evaluated and optimized as a function of the energy spread of a monochromator. This evaluation and optimization will be done for a fixed recording time as well as for a fixed ˚ . In the former case, the physical number of incident electrons per square A constraint is determined by the specimen drift whereas in the latter one, it is determined by the radiation sensitivity of the object. The reason for considering both constraints is that the number of incident electrons per second decreases by use of a monochromator. The use of a monochromator in CTEM experiments is assumed to be advantageous for qualitative structure determination. The reason for this supposition is that the information limit ri, which is equal to (plD/2)1/2, improves by use of a monochromator because of the decrease of the defocus spread D. This means that, in combination with oV-axis holography or the focal-series reconstruction method, visual interpretability of the reconstructed exit wave is enhanced. In these discussions, the object under study or the total number of detected electrons are not taken into account. However, a reduction of

86

VAN AERT ET AL.

the incident number of electrons per second leads to a decrease in SNR if the recording time is kept within the constraints. This eVect has to be taken into account when the performance of a monochromator for quantitative structure determination is evaluated. This might be done by using a modified definition of the information limit that includes the SNR (de Jong and van Dyck, 1993; van Dyck and de Jong, 1992). In the present analysis, however, this is done by choosing the attainable precision, instead of the information limit, as optimality criterion. This criterion takes both the object under study and the total number of detected electrons into account. First, it is assumed that the specimen drift determines the relevant physical constraint. Hence, the recording time is kept constant in the evaluation of the precision as a function of the standard deviation DEm of the energy spread of the monochromator, given by Eq. (114). Consequently, the total number of detected electrons decreases with decreasing energy spread. This follows directly from Eq. (115). Figures 24 and 25 show the results of the evaluation for a silicon [100] and for a gold [100] atom column for a microscope operating at an incident electron energy of 300 keV and 50 keV, respectively. At 50 keV, the spherical aberration constant is set to 0 mm. The optimal value of the energy spread in terms of precision is the one that corresponds to the minimum of the curve. From Figure 24, where the incident electron energy is equal to 300 keV, it follows that, for light atom columns such as silicon [100], no precision is gained by decreasing the energy spread by means of a monochromator. On the other hand, for heavy

Figure 24. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 300 keV. In this evaluation, the recording time is kept constant.

QUANTITATIVE ATOMIC RESOLUTION TEM

87

Figure 25. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 50 keV. A spherical aberration corrector is used with Cs = 0 mm. In this evaluation, the recording time is kept constant.

atom columns such as gold [100], a monochromator may slightly improve the precision. The precision that is gained by reducing the intrinsic energy spread of 0.75 eV to the optimal energy spread of 0.45 eV by use of a monochromator is a factor of 1.1 in this example. From Figure 25, it follows that, for a microscope operating at an incident electron energy of 50 keV, both for light and heavy atom columns, the precision improves by use of a monochromator. For a silicon [100] column and a gold [100] column, the precision that is gained by reducing the intrinsic energy spread of 0.75 eV to the optimal energy spread of 0.13 eV and 0.02 eV using a monochromator is a factor of 1.7 and 5.5, respectively. Second, it is assumed that the radiation sensitivity of the object determines the relevant physical constraint. Hence, the number of incident ˚ is kept constant in the evaluation of the precision as a electrons per square A function of the standard deviation of the energy spread. In practice, it follows from Eq. (115) that this may be realized by compensating the loss of incident electrons due to the use of the monochromator with an increasing recording time. Figures 26 and 27 show the results of the evaluation for a silicon [100] and for a gold [100] atom column for a microscope operating at an incident electron energy of 300 keV and 50 keV, respectively. The recording time corresponding to an intrinsic energy spread of 0.75 eV is equal to 1 s. At 50 keV, the spherical aberration constant is equal to 0 mm. From these figures, it follows that under the given conditions, the precision

88

VAN AERT ET AL.

Figure 26. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 300 keV. In this ˚ is kept constant. evaluation, the number of incident electrons per square A

Figure 27. The lower bound on the standard deviation of the position coordinates of an isolated silicon and gold atom column as a function of the standard deviation of the energy spread of a monochromator. The incident electron energy is equal to 50 keV. A spherical aberration corrector is used with Cs ¼ 0 mm. In this evaluation, the number of incident ˚ is kept constant. electrons per square A

improves by use of a monochromator. The precision that is gained is larger for heavy atom columns such as gold [100] and for smaller incident electron energies. This may be illustrated by the following numerical values. The precision that is gained by reducing the intrinsic energy spread of 0.75 eV to

QUANTITATIVE ATOMIC RESOLUTION TEM

89

0.03 eV using a monochromator for a microscope operating at an incident electron energy of 300 keV is a factor of 1.1 and 1.4 for a silicon [100] and gold [100] column, respectively. For a microscope operating at an incident electron energy of 50 keV, these factors are substantial and equal to 3.5 and 21.1, respectively. vii. Optimal Reduced Brightness of the Electron Source. Next, the eVect of the reduced brightness Br of the electron source on the precision with which the position coordinates of an atom column can be measured is studied. Using Eqs. (110), (116), and (117), it follows that the precision, represented by the lower bound on the standard deviation of the position coordinates, is inversely proportional to the square root of the total number of detected electrons N. Furthermore, it follows from Eqs. (111) and (115) that in the absence as well as in the presence of a monochromator, N is directly proportional to the reduced brightness of the electron source Br. Therefore, new developments in producing electron sources with higher reduced brightness (de Jonge, Lamy, Schoots, and Oosterkamp, 2002; van Veen, Hagen, Barth, and Kruit, 2001) are advantageous in terms of precision. For example, if the reduced brightness is increased by a factor of 10, the lower bound on the pffiffiffiffiffistandard deviation of the position coordinates decreases by a factor of 10. Hence, on the one hand, if the experiment is limited by specimen drift, the optimal reduced brightness is preferably as high as possible, that is, as high as physical limitations to the production of electron sources with higher reduced brightness allow. The dominant limitation is determined by the statistical Coulomb interactions (Kruit and Jansen, 1997; van Veen, Hagen, Barth, and Kruit, 2001). On the other hand, if the experiment is limited by the radiation sensitivity of the object, the reduced brightness has to be kept subcritical or an increase of the reduced brightness Br has to be kept subcritical or an increase of the reduced brightness Br has to be compensated by a decrease of the recording time t, so ˚ within the as to keep the number of incident electrons per square A constraints. Finally, a remark about the recording time needs to be made. If the experiment is limited by specimen drift, the recording time is kept within the constraints in this study. The amount of specimen drift is determined by mechanical instabilities of the specimen holder. Hence, new developments providing more stable specimen holders, would allow microscopists to increase the recording time. This has a favorable eVect on the precision since, as mentioned above, the lower bound on the standard deviation of the position coordinates is inversely proportional to the square root of the total number of detected electrons N, which in its turn is directly proportional to the recording time.

90

VAN AERT ET AL.

viii. Summary. Tables 11 and 12 give a summary of the attainable precisions with which the position coordinates of an isolated atom column can be estimated for a microscope operating at an incident electron energy of 300 keV and 50 keV, respectively. The attainable precision is represented for the values of the original microscope settings as described in Table 7 and for the optimal values of one or two of these settings with all other values kept fixed. The defocus is adjusted to the value given by Eq. (118) for negative Cs-values and to the Scherzer defocus, given by Eq. (119), for positive Cs-values. These values are close to optimal. This is done for both a silicon [100] and gold [100] atom column for which the structure parameters are given in Tables 8, 9, and 10. The recording time is held constant. From these tables, the following conclusions are drawn: TABLE 11 The Attainable Precision for an Isolated Silicon [100] and Gold [100] Atom Column for Different Values of Microscope Settings and for an Incident Electron Energy of 300 keV Column type Microscope settings

Si [100]

original settings optimal spherical aberration constant (Cs ¼ 0 mm) optimal chromatic aberration constant (Cc ¼ 0 mm) optimal energy spread of the monochromator (see text) 10 higher reduced brightness optimal spherical and chromatic aberration constant optimal spherical aberration constant and optimal energy spread of the monochromator

0.0014 0.0011 0.0013 0.0014 0.0004 0.0009 0.0011

Au [100]

˚ A ˚ A ˚ A ˚ A ˚ A ˚ A ˚ A

0.0054 0.0028 0.0040 0.0050 0.0017 0.0011 0.0022

˚ A ˚ A ˚ A ˚ A ˚ A ˚ A ˚ A

TABLE 12 The Attainable Precision for an Isolated Silicon [100] and Gold [100] Atom Column for Different Values of Microscope Settings and for an Incident Electron Energy of 50 keV Column type Microscope settings

Si [100]

original settings optimal spherical aberration constant (Cs ¼ 0 mm) optimal chromatic aberration constant (Cc ¼ 0 mm) optimal energy spread of the monochromator (see text) 10 higher reduced brightness optimal spherical and chromatic aberration constant optimal spherical aberration constant and optimal energy spread of the monochromator

0.0142 0.0079 0.0057 0.0098 0.0045 0.0023 0.0046

˚ A ˚ A ˚ A ˚ A ˚ A ˚ A ˚ A

Au [100] 0.1274 0.0555 0.0350 0.1133 0.0403 0.0025 0.0100

˚ A ˚ A ˚ A ˚ A ˚ A ˚ A ˚ A

QUANTITATIVE ATOMIC RESOLUTION TEM

91

. The attainable precision is better at 300 keV than at 50 keV. Hence, reducing the incident electron energy is only recommended if the experiment is limited by displacement damage instead of specimen drift. . Mathematically speaking, at 300 keV, the attainable precision improves with a spherical or chromatic aberration corrector. However, since the accompanied gain in precision is only marginal, one may wonder if such correctors are needed in order to obtain a prespecified precision of the atom column positions. . At 50 keV, the attainable precision improves with a spherical or chromatic aberration corrector. A chromatic aberration corrector is preferable. . The attainable precision improves more with a chromatic aberration corrector than with a monochromator. . The attainable precision improves substantially if the reduced brightness would be 10 times higher. . The attainable precision improves substantially with both a spherical and chromatic aberration corrector, especially for heavy atom columns and low incident electron energies.

Furthermore, as mentioned earlier, the attainable precision may be improved if the mechanical stability of the specimen holder is improved, since it would provide longer recording times and hence more detected electrons. c. Neighboring Atom Columns. The optimal microscope settings described in the previous part of Section IV.C.2 are derived for single isolated atom columns. One should keep in mind that the attainable precision with which the position of a single isolated column can be estimated is a valid criterion for the optimization of the experimental design as long as neighboring columns are clearly separated in the image. Under this condition, the attainable precision with which the position of an atom column is estimated is independent of the presence of neighboring columns. This condition was not always met in the previous part. For example, images of silicon [100] atom columns of a crystal, taken with a microscope which operates at an incident electron energy of 50 keV and which is not corrected for spherical and chromatic aberration, show strong overlap. Then, the attainable precision with which the position of an atom column can be estimated is aVected unfavorably by the presence of neighboring columns. To find out if the optimal microscope settings change in the presence of neighboring atom columns, the attainable precision with which atom column position coordinates of silicon [100] and gold [100] crystals can be estimated, will be computed.

92

VAN AERT ET AL.

i. Structure Parameters. The two-dimensional projected structure of the objects under study, which are, silicon [100] and gold [100] crystals, is modelled as a lattice consisting of 7  7 projected atom columns at the positions T  T  ð120Þ bn ¼ bxn byn ¼ nx d ny d ; with indices n ¼ ðnx ; ny Þ; nx ¼ 3; . . . ; 3, ny ¼ 3; . . . ; 3, and d the distance between an atom column and its nearest neighbor. The values of the distance d for both a silicon [100] and a gold [100] crystal (International Centre for DiVraction Data, 2001) and for the object thickness are given in Table 13. It should be mentioned that the chosen object thickness is equal to ˚ instead of half the extinction distance such as for isolated atom 50 A columns in the previous section. ii. Optimality Criterion. The optimal statistical experimental design will be described by the microscope settings resulting into the highest attainable precision with which the position coordinate bxn of the central atom column of the lattice consisting of 7  7 atom columns can be estimated. This column corresponds to the index n ¼ ð0; 0Þ. The attainable precision (in terms of the variance) is represented by the diagonal element s2bxn of the CRLB. An expression for this element may be derived as follows. First, the Fisher information matrix associated with the total set of 98 position coordinates bxn and byn is computed. This is a 98  98 matrix. The expression for the elements Frs of the Fisher information matrix is given by Eq. (116). Explicit numbers for these elements are obtained by substituting values of a given set of microscope settings and structure parameters of the object into the obtained expression for Frs. Next, the CRLB is computed. It is given by the inverse of the Fisher information matrix. Finally, the diagonal element s2bxn of the CRLB, corresponding to the position coordinate bxn of the central atom column of the lattice, represents the attainable precision. In what follows, the precision will be represented by the lower bound on the standard deviation sbxn, that is, the square root of s2bxn .

TABLE 13 Structure Parameters of Neighboring Atom Columns Column type Structure parameter ˚) d (A ˚) z (A

Si [100]

Au [100]

1.92 50

2.04 50

QUANTITATIVE ATOMIC RESOLUTION TEM

93

It will be used as optimality criterion for the evaluation and optimization of the experimental design. Alternatively, one could choose the lower bound on the standard deviation sbyn of the position coordinate byn of the central atom column since sbxn and sbyn are equal to one another. The reason for this is that, for the chosen structure of the objects under study, rotation of the expectation model over an angle of 90 degrees carries the expectation model into itself. Moreover, the central atom column is preferred rather than one of the other 48 atom columns since this column is mostly aVected by the presence of neighboring columns. As mentioned in Section II.C.2., the chosen criterion may be regarded as a partial or truncated optimality criterion. iii. Optimal Microscope Settings. First, in Figures 28 and 29, the precision is evaluated as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 300 keV for a silicon [100] and gold [100] crystal, respectively. The solid curve corresponds to a microscope without correction for chromatic aberration, that is, a microscope without chromatic aberration corrector and monochromator. The dashed curve corresponds to a microscope with chromatic aberration corrector, that is, a microscope for which the chromatic aberration constant is equal to 0 mm. The dotted curve corresponds to a microscope with monochromator, for which the standard deviation of the energy spread is chosen equal to 0.086 eV corresponding to a typical full width at half maximum height of 200 meV as follows from Eq. (102) (Batson, 1999). In this

Figure 28. The lower bound on the standard deviation of the position coordinates of ˚ thick silicon [100] crystal under study as a function of the the central atom column of the 50 A spherical aberration constant for a microscope operating at 300 keV equipped with or without chromatic aberration corrector or monochromator.

94

VAN AERT ET AL.

Figure 29. The lower bound on the standard deviation of the position coordinates of ˚ thick gold [100] crystal under study as a function of the the central atom column of the 50 A spherical aberration constant for a microscope operating at 300 keV equipped with or without chromatic aberration corrector or monochromator.

evaluation, it is assumed that the specimen drift is the relevant physical constraint. Hence, the recording time is kept constant. It should be noticed that the precision is not represented for a spherical aberration constant equal to 0 mm in Figures 28 and 29. The reason for this is that for the thin crystals considered and for the defocus adjusted to the Scherzer defocus, the contrast in the image is very low, which results in extremely high standard deviations of the position coordinates. From Figures 28 and 29, the following conclusions are drawn for neighboring atom columns and an incident electron energy of 300 keV: The optimal spherical aberration constant is close to, but diVerent from, 0 mm. The reason for this finding is due to the small object thickness. . The attainable precision improves by use of a chromatic aberration corrector. Particularly for light atom columns such as silicon [100], the gain in precision is only marginal. . The attainable precision deteriorates by use of a monochromator with an energy spread of 0.086 eV for both types of crystals. . Strictly speaking, the highest attainable precision, corresponding to the optimal experimental design, is obtained for a microscope with chromatic aberration constant equal to 0 mm and with spherical aberration constant close to, but diVerent from, 0 mm (for the thin objects considered). However, the precision that is gained is only marginal. Hence, the question may be raised if this gain is required to obtain a desired precision. .

QUANTITATIVE ATOMIC RESOLUTION TEM

95

It should be noticed that the possible benefit of a spherical aberration corrector is underestimated in the present analysis for the same reason as has been mentioned in the discussion of the evaluation of the spherical aberration constant for isolated atom columns. Second, in Figures 30 and 31, the precision is evaluated as a function of the spherical aberration constant for a microscope operating at an incident electron energy of 50 keV, instead of 300 keV, for a silicon [100] and gold [100] crystal, respectively. Again, the solid curve corresponds to a microscope without correction for chromatic aberration. The dashed curve corresponds to a microscope with chromatic aberration corrector, that is, a microscope for which the chromatic aberration constant is equal to 0 mm. The dotted curve corresponds to a microscope with monochromator, for which the standard deviation of the energy spread is equal to 0.086 eV. The recording time is kept constant in the evaluation. Also here, the precision is not represented for a spherical aberration constant equal to 0 mm since for the thin crystals considered and for the defocus adjusted to the Scherzer defocus, the corresponding standard deviations of the position coordinates are very high. From Figures 30 and 31, the following conclusions are drawn for neighboring atom columns and an incident electron energy of 50 keV: . The optimal spherical aberration constant is diVerent from 0 mm. The reason for this finding is due to the small object thickness.

Figure 30. The lower bound on the standard deviation of the position coordinates of ˚ thick silicon [100] crystal under study as a function of the the central atom column of the 50 A spherical aberration constant for a microscope operating at 50 keV equipped with or without chromatic aberration corrector or monochromator.

96

VAN AERT ET AL.

Figure 31. The lower bound on the standard deviation of the position coordinates of ˚ thick gold [100] crystal under study as a function of the the central atom column of the 50 A spherical aberration constant for a microscope operating at 50 keV equipped with or without chromatic aberration corrector or monochromator.

. The attainable precision improves by use of either a chromatic aberration corrector or a monochromator, although a chromatic aberration corrector is preferred. . The attainable precision improves more with a chromatic than with a spherical aberration corrector. The gain is more significant for heavy atom columns such as gold [100] than for light atom columns such as silicon [100]. . The highest attainable precision, corresponding to the optimal experimental design, is obtained for a microscope with chromatic aberration constant equal to 0 mm and with spherical aberration constant close to, but diVerent from, 0 mm (for the thin objects considered). The gain in precision is substantial.

From the comparison of the conclusions obtained from Figures 28 to 31 for neighboring atom columns with those obtained for isolated atom columns as summarized in the previous section, it follows that the main conclusions regarding the optimal microscope settings remain. Moreover, like for isolated atom columns, increasing the reduced brightness of the electron source and improving the mechanical stability of the specimen holder is advantageous in terms of precision if the experiment is limited by specimen drift. This is evident since these conclusions, which are given earlier, are only based on the total number of detected electrons and not on the structure of the object under study.

QUANTITATIVE ATOMIC RESOLUTION TEM

97

d. Attainability of the Crame´ r-Rao Lower Bound. Finally, the discussion of the optimization of a CTEM experiment should be complemented with an investigation if there exists an estimator attaining the CRLB on the variance of the position coordinates and if this estimator is unbiased. If so, this would justify the choice of the CRLB as optimality criterion used in this section. Generally, one may use diVerent estimators in order to measure the position coordinates of the atom columns from CTEM experiments such as the least squares estimator or the maximum likelihood estimator, which has been introduced in Section II.D. DiVerent estimators have diVerent properties. One of the asymptotic properties of the maximum likelihood estimator is that it is normally distributed about the true parameters with covariance matrix approaching the CRLB (van den Bos, 1982). This property would justify the use of the CRLB as optimality criterion, but it is an asymptotic one. This means that it applies to an infinite number of observations. However, the number of observations used in CTEM experiments is finite and may even be relatively small. If asymptotic properties still apply to such experiments can often only be assessed by estimating from artificial, simulated observations (van den Bos, 1999). Therefore, 200 diVerent CTEM experiments made on an isolated silicon [100] atom column are simulated; the observations are modelled using the parametric statistical model described in Section IV.B. The spherical aberration constant is set equal to 1 mm. Next, the position coordinates bx and by of the atom column are estimated from each simulation experiment using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the position coordinate and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression given by the right-hand member of Eq. (117). The results are presented in Table 14. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does not attain the CRLB. Furthermore, the maximum likelihood estimates of bx are presented in the histogram of Figure 32. The solid curve represents a normal distribution with mean and variance given in Table 14. This curve makes plausible that the estimates are normally distributed. This property is also tested quantitatively by means of the so-called Lilliefors test (Conover, 1980), which does not reject the hypothesis that the estimates are normally distributed. From the results obtained from the simulation experiments, it is concluded that the maximum likelihood estimates cannot be distinguished from unbiased, eYcient estimates. These results justify the choice of the CRLB as optimality criterion.

98

VAN AERT ET AL. TABLE 14 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 200 Maximum Likelihood Estimates of the Position Coordinates, Respectively True position ˚) coordinate (A

bx by

0 0

Estimated ˚) mean (A

Standard deviation ˚) of mean (A

6.3  10 5 3.6  10 5

10.9  10 5 11.1  10 5

Estimated ˚ 2) variance (A

Standard deviation ˚ 2) of variance (A

s2bx

2.4  10 6

0.2  10 6

s2by

2.5  10

6

0.2  10 6

bx by

Lower bound ˚ 2) on variance (A s2bx s2by

2.6  10 6 6

2.6  10

The numbers of the last column represent the estimated standard deviation of the variable of the previous column.

Figure 32. Histogram of 200 maximum likelihood estimates of the x-coordinate of the position of an atom column. The normal distribution superimposed on this histogram, with mean and variance given in Table 14, makes plausible that the estimates are normally distributed.

QUANTITATIVE ATOMIC RESOLUTION TEM

99

3. Interpretation of the Results To provide more insight, an intuitive interpretation will be given to some numerical results obtained in Section IV.C.2. This will be done at the hand of a result obtained in Section III where a rule of thumb was obtained for the attainable precision with which the position of one component can be measured from a bright-field imaging experiment such as CTEM. The rule of thumb, which is given by Eq. (68), was derived for an expectation model of the observations consisting of a constant background from which a Gaussian peak was subtracted. From it, one observes that the attainable precision is a function of the width of the Gaussian peak and the total number of detected electrons. Generally, the precision will improve by narrowing the Gaussian peak and by increasing the total number of detected electrons. Empirically, it has been found that the obtained rule of thumb is generalizable to more complicated CTEM expectation models than Gaussian peaks. Two diVerent approaches may be followed. One approach is to consider the highest spatial frequency that is transferred from the exit plane to the image plane instead of the inverse of the width of the Gaussian peak. Another approach is to consider the width associated with the peak which remains if the background is subtracted from the CTEM expectation model instead of the width of the Gaussian peak. The generalized rule of thumb is then that the precision will improve by decreasing the width of this peak or by increasing the highest spatial frequency that is transferred from the exit plane to the image plane and by increasing the total number of detected electrons. From the example illustrated in Figure 28, it may be concluded that the lower bound on the standard deviation of the position coordinates of a silicon [100] atom column of a crystal is a factor of 2.4 lower by using a chromatic aberration constant of 0 mm and a spherical aberration constant of 0.2 mm instead of an energy spread of the monochromator of 0.086 eV and a spherical aberration constant of 1.0 mm. This result may intuitively be interpreted by comparing the corresponding expectation models. These models describe the expected number of electrons detected at the pixels of a CCD camera. It has been derived in Section IV.B. Figure 33 shows intersections of the two-dimensional, radially symmetric column model and a plane through its radial axis. It is clearly observed that the peak, which remains if the background is removed, is narrower if the spherical aberration constant is equal to 0.2 mm instead of 1.0 mm. This narrowing is directly related to the improvement of the point resolution rs ¼ 0:66ðCs l3 Þ1=4 with decreasing spherical aberration constant. Moreover, the number of detected electrons is much larger in the absence of a monochromator since it is assumed in this example that the recording time is

100

VAN AERT ET AL.

Figure. 33. Intersection of the two-dimensional, radially symmetric expectation model of the observations made on an isolated silicon [100] atom column and a plane through its radial axis. The solid curve corresponds to a microscope with chromatic and spherical aberration constant equal to 0 mm and 0.2 mm, respectively. The dashed curve corresponds to a microscope with standard deviation of the energy spread of the monochromator and spherical aberration constant equal to 0.086 eV and 1.0 mm, respectively.

fixed. These considerations give an intuitive interpretation to the conclusion drawn from Figure 28. Moreover, it follows from Figure 24 to 31 that the precision that is possibly gained by use of a monochromator is higher for heavy atom columns such as gold [100] than for light atom columns such as silicon [100] and for microscopes operating at lower incident electron energies, for example, 50 keV instead of 300 keV. These results may be explained on a more or less intuitive basis as follows. Figures 34 and 35 show the damping envelope function Dt(g) due to partial temporal coherence, described by Eq. (98), associated with an electron source having an intrinsic energy spread equal to 0.75 eV, together with the Fourier transformed 1s-state functions F1s,n(g), described by Eq. (90), for a gold [100] and a silicon [100] atom column for a microscope operating at an incident electron energy equal to 300 keV and 50 keV, respectively. It follows from Eqs. (104)–(106) that F1s,n(g) may be regarded as the object transfer function associated with the atom column. It acts as a low pass filter and severely attenuates the amplitude of the microscope’s transfer function T(g) at high spatial frequencies. The microscope’s transfer function is described by Eq. (94) and it includes the damping envelope function Dt(g). The bandwidth of the low pass filter F1s,n(g) associated with the atom column depends on the

QUANTITATIVE ATOMIC RESOLUTION TEM

101

Figure 34. The damping envelope function Dt(g) due to partial temporal coherence, described by Eq. (98), together with the object transfer function F1s,n(g), described by Eq. (90), for a gold [100] and a silicon [100] atom column for a microscope operating at an incident electron energy of 300 keV.

Figure 35. The damping envelope function Dt(g) due to partial temporal coherence, described by Eq. (98), together with the object transfer function F1s,n(g), described by Eq. (90), for a gold [100] and a silicon [100] atom column for a microscope operating at an incident electron energy of 50 keV.

weight of this column. Heavy atom columns have more sharply peaked 1sstate functions, and therefore wider object transfer functions, than light atom columns (see also Tables 8 and 9). Consequently, a silicon [100] atom column will have a narrower bandwidth than a gold [100] atom column, as can be seen in Figures 34 and 35. A reduction of the energy spread, which may be obtained by incorporating a monochromator, decreases the

102

VAN AERT ET AL.

information limit since it increases the band-width of the damping envelope function Dt(g). However, pushing the inverse of the information limit of the microscope beyond the bandwidth of the object transfer function is useless. From Figures 34 and 35, it follows that there is more object spatial frequency information to be gained at an incident electron energy of 50 keV instead of 300 keV and for a gold [100] atom column than for a silicon [100] atom column. This eVect, in combination with the loss of electrons by use of a monochromator if the experiment is limited by the specimen drift or the non-loss if the experiment is limited by the radiation damage, makes the results obtained from Figures 24 to 31 understandable. The same reasoning may be applied to understand that the optimal chromatic aberration constant is equal to 0 mm and that a chromatic aberration corrector improves the precision more for heavy than for light atom columns and for lower incident electron energies as follows from Figures 22, 23, and 28 to 31. The chromatic aberration corrector increases the bandwidth of the damping envelope function Dt(g) like the monochromator does, but this is not accompanied by a reduction of the total number of detected electrons. The examples given above illustrate that the rule of thumb derived in Section III, for the attainable precision with which the position of one component can be measured from a bright-field imaging experiment such as CTEM, may be used to give an intuitive interpretation to the numerical results obtained in Section IV.C.2. This provides a check of these numerical results. However, this rule of thumb cannot replace the exact expressions for the attainable precision which have been used in Section IV.C.2. D. Conclusions It has been shown that when it comes to the evaluation and optimization of quantitative CTEM experiments aiming at the highest precision, criteria such as point resolution and information limit may give rise to deceptive guidelines, since they do not take the object and total number of detected electrons into account. Alternatively to these criteria, the obvious optimality criterion is the attainable statistical precision, that is, the CRLB, with which position coordinates of atom columns can be estimated. This criterion depends on the microscope settings, the object, and the total number of detected electrons, rather than on the microscope settings alone. An expression for the attainable statistical precision has been derived from a parametric statistical model of the observations. The expectations of the observations have been described by means of the channelling theory and the quasi-coherent approximation, whereas the fluctuations of the

QUANTITATIVE ATOMIC RESOLUTION TEM

103

observations have been described by means of Poisson statistics. The obtained expression has been used to evaluate and optimize the design of quantitative CTEM experiments. This analysis has been made for microscopes operating at an intermediate incident electron energy of 300 keV and for those operating at a low incident electron energy of 50 keV. The relevant physical constraints have been taken into consideration. These constraints are the radiation sensitivity of the object or the specimen drift. ˚ or the recording time has Therefore, the incident electron dose per square A been kept within the constraints. From the analysis, the following general guidelines have been derived: . The optimal defocus value is approximately given by Eq. (118) for negative Cs-values and at the Scherzer defocus, given by Eq. (119), for positive Cs-values. . A spherical and chromatic aberration corrector may improve the attainable precision. The precision that is gained depends on the object under study. Correction has more sense for low than for intermediate incident electron energies and for objects consisting of heavy instead of light atom columns. It should be mentioned that the optimal spherical aberration constant is diVerent from 0 mm for thin objects. . The attainable precision improves more with a chromatic aberration corrector than with a monochromator. . The highest attainable precision, corresponding to the optimal experimental design, is obtained for a microscope with both a spherical and chromatic aberration corrector. . Increasing the reduced brightness of the electron source may improve the attainable precision substantially if the experiment is limited by specimen drift. . Improving the mechanical stability of specimen holders, which would provide longer recording times, improves the attainable precision, especially if the experiment is limited by specimen drift.

Additionally, the following guidelines have been derived for microscopes operating at intermediate incident electron energies: . A monochromator does usually not improve the attainable precision if the experiment is limited by specimen drift, except for heavy atom columns, whereas it slightly improves the precision if the experiment is limited by the radiation sensitivity of the object. . The precision that is possibly gained using a spherical aberration corrector, a chromatic aberration corrector, a monochromator or any combination might be disillusioning in the sense that this gain is only marginal and might not be needed to obtain a required precision.

104

VAN AERT ET AL.

Furthermore, the following guidelines have been derived for microscopes operating at low incident electron energies: A monochromator improves the attainable precision. The attainable precision improves more with either a chromatic aberration corrector or a monochromator than with a spherical aberration corrector. . The precision that is gained using a spherical aberration corrector, a chromatic aberration corrector, a monochromator or any combination might be substantial. . .

V. Optimal Statistical Experimental Design of Scanning Transmission Electron Microscopy

A. Introduction In this section, optimal statistical experimental designs of STEM experiments will be described. They will be computed in a similar way as those of CTEM experiments in Section IV. Hence, the STEM designs will be evaluated and optimized in terms of the attainable precision, that is, the CRLB, with which atom column positions of the object under study can be measured. The choice of this optimality criterion reflects the purpose of future atomic resolution TEM experiments. As mentioned in Section I, this purpose is quantitative structure determination, which means that the structure parameters of the object under study, the atom column positions in particular, are quantitatively estimated from the observations. Ultimately, this should be done as precisely as possible. First, it will be described how STEM observations are collected. A scheme is shown in Figure 36. An electron probe is formed by demagnifying a small electron source with a set of condenser and objective lenses. The resulting probe scans in a raster over the object. At each probe position, a part of the object under study is illuminated. As a result of the electron-object interaction, the so-called exit wave, which is a complex electron wave function at the exit plane of the object, is formed. This wave propagates to a detector, which is placed in the back focal plane beyond the object. In this plane, a so-called convergent-beam electron diVraction pattern is formed. The part of this pattern that reaches the detector is integrated and displayed as a function of the probe position. In STEM, one distinguishes diVerent imaging modes that are related to the shape or size of the detector such as axial bright-field coherent STEM and annular dark-field incoherent STEM.

QUANTITATIVE ATOMIC RESOLUTION TEM

105

Figure 36. Scheme of a STEM experiment. Usually, an annular (black colored) or axial (grey colored) detector is chosen. The angle aD represents the inner collection semi-angle of an annular detector or the outer collection semi-angle of an axial detector.

In the former mode, an axial detector with a small outer collection semiangle aD is used, whereas in the latter mode, an annular detector with a large inner collection semi-angle aD is used. The angle aD is shown in Figure 36. It corresponds to a detector radius gdet equal to aD/l, where l is the electron wavelength. For more details on STEM, the reader is referred to Batson, Dellby, and Krivanek (2002); Cowley (1997), Crewe (1997), Nellist and Pennycook (2000), and Pennycook and Yan (2001). For many years, it has been standard practice to evaluate the performance of STEM experiments qualitatively, that is, in terms of direct visual interpretability. The performance criteria used are two-point resolution and contrast. For example, when axial bright-field coherent STEM is compared to annular dark-field incoherent STEM, the latter imaging mode is preferred. The basic ideas underlying this preference are the improvement of two-point resolution for incoherent imaging compared to coherent imaging (Pennycook, 1997) and the higher contrast in dark-field images than in bright-field images (Cowley, 1997). In annular dark-field incoherent STEM, visual interpretation of the images is optimal if the Scherzer conditions (Scherzer, 1949) for incoherent imaging are adapted (Pennycook and Jesson, 1991). As demonstrated in (Nellist and Pennycook, 1998), the resolution may be further improved if the main lobe of the probe is narrowed. However, visual interpretability is then reduced as a result of a considerable rise of the sidelobes of the probe.

106

VAN AERT ET AL.

Two important aspects are absent in these widely used performance criteria. First, the electron-object interaction is not taken into account. Second, the dose eYciency, which is defined as the ratio of the number of detected electrons to the number of incident electrons, is left out of consideration. Improvement of resolution and contrast is often obtained at the expense of dose eYciency, which leads to a decrease in the SNR. For example, the incoherence in annular dark-field incoherent STEM is attained by using an annular dark-field detector with a geometry much larger than the objective aperture, that is, an annular detector with an inner collection semi-angle much larger than the objective aperture semi-angle (Nellist and Pennycook, 2000). Its corresponding improvement of two-point resolution, by adapting the Scherzer conditions for incoherent imaging, is thus obtained at the expense of dose eYciency. Another example is the following. It is well known that in bright-field images, decreasing the outer collection semi-angle of an axial detector leads to higher contrast, but also to a deterioration of the SNR, which deteriorates the quality of an image. To compensate for such a decrease in SNR, longer recording times are necessary, which in turn increase the disturbing influence of specimen drift. The observation that the quality of an image is determined by both the resolution and the SNR has led to several modified criteria (Sato, 1997; Sato and OrloV, 1992). The ultimate goal of STEM is not qualitative structure determination, but quantitative structure determination instead. Ultimately, structure parameters of the object under study, such as the atom column positions, have to be measured as precisely as possible. However, this precision will always be limited by the presence of noise. Given the parametric statistical model of the observations, an expression may be obtained for the highest attainable precision with which the atom column positions can be measured. This expression, which is called the CRLB, is a function of structure parameters, microscope settings, and dose eYciency. Therefore, it may be used as an alternative performance measure in the evaluation and optimization of the design of a STEM experiment for a given object. The optimal statistical experimental design corresponds to the microscope settings resulting in the highest attainable precision. It will be obtained by using the principles of statistical experimental design explained in Section II. The section is organized as follows. In Section V.B, a parametric statistical model of the observations will be derived. This model describes the expectations of the observations as well as the fluctuations of the observations about these expectations. Next, in Section V.C, an expression for the CRLB on the variance of the atom column position estimates is obtained from this model. Also, an adequate optimality criterion, which is a function of the elements of the CRLB, is given. This criterion is then used to evaluate and optimize the experimental design. Special attention will be paid

QUANTITATIVE ATOMIC RESOLUTION TEM

107

to the optimal reduced brightness of the electron source, the optimal defocus value, the optimal spherical aberration constant, the optimal detector radius, and the optimal source width. Furthermore, it will be investigated if an annular detector is preferable to an axial one. In Section V.D, conclusions are drawn. Part of the results of this section has earlier been published in den Dekker, van Aert, van Dyck, and van den Bos (2000), van Aert and van Dyck (2001), van Aert, den Dekker, van Dyck, and van den Bos (2000a, 2002b), van Aert, van Dyck, den Dekker, and van den Bos (2000).

B. Parametric Statistical Model of Observations A parametric statistical model of the observations is needed in order to obtain an expression for the CRLB, which will be used for the optimization of the experimental design. In this section, such a model will be derived. It describes the expectations of the observations as well as the fluctuations of the observations about these expectations. This model contains microscope settings such as defocus, spherical aberration constant, and detector angle, as well as structure parameters such as atom column positions and the object thickness. In the derivation of this model, three basic approximations will be made. First, use will be made of the simplified channelling theory to describe the dynamical, elastic scattering of the electrons on their way through the object (Broeckx, Op de Beeck and van Dyck, 1995; Geuens and van Dyck, 2002; Pennycook and Jesson, 1992; van Aert, den Dekker, van Dyck and van den Bos, 2002b; van Dyck and Op de Beeck, 1996 ). Second, temporal incoherence due to chromatic aberration, which results from a spread in defocus values, will not be taken into account. This approximation is justified by the fact that researchers suspect that STEM imaging is robust to chromatic aberration (Batson, Dellby and Krivanek, 2002; Krivanek, Dellby and Nellist, 2002; Nellist and Pennycook, 1998, 2000). Third, thermal diVuse, inelastic scattering will not be taken into account. Thermal diVuse scattered electrons are predominantly collected in the detector at high angles (Treacy, 1982). Therefore, increasing the inner collection semiangle aD (see Figure 36) of an annular detector has the eVect of increasing thermal diVuse, inelastic scattering relative to elastic scattering (Wang, 2001). The main advantage of this is the strong dependence of the detected signal on the atomic number Z, hence the name, Z-contrast imaging. The disadvantage, however, is the accompanied decrease of dose eYciency, which leads to a decrease in SNR. In Section V.C, it will be shown that, as a result of this decrease in SNR, the optimal inner collection angle in terms of precision is small compared to the angles where thermal diVuse scattering is

108

VAN AERT ET AL.

important. This justifies the fact that thermal diVuse scattering will not be taken into account. Although the approximations made are of a limited validity, they are useful for a compact analytical model-based optimization of the design of quantitative STEM experiments as well as for explaining the basic principles governing the obtained results. The principal results are independent of the approximations made. 1. The Exit Wave The first step toward the parametric statistical model of the observations is to obtain an expression for the exit wave c(r, z). It is a complex electron wave function in the plane at the exit face of the object, resulting from the interaction of the electron probe with the object. As for CTEM, use will be made of the simplified channelling theory. At this stage, both structure parameters and microscope settings, describing the object and probe, respectively, will enter the model. According to the simplified channelling theory, applicable if the probe propagates along a major zone axis, an expression may be derived for the exit wave of an object consisting of nc atom columns (Broeckx, Op de Beeck, and van Dyck, 1995; Geuens and van Dyck, 2002; Pennycook and Jesson, 1992; van Aert, den Dekker, van Dyck, and van den Bos, 2002b; van Dyck and Op de Beeck, 1996). This derivation is equivalent to that of the exit wave for CTEM, given by Eq. (84), except that the parallel incident electron beam used in CTEM is now replaced by the electron probe. The expression for the exit wave for STEM is given by:     nc X E1s;n 1 cðr; zÞ ¼ pðr rkl Þ þ z 1 ; cn ðrkl bn Þf1s;n ðr bn Þ exp ip E0 l n¼1 ð121Þ T

where r ¼ ðx yÞ is a two-dimensional vector in the plane at the exit face of the object, perpendicular to the propagation direction of the electron probe, z is the object thickness, E0 is the incident electron energy, and l is the electron wavelength. The incident electron energy and the electron wavelength are related according to Eq. (85). Furthermore, the function p(r rkl) describes the probe located at the position rkl ¼ ðxk yl ÞT . The function f1s;n ðr bn Þ is the lowest energy bound state of the nth atom column located at position bn ¼ ðbxn byn ÞT and E1s,n is its energy. The energy of this state is a parameter related to the projected ‘weight’ of the atom column, which is a function of the atom numbers of the atoms along a column, the distance between successive atoms, and the Debye-Waller factor (van Dyck and Chen, 1999a). The lowest energy bound state f1s;n ðr bn Þ is

QUANTITATIVE ATOMIC RESOLUTION TEM

109

a real-valued, centrally peaked, radially symmetric function, which is a twodimensional analogue of the 1s-state of an atom. In Eq. (121), it is assumed that the dynamical motion of an electron in a column may be primarily expressed in terms of this tightly bound 1s-state. As in Section IV.B.1, where an expression for the exit wave is described for CTEM, it will be assumed that the 1s-state function may be approximated by a single, quadratically normalized, parameterized Gaussian function given by Eq. (88) (Geuens and van Dyck, 2002). The excitation coeYcients cn ðrkl bn Þ of Eq. (121) are found from: Z cn ðrkl bn Þ ¼ f1s;n ðr bn Þpðr rkl Þdr; ð122Þ where the symbol * denotes the complex conjugate. Since the 1s-state is a real-valued function and since the probe function is assumed to have radial symmetry so that pðrÞ ¼ pð rÞ, Eq. (122) may be written as a convolution product: cn ðrkl bn Þ ¼ pðrkl bn Þ  f1s;n ðrkl bn Þ:

ð123Þ

The convolution theorem (Papoulis, 1968) allows one to rewrite this equation as: cn ðrkl bn Þ ¼ = 1 g!r b PðgÞF1s;n ðgÞ; Z kl n ¼ PðgÞF1s;n ðgÞexpð i2pg:ðrkl bn ÞÞdg;

ð124Þ

where P(g) is the two-dimensional Fourier transform of the probe function p(r), F1s,n(g) is the Fourier transform of the 1s-state f1s,n(r) given by Eq. (90), g is a two-dimensional spatial frequency vector, and the symbol ‘.’ denotes the scalar product. The Fourier transform and the inverse Fourier transform are defined by Eqs. (91) and (92), respectively. For radially symmetric 1s-state and probe functions, Eq. (124) may be written as: Z 1 cn ðrkl bn Þ ¼ cn ðjrkl bn jÞ ¼ 2p PðgÞF1s;n ðgÞJ0 ð2pgjrkl bn jÞgdg: 0

ð125Þ This is an elementary result of the theory of Bessel functions, where J0(.) is the zeroth-order Bessel function of the first kind and qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2ffi jrkl bn j ¼ ðxk bxn Þ2 þ yl byn ð126Þ represents the distance from the probe to the nth atom column.

110

VAN AERT ET AL.

The illuminating STEM probe p(r) is the inverse Fourier transform of the coherent transfer function of the objective lens P(g): pðrÞ ¼ = 1 g!r PðgÞ:

ð127Þ

The transfer function P(g) is radially symmetric and given by: PðgÞ ¼ PðgÞ ¼ AðgÞexpð iwðgÞÞ;

ð128Þ

where g ¼ jgj is the Euclidean norm of the two-dimensional spatial frequency vector. The circular aperture function A(g) is defined in the same way as in Eq. (95): ( 1 if g  gap AðgÞ ¼ ð129Þ 0 if g > gap with gap the objective aperture radius. Notice that the objective aperture semiangle a0 is equal to gapl. The phase shift w(g), resulting from the objective lens aberrations, is radially symmetric and is defined in the same way as in Eq. (96) (van Dyck, 2002): 1 wðgÞ ¼ p"lg2 þ pCs l3 g4 2

ð130Þ

with " the defocus, l the electron wavelength, and Cs the spherical aberration constant. Other aberration eVects such as 2-fold astigmatism, 3-fold astigmatism, and axial coma, could also be included in this phase shift (Thust, Overwijk, Coene, and Lentzen, 1996). From the comparison of Eq. (128) with Eq. (94), where the microscope’s transfer function for CTEM is described, it follows that, apart from the damping envelope functions describing partial spatial and temporal coherence in CTEM, both equations are equal to one another. In the present work, temporal incoherence will not be taken into account since STEM imaging is suspected to be robust to chromatic aberration (Batson, Krivanek, Dellby and Nellist, 2002; Dellby and Krivanek, 2002; Nellist and Pennycook, 1998, 2000). Furthermore, spatial incoherence, resulting from a finite source image, will be incorporated in the model in the next section. 2. The Image Intensity Distribution From the expression for the exit wave, which has been obtained in the previous section the image intensity distribution may be computed. The exit wave, as given by Eq. (121), describes the interaction of the electron probe, which is located at a given position, and the object. The steps needed in proceeding from the exit wave to the image intensity distribution are the

QUANTITATIVE ATOMIC RESOLUTION TEM

111

following ones. First, the propagation from this exit wave to the detector, which is placed in the back focal plane beyond the object, is described as the Fourier transform of the exit wave. Next, the intensity pattern in the detector plane is given by the modulus square of the thus obtained wave. This is the so-called convergent-beam electron diVraction pattern. Then, the part of this pattern that reaches the detector is integrated and displayed as a function of the probe position. In this way, an expression for the image intensity distribution may be obtained. At this stage, the microscope parameters describing the detector will enter the model. From the procedure described above, it follows that the total detected intensity in the Fourier detector plane of a STEM is given by (Cowley, 1976): Z Ips ðrkl Þ ¼ jCðg; zÞj2 DðgÞdg; ð131Þ where C(g, z) is the two-dimensional Fourier transform of the exit wave c(r, z) and | C(g, z) |2 describes the convergent-beam electron diVraction pattern. Furthermore, D(g) is the detector function, which is equal to one in the detected field and equal to zero elsewhere. An expression for the twodimensional Fourier transform of the exit wave may be obtained from combining Eqs. (91) and (121): Cðg; zÞ ¼ PðgÞ expð2pig  rkl Þ     nc X E1s;n 1 z 1 : þ cn ðrkl bn ÞF1s;n ðgÞ expð2pig  bn Þ exp ip E0 l n¼1 ð132Þ Notice that it can be seen from Eqs. (131) and (132) that for identical atom columns, the contrast varies periodically with thickness. This periodicity is the same as for CTEM, given by Eq. (107):   2E0 l  : D1s ¼  ð133Þ E1s;n  It is called the extinction distance. This periodic oscillation is due to dynamical eVects, which have been included in the model via the channelling approximation. Generally, the extinction distance will be diVerent for diVerent types of atom columns. Thus far, it has been tacitly assumed that the source image may be modelled as a point. Therefore, the subscript ‘ps’ in Eq. (131) refers to point source. Elaborating on the ideas given in (Mory, Tence, and Colliex, 1985), it follows that the finite size of the source image may be taken into account

112

VAN AERT ET AL.

by a two-dimensional convolution of the intensity distribution Ips(rkl) with the intensity distribution of the source image S(r): I ðrkl Þ ¼ Ips ðrkl Þ  Sðrkl Þ:

ð134Þ

The eVect of the source image is thus an additional blurring. A realistic form for the intensity distribution of the source image is Gaussian (Mory, Tence, and Colliex, 1985). The function S(r) is thus a two-dimensional normalized Gaussian distribution given by:   1 r2 S ðrÞ ¼ SðrÞ ¼ exp 2 ; ð135Þ 2ps2 2s with s the standard deviation, representing the width corresponding to the radius containing 39% of the total intensity of S(r). Up to now, no assumptions have been made about the shape or size of the detector. From now on, however, the detector is assumed to be radially symmetric. Mathematically, this means that DðgÞ ¼ DðgÞ. Insight in the expression given by the right-hand member of Eq. (134) is obtained if it is split up into three terms: I ðrkl Þ ¼ I0 þ I1 ðrkl Þ þ I2 ðrkl Þ:

ð136Þ

The zeroth order term I0 corresponds to a non-interacting probe, the first order term I1(rkl) to the interference between the probe and the 1s-state and the second order term I2(rkl) to the interference of diVerent 1s-states. The zeroth order term I0 is given by: Z  2 ð137Þ I0 ¼ jPðgÞ expð2pig  rkl Þj DðgÞdg  S ðrkl Þ: It describes a constant background intensity, resulting from the noninteracting electrons collected by the detector. This equation may be rewritten by substitution of Eq. (128) and using the fact that D(g) is radially symmetric. This results in: Z I0 ¼ 2p A2 ðgÞDðgÞ g dg: ð138Þ Due to the definition of the aperture function, given by Eq. (129), the following equality may be used: A2 ðgÞ ¼ AðgÞ:

ð139Þ

Therefore, Eq. (138) becomes: I0 ¼ 2p

Z

AðgÞDðgÞ g dg:

ð140Þ

QUANTITATIVE ATOMIC RESOLUTION TEM

113

The first order term I1(rkl) corresponds to the interference of the incident probe p(r rkl) and the 1s-state f1s, n (r bn):      nc X E1s;n 1 z 1 I1 ðrkl Þ ¼ 2Re cn ðjrkl bn jÞ exp ip E0 l n¼1 ð141Þ  Z   2p P ðgÞF1s;n ðgÞJ0 ð2pgjrkl bn jÞDðgÞ g dg  S ðrkl Þ: This is a linear term in the sense that contributions of diVerent atom columns are added. The second order term I2(rkl) describes the interference of diVerent 1s-states f1s;n ðr bn Þ and f1s;m ðr bm Þ: I2 ðrkl Þ ¼

nc X nc X cn ðjrkl bn jÞcm ðjrkl bm jÞ n¼1 m¼1

       E1s;n 1 E1s;m 1 z 1 exp þip z 1  exp ip E0 l E0 l  Z    2p F1s;n ðgÞF1s;m ðgÞJ0 2pgdn;m DðgÞ g dg  S ðrkl Þ;

ð142Þ

where dn;m ¼ jbn bm j ¼

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi  2 ðbxn bxm Þ2 þ byn bym ;

ð143Þ

is the distance between the atom columns at positions bn and bm. It is only the last term I2(rkl) of Eq. (136) that remains for annular darkfield STEM using an annular detector with an inner collection radius gdet ¼ aD =l greater than or equal to the objective aperture radius gap ¼ a0 =l. The terms I0 and I1(rkl) of Eq. (136) given by Eqs. (140) and (141), respectively, are equal to zero since: PðgÞDðgÞ ¼ 0;

ð144Þ

AðgÞDðgÞ ¼ 0:

ð145Þ

or, equivalently,

Therefore, Eq. (142) describes the image intensity distribution for annular dark-field STEM. It can be shown that this result agrees with the result as derived in (Pennycook, RaVerty, and Nellist, 2000). 3. The Image Recording Finally, the expectation model, describing the expected number of electrons recorded by the detector, will be derived. In a STEM, the illuminating electron probe scans in a raster over the object. The image is

114

VAN AERT ET AL.

thus recorded as a function of the probe position rkl ¼ ðxk yl ÞT . Without loss of generality, the image magnification is ignored. Therefore, the probe position rkl ¼ ðxk yl ÞT directly corresponds to an image pixel at the same location. The recording device is characterized as consisting of K  L equidistant pixels of area Dx  Dy, where Dx and Dy are the probe sampling distances in the x and y directions, respectively. Pixel (k, l ) corresponds to position ðxk yl ÞT  ðx1 þ ðk 1ÞDx y1 þ ðl 1ÞDyÞT with k ¼ 1; . . . ; K and l ¼ 1; . . . ; L and (x1 y1)T represents the position of the pixel in the bottom left corner of the field of view (FOV). The FOV is chosen centered about (0 0)T. Assuming a recording time t for one pixel and a probe current I, the number of electrons per probe position is given by: It e

ð146Þ

with e ¼ 1:6  10 19 C the electron charge. The recording time t for one pixel is the ratio of the recording time t for the whole FOV to the total number of pixels KL: t¼

t : KL

ð147Þ

The total number of incident electrons Ni is equal to: Ni ¼ KL

It : e

ð148Þ

The probe current I is given by (Barth and Kruit, 1996): I¼

2 a2 Br E0 p2 dI50 o 4e

ð149Þ

with Br the reduced brightness of the electron source, E0 the incident electron energy, dI50 the diameter of the source image containing 50% of the current and ao the objective aperture semi-angle. From Eq. (135), it follows that pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi dI50 ¼ 2 2ln0:5 s: ð150Þ As a consequence of the detector shape and size in STEM, only the electrons within a selected part of the convergent-beam electron diVraction pattern are used to produce the image. Mathematically, this is expressed in Eq. (131). The selected part is determined by the detector function D(g). Suppose that fkl represents the fraction of electrons collected by the detector. Then, the expected number of electrons lkl at the pixel (k, l ) equals (Reimer, 1993):

QUANTITATIVE ATOMIC RESOLUTION TEM

lkl ¼ fkl

It : e

115 ð151Þ

The fraction fkl, which is smaller than 1, may be expressed as: fkl ¼

I ðrkl Þ ID¼1

ð152Þ

with I(rkl) given by Eq. (134) and ID¼1 the constant intensity obtained if the detector function D(g) is uniform. From straightforward calculations, using Eqs. (136)–(142), it follows that: Z ID¼1 ¼ 2p AðgÞgdg: ð153Þ The total number of detected electrons N to form the image is now equal to: N¼

K X L X k¼1 l¼1

fkl

It : e

ð154Þ

Then, the dose eYciency DE, which is defined as the ratio of the number of detected electrons to the number of incident electrons, becomes: PK PL N fkl DE ¼ ð155Þ ¼ k¼1 l¼1 : Ni KL This follows directly from Eqs. (148) and (154). For STEM, the observation are electron counting results, which are supposed to be Poisson distributed and statistically independent. Therefore, the joint probability density function of the observations P(o; b), representing the parametric statistical model of the observations is given by Eq. (10), where the total number of observations is equal to K  L and the expectation model is given by Eq. (151). The parameter vector b ¼ ðbx1 . . . bxnc by1 . . . bync ÞT consist of the x- and y-coordinates of the atom column positions to be estimated. In the following section, the experimental design resulting into the highest attainable precision with which the elements of the vector b can be estimated will be derived from the joint probability density function of the observations. C. Statistical Experimental Design In this section, optimal statistical experimental designs of STEM experiments will be derived in the sense of the microscope settings resulting into the highest attainable precision with which the position coordinates of the

116

VAN AERT ET AL.

atom columns can be estimated. The STEM observations are described by the parametric statistical model derived in Section V.B. This model will be used to obtain an expression for the attainable precision, which is represented by the CRLB associated with the position coordinates. In Section II, it has been explained how an expression for the CRLB may be derived. Next, a scalar measure of this CRLB, that is, a function of the matrix elements of the CRLB, will be chosen as optimality criterion. This criterion will then be evaluated and optimized as a function of the microscope settings. An overview of these microscope settings will be given in Section V.C.1. Some of them are tunable, while others are fixed properties of the electron microscope. Next, in Section V.C.2, the results of the numerical evaluation and optimization of the microscope settings will be presented for both isolated and neighboring atom columns. The section is concluded by simulation experiments to find out if the maximum likelihood estimator attains the CRLB and if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. Finally, in Section V.C.3, an interpretation of the numerical optimization results will be given. The object thickness, the energy of the atom columns, and the microscope settings are supposed to be known. However, the following analysis may relatively easily be extended to include the case in which these or even more parameters are unknown and hence have to be estimated simultaneously. 1. Microscope Parameters An overview of the microscope settings, which enter the parametric statistical model of the STEM observations, is given in this section. For simplicity, some of these settings will be kept constant in the evaluation and optimization of the experimental design. The settings describing the electron probe are the defocus ", the spherical aberration constant Cs, the objective aperture radius gap, the electron wavelength l, the width of the source image s, and the reduced brightness Br of the electron source. The electron wavelength and the reduced brightness of the electron source are fixed properties of a given electron microscope. In the evaluation of the experimental design, the electron wavelength will be kept constant. Furthermore, the eVect of the reduced brightness on the precision with which atom column positions can be estimated, will be studied. For most electron microscopes, the spherical aberration constant is a fixed property of the microscope as well, however, by incorporating a spherical aberration corrector, it is tunable. Therefore, it is interesting to study the eVect of the spherical aberration constant on the precision. The microscope settings specifying the detector configuration are related to the detector function D(g). In principle, the detector may have any shape

QUANTITATIVE ATOMIC RESOLUTION TEM

117

or size. However, in this article, the shape of the detector is confined to the more common ones, which are annular and axial detectors. The inner or outer collection radius gdet or semi-angle aD, which are related as gdet ¼ aD =l, is tunable. The microscope settings describing the image recording are the probe sampling distances or, equivalently, the pixel sizes Dx and Dy, the number of pixels K and L in the x- and y-direction, respectively, and the recording time t. The pixel sizes Dx and Dy are kept constant. In agreement with the results presented in Section III, it can be shown that the precision will generally improve with smaller pixel sizes for a constant total number of incident electrons Ni, as defined by Eq. (148). However, below a certain pixel size, no more improvement is gained. This has to do with the fact that the pixel SNR decreases with a decreasing pixel size. Therefore, the pixel sizes are chosen in the region where no more improvement may be gained. This is similar to what is described in (Bettens, van Dyck, den Dekker, Sijbers, and van den Bos, 1999; den Dekker, Sijbers and van Dyck, 1999; van Aert, den Dekker, van Dyck, and van den Bos, 2002a). The number of pixels K and L, defining the FOV for given pixel sizes Dx and Dy, is chosen fixed, but large enough so as to guarantee that the tails of the electron probe are collected in the FOV. 2. Numerical Results In this section, the experimental designs will be numerically evaluated and optimized in terms of the attainable precision with which atom column positions can be estimated. This section will be divided into four parts. First, general comments, which should be kept in mind during the reading of this section, will be given, including an overview of the original, non-optimized microscope settings and of the structure parameters. Second, the optimal experimental designs for isolated atom columns will be computed. Third, the influence of neighboring atom columns on the optimal experimental design will be discussed. Finally, simulation experiments will be carried out to find out if the maximum likelihood estimator attains the CRLB and if it is unbiased. If so, this justifies the choice of the CRLB as optimality criterion. a. General Comments. In this section, general comments will be given, which should be kept in mind during further reading. An overview of the original microscope settings and the structure parameters of the objects under study will be given. i. Original and Optimal Microscope Settings. In what follows, the values for the original, non-optimized microscope settings are given in Table 15, unless otherwise mentioned. These are typical values used in today’s

118

VAN AERT ET AL. TABLE 15 Original Microscope Settings Microscope setting

Value

E0(keV) ˚) l(A Br ðAm 2 sr 1 V 1 Þ Cs(mm) ˚) Dx(A ˚) Dy(A K L t(s)

300 0.02 2  107 0.5 0.2 0.2 100 100 8  10 8

STEM experiments. Furthermore, in the conventional approach, which is based on direct visual interpretability, the Scherzer conditions for incoherent imaging are usually applied (Pennycook and Jesson, 1991; Scherzer, 1949). Under these conditions, the objective aperture radius and defocus are given by:   1 4l 1=4 gap ¼ ; l Cs ð156Þ " ¼ ðCs lÞ1=2 respectively. For the microscope settings as given in Table 15, this ˚ 1 and " ¼ 320 A ˚ . Moreover, the outer corresponds to gap ¼ 0:56 A collection radius of an axial detector or the inner collection radius of an annular detector is usually taken much smaller or larger than the objective aperture radius, respectively (Nellist and Pennycook, 2000). To the author’s knowledge, explicit expressions for these radii do not exist. One of the guidelines that has been found in the literature is, for example, that the inner collection radius gdet of an annular detector should be at least three times the objective aperture radius (Hartel, Rose, and Dinges, 1996). Therefore, if gap ˚ 1, this corresponds to a value of gdet being larger than is equal to 0.56 A 1 ˚ . Other researchers propose a value of two times the objective 1.68 A aperture radius, which is representative for a typical Crewe detector (Pennycook, Jesson, Chisholm, Browning, McGibbon, and McGibbon, ˚ 1, this corresponds to gdet being equal 1995). For gap being equal to 0.56 A 1 ˚ . It should be noticed that for such large values of the detector to 1.12 A radius, thermal diVuse, inelastic scattering may be more important than elastic scattering. Consequently, the expectation model proposed in Section V.B, which only takes elastic scattering into account, is no longer valid. For example, the oscillation of the detected intensity as a function of thickness with a periodicity as given by Eq. (133) is no longer observed using annular

QUANTITATIVE ATOMIC RESOLUTION TEM

119

detectors with a large inner detector radius. In Pennycook and Yan (2001), this oscillation as a function of thickness has been studied for a rhodium atom column, where the distance between successive atoms is equal to ˚ . This has been done for a small, medium, and large detector radius 2.7 A ˚ 1, 1.50 A ˚ 1, and 2.25 A ˚ 1, respectively. corresponding to a value of 0.75 A From this study, it followed that the periodic oscillation as described by the model given in Section V.B applies for the small detector radius, whereas this oscillation is almost completely suppressed for the large detector radius. Therefore, in the present study, the evaluation of the inner detector radius of annular detectors will be restricted to small values. It will be shown that this constraint does not cause problems for the computation of the optimal detector radius in terms of attainable precision. The optimal value of the detector radius will be shown to be much smaller than the values usually taken. In the remainder of this section, the values for the microscope settings which are usually preferred in STEM experiments, as described above, will be compared to their optimal values in terms of attainable precision. These optimal values are found by optimizing the attainable precision for all microscope settings simultaneously. This corresponds to an iterative, numerical optimization procedure in the space of microscope settings of which the dimension is equal to the number of microscope settings. It has been found that some of these microscope settings are strongly correlated. This implies that the optimization cannot be performed one at a time. For example, it will be shown that the optimal detector radius strongly depends on the aperture radius. Furthermore, the optimal defocus value strongly depends on the spherical aberration constant. In what follows, the results following from this simultaneous optimization procedure will be described setting by setting. The relation of each setting to other microscope settings will be mentioned if necessary. In what follows, the attainable precision will be computed as a function of the following microscope settings: . . . . . .

Objective aperture radius Radius of annular and axial detectors Defocus Spherical aberration constant Reduced brightness of the electron source Width of the source image

For isolated atom columns, the width of the source image, which is determined by Eq. (135), will be kept constant in the following sense. The diameter dI50 of the source image containing 50% of the current will be assumed to be determined by the objective aperture angle ao, following the relation (Barth and Kruit, 1996)

120

VAN AERT ET AL.

dI50 ¼

0:54l : ao

ð157Þ

The right-hand member of this equation is equal to the diameter of the diVraction-error disc containing 50% of the total intensity. Consequently, the contribution of the source image to the total probe size is rather small. Then, meeting Eq. (157), it follows from Eq. (149) that the probe current is constant and equal to IB ¼ 10 18 Br (Barth and Kruit, 1996). This implies ˚ is constant as a that the total number of incident electrons per square A function of the microscope settings for a fixed recording time. The reason why the diameter of the source image will be assumed to be determined by the diVraction-error disc, instead of assuming it to be tunable, is the following one. For isolated atom columns, the optimal diameter dI50 would be infinite, corresponding to an infinite probe current, as follows from Eq. (149). However, an infinite source image is not realistic since neighboring atom columns will then strongly overlap. Therefore, the dependence of the ‘tunable’ source diameter on the precision will be studied for neighboring atom columns only. ii. Structure Parameters. The evaluation and optimization of the attainable precision as a function of the microscope settings will be done for diVerent types of atom columns. The atom columns which will be considered are given in Table 16 as well as the corresponding width of the 1s-state an, its energy E1s,n, the interatomic distance d, that is, the distance between successive

TABLE 16 ˚ 2 and E0 ¼300 keV), Width of the Is-State, Its Energy (Debye-Waller Factor ¼ 0.6 A Interatomic Distance, and Atomic Number for Different Atom Columns Column type Structure parameter

Si [100]

Si [110]

Sr [100]

˚) an(A E1s,n(eV) ˚) d(A Z

0.34 20.2 5.43 14

0.27 37.4 3.84 14

0.22 57.3 6.08 38

Column type Structure parameter

Sn [100]

Cu [100]

Au [100]

˚) an(A E1s,n(eV) ˚) d(A Z

0.20 69.8 6.49 50

0.18 78.3 3.62 29

0.13 210.8 4.08 79

121

QUANTITATIVE ATOMIC RESOLUTION TEM

atoms along a column, and the atom number Z of these atoms. The other structure parameters of the object under study, such as the atom column positions and the object thickness, will be given in the following parts. b. Isolated Atom Columns i. Structure Parameters. For isolated atom columns, the atom column positions and the object thickness are given in Table 17. The object thickness is equal to half the extinction distance, which is given by Eq. (133). From the proposed model in Section V.B, it follows that at this thickness and at thicknesses equal to odd multiples of half the extinction distance, the electrons are strongly localized at the atom column positions. ii. Optimality Criterion. The optimal statistical experimental design will be described by the microscope settings resulting into the highest attainable precision with which its position coordinates b ¼ ðbx by ÞT can be estimated. This attainable precision (in terms of the variance) is represented by the diagonal elements s2bx and s2by of the CRLB. These elements are theoretical lower bounds on the variance with which the position coordinates can be estimated without bias. An expression for them will be derived in the following paragraph. This derivation is completely analogous to the one presented in Section IV.C.2, for CTEM and may therefore be skipped by the reader who is already familiar with it. For an isolated atom column, the CRLB is equal to the inverse of the 2  2 Fisher information matrix F associated with the position coordinates. The (r, s)th element of F is defined by Eq. (12): Frs ¼

K X L X 1 @lkl @lkl l @br @bs k¼1 l¼1 kl

ð158Þ

with lkl the expected number of electrons at the pixel (k, l ). An expression for the elements Frs is found by substitution of the expectation model given by Eq. (151) as derived in Section V.B and its derivatives with respect to the position coordinates into Eq. (158). Explicit numbers for these elements are obtained by substituting values of a given set of microscope settings and structure parameters of the object into the obtained expression for Frs. TABLE 17 Structure Parameters of an Isolated Atom Column Structure parameter

Value

˚) bx(A ˚) by(A ˚) z(A

0  0   E0 l  E1s;n 

122

VAN AERT ET AL.

For the radially symmetrical expectation model used, the diagonal elements of the Fisher information matrix are equal to one another. Moreover, since the Fisher information matrix is symmetric, the diagonal elements of its inverse, that is, of the CRLB, are also equal to one another:

s2bx ¼ s2by ¼ F 1 11 ð159Þ with [F 1]11 the (1, 1)th element of the CRLB, that is, of F 1. In what follows, the precision will be represented by the lower bound on the standard deviation sbx and sby, that is, the square root of the right-hand member of Eq. (159). It will be used as optimality criterion for the evaluation and optimization of the experimental design. Therefore, this chosen optimality criterion will be calculated for various types of atom columns as a function of the objective aperture radius, the radius of annular and axial detectors, the defocus, the spherical aberration constant, and the reduced brightness of the electron source. In this evaluation and optimization procedure, the relevant physical constraints are taken into consideration. The relevant constraint is either the radiation sensitivity of the object under study or the specimen drift. ˚ or the recording Therefore, either the incident electron dose per square A time has to be kept within the constraints. iii. Optimal Objective Aperture Radius. First, the dependence of the precision on the objective aperture radius gap is studied. Recall that the objective aperture radius is directly related to the objective aperture semiangle ao according to the formula ao ¼ gap l. The precision, which is represented by the square root of the right-hand member of Eq. (159), has been evaluated as a function of the objective aperture radius for annular as well as for axial detectors and for diVerent atom column types. From this evaluation, it is found that the optimal aperture radius is mainly determined by the atom column type under study and that it is the same for annular and axial detectors. The optimal aperture radius turns out to be proportional to the width of the function F1s,n(g), that is, the Fourier transform of the 1sstate f1s,n(r) as given by Eq. (90). Figure 37 compares the optimal aperture radius with the width of F1s,n(g). The width of F1s,n(g) is equal to 1/(4pan), where an is the width of the 1s-state f1s,n(r) as given by Eq. (88). The optimal aperture radii are plotted as a function of ðd 2 =Z þ 0:276BÞ 1=2 , since this term is more or less proportional to the width of the function F1s,n(g) as shown in (van Dyck and Chen, 1999a). For a given atom column, d represents the interatomic distance, Z the atomic number, and B the Debye-Waller factor. From Figure 37, it is clear that the influence of the object on the optimal objective aperture radius is substantial. In contrast to what one might

QUANTITATIVE ATOMIC RESOLUTION TEM

123

Figure 37. Comparison of the optimal aperture radius for Cs being equal to 0.5 mm with the width of the Fourier transformed 1s-state F1s,n(g) for diVerent atom column types. The width of F1s,n(g) is proportional to (d 2/Z + 0.276B) 1/2.

expect, the resulting probe in the optimal design is not as narrow as possible. Its main lobe is even broader than the 1s-state f1s,n(r). This is shown in Figure 38, where both the 1s-state and the amplitude of the optimal probe are shown for a silicon and a gold [100] atom column. Furthermore, for heavy atom columns such as gold [100], an increase of the spherical aberration constant results in a decrease of the optimal aperture radius and vice versa. For this column, the optimal aperture radius ˚ 1 for Cs being equal to 0 mm, whereas it is equal to is equal to 0.75 A 1 ˚ 0.50 A for Cs being equal to 0.5 mm. For lighter atom columns such as silicon[100], the optimal aperture radius is independent of the spherical aberration constant. For a silicon [100] atom column, the optimal ˚ 1 for both Cs being equal to 0 mm and aperture radius is equal to 0.28 A 0.5 mm. It should be noticed that the foregoing analysis was done for object thicknesses equal to half the extinction distance as follows from Eq. (133) and Table 17. However, the conclusions remain the same for thicknesses diVerent from half the extinction distance. Also, it should be mentioned that these conclusions are not subject to the relevant physical constraint, which is either the radiation sensitivity of the object under study or the specimen drift. From the discussion given above, it follows that there is a fundamental diVerence between the optimal aperture radius in terms of the attainable precision and the aperture radius as given by Eq. (156), which is assumed to

124

VAN AERT ET AL.

Figure 38. The dashed curve of the left- and right-hand figure represents the 1s-state f1s, n(r) for a silicon [100] and a gold [100] atom column, respectively. The solid curves represent the amplitude of their associated optimal electron probes, that is, | p(r)|, for Cs being equal to 0.5 mm.

be optimal in terms of direct visual interpretability. The former depends more on the width of the 1s-state of the column under study than on the spherical aberration constant. The latter depends on the spherical aberration constant, but is independent of any structure parameter. iv. Optimal Detector Configuration. Next, the optimal detector configuration in terms of precision is described. In Figure 39, the precision with which the position coordinates of an isolated silicon [100] atom column can be estimated, is evaluated as a function of the detector-to-aperture radius, that is, gdet/gap. For annular detectors, gdet represents the inner collection detector radius, whereas for axial detectors, it represents the outer collection detector radius (see Figure 36). The objective aperture radius and ˚ 1 and 80 A ˚, the defocus are set to their optimal values of 0.28 A respectively. From this figure, the following conclusions may be derived: . For an annular detector, the optimal detector radius equals the optimal aperture radius. . For an axial detector, the optimal detector radius is slightly smaller than the optimal aperture radius. . An annular detector results in higher precisions than an axial detector when operating at the optimal conditions.

QUANTITATIVE ATOMIC RESOLUTION TEM

125

Figure 39. The lower bound on the standard deviation of the position coordinates of an isolated silicon [100] atom column as a function of the detector-to-aperture radius for an annular and an axial detector. The objective aperture radius and the defocus are set to their ˚ 1 and 80 A ˚ , respectively. optimal values of 0.28 A

It should be mentioned that in Figure 39, the size of the optimal detector radius of the annular detector is of the same order as the size of the aperture radius. For such detector radii, thermal diVuse scattering is unimportant. As mentioned earlier in this section, thermal diVuse scattering is not included in the expectation model given in Section V.B. The reader may wonder if the precision would be higher by using a large detector radius so that thermal diVuse scattering is dominant. This is not to be expected, since the precision in terms of the lower bound on the standard deviation is inversely proportional to the square-root of the total number of detected electrons, that is, the signal-to-noise ratio, which in its turn is inversely proportional to the detector radius. It is unlikely that the decrease of the total number of detected electrons by using a large detector radius may be compensated by the fact that only thermal diVuse scattered electrons are detected. Furthermore, it should be mentioned that the conclusions obtained from Figure 39 are not subject to the relevant physical constraint, which is either the radiation sensitivity of the object under study or the specimen drift. In Figure 39, the recording time as well as the number of incident electrons per ˚ are fixed. The optimal detector settings do not change if, for square A ˚ example, longer recording times or more incident electrons per square A would be allowed. For diVerent values of the recording time or the number ˚ , only the actual values for the standard of incident electrons per square A deviation ascribed to Figure 39 would be diVerent, whereas the optimal detector settings would be the same.

126

VAN AERT ET AL.

As mentioned earlier, the detector radius is usually taken much smaller or larger than the objective aperture radius for an axial or annular detector, respectively, thus aiming at optimal direct visual interpretability. However, this is typically not found if the attainable precision is used as optimality criterion. Then, the optimal detector radius is almost equal to the aperture radius. This has to do with the fact that the signal-to-noise ratio decreases with decreasing or increasing radius of axial or annular detectors, respectively. The finding that the optimal detector radius of an annular detector equals the optimal aperture radius is in agreement with the result found in (Rose, 1975). In that paper, the annular detector was optimized in terms of signal-to-noise ratio. Thus far, however, this guideline is usually not followed in practice since one seems to prefer direct visual interpretability above precision, even if this visual interpretability is accompanied with a low signal-to-noise ratio. v. Optimal Defocus Value. Subsequently, the dependence of the precisionon the defocus is studied, as well as the dependence of the optimal defocus on the spherical aberration constant, the electron wavelength, and the optimal objective aperture radius. In Figure 40, the precision is evaluated for a silicon [100] atom column as a function of the defocus " and the spherical aberration constant Cs for a given electron wavelength l and for the objective aperture radius gap and detector radius gdet adjusted to their ˚ 1. This evaluation is done for optimal values, both corresponding to 0.28 A an annular and axial detector. The solid white curves shown in Figure 40 are described by the relation 1 " ¼ Cs l2 g2ap : 2

ð160Þ

The dotted white curves describe the numerically found optimal defocus values as a function of the considered spherical aberration constants. From the comparison of the solid and dotted white curves, it follows that the defocus value as described by Eq. (160) is close to the optimal defocus value in terms of precision. Moreover, for a given spherical aberration constant, the precision that is gained by operating at the corresponding optimal defocus instead of at the defocus given by Eq. (160) is hardly significant. Therefore, the optimal defocus value, as a function of the spherical aberration constant, the electron wavelength, and the optimal objective aperture radius, is approximately given by the empirical relation as described by Eq. (160). At this defocus value, the transfer function is flattened in the sense that it is nearly equal to one over the whole angular range of the objective aperture. The optimal transfer function for a silicon [100] atom column and for a spherical aberration constant equal to 0.5 mm

QUANTITATIVE ATOMIC RESOLUTION TEM

127

Figure 40. The lower bound on the standard deviation of the position coordinates of an isolated silicon [100] atom column as a function of the spherical aberration constant and the defocus. The left- and right-hand figure represent the results for an annular and axial detector, respectively. The objective aperture radius and detector radius are adjusted to their optimal ˚ 1. The solid white curves are described by Eq. (160) and values, both corresponding to 0.28 A the dotted white curves describe the numerically found optimal defocus values as a function of the considered spherical aberration constants.

is presented in Figure 41, where the arrow represents the optimal objective aperture radius. Equation (160) is derived from Eq. (130) by setting the phase shift w(g) exactly to zero for g ¼ gap with gap the optimal objective aperture radius. These findings do not depend on whether the radiation sensitivity of the object under study or the specimen drift determines the relevant physical constraint. From now on, the defocus will be adjusted to the value given by Eq. (160). From the comparison of the optimal defocus in terms of direct visual interpretability as given by Eq. (156) with the optimal defocus in terms of precision as given by Eq. (160), it follows that their relation to the objective aperture radius is equal for both optimality criteria. Nevertheless, the explicit numbers for the defocus are diVerent since the optimal aperture radius is diVerent for both optimality criteria.

128

VAN AERT ET AL.

Figure 41. Transfer function for a spherical aberration constant of 0.5 mm, an electron ˚ , and a defocus value of 80 A ˚ . The arrow represents an objective wavelength of 0.02 A ˚ 1. aperture radius of 0.28 A

vi. Optimal Spherical Aberration Constant. Today, the use of the spherical aberration corrector is regarded as the most promising way to improve the visual interpretability of STEM images. The aim of this corrector is to obtain sub-a˚ ngstrom resolution (Batson, Dellby, and Krivanek, 2002). In the present section, the potential merit of spherical aberration correctors is studied for quantitative instead of qualitative STEM applications. Therefore, the precision is evaluated as a function of the spherical aberration constant. In Figures 42 and 43, the ratio of the precision for a given spherical aberration constant to the precision for a spherical aberration constant of 0 mm is shown as a function of the spherical aberration constant for an isolated silicon and gold [100] atom column, respectively. This evaluation is done for an annular as well as an axial detector. For each considered spherical aberration constant, the objective aperture radius is set to its optimal value. Furthermore, the detector radius is taken equal to this optimal objective aperture radius. For silicon, which is a light atom column, it follows from Figure 42 that the precision that is gained by reducing the spherical aberration constant from 0.5 mm to 0 mm is a factor of 1.0009 and 1.0011 for an annular and axial detector, respectively. These gains are negligible. For gold, which is a heavy atom column, it follows from Figure 43 that the precision that is gained by reducing the spherical aberration constant from 0.5 mm to 0 mm is a factor of 1.21 and 1.39 for an annular and axial detector, respectively. From the numerical values given above, it follows that correction of the spherical aberration is more useful in terms of precision for heavy than for

QUANTITATIVE ATOMIC RESOLUTION TEM

129

Figure. 42. Ratio of lower bounds on the standard deviation of the position coordinates, defined as sbx =sbx ðCs ¼ 0 mm), of an isolated silicon [100] atom column as a function of the spherical aberration constant for an annular as well as for an axial detector.

Figure. 43. Ratio of lower bounds on the standard deviation of the position coordinates, defined as sbx =sbx ðCs ¼ 0 mm), of an isolated gold [100] atom column as a function of the spherical aberration constant for an annular as well as for an axial detector.

light atom columns. These results may be explained by the fact that the optimal aperture setting is strongly dependent on the atom column type as shown earlier. The optimal aperture radius for a gold [100] atom column is much larger than for a silicon [100] atom column. Because spherical

130

VAN AERT ET AL.

aberration is observable only for non-paraxial rays, correction is only necessary for objective lenses working with larger apertures. Notice, however, that from Figures 42 and 43, it follows that the accompanied gain in precision is only marginal, or even negligible for light atom columns. The finding that, mathematically, the optimal spherical aberration constant in terms of precision is equal to 0 mm agrees with the optimal value in terms of direct visual interpretability. vii. Optimal Reduced Brightness of the Electron Source. Next, the eVect of the reduced brightness Br of the electron source on the precision with which the position coordinates of an atom column can be measured is studied. Using Eqs. (151), (158), and (159), it follows that the precision, represented by the lower bound on the standard deviation of the position coordinates, is inversely proportional to the square root of the product of the probe current I and the recording time t for one pixel. Furthermore, it follows from Eq. (149) that I is directly proportional to the reduced brightness of the electron source Br. Therefore, just as for CTEM (see Section IV.C.2), new developments in producing electron sources with higher reduced brightness (de Jonge, Lamy, Schoots, and Oosterkamp, 2002; van Veen, Hagen, Barth, and Kruit, 2001) are advantageous in terms of precision. For example, if the reduced brightness is increased by a factor of 10, the lower bound on p the ffiffiffiffiffistandard deviation of the position coordinates decreases by a factor of 10. Hence, on the one hand, if the experiment is limited by specimen drift, the optimal reduced brightness is preferably as high as possible, that is, as high as physical limitations to the production of electron sources with higher reduced brightness allow. The dominant limitation is determined by the statistical Coulomb interactions (Kruit and Jansen, 1997; van Veen, Hagen, Barth, and Kruit, 2001). On the other hand, if the experiment is limited by the radiation sensitivity of the object, the reduced brightness has to be kept subcritical or an increase of the reduced brightness Br has to be compensated by a decrease of the recording time t, so as to keep ˚ within the constraints. the number of incident electrons per square A Finally, a remark about the recording time needs to be made. If the experiment is limited by specimen drift, the recording time is kept within the constraints in this study. The amount of specimen drift is determined by mechanical instabilities of the specimen holder. Hence, just as for CTEM (see Section IV.C.2), new developments providing more stable specimen holders, would allow microscopists to increase the recording time. This has a favorable eVect on the precision since, as mentioned above, the lower bound on the standard deviation of the position coordinates is inversely proportional to the square root of the recording time t for one pixel, which in its turn is directly proportional to the recording time for the whole FOV.

131

QUANTITATIVE ATOMIC RESOLUTION TEM

It could be mentioned that for STEM, specimen drift appears in a diVerent manner than for CTEM. For CTEM, drift blurs the image, whereas for STEM, it distorts the image (Pennycook and Yan, 2001). viii. Comparison with Conventional Approach. Tables 18 and 19 compare the optimal microscope settings in terms of precision with the conventional settings that are optimal in terms of direct visual interpretability. This is done for an isolated strontium [100] atom column using an annular and axial detector, respectively. The objective aperture radius and defocus corresponding to optimal visual interpretability are given by Eq. (156). Furthermore, in order to meet the conditions for direct visual interpretability more or less, the detector radius is taken two times larger or smaller than the objective aperture radius for an annular or axial detector, respectively. The spherical aberration constant is set to 0.5 mm. The other microscope settings and structure parameters are given in Tables 15 to 17. From the bottom rows of Tables 18 and 19, it follows that the attainable precision sbx is improved by a factor of 8.5 and 1.8 at the optimal TABLE 18 Comparison between the Optimal Microscope Settings in Terms of Precision and in Terms of Direct Visual Interpretability for a Strontium [100] Atom Column using an Aannular Detector and Cs ¼ 0.5 mm. Optimality criterion Microscope setting ˚) "(A ˚ 1) gap(A ˚ 1) gdet(A ˚) sbx (A

Precision

Visual interpretability

80 0.40 0.40

316 0.56 1.12

0.04

0.34

TABLE 19 Comparison between the Optimal Microscope Settings in Terms of Precision and in Terms of Direct Visual Interpretability for a Strontium [100] Atom Column Using an Axial Detector and Cs ¼ 0.5 mm. Optimal criterion Microscope setting ˚) "(A ˚ 1 Þ gap ðA ˚ 1 Þ gdet ðA ˚ sbx ðAÞ

Precision

Visual interpretability

80 0.40 0.35

316 0.56 0.28

0.10

0.18

132

VAN AERT ET AL.

microscope settings in terms of precision instead of at those in terms of visual interpretability for an annular and axial detector, respectively. Furthermore, the gain in precision by using an annular detector instead of an axial detector, under optimal conditions in terms of precision, is a factor of 2.5. ix. Summary. The optimal STEM microscope settings in terms of the attainable precision for isolated atom columns are summarized here: . The optimal aperture radius is mainly determined by the atom column type. It is proportional to the width of the Fourier transform of the 1s-state of the column under study. This means that the optimal aperture radius is larger for heavy than for light columns. . The optimal inner radius of an annular detector equals the optimal aperture radius. . The optimal outer radius of an axial detector is slightly smaller than the optimal aperture radius. . An annular detector results in a higher precision than an axial detector and is therefore preferred. . The optimal defocus value is approximately given by Eq. (160). It is determined by the optimal aperture radius, the spherical aberration constant, and the electron wavelength. . Strictly speaking, the optimal spherical aberration constant is equal to 0 mm. The precision that is gained by reducing spherical aberration depends on the column type. This gain is usually only marginal. . The reduced brightness of the electron source is preferably as high as possible if the experiment is limited by the specimen drift. . Improvements of the mechanical stability of the specimen holder, providing longer recording times, are beneficial in terms of precision, especially if the experiment is limited by the specimen drift.

c. Neighboring Atom Columns. In the previous section, the optimal experimental STEM design was described for isolated atom columns. The optimality criterion was the attainable precision with which the position of an isolated atom column can be estimated. This choice is justified as long as neighboring atom columns are clearly separated in the image. Then, the attainable precision with which the position of an atom column is estimated is independent of the presence of neighboring columns. However, in the previous section, images of silicon [100] atom columns of a crystal taken at the optimal settings for isolated atom columns may show strong overlap. The attainable precision is then aVected unfavorably by the presence of neighboring columns. In this section, it will be investigated if the optimal microscope settings change in the presence of neighboring atom columns. This will be done for silicon [100] and gold [100] crystals.

133

QUANTITATIVE ATOMIC RESOLUTION TEM

i. Structure Parameters. The two-dimensional projected structure of the objects under study, which are, silicon [100] and gold [100] crystals, is modelled as a lattice consisting of 5  5 projected atom columns at the positions T  T  ð161Þ bn ¼ bxn byn ¼ nx d ny d ; with indices n ¼ ðnx ; ny Þ; nx ¼ 2; . . . ; 2; ny ¼ 2; . . . ; 2, and d the distance between an atom column and its nearest neighbor. The values of the distance d for both a silicon [100] and a gold [100] crystal (International Centre for DiVraction Data, 2001) and for the object thickness are given in Table 20. The object thickness is equal to half the extinction distance, which is given by Eq. (133). ii. Optimality Criterion. The optimal statistical experimental design is given by the microscope settings resulting into the highest attainable precision with which the position coordinate bxn of the central atom column of the lattice consisting of 5  5 atom columns can be estimated. This column corresponds to the index n ¼ ð0; 0Þ. The attainable precision (in terms of the variance) is represented by the diagonal element s2bxn of the CRLB. An expression for this element may be derived as follows. First, the Fisher information matrix associated with the total set of 50 position coordinates bxn and byn is computed. This is a 50  50 matrix. The expression for the elements Frs of the Fisher information matrix is given by Eq. (158). Explicit numbers for these elements are obtained by substituting values of a given set of microscope settings and structure parameters of the object into the obtained expression for Frs. Next, the CRLB is computed by inverting the Fisher information matrix. Finally, the diagonal element s2bxn of the CRLB, corresponding to the position coordinate bxn of the central atom column of the lattice, represents the attainable precision. In what follows, the precision will be represented by the lower bound on the standard deviation sbxn, that is, the square root of s2bxn . It will be used as optimality criterion for the evaluation and optimization of the experimental design. TABLE 20 Structure Parameters of Neighboring Atom Columns Column type Structure parameter

Si [100]

Au [100]

˚) d(A ˚) z(A

1.92  E0 l  E1s;n 

2.04  E0 l  E1s;n 

134

VAN AERT ET AL.

Alternatively, one could choose the lower bound on the standard deviation sbyn of the position coordinate byn of the central atom column since sbxn and sbyn are equal to one another. The reason for this is that, for the structure of the objects under study, rotation of the expectation model over an angle of 90 degrees carries the expectation model into itself. Moreover, the central atom column is preferred rather than one of the other 24 atom columns since this column is mostly aVected by the presence of neighboring columns. As mentioned in Section II.C.2, the chosen criterion may be regarded as a partial or truncated optimality criterion. iii. Optimal Microscope Settings. First, in accordance with the optimization of the design for isolated atom columns, it will be assumed that the diameter of the source image is given by Eq. (157). Meeting this assumption, the precision has been evaluated as a function of the microscope settings for both a gold [100] and a silicon [100] crystal. From this evaluation, it is investigated if the optimal settings for isolated atom columns as summarized earlier are still optimal in the presence of neighboring columns. For a gold [100] crystal, the optimal settings are reasonably well described by those for isolated atom columns. The reason for this is that neighboring atom columns do not show strong overlap in the images taken at these settings. One of the minor changes in the optimal design that has been observed is the following one: . The optimal objective aperture radius increases by an order of about ˚ 1 to 10%. For example, for an annular detector, it increases from 0.50 A 1 ˚ ˚ 1 to 0.55 A at a spherical aberration constant of 0.5 mm and from 0.75 A 1 ˚ 0.85 A at a spherical aberration constant of 0 mm.

For a silicon [100] crystal, changes in the optimal settings compared to those for isolated atom columns are more pronounced than for gold. The reason for this is that neighboring columns show strong overlap in the images taken at the settings which are optimal for isolated columns. The most important changes are the following ones: The optimal objective aperture radius increases substantially. For ˚ 1 to 0.50 A ˚ 1. example, for an annular detector it increases from 0.28 A 1 1 ˚ ˚ For an axial detector, it increases from 0.28 A to 0.65 A at a spherical ˚ 1 to 0.95 A ˚ 1 at a aberration constant of 0.5 mm and even from 0.28 A spherical aberration constant of 0 mm. . The optimal outer detector radius of an axial detector is considerably ˚ 1 smaller than the optimal aperture radius. It is found to be equal to 0.25 A for both a spherical aberration constant of 0 mm and 0.5 mm. .

QUANTITATIVE ATOMIC RESOLUTION TEM

135

. For a low value of the spherical aberration constant, an axial detector may result into a higher attainable precision than an annular detector.

The latter two findings are illustrated in Figure 44, where the precision is evaluated as a function of the detector-to-aperture radius for both an annular and an axial detector and a spherical aberration constant of 0 mm. The optimal aperture radius is set to its optimal value, corresponding to ˚ 1 and 0.95 A ˚ 1 for an annular and axial detector, respectively. 0.50 A Furthermore, it is worth mentioning that at the optimal settings for a silicon crystal, neighboring columns are clearly separated in the image. Next, it is found that the precision that may be gained by correcting spherical aberration is larger for neighboring atom columns than for isolated atom columns. Second, the diameter of the source image has been taken variable. In practice, this is possible by adjusting the settings of the condenser lenses, allowing the demagnification of the source to be continuously varied. It is well known that an increase of this diameter is accompanied by two side eVects: a broadening of the source image and an increase of the probe current (Nellist and Pennycook, 2000). The former has an unfavorable eVect on the precision while the latter has a favorable eVect. Moreover, a decrease of the diameter of the source image is accompanied by the opposite side

Figure 44. The lower bound on the standard deviation of the position coordinates of the central atom column of the silicon [100] crystal under study as a function of the detector-toaperture radius for an annular and an axial detector. The spherical aberration constant is set to ˚ 1 and 0 mm, the objective aperture radius is set to its optimal value corresponding to 0.50 A 1 ˚ 0.95 A for an annular and axial detector, respectively.

136

VAN AERT ET AL.

eVects. The potential merit of a variable diameter of the source image is studied for experiments that are limited by specimen drift only and not for experiments that are limited by the radiation sensitivity of the object. The reason for this is that the latter constraint, which implies that the total ˚ has to be kept constant, would number of incident electrons per square A lead to unrealistic microscope settings and recording times. This follows intuitively from Eqs. (148) and (149). The total number of incident electrons 2 t constant. Hence, a may be kept constant by keeping the product dI50 decrease of the diameter dI50 of the source image may be compensated by an increase of the recording time t for one pixel. Narrowing the source image has a favorable eVect on the precision. Although this leads to a decrease of the probe current as well, it has no eVect on the total number of incident electrons if the recording time may be increased without limits. Hence, the ‘optimal’ diameter of the source image would be infinitely small and the accompanied recording time would be infinitely large. Such settings are unrealistic in practice. For the silicon and gold crystals under study, it has been found that the optimal diameter of the source image is of the same order of magnitude as the diameter of the diVraction-error disc as given by Eq. (157). This is illustrated for gold in Figure 45, where the precision is evaluated as a function of the ‘source image’-to-‘diVraction-error disc’ diameter for an

Figure 45. The lower bound on the standard deviation of the position coordinates of the central atom column of the gold [100] crystal under study as a function of the ‘source image’to-‘diVraction-error disc’ diameter for an annular and an axial detector. The objective aperture ˚ 1, being optimal for a spherical aberration constant of and detector radius are set to 0.55 A 0.5 mm.

QUANTITATIVE ATOMIC RESOLUTION TEM

137

annular as well as for an axial detector. The diameter of the source image and the diVraction-error disc are determined by Eqs. (150) and (157), ˚ 1, respectively. The objective aperture and detector radius are set to 0.55 A being optimal for a spherical aberration constant of 0.5 mm. As follows from Figure 45, for this example, the optimal diameter of the source image is slightly smaller than the diameter of the diVraction-error disc. For other examples, it has been observed that the optimal diameter may equally well be larger, instead of smaller, than the diameter of the diVraction-error disc. For all examples that have been studied, the order of magnitude of this optimal diameter is approximately equal to the diameter of the diVractionerror disc. Furthermore, it has to be mentioned that a variable diameter of the source image has hardly any eVect on the optimal settings diVerent from this diameter. d. Attainability of the Crame´ r-Rao Lower Bound. Finally, it is investigated if there exists an estimator attaining the CRLB on the variance of the position coordinates and if this estimator may be considered unbiased. If so, this would justify the choice of the CRLB as optimality criterion used in this section. The procedure that is used to investigate the existence of an estimator attaining the CRLB is the same as the one used in Sections III.D and IV.C.2. Recall that one of the asymptotic properties of the maximum likelihood estimator is its normal distribution about the true parameters with a covariance matrix approaching the CRLB (van den Bos, 1982). This property would justify the use of the CRLB as optimality criterion, but it is an asymptotic one. If this asymptotic property still applies to STEM experiments, where the number of observations is finite, will be assessed by carrying out simulation experiments. Therefore, 200 diVerent STEM experiments made on an isolated strontium [100] atom column are simulated; the observations are modelled using the parametric statistical model described in Section V.B. The objective aperture radius and defocus ˚ 1 and 160 A ˚ , respectively. An are set to the optimal values of 0.4 A annular detector is used with detector radius equal to the optimal aperture radius. Next, the position coordinates bx and by of the atom column are estimated from each simulation experiment using the maximum likelihood estimator. The mean and variance of these estimates are computed and compared to the true value of the position coordinate and the lower bound on the variance, respectively. The lower bound on the variance is computed by substituting the true values of the parameters into the expression given by the right-hand member of Eq. (159). The results are presented in Table 21. From the comparison of these results, it follows that it may not be concluded that the maximum likelihood estimator is biased or that it does

138

VAN AERT ET AL. TABLE 21 Comparison of True Position Coordinates and Lower Bounds on the Variance with Estimated Means and Variances of 200 Maximum Likelihood Estimates of the Position Coordinates, Respectively. True position ˚) coordinate (A

bx by

0 0

bx by

Lower bound ˚ 2) on variance (A

Estimated ˚) mean (A

Standard deviation ˚) of mean (A

0.002 0.001

0.003 0.003

Estimated ˚ 2) variance (A

Standard deviation ˚ 2) of variance (A

s2bx

0.0019

s2bx

0.0019

0.002

s2by

0.0019

s2by

0.0022

0.0002

The numbers of the last column represent the estimated standard deviation of the variable of the previous column.

not attain the CRLB. Furthermore, the hypothesis that the estimates are normally distributed has been tested quantitatively by means of the so-called Lilliefors test (Conover, 1980), which does not reject this hypothesis. From the results obtained from the simulation experiments, it is concluded that the maximum likelihood estimates cannot be distinguished from unbiased, eYcient estimates. These results justify the choice of the CRLB as optimality criterion. 3. Interpretation of the Results In Section V.C.2, optimal STEM designs were obtained by numerically computing and evaluating the attainable precision as a function of the microscope settings. Numerical analysis is the only correct way to obtain the optimal design. However, in the present section, it will be shown that an intuitive interpretation may sometimes be given by means of the results of Section III, where rules of thumb were obtained for the attainable precision with which the position of one component or the distance between two components can be estimated from dark-field and bright-field imaging experiments. These rules of thumb are given by Eqs. (65)-(70). For dark-field imaging, they were derived for an expectation model of the observations consisting of Gaussian peaks located at each component. For bright-field imaging, they were derived for a model consisting of a constant background from which Gaussian peaks located at each component were subtracted. The rules of thumb show that the precision is a function of the width of the Gaussian peak and the total number of detected electrons. Generally, the precision improves by narrowing the Gaussian peak and by increasing the

QUANTITATIVE ATOMIC RESOLUTION TEM

139

total number of detected electrons. Furthermore, for neighboring components, the precision of the distance deteriorates if the distance is smaller than a typical value, which is proportional to the width of the peak. In other words, the precision deteriorates if neighboring components strongly overlap in the image. It may be shown that this does not only apply to the precision of the distance, but to the precision of the position as well. Empirically, it has been found that the obtained rules of thumb are generalizable to more complicated STEM expectation models than Gaussian peaks. Instead of the width of the Gaussian peak, one may consider the width of the corresponding non-Gaussian peak of the STEM expectation model. Then, the generalized rules of thumb express that the precision will improve by decreasing this width or by increasing the total number of detected electrons and that it will deteriorate if neighboring components strongly overlap in the image. The applicability of these rules of thumb to give an intuitive interpretation to the numerical results found in previous sections will now be demonstrated by means of two examples. The first example explains why the optimal probe is not as narrow as possible for dark-field STEM, using an annular detector with inner radius larger than or equal to the objective aperture radius. It is generally known that the size of a diVraction-limited probe decreases if the objective aperture radius increases. Consequently, the width of the non-Gaussian peak of the expectation model decreases. This follows from the expression for the darkfield image intensity distribution, given by Eq. (142). Therefore, an increase of the objective aperture radius has a favorable eVect on the attainable precision. At this point, it should be realized, however, that below a certain probe size, the width of the peak is limited by the intrinsic width of the column dependent 1s-state. Today, the width of the probe is almost equal to the width of an atom, as follows from Batson, Dellby, and Krivanek (2002) and Krivanek, Dellby, and Nellist (2002). On the other hand, it has been found that the optimal design of an annular detector corresponds to an inner radius of the detector being equal to the optimal objective aperture radius. Thus, an increase of the objective aperture radius means an increase of the detector radius. The accompanied loss of the number of detected electrons has an unfavorable eVect on the attainable precision. As a consequence, the optimal design balances the width of the probe and the number of detected electrons. This is illustrated in Figure 46, where intersections of the two-dimensional, radially symmetric expectation model of an isolated gold [100] column and a plane through its radial axis are shown. The solid curve corresponds to an objective aperture radius of ˚ 1 being optimal for a gold crystal. The dashed curve corresponds to 0.85 A ˚ 1 resulting into a narrower probe. a larger objective aperture radius of 1.5 A Furthermore, for both curves, the spherical aberration constant is set at

140

VAN AERT ET AL.

Figure 46. Intersection of the two-dimensional, radially symmetric expectation model of the observations made on an isolated gold [100] atom column and a plane through its radial axis. An annular detector is chosen with inner radius equal to the objective aperture radius. The solid curve corresponds to the optimal settings; the dashed curve corresponds to non-optimal settings.

0 mm and an annular detector is used with inner radius equal to the objective aperture radius. Figure 46 clearly illustrates that the width of the peak of the expectation model is smaller if the probe is narrower, but also that the number of detected electrons is lower. This makes plausible that the optimal probe is not as narrow as possible. The second example explains why, for a silicon [100] crystal, the optimal objective aperture radius increases substantially and why the optimal outer radius of an axial detector is much smaller than the objective aperture radius as compared to the optimal settings for an isolated silicon column, as mentioned earlier. Figure 47 shows intersections of the two-dimensional, radially symmetric, bright-field expectation model of an isolated silicon [100] column and a plane through its radial axis. The solid curve ˚ 1 and an outer corresponds to an objective aperture radius of 0.65 A 1 ˚ detector radius of 0.25 A being optimal for a silicon [100] crystal. The ˚ 1 and dashed curve corresponds to an objective aperture radius of 0.28 A 1 ˚ an outer detector radius of 0.22 A being optimal for an isolated silicon [100] column. The dotted curve corresponds to an objective aperture radius ˚ 1 and an outer detector radius of 0.55 A ˚ 1. Furthermore, for all of 0.65 A

QUANTITATIVE ATOMIC RESOLUTION TEM

141

Figure 47. Intersection of the two-dimensional, radially symmetric expectation model of the observations made on an isolated silicon [100] atom column and a plane through its radial axis. An axial detector is chosen. The solid curve corresponds to the optimal aperture and detector settings for a silicon crystal; the dashed curve corresponds to the optimal settings for a single, isolated silicon column; the dotted curve corresponds to non-optimal settings.

curves, the spherical aberration constant is set at 0.5 mm. From the comparison of the dashed curve with the solid and dotted curve, it follows that the width of the peak decreases by increasing the objective aperture radius, that is, by narrowing the probe. In the presence of neighboring silicon [100] columns, for which the distance between a column and its ˚ , this increase of the objective aperture nearest neighbor is equal to 1.92 A radius avoids that columns would strongly overlap in the image. This has a favorable eVect on the precision. Furthermore, from the comparison of the solid and dotted curve, it follows that by decreasing the detector radius, the contrast improves although at the expense of the number of detected electrons. This makes plausible that, for a silicon crystal, the optimal objective aperture radius increases and that the optimal outer radius of an axial detector is much smaller than the objective aperture radius as compared to the optimal settings for an isolated silicon column. D. Conclusions Conventionally, the design of a STEM experiment is optimized in terms of direct visual interpretability. However, quantitative STEM experiments aim at the highest precision with which the positions of the atom columns may be estimated. Since this is a diVerent purpose, the design has been

142

VAN AERT ET AL.

reconsidered. The obvious optimality criterion is the attainable precision with which atom column positions can be estimated. An expression for the attainable statistical precision has been derived from a parametric statistical model of the observations. The expectations of the observations have been described by means of the channelling theory, whereas the fluctuations of the observations have been described by means of Poisson statistics. The obtained expression has been used to evaluate and optimize the design of quantitative STEM experiments. From this analysis, the following conclusions have been obtained: . The optimal objective aperture radius is mainly determined by the object under study. For isolated atom columns, it has been found that it is proportional to the width of the Fourier transform of the 1s-state of the column under study. This means that the optimal aperture radius is larger for heavy than for light columns. Consequently, the probe is narrower for heavy than for light atom columns. However, in the presence of neighboring columns, it has been found that the optimal aperture radius may increase if the optimal aperture radius for isolated columns leads to strong overlap of neighboring columns in the image. . The optimal inner radius of an annular detector equals the optimal aperture radius. . For isolated atom columns, the optimal outer radius of an axial detector is slightly smaller than the optimal aperture radius. However, if this detector leads to very low contrast of the image in the presence of neighboring atom columns, the optimal detector radius decreases. . Usually, an annular detector results in higher precisions than an axial detector and is therefore preferred, although there are exceptions. . The optimal defocus value is approximately given by Eq. (160). It is determined by the optimal aperture radius, the spherical aberration constant, and the electron wavelength. . Strictly speaking, the optimal spherical aberration constant is found to be equal to 0 mm. The precision that is gained by reducing spherical aberration depends on the object under study. Usually, this is only marginal. . The reduced brightness of the electron source is preferably as high as possible if the experiment is limited by specimen drift. . Improvements of the mechanical stability of the specimen holder, providing longer recording times, are beneficial in terms of precision, especially if the experiment is limited by specimen drift. . The optimal width of the source image is of the same order of magnitude as the size of the diVraction-error disc if the experiment is limited by specimen drift.

QUANTITATIVE ATOMIC RESOLUTION TEM

143

VI. Discussion and Conclusions A method has been proposed for optimizing the design of quantitative atomic resolution transmission electron microscopy experiments. The obvious optimality criterion has been shown to be the attainable precision with which structure parameters, the atom or atom column positions in particular, can be estimated. This precision can be adequately quantified in the form of the so-called CRLB, which is a lower bound on the variance of the parameter estimates. Minimization of the CRLB as a function of the microscope settings, under the existing physical constraints, results in the optimal statistical experimental design. The constraints are either the radiation sensitivity of the object or the specimen drift. Therefore, either the incident electron dose per square a˚ ngstrom (that is, the amount of electrons per square a˚ ngstrom that interact with the object during the experiment) or the recording time is constrained in the optimization. The attainable precision with which position and distance parameters of one or two components can be estimated has been investigated. This has been done for two- and three-dimensional components. For two-dimensional components, the observations consist of counting events in a twodimensional pixel array, whereas for three-dimensional components, they consist of counting events in a set of two-dimensional pixel arrays, which is obtained by rotating these components about a rotation axis. These examples may be regarded as simulations of high-resolution conventional or scanning transmission electron microscopy and electron tomography experiments, respectively. The model describing the expectations of the observations made on these components, the expectation model, has been assumed to consist of Gaussian peaks with unknown position. Under this assumption, the CRLB, which is usually calculated numerically, is given by a simple rule of thumb in closed analytical form. Although the expectation models of images obtained in practice are usually of higher complexity than Gaussian peaks, the rules of thumb are suitable to give insight into statistical experimental design for quantitative atomic resolution transmission electron microscopy. For two- and three-dimensional, one and two component expectation models, the expressions show how the attainable precision depends on the width of the point spread function, the width of the components, and the number of detected counts. Furthermore, for two- and three-dimensional, two component expectation models, the attainable precision also depends on the distance between the components. Particularly for three-dimensional, two component expectation models, it is a function of the orientation of the components with respect to the rotation axis as well. Generally, the precision may be improved by increasing the number of

144

VAN AERT ET AL.

detected counts or by narrowing the point spread function. However, below a certain width of the point spread function, the precision is limited by the intrinsic width of the components. Then, further narrowing of the point spread function is useless. Moreover, if a narrower point spread function is accompanied with a decrease of the number of detected electrons, both eVects have to be weighed against each other under the existing physical constraints. The following results have been derived from the minimization of the numerically calculated CRLB with respect to the microscope settings, assuming expectation models with a solid physical base, instead of Gaussian peaks. Using this procedure, the optimal statistical experimental design of conventional and scanning transmission electron microscopy experiments have been derived. The obtained results may intuitively be interpreted using the rules of thumb for the CRLB, which have been obtained from Gaussian peaked expectation models. For conventional transmission electron microscopy, it has been shown that a spherical and chromatic aberration corrector may improve the attainable precision. Correction has most sense for low accelerating voltages and for objects consisting of heavy atom columns. However, it should be mentioned that the optimal spherical aberration constant is diVerent from 0 mm for thin objects. Furthermore, increasing the reduced brightness of the electron source or improving the mechanical stability of specimen holders, may improve the attainable precision considerably. Moreover, particularly for electron microscopes operating at intermediate accelerating voltages of about 300 kV, it has been found that a monochromator usually deteriorates the attainable precision if the experiment is limited by specimen drift, whereas it slightly improves the precision if the experiment is limited by the radiation sensitivity of the object. For electron microscopes operating at low accelerating voltages of about 50 kV, it has been shown that correction of the chromatic aberration by either a chromatic aberration corrector or a monochromator may improve the attainable precision significantly, although a chromatic aberration corrector would be preferred. For scanning transmission electron microscopy, it has been shown that the optimal probe is not the narrowest possible. The optimal objective aperture radius, which determines the size of a diVraction limited probe, has been found to be mainly determined by the object under study. More specifically, for isolated atom columns, it is proportional to the potential depth of the atom column, that is, the diVerence between the maximum and minimum potential energy. However, in the presence of neighboring columns, it has been found that the optimal aperture radius may increase so as to avoid strong overlap of neighboring columns in the image. In the evaluation of the experimental design, annular and axial detector types have

QUANTITATIVE ATOMIC RESOLUTION TEM

145

been compared. Usually, an annular detector may result in higher precisions than an axial one, but there are exceptions. The optimal inner radius of an annular detector has been found to be equal to the optimal objective aperture radius. The optimal outer radius of an axial detector is usually slightly smaller than the optimal aperture radius. However, if this detector leads to very low contrast of the image, the optimal detector radius decreases. Moreover, a spherical aberration corrector improves the precision. However, the accompanied gain, which depends on the object under study, may be disappointing. As for conventional transmission electron microscopy, the reduced brightness of the electron source is preferably as high as possible and the specimen holder as stable as possible, especially if the experiment is limited by specimen drift. In this article, statistical experimental design has been used to discover the theoretical limits to quantitative atomic resolution transmission electron microscopy. This limit has been shown to be determined by the highest attainable precision with which atom or atom column positions can be estimated. Statistical experimental design allows one to find the microscope settings resulting into the highest attainable precision. Most important is that it provides the electron microscopist with insight in which precision may be obtained at which microscope settings. Thus, it shows the possible benefit of the optimal settings compared to the usual settings. Then, the electron microscopist may decide if it is advantageous to modify these usual settings. Acknowledgments The research of Dr. A. J. den Dekker has been made possible by a fellowship of the Royal Netherlands Academy of Arts and Sciences. S. van Aert is a Postdoctoral Fellow of the F.W.O (Fund for Scientific Research, Flanders, Belgium). Appendix A In this appendix, the approximations, described by Eqs. (65), (66), and (67) of Section III, for the highest attainable precision with which position coordinates of one isolated component or the distance between two components can be estimated of a two-dimensional object from a darkfield imaging experiment are derived. First, Eqs. (65) and (67), which describe the CRLB on the variance of the position coordinate of one isolated component and on the variance of the

146

VAN AERT ET AL.

distance between two non-overlapping components, respectively, will be proven. If the pixel sizes Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, which is described by Eq. (29), it can be shown that the non-diagonal elements of the Fisher information matrix F associated with the position coordinates, which is approximated by the right-hand members of Eqs. (45) and (47) for one and two components, respectively, are approximately equal to zero. The reason for this is that the components do not overlap and that the image intensity distribution of each component has rotational symmetry. Moreover, the diagonal elements are nonzero and equal to one another. For example, the first diagonal element of F associated with the position coordinate bx1 is calculated explicitly as follows. From Eq. (12), it follows that: F11 ¼

K X L X 1 @lkl @lkl ; l @bx1 @bx1 k¼1 l¼1 kl

ð162Þ

where lkl is given by Eq. (32). In this expression, Eqs. (28), (29), and (31) are substituted. Moreover, since the components do not overlap, the denominator lkl of Eq. (162) is approximated by Np Gðxk bx1 ; yl by1 ÞDxDy. In the thus obtained expression, the sums are approximated by integrals since Dx and Dy are assumed to be small compared to the width r of the Gaussian peak. This results in: F11 

Np : r2

ð163Þ

An analogous reasoning results in the same approximation for the other diagonal element F33 associated with the y-coordinates of the position of two components. Substitution of these approximations into Eqs. (46) and (51) results in Eqs. (65) and (67), respectively. Second, Eq. (66), which describes the CRLB on the variance of the distance between two overlapping components, will be proven. For that purpose, the diVerences of the elements of the Fisher information matrix F associated with the position coordinates b ¼ ðbx1 bx2 by1 by2 ÞT , that is, F11 F12, F33 F34, and F13 F14 are calculated explicitly. From Eq. (12), it follows that   K X L 1X 1 @lkl @lkl 2 F11 F12  ; ð164Þ 2 k¼1 l¼1 lkl @bx1 @bx2 where it has been taken into account that F11  F22 and F12 ¼ F21 . In the following calculations, the diVerences between the coordinates of the two components and the sums of these coordinates are needed. In order to

QUANTITATIVE ATOMIC RESOLUTION TEM

147

simplify the notation, the parameters a are introduced. The elements of the parameter vector a ¼ ða1 a2 a3 a4 ÞT are defined as: a1 ¼ bx1 bx2 ; a2 ¼ by1 by2 ; a3 ¼ bx1 þ bx2 ; a4 ¼ by1 þ by2 :

ð165Þ

Using Eqs. (28), (31), (32), and (165), the derivatives @lkl =@bx1 and @lkl =@bx2 are written as: @lkl @Gðxk ða1 þ a3 Þ=2; yl ða2 þ a4 Þ=2Þ  Np DxDy; @xk @bx1

ð166Þ

@lkl @Gðxk ða3 a1 Þ=2; yl ða4 a2 Þ=2Þ  Np DxDy: @xk @bx2

ð167Þ

Since the distance d is assumed to be small compared to the width r of the Gaussian peak, Eqs. (166) and (167) are Taylor expanded about a1 ¼ 0 and a2 ¼ 0 as follows:  @lkl @Gðxk a3 =2; yl a4 =2Þ a1 @ 2 Gðxk a3 =2; yl a4 =2Þ  Np @xk @bx1 2 @x2k  a2 @ 2 Gðxk a3 =2; yl a4 =2Þ DxDy; @yl @xk 2 ð168Þ  @lkl @Gðxk a3 =2; yl a4 =2Þ a1 @ 2 Gðxk a3 =2; yl a4 =2Þ  Np þ @xk @bx2 2 @x2k  a2 @ 2 Gðxk a3 =2; yl a4 =2Þ þ DxDy: @yl @xk 2 ð169Þ Next, Eqs. (164), (168), and (169) are combined and since the two components are assumed to overlap nearly completely, the denominator lkl of Eq. (164) is approximated by 2Np Gðxk a3 =2; yl a4 =2ÞDxDy. This results in:  2 2 @ Gðxk a3 =2;yl a4 =2Þ @ 2 Gðxk a3 =2;yl a4 =2Þ K X L a þ a X 1 2 2 @yl @xk Np DxDy @xk : F11 F12  4 G ð x a =2; y a =2 Þ k 3 l 4 k¼1 l¼1 ð170Þ

148

VAN AERT ET AL.

The next step is to substitute Eq. (29) into Eq. (170). Then, the sums may be approximated by integrals if it assumed that Dx and Dy are small compared to the width r of the Gaussian peak. Straightforward calculations result in: F11 F12 

 Np  2 2a1 þ a22 : 4 4r

ð171Þ

 Np  2 2a2 þ a21 4 4r

ð172Þ

An analogous reasoning yields: F33 F34 

Next, it follows from Eq. (12) that !   K X L 1X 1 @lkl @lkl @lkl @lkl ; F13 F14  2 k¼1 l¼1 lkl @bx1 @bx2 @by1 @by2

ð173Þ

where it has been taken into account that F13  F24 and F14  F23. A similar derivation as that resulting in Eqs. (168) and (169) gives:  @lkl @Gðxk a3 =2; yl a4 =2Þ a1 @ 2 Gðxk a3 =2; yl a4 =2Þ  Np @yl @xk @yl @by1 2  2 a2 @ Gðxk a3 =2; yl a4 =2Þ DxDy; 2 @y2l ð174Þ  @lkl @Gðxk a3 =2; yl a4 =2Þ a1 @ 2 Gðxk a3 =2; yl a4 =2Þ  Np þ @yl @xk @yl @by2 2  2 a2 @ Gðxk a3 =2; yl a4 =2Þ þ DxDy: 2 @y2l ð175Þ Next, Eqs. (29), (168), (169), (173)-(175) are combined and the denominator lkl of Eq. (173) is approximated by 2Np Gðxk a3 =2; yl a4 =2ÞDxDy since the two components are assumed to overlap nearly completely. Moreover, if Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, the sums may be approximated by integrals. This results in: F13 F14 

N p a1 a 2 : 4r4

ð176Þ

QUANTITATIVE ATOMIC RESOLUTION TEM

149

Finally, substitution of Eqs. (171), (172), and (176) into Eq. (51) results in Eq. (66).

Appendix B In this appendix, the approximations, described by Eqs. (68), (69), and (70) of Section III for the highest attainable precision with which position coordinates of one isolated component or the distance between two components can be estimated of a two-dimensional object from a brightfield imaging experiment are derived. First, Eqs. (68) and (70), which describe the CRLB on the variance of the position coordinate of one isolated component and on the variance of the distance between two non-overlapping components, respectively, will be proven. If the pixel sizes Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, which is described by Eq. (29), it can be shown that the non-diagonal elements of the Fisher information matrix F associated with the position coordinates, which is approximated by the right-hand members of Eqs. (45) and (47) for one and two components, respectively, are approximately equal to zero. The reason for this is that the components do not overlap and that the image intensity distribution of each component has rotational symmetry. Moreover, the diagonal elements are nonzero and equal to one another. For example, the first diagonal element of F is calculated explicitly as follows. From Eq. (12), it follows that: F11 ¼

K X L X 1 @lkl @lkl ; l @bx1 @bx1 k¼1 l¼1 kl

ð177Þ

where lkl is given by Eq. (35). In this expression, Eqs. (33) and (34) are substituted. Moreover, it is assumed that the number of interacting electrons is much smaller than the number of noninteracting electrons. Hence, the term ‘ncOgDF (x, y; b)’ of Eq. (33) is assumed to be much smaller than the term ‘1’. Therefore, the denominator lkl of Eq. (177) may be approximated by NDxDy/ (FOV ncO). This results in: F11 

K X L N ðnOÞ2 X @gDF ðxk ; yl ; bÞ @gDF ðxk ; yl ; bÞ DxDy @bx1 @bx1 FOV nc O k¼1 l¼1

ð178Þ

In the thus obtained expression, Eqs. (28) and (29) are substituted. Moreover, the factor Nðnc OÞ2 =ðFOV nc OÞ is approximated by Nðnc OÞ2 =FOV since the number of interacting electrons is much smaller than the number of

150

VAN AERT ET AL.

noninteracting electrons. If Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, the sums may be approximated by integrals, resulting in: F11 

NO2 : 8pr4 FOV

ð179Þ

An analogous reasoning results in the same approximation for the other diagonal element F33 associated with the y-coordinates of the position of two components. Substitution of these approximations into Eqs. (46) and (51) results in Eqs. (68) and (70), respectively. Second, Eq. (69), which describes the CRLB on the variance of the distance between two overlapping components, will be proven. For that purpose, the diVerences of the elements of the Fisher information matrix F associated with the position coordinates b ¼ ðbx1 bx2 by1 by2 ÞT , that is, F11 F12 , F33 F34 , and F13 14 are calculated explicitly. From Eq. (12), it follows that   K X L 1X 1 @lkl @lkl 2 F11 F12  ; ð180Þ 2 k¼1 l¼1 lkl @bx1 @bx2 where it has been taken into account that F11  F22 and F12 = F21. Using Eqs. (28), (33)–(35), and (165), the derivatives @lkl =@bx1 and @lkl =@bx2 are written as: @lkl NODxDy @Gðxk ða1 þ a3 Þ=2; yl ða2 þ a4 Þ=2Þ  ; @xk @bx1 FOV nc O

ð181Þ

@lkl NODxDy @Gðxk ða3 a1 Þ=2; yl ða4 a2 Þ=2Þ  : @xk @bx2 FOV nc O

ð182Þ

Since the distance d is assumed to be small compared to the width of the Gaussian peak, Eqs. (181) and (182) are Taylor expanded about a1 ¼ 0 and a2 ¼ 0 as follows:  @lkl NODxDy @Gðxk a3 =2; yl a4 =2Þ  @xk @bx1 FOV nc O a1 @ 2 Gðxk a3 =2; yl a4 =2Þ ð183Þ 2 @x2k  a2 @ 2 Gðxk a3 =2; yl a4 =2Þ ; @yl @xk 2

QUANTITATIVE ATOMIC RESOLUTION TEM

 @lkl NODxDy @Gðxk a3 =2; yl a4 =2Þ  @xk @bx2 FOV nc O 2 a1 @ Gðxk a3 =2; yl a4 =2Þ þ 2 @x2k  2 a2 @ Gðxk a3 =2; yl a4 =2Þ þ : @yl @xk 2

151

ð184Þ

Eqs. (180), (183), and (184) are combined and the denominator lkl of Eq. (180) is approximated by NDxDy=ðFOV nc OÞ since ‘nc OgDF ðx; y; bÞ’ of Eq. (33) is assumed to be much smaller than the term ‘1’. This results in: F11 F12

K X L  NO2 DxDy X @ 2 Gðxk a3 =2; yl a4 =2Þ  a1 2ðFOV nc OÞ k¼1 l¼1 @x2k 2 @ 2 Gðxk a3 =2; yl a4 =2Þ : þ a2 @yl @xk

ð185Þ

The next step is to substitute Eq. (29) into Eq. (185). Moreover, NO2 =ðFOV nc OÞ is approximated by NO2 =FOV since the number of interacting electrons is much smaller than the number of noninteracting electrons. Then, if it assumed that Dx and Dy are small compared to the width r of the Gaussian peak, the sums may be approximated by integrals. Straightforward calculations result in: F11 F12 

 2  NO2 3a1 þ a22 : 6 32pr FOV

ð186Þ

 2  NO2 2 : 3a þ a 2 1 32pr6 FOV

ð187Þ

An analogous reasoning yields: F33 F34 

Next, it follows from Eq. (12) that F13 F14

!   K X L 1X 1 @lkl @lkl @lkl @lkl  ; 2 k¼1 l¼1 lkl @bx1 @bx2 @by1 @by2

ð188Þ

where it has been taken into account that F13  F24 and F14  F23. A similar derivation as that resulting in Eqs. (183) and (184) gives:

152

VAN AERT ET AL.

 @lkl NODxDy @Gðxk a3 =2; yl a4 =2Þ  @yl @by1 FOV nc O a1 @ 2 Gðxk a3 =2; yl a4 =2Þ @xk @yl 2  a2 @ 2 Gðxk a3 =2; yl a4 =2Þ ; 2 @y2l

ð189Þ

 @lkl NODxDy @Gðxk a3 =2; yl a4 =2Þ  @yl @by2 FOV nc O a1 @ 2 Gðxk a3 =2; yl a4 =2Þ @xk @yl 2  a2 @ 2 Gðxk a3 =2; yl a4 =2Þ þ : 2 @y2l þ

ð190Þ

Next, Eqs. (29), (183), (184), (188)-(190) are combined and the denominator l kl of Eq. (188) is approximated by NDxDy=ðFOV nc OÞ since ‘nc OgDF ðx; y; bÞ’ of Eq. (33) is assumed to be much smaller than the term ‘1’, which means that the number of interacting electrons is much smaller than the number of noninteracting electrons. Also, for the same reason, NO2 =ðFOV nc OÞ is approximated by NO2/FOV. Moreover, if Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, the sums may be approximated by integrals. This results in: F13 F14 

NO2 a1 a2 : 16pr2 FOV

ð191Þ

Finally, substitution of Eqs. (186), (187), and (191) into Eq. (51) results in Eq. (69). Appendix C In this appendix, the approximations, described by Eqs. (73), (74), (80), and (81) of Section III, for the highest attainable precision with which position coordinates of one isolated component or the distance between two components can be estimated of a three-dimensional object from a dark-field imaging tomography experiment are derived. First, Eqs. (73)–(74) and (81), which describe the CRLB on the variance of the position coordinates of one isolated component and on the variance of the distance between two non-overlapping three-dimensional components,

QUANTITATIVE ATOMIC RESOLUTION TEM

153

respectively, will be proven. If the pixel sizes Dx and Dy are assumed to be small compared to the width r of the Gaussian peak, which is described by Eq. (29), it can be shown that the non-diagonal elements of the Fisher information matrix F associated with the position coordinates, which are given by Eqs. (71) and (75) for one and two components, respectively, are approximately equal to zero. The reason for this is that for most projections the components do not overlap and that the image intensity distribution of each projected component has rotational symmetry. Moreover, the diagonal elements are nonzero. For example, the first diagonal element of F associated with the position coordinate bx1 is calculated explicitly as follows. From Eq. (12), it follows that: F11 ¼

j j J X K X L X 1 @lkl @lkl ; j @b @b x1 x1 j¼1 k¼1 l¼1 lkl

ð192Þ

where ljkl is given by Eq. (41). According to the chain rule for diVerentiation and from Eq. (38), it follows that Eq. (192) may be rewritten as: " # j j J K X L X X @l @l 1 kl kl F11 ¼ cos2 yj : ð193Þ j j j j¼1 k¼1 l¼1 lkl @bx1 @bx1 In fact, the sum between square brackets has already been calculated in Appendix A, where a comparable sum and its approximation is given by Eqs. (162) and (163), respectively. This result is incorporated in Eq. (193) as follows. On the one hand, it follows from Eqs. (39)–(41) that ljkl is equal to: ljkl 

  nc N p gDF xk ; yl ; "j DxDy; J

ð194Þ

j j j j . . . bxn by1 . . . byn ÞT is the 2nc-dimensional parameter vector where "i ¼ ðbx1 c c of projected position coordinates. On the other hand, it follows from Eqs. (31) and (32) that lkl, which was used for the calculation of Eq. (162), was equal to:

lkl ¼ nc Np gDF ðxk ; yl ; bÞDxDy; T

ð195Þ

where b ¼ ðbx1 . . . bxnc by1 . . . bync Þ was the 2nc-dimensional parameter vector of the two-dimensional components. From the comparison of Eq. (195) with Eq. (194), it follows that lkl is equal to J times ljkl if b of Eq. (195) is replaced by "j. Therefore, Eq. (162) is equal to J times the sum between square brackets of Eq. (193) if b of Eq. (162) is replaced by "j. The approximation of Eq. (162), which was given by Eq. (163), was only valid if the two-dimensional components did not overlap. Hence, for the threedimensional object this result is only valid for those projections j where the

154

VAN AERT ET AL.

projected components do not overlap. This condition is fulfilled for most projections, but for some projections, the projected components may overlap, even if the three-dimensional components do not overlap. However, since it will be assumed that the total number of projections J is large, the contribution of projections in Eq. (193) for which the condition is not fulfilled will be neglected. Therefore, the sum between square brackets of Eq. (193) is replaced by 1/J times the approximation given by Eq. (163). Thus, Eq. (193) is approximately equal to: F11 

J X 1 j¼1

J

cos2 y j

Np : r2

ð196Þ

Next, it is assumed that the tilt angles yi ; j ¼ 1; . . . ; J are equidistantly located on the interval ( p/2, p/2). Therefore, the diVerence Dy between successive tilt angles is equal to: Dy ¼

p : J

ð197Þ

Thus, 1/J of Eq. (196) can be replaced by Dy/p. Furthermore, the diVerence Dy between successive tilt angles is assumed to be small compared to the full angular tilt range ( p/2, p/2), or in other words, the total P number of projections J is assumed to be large. Therefore, the sum j Dy may be R approximated by the integral dy. Straightforward calculations yield: F11 

Np : 2r2

ð198Þ

By an analogous reasoning, it can be shown that the diagonal element of F associated with the position coordinate by1, that is, F22 of Eq. (71) or F33 of Eq. (75), is approximately equal to: Np : r2

ð199Þ

Moreover, the diagonal element of F associated with the position coordinate bz1, that is, F33 of Eq. (71) or F55 of Eq. (75), is approximated by: Np : 2r2

ð200Þ

Next, Eqs. (198)–(200) are substituted into Eqs. (72) and (79). Then, Eq. (72) produces Eqs. (73) and (74). Finally, the following notion is taken into account. The distance d0 between the components projected onto the (x, z)plane is equal to:

QUANTITATIVE ATOMIC RESOLUTION TEM

qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi d ¼ ðbx1 bx2 Þ2 þ ðbz1 bz2 Þ2 ¼ d sinf; 0

155 ð201Þ

where f is the angle between the rotation axis and the axis that connects the two components. The distance d0 and the angle f are shown in figure 3. Taking account of Eq. (201), Eq. (79) results in Eq. (81). Second, Eq. (80), which describes the CRLB on the variance of the distance between two overlapping three-dimensional components, will be proven. For that purpose, the diVerences of the elements of the Fisher information matrix F associated with the position coordinates b ¼ ðbx1 bx2 by1 by2 bz1 bz2 ÞT , that is, F11 F12, F33 F34, F55 F56, F13 F14, F15 F16, and F35 F36 are calculated explicitly. From Eq. (12), it follows that !2 j j J X K X L 1X 1 @lkl @lkl F11 F12  ; ð202Þ 2 j¼1 k¼1 l¼1 ljkl @bx1 @bx2 where it has been taken into account that F11  F22 and F12 ¼ F21 . According to the chain rule for diVerentiation and from Eq. (38), it follows that Eq. (202) may be rewritten 2 as: !2 3 j j J K X L X X @l @l 1 1 j kl F11 F12  ð203Þ cos2 y 4 kl 5: 2 k¼1 l¼1 ljkl @bjx1 @bjx2 j¼1 Following the same reasoning as in the derivation of Eq. (196) from Eq. (193), it can be shown that the sum between square brackets of Eq. (203) may be directly derived from the result of Eq. (164) of Appendix A, which is given by Eq. (171). This results in:    J 2  2  X 1 j j j j 2 j Np F11 F12  cos y 2 bx1 bx2 þ by1 by2 : ð204Þ J 4r4 j¼1 In order to simplify the notation, the parameters a are introduced. The elements of the parameter vector a ¼ ða1 a2 a3 a4 a5 a6 ÞT are defined as: a1 ¼ bx1 bx2 ; a2 ¼ by1 by2 ; a3 ¼ bz1 bz2 ; a4 ¼ bx1 þ bx2 ; a5 ¼ by1 þ by2 ; a6 ¼ bz1 þ bz2 : Next, Eqs. (38) and (205) are substituted into Eq. (204), resulting in:

ð205Þ

156

VAN AERT ET AL.

F11 F12 

J X 1 j¼1

J

cos2 y j



  Np   2 j j 2 2 a cosy þ a siny þ ð a Þ : 1 3 2 4r4

ð206Þ

In this expression, the sum is approximated by an integral since the diVerence Dy between successive tilt angles is assumed to be small compared to the full angular tilt range ( p/2, p/2). This results in:   Np 3 2 1 2 2 a þ a2 þ a3 : F11 F12  4 ð207Þ 2 8r 2 1 Similar reasonings result in: F33 F34

  Np 1 2 1 2 2 a þ 2a2 þ a3  4 2 4r 2 1

ð208Þ

F55 F56

  Np 1 2 3 2 2 a þ a2 þ a3 :  4 2 8r 2 1

ð209Þ

and

Next, it follows from Eq. (12) that F13 F14

J X K X L 1X 1  2 j¼1 k¼1 l¼1 ljkl

@ljkl @ljkl @bx1 @bx2

!

@ljkl @ljkl @by1 @by2

! ð210Þ

where it has been taken into account that F13  F24 and F14  F23. According to the chain rule for diVerentiation and from Eq. (38), it follows that Eq. (210) may be rewritten as: " ! !# j j J K X L X X @ljkl @ljkl 1 @lkl @lkl j 1 F13 F14  cosy : ð211Þ 2 k¼1 l¼1 ljkl @bjx1 @bjx2 @bjy1 @bjy2 j¼1 Following the same reasoning as in the derivation of Eq. (196) from Eq. (193), it can be shown that the sum between square brackets of Eq. (211) may be directly derived from the result of Eq. (173) of Appendix A, which is given by Eq. (176). This results in:  3 2  j j j j J Np bx1 bx2 by2 by1 X 5: F13 F14  ð212Þ cosy j 4 4 4r j¼1 Next, Eqs. (38) and (205) are substituted into Eq. (212), resulting in: "  # j j J X j Np a1 cosy þ a3 siny F13 F14  cosy : ð213Þ 4r4 j¼1

QUANTITATIVE ATOMIC RESOLUTION TEM

157

In this expression, the sum is approximated by an integral since the diVerence Dy between successive tilt angles is assumed to be small compared to the full angular tilt range ( p/2, p/2). This results in: F13 F14 

Np a1 a2 : 8r4

ð214Þ

F15 F16 

Np a1 a3 8r4

ð215Þ

F35 F36 

Np a2 a3 : 8r4

ð216Þ

Similar reasonings yield:

and

Next, Eqs. (207)–(209) and (214)–(216) are substituted in Eq. (79). Finally, it is noticed that the distance a2 between the components projected onto the (y, z)-plane is equal to: a2 ¼ dcosf:

ð217Þ

Taking account of Eqs. (201) and (217) results in Eq. (80).

References Barth, J. E., and Kruit, P. (1996). Addition of different contributions to the charged particle probe size. Optik 101, 101–109. Batson, P. E. (1999). Advanced spatially resolved EELS in the STEM. Ultramicroscopy 78, 33–42. Batson, P. E., Dellby, N., and Krivanek, O. L. (2002). Sub-a˚ ngstrom resolution using aberration corrected electron optics. Nature 418, 617–620. Bettens, E., van Dyck, D., den Dekker, A. J., Sijbers, J., and van den Bos, A. (1999). Modelbased two-object resolution from observations having counting statistics. Ultramicroscopy 77, 37–48. Bevington, P. R. (1969). Data Reduction and Error Analysis for the Physical Sciences. New York: McGraw-Hill. Born, M., and Wolf, E. (1999). Principles of Optics—Electromagnetic Theory of Propagation, Interference and Diffraction of Light. 7th (expanded) ed. Cambridge: Cambridge University Press. Broeckx, J., Op de Beeck, M., and van Dyck, D. (1995). A useful approximation of the exit wave function in coherent STEM. Ultramicroscopy 60, 71–80. Browning, N. D., Arslan, I., Moeck, P., and Topuria, T. (2001). Atomic resolution scanning transmission electron microscopy. Physica Status Solidi B 227, 229–245. Buseck, P. R., Cowley, J. M., and Eyring, L. (1988). High-Resolution Transmission Electron Microscopy and Associated Techniques. Oxford: Oxford University Press.

158

VAN AERT ET AL.

Cahn, R. W. (2001). The Coming of Materials Science. New York: Pergamon, chapter 5, pp. 187–210. Coene, W., and van Dyck, D. (1988). New aspects in nonlinear image processing for high resolution electron microscopy. Scanning Microscopy, Supplement 2, 117–129. Coene, W. M. J., Thust, A., Op de Beeck, M., and van Dyck, D. (1996). Maximum-likelihood method for focus-variation image reconstruction in high resolution transmission electron microscopy. Ultramicroscopy 64, 109–135. Conover, W. J. (1980). Practical Nonparametric Statistics, 2nd ed. New York: Wiley. Cowley, J. M. (1976). Scanning transmission electron microscopy of thin specimens. Ultramicroscopy 2, 3–16. Cowley, J. M. (1997). Scanning transmission electron microscopy, in Handbook of Microscopy—Applications in Materials Science, Solid-State Physics and Chemistry, Methods II, edited by S. Amelinckx, D. van Dyck, J. van Landuyt, and G. van Tendeloo. Weinheim: VCH, pp. 563–594. Cowley, J. M., and Moodie, A. F. (1957). The scattering of electrons by atoms and crystals. I. A new theoretical approach. Acta Crystallographica 10, 609–619. Crame´ r, H. (1999). Mathematical Methods of Statistics, 19th ed. Princeton: Princeton University Press. Crewe, A. V. (1997). The scanning transmission electron microscope, in Handbook of Charged Particle Optics, edited by J. Orloff. Boca Raton: CRC Press, pp. 401–427. de Jong, A. F., and van Dyck, D. (1993). Ultimate resolution and information in electron microscopy II. The information limit of transmission electron microscopes. Ultramicroscopy 49, 66–80. de Jonge, N., Lamy, Y., Schoots, K., and Oosterkamp, T. H. (2002). High brightness electron beam from a multi-walled carbon nanotube. Nature 420, 393–395. den Dekker, A. J., Sijbers, J., and van Dyck, D. (1999). How to optimize the design of a quantitative HREM experiment so as to attain the highest precision. Journal of Microscopy 194, 95–104. den Dekker, A. J., and van Aert, S. (2002). Quantitative high resolution electron microscopy and Fisher information, in Proceedings of the 15th International Congress on Electron Microscopy, Interdisciplinary and Technical Forum Abstracts 2002 in Durban, South Africa, edited by R. Cross. Vol. 3, Onderstepoort: Microscopy Society of Southern Africa, pp. 185–186. den Dekker, A. J., van Aert, S., van Dyck, D., and van den Bos, A. (2000). A quantitative evaluation of different STEM imaging modes, in Proceedings of the 12th European Congress on Electron Microscopy, Instrumentation and Methodology 2000 in Brno, Czech Republic, edited by P. Toma´ nek and R. Kolar˘´ık. Vol. 3, Brno: The Czechoslovak Society for Electron Microscopy, pp. 131–132. den Dekker, A. J., van Aert, S., van Dyck, D., van den Bos, A., and Geuens, P. (2000). Does a monochromator improve the precision in quantitative HRTEM? in Jaarboek Nederlandse Vereniging voor Microscopie 2000, including the proceedings of the Joint Meeting of the BVM and the NVvM 2000 in Papendal, Arnhem, The Netherlands, edited by H. K. Koerten. Rijnsburg: Press Point, pp. 138–140. den Dekker, A. J., van Aert, S., van Dyck, D., van den Bos, A., and Geuens, P. (2001). Does a monochromator improve the precision in quantitative HRTEM? Ultramicroscopy 89, 275–290. Fedorov, V. V. (1972). Theory of Optimal Experiments. New York: Academic Press. Fejes, P. L. (1977). Approximations for the calculation of high-resolution electron-microscope images of thin films. Acta Crystallographica A 33, 109–113. Frank, J. (1973). The envelope of electron microscopic transfer functions for partially coherent illumination. Optik 38, 519–536.

QUANTITATIVE ATOMIC RESOLUTION TEM

159

Frank, J. (1992). Electron Tomography—Three-Dimensional Imaging with the Transmission Electron Microscope. New York: Plenum Press. Frieden, B. R. (1998). Physics from Fisher Information—A Unification. Cambridge: Cambridge University Press. Fujita, H., and Sumida, N. (1994). Usefulness of electron microscopy, in Physics of New Materials, edited by F. E. Fujita. Berlin: Springer-Verlag, pp. 226–263. Gabor, D. (1948). A new microscopic principle. Nature 161, 777–778. Geuens, P., and van Dyck, D. (2002). The S-state model: A work horse for HRTEM. Ultramicroscopy 93, 179–198. Geuens, P., Chen, J. H., den Dekker, A. J., and van Dyck, D. (1999). An analytic expression in closed form for the electron exit wave. Acta Crystallographica A Supplement 55, Abstract P11.OE.002. Haider, M., Uhlemann, S., Schwan, E., Rose, H., Kabius, B., and Urban, K. (1998). Electron microscopy image enhanced. Nature 392, 768–769. Hartel, P., Rose, H., and Dinges, C. (1996). Conditions and reasons for incoherent imaging in STEM. Ultramicroscopy 63, 93–114. Henderson, R. (1995). The potential and limitations of neutrons, electrons and X-rays for atomic resolution microscopy of unstained biological molecules. Q. Rev. Biophys. 28, 171–193. Herrmann, K.-H. (1997). Image recording in microscopy, in Handbook of Microscopy— Applications in Materials Science, Solid-State Physics and Chemistry, Methods II, edited by S. Amelinckx, D. van Dyck, J. van Landuyt, and G. van Tendeloo. Weinheim: VCH, pp. 885–921. Hirsch, P. B., Howie, A., Nicholson, R. B., Pashley, D. W., and Whelan, M. J. (1965). Electron Microscopy of Thin Crystals. London: Butterworths. Howie, A. (1966). Diffraction channelling of fast electrons and positrons in crystals. Philosophical Magazine 14, 223–237. Howie, A. (1970). The theory of high energy electron diffraction, in Modern Diffraction and Imaging Techniques in Material Science, edited by S. Amelinckx, R. Gevers, G. Remaut, and J. van Landuyt. Amsterdam: North-Holland Publishing Company, pp. 295–339. International Centre for Diffraction Data. (2001). Release 2001 for the Powder Diffraction File. Pennsylvania: International Centre for Diffraction Data. (This is a software package.). Ishizuka, K. (1980). Contrast transfer of crystal images in TEM. Ultramicroscopy 5, 55–65. Kabius, B., Haider, M., Uhlemann, S., Schwan, E., Urban, K., and Rose, H. (2002). First application of a spherical-aberration corrected transmission electron microscope in materials science. J. Elect. Micro. Supplement 51, 51–58. Kambe, K., Lehmpfuhl, G., and Fujimoto, F. (1974). Interpretation of electron channeling by the dynamical theory of electron diffraction. Zeitschrift fu¨ r Naturforschung 29a, 1034–1044. Kilaas, R., and Gronsky, R. (1983). Real space image simulation in high resolution electron microscopy. Ultramicroscopy 11, 289–298. Kirkland, E. J. (1984). Improved high resolution image processing of bright field electron micrographs I. Theory. Ultramicroscopy 15, 151–172. Kirkland, E. J. (1998). Advanced Computing in Electron Microscopy. New York: Plenum Press. Kisielowski, C., Hetherington, C. J. D., Wang, Y. C., Kilaas, R., O’Keefe, M. A., and Thust, A. (2001). Imaging columns of the light elements carbon, nitrogen and oxygen with sub ˚ ngstrom resolution. Ultramicroscopy 89, 243–263. A Kisielowski, C., Principe, E., Freitag, B., and Hubert, D. (2001). Benefits of microscopy with super resolution. Physica B 308–310, 1090–1096. Krivanek, O. L., Dellby, N., and Nellist, P. D. (2002). Aberration correction in the STEM, in Proceedings of the 15th International Congress on Electron Microscopy, Interdisciplinary and

160

VAN AERT ET AL.

Technical Forum Abstracts 2002 in Durban, South Africa, edited by R. Cross. Vol. 3, Onderstepoort: Microscopy Society of Southern Africa, pp. 29–30. Kruit, P., and Jansen, G. H. (1997). Space charge and statistical Coulomb effects, in Handbook of Charged Particle Optics, edited by J. Orloff. Boca Raton: CRC Press, pp. 275–318. Lentzen, M., Jahnen, B., Jia, C. L., Thust, A., Tillmann, K., and Urban, K. (2002). Highresolution imaging with an aberration-corrected transmission electron microscope. Ultramicroscopy 92, 233–242. Lichte, H. (1991). Electron image plane off-axis holography of atomic structures, in Advances in Optical and Electron Microscopy, edited by T. Mulvey and C. J. R. Sheppard. Vol. 12, London: Academic Press, pp. 25–91. Mo¨ bus, G. R., Schweinfest, T., Gemming, T., Wagner, T., and Ru¨ hle, M. (1997). Iterative structure retrieval techniques in HREM: a comparative study and a modular program package. J. Microscopy 190, 109–130. Mood, A. M., Graybill, F. A., and Boes, D. C. (1974). Introduction to the Theory of Statistics, 3rd ed. Singapore: McGraw-Hill. Mook, H. W., and Kruit, P. (1999). Optics and design of the fringe field monochromator for a Schottky field emission gun. Nuclear Instruments and Methods in Physics Research A 427, 109–120. Mory, C., Tence, M., and Colliex, C. (1985). Theoretical study of the characteristics of the probe for a STEM with a field emission gun. J. Micro. Spect. Electro. 10, 381–387. Muller, D. A. (1998). Core level shifts and grain boundary cohesion, in Microscopy and Microanalysis, Proceedings Microscopy and Microanalysis 1998 in Atlanta, Georgia, edited by G. W. Bailey, K. B. Alexander, W. G. Jerome, M. G. Bond, and J. J. McCarthy. Vol. 4, Supplement 2, New York: Springer, pp. 766–767. Muller, D. A. (1999). Why changes in bond lengths and cohesion lead to core-level shifts in metals, and consequences for the spatial difference method. Ultramicroscopy 78, 163–174. Muller, D. A., and Mills, M. J. (1999). Electron microscopy: probing the atomic structure and chemistry of grain boundaries, interfaces and defects. Mat. Sci. Engin. A 260, 12–28. Murray, W. (1972). Numerical Methods for Unconstrained Optimization. London: Academic Press. Nalwa, H. S. (2002). Nanostructured Materials and Nanotechnology: Concise Edition. San Diego: Academic Press. Nellist, P. D., and Pennycook, S. J. (1998). Subangstrom resolution by underfocused incoherent transmission electron microscopy. Phys. Rev. Lett. 81, 4156–4159. Nellist, P. D., and Pennycook, S. J. (2000). The principles and interpretation of annular darkfield Z-contrast imaging, in Advances in Imaging and Electron Physics, edited by P. W. Hawkes. Vol. 113, San Diego: Academic Press, pp. 147–199. O’Keefe, M. A. (1992). ‘Resolution’ in high-resolution electron microscopy. Ultramicroscopy 47, 282–297. O’Keefe, M. A., Hetherington, C. J. D., Wang, Y. C., Nelson, E. C., Turner, J. H., Kisielowski, C., Malm, J.-O., Mueller, R., Ringnalda, J., Pan, M., and Thust, A. (2001). Sub˚ ngstrom high-resolution transmission electron microscopy at 300 keV. Ultramicroscopy A 89, 215–241. Olson, G. B. (1997). Computational design of hierarchically structured materials. Science 277, 1237–1242. Olson, G. B. (2000). Designing a new material world. Science 288, 993–998. Op de Beeck, M., and van Dyck, D. (1996). Direct structure reconstruction in HRTEM. Ultramicroscopy 64, 153–165. Papoulis, A. (1965). Probability, Random Variables, and Stochastic Processes. New York: McGraw-Hill.

QUANTITATIVE ATOMIC RESOLUTION TEM

161

Papoulis, A. (1968). Systems and Transforms with Applications in Optics. New York: McGrawHill. Pa´ zman, A. (1986). Foundations of Optimum Experimental Design. Dordrecht: D. Reidel Publishing Company. Pennycook, S. J. (1997). Scanning transmission electron microscopy: Z contrast, in Handbook of Microscopy—Applications in Materials Science, Solid-State Physics and Chemistry, Methods II, edited by S. Amelinckx, D. van Dyck, J. van Landuyt, and G. van Tendeloo. Weinheim: VCH, pp. 595–620. Pennycook, S. J., Rafferty, B., and Nellist, P. D. (2000). Z-contrast imaging in an aberrationcorrected scanning transmission electron microscope. Microscopy and Microanalysis 6, 343–352. Pennycook, S. J., and Jesson, D. E. (1991). High-resolution Z-contrast imaging of crystals. Ultramicroscopy 37, 14–38. Pennycook, S. J., and Jesson, D. E. (1992). Atomic resolution Z-contrast imaging of interfaces. Acta Metallurgica et Materialia, Supplement 40, 149–159. Pennycook, S. J., Jesson, D. E., Chisholm, M. F., Browning, N. D., McGibbon, A. J., and McGibbon, M. M. (1995). Z-contrast imaging in the scanning transmission electron microscope. J. Micro. Soc. Am. 1, 231–251. Pennycook, S. J., and Yan, Y. (2001). Z-contrast imaging in the scanning transmission electron microscope, in Progress in Transmission Electron Microscopy 1—Concepts and Techniques, edited by X.-F. Zhang and Z. Zhang. Berlin: Springer-Verlag, pp. 81–111. Phillipp, F., Ho¨ schen, R., Osaki, M., Mo¨ bus, G., and Ru¨ hle, M. (1994). New high-voltage ˚ point resolution installed in Stuttgart. atomic resolution microscope approaching 1 A Ultramicroscopy 56, 1–10. Rayleigh, Lord (1902). Wave theory of light, in Scientific Papers by John William Strutt, Baron Rayleigh. Vol. 3, Cambridge: Cambridge University Press, pp. 47–189. Reed, M. A., and Tour, J. M. (2000). Computing with molecules. Sci. Am. 282, 68–75. Reimer, L. (1984). Particle optics of electrons, in Transmission Electron Microscopy—Physics of Image Formation and Microanalysis. Berlin: Springer-Verlag, pp. 19–49. Reimer, L. (1993). Elements of a transmission electron microscope, in Transmission Electron Microscopy—Physics of Image Formation and Microanalysis. Berlin: Springer-Verlag, pp. 86–135. Rose, H. (1975). Zur Theorie der Bildenstehung im Elektronen-Mikroskop I. Optik 42, 217–244. Rose, H. (1990). Outline of a spherically corrected semiaplanatic medium-voltage transmission electron microscope. Optik 85, 19–24. Sato, M. (1997). Resolution, in Handbook of Charged Particle Optics, edited by J. Orloff. Boca Raton: CRC Press, pp. 319–361. Sato, M., and Orloff, J. (1992). A new concept of theoretical resolution of an optical system, comparison with experiment and optimum condition for a point source. Ultramicroscopy 41, 181–192. Saxton, W. O. (1978). Computer Techniques for Image Processing in Electron Microscopy. New York: Academic Press, chapter 9, pp. 236–248. Saxton, W. O. (1997). Quantitative comparison of images and transforms. J. Microscopy 190, 52–60. Scherzer, O. (1949). The theoretical resolution limit of the electron microscope. Journal of Applied Physics 20, 20–28. Schiske, P. (1973). Image processing using additional statistical information about the object, in Image Processing and Computer-aided Design in Electron Optics, edited by P. W. Hawkes. London: Academic Press, pp. 82–90.

162

VAN AERT ET AL.

Sinkler, W., and Marks, L. D. (1999). A simple channelling model for HREM contrast transfer under dynamical conditions. J. Microscopy 194, 112–123. Spence, J. C. H. (1988). Experimental High-Resolution Electron Microscopy, 2nd ed. New York: Oxford University Press. Spence, J. C. H. (1999). The future of atomic resolution electron microscopy for materials science. Mat. Sci. Engi. R 26, 1–49. Springborg, M. (2000). Methods of Electronic-Structure Calculations: From Molecules to Solids. Chichester: Wiley. Stadelmann, P. A. (1987). EMS—A software package for electron diffraction analysis and HREM image simulation in materials science. Ultramicroscopy 21, 131–146. Thust, A., and Jia, C. L. (2000). Advances in atomic structure determination using the focal-series reconstruction technique, in Proceedings of the 12th European Congress on Electron Microscopy, Instrumentation and Methodology 2000 in Brno, Czech Republic, edited by P. Toma´ nek and R. Kolar˘´ık. Vol. 3, Brno: The Czechoslovak Society for Electron Microscopy, pp. 107–110. Thust, A., Overwijk, M. H. F., Coene, W. M. J., and Lentzen, M. (1996). Numerical correction of lens aberrations in phase-retrieval HRTEM. Ultramicroscopy 64, 249–264. Thust, A., Coene, W. M. J., Op de Beeck, M., and van Dyck, D. (1996). Focal-series reconstruction in HRTEM: simulation studies on non-periodic objects. Ultramicroscopy 64, 211–230. Treacy, M. M. J. (1982). Optimising atomic number contrast in annular dark field images of thin films in the scanning transmission electron microscope. J. Micro. Spect. Electron. 7, 511–523. van Aert, S., den Dekker, A. J., van den Bos, A., and van Dyck, D. (2002a). High-resolution electron microscopy: from imaging toward measuring. IEEE Trans. Instrument. Measure. 51, 611–615. van Aert, S., den Dekker, A. J., van den Bos, A., and van Dyck, D. (2002b). The benefits of statistical experimental design for quantitative electron microscopy, in Proceedings of the 15th International Congress on Electron Microscopy, Interdisciplinary and Technical Forum Abstracts 2002 in Durban, South Africa, edited by R. Cross. Vol. 3, Onderstepoort: Microscopy Society of Southern Africa, pp. 189–190. van Aert, S., den Dekker, A. J., van Dyck, D., and van den Bos, A. (2000). Design aspects for an optimum DF STEM probe, in Proceedings of the 12th European Congress on Electron Microscopy, Instrumentation and Methodology 2000 in Brno, Czech Republic, edited by P. Toma´ nek and R. Kolar˘´ık. Vol. 3, Brno: The Czechoslovak Society for Electron Microscopy, pp. 129–130. van Aert, S., den Dekker, A. J., van Dyck, D., and van den Bos, A. (2002a). High-resolution electron microscopy and electron tomography: Resolution versus precision. J. Struct. Biol. 138, 21–33. van Aert, S., den Dekker, A. J., van Dyck, D., and van den Bos, A. (2002b). Optimal experimental design of STEM measurement of atom column positions. Ultramicroscopy 90, 273–289. van Aert, S., and van Dyck, D. (2001). Do smaller probes in a scanning tranmission electron microscope result in more precise measurement of the distances between atom columns? Philosophical Magazine B 81, 1833–1846. van Aert, S., van Dyck, D., den Dekker, A. J., and van den Bos, A. (2000). Quantitative ADF STEM: guidelines towards an improved experimental design, in Jaarboek Nederlandse Vereniging voor Microscopie 2000, including the proceedings of the Joint Meeting of the BVM and the NVuM 2000 in Papendal, Arnhem, The Netherlands, edited by H. K. Koerten. Rijnsburg: Press Point, pp. 126–127.

QUANTITATIVE ATOMIC RESOLUTION TEM

163

van den Bos, A. (1982). Parameter estimation, in Handbook of Measurement Science, edited by P. H. Sydenham. Vol. 1, Chicester: Wiley, pp. 331–377. van den Bos, A. (1999). Measurement errors, in Encyclopedia of Electrical and Electronics Engineering, edited by J. G. Webster. Vol. 12, New York: Wiley, pp. 448–459. van den Bos, A. (2002). Afscheidsrede—Naar Waarde Schatten; Valedictory Address. Technische Universiteit Delft. van den Bos, A., and den Dekker, A. J. (2001). Resolution reconsidered—Conventional approaches and an alternative, in Advances in Imaging and Electron Physics, edited by P. W. Hawkes. Vol. 117, San Diego: Academic Press, pp. 241–360. van Dyck, D. (2002). High-resolution electron microscopy, in , edited by P. W. Hawkes. Advances in Imaging and Electron Physics Vol. 123, San Diego: Academic Press, pp. 105–171. van Dyck, D., and de Jong, A. F. (1992). Ultimate resolution and information in electron microscopy: general principles. Ultramicroscopy 47, 266–281. van Dyck, D., Danckaert, J., Coene, W., Selderslaghs, E., Broddin, D., van Landuyt, J., and Amelinckx, S. (1989). The atom column approximation in dynamical electron diffraction calculations, in Computer Simulation of Electron Microscope Diffraction and Images, edited by W. Krakow and M. O’Keefe. Warrendale: The Minerals, Metals & Materials Society, pp. 107–134. van Dyck, D., and Chen, J. H. (1999a). A simple theory for dynamical electron diffraction in crystals. Solid State Communications 109, 501–505. van Dyck, D., and Chen, J. H. (1999b). Towards an exit wave in closed analytical form. Acta Crystallographica A 55, 212–215. van Dyck, D., and Op de Beeck, M. (1996). A simple intuitive theory for electron diffraction. Ultramicroscopy 64, 99–107. van Dyck, D., Op de Beeck, M., and Coene, W. (1993). ‘‘A new approach to object wavefunction reconstruction in electron microscopy.’’ Optik 93, 103–107. van Dyck, D., van Aert, S., den Dekker, A. J., and van den Bos, A. (2002). How to select the items for the shopping list of future high resolution electron microscopists? in Microscopy and Microanalysis, Proceedings Microscopy and Microanalysis 2002 in Qubec City, Canada, edited by E. Voelkl, D. Piston, R. Gauvin, A. J. Lockley, G. W. Bailey, and S. McKernan. Vol. 8, Supplement 2, Cambridge: Cambridge University Press, pp. 94–95. van Dyck, D., van Aert, S., den Dekker, A. J., and van den Bos, A. (2003). Is atomic resolution transmission electron microscopy able to resolve and refine amorphous structures? Ultramicroscopy 98, 27–42. van Dyck, D., and Coene, W. (1987). A new procedure for wave function restoration in high resolution electron microscopy. Optik 77, 125–128. van Tendeloo, G., Pauwels, B., Geuens, P., and Lebedev, O. (2000). TEM of nanostructured materials, in Proceedings of the 12th European Congress on Electron Microscopy, Physical Sciences 2000 in Brno, Czech Republic, edited by J. Gemperlova´ and I. Va´ vra. Vol. 2, Brno: The Czechoslovak Society for Electron Microscopy, pp. 1–6. van Tendeloo, G., and Amelinckx, S. (1978). A high resolution study of ordering in Au4Mn. Physica Status Solidi A 49, 337–346. van Tendeloo, G., and Amelinckx, S. (1982). High resolution electron microscopic and electron diffraction study of the Au—Mg System II. The Y-phase and some observations on the Au77Mg23 phase. Physica Status Solidi A 69, 103–120. van Veen, A. H. V., Hagen, C. W., Barth, J. E., and Kruit, P. (2001). ‘‘Reduced brightness of the ZrO/W Schottky electron emitter.’’ J. Vac. Sci. Technol. B 19, 2038–2044. Wada, Y. (1996). Atom electronics: a proposal of nano-scale devices based on atom/molecule switching. Microelect. Engin. 30, 375–382.

164

VAN AERT ET AL.

Wang, Z. L. (2001). Inelastic scattering in electron microscopy—Effects, spectrometry and imaging, in Progress in Transmission Electron Microscopy 1—Concepts and Techniques, edited by X.-F. Zhang and Z. Zhang. Berlin: Springer-Verlag, pp. 113–159. Weißba¨ cker, C., and Rose, H. (2001). Electrostatic correction of the chromatic and of the spherical aberration of charged-particle lenses (Part I). J. Elect. Microscopy 50, 383–390. Weißba¨ cker, C., and Rose, H. (2002). Electrostatic correction of the chromatic and of the spherical aberration of charged-particle lenses (Part II). J. Elect. Microscopy 51, 45–51. Wiesendanger, R. (1994). Scanning Probe Microscopy and Spectroscopy—Methods and Applications. Cambridge: Cambridge University Press. Williams, D. B., and Carter, C. B. (1996). Transmission Electron Microscopy—A Textbook for Materials Science. New York: Plenum Press. Zanchet, D., Hall, B. D., and Ugarte, D. (2000). X-ray characterization of nanoparticles, in Characterization of Nanophase Materials, edited by Z. L. Wang. Weinheim: Wiley-VCH, pp. 13–36. Zandbergen, H. W., and van Dyck, D. (2000). Exit wave reconstructions using through focus series of HREM images. Microscopy Research and Technique 49, 301–323.

Suggest Documents