In Proc. SPIE 4013-51, UV, Optical, and IR Space Telescopes and Instruments VI, Breckinridge & Jacobsen, eds., Munich, Germany, March 2000.
A Predictor Approach to Closed-Loop Phase-Diversity Wavefront Sensing Mats G. L¨ofdahl, G¨oran B. Scharmer Royal Swedish Academy of Sciences Stockholm Observatory, SE–133 36 Saltsj¨obaden, Sweden
ABSTRACT We present a novel and fast method for utilizing wavefront information in closed-loop phase-diverse image data. We form a 2-dimensional object-independent error function using the images at different focus positions together with OTFs of the diffraction limited system. Each coefficient in an expansion of the wavefront is estimated quickly and independently by calculating the inner product of a corresponding predictor function and the error function. This operation is easy to parallelize. The main computational burden is in pre-processing, when the predictors are formed. This makes this method fast and therefore attractive for closed loop operation. Calculating the predictors involves error function derivatives with respect to the wavefront parameters, statistics of the parameters, noise levels and other known characteristics of the optical system. The predictors are optimized so that the RMS error in the wavefront parameters is minimized rather than consistency between estimated quantities with image data. We present simulation results that are relevant to the phasing of segmented mirrors in a space telescope, such as the NGST. Keywords: Wavefront sensing, Segmented mirrors, Phase diversity, NGST
1. INTRODUCTION Fast and accurate methods for the phasing and figure control of segmented telescopes such as the NGST and phasedarray telescopes are needed. Phase diversity (PD) methods are considered for these tasks because they can estimate the relative pistons as well as higher-order aberrations from data collected at or near the focal plane. This means that very little extra optics is required for PD wavefront sensing (WFS). In PD, information about the phase and the object is extracted from at least two images of same object, with a known difference in the phase between the two imaging channels. Most commonly, focus is used as phase difference between the two channels, because it can so easily be realized by adding a short distance to the optical path, e.g. with a beam splitter or (if the aberrations do not change quickly) even by refocusing the camera between consecutive exposures. Thus, very little extra optics is required.
PD methods and algorithms have been described by several authors since the concept was first proposed in 1979. Laboratory experiments have demonstrated PD to be a useful tool for estimating and correcting the phase errors of segmented mirrors, deformable mirrors, and liquid-crystal spatial light modulators. However, PD WFS usually requires considerable amounts of computing time. Earlier efforts in making fast PD algorithms have weaknesses. PD WFS with look-up table interpolation using a neural net has been quite successful but suffers from rapidly increasing computing time as the number of aberrations increases and from the need to phase the system in order to train it with empirical data. The concurrent computation algorithm was applied to a quite restricted wavefront of only focus, astigmatism, and coma on a circular aperture but still had 15% phase errors when an extended object was used. This was probably due to cross talk between
aberrations, since Zernike polynomials are not orthogonal with respect to Gonsalves type algorithms and we expect the problem would be worse when more aberrations are included. Speed-up in a Gonsalves type PD algorithm by Email address:
[email protected].
1
skipping the numerical guard band for Fourier wrap-around effects with extended objects has been attempted it also suffered badly from difficulties when higher order aberrations were present.
but
A novel PD method was recently presented by Scharmer, who demonstrated it with simulations for a circular aperture and atmospheric turbulence. A similar method developed for a Shack-Hartmann wavefront sensor is
presently in use for the SVST adaptive optics system. The idea is to find the optimum linear combination of samples of an error signal, that gives directly the coefficient of a wavefront mode. We call the weights of this linear combination a predictor. This kind of method has the potential to overcome some of the limitations of the previous methods for fast PD WFS. It allows fast real-time correction, because all that is needed to get an estimate of an aberration parameter is to Fourier transform the images, form an error function, and multiply the error function with the predictors and then sum up the pixels. The heavy computational work is done in preprocessing, which can be done in advance on a separate computer, when the predictors that minimize the errors in the wavefront in a least squares sense are calculated. Noise and cross talk with higher order aberrations can be accounted for when the predictors are calculated. Fourier wrap-around effects are avoided by using an error function that has the form of a OTF/PSF error, rather than an error signal in an extended image. The closed-loop computing time scales linearly with control of additional modes and parallelizes in a natural way. This paper is organized as follows: The PD algorithm is given in Section 2 and we demonstrate its use in Section 3. We end with a discussion in Section 4.
2. ALGORITHM 2.1. Error function The error metric minimized in PD algorithms related to Gonsalves’ original PD method can be written as the pixel sum of an error function, that can be derived using an assumption of Gaussian additive noise and optimizing the consistency between the data and the estimated wavefront and object. A drawback with Gonsalves’ error function is that the error signal is convolved with the object, which makes it impossible to compare from one object to another. Here, we find the optimum linear combination of the pixel values of an error function that we are free to choose. The optimization will take the choice of error function into account. Object independent error functions can be defined, as has been shown by researchers doing neural net PD. The ratio between the transforms of the images has been used but could become unstable when the denominator is close to zero. For applications with the GRNN, two different error functions have been used, one called the power metric which is only sensitive to symmetrical aberrations and the other called the sharpness metric, which senses antisymmetric components in the phase. They reduce the problem of small denominators by dividing with the sum of the power spectra of the two data frames. One drawback with the power metric is that it is not zero for a perfectly phased system. We write the Fourier transform of a data frame from diversity channel k as Dk
FTSk
Nk
(1)
where F is the object, T is the detector MTF, S k is the OTF, and Nk represents noise. We define
E
D2 S1 D2 S1
D1 S2 D1 S2
S2 S1
S2 S1
S1 S2
S1 S2
S1 2
S2 2
(2)
where subscripts 1 and 2 denote different diversity imaging channels, used as a superscript denotes an aberration free OTF, and denotes a current estimate in an iterative solution. is a noise term. In closed loop, the second term in the parenthesis is always zero because the initial estimate of S k is Sk
2
Sk .
E is object independent. In fact, it has the form of an error in an OTF, so it will be zero for zero aberrations. This means that when inverse Fourier transformed, it has the form of an error in a PSF (rather than of an image, as in the Gonsalves case). Since the phase information is concentrated to within the first few diffraction rings, the PSF error frame could in principle be cropped. This is equivalent to compressing E. The compression factor should be chosen so that the important features of the PSFs (in both diversity channels) fit within the field of view (FOV). For a point-like object this merely allows the object This can be done without any apparent loss of information. to move around in the FOV but for extended objects it means that more image data can be used without increasing the computing time significantly.
2.2. Linear predictors
As usual in PD methods, we parameterize the wavefront, front modes, m ,
k,
by expanding it in a set of linearly independent wave-
k
m m
(3)
k
m
where k is the known diversity phase. We estimate the aberration coefficients, E and predictor functions, Hm ,
m
h e
U
Hmu Eu u
m,
by forming an inner product of
(4)
m
for all aberrations that we wish to control in the closed loop. We also use windowing in the Fourier domain; U is the number of pixels within the diffraction limit. Note that m is always a real number, because Hm and E have hermitian symmetry.
Following Ref. 14, correcting for misprints and explicitly treating H m and E as complex quantities, we find the linear predictor, Hm , for m by minimizing the expected least squares error
L
m
m
U
E 2
Hmu u
(5)
u
The expectancy values, , incorporate noise realizations as well as all the expected aberrations. We expand this expression, using the fact that Hm is not a random function, L
m
2 m
u
Hmu Hmv Eu Ev v
Hmu u
m Eu
Hmv v
m Ev
(6)
At the optimum, the partial derivative of L with respect to both the real and the imaginary parts of H m u are zero simultaneously for all m and u . This results in an equation for Hmu ,
E E H
U
0
u v
mu
u
which can be written on matrix form, Rhm
sm
0
m Ev
(7)
(8)
Note that the same R matrix gives the optimum H m for all m. R can be constructed for arbitrary noise and aberration statistics using Monte Carlo methods. However, small aberrations and simple noise models allow the derivation of an analytical expression. Using an assumption of small aberrations that allows us to write E
E
m
m
3
m
(9)
the array elements can be written as (10) E E h H (11) s (12) E Note that the sum over m extends to infinity (in practice to some large number M), including aberrations not conR
2 m
uv
m
m u
mu
m v
2 m
u
v
m
m
u
v
v
m
trolled! This means that Hm for a certain aberration m is adapted to and minimizes the cross talk with all other aberrations, including higher-order aberrations that are not estimated. A derivation of the noise term in Eq. (10) for the case of additive Gaussian noise can be found in Appendix A. R is a complex U U-element, rank-M matrix, h m and sm are complex U-element arrays. When calculating the
Hm predictors in Eq. (4), we solve Eq. (8) by using singular value decomposition (SVD) and back substitution. The SVD can be written as R UWVT (13)
where U and V are orthogonal matrices and W is a diagonal matrix. The diagonal elements w i of W are called the singular values (SVs) and are ordered with the largest SV first. Note that, although the SVD of R can be time consuming due to its size, it only needs to be decomposed once, in order to form all the h m .
3. ALGORITHM USAGE 3.1. Simulation setup Here we will simulate a segmented mirror configuration with six identical hexagonal segments and a partly obscured center segment. We define modes as Zernike polynomials local to each segment, with coordinates that rotate with segment position and with the radial variable 1 at the segment corners. They are numbered so that modes 1–6 are piston modes, 7–12 are tilts with the gradient pointing in the radial direction, 13–18 are azimuthal tilts, etc. We show these modes for a single segment in Fig. 1. Note that we keep the center segment fixed and aberration free for the experiments presented here.
The parameters of the simulated setup are the following: An F 83 beam (with respect to the longest baseline), 128-pixel frames, 600 nm and image scale 0 2 pixel gives a diffraction limited resolution of 0 5. We use 128 a linear compression factor of 3 (see Section 2.1). Preliminary tests suggest that moderate amounts of noise cause only small changes in R, so Eq. (10) was used with no noise contribution.
We assume statistics where there is about half as much RMS for segment focus as for either of the tilt terms, one third for astigmatism and one tenth for coma. The expected piston amplitude was set to three times the tilt RMS. 1.
Piston
7.
Rad. tilt
13.
Az. tilt
19.
Focus
Figure 1. Sample segment modes
25.
31.
37.
43.
45 astig. 90 astig. Az. coma Rad. coma
4
m,
m given above left of each tile.
Figure 2. Singular value ratio as a function of diversity. The optimum amount of focus diversity for the current setup is 2.7 waves peak-to-peak.
Figure 3. The M 48 first SVs of R normalized to the first SV. Plus signs ( ) are SVs for when the original m modes are used, crosses ( ) for m eigenmodes.
3.2. Making predictors Forming the SVD of R as in Eq. (13) allows an objective way of optimizing various aspects of the optical setup. Here we will show how it can be done for the amount of diversity defocus. In general, a matrix inversion problem is more well determined if the SVs are large. So if changes in a parameter changes the SVs, we can choose the value that maximizes the ratio wM w1 , where M is the number of modeled aberrations. This is reasonable, because it is a measure of how well the signal is represented in single-precision arithmetics. If the ratio is 1 10 6 , then single precision is not enough.
We show w48 w1 for the current setup as a function of focus diversity in Fig. 2. Anything between 2 and 3 waves is probably a good choice but the optimum diversity in this sense is close to 2.7 waves peak-to-valley. We use this diversity in the following simulations and we show the 48 first SVs for 2.7 waves as plus signs ( ) in Fig. 3.
We can calculate the partial derivative of the error function with respect to each mode. They correspond to error function responses to small changes in the wavefront of the type specified by the mode, i.e. they show in which pixels there is signal from small changes in each mode. Error function responses for the modes in Fig. 1 are shown in Fig. 4. Note similarities between piston, radial tilt, focus, and radial astigmatism on one hand and azimuthal tilt, 45 astigmatism, and azimuthal coma on the other.
We show the corresponding predictors in Fig. 5. Note similarities and differences with Fig. 4. The similarities at low spatial frequencies between the responses for different modes force the predictors to more emphasis on the high frequencies.
3.3. Eigenmodes and stepwise convergence By solving Eqs. (8) and (13) for one SV at a time (nulling the others), we can find the eigenmodes of a particular setup for this PD algorithm. The solution for each SV is a single predictor H m , which is expanded in terms of the response functions. The resulting expansion coefficients are applied to the m modes and the result called an eigenmode, i . In Fig. 6, we show the eigenmodes in order of significance. Note that the first six modes are combinations of almost clean piston modes, indicating good piston-only control. In modes 7–24, tilt and focus modes are intermixed together with some piston. 25–36 include also astigmatism, while coma is more or less restricted to the last 12 modes.
5
7.
1.
Piston
19.
13.
Rad. tilt
Az. tilt
25.
Focus
31.
45 astig.
37.
90 astig.
43.
Az. coma
Rad. coma
Figure 4. Response for single segment aberrations, m . Top row: real parts. Bottom row: imaginary parts. Note that each derivative shown exists in six different orientations. Compare Fig. 1.
1.
7.
Piston
13.
Rad. tilt
19.
Az. tilt
25.
Figure 5. Predictors, Hm , corresponding to the 1.
2.
3.
4.
5.
Focus
31.
45 astig.
m
37.
90 astig.
43.
Az. coma
Rad. coma
modes in Fig. 1 and the partial derivatives in Fig. 4.
6.
Figure 6. Eigenmodes in order of significance, m . 7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
0
6
18
24
36
48
6
18 24
31.
32.
33.
34.
35.
36. 36
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
48
Figure 7. Conversion matrix M between original modes and eigenmodes.
6
(a) 6 SVs.
(b) 18 SVs.
(c) 24 SVs.
(d) 36 SVs.
(e) 48 SVs.
Figure 8. Predictor–response product matrix.
Piston
Piston
Rad. tilt
Focus
(b) 24 SVs
(a) 6 SVs
Piston
Az. tilt
Rad. tilt
Az. tilt
Focus
45 astig. 90 astig.
(c) 36 SVs
Figure 9. Predictors for a single segment. Compare responses in Fig. 4.
The eigenmodes can be studied in another way in Fig. 7, as a conversion matrix M between original modes and T T eigenmodes, 1 2 M M M . Note that M is almost a block lower triangular matrix 1 2 with blocks defined at row/columns 6 (piston), 24 (piston–tilt–focus), and 36 (piston–tilt–focus–astigmatism) but not quite at row/column 18 (piston–tilt). It is easier for the algorithm to converge if we start by estimating only for the most significant modes, and then add more modes as the low-order modes converge. To do this we must define natural subsets of the modes. The blockiness of M gives the answer, see Fig. 7. The predictor–response product matrices in Fig. 8 show where there is cross-talk with higher modes. The piston modes 1–6 are relatively free from cross talk. The 18-SV predictors would not work well, which is due to the focus modes 17 and 18, that destroy the 18 18 block in M.
7
1.
2.
3.
4.
5.
6.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
37.
38.
39.
40.
41.
42.
43.
44.
45.
46.
47.
48.
43.
44.
45.
46.
47.
48.
(a) Responses
(b) Predictors
Figure 10. Responses and predictors for eigenmodes, m , see Fig. 6. Because the eigenmodes are symmetric or antisymmetric, only one of the symmetric real part and the antisymmetric imaginary part is non-zero and therefore shown here. We show predictors based on piston-only modes, and on piston–tilt–focus, and piston–tilt–focus–astigmatism in Fig. 9. Note that the 6-SV piston predictor is almost identical to the piston response in Fig. 4. It does not take the higher order aberrations into account as much as the predictors calculated with more SVs.
We can check the results by doing the same process again, starting with the eigenmodes m . The SVs are shown in Fig. 3 as crosses ( ). The responses and predictors in Fig. 10 are much more similar than the responses and predictors in Figs. 4 and 5, which shows that the eigenmodes m are chosen so that they are approximately orthogonal with respect to the predictors.
3.4. Simulated closed loop 48 We initialize simulated closed loop experiments by making random wavefronts where m m 1 are drawn from different uniform distributions depending on the type of segment aberration. Pistons 1 wave were used. The aberrations statistics used for making the predictors are not followed strictly for the simulated wavefronts, but the expectancy values for the aberrations are important primarily when M U. We use the predictors to estimate corrections that are subtracted from the m .
8
(a) RMS
(b) Strehl ratio
Figure 11. Results from closed loop simulation. Averages of 300 experiments. Solid line: total wavefront. Short dashes: Pistons. Longer dashes: tilts. Dotted: higher modes. Only pistons were controlled during the first 8 corrections. For each wavefront, image data frames are made by convolving an object with OTFs based on the wavefronts in 256 256-pixel arrays and then cropped to 128 128 pixels. We use no-noise predictors but make simulated image data with Gaussian additive noise, noise level 10 4 of peak power, enough to make the diversity channel quite noisy. We show statistics in Fig. 11 for 300 closed loop simulations using a point source, although we have similar results with extended sources. The first 8 corrections were piston-only, calculated using piston-only predictors, corresponding to using 6 SVs, and using only the piston estimates. This is similar to using only the 6 6 upper left hand corner of M. For the following corrections, the full set of predictors were used. The wavefront RMS values are calculated after doing a mod 2 operation on the segment pistons and then removing global piston and tilt, since these sources of RMS do not affect image quality. The 2 ambiguity is a general problem for piston WFS and is dealt with in another paper. Note the rapid convergence. Correcting only the first six modes, the pistons are converged to 40 nm RMS in 3 corrections and reach their final level of 30 nm after 6 corrections. When the full set of predictors is used, all modes converge in another 5 corrections. The final wavefront is 20 nm RMS with pistons contributing 10 nm RMS.
4. DISCUSSION We have shown that the predictor method can sense piston in the presence of uncorrected higher aberrations and that also the higher aberrations can be sensed and corrected. We are still exploring various combinations of limiting the number of SVs and the number of controlled modes. Experiments not presented here suggest that the 6-SV predictors extend the capture range compared to just controlling 6 nodes with the 48-SV predictors, probably due to non-linear effects. However, it is evident from the piston improvement when the other modes are included that there is considerable cross talk with higher modes. We have also shown that predictors formed under the assumption of zero noise work with data, that include Gaussian additive noise. We expect that it would also work for moderate amounts of noise with other statistics. In
9
general, the expression for the noise term shows that for calculation of the noise contribution, R (and therefore the predictors) is no longer independent of the object. However, an approximate power spectrum should be enough to customize the matrix to the expected observations. One important point is that the predictor method does not optimize the consistency between the data and the estimated quantities, like Gonsalves type PD methods. Instead it minimizes the errors in the quantity we are interested in, the wavefront. It should be noted that nothing in this PD algorithm limits it to monochromatic light. The derivatives have to take the bandwidth into account, but that is all. The method is not even restricted to PD. It could in a natural way be applied to diversity in the pupil geometry or any setup where derivatives of an object-independent error function can be defined. We have shown how the SVD can be used to find a reasonable optimum for the amount of focus diversity. Although it remains to be shown that this optimum diversity really gives optimum performance in closed loop, we believe the possibility is worth mentioning. The ability to predict the objectively best diversity amount is certainly useful, maybe even for users of other PD algorithms. Another possibility is to minimize L, which is the variance of the expected residual error in the aberration coefficients m .
This method combines some of the good qualities of neural net PD, such as speed and possibility of empirical “training”, with some that are typical of Gonsalves based algorithms, such as a well defined optimum defined by modeling. Although the preprocessing to make predictors is heavy, the closed loop calculations are cheap and scale linearly with the number of sensed parameters. The separate nature of the estimation of different aberration coefficients makes this algorithm easy to parallelize. We believe empirical derivatives from a not completely phased telescope could be used with this method. Note that while network training establishes associations between data and absolute phase errors, which requires a well phased system to start with, the derivatives are used in this method only to show the direction to the best phase. This may be an advantage in situations when the optical system is not well modeled, e.g. during an initial phasing operation of a space telescope. However, it would mean that the preprocessing would have to be done in space and that it may have to be redone when the phasing has improved. It should be noted that for a well modeled system, all the heavy preprocessing can be done pre-launch.
E The error function in Eq. (2) would probably be a good choice for an error metric used with the GRNN. is similar to the power metric in that it senses symmetric components of the phase, while E , like the sharpness metric, is sensitive to anti-symmetric aberrations.
ACKNOWLEDGMENTS M.G.L would like to thank Rick Williams and John Seldin for discussions and information on NGST wavefront control. This research was supported in part by Independent Research and Development funds at Lockheed Martin Space Systems, Advanced Technology Center, Palo Alto, California.
REFERENCES 1. D. Redding, S. Basinger, A. E. Lowman, A. Kissil, P. Bely, R. Burg, R. Lyon, G. Mosier, M. Femiano, M. Wilson, G. Schunk, L. Craig, D. Jacobsen, and J. Rakoczy, “Wavefront sensing and control for a next generation space telescope,” in Space Telescopes and Instruments V, P. Y. Bely and J. B. Breckinridge, eds., vol. 3356 of Proc. SPIE, pp. 758–772, 1998. 2. V. Zarifis, R. M. Bell Jr., L. R. Benson, P. J. Cuneo, A. L. Duncan, B. J. Herman, B. Holmes, R. D. Sigler, R. E. Stone, D. M. Stubbs, R. L. Kendrick, R. G. Paxman, J. H. Seldin, and M. G. L¨ofdahl, “The multi aperture imaging array,” in Working on the Fringe: An International Conference on Optical and IR Interferometry from Ground and Space, S. Unwin and R. Stachnik, eds., vol. 194 of ASP Conf. Ser., p. 76, 1999. 3. M. G. L¨ofdahl and G. B. Scharmer, “Wavefront sensing and image restoration from focused and defocused solar images,” Astronomy & Astrophysics Supplement Series 107, pp. 243–264, 1994.
10
4. R. G. Paxman, J. H. Seldin, M. G. L¨ofdahl, G. B. Scharmer, and C. U. Keller, “Evaluation of phase-diversity techniques for solar-image restoration,” Astrophysical Journal 466, pp. 1087–1099, 1996. 5. R. L. Kendrick, D. S. Acton, and A. L. Duncan, “Phase-diversity wave-front sensor for imaging systems,” Applied Optics 33(27), pp. 6533–6546, 1994. 6. R. G. Paxman, T. J. Schulz, and J. R. Fienup, “Joint estimation of object and aberrations by using phase diversity,” Journal of the Optical Society of America A 9(7), pp. 1072–1085, 1992. 7. R. G. Paxman and J. R. Fienup, “Optical misalignment sensing and image reconstruction using phase diversity,” Journal of the Optical Society of America A 5, pp. 914–923, 1988. 8. R. A. Gonsalves and R. Chidlaw, “Wavefront sensing by phase retrieval,” in Applications of Digital Image Processing III, A. G. Tescher, ed., vol. 207 of Proc. SPIE, pp. 32–39, 1979. 9. R. L. Kendrick, R. Bell, A. L. Duncan, G. D. Love, and D. S. Acton, “Closed loop wavefront correction using phase diversity,” in Space Telescopes and Instruments V, Bely and Breckinridge, eds., vol. 3356 of Proc. SPIE, pp. 844–853, 1998. 10. M. G. L¨ofdahl, G. B. Scharmer, and W. Wei, “Calibration of a deformable mirror and Strehl ratio measurements by use of phase diversity,” Applied Optics 39(1), pp. 94–103, 2000. 11. R. A. Carreras, G. Tarr, S. Restaino, G. Loos, and M. Damodaran, “Concurrent computation of Zernike coefficients used in a phase diversity algorithm for optical aberration correction,” in Image and Signal Processing for Remote Sensing, J. Desachy, ed., vol. 2315 of Proc. SPIE, pp. 363–370, 1994. 12. M. G. L¨ofdahl, “Orthogonalization of basis functions for diagonalized wavefront sensing,” in High Resolution Solar Physics: Theory, Observations and Techniques, T. Rimmele, R. R. Radick, and K. S. Balasubramaniam, eds., Proc. 19th Sacramento Peak Summer Workshop, ASP Conf. Series vol. 183, p. 320, 1999. 13. M. G. L¨ofdahl, A. L. Duncan, and G. B. Scharmer, “Fast phase diversity wavefront sensing for mirror control,” in Adaptive Optical System Technologies, D. Bonnaccini and R. K. Tyson, eds., vol. 3353 of Proc. SPIE, pp. 952–963, 1998. 14. G. B. Scharmer, “Object-independent fast phase-diversity,” in High Resolution Solar Physics: Theory, Observations and Techniques, T. Rimmele, R. R. Radick, and K. S. Balasubramaniam, eds., Proc. 19th Sacramento Peak Summer Workshop, ASP Conf. Series vol. 183, p. 330, 1999. 15. G. B. Scharmer and H. Blomberg, “Optimized Shack–Hartmann wavefront sensing for adaptive optics and post processing,” in High Resolution Solar Physics: Theory, Observations and Techniques, T. Rimmele, R. R. Radick, and K. S. Balasubramaniam, eds., Proc. 19th Sacramento Peak Summer Workshop, ASP Conf. Series vol. 183, p. 239, 1999. 16. G. B. Scharmer, P. M. Dettori, M. G. L¨ofdahl, M. Shand, and W. Wei, “A workstation based adaptive optics system,” in Adaptive Optical Systems Technologies, P. L. Wizinowich, ed., vol. 4007 of Proc. SPIE, 2000. 17. N. A. Miller and A. V. Ling, “Imaging with phase diversity: Simulations with a neural network,” in Photoelectronic Detection and Imaging: Technology and Applications ’93, L. Zhou, ed., vol. 1982 of Proc. SPIE, pp. 410–417, 1993. 18. C. L. Lawson and R. J. Hanson, Solving Least Squares Problems, Series in Automatic Computation, Prentice-Hall, Englewood Cliffs, New Jersey, 1974. 19. W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, Numerical Recipes in C, The Art of Scientific Computing, Cambridge University Press, Cambridge, MA, 2 ed., 1992. 20. M. G. L¨ofdahl and H. Eriksson, “Resolving piston ambiguities when phasing a segmented mirror,” in UV, Optical, and IR Space Telescopes and Instruments VI, J. B. Breckinridge and P. Jacobsen, eds., vol. 4013 of Proc. SPIE, 2000.
APPENDIX A. NOISE EFFECTS Substituting Eq. (1) for the data frames in Eq. (2) and using only the closed loop part of the error function, we can write FTS2 N2 S1 FTS1 N1 S2 S1 2 S2 2 (14) E FTS2 N2 S1 FTS1 N1 S2
Assuming the noise is negligible with respect to the sum in the denominator but not with respect to the subtraction in the numerator, we get S2 S1 S1 S2 S1 2 S2 2 (15) E S2 S1 S1 S2 11
where, assuming also the aberrations are negligible with respect to the sum in the denominator,
1 N2 S1 N1 S2 S1 2 FT 2S1 S2
S2 2
(16)
The assumption of small aberrations also lets us expand E in its derivatives using Eq. (9), so Eq. (7) can be rewritten as U Eu Ev Ev 0 Hmu (17) n m n u v v
Assuming also that the noise is statistically independent from the aberrations and that the aberrations have zero mean u
n
n
n
n
and are independent from each other, this becomes 0
U
E E 2 n
u
n
u
v
n
n
u
Hmu
v
E 2 m
v
(18)
m
We expand the noise term, u v , in terms of the noise in the data frames. Assuming that data frames have noise that is statistically independent from frame to frame, and that all frames have identical statistics, we have from Eq. (16) that
u
v
Nu Nv
S
S2u S1v
1u
Fu Fv Tu Tv
2 S1u
S2v
2 2 S2u S1v
2 S2v
(19)
4S1u S1v S2u S2v
For additive Gaussian noise, this becomes particularly simple, because the noise in different pixels is independent and with equal power, and the noise is also independent from the object. The first factor can then be written as
N
(20) F F T T F F T T is the Kronecker delta. This means that the other factors, including the fraction involving the OTFs, only
Nu Nv
2
u v
u v u v
where n have to be calculated for u
u v
u v
v. We get, in the Gaussian case,
N
F T
N u v 4 Fu 2 Tu2
u
2
2
v
u v 4
S1u
u
2 u
2 S1u
2 2 S S1u 2u
2
2
S2u
1
S1u S2u
2
2 S2u
1
S2u
2
2
(21)
S1u
Gaussian noise contributes only to the diagonal of R. This should make the matrix equation less singular, much like
the parameter added to the diagonal in the Levenberg–Marquardt method.
12