Maximum a posteriori estimation of fixed aberrations, dynamic aberrations, and the object from phase-diverse speckle data. Brian J. Thelen, Richard G. Paxman, ...
1016
J. Opt. Soc. Am. A / Vol. 16, No. 5 / May 1999
Thelen et al.
Maximum a posteriori estimation of fixed aberrations, dynamic aberrations, and the object from phase-diverse speckle data Brian J. Thelen, Richard G. Paxman, David A. Carrara, and John H. Seldin ERIM-International, Box 134008, Ann Arbor, Michigan 48113-4008 Received November 3, 1998; accepted January 19, 1999 In phase-diverse speckle imaging one collects a time series of phase-diversity image sets that are used to jointly estimate the object and each of the phase-aberration functions. Current approaches model the total phase aberration in some deterministic parametric fashion. For many imaging schemes, however, additional information can be exploited. Specifically, the total aberration function consists of the fixed aberrations combined with dynamic (time-varying), turbulence-induced aberrations, about whose stochastic behavior we often have some knowledge. One important example is that in which the wave-front phase error corresponds to Kolmogorov turbulence. In this context using the extra statistical information available may be a powerful aid in the joint aberration/object estimation. In addition, such a framework provides an attractive method for calibrating fixed aberrations in an imaging system. The discipline of Bayesian statistical inference provides a natural framework for using the stochastic information regarding the wave fronts. Here one imposes an a priori probability distribution on the turbulence-induced wave fronts. We present the general Bayesian approach for the joint-estimation problem of fixed aberrations, dynamic aberrations, and the object from phasediverse speckle data that leads to a maximum a posteriori estimator. We also present results based on simulated data, which show that the Bayesian approach provides an increase in accuracy and robustness for this joint estimation. © 1999 Optical Society of America [S0740-3232(99)02005-0] OCIS codes: 100.3020.
1. INTRODUCTION 1,2
As its name suggests, phase-diverse speckle (PDS) is a blend of phase-diversity and speckle-imaging concepts. The method of phase diversity,3,4 first proposed by Gonsalves, requires the collection of two or more images. One of these images is the conventional focal-plane image that has been degraded by unknown aberrations. Additional images of the same object are formed by perturbing these unknown aberrations in some known fashion, thus creating phase diversity and reimaging. Such a perturbation can be accomplished with simple optical hardware. For example, a beam splitter and a second detector array, translated along the optical axis, further degrade the imagery with a known amount of defocus. The goal is to identify a combination of object and aberrations that is consistent with the collected images, given the known phase-diversity functions. Note that phase diversity provides system-identification information by estimating the unknown aberrations. Speckle imaging is a relatively mature technique for obtaining fine-resolution images in the presence of dynamically changing aberrations such as turbulence. This technique requires the collection of many short-exposure images of a static object. PDS requires the simultaneous collection of one conventional image and at least one diversity image for each of the multiple dynamic-aberration realizations, as depicted in Fig. 1. Fortunately the primary strengths of the two constituent methods, namely, the system identification provided by phase diversity and the added information content in a sequence of shortexposure images, persist. PDS has been successfully 0740-3232/99/051016-10$15.00
demonstrated with real data in a variety of applications. Near-diffraction-limited resolution has been achieved for the ground-based imaging of solar granulation,2,5 binary stars,6 and artificial satellites.7,8 Current PDS estimators are based on a deterministic parametric model for the total phase-aberration function. For a typical imaging scenario, however, one has more information than this. Specifically, the total phaseaberration function consists of fixed aberrations combined with dynamic (time-varying), turbulence-induced aberrations, about whose stochastic behavior we often have some knowledge. One important example is the case in which the dynamic aberrations are induced from Kolmogorov turbulence. In this context this extra information may be a powerful aid in the joint aberration/object estimation. In addition, such a framework would provide an attractive method for calibrating fixed aberrations in an imaging system. The natural framework for using the stochastic information regarding the wave fronts is that of Bayesian statistical inference, by which one imposes an a priori probability distribution on the turbulence-induced aberrations. In this paper we present the general Bayesian approach for this joint-estimation problem of fixed aberrations, dynamic aberrations, and the object from PDS data that leads to a maximum a posteriori (MAP) estimator. We then present results based on simulated data, which show that the Bayesian approach provides an increase in accuracy and robustness for the joint estimation of object and aberrations. In Section 2 we give a precise statement of the problem, © 1999 Optical Society of America
Thelen et al.
Vol. 16, No. 5 / May 1999 / J. Opt. Soc. Am. A
s jk ~ x ! 5
u h jk ~ x ! u 2
S x 8 u h jk ~ x 8 ! u 2
1017
,
(2)
where the coherent impulse response is given by h jk ~ x ! 5
1 N2
(H u
jk ~ u ! exp~ i2 p ^ u,
x & /N !
(3)
and the coherent transfer function is given by H jk ~ u ! 5 u H k ~ u ! u exp$ i @ f j ~ u ! 1 u k ~ u !# % .
Fig. 1.
Optical layout for phase-diverse speckle imaging.
and in Section 3 we develop the general theory along with the resulting image-reconstruction algorithm based on a MAP estimation of the fixed and dynamic aberrations. In Section 4 we present some simulation results showing the improvement in performance of the MAP approach over the traditional maximum-likelihood-estimation approach.
(4)
Here u takes values in $ 0,..., N 2 1% 3 $0,..., N 2 1% and corresponds to spatial frequency, ^ u, x & denotes the inner product, u H k (u) u is the known pupiltransmission function (appropriately scaled) for the kth channel, f j (u) is the unknown phase-aberration function that is common to all K diversity channels, and u k (u) is the known phase-diversity function associated with the kth channel. A convenient functional form for the phasediversity function is that of a quadratic, corresponding to defocus, since this quadratic is easily obtained by translating the detector arrays along the optical axis. The unknown phase-aberration function, f j (u), contains fixed and dynamic aberrations, i.e.,
f j ~ u ! 5 f ~ f ! ~ u ! 1 f ~j d ! ~ u ! ,
2. STATEMENT OF PROBLEM We consider the PDS imaging problem as described in Refs. 2 and 9, with the additional feature in which we decompose the unknown phase aberrations into fixed aberrations (FA’s) that are common to all the short-exposure images and dynamic aberrations (DA’s) that vary from realization to realization. We assume that we have J short-exposure image sets, corresponding to J DA realizations and K diversity channels. Typically K 5 2, as depicted in Fig. 1, but other values are of interest. For example, when K 5 1 the data correspond to a multiframe blind deconvolution data set.10 The incoherent PDS image-formation process is approximated by the following set of discrete convolutions: g jk ~ x ! 5
(8 f ~ x 8 ! s x
jk ~ x
2 x8!,
j 5 1,..., J , k 5 1,..., K
(1)
where f is the object, s jk is the point-spread function (PSF) for the jth DA realization and for the kth diversity channel, g jk denotes the corresponding noiseless image, and x is a two-dimensional coordinate taking values in the integer grid $ 0,..., N 2 1% 3 $0,..., N 2 1%. To take advantage of fast Fourier transforms to implement Eq. (1), we treat the object, the PSF’s, and the images as periodic arrays with a period cell size of N 3 N. The detected image data are denoted by d jk (x), and for the remainder of this paper we assume that these data, given the object f and PSF’s, can be modeled as independent Poisson random variables with a mean given by g jk (x) for each j, k, and x. The theory and approach can readily be adapted to handle other noise models such as additive Gaussian4 or mixed Poisson and additive Gaussian.11 The PSF s jk is defined by
(5)
where f ( f ) corresponds to the FA’s and f j( d ) corresponds to the DA’s. Typical sources for the FA’s include mirror misfigure, misalignments among optical components, and errors in the wave front–sensor offsets in an adaptiveoptics system. Typical sources for the DA’s are turbulence, jitter, or the dynamic portion of the residual phase errors after adaptive-optics correction. It is desirable to represent each of these phase functions as a linear combination of appropriate basis functions. Let $ a l 8 % l 8851 and L denote two (potentially different) sets of basis $ b l % l51 functions for the FA’s and for the DA’s, respectively. In the cases of segmented-aperture or phased-array teleL
scopes, for example, the FA basis set $ a l 8 % l 8851 could be selected to correspond to piston and tilt misalignments. L are Typical choices for the DA basis set $ b l % l51 Karhunen–Loe`ve functions, Zernike polynomials, or delta functions (corresponding to a point-by-point representation). With these basis sets, the total phase-aberration function for the jth short-exposure image is given by L
L8
f j~ u ! 5
(
l 8 51
L
a l 8 a l 8~ u ! 1
(b
jl b l ~ u ! ,
(6)
l51
where the first summation corresponds to the FA’s and the second summation corresponds to the DA’s. We model the FA parameters $ a l 8 % l 8851 as deterministic and unknown, whereas we model the DA parameter vecL as multivariate Gaussian with a mean of 0 tor $ b jl % l51 and a known covariance matrix S. We also assume that L corresponding to the DA’s are the basis functions $ b l % l51 linearly independent and that all the basis functions L $ a l 8 % l 8851 can be expressed as a linear combination of the L . This is not a severe restriction, and in functions $ b l % l51 Appendix A we present an alternative approach when L
1018
J. Opt. Soc. Am. A / Vol. 16, No. 5 / May 1999
Thelen et al.
this assumption is not satisfied. Finally, we assume that the DA’s are statistically independent from realization to realization. Our problem can now be stated. Given the set of JK detected images $ d jk % and the pupil-transmission functions $ u H k u % , we want to estimate the object f, the FA parameters $ a l 8 % , and the DA parameters $ b jl % .
3. MAP ESTIMATION In this section we present a Bayesian approach for estimating the object, the FA’s, and the DA’s. Notationally, we will let a denote the L 8 -dimensional vector
a5
SD a1 a2
] a L8
bj 5
SD ] b jL
,
(7)
(8)
For a fixed j the joint probability density function (pdf) for the data d j 5 $ d jk (x); k, x % and for the aberration parameters b j depends on the deterministic parameters f and a. We denote this pdf by p(d j , b j ; f, a). The full pdf of all the data d 1 ,..., d J and aberration parameters b 1 ,..., b J , which depends on the object f and a, can be written as p ~ d 1 ,..., d J , b 1 ,..., b J ; f, a ! 5 p ~ d 1 ,..., d J u b 1 ,..., b J ; f, a ! p ~ b 1 ,..., b J ; f, a !
F) J
5
GF ) G
(9)
J
p ~ d j u b j ; f, a !
j51
p~ bj!
(10)
j51
J
5
) p~ d ub j
j
; f, a ! p ~ b j ! ,
(11)
j51
where p(d j u b j ; f, a) denotes the conditional probability of the data d j , given a fixed value of b j (and object f and FA parameter vector a), and p(b j ) denotes the pdf of the DA random vector b j . Here Eq. (9) is a simple consequence of the definition of conditional pdf, and Eq. (10) follows from the independence of the data and the DA’s over realizations along with the assumption that the DA random vector b j does not depend on the object f or the FA parameter vector a. Note that the above equations do not depend on the noise model or the particular pdf for the b j ’s. For the special case of a Poisson noise model and the previously discussed multivariate Gaussian model for the DA vector b j , we have K
p ~ d j u b j ; f, a ! 5
))
k51
x
A~ 2 p ! L u S u
exp~ 2
exp@ 2g jk ~ x !#
@ g jk ~ x !# d jk ~ x !
d jk ~ x ! !
, (12)
1 2
b j 8 S 21 b j ! ,
(13)
where S is the covariance matrix of the aberration parameter vectors $ b j % 1, uSu denotes the determinant of S, and the prime applied to a vector indicates a transpose. Note that the noiseless image g jk (x) depends implicitly on the object f and the parameter vectors a and b j . The expression for the full pdf can be reinterpreted as a likelihood function12 when the data are known and the object and aberrations are unknown. Substituting Eqs. (12) and (13) into Eq. (11), taking the logarithm, and discarding terms that are not a function of the parameters, we find that the log-likelihood function is given by K
( ( ( $d
L [ L~ f, b 1 ,..., b J , a ! [
j51 k51
2 g jk ~ x ! % 2
j 5 1,..., J.
,
1
J
and b j will denote the L-dimensional vector b j1 b j2
p~ bj! 5
1 2
jk ~ x ! log@
g jk ~ x !#
x
J
( b 8S j
21
bj .
(14)
j51
L represents the full log-likelihood of both the deterministic parameters ( f and a) and the random parameters $ b j % . For an alternative interpretation one can assume a uniform (improper) prior pdf on f and a. In this case L can be viewed as the log of the a posteriori pdf of f, a, and $ b j % , given the data $ d jk (x) % . It is natural to seek parameter estimates that maximize L or (generalized) MAP estimates. For the rest of this paper we will concentrate on the MAP estimation of the parameters f, a, and $ b j % based on maximizing L. The objective function can be more efficiently parameterized by describing all the aberrations in terms of one L1 unified aberration basis set $ c l % l51 that spans both
L L and $ a l 8 % l 8851 ; i.e., each of these aberration func$ b l % l51 tions can be written as a linear combination of $ c l % . For
the remainder of this section we assume that the DA basis $ b l % spans the FA basis set $ a l 8 % , thus allowing us to choose L 1 5 L and $ c l % 5 $ b l % . An important special case in which this assumption is satisfied is when $ b l % corresponds to the point-by-point representation. An alterative strategy for selecting the spanning-basis set and doing MAP estimation is presented in Appendix A for the general case in which the DA basis $ b l % does not necessarily span the FA basis set $ a l 8 % . In our present scheme each FA basis function a l 8 can be expressed as a linear combination of the basis functions $ c l % , L
a l8 5
(r l51
ll 8 c l
,
(15)
L where we have introduced the coefficients $ r ll 8 % l51 . Accordingly, we can express the total aberration function f j (u) as a linear combination of the $ c l % basis functions, L
f j~ u ! 5
(c
jl c l ~ u ! ,
(16)
l51
where the set $ c jl % now represents coefficients encompassing both the DA’s and the FA’s. In this new representation the DA’s (which change over realization j) are given by
Thelen et al.
Vol. 16, No. 5 / May 1999 / J. Opt. Soc. Am. A
f ~j d ! 5 f j 2 f ~j f !
J
(c
5
jl c l
2
l 8 51
L
L8
jl c l
2
L
L
(
5
c jl c l 2
l51
2 L
a l8
S
(r l51
l 8 51
l51
r ll 8 a l 8
L
( @c
5
J
5
a l 8 r ll 8 c l
F S ( DG c jl 2
2 ~ ra !l#cl 5
l51
J
cl
(b
j51 k51
2
1 2
K
1 2
jl c l
,
(17)
2 g jk ~ x ! %
x
~ c j 2 r a ! 8 S 21 ~ c j 2 r a ! ,
(18)
j51
where g jk depends explicitly only on c j and f (not on a). For a fixed set $ c j % there is a closed-form expression for the vector a that maximizes L. In Appendix B we provide a straightforward application of generalized leastsquares-estimation theory13 showing that the vector a that maximizes L is given by a ~¯c ! [ ~ r 8 S 21 r ! 21 r 8 S 21¯c , *
1 J
J
(c. j51
K
( ( ( $d j51 k51
2
1 2
jk ~ x ! log@ g jk ~ x !#
j51
j
2 r a ~¯c !# 8 S 21 @ c j 2 r a ~¯c !# * *
(26)
]L 2 5 2~ S 21 c j ! l 1 ~ B S r¯c ! l . ] c jl
(27)
In the important special case in which there are only dynamic aberrations, the entries in the matrix B S r are all zeros, and the resultant objective function is given by J
K
( ( ( $d j51 k51
2
1 2
jk ~ x ! log@
x
S( J
j51
c j8 S 21 c j
g jk ~ x !# 2 g jk ~ x ! %
D (28)
and the resultant gradient equations simplify to
J
( @c
(25)
where ¯cˆ is the vector corresponding to the mean of all the cˆ j ’s. A variety of optimization algorithms rely on expressions for the gradient of the objective function with respect to the object and aberration parameters. Expressions for the partial derivatives of the first term L1 have appeared earlier in Ref. 1. The partial derivative of the second term L2 with respect to the aberration parameter c jl is found to be
[ L1 1 L2 ,
2 g jk ~ x ! %
x
(23) (24)
aˆ [ a ~¯cˆ ! 5 ~ r 8 S 21 r ! 21 r 8 S 21¯cˆ , *
Lm 5
L 5 *
D
Equation (22) can be derived by means of straightforward algebra or, alternatively, through invoking generalized least-squares theory.13 See Appendix B for more details. The objective function L is seen to consist of two major * terms—the first corresponding to the usual log-likelihood 2 function and the second corresponding to the Bayesian prior on the DA’s. MAP estimates, denoted with a caret, can be found by identifying the f and the parameter vectors c 1 ,..., c J that maximize L in Eq. (24). The MAP es* timates of the aberrations vectors, cˆ 1 ,..., cˆ J , can be substituted into Eq. (19) to find the MAP estimate of the fixed-aberration vector a, resulting in
(20)
j
This equation for a can be substituted into Eq. (18) to * derive a new objective function or modified log-likelihood function, J
¯ 8 B S r ¯c c j8 S 21 c j 2 Jc
j51
(19)
where c is the vector corresponding to the mean of all the c j ’s, ¯c 5
S( J
B S r [ S 21 r ~ r 8 S 21 r ! 21 r 8 S 21 .
J
(
2 g jk ~ x ! %
where we have defined the positive-definite matrix
l51
jk ~ x ! log@ g jk ~ x !#
jk ~ x ! log@ g jk ~ x !#
x
[ L1 1 L2 ,
K
( ( ( $d
G
¯ 8 S 21 r ~ r 8 S 21 r ! 21 r 8 S 21¯c . c j8 S 21 c j 2 Jc
j51
( ( ( $d 2
where r is a matrix with elements r ll 8 . Notationally, let c j denote the L-dimensional vector corresponding to $ c jl % . J Under this new description the stochastic vectors $ c j % j51 are independent and identically distributed as multivariate Gaussian with a nonzero mean vector determined by the fixed aberrations, i.e., a mean vector equal to r a and a covariance matrix specified by S. The log-likelihood function is now given by
L5
F( J
j51 k51
L
jl
2
2 g jk ~ x ! %
(22)
D
L8
l 8 51
1
jk ~ x ! log@ g jk ~ x !#
x
ll 8 c l
L8
(
( ( ( $d
a l8a l8
( ( l51
L
5
(
l 8 51
l51
K
j51 k51
(
l51
(c
5
5
L8
L
1019
(21)
] L2 5 2~ S 21 c j ! l . ]cj l
(29)
1020
J. Opt. Soc. Am. A / Vol. 16, No. 5 / May 1999
To summarize what we have shown so far, we have written the log-likelihood function as a function of only the object f and the aberration vectors c 1 ,..., c J . After this function is optimized, the FA parameter vector a is estimated from the estimated aberration vectors cˆ 1 ,..., cˆ J as in Eqs. (26). The actual estimation algorithms are based on maximizing the appropriate objective function by use of an appropriate nonlinear optimization algorithm such as the limited-memory quasi-Newton optimization scheme.14 For MAP estimation the appropriate objective function is given by Eq. (24) for both FA’s and DA’s and is given by Eqs. (28) for DA’s alone. For maximum-likelihood (ML) estimation the appropriate objective function being optimized is given by the first term, L1 , in both expressions.
Thelen et al.
Table 1. Simulation Details for Model of Image Formation with Dynamic Aberrations Only Model Component
Description
Aperture Aberrations Phase-diversity function Image plane Total photons per PD pair Number of PD pairs Noise model
32 3 32 pixels Dynamic aberrations only Kolmogorov turbulence, D/r 0 5 5 0.9 wave quadratic (peak to valley) 64 3 64 pixels 2 3 107 J 5 20 Poisson
4. SIMULATION RESULTS In this section we present simulation results demonstrating the improvement realized by using the MAP estimation approach over using an approach that assumes that the DA’s are deterministic. We consider both the case with dynamic aberrations only and the case with fixed and dynamic aberrations. A. Dynamic Aberrations Only Consider a case in which there are no fixed aberrations and the dynamic aberrations are induced by Kolmogorov turbulence. In our simulations we assumed an aberration strength of D/r o 5 5, where D denotes the diameter of the aperture and r o is the correlation diameter or seeing parameter.15 Additional simulation details can be found in Table 1. The discrete object used in these simulations was a jet on a runway, shown in Fig. 2 along with the corresponding diffraction-limited image. The bottom two rows in Fig. 2 show two different phase-diversity image pairs in which the images include the effects of the aberrations and photon noise. Each pair of images consists of one conventional image and one diversity image; an intentional extra 0.9 wave of defocus has been applied to the diversity image. As is apparent from these images, the aberrations are fairly significant in their effect on the conventional image quality. On the basis of the 20 pairs of phase-diversity images, we estimated the object and each of the 20 aberration functions with both the MAP PDS algorithm and the ML PDS algorithm. For both algorithms the initial object estimate was obtained by shifting and adding the 20 conventional images; the shift was determined by cross correlation with a reference image (one of the blurred images). The initial phase-aberration-function estimates were linear phase functions corresponding to the individual shifts. To quantify performance as a function of iteration, we ran 1200 iterations of the limited-memory quasi-Newton optimization algorithm on both MAP and ML objective functions. A figure of merit for assessing the accuracy of object estimates is the normalized rootmean-squared error (NRMSE) of the object estimates, defined as
Fig. 2. (a) Discrete jet object used in simulations; (b) diffractionlimited image; (c) conventional image for first aberration realization j 5 1, k 5 1; (d) diversity image for j 5 1, k 5 2; (e) conventional image for second aberration realization j 5 2, k 5 1; (f) diversity image for j 5 2, k 5 2.
NRMSE 5
H
( x
@ ˆf ~ x 2 x o ! 2 f ~ x !# 2
( x
f ~ x !2
J
1/2
,
(30)
Thelen et al.
Vol. 16, No. 5 / May 1999 / J. Opt. Soc. Am. A
where x o is the subpixel shift (realized by a linear phase in the Fourier domain) that minimizes the metric in Eq. (30). In Fig. 3 we present plots of the NRMSE for object estimates as a function of iteration for both the MAP and ML algorithms. As can be seen from the plots, the NRMSE of the ML object estimate decreases for 40–60 iterations and then increases fairly rapidly. This increase is due to the ML algorithm’s starting to overfit the data and fitting the noise. On the other hand, the Bayes prior has a desirable regularizing influence as evidenced by the NRMSE plot for the MAP estimate. For the MAP estimate the NRMSE decreases a little more slowly, achieves a similar value, and then increases much more slowly relative to the ML iterations. Regularization by terminating iterations is an often used, though ad hoc, strategy. Apparently the MAP algorithm is more tolerant in the selection of the best terminating iteration number than is the ML algorithm. MAP and ML object estimates are given in Fig. 4, for which both object estimates were selected on the basis of the (optimal) iteration at which the NRMSE was minimized. The quality of the estimates appears comparable. A figure of merit for assessing the accuracy of aberration estimates is the RMSE of all the phase-aberrationfunction estimates,
Fig. 3. Normalized RMSE as a function of iteration for MAP and ML object estimates from PDS data.
Fig. 4.
PDS object estimates:
RMSE 5
H
20 S j51 S u @ fˆ j ~ u ! 2 f j ~ u ! 2 m j # 2
20 M
J
1021
1/2
,
(31)
where M is the number of points u in the discrete aperˆ is the estimate (MAP or ML) of the jth phaseture, f j ˆ 2 f over aberration function, and m j is the mean of f j j the discrete aperture. In Fig. 5 we present plots of the RMSE for the 20 aberration estimates as a function of iteration for both the MAP and the ML estimation algorithms. As can be seen from the plots, the RMSE of the ML aberration estimates decreases for 50–70 iterations and then increases moderately. On the other hand, the RMSE for the MAP aberration estimates decreases as a function of iteration and is well below the minimum value achieved by the ML algorithm from approximately 100 iterations on. Actual aberration estimates are depicted in Fig. 6. In this figure we show the true phase-aberration function for j 5 1, the corresponding phase-aberration estimate for both the MAP and ML estimation algorithms, and images of the difference between each estimate and the true aberration. On comparing the actual estimates and the absolute error images, we see that the MAP aberration estimates are preferable. Specifically, the error image corresponding to the MAP aberration estimate has a significant reduction in magnitude and has less local correlation than the ML error image. The analogous results for the other realizations j 5 2,..., 20, are comparable. When the goal is to estimate dynamic aberrations on the basis of PDS data, MAP estimation clearly outperforms the ML approach. Since the aberrations manifest themselves in the imaging process by mean of the PSF’s they generate, it is of interest to assess how well the MAP and ML approaches estimate PSF’s. In Fig. 7 we present plots of the RMSE for the 20 PSF estimates as a function of iteration for both the MAP and the ML estimation approaches. As can be seen from the plots, the RMSE of the ML PSF estimates decreases for 50–80 iterations and then increases slowly for the remainder of the iterations. On the other hand, the RMSE of the MAP PSF estimates decreases monotonically to a value that is again significantly below the minimum RMSE value achieved by the ML scheme. Thus for PSF estimation, as for aberration estimation, the MAP estimation algorithm outperforms the ML approach. However, the differences between the estimation ap-
(a) MAP object estimate (from iteration 90), (b) ML object estimate (from iteration 60).
1022
J. Opt. Soc. Am. A / Vol. 16, No. 5 / May 1999
Thelen et al.
These simulation results, which quantify performance of the object, the phase-aberration function, and PSF estimates, suggest that prior statistical knowledge regarding dynamic aberrations provides a strong regularization influence on the estimates for the phase-aberration function and for the PSF and only a mild regularizing influence on the object estimates from PDS data. We believe that additional strategies for regularizing object estimates16 can be brought to bear gainfully on this problem.
Fig. 5. RMSE of MAP and ML PDS phase-aberration function estimates. The minimum value for MAP is 0.089 waves, whereas the minimum value for ML is 0.152 waves.
B. Fixed and Dynamic Aberrations In Section 4.A we presented simulation results for the case involving only dynamic aberrations. Here we present simulation results for the more general case in which the aberrations are a mixture of both fixed and dynamic aberrations. The simulation details for these experiments, with the exception of the addition of the fixed aberrations, are the same as in Section 4.A (see Table 1). Our model for the fixed aberrations corresponds to a segmented aperture with nine separate segments, each having a piston misalignment, as depicted in Fig. 8. Al-
Fig. 7. RMSE of MAP and ML PDS PSF estimates for Kolmogorov aberrations.
Fig. 6. PDS aberration estimates: (a) True phase-aberration function for realization j 5 1, (b) MAP aberration estimate (iteration 1200), (c) ML aberration estimate (iteration 70), (d) image of absolute error for the MAP aberration estimate, (e) image of absolute error for the ML aberration estimate, displayed on same scale as (d).
proaches as quantified in the metric of PSF RMSE are slightly less dramatic than the differences observed in the metric of the aberration RMSE, which was shown in Fig. 5.
Fig. 8. Phase-aberration function for segmented aperture with fixed piston errors only.
Thelen et al.
Vol. 16, No. 5 / May 1999 / J. Opt. Soc. Am. A
1023
Table 2. Fixed Piston Misalignment Values for Each of the Segments That Were Used in the Simulationsa Segment Misalignment (wave) Estimated values Error a
1
2
3
4
5
6
7
8
9
0 0 0
20.124 20.157 0.033
0.167 0.146 0.021
0.137 0.144 20.008
20.042 20.074 0.033
0.057 0.029 0.029
20.167 20.183 0.016
0.046 0.046 0.000
0.089 0.110 20.021
The bottom two rows are the MAP estimates of the misalignments and the corresponding estimation errors.
We simulated aberrated and noisy image data in a manner similar to the procedure outlined in Section 4.A and ran the MAP PDS estimation algorithm for 700 iterations using these data as inputs. As before, we started with a shift-and-add image for the initial object estimate and linear phase functions for the initial aberration estimates. In this case the MAP estimates for the object and the DA’s are similar to those shown in Section 4.A and are not reproduced here. Instead we focus on the MAP estimation results of fixed aberrations. Figure 9 shows plots of the estimated piston values as a function of iteration for the fourth and the seventh segments. In Fig. 10 we have plotted the error of the MAP PDS fixed-aberration estimates as a function of iteration for segments 2–9. As can be seen from the plots, the algorithm did very nicely in estimating the piston values for these two segments. Apparently, a large number of iterations is required for the algorithm to find the fixedaberration parameters. We believe that using alternative optimization strategies, such as that of Levenberg–Marquardt,17 will dramatically reduce the number of iterations required. The bottom two rows of Table 2 show the MAP estimates and the corresponding estimation error for the misalignment of each segment. As can be seen from these plots, the MAP approach for estimating the fixed aberration performed well in that each of the piston estimates converges to a value that was within 60.033 waves of the true value. When a best-fit global piston is subtracted, the total rms phase error is less than 0.021 waves. The accuracy of these fixedaberration estimates are very encouraging. These results demonstrate that the MAP estimation algorithm applied to PDS data provides a valuable approach for estimating fixed aberrations, given known models for the FA’s (in this case, piston errors), and a prior statistical model for the DA’s.
Fig. 9. Plots of MAP PDS piston estimates as a function of iteration. The true piston value is denoted with a dashed line; the estimates are solid curves. (a) Segment 4, (b) segment 7.
though the fixed aberrations are somewhat contrived, they serve to illustrate a procedure that can be applied to any parameterized fixed aberrations. Since a global piston has no effect on the incoherent PSF, we arbitrarily set the piston of the first segment to 0, with piston values for segments 2–9 being identified relative to the first segment. The eight piston misalignments were derived from a random-number generator and are given in Table 2.
5. SUMMARY We have presented a general Bayesian framework for the joint estimation of fixed aberrations, dynamic aberrations, and the object from phase-diverse speckle data. The framework and resulting MAP estimation algorithm allow us to exploit our prior knowledge about the statistics of the dynamic aberrations. We have also presented simulation results showing that the MAP algorithm significantly outperforms the ML algorithm in phaseaberration and PSF estimation. Although the MAP algorithm has a mild regularization effect on object estimation (when iterations are terminated), we believe that additional strategies for regularizing object estimates should be employed. Finally, we presented simu-
1024
J. Opt. Soc. Am. A / Vol. 16, No. 5 / May 1999
Thelen et al. L 1 2L
L
f j~ u ! 5
(
c ~jlo ! c l ~ u !
1
l51
(
c ~l 1 ! c L1l ~ u ! .
(A3)
l51
Let c j( o ) denote the L-dimensional vector
c ~j o ! 5
SD c j1 c j2 ] c jL
,
(A4)
let c ( 1 ) denote the (L 1 2 L)-dimensional vector
c~1! 5 Fig. 10. Plots of error for each of the MAP PDS piston estimates as a function of iteration. Note that these plots all converge to values within a fairly compact neighborhood of zero.
lation results demonstrating the potential of the MAP PDS estimation algorithm for estimating fixed aberrations in the presence of dynamic aberrations when the object was unknown and extended.
S D c L11 c L12 ] c L1
,
and let r ( o ) and r ( 1 ) denote matrices with elements (o) L ( 1 ) L 12L and $ r ll 8 % l51 , respectively. With the same mul$ r ll 8 % l51 tivariate Gaussian model on the DA’s as was assumed earlier in the paper, the vectors c j( o ) can be modeled as independent multivariate Gaussian with a mean vector given by
m c ~ o ! 5 r ~ o ! a.
APPENDIX A: EXTENSION TO THE GENERAL CHOICE OF BASIS FUNCTIONS In the main body of the paper we gave the theory for Bayes MAP estimation, assuming that each FA basis function a l 8 can be represented by a linear combination of L . In this appendix we present DA basis functions $ b l % l51 a more general approach that is valid for arbitrary choice of FA and DA basis functions. We start by choosing a set of aberration basis functions L1 c , with L 1 > L, for which $ l % l51 L
1 a. $ c l % l51 spans both the FA basis functions $ a l 8 % and DA basis functions $ b l % ,
L L 5 $ b l % l51 , b. $ c l % l51
( c ~ u !c u
l
l 1~ u !
50
for
1 < l < L, L 1 1 < l1 < L1 .
(A1)
Now letting P o denote the orthogonal projection operator onto the linear subspace spanned by the functions L and letting P 1 denote the orthogonal projection op$ c l % l51 L1 erator onto the linear subspace spanned by $ c l % l5L11 , we have that the FA basis function a l 8 can be written as the sum
L 1 2L
5
(
l51
(o)
o r ~ll 8! c l
(1) L
1
(
l51
a. r in B S r [Eq. (25)] is replaced by r ( o ) ; b. The first term L1 of the modified log-likelihood function depends on the object f and on aberration param(o) eter sets $ c jl % and $ c l( 1 ) % ; (o) c. When one has found the MAP estimates for $ c jl % (1) and $ c l % , the final estimate of parameter vector a for the fixed-aberration parameters is derived from the estimates cˆ j( o ) and cˆ ( 1 ) by the formula aˆ 5 @~ r ~ o ! ! 8 S 21 r ~ o ! # 21 ~ r ~ o ! ! 8 S 21¯cˆ ~ o ! (A7)
APPENDIX B: GENERALIZED LEASTSQUARES-ESTIMATE DERIVATION ¯) In this appendix we rigorously justify the estimator a (c * as given in Eq. (19). We start with the goal of finding the a that minimizes the quantity J
22L2 5
( ~c
j
2 r a ! 8 S 21 ~ c j 2 r a ! ,
(B1)
j51
a l8 5 P oa l8 1 P 1a l8 L
(A6)
Again we can solve for a and derive a modified loglikelihood function that is very similar to that given in Eq. (24) with the only differences being that
1 @~ r ~ 1 ! ! 8 r ~ 1 ! # 21 ~ r ~ 1 ! ! 8 cˆ ~ 1 ! .
L
L 1 are orthogonal relative to $ c l % l51 , c. $ c l % l5L11
(A5)
1 r ~ll 8! c L1l
,
(A2)
L 12L where $ r ll 8 % l51 and $ r ll 8 % l51 denote the appropriate coefficients. In this context, performing a derivation similar to the one carried out in Eqs. (17), one can show that the total aberration phase function f j can be represented as
and we claim that the a minimizing this quantity is given by Eq. (19), which we reproduce here as a ~¯c ! 5 ~ r 8 S 21 r ! 21 r 8 S 21¯c . *
(B2)
To rigorously verify this claim we recast the above problem in vector notation. Let c denote the JL-dimensional vector given by
Thelen et al.
SD SD
Vol. 16, No. 5 / May 1999 / J. Opt. Soc. Am. A
ACKNOWLEDGMENT
c1
c5
c2 ]
1025
,
(B3)
cJ
let r denote the JL 3 L 8 -dimensional matrix given by
This research was supported in part by the Air Force Office of Scientific Research under grant F49620-95-1-034. The authors can be reached as follows: phone, 734994-1200; fax, 734-994-5704; e-mail, thelen@erim-int. com.
r
r5
r
]
REFERENCES ,
(B4)
1.
r
and let S denote the JL 3 JL matrix given by
S5
F
S
0
0
¯
0
0
S
0
¯
0
]
]
]
]
]
0
0
0
¯
S
G
.
2.
(B5)
4.
With this notation the penalty term can be written in vector form as 22L2 5 ~ c 2 r a ! 8 S21 ~ c 2 r a ! .
3.
5.
(B6)
Now a vector a maximizes L2 if and only if a minimizes the quadratic form in Eq. (44). In Sec. 9.12 of Ref. 13, the minimizing value for the vector a is given by a 5 ~ r8 S21 r! 21 r8 S21 c. *
(B7)
r8 S21 5 @ r 8 S 21 ¯ r 8 S 21 # ,
(B8)
6.
7.
But 8.
so J
r8 S
21
c5
( r 8S
21
c j 5 J r 8 S 21¯c ,
(B9)
j51
9.
and ~ r8 S21 r! 21 5 ~ J r 8 S 21 r ! 21
1 5 ~ r 8 S 21 r ! 21 . J
10.
(B10)
Substituting Eqs. (47) and (48) into Eq. (45), we have proven that the expression for a in Eq. (40) is the minimizing value. Also by generalized least-squares theory,13 the resultant quadratic form after substituting in this expression for a is given by
11. 12. 13. 14.
@ c 2 r a ~¯c !# 8 S21 @ c 2 r a ~¯c !#
*
5 c 8S
*
21
21
c 2 cS
15.
r ~ r8 S21 r! 21 r8 S21 c
J
5
( c 8S j51
j
21
¯ 8 S 21 r ~ r 8 S 21 r ! 21 r 8 S 21¯c , c j 2 Jc
(B11)
where in the last line we have invoked Eqs. (47) and (48). This last expression is what appears in Eq. (22), and so we have justified the claim.
16.
17.
R. G. Paxman, T. J. Schulz, and J. R. Fienup, ‘‘Phasediverse speckle interferometry,’’ in Signal Recovery and Synthesis IV, Vol. 11 of 1992 Tech. Dig. Ser.-Opt. Soc. Am. (Optical Society of America, Washington, D.C., 1992), pp. 5–7. J. H. Seldin and R. G. Paxman, ‘‘Phase-diverse speckle reconstruction of solar data,’’ in Image Reconstruction and Restoration, T. J. Schulz and D. L. Snyder, eds., Proc. SPIE 2302, 268–280 (1994). R. A. Gonsalves and R. Chidlaw, ‘‘Wavefront sensing by phase retrieval,’’ in Applications of Digital Image Processing III, A. G. Tescher, ed., Proc. SPIE 207, 32–39 (1979). R. G. Paxman, T. J. Schulz, and J. R. Fienup, ‘‘Joint estimation of object and aberrations by using phase diversity,’’ J. Opt. Soc. Am. A 9, 1072–1085 (1992). R. G. Paxman, J. H. Seldin, M. G. Lo¨fdahl, G. B. Scharmer, and C. U. Keller, ‘‘Evaluation of phase-diversity techniques for solar-image restoration,’’ Astrophys. J. 467, 1087–1099 (1996). J. H. Seldin, R. G. Paxman, and B. L. Ellerbroek, ‘‘Postdetection correction of compensated imagery using phasediverse speckle,’’ in Adaptive Optics, Vol. 23 of 1995 Tech. Dig. Ser.-Opt. Soc. Am., M. Cullum, ed. (Optical Society of America, Washington, D.C., 1995) pp. 471–476. J. H. Seldin, M. F. Reiley, R. G. Paxman, B. E. Stribling, B. L. Ellerbroek, and D. C. Johnston, ‘‘Space-object identification using phase-diverse speckle,’’ in Image Reconstruction and Restoration II, T. Schulz, ed., Proc. SPIE 3170, 2–15 (1997). J. H. Seldin, R. G. Paxman, B. L. Ellerbroek, and D. C. Johnston, ‘‘Phase-diverse speckle restorations of artificial satellites imaged with adaptive-optics compensation,’’ in Adaptive Optics, Vol. 13 of 1996 Tech. Dig. Ser.-Opt. Soc. Am. (Optical Society of America, Washington, D.C., 1996), addendum, pp. 341–343. R. G. Paxman and J. H. Seldin, ‘‘Fine-resolution imaging of solar features using phase-diverse speckle imaging,’’ in Real Time and Post-Facto Solar Image Correction, R. R. Raddick, ed., National Solar Observatory/Sacramento Peak Summer Workshop 13 (Sunspot, N.M., 1992), pp. 112–118. T. J. Schulz, ‘‘Multi-frame blind deconvolution of astronomical images,’’ J. Opt. Soc. Am. A 10, 1064–1073 (1993). D. L. Snyder, C. W. Helstrom, A. D. Lanterman, M. Faisal, and R. L. White, ‘‘Compensation for readout noise in CCD images,’’ J. Opt. Soc. Am. A 12, 272–283 (1995). H. L. Van-Trees, Detection, Estimation, and Modulation Theory: Part I (Wiley, New York, 1968). L. L. Scharf, Statistical Signal Processing: Detection, Estimation, and Time Series Analysis (Addison-Wesley, Reading, Mass., 1991). D. C. Liu and J. Nocedal, ‘‘On the limited memory BFGS method for large scale optimization,’’ Math. Program. 45, 503–528 (1989). D. L. Fried, ‘‘Optical resolution through a randomly inhomogeneous medium for very long and very short exposures,’’ J. Opt. Soc. Am. 56, 1372–1379 (1966). R. G. Paxman, J. H. Seldin, and P. P. Sanchez, ‘‘Applied phase diversity,’’ Tech. Rep. 213390-3-F [Environmental Research Institute of Michigan (ERIM), Ann Arbor, Mich., 1992]. J. E. Dennis and R. B. Schnabel, Numerical Methods for Unconstrained Optimization and Nonlinear Equations (Prentice-Hall, Englewood Cliffs, N.J., 1983).