IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 38, NO. 1, JANUARY 2000
329
TABLE I DIFFERENCES BETWEEN THE INTEGRATED WATER VAPOR VALUES MEASURED WITH THE DIFFERENT TECHNIQUES
ACKNOWLEDGMENT The authors wish to thank the reviewers for their constructive comments which helped them to focus on the important issues. The radiosonde launches are operated by the Swedish Meteorological and Hydrological Institute (SMHI). L. Gradinarsky analyzed the main part of the WVR data. REFERENCES [1] Y.-H. Kuo, Y.-R. Guo, and E. R. Westwater, “Assimilation of precipitable water measurements into a mesoscale numerical model,” Mon. Weather Rev., vol. 121, pp. 1215–1238, 1993. [2] Y.-H. Kuo, X. Zou, and Y.-R. Guo, “Variational assimilation of precipitable water using a nonhydrostatic mesoscale adjoint model,” Mon. Weather Rev., vol. 124, pp. 122–147, 1996. [3] X. Yang, B. H. Sass, G. Elgered, J. M. Johansson, and T. R. Emardson, “A comparison of the integrated water vapor estimation by a NWP simulation and GPS observation,” J. Appl. Meteor., vol. 38, pp. 941–956, 1999. [4] J. Duan, M. Bevis, P. Fang, Y. Bock, S. Chiswell, S. Businger, C. Rocken, F. Solheim, T. VanHove, R. Ware, S. Mc-Clusky, T. A. Herring, and R. W. King, “GPS meteorology: Direct estimation of the absolute value of precipitable water,” J. Appl. Meteor., vol. 35, pp. 830–838, 1996. [5] A. J. Coster, A. E. Niell, F. S. Solheim, V. B. Mendes, P. C. Toor, K. P. Buchmann, and C. A. Upham, “Measurements of precipitable water vapor by GPS, radiosondes, and a microwave water vapor radiometer,” in Proc. 9th Int. Tech. Meet. Satellite Division of the Institute of Navigation, vol. 1, 1996, pp. 625–631. [6] T. R. Emardson, G. Elgered, and J. M. Johansson, “Three months of continuous monitoring of atmospheric water vapor with a network of global positioning system receivers,” J. Geophys. Res., vol. 103, pp. 1807–1820, 1998. [7] BIFROST Project, “GPS measurements to constrain geodynamic processes in fennoscandia,” EOS, Trans. Amer. Geophys. Union, vol. 35, 1996. [8] G. Elgered and P. O. J. Jarlemark, “Ground-based microwave radiometry and long-term observations of atmospheric water vapor,” Radio Sci., vol. 33, pp. 707–717, 1998. [9] F. Webb and J. F. Zumberge, “An introduction to the GIPSY/OASIS—II,” in JPL Publ. D-11088. Pasadena, CA, 1993.
[10] J. F. Zumberge, M. B. Heflin, D. C. Jefferson, M. M. Watkins, and F. H. Webb, “Precise point positioning for the efficient and robust analysis of GPS data from large networks,” J. Geophys. Res., vol. 102, pp. 5005–5017, 1997. [11] J. M. Johansson, T. R. Emardson, P. O. J. Jarlemark, L. P. Gradinarsky, and G. Elgered, “The atmospheric influence on the results from the Swedish GPS network,” Phys. Chem. Earth, vol. 23, pp. 107–112, 1998. [12] L. Yuan, R. Anthes, R. Ware, C. Rocken, W. Bonner, M. Bevis, and S. Businger, “Sensing global climate change using the global positioning system,” J. Geophys. Res., vol. 98, pp. 14925–14 937, 1993.
On the Condition Number of Gaussian Sample-Covariance Matrices A. B. Kostinski and A. C. Koivunen Abstract—We examine the reasons behind the fact that the Gaussian autocorrelation-function model, widely used in remote sensing, yields a particularly ill-conditioned sample-covariance matrix in the case of many strongly correlated samples. We explore the question numerically and relate the magnitude of the matrix-condition number to the nonnegativity requirement satisfied by all correlation functions. We show that the condition number exhibits explosive growth near the boundary of the allowed-parameter space. Simple numerical recipes are suggested in order to avoid this instability.
The Gaussian autocorrelation function (ACF) is used widely to model echo statistics in modern remote-sensing applications such as Manuscript received September 28, 1998; revised March 9, 1999. This work was supported by NSF Grant ATM95-12685 and by the USRA-NASA Goddard Sabbatical Fellowship Program (for ABK). The authors are with the Department of Physics, Michigan Technological University, Houghton, MI 49931 USA (e-mail:
[email protected]). Publisher Item Identifier S 0196-2892(00)00005-X.
0196–2892/00$10.00 © 2000 IEEE
330
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 38, NO. 1, JANUARY 2000
radar and lidar meteorology, in which the Gaussian Doppler spectrum assumption is commonly employed (e.g., see [6]). Another example is scattering by rough surfaces [1, p. 20]. In these applications, a uniformly sampled ACF yields a sample-covariance matrix whose dimension is equal to the number of lags used from the corresponding time series. Modern signal processing techniques, such as data whitening, often involve diagonalization of covariance matrices, e.g., see [2, pp. 94–96], [4, pp. 220–221], or [7]. Thus, accuracy of the sample-covariance matrix inversion is an important consideration as the matrix dimension (number of time lags used) and/or degree of correlation increase. In the course of our analysis of correlated weather-radar echoes [3], we observed that the Gaussian covariance matrix is particularly ill-conditioned. The purpose of this note, therefore, is to explain why the Gaussian ACF yields an unstable and poorly behaved covariance matrix C . One must be aware of this “Gaussian Anomaly” when using Gaussian ACF models for large-dimensional matrices and/or highly correlated samples. We trace the origin of this instability to the nonnegativity requirement, which must be imposed on all ACF’s, and suggest a simple remedy to avoid such instabilities. Let us use the conventional condition number (ratio of maximal to minimal eigenvalue) as the measure of ill-conditioning of C (e.g., [8, pp. 397, 460]). One expects poor inversion quality when the condition number approaches the inverse of computer roundoff error. We now proceed to explore numerically the behavior of the condition number (denoted as ) as a function of the ACF parameters. The important finding is that as we approach the Gaussian functional form for the ACF, exhibits explosive growth. For example, there is a drastic difference between the exponential and Gaussian case. To see this, consider a stationary time series with an autocorrelation function , which is discretely sampled at uniformly-spaced time intervals, yielding a set of values m . These in turn form a row of a sample-covariance matrix (lags ranging from 0 to M 0 in steps of ) [2]
1
1 8 for ( ) = = 0 98 is constant 1 8=1
Fig. 1. Covariance-matrix condition number versus . The correlation coefficient throughout. Note the high sensitivity of for .
8
+ (1
8)
1
()
1
C
=
(0) (1)
.. .
(
(01) (0)
1 1 1 (1 0 M ) 1 1 1 (2 0 M )
.. .
..
.. .
.
( 0 1) (M 0 2) 1 1 1
)
:
(1)
(0)
M
This matrix is Hermitian and Toeplitz. Next, in order to explore the Gaussian limit, consider ACF, which is a sum [1, pp. 119–120] of exponential and Gaussian form (γ and are unitless)
( ) = (1 0 8)e0 + 8e0
(2)
whose discrete version is measured in units of
( ) = (1 0 8)rm + 8rm
m
(3)
where r e0 is the correlation coefficient between two consecutive samples. Regardless of the ACF functional form, as r approaches unity, the covariance matrix becomes singular, and blows up. However, this approach to singularity occurs in a rather different manner in the exponential and Gaussian cases. To see this, we solve the corresponding eigenvalue problem numerically and examine the condition number (ratio of a maximal to minimal eigenvalue) as a function of ; r and M (the matrix dimension). The results are shown in Fig. 1 for a strongly correlated case of r : . Already, in the case of only eight measurements (8 2 8 sample-covariance matrix), the condition number in the Gaussian case ( ) is more than three orders of magnitude greater than the expo-
=
1
8
= 0 98 8=1
1
1
( )=
Fig. 2. Covariance-matrix condition number versus for . The correlation coefficient is constant throughout. Note the high sensitivity of for .
1
=2
= 0 98
nential one. Why such remarkable sensitivity? After all, both models appear equally widely used in remote sensing applications. The answer turns out to be rooted in the requirement of nonnegativity, which must be satisfied by all ACF’s and by the corresponding sample-covariance matrices. That is, there is a close connection between the “nearness” to the boundary of what is allowed by nonnega-
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 38, NO. 1, JANUARY 2000
331
1
Fig. 3. Covariance-matrix condition number versus for ( ) = . The covariance matrices were of size 64 64 with varying values of correlation coefficient . Again, note the sensitivity of to as approaches 2.
1
tivity and the magnitude of the condition number lowing parametric family of ACF’s
1. Consider the fol-
( ) = e0 : (4) A sufficient and necessary condition for ( ) to be a valid ACF (to have a nonnegative Fourier transform) is that 0 < 2 (see [9, p. 137]). The discretely sampled version of the above equation is
(m ) = r(m )
(5)
where again r = e0 is the correlation coefficient between two consec-
utive samples. The matrix elements across the row are then as follows: , etc. For example, four echoes spaced uniformly in units of yield the following matrix
rt , r(2t) , r(3t)
C=
0
1
r0 r1 r2 r3
=1 2
r1 r0 r1 r2
r2 r1 r0 r1
r3 r2 r1 r0
and
1 0:98 0:961 0:942 0 :98 1 0:98 0:961 CE = 0:961 0:98 1 0:98 0:942 0:961 0:98 1
:
(8)
(6)
with r and ; corresponding to the exponential and Gaussian correlation models, respectively. The correlation coefficient between two consecutive echoes is typically between 0.8 and 0.98 in radar meteorology. This is because, at the rate of 1000 echoes per second, rain drops do not move an appreciable fraction of the radar wavelength (3–10 cm typically) during one millisecond. To illustrate the numerical analysis issues, we pick a correlation coefficient of e0:02 0.98. The four echoes Gaussian and exponential covariance matrices (denoted as CG and CE , respectively) are then given as follows:
1 0:98 0:923 0:835 0 :98 1 0:98 0:923 CG = 0:923 0:98 1 0:98 0:835 0:923 0:98 1
Fig. 4. (a) Minimum eigenvalues of a 256 256 covariance matrix, computed for various values of r as varies from 0.01 to 2.0. (b) with ( ) = minimum spectral density values of a 256 point autocorrelation sequence of the same form, for the same values of r and the same variation of .
(7)
Note that based on a traditional numerical-analysis point of view, one would guess that it is the exponential case that is ill-conditioned because the matrix is less “diagonally dominant.” In fact, just the reverse is the case. Let us next explore the “Gaussian anomaly” systematically, by computing the eigenvalues and the condition number as a function of as it approaches 2. The results are shown in Fig. 2 for various dimensions, and r : . Again, the Gaussian anomaly is clear. We see that reducing from 2 to 1.9 reduces the condition number by several orders of magnitude (note that : is not quite resolved in Fig. 2. The largest eigenvalue remains of order M (matrix dimension), but the smallest one decreases drastically to zero as approaches two. Similar behavior is seen in Fig. 3, in which we study the Gaussian anomaly for various values of the correlation parameter r . In retrospect, such a behavior of the smallest eigenvalue could have been anticipated, because for stationary random processes and circulant matrices, the eigenvalues of the sample-covariance matrix turn out to be proportional to the values of discretely and uniformly sampled power-
= 0 98
= 20
332
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 38, NO. 1, JANUARY 2000
spectral density (Fourier transform of the ACF by the Wiener–Khinchin theorem [5, p. 261, Eq. (4.11)–(4.43)]). On the other hand, Gaussian wings decay faster than the inverse square. This is illustrated in Fig. 4, which compares the -dependence of the smallest eigenvalue of the covariance matrix with the minimal value of the power-spectral density for a 256 2 256 matrix (tests with other dimensions led to similar results). Note, however, that the rapid rise of the condition number to the nearness of the boundary of the allowed (by nonnegativity) region is not at all obvious from this argument. One simple remedy for avoiding the instability is simply to decrease somewhat, as indicated by Figs. 2 and 3. For example, Fung suggests the range 1.2 to 1.8 for applications in scattering by rough surfaces [1, pp. 119–121]. One could also use the Gaussian/exponential combination. In summary, the widely used Gaussian ACF model happens to be right at the edge of the parameter space allowed by the nonnegativity requirement. This results in the explosive growth of the condition number and extreme sensitivity to measurement noise as the correlation-parameter increases, and the inverse of the condition number becomes comparable to computer roundoff error. Therefore, we recommend picking to avoid singularity at 2 in those problems that are sensitive to behavior of the ACF tails.
ACKNOWLEDGMENT The authors would like to thank Dr. I. Pinelis for several helpful comments. REFERENCES [1] A. K. Fung, Microwave Scattering and Emission Models and Their Applications. Boston, MA: Artech House, 1994. [2] S. M. Kay, Fundamentals of Statistical Signal Processing: Estimation Theory. Englewood Cliffs, NJ: Prentice-Hall, 1993. [3] A. Koivunen and A. Kostinski, “Feasibility of data whitening to improve performance of weather radar,” J. Appl. Meteorol., vol. 38, no. 6, pp. 741–749, 1999. [4] R. N. McDonough and A. D. Whalen, Detection of Signals in Noise. New York: Academic, 1995. [5] M. B. Priestley, Spectral Analysis and Time Series. London, U.K.: Academic, 1996. [6] H. Sauvageot, Radar Meteorology. Boston, MA: Artech House, 1992. [7] T. Schulz and A. B. Kostinski, “Variance bounds on the estimation of reflectivity and polarization parameters in radar meteorology,” IEEE Trans. Geosci. Remote Sensing, vol. 35, pp. 248–255, Mar. 1997. [8] G. Strang, Introduction to Applied Mathematics. New York: Wellesley-Cambridge, 1986. [9] A. M. Yaglom, Correlation Theory of Stationary and Related Random Functions I. New York: Springer, 1987.