Abstract 1 Introduction - CiteSeerX

22 downloads 0 Views 112KB Size Report
The extension of high breakdown estimators to nonlinear regression has been considered recently by. Stromberg and Ruppert (1992) and by Stromberg (1993,.
Robust Frequency Estimation Using Elemental Sets Gordon K. Smyth Department of Mathematics, University of Queensland, Australia 4072

Abstract The extraction of sinusoidal signals from time-series data has attracted enormous attention in the statistics and signal processing literatures. A closely related problem is that of tting exponentially damped oscillations and pure exponential signals to equally-spaced data arising from, for example, chemical reactions, radioactive decay or economic series. Sums of exponentials, sinusoids and exponentially damped sinusoids are uni ed in that they satisfy constant coecient homogeneous di erential equations. Functions of this form are highly nonlinear. Ad hoc methods and inecient methods are often used to t them and problems of data quality have received little attention. In this paper we use elemental set methods to construct nite algorithm estimators which approximately minimize the least squares, least trimmed sum of squares or least median of squares criteria. An elemental set is a subset of the data containing the minimum number of points such that the unknown parameters in the model can be identi ed. The constructed estimators may be used to initialize iterative estimation schemes such as maximum likelihood, M- or MM-estimation.

1 Introduction The method of elemental sets involves performing many ts to a given data set, each t made to a subsample just large enough to identify the parameters. The method is useful for optimizing criteria which are not smooth, have many local minima or are otherwise not amenable to global optimization by re nement in the parameter space. In particular, elemental sets form the basis of most algorithms for computing high breakdown estimators in regression. See Rousseeuw (1984), Hawkins, Bradu and Kass (1984), Rousseeuw and Leroy (1986) and Hawkins (1993) for discussions of the linear regression case. The extension of high breakdown estimators to nonlinear regression has been considered recently by Stromberg and Ruppert (1992) and by Stromberg (1993, 1995). One diculty is that the computation of exact ts

Douglas M. Hawkins Department of Applied Statistics University of Minnesota, St Paul, MN 55108 to elemental sets is far from trivial in nonlinear regression. This paper considers a special class of nonlinear regression models for which the elemental set estimates can be computed in closed form. Another advantage of the elemental sets method is that it is nite. It can therefore be used to generate preliminary estimators with which to initialize iterative estimation schemes such as maximum likelihood or Mestimation. In particular, an estimator which combines high breakdown with high eciency under normal errors can be achieved by using a high breakdown elemental estimator to initialize the MM estimation scheme of Yohai (1987) and Stromberg (1993). The regression functions (t) we consider are those which solve constant coecient di erential equations of the form +1 X  D ?1  = 0 (1) p

k

=1

k

k

where D is the di erential operator. Solutions to (??) include complex exponentials, damped and undamped sinusoids and real exponentials, depending on the roots of the polynomial with the  as coecients (Brockwell and Davis, 1991; Osborne and Smyth, 1995). Let the roots be , j = 1; . . . ; p. If the are distinct, then (t) is a sum of exponentials, k

j

j

(t) =

X p

j

=1

exp( t) j

(2)

j

where in general the and may be complex. If the and are not all real, then they must occur in complex conjugate pairs in order that (t) be a real signal. If the roots are purely imaginary, then (t) is a sinusoidal signal j

j

j

j

(t) =

X2 p=

j

=1

sin(! t +  ) j

j

(3)

j

for real and  and ! 2 [0; ). We suppose that observations, y = (t ) +  , are made at equi-spaced times t , i = 1; . . . ; n, where the  are independent with mean zero and variance 2 . Write j

j

j

i

i

i

i

i

 = (t ). The discrete  satisfy a di erence equation i

i

i

of the form

c1  + . . . + c +1  + = 0 (4) for i = 1; . . . ; n ? p, where the c are the coecients of the polynomial p (z ) = c1 + c2 z +    + c +1 z with roots exp( t) and where t is the spacing unit of the t . This allows the c to be estimated in closed form given an elemental set of 2p equi-spaced observations. i

p

i

p

j

c

p

p

j

i

j

The elemental set computation is in fact an application of the classical interpolation method of Prony (1795). See Osborne and Smyth (1991, 1995), Kahn et al (1992), Mackisack et al (1994) and Smyth (1995) for modern least squares modi cations of Prony's method. The extraction of sinusoidal signals from time-series data and the least squares tting of exponential signals are highly nonlinear problems which have attracted enormous attention in the statistics and engineering literatures. Exponential tting algorithms typically require excellent starting values and frequently have diculty converging because of collinearity between the exponential decay functions. Sinusoidal signals on the other hand are asymptotically uncorrelated, but give rise to highly non-convex sums of squares having local minima O(n?1) apart in the frequencies (Rosenblatt and Rice, 1988). This has typically prevented the use of fully ef cient global least squares estimators in practice. For neither type of problem have problems of data quality have received much attention. The method of elemental sets gives rise to two classes of estimators, those based on minimizing a suitable criterion over the elemental sets and those based on averaging the tted values or parameter estimates obtained from the elemental sets. In this paper we consider criterion based estimators which achieve the least sum of squared residuals (LS), least trimmed sum of squared residuals (LTS) and least median squared residual (LMS). LTS and LMS are high breakdown criteria. LTS has positive eciency relative to least squares given normal errors. We suggest minimizing the above criteria over O(n) elemental sets to obtain nite preliminary estimators. This approach is related to the random search for the least squares frequency estimate suggested by Rosenblatt and Rice (1988), with the di erence that each estimate considered has some support in the data so that the candidates are relatively dense in the neighborhood of the true values.

2 Elemental Set Estimates An elemental set is any set of 2p observations which identi es unique values for the parameters. Non-consecutive

sequences of observations however can be generally interpolated by more than one possible sinusoidal signal, the alternative signals being related as harmonics. We therefore restrict to elemental sets which are sequences of consecutive observations. A larger number of elemental sets is possible when tting real exponential signals to positive data, and although this idea is not used in this paper. Consider an elemental set, y + ?1 , j = 1; . . . ; 2p. The elemental estimate of c is the null vector of the p + 1  p matrix 1 0 i

   y + ?1

y

Y

i

=B @ ...

j

i

.. .

y+ i

p

i

p

   y +2 ?1 i

CA

p

Without loss of generality we can put c +1 = 1. Also let = (c1; . . . ; c ) . Writing B for the rst p rows of Y and b for the last row, the elemental estimate of d is ^ = ?B ?1 b . Fitted values ^ can be obtained directly d from the estimated c^ = (d^ ; 1) using formula (5) from Osborne and Smyth (1991). Alternatively, the elemental estimates for 1 ; . . . ; can be obtained by rooting the polynomial with c^ as coecients. There are at most n ? 2p + 1 elemental sets of consecutive observations. If desired, a greater number of elemental type estimates could be obtained in the following way. Let y = (y ; . . . ; y + ) be a p + 1 column vector. An generalized elemental estimate of the Prony vector c can be obtained as the null vector of the matrix p

d

p

T

T i

T i

i

i

i

i

i

T

T

p

i

i

Y 1 ... p = i ;

;i

i

p

T

?y ; . . . ; y  p i1

i

where i1 ; . . . ; i is any p-subset of f1; . . . ; n ? 2p + 1g. These estimates however are not true elemental estimates, as they use more than the minimum number of observations, and this approach is not taken further in this paper. p

3 Frequency Estimation The derived signal (t) is a sum of cosines if and only if the roots of the polynomial p () lie of the unit circle. In this case p is even, the roots occur in conjugate pairs, and c is symmetric, c = c2 +2? for j = 1; . . . ; p=2 (Smyth, 1995). The roots of p the polynomial are p the exponentiated frequencies exp( ?1! s), exp(? ?1! s), j = 1; . . . ; p=2. If the signal is known apriori to be sinusoidal then it is appropriate to constrain c to be symmetric. With this constraint, an elemental set consists of 3p=2 data values. Following Smyth (1995), let Q be the (2p + 1)  p + 1 c

j

p

j

j

j

Table 1: Simulation results for one cosine signal and no outliers. The true frequency is 0.7. Method LS Global LS LTS LMS Prin. Axis

Mean St. Dev. 0.6997 0.0053 0.7000 0.0005 0.6994 0.0054 0.6996 0.0054 0.6987 0.0446

matrix

MSE Eciency 0.0053 0.0085 0.0005 1.0149 0.0054 0.0080 0.0054 0.0082 0.0447 0.0001

0 I =p2 0 1 Q = @ 0p 1 A p

J= 2 0 p

where I is the p  p identity matrix and J is the p  p anti-diagonal matrix p

p

00 BB =B B@

J

p

1

:

:

:

1 CC CC A

1 0

The symmetry constraint can be represented as c = Qe where e is an unrestricted real vector of dimension p=2+ 1. An elemental set element set is now any set of 3p=2 consecutive observations which can be interpolated by a sinusoidal signal. The elemental estimate of c is the symmetric null vector of the (p + 1)  p=2 matrix Y = i

?y ; . . . ; y i

+ 2?1

i

p=



The convention that c +1 = 1 here implies that e1 = 1. Let f = (e2 ; . . . ; e 2+1 ). If we write b for the rst column of Y Q and B for its remaining columns, then the elemental estimate of f is ^f = ?B ?1 b . A simulation study was conducted with p = 2 and n = 100. Independent Gaussian observations were generated with means  = cos(0:7t + 0:1) t = 1; . . . ; n, and standard deviation  = 0:1. With p = 2, the elemental sets contain three observations, y , y +1 and y +2 say, and the Prony coecient vector is c = (1; c2; 1) for some c2 . It can be shown that the three points are interpolated by a cosine if and only if j(y + y +2 )=y +1 j  2. If this is so, then the elemental estimate of c2 is c^2 = ?(y + y +2 )=y +1 and the elemental estimate of the frequency is !^ = cos?1 (^c2 =2). p

i

p=

T i

i

i

i

i

i

i

i

i

i

i

T

i

i

i

;i

i

i

i

i

;i

Table 2: Simulation results for one cosine 30% outliers. The true frequency is 0.7. Method Mean Median St. Dev. LS 0.8768 0.7162 0.4457 Global LS 0.8756 0.7141 0.4458 LTS 0.7062 0.7007 0.0746 LMS 0.7098 0.7006 0.0955 Prin. Axis 0.0160 0.0000 0.3000

signal with MSE 0.4795 0.4792 0.0749 0.0960 0.7214

Table ?? gives results over 500 simulated data sets for the LS, LTS and LMS criteria minimized over the elemental sets. The LTS criterion trimmed 50% of the residuals. The elemental LS estimate was also input as starting value to an iterative Levenberg least squares algorithm to obtain an approximation to the global least squares estimator (global LS). Also computed is an estimator based on the principal axis of the Prony vectors, treating them as axial observations on the unit sphere. See Fisher et al (1987, Section 3.2.4) for a discussion of principal axes. The LS, LTS and LMS criteria have similar squareroot mean square errors (MSE) on this problem. The global LS estimator has, predictably, far the lowest MSE, and achieves eciency relative to the Cramer-Rao lower bound of 100%. This demonstrates that the elemental LS is an adequate starting value for maximum likelihood on this problem. The estimator based on averaging the Prony vectors themselves performs far less well than the criteria based estimators. Table ?? gives the results of a similar simulation experiment, but with 30% outliers. Outliers were generated to have standard deviations 100 times that of the good observations and were associated with a random subset of the t . The LTS and LMS criteria are now markedly superior to least squares, although none of the methods achieve good eciency relative to the CramerRao lower bound. The square root of the Cramer-Rao lower bound based on the 70% good observations is in this case 0.00058. The principal axis method fails completely on this problem. i

References Brockwell, P. J., and Davis, R. A. (1991). Time series. Theory and methods (2nd edition). Springer, New York. Fisher, N. I., Lewis, T., and Embleton, B. J. J. (1987). Statistical Analysis of Spherical Data, Cambridge

UP. Hawkins, D. M. (1993). The accuracy of elemental set approximations for regression. J. Amer. Statist. Assoc., 88, 580{589. Hawkins, D. M., Bradu, D., and Kass, G. V. (1984). Location of several outliers in multiple regression data using elemental sets. Technometrics, 26, 197{208. Kahn, M., Mackisack, M.S., Osborne, M.R. and Smyth, G.K. (1992). On the consistency of Prony's method and related algorithms. J. Comput. Graph. Statist., 1, 329{349. Mackisack, M. S., Osborne, M. R., and Smyth, G. K. (1994). A modi ed Prony algorithm for estimating sinusoidal frequencies. J. Statist. Comput. Simul., 49, 111{124. Osborne, M. R. and Smyth, G. K. (1991). A modi ed Prony algorithm for tting functions de ned by difference equations, SIAM J. Sci. Stat. Comput., 12, 362{382. Osborne, M. R., and Smyth, G. K. (1995). A modi ed Prony algorithm for exponential function tting. SIAM J. Sci. Statist. Comput. 16, 119-138. Prony, R. (1795). Essai experimental et analytique. J.  Polytechnique, 2, 24{76. de L'Ecole Rice, J. A., and Rosenblatt, M. (1988) On frequency estimation. Biometrika, 75, 477{484. Rousseeuw, P. J. (1984). Least-median of squares regression. J. Amer. Statist. Assoc., 79, 871{880. Roussseeuw, P. J. and Leroy, A. M. (1986). Robust Regression and Outlier Detection., Wiley, New York. Smyth, G. K. (1995). On constraining eigenanalysis methods of frequency estimation to prevent unwanted damping. Research Report, Centre for Statistics, University of Queensland. Stromberg, A. J. (1993). Computation of high breakdown nonlinear regression parameters. J. Amer. Statist. Assoc., 88, 237{244. Stromberg, A. J. (1995). Consistency of the least median of squares estimator in nonlinear least squares. Commun. Statist.|Theory Meth., 24(8), 1971{1984. Stromberg, A. J., and Ruppert, D. (1992). Breakdown in nonlinear regression. J. Amer. Statist. Assoc., 87, 991{997.

Yohai, V. J. (1987). High breakdown point and high eciency robust estimates for regression. Ann. Statist., 15, 642{656.