On Least Absolute Deviation Estimators For One Dimensional Chirp ...

1 downloads 0 Views 182KB Size Report
Key Words and Phrases: Chirp signals; least absolute deviation estimators; strong ... of the LSE of the one dimensional (1D) chirp signal model for i.i.d. errors.
On Least Absolute Deviation Estimators For One Dimensional Chirp Model Ananya Lahiri1 & Debasis Kundu1,2 & Amit Mitra1

Abstract It is well known that the least absolute deviation (LAD) estimators are more robust than the least squares estimators particularly in presence of heavy tail errors. We consider the LAD estimators of the unknown parameters of one dimensional chirp signal model under independent and identically distributed error structure. The proposed estimators are strongly consistent and it is observed that the asymptotic distribution of the LAD estimators are normally distributed. We perform some simulation studies to verify the asymptotic theory for small sample sizes and the performance are quite satisfactory.

Key Words and Phrases: Chirp signals; least absolute deviation estimators; strong consistency, asymptotic distribution. 1

Department of Mathematics and Statistics, Indian Institute of Technology Kanpur,

Pin 208016, India. 2

Corresponding author, E-mail: [email protected], Phone: 91-512-2597141, Fax: 91-

512-2597500.

1

1

Introduction

Let us consider the following chirp signal model; y(n) = A0 cos(α0 n + β 0 n2 ) + B 0 sin(α0 n + β 0 n2 ) + X(n);

n = 1, · · · , N.

(1)

Here y(n) is the real valued signal observed at n = 1, · · · , N . A0 , B 0 are amplitudes, and α0 and β 0 are frequency and frequency rate respectively. The additive error {X(n)} is a sequence of independent and identically distributed (i.i.d.) random variables with mean zero and finite second moment. The explicit assumptions on X(n)s will be provided later. In signal processing literature, chirp signal models are used to detect an object with respect to a fixed receiver. Such models are typically one-dimensional chirp model as described in (1), where the dimension is usually the time. In this model, frequency varies with time in a non-linear fashion like a quadratic function and it is this property that has been exploited for measuring the distance of an object from a fixed receiver. In various areas of science and engineering, for example in sonar, radar and communications systems, such models are used. Oceanography and geology are some other areas where this model has been used quite extensively. On this model (1) or on its variations, extensive work has been done by several authors, see for example Abatzoglou (1986), Kumaresan and Verma (1987), Djuric and Kay (1990), Gini, Montanari and Verrazani (2000),Saha and Kay (2002), Nandi and Kundu (2004), Kundu and Nandi (2008) and the references cited therein. Nandi and Kundu (2004) first established the consistency and asymptotic normality property of the LSE of the one dimensional (1D) chirp signal model for i.i.d. errors. The authors, see Kundu and Nandi (2008), extended the results when X(n)’s are obtained from a linear stationary processes. But there is no discussion about any method like least absolute deviation (LAD) estimation which is well known to be more robust than the LSEs, particularly in presence of outliers. 2

Unfortunately, the model does not satisfy the assumption B5 of Oberhofer (1982) and therefore the strong consistency of the LAD estimators in this case is not immediate. It may be mentioned that even the ordinary sinusoidal model does not satisfy the assumption B5 of Oberhofer (1982), and in that case Kim et al. (2000) provided the consistency and asymptotic normality results of the LAD estimators. The main aim of this paper is to provide the consistency and asymptotic normality properties of the LAD estimators of the unknown parameters of model (1). It is known that the LSE of α0 has the convergence rate Op (N −3/2 ), whereas the LSE of β 0 has the convergence rate Op (N −5/2 ), see Nandi and Kundu (2004). Here z = Op (N −δ ) means zN δ is bounded in probability. In this paper it is observed that the LAD estimators of α0 and β 0 have the same rates of convergence as the corresponding LSEs. But it is observed that asymptotic efficiency of LAD estimators relative to LSE is 4f (0)2 σ 2 , here f (·) is the probability density function (PDF) of the error random variable X(n). Therefore it is clear that LAD estimators are more efficient than LSEs for heavy tailed error distributions. We perform some extensive simulation experiments to study the effectiveness of the LAD estimators for finite samples, and it is observed that the performances of the LAD estimators are quite satisfactory. The rest of the paper is organized as follows. In Section 2, we mainly provide the model assumptions and methodology. In Section 3 the strong consistency and asymptotic normality of LAD estimators are provided. Numerical results are presented in Section 4, and finally we conclude the paper in Section 5.

2 2.1

Model Assumptions and Preliminary Results Model Assumptions

We make the following assumptions on the error random variables.

3

Assumption 1: The error random variable X(n) satisfies the following conditions; {X(n)} is a sequence of i.i.d. absolute continuous random variables with mean zero, variance σ 2 , and has the PDF f (·). It is further assumed that f (·) is symmetric and differentiable in (0, ǫ) and (−ǫ, 0) for some ǫ > 0 and f (0) > 0. We use the following notations; F (·) the cumulative distribution function corresponds to f (·). The parameter vector θ = (A, B, α, β), the true parameter vector θ0 = (A0 , B 0 , α0 , β 0 ), and the parameter space Θ = [−M, M ]×[−M, M ]×[0, π]×[0, π]. Assumption 2: It is assumed that θ0 is an interior point of Θ.

2.2

Least Absolute Deviation Estimation Procedure

In this section we propose the LAD estimation procedure to estimate the unknown parameters of the model (1). The LAD estimators are obtained by minimizing Q(θ), with respect to θ, where, N X  y(n) − A cos(αn + βn2 ) + B sin(αn + βn2 ) Q(θ) =

(2)

n=1

we note that

b B(b b α b b β), B(α, b β), α, β) > Q(A(b b α, β), b α, β), Q(A, B, α, β) > Q(A(α, b, β)

b b β), B(α, b β) are the minimizer of Q(A, B, α, β) for known α, β and A(b b α, β), where A(α,

b are the minimizer of Q(A, B, α b Now b α, β) B(b b, β).

b = arg min Q(A(α, b β), B(α, b β), α, β). (b α, β)

b B(b b α b = (A, b b α, β), b α, β), b B, b α So, LAD estimators of θ0 will be θb = (A(b b, β) b, β).

4

Asymptotic Properties of Least Absolute Deviation Estimators

3

3.1

Strong Consistency

Now we will provide the consistency results for the proposed estimators.   b b b Theorem 1. If the Assumptions 1-2 are satisfied then A, B, α b, β is a strongly

consistent estimator of (A0 , B 0 , α0 , β 0 ).

We need the following results to prove Theorem 1. Lemma 1. If (θ1 , θ2 ) in (0, π) × (0, π), t = 0, 1, 2 then except for countable number of points the followings are true. (i)

(ii)

N N 1 X 1 X 2 lim cos(θ1 n + θ2 n ) = lim sin(θ1 n + θ2 n2 ) = 0. N →∞ N N →∞ N n=1 n=1

1

lim

N →∞ N t+1

lim

1

N →∞

N t+1

1

N X

lim

N →∞

N t+1

N X

nt cos2 (θ1 n + θ2 n2 ) =

1 2(t + 1)

(4)

nt sin2 (θ1 n + θ2 n2 ) =

1 . 2(t + 1)

(5)

n=1

N X

(3)

n=1

nt sin(θ1 n + θ2 n2 ) cos(θ1 n + θ2 n2 ) = 0.

(6)

n=1

Proof: Using the result of Vinogradov (1954) Lemma 1 can be easily established. Lemma 2. If, D(θ) = Q(θ) − Q(θ0 ), then 1 1 D(θ) − lim E[ D(θ)] → 0 a.s.uniformly ∀ θ ∈ Θ. N →∞ N N

N 1 X 1 Wn (θ). Proof: Let us denote Wn (θ) = |hn (θ)+X(n)|−|X(n)|. Then D(θ) = N N n=1 We note that Wn (θ) = |hn (θ) + X(n)| − |X(n)| ≤ |hn (θ)| ≤ 4M, as the parameter

5

space is compact. Also Wn (θ)s are independent and non identically distributed random variables with E[Wn (θ)] < ∞ and V [Wn (θ)] < ∞. It may be easily seen similarly as in Oberhofer (1982) that these bounds do not depend on n. Since Θ is a compact set, there exists Θ1 , · · · , ΘK , such that Θ = ∪K i=1 Θi and on ǫ each Θi , sup Wn (θ) − inf Wn (θ) < n . a.s. Now for θ ∈ Θi , θ∈Θi 4 θ∈Θi 1 1 D(θ) − lim E[ D(θ)] N →∞ N "N N # # " N N X 1 1 X 1 X 1 = Wn (θ) − E sup Wn (θ) + E sup Wn (θ) − lim E[ D(θ)] N →∞ N n=1 N n=1 θ∈Θi N n=1 θ∈Θi N = A(θ) + B(θ) where,

A(θ) =

N N 1 X 1 X Wn (θ) − E sup Wn (θ) N n=1 N n=1 θ∈Θi

N N 1 X 1 X Wn (θ)] − E sup Wn (θ) ≤ sup [ N n=1 θ∈Θi θ∈Θi N n=1

N N 1 X 1 X sup Wn (θ) − E sup Wn (θ). ≤ N n=1 θ∈Θi N n=1 θ∈Θi

Note that sup Wn (θ)’s are independent and non identically distributed random variθ∈Θi

ables with finite mean and variance, and the variance is bounded by a quantity not depending on n. Applying Kolmogorov’s strong law of large numbers, choose N0i ǫ a.s., uniformly ∀ θ ∈ Θi . Now large enough, so that for N ≥ N0i , A(θ) < 3 B(θ) = =

N 1 1 X E sup Wn (θ) − lim E[ D(θ)] N →∞ N n=1 θ∈Θi N N 1 X 1 E sup Wn (θ) − E lim [ D(θ)] N →∞ N N n=1 θ∈Θi

using DCT

= C(θ) + D(θ), where

N N 1 X 1 X E sup Wn (θ) − E lim sup Wn (θ), C(θ) = N →∞ N N n=1 θ∈Θi θ∈Θ i n=1 N

1 X E sup Wn (θ) N n=1 θ∈Θi and we want to apply DCT to pass the limit inside the expectation and we get N∗i

and DCT stands for dominated convergence theorem. We take UN (θ) =

6

ǫ such that C(θ) < . Further, note that 3 N 1 X 1 D(θ) = E lim sup Wn (θ) − E lim [ D(θ)] N →∞ N N →∞ N n=1 θ∈Θi

N N 1 X 1 X sup Wn (θ) − E lim Wn (θ) N →∞ N N →∞ N θ∈Θ i n=1 n=1

= E lim

N N 1 X 1 X sup Wn (θ) − E lim inf Wn (θ) N →∞ N N →∞ N θ∈Θi θ∈Θ i n=1 n=1

≤ E lim

N 1 X ǫ = 0. ≤ lim N →∞ N 4n n=1

Combining we get

1 1 D(θ) − lim E[ D(θ)] → 0 a.s.uniformly ∀ θ ∈ Θ. N →∞ N N

Lemma 3. The global minimum of lim E[ N →∞

1 D(θ)] is attained at θ0 . N

1 Proof: At θ0 the value of lim E[ D(θ)] is zero, and for θ 6= θ0 , if we can show N →∞ N 1 lim E[ D(θ)] > 0 then we are through. To achieve that we verify the assumptions N →∞ N B7, B8, B9 of Lemma 4 by Oberhofer (1982). For convenience we reproduce the assumptions B7, B8, B6 as A1, A2, A3 respectively, below. A1: For every closed set Θ0 not containing θ0 , there exist numbers c > 0, d > 0, N0 > 0 such that for all θ ∈ Θ0 and all N ≥ N0 , |{n : n ≤ N0 , |hn (θ)| ≥ c}|/N ≥ d > 0. A2: For every c > 0, there exists a real number d > 0, such that for all n min[Fn (c) − 1/2, 1/2 − Fn (−c)] ≥ d > 0 A3: There exists e > 0 and N0 such that for all N ≥ N0 , Q = inf Θ0

N 1 X |hn (θ)| min[Fn (c) − 1/2, 1/2 − Fn (−c)] ≥ e > 0 N n=1

Lemma 4 of Oberhofer (1982) states that A3 is fulfilled if A1 and A2 holds. Note 1 that Lemma 2 of Oberhofer (1982) gives D(θ) ≥ Q. Then it is enough to show N 1 lim E[ D(θ)] ≥ lim Q > 0. Now, lim Q > 0 condition is same as A3. Using N →∞ N →∞ N →∞ N Lemma 4 of Oberhofer (1982) instead of A3 we try to show A1 and A2. If f (0) > 0 7

then A2 is automatically satisfied. It remains to show that A1 is satisfied in our N 1 X case. If there exists c > 0 such that inf |hn (θ)| ≥ c > 0 for all N ≥ N0 then Θ0 N n=1 A1 will be satisfied. Let us consider Θ0 = Sc = {θ : |θ − θ0 | ≥ 3c > 0} ⊆ ScA ∪ ScB ∪ Sc(α,β)

(7)

where ScA = {θ : |A − A0 | ≥ c > 0} ⊆ {θ : |A − A0 | ≥ c, (α, β) = (α0 , β 0 )} ∪{θ : |A − A0 | ≥ c, (α, β) 6= (α0 , β 0 )}, ScB = {θ : |B − B 0 | ≥ c > 0} ⊆ {θ : |B − B 0 | ≥ c, (α, β) = (α0 , β 0 )} ∪{θ : |B − B 0 | ≥ c, (α, β) 6= (α0 , β 0 )}, Sc(α,β) = {θ : |(α, β) − (α0 , β 0 )| ≥ c > 0} Now on the set {θ : |A − A0 | ≥ c, (α, β) = (α0 , β 0 )} Case-1, if B − B 0 = 0, then hn (θ) = (A − A0 ) cos(α0 n + β 0 n2 ) and,

lim inf N →∞

N N 1 X 1 X |hn (θ)| = |A − A0 | lim inf | cos(α0 n + β 0 n2 )| N →∞ N n=1 N n=1 N 1 X (cos(α0 n + β 0 n2 ))2 ≥ |A − A | lim N →∞ N n=1 0

1 c = |A − A0 | ≥ > 0 2 2

Case-2, if B − B 0 6= 0, then hn (θ) = (A − A0 ) cos(α0 n + β 0 n2 ) + (B − B 0 ) sin(α0 n + β 0 n2 ) = r cos(ω) cos(α0 n + β 0 n2 ) + r sin(ω) sin(α0 n + β 0 n2 ) for some r > 0, ω = r cos(α0 n + β 0 n2 − ω) 8

N N 1 X 1 X lim inf |hn (θ)| = |r| lim inf | cos(α0 n + β 0 n2 − ω)| N →∞ N N →∞ N n=1 n=1

So,

N 1 X ≥ |r| lim (cos(α0 n + β 0 n2 − ω))2 N →∞ N n=1

1 = |r| > 0 2

On the set {θ : |A − A0 | ≥ c, (α, β) 6= (α0 , β 0 )} hn (θ) = A cos(αn + βn2 ) + B sin(αn + βn2 ) − A0 cos(α0 n + β 0 n2 ) − B 0 sin(α0 n + β 0 n2 ) = r cos(αn + βn2 − ω) − r0 cos(α0 n + β 0 n2 − ω 0 ) for some r, r0 > 0, ω, ω 0  2 hn (θ) hn (θ) < 1. We denote r = R > We recall that |hn (θ)| ≤ 4M . Then ≤ 4M 4M 4M 0 r 0 and = R0 > 0. Then 4M N N 1 X 1 X hn (θ) | |hn (θ)| = 4M lim inf | N →∞ N N →∞ N 4M n=1 n=1   N 2 1 X hn (θ) ≥ 4M lim inf N →∞ N 4M n=1

lim inf

N 1 X [R cos(αn + βn2 − ω) − R0 cos(α0 n + β 0 n2 − ω 0 )]2 = 4M lim N →∞ N n=1 2

= 4M

R2 + R0 >0 2

N 1 X 1 Similarly on other sets lim inf |hn (θ)| > 0. So, we get lim E[ D(θ)] > 0 for N →∞ N N →∞ N n=1 0 θ 6= θ .

Proof of Theorem 1: Now to prove the strong consistency of the LAD estimators, first let us observe that the minimizer of Q(θ) will be same as the minimizer of D(θ) = Q(θ) − Q(θ0 ). So we develop our result based on minimizer of D(θ) instead of Q(θ). Note that N N X  X 2 2 Q(θ) = y(n) − A cos(αn + βn ) + B sin(αn + βn ) = |hn (θ) + X(n)| n=1

n=1

(8)

9

where hn (θ) = A0 cos(α0 n + β 0 n2 ) + B 0 sin(α0 n + β 0 n2 ) − A cos(αn + βn2 ) − B sin(αn + βn2 ) 0

and note that |hn (θ)| ≤ 4M for θ ∈ Θ and Q(θ ) = shown that 1 1 D(θ) − lim E[ D(θ)] → 0 a.s. N →∞ N N

N X n=1

|X(n)|. In Lemma 2 we have

uniformly ∀ θ ∈ Θ.

 1 D(θ) . and in Lemma 3 we have shown that θ is the global minimizer of lim E N →∞ N Therefore, by Lemma 2 of Jennrich (1969) or by Lemma 2.2 of White (1980) we can 

0

conclude that minimizer of D(θ) is a strong consistent estimators of θ0 .

3.2

Asymptotic Normality

Now we want to show that the estimators obtained have the following asymptotic   1 1 1 1 √ normality result. Let us take D = diag √ , √ , √ , N N N N N2 N Theorem 2. If the Assumptions 1-2 are satisfied then  0 −1 d b (θ − θ )D −→ N4 0,

1 Σ = 02 A + B02 d

   1 02 02 A + 9B 2  −4A0 B 0    −18B 0 15B 0

0

 1 Σ f (0)2 0

−4A B −18B   2 2 1 9A0 + B 0 18A0 2 18A0 96 0 −15A −90

here, ‘ −→’ means converges in distribution,

(9) 0

15B

0



  −15A0  ,  −90  90

(10)

Proof: We recall that Q(θ) is not a differentiable function, to find the asymptotic disb we want to approximate Q(θ) by Q(θ) e tribution of θ, with some “nice” property

(differentiability). For that purpose we need to approximate |x| by some “nice” func-

tion ρN (x) near zero, such that lim ρN (x) = |x|. Let us consider the interval near N →∞

10

1 = 0. N →∞ γN Let us approximate |x| by a polynomial. We want to approximate |x| separately in

zero as (− γ1N , γ1N ) where γN is an increasing function of N satisfying lim

(− γ1N , 0) and (0, γ1N ). In each of these intervals we observe that the degree of the polynomial has to be at least 3 to make the approximating function twice continuously differentiable. If possible the degree of the polynomial is less than 3, say 2 and it is P(x) = Ax2 + Bx + C. Then P ′′ ( γ1N ) = 2A should match with the second derivative of |x| at boundary point

1 , γN

which is zero. In that case A = 0 makes

polynomial degree 1, if not then there will be a jump discontinuity at

1 γN

for the

function P ′′ (x). So, let the approximating polynomial is P(x) = Ax3 + Bx2 + Cx + D in (0, γ1N ). As |x| is symmetric about zero the approximating polynomial in (− γ1N , 0) will be P(x) = −Ax3 + Bx2 − Cx + D. Now to find the coefficients of the polynomial we match the function value and its derivatives at the joining points. P( γ1N ) = | γ1N | gives

P ′ ( γ1N ) = 1 gives

P ′′ ( γ1N ) = 0 gives

B C 1 A + 2 + +D = 3 γN γN γN γN

(11)

3A 2B + +C =1 2 γN γN

(12)

6A + 2B = 0 γN

(13)

and P ′ (0) agrees from both parts of the polynomial giving C = 0.

(14)

Solving previous four equations we get the suitable cubic spline as   1 1 2 3 2 I“0 1 ” ρN (x) = − γN x + γN x + γN γN 3 3γN ρN (−x) = −ρN (x) which is symmetric, twice continuously differentiable and γN is an increasing function ∞ X 1 2 3 of N satisfying some extra conditions, N = o(γN ), γN = o(N ) and < ∞, 2 γ N N =1 11

which we will be needing later. After getting the nice function ρN (x) we now define e Q(θ) =

N X

ρN (hn (θ) + X(n))

(15)

n=1

e 0 ) = PN ρN (X(n)). Now we want to prove the following two and note that Q(θ n=1

results (Lemma 4 and Lemma 5) which when combined will give the required asymptotic normality result.

P P e −1 −→ Lemma 4. If the Assumptions 1-2 are satisfied then (θb − θ)D 0 where −→

means convergence in probability.

e the minimizer of Q(θ) e Lemma 5. If the Assumptions 1-2 are satisfied, then θ, has d

1 the following asymptotic distribution (θe − θ0 )D−1 −→ N4 (0, f (0) 2 Σ)

To prove Lemma 4 and Lemma 5 we need some more lemmas. e − Q(θ)) = oP (1) and sup 1 |Q(θ) e − Q(θ)| → 0 a.s. where oP (1) Lemma 6. sup(Q(θ) θ∈Θ θ∈Θ N means converges to zero in probability. e Proof: To calculate the following quantity Q(θ) − Q(θ). we write explicitly the function ρN (x) − |x|.

  1 1 2 3 2 I“0 ǫ) ≤ E|Q(θ) − Q(θ)| P (|Q(θ) ǫ N X C1 ≤ EI“0 ǫ) ≤ 2 and N γN ∞ X

implies

1 e |Q(θ) N

∞ X 1 e 1 P ( |Q(θ) − Q(θ)| > ǫ) ≤ C2 0 for N

e′ (θ) as the 4 × 1 first derivative vector and Q e′′ (θ) as the 4 × 4 Let us denote Q

e e′ (θ) and Q e′′ (θ), let second derivative matrix of Q(θ). To get explicit expressions of Q us write explicitly the functions ρN (x), ρ′N (x) and ρ′′N (x).

  1 2 3 1 2 ρN (x) = − γN x + γN x + I“0 1 ” γN γN 3 3γN   1 2 3 1 + I“− 1 ≤x≤0” − xI“x

Suggest Documents