Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Communications in Statistics—Theory and Methods, 40: 467–491, 2011 Copyright © Taylor & Francis Group, LLC ISSN: 0361-0926 print/1532-415X online DOI: 10.1080/03610920903427792
The Strong Consistency of M Estimator in a Linear Model for Negatively Dependent Random Samples QUNYING WU AND YUANYING JIANG College of Science, Guilin University of Technology, Guilin, P.R. China The strong consistency of M estimators of the regression parameters in linear models for negatively dependent random errors under some mild conditions is established, which is an essential improvement on the relevant results in the literature on the moment conditions and dependent errors. Especially, Theorems 1 and 2 of Wu (2006) are not only extended to the case of negatively dependent random errors, but also are improved essentially on the moment conditions. Keywords Linear model; M estimator; Moment dependent random error; Strong consistency.
condition;
Negatively
Mathematics Subject Classification 62F12.
1. Introduction and Lemmas Consider the linear model Yi = xi + ei i = 1 n n ≥ 1
(1.1)
where x1 x2 xn are p × 1 known design vectors, e1 e2 en are random errors, and is a p × 1 unknown parameter vector. Suppose that is some suitably chosen function on R1 . The M estimator of is defined by ˆ n as follows: n i=1
Yi − xi ˆ n = minp ∈R
n
Yi − xi
(1.2)
i=1
M estimators are very generally-used and important. The most widely used M estimators include Examples 1.1 to 1.4 below. Example 1.1 (Maximum Likelihood Estimators, MLE). Assume that ei have common density function f . It is well known that the maximum likelihood Received November 6, 2007; Accepted October 19, 2009 Address correspondence to Qunying Wu, College of Science, Guilin University of Technology, Guilin 541004, P.R. China; E-mail:
[email protected]
467
468
Wu and Jiang
estimators (MLE) of for model (1.1) is defined by ˆ n∗ as follows: n
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
i=1
fYi − xi ˆ n∗ = maxp ∈R
n
fYi − xi
i=1
Let = − ln f , then ni=1 Yi − xi ˆ n∗ = min∈Rp ni=1 Yi − xi , i.e., ˆ n∗ is a M estimator for the model (1.1). Thus, MLE is a particular case of M estimators. Example 1.2 (Huber’s Estimators). Huber’s estimators correspond to the objective function x = x2 Ix≤c /2 + cx − c2 /2Ix>c c > 0. Example 1.3 ( q Regression Estimators). The q regression estimators with x = xq 1 ≤ q ≤ 2. In particular, if q = 1 and q = 2, respectively, then the minimizer of (1.2) are called the least absolute deviations (LAD) estimator and least squares (LS) estimator. Example 1.4 (Regression Quantile Estimators). The regression quantile estimator of order corresponds to the objective function x = x+ + 1 − −x+ 0 < < 1, where x+ = maxx 0. After Huber (1973) studied M estimators, many statisticians have been interested in studying this topic. A series of useful results were established. References to recent work on M estimators can be found in Chen and Zhao (1995, 1996), He and Shao (1996), Yang (2002), Collins and Szatmari (2004), Georgios (2005), Wu (2005, 2006), Djalil and Didier (2007), and Seija et al. (2007). However, the majority of the previous work is assumed that the errors ei are independent. The asymptotic problem of M estimators of linear models with dependent errors is practically important, however theoretically challenging. Huber (1973) commented that the restriction about the assumption of independence is serious. In this article, we shall relax the independence assumption in the classical M estimators theory so that the class of dependent errors become more general. Generally speaking, it is not easy to obtain strong consistent properties for dependent errors. We establish the strongly consistent estimator of in the linear model (1.1) with the negatively dependent random errors (ei ). We now turn to introduce the concepts of three types of dependent random variables. Definition 1.1. Random variables X and Y are said to be negatively dependent (ND) if PX ≤ x Y ≤ y ≤ PX ≤ xPY ≤ y
(1.3)
for all x y ∈ R. A collection of random variables is said to be pairwise negatively dependent (PND) if every pair of random variables in the collection satisfies (1.3). It is important to note that (1.3) implies PX > x Y > y ≤ PX > xPY > y
(1.4)
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
M Estimator in a Linear Model
469
for all x y ∈ R. Moreover, it follows that (1.4) implies (1.3), and hence, (1.3) and (1.4) are equivalent. Ebrahimi and Ghosh (1981) showed that (1.3) and (1.4) are not equivalent for a collection of three or more random variables. They considered random variables X1 X2 , and X3 where X1 X2 X3 assumed the values 0 1 1 1 0 1 1 1 0, and 0 0 0 each with probability 41 . The random variables X1 X2 , and X3 are pairwise independent, and hence, satisfy both (1.3) and (1.4) for all pairs. However, PX1 > x1 X2 > x2 X3 > x3 ≤ PX1 > x1 PX2 > x2 PX3 > x3
(1.5)
for all x1 x2 , and x3 , but PX1 ≤ 0 X2 ≤ 0 X3 ≤ 0 =
1 1 = PX1 ≤ 0PX2 ≤ 0PX3 ≤ 0 4 8
Placing probability 41 on each of the other vertices 1 0 0 0 1 0 0 0 1 1 1 1 provides the converse example of pairwise independent random variables which will not satisfy (1.5) with x1 = 0 x2 = 0 and x3 = 0 but where the desired “≤” in PX1 ≤ x1 X2 ≤ x2 X3 ≤ x3 ≤ 3i=1 PXi ≤ xi holds for all x1 x2 , and x3 . Accordingly, the following definition is needed to define sequences of negatively dependent random variables. Definition 1.2. The random variables X1 Xn are said to be negatively dependent (ND) if for all real x1 xn , P
n
Xj ≤ xj ≤
j=1
n
PXj ≤ xj
j=1
and P
n j=1
Xj > xj ≤
n
PXj > xj
j=1
An infinite random sequence Xn n ≥ 1 is said to be ND if every finite subfamily is ND. Definition 1.3. Random variables X1 X2 Xn n ≥ 2 are said to be negatively associated (NA) if for every pair of disjoint subsets A1 and A2 of 1 2 n , covf1 Xi i ∈ A1 f2 Xj j ∈ A2 ≤ 0 where f1 and f2 are increasing in every variable (or decreasing in every variable), provided this covariance exists. A random variables sequence Xn n ≥ 1 is said to be NA if every finite subfamily is NA. The definition of PND is given by Lehmann (1966). The definition of NA is introduced by Joag-Dev and Proschan (1983), and the concept of ND is given by Bozorgnia et al. (1993). These concepts of dependent random variables are very useful to reliability theory and applications.
470
Wu and Jiang
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
It is easy to see from the definitions that NA implies ND. But the following example will show us that ND does not imply NA. Example 1.5. Let Xi be a binary random variable such that PXi = 0 = PXi = 1 = 05 for i = 1 2 3. Let X1 X2 X3 take the values 0 0 1, 0 1 0, 1 0 0, and 1 1 1 each with probability 1/4. It can be verified that all the ND conditions hold. However, PX1 + X3 ≤ 1 X2 ≤ 0 = 4/8 ≤ 3/8 = PX1 + X3 ≤ 1PX2 ≤ 0 Thus, X1 X2 X3 are not NA. From the above example, it is shown that ND is much weaker than NA. In the articles listed earlier, a number of well-known multivariate distributions are shown to possess the ND properties, such as: (a) multinomial; (b) convolution of unlike multinomials; (c) multivariate hypergeometric; (d) Dirichlet; (e) Dirichlet compound multinomial; (f) multinormals having certain covariance matrices. Because of their wide applications in multivariate statistical analysis and reliability theory, the notions of ND random variables have attracted more and more attention recently. A series of useful results have been established (cf; Bozorgnia et al., 1993; Ebrahimi and Ghosh, 1981; Klesov et al., 2005; Kuczmaszewska, 2006; Taylor et al., 2002). Hence, it is highly desirable and of considerable significance to extend the limit properties of independent or NA random variables to the case of ND random variables theorems and applications. The following five lemmas are listed to refer in obtaining the main results in the next section. Detailed proofs of Lemmas 1.1 and 1.2 can be founded in Bozorgnia et al. (1993), and detailed proofs of Lemmas 1.3–1.5 can be founded in the Appendix. Lemma 1.1 (Bozorgnia et al., 1993). Let X1 Xn be ND random variables and fn n ≥ 1 be a sequence of Borel functions all of that are monotone increasing (or all are monotone decreasing). Then, fn Xn n ≥ 1 is a sequence of ND r.v.’s. Lemma 1.2 (Bozorgnia et al., 1993). Let X1 Xn be non negative ND r.v.’s. Then
n n E Xj ≤ EXj j=1
j=1
In particular, let X1 Xn be ND and t1 tn be all non negative (or non positive) real numbers. Then n n E exp ≤ tj Xj E exptj Xj j=1
j=1
Lemma 1.3. Let Xn n ≥ 1 be an ND sequence with EXi = 0 Xi ≤ bi a.s. i = 1 2 Bn = ni=1 EXi2 t > 0, and t · max1≤i≤n bi ≤ 1. Then,
n
P Xi > u ≤ 2 exp − tu + t2 Bn for all u > 0 i=1
M Estimator in a Linear Model
471
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Lemma 1.4. Let Xn n ≥ 1 be a sequence of ND identically distributed random variables, and for some 0 < ≤ 1, EX1 1/ <
(1.6)
EX1 = 0
(1.7)
ank ≤ cn− for n ≥ 1 k ≤ n and some constant 0 < c < and ank = 0 for k > n (1.8) And for p = min 1 2, there exists a constant > 0 such that n
ank p ≤ cn− for all n
(1.9)
k=1
Then, Tn = ˆ
n
ank Xk → 0 as
(1.10)
k=1
Remark 1.1. Lemma 1.4 itself is a important result of almost sure convergence for ND random variables. Lemma 1.5. Let Xn n ≥ 1 be a sequence of ND random variables, and there exist a random variable X and a constant c > 0 such that PXk > t ≤ cPX > t for all t > 0 k ≥ 1 EX1/ < for some > 0, and EXk = 0 for < 1. (1.8) and (1.9) are satisfied. Then (1.10) holds. The article is structured as follows. Section 2 presents our main results on the strongly consistent estimator of in the linear model (1.1) with the negatively dependent random errors (ei ). The detail theoretical proofs of the theorems are given in Sec. 3, and the methodology of the proofs is also interesting. The simulation studies are given to illustrate the results in Sec. 4. The proofs of Lemmas 1.3–1.5 are given in the Appendix in Sec. 5.
2. Main Results For the sake of convenience, throughout this article, we always assume that is a non monotonic convex function on R1 which ensures the existence of a M estimator for the linear model (1.1) (see Chen and Zhao, 1996). − and + mean left and right derivatives of , respectively. Choose increasing function such that − u ≤ u ≤ + u for all u ∈ R1 . The function · plays an important role in the study of the strong consistency of ˆ n . Let a denote p × 1 column vector and a denote its transpose. Write a2 = ˆ
p i=1
a2i = a a a = ˆ max ai 1≤i≤p
Let Sn = ˆ x1 x1 + · · · + xn xn = ni=1 xi xi , suppose that Sn−1 exists, and hence, Sn is a positive definite matrix. Let dn = ˆ max1≤i≤n xi Sn−1 xi .
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
472
Wu and Jiang
In most of earlier articles, dealing with the strong consistency of an M estimator or a LAD estimate of , it is assumed that Sn /n → Q, where Q is a positive-definite matrix. Wu (1988) obtained some results of a more general nature for the strong consistency of a LAD estimate ˆ n of , but still assumed some cumbersome and unnecessary conditions. Chen et al. (1992) gave the following result. Suppose that e1 e2 are independent, medei = 0 for each i, and there exist positive constants c and such that P−h < ei < 0 ≤ ch ≤ P0 < ei < h for any h ∈ 0 and i = 1 2 Then the following assertions are true: dn = o1/ ln n ⇒ ˆ n → a.s. as n → dn = O1/n ⇒ ˆ n tends to exponentially in the sense that for given > 0, there exists a constant c > 0 such that Pˆ n − ≥ = Oe−cn For the strong consistency of an M estimator ˆ n defined by minimizing (1.2), Chen and Zhao (1995) obtained the following results. (i) Suppose that e1 e2 are independent, and there exist positive constants l1 and l2 such that Eei + u − ei ≥ l1 u2 for u < l2 and i = 1 2
(2.1)
and that one of the following two conditions is satisfied: (a) There exists a constant M < such that ei ≤ M for each i; (b) satisfies the Lipschitz condition, or equivalently, + is bounded. Then ˆ n is a strongly consistent estimator of provided dn = o1/ ln n (ii) Suppose that e1 e2 are independent, (2.1) holds and E + ei ± m ≤ hm < i m = 1 2 and dn = On−
(2.2)
for some positive constants hm m = 1 2 and , then ˆ n is a strongly consistent estimator of . However, for the strong consistency of ˆ n , the restriction of the moment conditions in (ii) is too strong. Yang (2002) improved the above result in (ii). In the case when e1 e2 are i.i.d., he established the strong consistency of ˆ n under (2.1), (2.2), and the moment condition E + ei ± 2/ < where > 0 and 0 < ≤ 1.
M Estimator in a Linear Model
473
Further, Wu (2006) extended theorems of Yang (2002) to the case of NA random errors, and obtained the following results.
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Theorem A. In the model (1.1), assume that e e1 e2 are NA random errors which have common distribution, is a convex function, (2.2) and the following conditions are satisfied 1. There exist constants c2 > 0 and l2 > 0 such that Ee + u − e ≥ c2 u2 for u < l2
(2.3)
2. There exist constants > 0 and 0 < ≤ 1 such that E + e ± 2/ ≤ h <
(2.4)
Then ˆ n is a strongly consistent estimator of . Theorem B. In the model (1.1), assume that e1 e2 are NA random errors, is a convex function, (2.1), (2.2), and the following conditions are satisfied: There exist constants > 0 0 < ≤ 1 and t > 2/ such that supi E + ei ± t ≤ h < . Then ˆ n is a strongly consistent estimator of . In this article, we study the general ND random errors. As a result, we not only extend the above Theorems A and B to the ND random errors but also essentially improve the moment conditions. Theorem 2.1. In the model (1.1), assume that e e1 e2 are ND random errors which have the identical distribution, there exist positive constants c1 c2 , and ∈ 0 1 such that the following conditions are satisfied: u ± t − u is monotonic on u ∈ R1 and u + t − u ≤ c1 for t ∈ 0 u ∈ R1
(2.5)
or u ≤ c1 for u ∈ R1
(25 )
E e = 0 and E e + u ≥ c2 u for u <
(2.6)
dn ≤ c1 n−
(2.7)
E e1/ < when 0 < < 1 and E e < for some > 1 when = 1 (2.8) Then, ˆ n is a strongly consistent estimator of . Theorem 2.1 can be easily extended to the case that e1 e2 are ND random errors but not with the same distribution. We have the following.
474
Wu and Jiang
Theorem 2.2. In the model (1.1), assume that e1 e2 are ND random errors, there exist positive constants c c1 c2 , ∈ 0 1, and a random variable e satisfying Pen > t ≤ cPe > t for all t > 0 n ≥ 1, such that (2.5) (or (2.5) ), (2.7), (2.8) hold, and
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
E ek = 0 and E ek + u ≥ c2 u for u < k ≥ 1 Then, ˆ n is a strongly consistent estimator of . Remark 2.1. The moment conditions (2.8) greatly improve the condition (2.4). In fact, our results are essential improvement on the moment conditions in the relevant results in the literature, such as, Chen and Zhao (1995), Yang (2002), and Wu (2006), etc. If 0 < < 1, we only need E e1/ < , and if = 1, we only need E e < for some > 1. Remark 2.2. The condition (2.6) is weaker than (2.3). In fact, if (2.3) hold, then 0 is the minimum value of the function gu = ˆ Ee + u − e. By Lemma 1.4 of Chen and Zhao (1996), E e = 0. And by the properties of a convex function, it is easy to see that e + u − e ≤ u e + u for u ∈ R1 Thus, Ee + u − e ≤ uE e + u which implies that E e + u ≥ c2 u. Remark 2.3. By the statements of pp. 64 and 65 in Chen and Zhao (1996), dn ≥ p/n. Thus, the restriction 0 < ≤ 1 in (2.7) is essential. Remark 2.4. Since the family of M estimators like LS, LAD estimates, etc. satisfy (2.5) (or (2.5) ), assumption (2.5) (or (2.5) ) is not an essential restriction in application. As mentioned above, Remarks 2.1–2.4, Theorems 2.1 and 2.2 not only relax random errors to a very general class of dependent errors but also essentially improve the conditions on ei and . In particular, Theorems 2.1 and 2.2 not only extend Theorems 1 and 2 of Wu (2006) to the case of ND random errors but also essentially improve on the moment condition. The conditions of our theorems are very weak and they are satisfied in most practical cases that most people are interested in. Some examples are below. Example 2.1. In the model (1.1), assume that e1 e2 are ND which have the common distribution function F with zero median and 1 − 2Fu ≥ cu for u <
(2.9)
M Estimator in a Linear Model
475
which c > 0 and > 0 are constants. For LAD estimator, i.e., x = x, choose ∈ −1 1, and
sgnu u = 0
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
u =
u=0
such that E e1 = 0. (2.5) and (2.8) hold by u ≤ 1. It is easy to see that (2.9) implies (2.6). Thus, if (2.7) holds for design sample vectors xi , then LAD estimator ˆ n is a strongly consistent estimator of . Example 2.2. For LS estimator, i.e., x = x2 , or equivalently, x = 2x, (2.5) and E e + u ≥ c2 u hold naturally. Choose design vectors xi such that (2.7) is satisfied for some 0 < < 1. Let > 1 and e have a density function
fx =
⎧ ⎨0
⎩21/−1 x−1/−1 log− 2 x
1 +
log−1 2 ln 2
x < 2 x x ≥ 2
Then Ee = 0, the distribution function ⎧ 21/−1 ⎪ ⎪ ⎪ x < −2 ⎪ ⎪ x1/ log2 x ⎪ ⎪ ⎪ ⎨1 x ≤ 2 Fx = 2 ⎪ ⎪ ⎪ ⎪ ⎪ 21/−1 ⎪ ⎪ ⎪ x > 2 ⎩1 − 1/ x log2 x for x ≥ 0,
Pe > x =
⎧ ⎪ ⎪ ⎨1 ⎪ ⎪ ⎩
x < 2 2
x1/
1/
log2 x
x ≥ 2
and Ee1/ =
0
21/
Pe1/ > xdx +
21/
Pe1/ > xdx = 21/ +
21/
21/ dx x log2 x
<
Hence, by Theorem 2.1, the LS estimator ˆ n is a strongly consistent estimator of . Remark 2.5. In the Example 2.2, for any > 1/, i.e., 1 < 1, then Ee = 2 + 21/ dx = , i.e., (2.4) does not hold. This shows that moment condition (2.4) is 1 2 x log2 x
too strong in a sense.
476
Wu and Jiang
3. Proof of the Theorems 2.1
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Proof of Theorem 2.1. Let ˆ n be the minimizer of (1.2) and 0 be the true parameter. Let xni = Sn−1/2 xi n0 = Sn1/2 0 1 ≤ i ≤ n. The model (1.1) can be rewritten as Yi = xni n0 + ei 1 ≤ i ≤ n
(3.1)
and we have n
xni xni = Ip
i=1
n
xni 2 = p dn = max xni 2 1≤i≤n
i=1
(3.2)
where Ip is the p × p identity matrix. n0 = Sn1/2 ˆ n . Without Let n0 be an M estimator of n0 in the model (3.1), then loss of generality, we can suppose that the true parameter 0 = 0 in model (1.1), i.e, n0 = 0 in model (3.1). And let n
ei − xni n0 = minp ∈R
i=1
n
ei − xni
(3.3)
i=1
Denote the unit sphere U = ∈ Rp = 1 . Let > 0 be any given √ constant. Without loss of generality, it can be assumed that 2 c1 < . Define Dn =
n ei − xni − ei ∈ Rp i=1
Then, Dn · is a convex function and Dn 0 = 0. Let wni = − n/2 xni r ∈ U , by the definition of , we have: Dn n/2 r =
n
ei − n/2 xni r − ei
i=1
=
n i=1 0
r wni
ei + t − ei dt +
n
wni r ei
i=1
= ˆ I1n r + I2n r Hence, inf Dn n/2 r ≥ inf I1n r + inf I2n r ≥ inf I1n r − sup I2n r r∈U
r∈U
r∈U
r∈U
(3.4)
r∈U
We can divide U into N equal parts, U1 U2 UN such that the diameter of each part is less than n−2 , and N ≤ 2n2 + 1p . Let Tj be the smallest close convex set covering Uj . For a fixed Tj , there are following three cases. (i) wni r ≥ 0 for all r ∈ Tj , then there exists a rij ∈ Tj such that wni rij = inf r∈Tj wni r ; (ii) wni r ≤ 0 for all r ∈ Tj , then there exists a rij ∈ Tj such that wni rij = supr∈Tj wni r ; (iii) wni r1 > 0 for some r1 ∈ Tj , and wni r2 < 0 for some r2 ∈ Tj , then there exists a rij ∈ Tj such that wni rij = 0.
M Estimator in a Linear Model
477
Let Ri t = ei + t − ei Ri t = Ri t − ERi t
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
By the monotonicity of , we know that Ri t and ERi t are increasing on t, combining the selection of rij and U ⊂ Nj=1 Tj , we get inf I1n r = inf r∈U
r∈U
n i=1 0
≥ inf inf
1≤j≤N r∈Tj
≥ inf
1≤j≤N
1≤j≤N
Ri tdt
n
r wni
i=1 0
n
r wni ij
i=1 0
= inf
r wni
n
Ri tdt
Ri tdt
r wni ij
i=1 0
Ri tdt +
n
r wni ij
i=1 0
ERi tdt
n wni rij Ri tdt i=1 0 ERi tdt 1 + w r = min n ni ij 1≤j≤N ERi tdt i=1 0 i=1 0 ⎡
⎤ wni rij
n
n w rij R tdt
⎥ i i=1 ni 0 ⎢ ERi tdt ⎣1 − w r ≥ min ⎦ n ni ij 1≤j≤N ER tdt i=1 0 i i=1 0 n
r wni ij
(3.5)
Let r ∈ Uj and rij ∈ Tj . By (3.2) and the definitions of Uj and Tj , for sufficiently large n,
n n n
− xni xni rij 2 + xni r2 ≤ rij 2 − xni r2
i=1 i=1 i=1
n
= rij − r xni xni rij + r
i=1 ≤
n
rij − rxni 2 rij − r + 2r
i=1
≤
n
n−2 n−2 + 2xni 2
i=1
≤ 3n−2 p < 1/2 which combining (3.2), we obtain n i=1
xni rij 2 >
n
xni r2 − 1/2 =
i=1
= r
n
n
r xni xni r − 1/2
i=1 xni xni r − 1/2 = r2 − 1/2
i=1
= 1/2 1 ≤ j ≤ N
(3.6)
478
Wu and Jiang
By (2.7), (3.2), and the selection of , for sufficiently large n, and for i = 1 2 n and j = 1 2 N ,
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
rij ≤ n/2 xni rij − r + r wni rij = n/2 xni √ ≤ n/2 dn1/2 n−2 + 1 ≤ 2 c1 <
(3.7)
and by (2.6) and (3.6), min
1≤j≤N
n
r wni ij
ERi tdt ≥ min
1≤j≤N
i=1 0
n
r wni ij
i=1 0
c2 tdt
≥
n c2 w r 2 min 2 1≤j≤N i=1 ni ij
=
n c2 2 n min x r 2 2 1≤j≤N i=1 ni ij
≥ c2 2 n /4 For fixed j = 1 2 N , let Yni =
wni rij
(3.8)
Ri tdt.
0
(i) If (2.5) holds, then, ei + t − ei is monotonic on ei , and thus Yni n ≥ 1 i ≤ n are monotonic on ei . By Lemma 1.1, Yni n ≥ 1 i ≤ n is also a sequence of ND random variables with EYni = 0. And by (2.5) and (3.7), rij < 2c1 = ˆ c3 Yni ≤ 2c1 wni
And by (2.5), (3.7) again, Bn = ˆ
n
EYni2 =
i=1
=
n
≤
n
E 0
=
n i=1
≤
0 r wni ij
E
i=1
≤ c1
r wni ij
E
i=1 n
r wni ij
0 wni rij
r wni ij
E 0
i=1
i=1
n
Ri tdt
Ri tdt − E
2
r wni ij
0
2 Ri tdt
2 Ri tdt Ri tdt 0
r wni ij
r wni ij
0
Ri tdt
ERi tdt
n w rij ni c3 ERi tdt 2 i=1 0
(3.9)
w r Let u = 21 ni=1 0 ni ij ERi tdt > 0 t = 1/2c3 . Thus, tc3 ≤ 1. Applying Lemma 1.3 for Yni n ≥ 1 i ≤ n , combining (3.8), (3.9), and N ≤ 2n2 + 1p , we
M Estimator in a Linear Model have
479
1 n n w rij ni
P max Yni ≥ ERi tdt 1≤j≤N
2 i=1 0 i=1
1 n n w rij N ni
≤ P Yni ≥ ERi tdt
i=1 2 i=1 0 j=1
N n w rij ni 1 1 ≤ 2 exp − ERi tdt + 2 Bn 4c3 i=1 0 4c3 j=1
N n w rij n w rij ni ni 1 1 2 exp − ERi tdt + ERi tdt ≤ 4c3 i=1 0 8c3 i=1 0 j=1 N c 2 2 ≤ 2 exp − n 32c3 j=1 c 2 ≤ 22n2 + 1p exp − 2 n 32c3
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Thus,
1 n n w rij ni
P max Yni ≥ ERi tdt < 1≤j≤N
2 i=1 0 n=1 i=1
(3.10)
w r (ii) If (2.5) hold, i.e., u ≤ c1 . Since Yni = 0 ni ij ei + t − E ei + w r 1 2 j tdt − 0 ni ij ei − E ei dt = ˆ Yni − Yni , obviously, Yni n ≥ 1 i ≤ n j = 1 2 j are monotonic on ei . By Lemma 1.1, Yni n ≥ 1 i ≤ n j = 1 2 are also sequences j of ND random variables with EYni = 0 j = 1 2. And by (2.5) and (3.7), Yni ≤ 2c1 wni rij < 2c1 = ˆ c3 j = 1 2 j
And by (2.5) , (3.7) again, ˆ Bnj =
n
j
EYni 2 ≤
i=1
=
n
n
4c12 wni rij 2
i=1 4c12 2 n xni rij 2
i=1
≤ 4c12 2 n
n
xni 2 rij 2
i=1
≤ 4c12 2 n
n
xni 2 rij − r + r2
i=1
≤ 4c12 2 n
n
xni 2 n−2 + 12
i=1
≤ 16pc12 2 n
(3.11)
480
Wu and Jiang ˆ Choose 0 < 1 ≤ 1 such that A =
c2 1 2×162 pc12
≤
1 . c3
0 t = A > 0. Then tc3 ≤ 1. Applying Lemma 1.3 for combining (3.8) and (3.11) we have
And let u = j Yni n
1 c 2 n 16 2
>
≥ 1 i ≤ n j = 1 2 ,
1 n n w rij ni
P max Yni ≥ ERi tdt 1≤j≤N
2 i=1 0 i=1
c 2 n N n
2 P Yni ≥ ≤
i=1 8 j=1
c 2 n
c 2 n
N n n
1 2 2 2 P Yni ≥ ≤ + P Yni ≥
i=1
i=1 16 16 j=1
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
≤
N j=1
4 exp −tu + t2 Bn 4 exp −
c22 1 2 n 4 × 163 pc12 j=1 c22 1 2 2 p n ≤ 42n + 1 exp − 4 × 163 pc12 ≤
N
Thus, (3.10) is also established. By the Borel–Cantelli lemma and the definition of Yni , for sufficiently large n,
w r
n
Ri tdt
i=1 0 ni ij 1 max w r a.s. < n ni ij 1≤j≤N 2 ERi tdt i=1 0 Substitute the above inequality and (3.8) in (3.5). For sufficiently large n, inf I1n r ≥ r∈U
1 2 c n a.s. 8 2
(3.12)
Denote ⎛ ⎞ ⎞ r1 xni1 ⎧x ⎨ nij i ≤ n ⎜ r2 ⎟ ⎜xni2 ⎟ /2 ⎜ ⎟ ⎜ ⎟ xni = ⎜ ⎟ r = ⎜ ⎟ ani = n ⎩ ⎝ ⎠ ⎝ ⎠ 0 i>n xnip rp ⎛
for fixed j = 1 2 p
By (3.2) and (2.7), for fixed j = 1 2 p, xnij ≤ xni ≤ dn1/2 ≤ 2 i=1 xnij = 1, thus
n
ani ≤
√ − c1 n for n ≥ 1 i ≤ n
√ −/2 c1 n , and
M Estimator in a Linear Model (i) When ≤ 1/2 i.e.,
1
≥ 2, therefore p = min 1 2 = 2, we have
n
n
ani p =
i=1
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
481
a2ni = n−
i=1
n
2 xnij = n−
i=1
(ii) When 21 < < 1, we have 1 < 1/ < 2. Thus, p = min 1 2 = 1 , and 1 > 0, by the Hölder inequality, n
ani p =
i=1
n
n
ani 1/ = n−1/2
i=1
1 1 2
−
xnij 1/
i=1
−1/2
≤n
n
21 xnij
1 ·2
n
2−1 2
1 1
= n− 2 −1
i=1
(iii) When = 1, without loss of generality, we can assume that 1 < < 2 in (2.8). Thus, p = min 2 = , by the Hölder inequality, n
ani p =
i=1
n
ani = n− 2
i=1 − 2
≤n
n
n
xnij
i=1
2
2
xnij
n
2− 2
= n−−1
i=1
Because ei is increasing on ei , by Lemma 1.1, ei i ≥ 1 is also a sequence of ND random variables. By (2.8), applying Lemma 1.4 for ei i ≥ 1 and ani n ≥ 1 i ≤ n , for fixed j = 1 2 p we obtain n
ani ei = n−/2
n
i=1
xnij ei → 0 a.s.
i=1
Thus, −
n
n n
− /2 sup I2n r = n sup wni r ei = n sup n xni r ei
r∈U r∈U i=1 r∈U i=1
n p
= n−/2 sup xnij rj ei
r∈U i=1 j=1
p n
= n−/2 sup xnij ei rj
r∈U j=1 i=1 )
2 ) *p *p n * * ≤ n−/2 sup + xnij ei + rj2 −
r∈U
j=1
i=1
j=1
)
2 *p n * + n−/2 = xnij ei → 0 a.s. j=1
i=1
482
Wu and Jiang
This implies that for sufficiently large n, sup I2n r ≤ r∈U
c 2 2 n 16
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Substitute the above inequality and (3.12) in (3.4), for sufficiently large n, we get inf Dn n/2 r ≥ c2 2 n /16 > 0 r∈U
(3.13)
Closed surface S = ∈ Rp = n/2 divides Rp into two parts, A = ∈ Rp < n/2
and B = ∈ Rp ≥ n/2 Because Dn · is a convex function, Dn 0 = 0, 0 is a interior point of A, and inf ∈S Dn = inf r∈U Dn n/2 r > 0 = Dn 0 from (3.13). Thus, for all ∈ B Dn > 0. On the other hand, by the definition (3.3) of ˆ n0 Dn ˆ n0 ≤ 0 Thus, ˆ n0 ∈ A, i.e., ˆ n0 < n/2 a.s. for any given > 0 which implies that n−/2 ˆ n0 → 0 a.s. n →
(3.14)
Let n be the smallest eigenvalue of Sn which is a positive definite matrix. Then n1/2 is the smallest eigenvalue of Sn1/2 , and n−1 is the maximum eigenvalue of Sn−1 . By (3.10) of Chen and Zhao (1996) and (2.7), for fixed n0 , n0 n−1 ≤ trSn0 Sn−1 =
n0
trxi xi Sn−1
i=1
=
n0
trxi Sn−1 xi =
i=1
n0
xi Sn−1 xi
i=1 −
≤ n0 dn ≤ n0 c1 n Therefore, let c4 =
,n
0 c1 n 0
> 0, we have 1 ≤ c4 n1/2 n−/2 . Combining Sn1/2 ˆ n ≥
n1/2 ˆ n and (3.14), we get ˆ n ≤ c4 n−/2 n1/2 ˆ n ≤ c4 n−/2 Sn1/2 ˆ n = c4 n−/2 ˆ n0 → 0 n →
M Estimator in a Linear Model
483
i.e., ˆ n → 0 a.s. n →
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Theorem 2.1 is proved. Proof of Theorem 2.2. Combining the methods used to prove the above Theorem 2.1 with Lemma 1.5, we can prove the Theorem 2.2.
4. A Simulation Study In this section, we report the details of a simulation study, in which three special situations for the M estimator, i.e., LAD estimator, LS estimator, and Huber’s estimator are presented. 4.1. One Dimension Our objective is to study the M estimator ˆ n of , defined as solutions of the minimization problem (1.2). Assume that is a convex function. If is differentiable, with derivative , then ˆ n can be obtained by solving the estimating equations n
xi Yi − xi = 0
(4.1)
i=1
The function in (4.1) is called the score function of the M estimator ˆ n . If is not differentiable, but exists a right and a left derivative, + and − , respectively, at each point, then one can choose the score function such that − ≤ ≤ + . Note that this inequality is a consequence of the convexity of . In the model (1.1), let p = 1 = 1, xi be generated from a uniform distribution on the interval 1 11, and the generated method of ND random errors ei is as follows. Firstly, we divide the interval −05 05 into 10,000 equal parts. And then generate a random number from each sub-interval. Take these 10,000 random numbers as sample population, we sample n times without replacement from the population, thus we can obtain the ND random errors ei which we need in our simulation stage. The random errors e1 e2 en are ND, but are not independent. Now putting xi ’s and ei ’s numerical values in expression Yi = xi + ei , we can calculate the numerical values of Yi . For the least absolute deviations (LAD) estimator in Example 1.3, u = sgnu, for the least squares (LS) estimator, u = 2u. And for the Huber’s estimator in Example 1.2, u = uIu≤c + c1 − 2Iu≤0 Iu>c . The following Tables 1–3 are the ˆ n ’s numerical values by solving the estimating Eq. (4.1). 4.2. Multi Dimensions Two Dimensions In the model (1.1), let p = 2, 1 = 1 2 = 2, x1i and x2i be generated from uniform distribution on the intervals 0 10 and 1 11, respectively,
484
Wu and Jiang Table 1 LS estimator, p = 1, = 1
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Sample size n 100 400 700 1000
ˆ n
ˆ n −
Sample size n
ˆ n
ˆ n −
0.9758981 1.0113502 1.0152359 1.0104891
0.0241019 0.0113502 0.0152359 0.0104891
1300 1600 2000 2500
1.0067966 1.0013071 1.0018691 0.9979107
0.0067966 0.0013071 0.0018691 0.0020893
Table 2 LAD estimator, p = 1, = 1 Sample size n 100 400 700 1000
ˆ n
ˆ n −
Sample size n
ˆ n
ˆ n −
1.0022593 1.0319936 1.0328176 1.0174342
0.0022593 0.0319936 0.0328176 0.0174342
1300 1600 2000 2500
1.0166226 1.0041280 1.0055556 0.9966310
0.0166226 0.0041280 0.0055556 0.0033690
Table 3 Huber’s estimator, p = 1, = 1 c = 05 Sample size n 100 400 700 1000
ˆ n
ˆ n −
Sample size n
ˆ n
ˆ n −
0.9756360 1.0114411 1.0154146 1.0105741
0.0243640 0.0114411 0.0154146 0.0105741
1300 1600 2000 2500
1.0068387 1.0013094 1.0018740 0.9979068
0.0068387 0.0013094 0.0018740 0.0020932
and the generated method of random errors ei is as above. Then the random errors e1 e2 en are ND, but are not independent. Now putting xi ’s and ei ’s numerical values in expression Yi = 1 x1i + 2 x2i + ei , we can calculate the numerical values of Yi . The following Tables 4–6 are the ˆ n ’s numerical values by solving the estimating Eq. (4.1). Three Dimensions In the model (1.1), let p = 3, 1 = 1 2 = 2 3 = 3, x1i , x2i , x3i be generated from uniform distribution on the intervals 0 10, 1 11, and −1 9, respectively, and the generated method of random errors ei is as above. Then the random errors e1 e2 en are ND, but are not independent. Now putting xi ’s and ei ’s numerical values in expression Yi = 1 x1i + 2 x2i + 3 x3i + ei , we can calculate the numerical values of Yi . The following Tables 7–9 are the ˆ n ’s numerical values by solving the estimating Eq. (4.1). Based on the results of simulations above, in the three special situations for the M estimator, the simulation values of ˆ n converge into asymptotically with the increasing of sample size. The results support our theorems that the M estimator ˆ n
M Estimator in a Linear Model
485
Table 4 LS estimator, p = 2, 1 = 1 2 = 2
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Sample size n 100 400 700 1000 1300 1600 2000 2500
ˆ 1n
ˆ 1n − 1
ˆ 2n
ˆ 2n − 2
0.6631466 0.9740178 0.9702049 0.9778169 0.9898313 0.9941027 0.9922873 0.9958029
0.3368534 0.0259822 0.0297951 0.0221831 0.0101687 0.0058973 0.0077127 0.0041971
2.0481879 2.0092448 2.0091754 2.0088165 2.0052610 2.0037938 2.0045914 2.0030734
0.0481879 0.0092448 0.0091754 0.0088165 0.0052610 0.0037938 0.0045914 0.0030734
Table 5 LAD estimator, p = 2, 1 = 1 2 = 2 Sample size n 100 400 700 1000 1300 1600 2000 2500
ˆ 1n
ˆ 1n − 1
ˆ 2n
ˆ 2n − 2
0.8817620 0.9332905 0.9654469 1.0217184 0.9986156 0.9909514 0.9968314 1.0030996
0.1182380 0.0667095 0.0345531 0.0217184 0.0013844 0.0090486 0.0031686 0.0030996
2.0289440 2.0130674 2.0093095 1.9969634 2.0055463 2.0073337 2.0037529 2.0006724
0.0289440 0.0130674 0.0093095 0.0030366 0.0055463 0.0073337 0.0037529 0.0006724
Table 6 Huber estimator, p = 2, 1 = 1 2 = 2 c = 05 Sample size n 100 400 700 1000 1300 1600 2000 2500
ˆ 1n
ˆ 1n − 1
ˆ 2n
ˆ 2n − 2
0.6315787 0.9740178 0.9702049 0.9777888 0.9898323 0.9941018 0.9922824 0.9957975
0.3684213 0.0259822 0.0297951 0.0222112 0.0101677 0.0058982 0.0077176 0.0042025
2.0508064 2.0092448 2.0091754 2.0088235 2.0052613 2.0037960 2.0045950 2.0030775
0.0508064 0.0092448 0.0091754 0.0088235 0.0052613 0.0037960 0.0045950 0.0030775
486
Wu and Jiang
Table 7 LS estimator, p = 3, 1 = 1 2 = 2 3 = 3
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Sample size n 100 400 700 1000 1300 1600 2000 2500
ˆ 1n
ˆ 1n − 1
ˆ 2n
ˆ 2n − 2
ˆ 3n
ˆ 3n − 3
1.0581025 1.0097717 0.9954848 0.9981353 1.0246705 1.0018969 0.9815122 0.9905781
0.0581025 0.0097717 0.0045152 0.0018647 0.0246705 0.0018969 0.0184878 0.0094219
1.8507100 1.9913679 1.9965355 1.9986572 1.9878414 1.9998967 2.0099789 2.0056859
0.1492900 0.0086321 0.0034645 0.0013428 0.0121586 0.0001033 0.0099789 0.0056859
2.8025220 2.9821231 2.9873600 2.9898408 2.9825804 2.9961029 3.0053875 3.0026124
0.1974780 0.0178769 0.0126400 0.0101592 0.0174196 0.0038971 0.0053875 0.0026124
Table 8 LAD estimator, p = 3, 1 = 1 2 = 2 3 = 3 Sample size n 100 400 700 1000 1300 1600 2000 2500
ˆ 1n
ˆ 1n − 1
ˆ 2n
ˆ 2n − 2
ˆ 3n
ˆ 3n − 3
1.1837366 1.2055280 1.0186715 1.0496348 0.9474624 0.9577325 1.0632508 0.9532149
0.1837366 0.2055280 0.0186715 0.0496348 0.0525376 0.0422675 0.0632508 0.0467851
2.0024469 1.9040506 1.9623534 1.9622506 2.0025729 2.0900117 1.9428363 1.9997209
0.0024469 0.0959494 0.0376466 0.0377494 0.0025729 0.0900117 0.0571637 0.0002791
3.0027115 2.8909142 2.9487677 2.9502894 2.9917845 3.0793070 2.9242667 2.9979547
0.0027115 0.1090858 0.0512323 0.0497106 0.0082155 0.0793070 0.0757333 0.0020453
Table 9 Huber estimator, p = 3, 1 = 1 2 = 2 3 = 3 c = 05 Sample size n 100 400 700 1000 1300 1600 2000 2500
ˆ 1n
ˆ 1n − 1
ˆ 2n
ˆ 2n − 2
ˆ 3n
ˆ 3n − 3
1.0621401 1.0456920 0.9176414 0.9878056 0.9837392 0.9894232 0.9989699 0.9958240
0.0621401 0.0456920 0.0823586 0.0121944 0.0162608 0.0105768 0.0010301 0.0041760
1.9414686 1.9543527 2.0143632 2.0043336 2.0068731 2.0770537 1.9976158 1.9992246
0.0585314 0.0456473 0.0143632 0.0043336 0.0068731 0.0770537 0.0023842 0.0007754
2.8955363 2.9466737 3.0100642 3.0182277 3.0232909 3.0938445 3.0115540 3.0132246
0.1044637 0.0533263 0.0100642 0.0182277 0.0232909 0.0938445 0.0115540 0.0132246
M Estimator in a Linear Model
487
is a strongly consistent estimator of in the linear model for negatively dependent random samples.
5. Appendix Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Proof of Lemma 1.3. By tXi ≤ 1 a.s., etXi =
tXi k 1 ≤ 1 + tXi + tXi 2 ≤ 1 + tXi + t2 Xi2 a.s. k! k! k=0 k=2
Therefore, EetXi ≤ 1 + tEXi + t2 EXi2 ≤ et
2 EX 2 i
from EXi = 0. Thus, from the Markov inequality and Lemma 1.2, for any u > 0 t > 0 we get P
n
n n Xi > u = P et i=1 Xi > etu ≤ e−tu Eet i=1 Xi
i=1
≤ e−tu
n
EetXi ≤ e−tu
i=1
n
et
2 EX 2 i
i=1
= exp −tu + t Bn 2
Now in the above inequality, let −Xi take the place of Xi , and we have P
n
−Xi > u = P
i=1
n
Xi < −u ≤ exp −tu + t2 Bn
i=1
Therefore, we conclude
n n n
Xi > u + P Xi < −u P Xi > u = P i=1 i=1 i=1 ≤ 2 exp −tu + t2 Bn The Lemma 1.3 is proved. Proof of Lemma 1.4. Without loss of generality, assume ank > 0 for all n ≥ 1 k ≤ n. Let > 0 be given, N = 2 + 1, where x denotes the maximum integral number −/2 not exceeding x, Ank = a−1 Bk = k For k ≤ n, let nk n Nc 1
Xnk = Xk IXk ≤Ank + Ank IXk >Ank 2
Xnk = Xk IXk >Bk 3
1
2
Xnk = Xk − Xnk − Xnk = Xk − Ank IAnk Bk
488
Wu and Jiang
and Tni =
n
i
ank Xnk i = 1 2 3
k=1
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Then, Tn =
3
Tni
(5.1)
i=1 1
1
Let Ynk = n/2 ank Xnk Obviously, Xnk is increasing on Xk . By Lemma 1.1, 1 Ynk = n/2 ank Xnk n ≥ 1 k ≤ n is also a sequence of ND random variables. And 1 /2 Ynk ≤ n ank Ank = 1, EYnk = n/2 ank EXnk ≤ n/2 ank EXk = 0 from (1.7). It is easy to verify ey ≤ 1 + y + yp for y ≤ 1 1 ≤ p ≤ 2 So expYnk ≤ 1 + Ynk + Ynk p which implies E expYnk ≤ 1 + EYnk + EYnk p ≤ 1 + EYnk p ≤ expEYnk p 1
(5.2)
By Lemma 1.2, (1.9), (5.2), and the fact that Xnk ≤ Xk , p ≤ 1/, EX1 p < , we conclude that
n n n /2 1 E expn Tn = E exp Ynk ≤ E expYnk ≤ expEYnk p k=1
=
n
k=1
k=1
n . 1 p p p/2 p /2 p exp n ank EXnk ≤ exp n EX1 ank
k=1
k=1 p −/2
≤ expcEX1 n
≤ c1
where c1 is a positive constant. Combining the Morkov inequality, we obtain
PTn1 ≥ ≤
n=1
exp− n/2 Eexpn/2 Tn1
n=1
≤ c1
exp− n/2 <
n=1
By the Borel–Cantelli lemma, PTn1 ≥ io = 0
M Estimator in a Linear Model
489
Thus, . P lim sup Tn1 ≥ ≤ PTn1 ≥ io = 0 n→
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
Since > 0 is arbitrary, we have Plim sup Tn1 > 0 = 0 n→
i.e., lim sup Tn1 ≤ 0 a.s.
(5.3)
n→
Since Xk have the identical distribution and (1.6), we have
PXk > Bk =
k=1
PX1 > Bk <
k=1
Therefore,
Xk2 IXk > Bk < a.s.
k=1
Thus, by the Schwarz inequality and (1.9), 0≤
Tn2
=
n
2 ank Xnk
k=1
≤
n
≤
1/2 apnk
k=1
n
1/2 a2nk
k=1
n
1/2 Xk2 IXk >Bk
k=1
1/2
Xk2 IXk >Bk
k=1
√ ≤ cn−/2 Xk2 IXk >Bk
1/2
k=1
→ 0 a.s.
(5.4)
Now we prove lim supn→ Tn3 ≤ 0 a.s. By (1.8), ank Xk − Ank IAnk 0. The
lim sup Tn3 ≤ 0 a.s. n→
Combining (5.1), (5.3), and (5.4), we get lim sup Tn ≤ 0 a.s.
(5.5)
n→
In (5.5), replacing −Xi by Xi , we obtain lim inf Tn ≥ 0 as n→
(5.6)
Now, (1.10) follow from (5.5) and (5.6). The Lemma 1.4 is proved. Proof of Lemma 1.5. By the similar methods used to prove the Lemma 1.4, we can prove Lemma 1.5.
Acknowledgments We are very grateful to the referees and the Editors for their valuable comments and some helpful suggestions that improved the clarity and readability of the article. Supported by the National Natural Science Foundation of China (11061012), the Support Program of the New Century Guangxi China Ten-hundred-thousand Talents Project (2005214), and the Guangxi China Science Foundation (0991081 and 2010GXNSFA013120).
References Bozorgnia, A., Patterson, R. F., Taylor, R. L. (1993). Limit theorems for ND r.v.’s. Technical Report. University of Georgia. Chen, X. R., Zhao, L. C. (1995). Strong consistency of M-estimates of multiple regression coefficients. Syst. Sci. Mathemat. Sci. 8:82–87.
Downloaded By: [University of Electronic Science and Technology of China] At: 16:13 11 January 2011
M Estimator in a Linear Model
491
Chen, X. R., Zhao, L. C. (1996). M-methods in Linear Model[M]. Shanghai Scientific and Technical Publishers. Chen, X. R., Bai, Z. D., Zhao, L. C., Wu, Y. H. (1992). Consistency of “minimum L1 -norm” estimates in linear regression models. In: Chen, X. R., Fang, K. T., Yang, C. C., eds. The Deveiopment of Statistics: Recent Contributions from China. Pitman Research Notes in Mathematics, Series 258, Harlow: Longman Scientific and Technical. Collins, J. R., Szatmari, V. D. (2004). Maximal asymptotic siases of M estimators of location with preliminary scale estimates. Commun. Statist. Theor. Meth. 33(8):1877–1866. Djalil, C., Didier, C. (2007). On the strong consistency of asymptotic M Estimators. J. Statist. Plan. Infer. 137:2774–2783. Ebrahimi, N., Ghosh, M. (1981). Multivariate negative dependence. Commun. Statist. Theor. Meth. A 10:307–337. Georgios, P. (2005). Application of M estimators to cross-section effect models. Commun. Statist. Simul. Computat. 34(3):601–616. He, X. M., Shao, Q. M. (1996). A general bahadur representation of M Estimator and its application to linear regression with nonstochastic designs. Ann. Statist. 24(6):2608– 2630. Huber, P. J. (1973). Robust regression: asymptotics, conjectures and Monte Carlo. Ann. Statist. 1(5):799–821. Joag-Dev, K., Proschan, F. (1983). Negative association of random variables with applications. Ann. Statist. 11(1):286–295. Klesov, O., Rosalsky, A., Volodin, A. I. (2005). On the almost sure growth rate of sums of lower negatively dependent nonnegative random variables. Statist. Probab. Lett. 71:193– 202. Kuczmaszewska, A. (2006). On some conditions for complete convergence for arrays of rowwise negatively dependent random variables. Stochastic Anal. Appl. 24(6):1083–1095. Lehmann, E. L. (1966). Some concepts of dependence. Ann. Math. Statist. 43:1137–1153. Seija, S., Sara, T., Hannu, O. (2007). Symmetrized M estimators of multivariate scatter. J. Multivariate Anal. 98:1611–1629. Taylor, R. L., Patterson, R. F., Bozorgnia, A. (2002). A strong law of large numbers for arrays of rowwise negatively dependent random variables. Stochastic Anal. Appl. 20:643–656. Wu, Q. Y. (2005). Strong consistency of M estimator in linear model for ˜ -mixing samples. Acta Mathematica Scientia 25A(1):41–46. Wu, Q. Y. (2006). Strong consistency of M estimator in linear model for negatively associated samples. J. Syst. Sci. Complex. 19(4):592–600. Wu, Y. H. (1988). Strong consistency and exponential rate of the “minimum L1 -norm” estimates in linear regression models. Computat. Statist. Data Anal. 6:285–295. Yang, S. C. (2002). Strong consistency of M estimator in linear model. Acta Mathematica Sinica 45(1):21–28.