In this paper we use saddlepoint approximation to show that in the linear heteroscedastic regression model th e distribution of the residual sum of squares is ...
A saddlepoint approximation for the residual sum of squares in heteroscedastic linear regression Alexander Ivanov and Silvelyn Zwanzig Abstract It is shown that in local heteroscedastc linear regression models the distribution of the RSS can be approximated by the chi squared distribution. Approximation bounds are derived with help of the saddlepoint approximation approach.
1
Introduction
In this paper we use saddlepoint approximation to show that in the linear heteroscedastic regression model th e distribution of the residual sum of squares is approximately the chi squared distribution. In Ivanov and Zwanzig (2001), [3], the saddlepoint approximation for the residual sum is derived in the case of independent non normal errors with heteroscedastic variances. Further in [3] is shown that the leading term of the saddlepoint approximation is the chi squared distribution in the case of normal i.i.d. errors. In general the derivation of saddlepoint approximations includes at least three main problems. First, the complete cumulant generating function of the relevant transformation of sum of random errors has to be known. Second, the saddlepoint equation system must be solved explicitly, which requires the solution of a nonlinear equation system. Third, an integration of the saddlepoint expansion of the transformed sum is necessary. In the case of normally distributed errors with heteroscedastic variances the cumulant generating function can be calculated. But the nonlinear equation system cannot be solved explicitly. Also the required integration step 1
must be done numerically. For normal i.i.d. errors, both the solution of the saddlepoint equation system and the integration can be done explicitly. The main idea of this paper is to approximate the heteroscedastic saddlepoint approximation by a homoscedastic one (Theorem 2) and to calculate the leading term for this approximation (Theorem 3).
2
Local heteroscedastic model
We consider the heteroscedastic linear regression model yi,n = xTi,n β + εi,n ,
(2.0.1)
where the errors are given in the following triangular scheme N εi,n = σ i,n i , with i ∼ N (0, 1) i.i.d..
(2.0.2)
Further we suppose a triangular scheme of known weights wi = wi,n and 0 < wmin < wi,n < wmax . We have
n
wi,n σ 2i,n
=1+
∆2i,n
and
σ 2n
1X = wi,n σ 2i,n . n 1=1
All deviations assumed to be bounded: A For all n max ∆2in ≤ ∆2 < ∞ . 1≤k≤n
(2.0.3)
(2.0.4)
The homoscedastic case is given, when the weights are related to the variances such that HOM wi,n σ 2i,n = 1 + ∆2n , with ∆2n ≤ ∆2 < ∞. (2.0.5) The derived approximation bounds are reasonable for asymptotically homoscedastic models in the sense that for n
1 X 2 Dn = ∆n − ∆2i,n it holds n i=1
lim log n Dn = 0.
n→∞
The following standard regularity conditions R1 and R2 are made: 2
R1 There exists a constant λx > 0 such that ! n 1X T xi,n xi,n > λx . limn→∞ λmin n i=1 R2 There exists a constant cx < ∞ such that limn→∞ max kxi kmax < cx . i≤n
Note under R1 and R2 there exist a positive constant λ0 = λx wmin and a bounded constant c0 = cx wmax such that for the weighted Fisher matrix I(n) n
I(n) =
1X wi,n xi,n xTi,n it holds λmin I(n) > λ0 and tr(I(n) ) < c0 . (2.0.6) n i=1
In Ivanov and Zwanzig (2001), [3], the following general result saddlepoint approximation is given for the weighted residual sum of squares RSS = min β
n X
wi,n yi,n − xTi,n β
2
.
(2.0.7)
i=1
Theorem 1. Under assumptions N, R1, R2 it holds Z ∞Z √ 1 2 P n RSS − σ n > u = fhet (y) (1 + Rn (y))dy2 dy1 n u y1 with y = ∈ Rp+1 , y1 ∈ R, y2 ∈ Rp , and with fhet (y) given in y2 p+1 b (2.0.9) is the exact solution of (2.0.13) and for Y =
and θ(y)o∈ R n
b y : θ (y) ≤ c sup |Rn (y) | = O(n−1 ).
(2.0.8)
y∈Y
Under N the formulas for the saddlepoint approximation in Theorem 1 are the following: √ T θ F −1 (y)) exp(κhet b θ − nb , (2.0.9) fhet (y) = p+1 1 (2π) 2 (det Chet (b θ)) 2 3
with n
κhet (θ) =
−nσ 2n θ1
where θ =
θ1 θ2
1X −1 −1 ln(1 − 2wi σ 2i θ1 ) + n θT2 I(n) B(θ1 )I(n) θ2 , (2.0.10) − 2 i=1
∈ Rp+1 , θ1 ∈ R, θ2 ∈ Rp and n
1 X si (θ1 ) B (θ1 ) = wi,n xi,n xTi,n , n 1=1 2
si (θ1 ) =
wi,n σ 2i,n . 1 − 2wi,n σ 2i,n θ1
(2.0.11)
Further F
−1
(y) =
1
y1 + n− 2 y2T I(n) y2 y2
n
and s(θ1 ) =
1X si (θ1 ), n i=1
and Chet (θ) =
−1 00 −1 −1 0 −1 s0 (θ1 ) + θT2 I(n) B (θ1 ) I(n) θ2 2θT2 I(n) B (θ1 ) I(n) −1 0 −1 −1 −1 2I(n) B (θ1 ) I(n) θ2 I(n) B (θ1 ) I(n)
! .
(2.0.12)
The prime denotes the P componentwise derivatives with respect to θ1 , thus θ = b θ(y) ∈ Rp+1 is s0 (θ1 ) = ∂θ∂ 1 s(θ1 ) = n1 n1=1 2s2i (θ1 ). The saddlepoint b defined as the exact solution of the equation system mhet (b θ) = F −1 (y) with ! −1 0 −1 √ − σ 2n + s(θ1 ) + θT2 I(n) B (θ1 ) I(n) θ2 mhet (θ) = n . (2.0.13) −1 −1 I(n) B (θ1 ) I(n) θ2 The main problem is that in the general heteroscedastic model the saddlepoint b θ cannot derived explicitly. Let us consider now the homoscedastic case. Under HOM all formulas can be simplified, because 1 B (θ1 ) = s(θ1 )I(n) , B 0 (θ1 ) = s(θ1 )2 I(n) , B 00 (θ1 ) = 4s(θ1 )3 I(n) . 2 Under HOM we have fhet (y) = fhom (y) with √ T exp(κhom e θ − ne θ F −1 (y)) fhom (y) = p+1 1 (2π) 2 (det Chom (e θ)) 2 4
(2.0.14)
and κhom (θ) = −nσ 2n θ1 −
n n −1 ln(1 − 2σ 2n θ1 ) + s(θ1 ) θT2 I(n) θ2 2 2
(2.0.15)
and −1 −1 s2 (θ1 ) + 4 s3 (θ1 ) θT2 I(n) θ2 2 s2 (θ1 ) θT2 I(n) −1 −1 2 s2 (θ1 )I(n) θ2 s(θ1 )I(n)
Chom (θ) =
! .
(2.0.16)
Under HOM we have mhet (θ) = mhom (θ) with mhom (θ) = For se = σ 2n +
√1 y1 n
√
n
−1 − σ 2n + s(θ1 ) + s(θ1 )2 θT2 I(n) θ2 −1 s(θ1 )I(n) θ2
.
(2.0.17)
> 0 we get the explicit solution of mhom (e θ) = F −1 (y) by 1 1 e θ2 = √ I(n) y2 , e θ1 = 2 ne s
3
!
1 1 − . σ 2n se
(2.0.18)
Main results
Theorem 2. Suppose the assumptions A, R1, R2 are fulfilled. Then for kyk2 ≤ const log n there exists a constant c such that 1 − c log n Dn ≤
fhet (y) ≤ 1 + c log n Dn . fhom (y)
(3.0.19)
Note the bound Dn is minimized for ∆2n = median(∆2i,n ). Theorem 3. Suppose the assumptions N, A, R1, R2 are fulfilled. Then Z un,upper Z un,upper RSS < u ≤ (1+δ n ) fχ2n−p (w)dw fχ2n−p (w)dw ≤ P (1−δ n ) σ 2n un,lower un,lower un,upper = n(1+
un − σ 2n c p ), un,lower = n− 2 n log n, δ n = c log n Dn +O(n−1 ), 2 σn σn
where fχ2n−p (w) is the density of the χ2n−p distribution with n − p degrees of freedom. 5
4
Proofs
First let us show auxiliary results for estimating the difference between the heteroscedastic and the homoscedastic formulas. Lemma 4. Under A, R1, R2 for all θ with kθk < ck , Ck such that
1 2(1+∆2 )
there exist constants
n X k si (θ1 ) − sk (θ1 ) ≤ ck nDn and λmax (Bk ) ≤ Ck nDn , k = 1, 2, 3 1=1
with B1 = n(B 4s3 (θ1 ) In ).
(θ1 )− 12 s (θ1 ) In ),
(4.0.20) B2 = n(B (θ1 )−s (θ1 ) In ), B3 = n(B 00 (θ1 )− 0
2
Proof. For simplicity let us give the proof for k = 1 only. By the mean value theorem 2h(1 − hθ1 ) y x max (4.0.21) 1 − 2xθ1 − 1 − 2yθ1 ≤ |x − y| {h:|x−h|≤|x−y|} (1 − 2hθ1 )2 . 1 Under |θ1 | < 2(1+∆ 2 ) the derivatives in (4.0.21) are continuous and the max wi σ 2i,n − σ 2n = imum is taken over a bounded interval. For |x − y| = 2 ∆i,n − ∆2n we get the first formula in (4.0.20). Hence we have C1 = wmax cx c1 , because of n
λmax (B1 ) =
max tT B1 t ≤
t, ktk=1
1X |si (θ1 ) − s (θ1 )| max wi,n (xTi,n t)2 t, ktk=1 2 1=1
≤ wmax cx c1 nDn .
Lemma 5. Under A, R1, R2 for all θ with kθk < positive constant λhom such that
1 2(1+∆2 )
there exists a
λmin (Chom (θ)) > λhom . Proof. Under A and kθk < s (θ1 ) > 0. Because of
1 2(1+∆2 )
it holds si (θ1 ) >
|A| = |A22 | A11 − A12 A−1 22 A21 6
for A =
1 1−2(1+∆2 )θ1
A11 A12 A21 A22
> 0. Hence
we have
2p+2
|Chom (θ)| = 2s (θ1 )
−1 I(n) > 0.
(4.0.22)
Further under (2.0.6) −1 −1 tr(Chom (θ)) = 4s (θ1 )3 θT2 I(n) θ2 + 2s (θ1 )2 + s (θ1 ) tr(I(n) ) −1 < s (θ1 ) tr(I(n) ) < const.
Applying λmin (A) = |A| 0 that
Q
p+2
λmin (Chom (θ)) > 2s (θ1 )
−1 > |A| tr(A)−p we get for s (θ1 ) > i6=min λi (A)
−1 −(p+1) −1 −p ) > (1+∆2 )pp+1 λ0 = λhom > 0. I(n) tr(I(n)
Lemma 6. Under A, R1, R2 for all θ with kθk < constant c, such that
1 2(1+∆2 )
there exists a
√ kmhet (θ) − mhom (θ)k ≤ c nDn kθk . Proof. From (2.0.13), (2.0.17) we get √ n kmhet (θ) − mhom (θ)k < m1 + m2 + m3 + m4, P P with m1 = ni=1 wi σ 2i,n − σ 2n = nDn and m2 = ni=1 (si (θ1 ) − s(θ1 )) and
−1 2
−1 −1 −1 m3 = θT2 In−1 B2 I(n) θ2 ≤ I(n) θ2 |λmax (B2 )| and m4 = I(n) B1 I(n) θ2 ≤
−1 2 I θ
(n) 2 kθ2 k |λmax (B1 )| . Thus the statement follows from Lemma 4. Lemma 7. For kθk < , ktk < there exists a constant c, such that √ kmhet (θ) − mhom (t)k > c n kθ − tk . Proof. The mean value theorem implies mhom (θ)−mhom (t) = m0hom (s) (θ − t) where m0hom (s) is the (p + 1) × (p + 1) matrix of the first derivatives of the (p + 0 1)− √ dimensional function mhom (t). Because of kAxk ≥ λmin (A) kxk and mhom (s) = nChom (s), the statement follows from Lemma 5.
7
b
1 θ < 2(1+∆ , Lemma 8. Under A, R1, R2, for e 2 ) θ < constant c, such that
b e
θ .
θ − θ ≤ c Dn b
1 2(1+∆2 )
there exists a
Proof. Remind b θ is the solution of mhet (b θ) = F −1 (y) and e θ is the solution −1 e of mhom (θ) = F (y) . Thus mhet (b θ) − mhom (b θ) = mhom (e θ) − mhom (b θ). (4.0.23)
√
From Lemma 6 it follows that mhet (b θ) − mhom (b θ) ≤ c1 nDn b θ . Thus from (4.0.23) and Lemma 7
√ √
θ) − mhom (e θ) ≤ c1 nDn b c2 n b θ −e θ < mhom (b θ which gives the statement.
Lemma 9. Under A, R1, R2 for e θ < constant c, such that (1 + c Dn )−1
1 , 2(1+∆2 )
b
θ