Journal of Statistical Planning and Inference 142 (2012) 1421–1435
Contents lists available at SciVerse ScienceDirect
Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi
Improving the estimators of the parameters of a probit regression model: A ridge regression approach B.M. Golam Kibria a,n, A.K.Md.E. Saleh b a b
Department of Mathematics and Statistics, Florida International University, Modesto A. Maidique Campus Miami, FL 33199, USA Distinguished Research Professor in School of Mathematics and Statistics, Carleton University, Ottawa, Canada K1S 5B6
a r t i c l e i n f o
abstract
Article history: Received 3 September 2010 Received in revised form 24 December 2011 Accepted 27 December 2011 Available online 4 January 2012
This paper considered the estimation of the regression parameters of a general probit regression model. Accordingly, we proposed five ridge regression (RR) estimators for the probit regression models for estimating the parameters ðbÞ when the weighted design matrix is ill-conditioned and it is suspected that the parameter b may belong to a linear subspace defined by Hb ¼ h. Asymptotic properties of the estimators are studied with respect to quadratic biases, MSE matrices and quadratic risks. The regions of optimality of the proposed estimators are determined based on the quadratic risks. Some relative efficiency tables and risk graphs are provided to illustrate the numerical comparison of the estimators. We conclude that when q Z 3, one would uses PRRRE; otherwise one uses PTRRE with some optimum size a. We also discuss the performance of the proposed estimators compare to the alternative ridge regression method due to Liu (1993). & 2012 Elsevier B.V. All rights reserved.
Keywords: Dominance Preliminary test Probit regression model Relative efficiency Ridge regression Risk function Shrinkage estimation
1. Introduction The general probit regression model is used in modeling the dichotomous or binary outcome variables. The probit regression model is commonly used in microeconomics, health economics (Bishai, 1996) and medical science (see Bhattacharyya, 1997), when the purpose of the study is to model a latent variable, Yn that may be defined using the regression relationship: yni ¼ x0i b þ ui ,
i ¼ 1; 2, . . . ,n
ð1:1Þ
where xi is the ith row of the n ðp þ 1Þ data matrix, X with (p þ1)-vector, b of regression coefficients and u ¼ ðu1 ,u2 , . . . ,un Þ0 is the error vector with iid components having a continuous CDF, F(x) defined on R1 with finite Fisher information: Z 1 0 2 f ðuÞ Iðf Þ ¼ f ðuÞ du, ð1:2Þ f ðuÞ 1 0
where f is the pdf of u and f ðuÞ is the derivative of f(u). However, the latent variable Yn is an unobservable. So instead of yn, we obtain the following dummy variable: ( 1 if yni 4 0, yi ¼ 0 otherwise;
n
Corresponding author. E-mail addresses: kibriag@fiu.edu (B.M.G. Kibria),
[email protected] (A.K.Md.E. Saleh).
0378-3758/$ - see front matter & 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2011.12.023
ð1:3Þ
1422
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435
which is distributed as a Bernoulli variable with parameter:
pi ¼ Fðx0i bÞ;
1 ¼ 1; 2, . . . ,n:
ð1:4Þ
The likelihood function based on a sample of size n is given by n Y pyi i ð1pi Þ1yi : LðbÞ ¼
ð1:5Þ
i¼1
Hence, ln LðbÞ ¼
n X
yi logFðx0i bÞ þ
i¼1
n X
ð1yi Þlogð1Fðx0i bÞÞ:
ð1:6Þ
i¼1
The success parameters b are estimated using the ML equations: n X
xi
i¼1
yi Fðx0i bÞ f ðx0 bÞ ¼ 0: 0 Fðxi bÞ½1Fðx0i bÞ i
ð1:7Þ
Since, the system of equations in (1.7) are non-linear in b, one obtains the following iteratively weighted least squares (IWLS) algorithm (Schaefer, 1986): ^ n X1 ðX 0 W ^ n ZÞ, b~ ML ¼ ½X 0 W
ð1:8Þ
^ n X is the weighted design matrix with where X W ! ½f ðx0i b^ Þ2 ^ n ¼ Diag W i ¼ 1; 2, . . . ,n Fðx0i b^ Þ½1Fðx0i b^ Þ 0
and Z ¼ ðZ 1 ,Z 2 , . . . ,Z n Þ0 , where Zi is defined as Z i ¼ logðp^ i Þ þ
yi p^ i p^ i ð1p^ i Þ
ð1:9Þ
and p^ ¼ Fðx0i b^ Þ. ^ 1=2 X and Z n ¼ W ^ 1=2 Z, then Let X nn ¼ W n n
b~ ML ¼ ½X nn X nn 1 ðX nn Z n Þ:
ð1:10Þ
Now, we assume the following two conditions: ðaÞ ðbÞ
n1 ðX nn X nn Þ ¼ C,
as n-1
max xnni ðX nn X nn Þ1 xnni -0
1rirn
as n-1
ð1:11Þ
where C ¼ X 0 WX is a finite and positive definite matrix and xnni is the ith row of the matrix X nn . Then the asymptotic distribution of b~ ML is given by: pffiffiffi ~ nHðb ML bÞ N q ð0,½HC 1 H0 Þ ð1:12Þ and lim PðLn r x9Ho Þ ¼ Hq ðx; 0Þ,
n-1
ð1:13Þ
where Hq ðx; 0Þ is the cdf of a central chi-square distribution with q degrees of freedom. Since, we suspect that the null hypothesis, H0 : Hb ¼ h may hold, where H is a q (pþ1) known matrix and h is a q 1 vector of known constants, the corresponding restricted estimator of b is 1 0 1 0 b^ ML ¼ b~ ML C 1 ðHb~ ML hÞ, n H ½HC n H
^ n X. where C n ¼ X W For the test of H0, we propose the Wald type test statistic defined by " 1 #1 1 Cn H0 ðHb~ ML hÞ: Ln ¼ nðHb~ ML hÞ0 H n
ð1:14Þ
0
D
ð1:15Þ
Under H0 : Hb ¼ h, Ln w2q as n-1. It is well known that the mle of b is effected by interrelationship of the covariates of b which is represented by the ^ n X. Following Hoerl and Kennard (1970), we propose the probit ridge regression estimators weighted design matrix X 0 W (PRRE) of b as " 1 #1 1 ~ ~ b ML ðkÞ ¼ Rn ðkÞb ML , Rn ðkÞ ¼ I þ k C n , ð1:16Þ n
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435
1423
where b~ ML is the unrestricted maximum likelihood estimator of b. Note that, Rn ðkÞ-RðkÞ ¼ ½I þ kC
1 1
as n-1:
ð1:17Þ
The restricted linear model with multicollinearity is common in practice. Ridge estimators under the Gaussian regression model has been considered by various researchers. Among them Hoerl and Kennard (1970), Sarker (1992), Saleh and Kibria (1993), Gruber (1998), Malthouse (1999), Inoue (2001), Kibria and Saleh (2003) and very recently Hassanzadeh Bashtian et al. (2011a) are notable. However, the literature on the ridge estimators under the probit regression model is limited. For details on probit regression and probit ridge regression we refer Shariff et al. (2004) and Gana (1995) respectively and references therein. In this paper we study the various types of probit ridge regression estimators and their asymptotic properties. The organization of the paper is as follows. Proposed estimators and their asymptotic properties are provided in Section 2. The risk analyzes are discussed in Section 3. The relative efficiencies of the proposed estimators compared to unrestricted Liu (1993) estimator are discussed in Section 4. Finally some concluding remarks are provided in Section 5. 2. Proposed estimators and their properties 2.1. Quasi-empirical Bayes ridge regression estimators Following Liu (1993) and Saleh (2006), we propose the following quasi-empirical Bayes ridge regression estimator of b. Unrestricted ridge regression estimator (URRE),
b~ ML ðkÞ ¼ Rn ðkÞb~ ML ,
ð2:1Þ
where b~ ML is the unrestricted maximum likelihood estimator of b and defined in Eq. (1.8). Restricted ridge regression estimator (RRRE),
b^ ML ðkÞ ¼ Rn ðkÞb^ ML ,
ð2:2Þ
where b^ ML is the restricted maximum likelihood estimator of b and defined in Eq. (1.14). Preliminary test ridge regression estimator (PTRRE): PT
PT
b^ ML ðkÞ ¼ Rn ðkÞb^ ML ,
ð2:3Þ
PT
where b^ ML ¼ b~ ML ðb~ ML b^ ML ÞIðLn o w2q, a Þ is the preliminary test estimator of b and w2q, a is the a level upper value of the null distribution of Ln and I(A) is the indicator function for the set A. The preliminary test approach estimation under the Gaussian assumption has been pioneered by Bancroft (1944), followed by Bancroft (1964), Han and Bancroft (1968), Judge and Bock (1978), Giles (1991), Kibria and Saleh (2006) and recently Arashi (2009) among others. The preliminary test estimator (PTE) has its own problem as it depends on the level of significance a ð0 o a o 1Þ. Thus, an optimal value of a remains a problem for the use of PTE. To overcome this problem, we propose the following Stein-type improved estimators, which are free of a, the level of significance. Shrinkage ridge regression estimator (SRRE): S
S
b^ ML ðkÞ ¼ Rn ðkÞb^ ML ,
ð2:4Þ
S b^ ML
where ¼ b~ ML ðq2Þðb~ ML b^ ML ÞL1 n is the shrinkage estimator of b. Positive rule ridge regression estimator (PRRRE): Sþ
Sþ
b^ ML ðkÞ ¼ Rn ðkÞb^ ML ,
ð2:5Þ
Sþ S where b^ ML ¼ b^ ML IðLn o ðq2ÞÞ þ b^ ML IðLn Z ðq2ÞÞ is the positive rule estimator of b. The Stein-rule estimation technique for different purposes and several models have been considered by various researchers: James and Stein (1961), Ohtani (1993), Gruber (1998), Kibria and Saleh (2004), Saleh (2006), Arashi and Tabatabaey (2008, 2009, 2010a,b) and Hassanzadeh Bashtian et al. (2011b) to mention a few. Now we discuss the asymptotic properties of the estimators in the following subsection.
2.2. Asymptotic distributions of the estimators and their properties As discussed in the previous section that the test statistic, Ln follows the central chi-square distribution with q degrees of freedom (D.F.). Now we consider the properties of the test-statistic, Ln under the fixed alternatives, K ðxÞ : Hb ¼ h þ x, x 2 Rq . 0
P
Clearly, Ln ¼ nx ½HC 1 H0 1 x-1 as n-1. So that limn-1 P x ðLn ZxÞ ¼ 1 i.e. Ln is a consistent test. As a consequence, pffiffiffi PT pffiffiffi S pffiffiffi S þ under the fixed alternatives, the asymptotic distribution of nðb^ ML bÞ, nðb^ ML bÞ and nðb^ ML bÞ are equivalent to the pffiffiffi ~ pffiffiffi ^ asymptotic distribution of nðb bÞ, while the asymptotic distribution of nðb bÞ is degenerate as n-1. These have ML
been proved below.
ML
1424
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435
(a) PTRRE: Consider the quadratic form: " 1 ! 1 #1 PT PT 1 1 Cn Cn nðb^ ML b~ ML Þ0 H H0 ðb^ ML b~ ML Þ ¼ nðHb~ ML hÞ0 H H0 ðHb~ ML hÞIðLn o w2q, a Þ n n P
¼ Ln IðLn o w2q, a Þ r w2q, a IðLn o w2q, a Þ-0
as n-1
ð2:6Þ
by the consistency of the test statistic, Ln . Hence, pffiffiffi ^ PT D pffiffiffi nðb ML bÞ nðb~ ML bÞ:
ð2:7Þ
(b) SRRE: Similarly, consider the quadratic form: 1 ! S S 1 P 0 ~ ^ C b b Þ H H0 ðb^ ML b~ ML Þ ¼ ðq2Þ2 L1 nð ML n ML n -0 n
as n-1
under fixed alternatives:
Hence, pffiffiffi ^ S D pffiffiffi nðb ML bÞ nðb~ ML bÞ:
ð2:8Þ
ð2:9Þ
(c) PRSRRE: Finally, we consider the following quadratic form: 1 ! Sþ Sþ 1 Cn H0 ðb^ ML b~ ML Þ nðb^ ML b~ ML Þ0 H n 1 ¼ 2ðq2Þ2 ½L1 n þ Ln IðLn oq2Þþ ½Ln IðLn oq2Þ4ðq2ÞIðLn o q2Þ P
-0
as n-1
under fixed alternatives:
ð2:10Þ
Hence, pffiffiffi ^ S þ D pffiffiffi nðb ML bÞ nðb~ ML bÞ: pffiffiffi P Further, nðb~ ML b^ ML Þ0 ½Hð1n C n Þ1 H0 1 ðb~ ML b^ ML Þ ¼ Ln -1 as n-1. Hence, the distribution of nðb^ ML bÞ degenerate as n-1. Hence, to obtain proper discrimination between the asymptotic distributions of the estimators, we consider the local alternatives: fK ðnÞ g, K ðnÞ : Hb ¼ hþ n1=2 g,
g ¼ ðg1 , g2 , . . . , gq Þa0
Then under K ðnÞ , " 1 #1 D 0 1 0 ~ Cn Ln ¼ nðHb ML hÞ H H0 ðHb~ ML hÞ w2q ðD2 Þ n
as n-1,
where
D2 ¼ g0 ðHC 1 H0 Þ1 g is the non-centrality parameter. Finally, under the fK ðnÞ g, as n-1, we have 0 pffiffiffi 1 80 1 0 nðb~ ML bÞ > C 1 < 0 B pffiffiffi ^ C B C B 1 B C nðb ML bÞ A N 3p @ d A; @ C A @ > pffiffiffi ~ : d A nðb ML b^ ML Þ
C 1 A C 1 A 0
19 > = C 0A > ; A A
where A ¼ C 1 H0 ðHC 1 H0 Þ1 HC 1 and d ¼ C 1 H0 ðHC 1 H0 Þ1 g. Thus, we obtain the following conclusions from the results in (2.11), under fK ðnÞ g as n-1: 0 pffiffiffi 1 80 9 1 nðb~ ML ðkÞbÞ > > < b1 ðkÞ = B pffiffiffi ^ C B C n B nðb ML ðkÞbÞ C @ A N 3p >@ b2 ðkÞ A; C > pffiffiffi ~ : b ðkÞ ; 12 nðb ML ðkÞb^ ML ðkÞÞ
ð2:11Þ
ð2:12Þ
and 0
RðkÞC 1 RðkÞ B C ¼ @ RðkÞðC 1 AÞRðkÞ RðkÞARðkÞ n
1
RðkÞðC 1 AÞRðkÞ RðkÞðC
1
AÞRðkÞ
0
RðkÞARðkÞ 0
1 C A:
ð2:13Þ
RðkÞARðkÞ 1
In (2.12), b1 ðkÞ ¼ kC ðkÞb, b2 ðkÞ ¼ RðkÞdkC ðkÞb, and b12 ðkÞ ¼ b1 ðkÞb2 ðkÞ ¼ RðkÞd with CðkÞ ¼ ½C þ kIp are the asymptotic distributional biases (ADB) of the estimators.
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435
1425
Now, under fK n g, we have pffiffiffi PT lim Pf nðb^ ML bÞ r x9K ðnÞ g n-1
pffiffiffi PT pffiffiffi PT ¼ lim Pf nðb^ ML bÞ r x; Ln o w2q, a 9K ðnÞ g þ lim Pf nðb^ ML bÞ r x; Ln Z w2q, a 9K ðnÞ g n-1 n-1 Z Fp ðxC 1 H0 ðH0 C 1 HÞ1 Z,0,AÞ dFp fZ,0,ðHC 1 H0 Þg ¼ Hq ðw2q, a ; D2 ÞFðx þ d,0,C 1 AÞ þ
ð2:14Þ
EðgÞ
where EðgÞ ¼ fðZ þ gÞ0 ðHC 1 H0 Þ1 ðZ þ gÞ Z w2q, a g and Fp ðZ, y, SÞ is the cdf of a p-variate normal distribution with mean 0 and covariance matrix S. Here Hq ðx; D2 Þ is the cdf of a non-central chi-square distribution with q DF and non-centrality parameter, D2 ¼ g0 ðHC 1 H0 Þg. PT As a consequence, the asymptotic distributional bias for b^ ML ðkÞ is given by b3 ðkÞ ¼ RðkÞdHq þ 2 ðw2q, a ; D2 ÞkC
1
ðkÞb:
ð2:15Þ
Furthermore, pffiffiffi ^ S C 1 H0 ðHC 1 H0 Þ1 ðHU þ gÞ D , nðb ML bÞ Uðq2Þ ðHU þ gÞ0 ðHC 1 H0 Þ1 ðHU þ gÞ
ð2:16Þ
S where U N p ð0,C 1 Þ. Then the asymptotic distributional bias for Stein-type ridge regression estimator, b^ ML ðkÞ is given by 2 b4 ðkÞ ¼ ðq2ÞRðkÞdE½w2 q þ 2 ðD ÞkC
1
ðkÞb,
ð2:17Þ
where E½w
D2 =2
2 2 q þ 2 ðD Þ ¼
e
X 1 D2 r! 2 r Z0
!r
1 ¼ Er ðqþ 2rÞ1 qþ 2r
and Er stands for the expectation with respect to a poisson random variable R with mean D2 =2. Sþ Similarly, the asymptotic distributional bias for the positive rule ridge regression estimator, b^ ML ðkÞ is given by b5 ðkÞ ¼ RðkÞdFðD2 ÞkC
1
ðkÞb,
ð2:18Þ
where 2 2 2 2 2 2 2 FðD2 Þ ¼ fðq2ÞE½w2 q þ 2 ðD Þ þH q þ 2 ðwq, a ; D Þðq2ÞE½wq þ 2 ðD ÞIðwq þ 2 ðD Þ oq2Þg 2 2 2 2 and E½w2 q þ 2 ðD ÞIðwq þ 2 ðD Þ oq2Þ is the truncated expectation of the reciprocal of a non-central w -distribution with (qþ2) degrees of freedom and non-centrality parameter D2 =2.
2.3. MSE and risks Following Saleh (2006), in this subsection we present the expression for the MSE and Risk for all five estimators in the form of theorems. Theorem 2.1. Based on (2.13)–(2.18), the MSE of the URRE, RRRE, PTRRE, SRRE and PRRRE are given by (2.19): 2 0 M 1 ðb~ ML ðkÞÞ ¼ RðkÞC 1 RðkÞ þ k C 1 ðkÞbb C 1 ðkÞ 0 0 0 2 0 M 2 ðb^ ML ðkÞÞ ¼ RðkÞ½C 1 ARðkÞ þ RðkÞdd RðkÞ þk½RðkÞdb C 1 ðkÞ þ C 1 ðkÞbd RðkÞ þ k C 1 ðkÞbb C 1 ðkÞ, PT 0 M 3 ðb^ ML ðkÞÞ ¼ RðkÞ½C 1 AHq þ 2 ðw2q, a ; D2 ÞRðkÞ þRðkÞdd RðkÞ ½2Hq þ 2 ðw2q, a ; D2 ÞHq þ 4 ðw2q, a ; D2 Þ 2
þ k½RðkÞdb C 1 ðkÞ þC 1 ðkÞbd RðkÞHq þ 2 ðw2q, a ; D2 Þ þk C 1 ðkÞbb C 1 ðkÞ, 0
0
0
S 0 2 2 2 2 4 2 4 M 4 ðb^ ML ðkÞÞ ¼ RðkÞ½C 1 Afðq2ÞE½w4 q þ 2 ðD Þ þ2D E½wq þ 4 ðD ÞgRðkÞ þ ðq 4ÞRðkÞdd RðkÞE½wq þ 4 ðD Þ 2
2 1 ðkÞbb C 1 ðkÞ, þ ðq2Þk½RðkÞdb C 1 ðkÞ þ C 1 ðkÞbd RðkÞE½w2 q þ 2 ðD Þ þ k C 0
0
0
Sþ S 2 2 2 2 M 5 ðb^ ML ðkÞÞ ¼ M4 ðb^ ML ðkÞÞRðkÞARðkÞE½ð1ðq2Þw2 q þ 2 ðD ÞÞ Iðwq þ 2 ðD Þ oq2Þ 2 2 2 2 2 2 2 2 þ RðkÞdd RðkÞf2E½ð1ðq2Þw2 q þ 2 ðD ÞÞIðwq þ 2 ðD Þ oq2ÞE½ð1ðq2Þwq þ 2 ðD ÞÞ Iðwq þ 2 ðD Þ o q2Þg 0
2 2 2 þ k½RðkÞdb C 1 ðkÞ þC 1 ðkÞbd RðkÞ E½ð1ðq2Þw2 q þ 2 ðD ÞÞIðwq þ 2 ðD Þ oq2Þ, 0
where 1 g Eðw4 q þ 2 ðDÞÞ ¼ Er fðq þ2rÞ
0
ð2:19Þ
1426
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435 1 Eðw4 ðqþ 2r2Þ1 g q þ 2 ðDÞÞ ¼ Er fðq þ 2rÞ 1 Eðw2 g q þ 4 ðDÞÞ ¼ Er fðq þ 2r2Þ 1 Eðw4 ðqþ 2r þ 2Þ1 g: q þ 4 ðDÞÞ ¼ Er fðq þ 2rÞ
Theorem 2.2. Using (2.19), risk expression of the URRE, RRRE, PTRRE, SRRE and PRRRE are given by (2.20): 2 0 R1 ðb~ ML ðkÞÞ ¼ tr½RðkÞC 1 RðkÞ þ k b C 2 ðkÞb
R2 ðb^ ML ðkÞÞ ¼ tr½RðkÞC 1 RðkÞtr½RðkÞARðkÞþ d RðkÞ2 d þ2kd RðkÞC 2 ðkÞb þ k b C 2 ðkÞb 0
2 0
0
PT 0 R3 ðb^ ML ðkÞÞ ¼ tr½RðkÞC 1 RðkÞtr½RðkÞARðkÞHq þ 2 ðw2q, a ; D2 Þ þ d RðkÞ2 d½2Hq þ 2 ðw2q, a ; D2 ÞHq þ 4 ðw2q, a ; D2 Þ 2 0
þ 2kd RðkÞC 1 ðkÞbHq þ 2 ðw2q, a ; D2 Þ þ k b C 2 ðkÞb 0
S
R4 ðb^ ML ðkÞÞ ¼ s2 trðRðkÞ0 C 1 RðkÞÞðq2ÞtrðRðkÞ0 ARðkÞÞ XðD2 Þ þ ðq2Þd RðkÞ0 RðkÞd YðD2 Þ 0
2 0
2 2 ðkÞb, þ 2ðq2Þkd RðkÞ0 C 1 ðkÞbEðw2 q þ 2 ðD ÞÞ þ k b C 0
" # ) 0 n ðqþ 2Þðd R2 ðkÞd 2 2 2 4 2 ð D Þ þ 1 D E½ w ð D Þ ¼ tr½RðkÞC 1 RðkÞðq2Þtr½RðkÞARðkÞ ðq2ÞE½w4 qþ2 qþ4 2D2 trðRðkÞARðkÞ 2 0
2 2 ðkÞb þ 2kðq2Þd RðkÞC 1 ðkÞbEððw2 q þ 2 ðD ÞÞ þ k b C 0
Sþ S 2 2 2 2 R5 ðb^ ML ðkÞÞ ¼ R4 ðb^ ðkÞÞtr½RðkÞARðkÞE½ð1ðq2Þw2 q þ 2 ðD ÞÞ Iðwq þ 2 ðD Þ o q2Þ 2 2 2 2 2 2 2 2 þ d R2 ðkÞdf2E½ð1ðq2Þw2 q þ 2 ðD ÞÞIðwq þ 2 ðD Þ oq2ÞE½ð1ðq2Þwq þ 2 ðD ÞÞ Iðwq þ 2 ðD Þ o q2Þg 0
2 2 2 þ 2kd RðkÞC 1 ðkÞbE½ð1ðq2Þw2 q þ 2 ðD ÞÞIðwq þ 2 ðD Þ oq2Þ, 0
ð2:20Þ
where 2 2 4 XðD2 Þ ¼ 2Eðw2 q þ 2 ðD ÞÞðq2ÞEðwq þ 2 ðD ÞÞ 2 2 2 2 4 YðD2 Þ ¼ 2Eðw2 q þ 2 ðD ÞÞ2Eðwq þ 4 ðD ÞÞ þðq2ÞEðwq þ 2 ðD ÞÞ:
The performance of the proposed estimators under the probit regression models are presented in the following section. 3. Comparison based on risk function In this section we will compare the performance of the proposed estimators in the light of quadratic risk function. Since C is a psd matrix, there exists an orthogonal matrix G such that
G0 C G ¼ L ¼ diagðl1 , l2 , . . . , lp Þ, 1
where l1 Z l2 Z . . . lp 4 0 are the eigen values of C. It is easy to see that the eigen values of RðkÞ ¼ ðIp þkC Þ1 and CðkÞ ¼ ðC þ kIp Þ are ðl1 =ðl1 þ kÞ, l2 =ðl2 þ kÞ . . . , lp =ðlp þ kÞÞ and ðl1 þ k, l2 þ k . . . , lp þkÞ respectively. Then we obtain the following identities: trðRðkÞ0 C 1 RðkÞÞ ¼
b0 C 2 ðkÞb ¼
p X
p X
y2i
i ¼ 1 ðli þkÞ
trðRðkÞ0 ARðkÞÞ ¼
li
i ¼ 1 ðli þkÞ
2
;
p n 2 X hii li i ¼ 1 ðli þ kÞ
2
2
,
ð3:1Þ
y ¼ G0 b,
ð3:2Þ
,
ð3:3Þ
where hii Z0 is the ith diagonal element of the matrix Hn ¼ G0 AG. n
0
trðd RðkÞ0 RðkÞdÞ ¼
p X l2i dni 2 i ¼ 1 ðli þ kÞ
2
,
ð3:4Þ
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435 n
1427
0
n
where di is the ith element of vector d ¼ d G. Similarly 0
trðd RðkÞ0 C 1 ðkÞbÞ ¼
p X yi li dni i ¼ 1 ðli þ kÞ
2
:
ð3:5Þ
3.1. Comparison of PTRRE with PT, URRE, RRRE 3.1.1. Comparison between PTRRE and PT Case 1: Under the null hypothesis, the risk function of PTRRE can be expressed as follows: PT
Rðb^ ML ðkÞÞ ¼
p X li ½1li anii Hq þ 2 ðw2q, a ,0Þ
ðli þ kÞ
i¼1
p X
2
þk
2
a2i
i ¼ 1 ðli þkÞ
2
:
ð3:6Þ
The first term is a continuous, monotonically decreasing function of k and its derivative w.r.t k approaches 1 as k-0 þ and l1 -0. The second term is also a continuous, monotonically increasing function of k and its derivative w.r.t k PT PT approaches 0 as k-0 þ . Differentiating Rðb^ ðkÞÞ w.r.t k and we find a sufficient condition for @Rðb^ ðkÞÞ=@k to be negative is ML
ML
that 0 o ko k1 ðaÞ, where k1 ðaÞ ¼ 1 max ½a2i f1li anii Hq þ 2 ðw2q, a ,0Þg1 : 1rirp
PT There always exists a k4 0 in the region 0 ok o k1 ðaÞ (see Hoerl and Kennard, 1970), such that b^ ML ðkÞ has smaller risk ~ than that of b ML under H0.
Remark 1. If a ¼ 0, then we find that the risk of b^ ML ðkÞ is smaller than that of b^ ML and for a ¼ 1, the risk of b~ ML ðkÞ is smaller than that of b~ ML . Case 2: When the null hypothesis does not hold, we consider the following risks difference PT
PT
Rðb^ ML ÞRðb^ ML ðkÞÞ ¼ tr½C 1 RðkÞC 1 RðkÞ0 tr½ARðkÞARðkÞ0 Hq þ 2 ðw2q, a ; D2 Þ þ ðd dd RðkÞ0 RðkÞdÞ 0
2
0
2
f2Hq þ 2 ðw2q, a ; D ÞHq þ 4 ðw2q, a ; D Þg 2 0
2kHq þ 2 ðw2q, a ; D2 Þd RðkÞ0 C 1 ðkÞbk b C 2 ðkÞb: 0
ð3:7Þ
The r:h:s: Z0 according as:
d0 ½Ip RðkÞ0 RðkÞd Z
f 1 ðk, a, D2 Þ ½2Hq þ 2 ðw2q, a ; D2 ÞHq þ 4 ðw2q, a ; D2 Þ
,
ð3:8Þ
where f 1 ðk, a, D2 Þ ¼ tr½ARðkÞARðkÞ0 Hq þ 2 ðw2q, a ; D2 Þtr½C 1 RðkÞC 1 RðkÞ0 2 0
þ 2kHq þ 2 ðw2q, a ; D2 Þd RðkÞ0 C 1 ðkÞb þ k b C 2 ðkÞb: 0
PT Thus it can be shown by Courant-Fisher theorem (Rao, 1973, pp. 48–53) that b^ ML ðkÞ is superior (in the sense of smaller risk) PT to b^ ML whenever, D21 ða,kÞ r D2 o 1, where
D21 ða,kÞ ¼
f 1 ðk, a, D2 Þ 0
Chmax ½ðIp RðkÞ RðkÞÞC
1
f2Hq þ 2 ðw2q, a ; D2 ÞHq þ 4 ðw2q, a ; D2 Þg
ð3:9Þ
and Chmax(M) is the largest characteristic root of the matrix M. PT Now we consider the Rðb^ ML ðkÞÞ as a function of eigen values and k and given as follows: PT
Rðb^ ML ðkÞÞ ¼
p X
1
2
2 i ¼ 1 ðli þ kÞ
½li f1li anii Hq þ 2 ðw2q, a ; D2 Þg þ k a2i 2 n
þ2kai li di Hq þ 2 ðw2q, a ; D2 Þ þ li di 2 f2Hq þ 2 ðw2q, a ; D2 ÞHq þ 4 ðw2q, a ; D2 Þg: n
ð3:10Þ
Differentiating (3.10) w.r.t k we obtain PT
p X @Rðb^ ML ðkÞÞ 1 n 2 n ¼2 ½kfli a2i ai li di Hq þ 2 ðw2q, a ; D2 Þgfli ð1li anii Hq þ 2 ðw2q, a ; D2 ÞÞ þ li di 2 f2Hq þ 2 ðw2q, a ; D2 Þ 3 @k i ¼ 1 ðli þ kÞ 2 n
Hq þ 4 ðw2q, a ; D2 Þgai li di Hq þ 2 ðw2q, a ; D2 Þg:
ð3:11Þ
1428
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435 PT
Hence, we find a sufficient condition on k so that the derivative of @Rðb^ ML ðkÞÞ=@k is negative and let us define: k2 ða, D2 Þ ¼
f 2 ða, D2 Þ g 1 ða, D2 Þ
,
ð3:12Þ
where 2 n
2 n
f 2 ða, D2 Þ ¼ min fli ð1li anii Hq þ 2 ðw2q, a ; D2 ÞÞ þ li di 2 f2Hq þ 2 ðw2q, a ; D2 ÞHq þ 4 ðw2q, a ; D2 Þgai li di Hq þ 2 ðw2q, a ; D2 Þg 1rirp
and g 1 ða, D2 Þ ¼ max fli a2i ai li di Hq þ 2 ðw2q, a ; D2 Þg: n
1rirp
Suppose k4 0, then using (3.12) and following Kaciranlar et al. (1999), we have the following statements. PT PT 1. If g 1 ða, DÞ 4 0, it follows that for each positive k with k ok2 ða, DÞ, b^ ML ðkÞ has smaller risk value than that of b^ ML . PT PT 2. If g 1 ða, DÞ o 0, it follows that for each positive k with k 4k2 ða, DÞ, b^ ML ðkÞ has smaller risk than that of b^ ML .
3.1.2. Comparison of PTRRE with RRRE and URRE First compare between PTRRE and RRRE. PT Under the alternative hypothesis, the difference of b^ ML ðkÞ and b^ ML ðkÞ is PT
b^ ML ðkÞb^ ML ðkÞ ¼ tr½RðkÞARðkÞ0 ½1Hq þ 2 ðw2q, a ; D2 Þðd0 RðkÞ0 RðkÞdÞ½12Hq þ 2 ðw2q, a ; D2 Þ þ Hq þ 4 ðw2q, a ; D2 Þ2kd RðkÞ0 C 1 ðkÞb½1Hq þ 2 ðw2q, a ; D2 Þ: 0
ð3:13Þ
The expression is Z 0 whenever tr½RðkÞARðkÞ0 ½1Hq þ 2 ðw2q, a ; D2 Þ2kd RðkÞ0 C 1 ðkÞb½1Hq þ 2 ðw2q, a ; D2 Þ 0
d0 RðkÞ0 RðkÞd Z
½12Hq þ 2 ðw2q, a ; D2 Þ þHq þ 4 ðw2q, a ; D2 Þ
:
ð3:14Þ
Using the Courant–Fisher theorem, we obtain (3.13) Z 0 according as ftr½RðkÞARðkÞ0 2kd RðkÞ0 C 1 ðkÞbg½1Hq þ 2 ðw2q, a ; D2 Þ 0
D22 ða,kÞ Z
0
Chmax ðRðkÞ RðkÞC
1
Þ½12Hq þ 2 ðw2q, a ; D2 Þ þ Hq þ 4 ðw2q, a ; D2 Þ
:
ð3:15Þ
PT Thus b^ ML ðkÞ dominates b^ ML ðkÞ whenever D2 2 ð0, D22 ða,kÞÞ. Consider a ¼ 1, we find that b^ ML ðkÞ dominates b~ ML ðkÞ whenever 2 2 D 2 ð0, D3 ða,kÞÞ, where 0
D23 ða,kÞ Z
tr½RðkÞARðkÞ0 2kd RðkÞ0 C 1 ðkÞb Chmax ðRðkÞ0 RðkÞC 1 Þ
:
Re-writing the expression (3.13) in terms of the eigen values and k, we obtain PT
b^ ML ðkÞb^ ML ðkÞ ¼
p X
1
2 i ¼ 1 ðli þ kÞ
2
2 n
½li anii ½1Hq þ 2 ðw2q, a ; D2 Þli di 2 ½12Hq þ 2 ðw2q, a ; D2 Þ
þ Hq þ 4 ðw2q, a ; D2 Þ2kai li di ½1Hq þ 2 ðw2q, a ; D2 Þ:
ð3:16Þ
The r.h.s is negative when 2
k3 ðD2 , aÞ ¼
2 n
max1 r i r p f½li anii ½1Hq þ 2 ðw2q, a ; D2 Þli di 2 ½12Hq þ 2 ðw2q, a ; D2 Þ þ Hq þ 4 ðw2q, a ; D2 Þg min1 r i r p f2ai li di ½1Hq þ 2 ðw2q, a ; D2 Þg n
:
ð3:17Þ
Therefore, the PTRRE will dominate RRRE when k3 ðD2 , aÞ ok, otherwise, RRRE will dominate PTRRE. For a ¼ 1, URRE will dominate RRRE when 2
k4 ðD2 , aÞ ¼
2 n
max1 r i r p fli anii li di 2 n
min1 r i r p f2ai li di g
:
ð3:18Þ
Now we compare between PTRRE and URRE. The risk difference between PTRRE and URRE is PT
b^ ML ðkÞb~ ML ðkÞ ¼ tr½RðkÞARðkÞ0 Hq þ 2 ðw2q, a ; D2 Þðd0 RðkÞ0 RðkÞdÞ½2Hq þ 2 ðw2q, a ; D2 Þ Hq þ 4 ðw2q, a ; D2 Þ2kd RðkÞ0 C 1 ðkÞbHq þ 2 ðw2q, a ; D2 Þ: 0
ð3:19Þ
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435
1429
The expression is Z0 whenever ftr½RðkÞARðkÞ0 2kd RðkÞ0 C 1 ðkÞbgHq þ 2 ðw2q, a ; D2 Þ 0
d0 RðkÞ0 RðkÞd Z
½2Hq þ 2 ðw2q, a ; D2 ÞHq þ 4 ðw2q, a ; D2 Þ
:
ð3:20Þ
Using again the Courant–Fisher theorem, we obtain (3.19) Z0 according as: ftr½RðkÞARðkÞ0 2kd RðkÞ0 C 1 ðkÞbgHq þ 2 ðw2q, a ; D2 Þ 0
D24 ða,kÞ Z
0
Chmax ðRðkÞ RðkÞC
1
Þ½2Hq þ 2 ðw2q, a ; D2 ÞHq þ 4 ðw2q, a ; D2 Þ
:
ð3:21Þ
PT Thus b~ ML ðkÞ dominates b^ ML ðkÞ whenever D2 2 ð0, D24 ðk, aÞÞ. Re-writing the expression (3.19) in terms of the eigen values and k, we obtain PT Rðb^ ML ðkÞÞRðb~ ML ðkÞÞ ¼
p X
1
2 i ¼ 1 ðli þ kÞ
2
2 n
½li anii Hq þ 2 ðw2q, a ; D2 Þli di 2 ½Hq þ 2 ðw2q, a ; D2 Þ
Hq þ 4 ðw2q, a ; D2 Þ2kai li di Hq þ 2 ðw2q, a ; D2 Þ: n
ð3:22Þ
The r.h.s is positive and the PTRRE will dominate URRRE when k5 ðD2 , aÞ ok, where 2
k5 ða, D2 Þ ¼
2 n
max1 r i r p f½li anii Hq þ 2 ðw2q, a ; D2 Þli di 2 ½2Hq þ 2 ðw2q, a ; D2 ÞHq þ 4 ðw2q, a ; D2 Þg
ð3:23Þ
min1 r i r p f2kai li di Hq þ 2 ðw2q, a ; D2 Þg n
and URRRE will dominate PTRRE whenever k5 ða, D2 Þ rk. 3.2. Comparison of SRRE and PRRRE with PR, SE, URRE, RRRE, PTRRE Following Section 3.1, we can make theoretical comparisons among the estimators, PRRRE and SRRE with URRE, RRRE and PTRRE and would find the values of ridge parameter k and non-centrality parameter D2 for which one estimator
p=8, q=5, alpha=0.05, k=0.2
8
8
7
7
6
6 Risk
Risk
p=8, q=5, alpha=0.025, k=0.2
5
5
URRE
URRE
RRRE
4
RRRE
4
PTRRE
PTRRE
SRRE
0
5
10
15
PRRRE
3
20
0
5
10
15
Delta
p=8, q=5, alpha=0.10, k=0.2
p=8, q=5, alpha=0.20, k=0.2
8
8
7
7
6
6
5
5
URRE
URRE
RRRE
4
RRRE
4
PTRRE
PTRRE
SRRE
SRRE
PRRRE
3
0
20
Delta
Risk
Risk
SRRE
PRRRE
3
5
10 Delta
15
PRRRE
3
20
0
5
10 Delta
Fig. 1. Risk behavior of the estimators for p ¼8, q¼ 5, k¼ 0.20 and different a.
15
20
1430
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435
p=8, q=5, alpha=0.05, k=0.2
p=8, q=5, alpha=0.05, k=0.4
URRE
7
PTRRE
6
URRE
7
RRRE
RRRE PTRRE
6
SRRE
SRRE PRRRE
5
Risk
Risk
PRRRE
5
4
4
3
3
2
2
0
5
10
15
20
0
5
10
15
Delta
p=8, q=5, alpha=0.05, k=0.6
p=8, q=5, alpha=0.05, k=0.8
URRE
7
PTRRE
6
URRE
7
RRRE
RRRE PTRRE
6
SRRE
SRRE PRRRE
5
Risk
Risk
PRRRE
5
4
4
3
3
2
2
0
20
Delta
5
10 Delta
15
20
0
5
10
15
20
Delta
Fig. 2. Risk behavior of the estimators for p ¼ 8, q ¼5, a ¼ 0:05 and different k.
dominates others in the smaller risk value. As per reviewer’s suggestion and for the brevity of the paper, we have omitted all of those comparisons. However, they are available from authors (Kibria and Saleh, 2011). Since we have omitted some theoretical comparisons among the estimators, to illustrate the performance of the estimators, we have given some graphs based on the orthonormal regression. For this we assumed that X 0 X ¼ Ip , H0 H ¼ Ip , HH0 ¼ Iq , h¼ 0. We have plotted the risk of the improved estimators versus D2 for different values of a and k in Figs. 1 and 2 respectively. We observed that both RRRE and PTRRE dominate URRE, SRRE and PRRRE near the null hypothesis. However, the performance of RRRE become worst when D2 moves away from the subspace restriction. The PRRRE and SRRE achieve a common upper bound i.e., the risk of URRE when D2 -1. We also note that both SRRE and PRRRE performed better for large q. To illustrate the performance of the estimators numerically, using the above orthonormal regression, for p¼ 10, q¼4, we have provided the relative efficiency of the estimators (RRRE, PTRRE, SRRE, PRRRE) compared with URRE in Table 1. From Table 1 we notice that all proposed estimators performed better (in the sense of efficiency) for the smaller value of k. The PTRRE and RRRE are performing better near the null hypothesis compared to PRRRE and SRRE. However, the PRRRE and SRRE performing better compared to PTRRE and RRRE when D2 departed from origin. We also observed that the PRRRE is more efficient than URRRE, SRRE for any D2 . Since the performance of PTRRE depends on the size of test (a), we provided the optimal significance level of PTRRE in the following subsection. 3.3. Optimal significance level of PTRRE In this subsection, we consider the relative efficiency (RE) of PTRRE compared to URRRE. Accordingly, we provide maximum and minimum (Max and Min) rule for the optimum choice of the level of significance of the PTRRE for testing the null hypothesis, H0 : Hb ¼ h. For fixed value of kðk 4 0Þ, this RE is a function of a and D2 . Let us denote this by " #1 Rðb~ ML ðkÞÞ f 3 ða,k, D2 Þ ¼ 1 Eða, D,kÞ ¼ , ð3:24Þ PT 2 0 tr½RðkÞARðkÞ0 þ k b C 2 ðkÞb Rðb^ ML ðkÞÞ
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435
1431
Table 1 Relative efficiency of the estimators compared with URRE for p ¼ 10, q ¼4. k
D2
0.000
2.778
5.556
8.333
11.111
13.889
16.667
19.444
22.222
25.000
0.0
ERE EPTE ESE EPE
1.667 1.517 1.250 1.338
1.139 1.032 1.121 1.165
0.865 0.878 1.072 1.090
0.698 0.840 1.048 1.056
0.584 0.853 1.032 1.038
0.503 0.884 1.020 1.027
0.441 0.919 1.010 1.019
0.393 0.948 1.004 1.012
0.354 0.969 1.001 1.008
0.323 0.982 0.999 1.005
0.1
ERRRE EPTRRE ESRRE EPRRE
1.612 1.478 1.234 1.322
1.114 1.019 1.113 1.156
0.851 0.872 1.067 1.084
0.688 0.837 1.044 1.052
0.578 0.851 1.029 1.035
0.498 0.883 1.017 1.025
0.438 0.918 1.009 1.017
0.390 0.947 1.003 1.012
0.352 0.968 1.000 1.007
0.321 0.982 0.999 1.004
0.2
ERRRE EPTRRE ESRRE EPRRE
1.559 1.440 1.218 1.305
1.089 1.006 1.104 1.147
0.837 0.866 1.061 1.079
0.680 0.834 1.040 1.048
0.572 0.849 1.026 1.032
0.494 0.882 1.015 1.023
0.435 0.917 1.007 1.016
0.388 0.947 1.002 1.011
0.350 0.968 1.000 1.007
0.319 0.982 0.999 1.004
0.3
ERRRE EPTRRE ESRRE EPRRE
1.508 1.403 1.203 1.289
1.066 0.993 1.096 1.138
0.824 0.860 1.056 1.073
0.672 0.831 1.036 1.044
0.567 0.847 1.023 1.029
0.490 0.881 1.013 1.021
0.432 0.917 1.006 1.014
0.386 0.947 1.001 1.010
0.349 0.968 0.999 1.006
0.318 0.982 0.999 1.004
0.4
ERRRE EPTRRE ESRRE EPRRE
1.460 1.367 1.187 1.272
1.043 0.981 1.087 1.129
0.812 0.855 1.050 1.067
0.664 0.828 1.032 1.039
0.562 0.846 1.020 1.026
0.487 0.881 1.011 1.018
0.430 0.917 1.004 1.013
0.385 0.947 1.000 1.009
0.348 0.968 0.999 1.006
0.318 0.982 0.998 1.003
0.6
ERRRE EPTRRE ESRRE EPRRE
1.370 1.299 1.156 1.238
1.002 0.959 1.071 1.111
0.790 0.846 1.039 1.056
0.652 0.824 1.024 1.031
0.555 0.845 1.014 1.020
0.483 0.880 1.007 1.014
0.428 0.917 1.001 1.010
0.384 0.947 0.999 1.007
0.348 0.969 0.998 1.004
0.318 0.982 0.998 1.003
0.8
ERRRE EPTRRE ESRRE EPRRE
1.291 1.238 1.127 1.205
0.966 0.939 1.055 1.094
0.771 0.838 1.028 1.044
0.642 0.822 1.016 1.023
0.550 0.844 1.009 1.014
0.481 0.881 1.003 1.010
0.427 0.918 0.999 1.007
0.384 0.948 0.997 1.005
0.349 0.969 0.997 1.003
0.320 0.983 0.997 1.002
where f 3 ða,k, D2 Þ ¼ tr½RðkÞARðkÞ0 Hq þ 2 ðw2q, a ; D2 Þd RðkÞ0 RðkÞd 0
f2Hq þ 2 ðw2q, a ; D2 ÞHq þ 4 ðw2q, a ; D2 Þg2kHq þ 2 ðw2q, a ; D2 Þd RðkÞ0 C 1 ðkÞb: 0
For given k, the function Eða, D2 ,kÞ, is a function of a and D2 . This function for aa0 has its maximum under the null hypothesis with the following value: " #1 tr½RðkÞARðkÞ0 Hq þ 2 ðw2q, a ; 0Þ EMax ða,0,kÞ ¼ 1 : ð3:25Þ 2 0 tr½RðkÞARðkÞ0 þ k b C 2 ðkÞb For given k, EMax ða,0,kÞ is a decreasing function of a. While, the minimum efficiency EMin is an increasing function of a. For aa0, as D2 varies the graphs of Eð0, D,kÞ and Eð1, D,kÞ intersects in the range 0 o D2 o D22 ða,kÞ, which is given in (3.15). Therefore, in order to choose an estimator with optimum relative efficiency, we adopt the following rule for fixed values of k. If 0 o D2 o D22 ða,kÞ, we choose b^ ML ðkÞ since Eð0, D,kÞ is the largest in this interval. However, D2 is unknown and there is no way of choosing a uniformly best estimator. Therefore, we will use the following criterion for selecting the significance level of the preliminary test. Suppose the experimenter does not know the size of a, and wants an estimator which has relative efficiency not less than EMin. Then among the set of estimators with a 2 A, where A ¼ fa : Eða, D,kÞ ZEMin for all Dg, the estimator is chosen to maximizes Eða, D,kÞ over all a 2 A and all D2 . Thus we solve for a from the following equation: max minEða, D,kÞ ¼ EMin :
0 r a r 1 D2
ð3:26Þ
The solution an for (3.26) gives the optimum choice of a and the value of D2 ¼ D2min for which (3.26) is satisfied. The maximum and minimum guaranteed efficiencies for p¼6, q¼4 and different values of k is presented in Table 2. We also provided the efficiency for RRRE, SRRE and PRRRE for the corresponding D2Min . From Table 2, we observed that the maximum guaranteed efficiency is decreasing function of a, the size of the test where as the minimum guaranteed efficiency is an increasing function of a. For fixed a, p and q, both maximum and minimum guaranteed efficiencies are decreasing function of k. It is also noted that the performance of the SRRE and PRRRE is between the minimum and maximum guaranteed efficiency, depends on the size of the test. However, the performance of RRRE is always worst compared to the rest.
1432
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435
Table 2 Maximum and minimum guaranteed efficiency of PTRRE for selected p ¼ 6,q ¼ 4, a. k
a
5%
10%
15%
20%
25%
30%
50%
0.0
EMax EMin ERRRE ESRRE EPRRE
2.3150 0.7590 0.5591 1.0783 1.0910 8.7321
1.9875 0.8222 0.6138 1.0899 1.1058 7.7751
1.7750 0.8604 0.6538 1.0981 1.1171 7.1770
1.6226 0.8876 0.6804 1.1035 1.1249 6.8182
1.5069 0.9083 0.7093 1.1094 1.1334 6.4593
1.4158 0.9247 0.7299 1.1136 1.1397 6.2201
1.1877 0.9667 0.8127 1.1303 1.1654 5.3828
0.1
EMax EMin ERRRE ESRRE EPRRE
2.3099 0.7210 0.5679 1.0753 1.0889 8.3732
1.9843 0.7914 0.6243 1.0867 1.1042 7.4163
1.7727 0.8349 0.6657 1.0949 1.1160 6.8182
1.6209 0.8663 0.6933 1.1004 1.1241 6.4593
1.5056 0.8905 0.7232 1.1063 1.1331 6.1005
1.4148 0.9098 0.7446 1.1105 1.1397 5.8612
1.1873 0.9597 0.8173 1.1250 1.1626 5.1435
0.2
EMax EMin ERRRE ESRRE EPRRE
2.2950 0.6868 0.5712 1.0706 1.0847 8.1340
1.9746 0.7628 0.6281 1.0815 1.1000 7.1770
1.7659 0.8109 0.6697 1.0895 1.1119 6.5789
1.6159 0.8459 0.6975 1.0948 1.1201 6.2201
1.5018 0.8733 0.7173 1.0986 1.1261 5.9809
1.4119 0.8953 0.7492 1.1047 1.1360 5.6220
1.1862 0.9529 0.8092 1.1165 1.1550 5.0239
0.3
EMax EMin ERRRE ESRRE EPRRE
2.2709 0.6563 0.5754 1.0654 1.0802 7.8947
1.9589 0.7367 0.6325 1.0759 1.0955 6.9378
1.7549 0.7885 0.6656 1.0820 1.1049 6.4593
1.6078 0.8268 0.7023 1.0888 1.1157 5.9809
1.4957 0.8569 0.7223 1.0925 1.1218 5.7416
1.4071 0.8814 0.7434 1.0964 1.1283 5.5024
1.1844 0.9462 0.8147 1.1100 1.1510 4.7847
0.4
EMax EMin ERRRE ESRRE EPRRE
2.2385 0.6293 0.5803 1.0599 1.0754 7.6555
1.9378 0.7131 0.6300 1.0686 1.0885 6.8182
1.7399 0.7680 0.6710 1.0758 1.1001 6.2201
1.5967 0.8090 0.6983 1.0806 1.1081 5.8612
1.4873 0.8416 0.7178 1.0841 1.1140 5.6220
1.4006 0.8683 0.7384 1.0879 1.1203 5.3828
1.1820 0.9397 0.8079 1.1007 1.1424 4.6651
0.6
EMax EMin ERRRE ESRRE EPRRE
2.1546 0.5846 0.5794 1.0464 1.0621 7.4163
1.8823 0.6729 0.6348 1.0549 1.0761 6.4593
1.7004 0.7324 0.6666 1.0599 1.0848 5.9809
1.5673 0.7778 0.6927 1.0642 1.0923 5.6220
1.4649 0.8144 0.7112 1.0672 1.0978 5.3828
1.3832 0.8448 0.7307 1.0705 1.1037 5.1435
1.1752 0.9278 0.7964 1.0818 1.1243 4.4258
0.8
EMax EMin ERRRE ESRRE EPRRE
2.0546 0.5507 0.5816 1.0331 1.0489 7.1770
1.8148 0.6414 0.6276 1.0391 1.0598 6.3397
1.6516 0.7039 0.6652 1.0443 1.0696 5.7416
1.5307 0.7524 0.6900 1.0479 1.0764 5.3828
1.4367 0.7920 0.7076 1.0505 1.0814 5.1435
1.3612 0.8252 0.7261 1.0532 1.0869 4.9043
1.1666 0.9176 0.7770 1.0612 1.1024 4.3062
DMin
DMin
DMin
DMin
DMin
DMin
DMin
4. Relative efficiency of the estimators compare to URLE In this section, we will introduce the Liu (1993) estimator for the multicollinear problem under the probit model. The demerit of the ridge regression method is that it is a complicated function of k. To overcome this problem, Liu (1993) proposed the following unrestricted Liu estimator (URLE), which is defined as:
b~ ML ðdÞ ¼ F d b~ ML , F d ¼ ðC þ Ip Þ1 ðC þdIp Þ,
ð4:27Þ
where 0 o d o 1 is a parameter. The advantage of URLE over URRE is that URLE is a linear function of d. Therefore it is easier to choose d than to choose k in the URRE. It is known that URLE combines the benefit of Hoerl and Kennard (1970) and Stein (1956). More on Liu estimator we refer to Liu (2003), Akdeniz and Kacirnlar (1995), and recently Alheety and Kibria (2009) and references therein. Following Section 2, the quadratic risk of URLE is obtained as Rðb~ ðdÞ; bÞ ¼ trðF d 0 C 1 F d Þ þð1dÞ2 b ðC þ Ip Þ2 b: 0
ð4:28Þ
To compare the performance of URRE and URLE, we have provided their risks in Fig. 3. From Fig. 3, it is clear that neither URRE nor URLE dominates each other uniformly. However, when d o0:5, the URLE dominates URRRE and when d Z 0:5, the URRE dominates URLE. Since, the URLE is an alternative ridge regression estimator, it might be interesting to see the performance of RRRE, PTRRE, SRRE and PRRRE compared to URLE. For the brevity of the paper we have omitted all theoretical comparisons
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435
1433
between URLE and RRRE, PTRRE, SRRE and PRRRE. Instead, we will illustrate the performance of the estimators compare with URLE numerically. For different k, the efficiencies of RRRE, PTRRE, SRRE and PRRRE compare to URLE are presented for d ¼0.5 and d ¼0.9 in Tables 3 and 4 respectively. Now, if we compare Tables 3 and 4, we observed that for larger d value the proposed RRRE, PTRRE, SRRE and PRRRE performing better than URLE. However, the condition changes for RRRE and PTRRE depends on the size of the test and departure from origin. For a given d, the efficiency of SRRE and PRRRE increase as the
6
5
AQRisk
4
3
2
1
URLSE URRE URLIU
0 0.0
0.2
0.4
0.6
0.8
1.0
d Fig. 3. Risk behavior of URRE and URLIU estimators for p ¼ 5, different d.
Table 3 Relative efficiency of the estimators compared with URRE for p ¼ 10,q ¼ 6,d ¼ 0:5, a ¼ 0:05. k
D2
0.000
2.778
5.556
8.333
11.111
13.889
16.667
19.444
22.222
25.000
0.0
ERE EPT ESE EPR
0.948 0.863 0.711 0.761
0.648 0.587 0.638 0.663
0.492 0.499 0.610 0.620
0.397 0.478 0.595 0.600
0.332 0.485 0.585 0.590
0.286 0.503 0.577 0.583
0.251 0.522 0.572 0.578
0.224 0.539 0.569 0.574
0.202 0.551 0.568 0.572
0.183 0.559 0.568 0.570
0.1
ERRRE EPTRRE ESRRE EPRRE
1.108 1.016 0.849 0.909
0.766 0.700 0.765 0.795
0.585 0.599 0.733 0.745
0.473 0.575 0.717 0.723
0.397 0.585 0.705 0.711
0.342 0.607 0.696 0.703
0.301 0.631 0.691 0.698
0.268 0.651 0.688 0.694
0.242 0.666 0.687 0.691
0.221 0.675 0.687 0.689
0.2
ERRRE EPTRRE ESRRE EPRRE
1.272 1.175 0.994 1.065
0.888 0.820 0.901 0.936
0.683 0.706 0.865 0.880
0.554 0.680 0.847 0.854
0.467 0.692 0.834 0.841
0.403 0.719 0.825 0.833
0.354 0.748 0.819 0.827
0.316 0.773 0.816 0.823
0.286 0.790 0.815 0.820
0.260 0.801 0.815 0.818
0.3
ERRRE EPTRRE ESRRE EPRRE
1.437 1.336 1.146 1.228
1.015 0.946 1.044 1.085
0.785 0.819 1.005 1.022
0.640 0.792 0.985 0.994
0.540 0.807 0.971 0.980
0.467 0.839 0.961 0.971
0.412 0.874 0.955 0.965
0.368 0.902 0.952 0.960
0.332 0.922 0.951 0.957
0.303 0.936 0.951 0.955
0.4
ERRRE EPTRRE ESRRE EPRRE
1.602 1.500 1.302 1.395
1.145 1.077 1.193 1.239
0.891 0.938 1.152 1.171
0.729 0.909 1.130 1.140
0.617 0.928 1.116 1.125
0.535 0.966 1.105 1.116
0.472 1.006 1.099 1.110
0.422 1.039 1.096 1.105
0.382 1.062 1.095 1.102
0.349 1.078 1.095 1.100
0.6
ERRRE EPTRRE ESRRE EPRRE
1.926 1.826 1.625 1.740
1.408 1.348 1.505 1.562
1.110 1.189 1.460 1.484
0.916 1.159 1.437 1.449
0.780 1.187 1.421 1.433
0.679 1.237 1.410 1.424
0.601 1.289 1.404 1.418
0.539 1.331 1.402 1.413
0.489 1.361 1.402 1.410
0.447 1.381 1.403 1.408
0.8
ERRRE EPTRRE ESRRE EPRRE
2.236 2.144 1.952 2.087
1.672 1.626 1.827 1.894
1.336 1.452 1.780 1.809
1.112 1.423 1.758 1.772
0.952 1.462 1.743 1.757
0.833 1.525 1.732 1.748
0.740 1.589 1.727 1.743
0.666 1.641 1.725 1.739
0.605 1.678 1.726 1.736
0.554 1.702 1.728 1.735
1434
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435
Table 4 Relative efficiency of the estimators compared with URLE for p ¼ 10,q ¼ 6,d ¼ 0:9, a ¼ 0:05. k
D2
0.000
2.778
5.556
8.333
11.111
13.889
16.667
19.444
22.222
25.000
0.0
ERE EPT ESE EPR
1.505 1.370 1.128 1.208
1.028 0.931 1.012 1.052
0.781 0.792 0.968 0.984
0.630 0.759 0.944 0.953
0.528 0.770 0.928 0.936
0.454 0.798 0.916 0.925
0.398 0.829 0.908 0.917
0.355 0.855 0.904 0.912
0.320 0.874 0.902 0.908
0.291 0.887 0.902 0.905
0.1
ERRRE EPTRRE ESRRE EPRRE
1.759 1.613 1.347 1.443
1.215 1.112 1.214 1.262
0.928 0.951 1.164 1.183
0.751 0.913 1.137 1.147
0.631 0.928 1.119 1.128
0.543 0.963 1.105 1.116
0.477 1.002 1.096 1.108
0.426 1.034 1.092 1.101
0.384 1.057 1.090 1.097
0.350 1.072 1.090 1.094
0.2
ERRRE EPTRRE ESRRE EPRRE
2.019 1.864 1.578 1.690
1.410 1.302 1.430 1.486
1.084 1.121 1.374 1.397
0.880 1.080 1.344 1.356
0.741 1.099 1.324 1.335
0.639 1.142 1.309 1.322
0.563 1.188 1.299 1.313
0.502 1.226 1.295 1.306
0.454 1.254 1.293 1.301
0.413 1.272 1.293 1.298
0.3
ERRRE EPTRRE ESRRE EPRRE
2.280 2.121 1.818 1.948
1.611 1.502 1.657 1.721
1.246 1.301 1.596 1.622
1.016 1.256 1.564 1.578
0.857 1.281 1.542 1.555
0.741 1.332 1.526 1.541
0.653 1.387 1.516 1.531
0.584 1.432 1.511 1.524
0.528 1.464 1.510 1.519
0.481 1.485 1.510 1.516
0.4
ERRRE EPTRRE ESRRE EPRRE
2.542 2.380 2.067 2.214
1.817 1.709 1.894 1.967
1.414 1.489 1.828 1.858
1.157 1.443 1.794 1.810
0.979 1.473 1.771 1.786
0.849 1.534 1.754 1.771
0.749 1.597 1.744 1.761
0.670 1.649 1.739 1.754
0.606 1.686 1.738 1.749
0.554 1.710 1.739 1.745
0.6
ERRRE EPTRRE ESRRE EPRRE
3.057 2.898 2.579 2.761
2.235 2.140 2.388 2.479
1.762 1.887 2.317 2.355
1.454 1.839 2.281 2.300
1.238 1.884 2.256 2.275
1.077 1.964 2.239 2.260
0.954 2.045 2.229 2.251
0.856 2.113 2.225 2.243
0.776 2.160 2.225 2.238
0.710 2.191 2.226 2.235
0.8
ERRRE EPTRRE ESRRE EPRRE
3.550 3.403 3.098 3.313
2.655 2.581 2.899 3.007
2.120 2.304 2.826 2.871
1.765 2.259 2.790 2.813
1.512 2.320 2.766 2.788
1.322 2.421 2.749 2.775
1.174 2.522 2.741 2.767
1.057 2.605 2.739 2.761
0.960 2.664 2.740 2.756
0.880 2.701 2.743 2.753
value of k increase. We also conclude that the performance of the PRRRE, SRRE and PTRRE can be improved by increasing the value of k and d. 5. Summary and conclusions This paper considered five ridge regression (RR) estimators, namely, unrestricted ridge regression estimator (URRE), restricted ridge regression estimator (RRRE), preliminary test ridge regression estimator (PTRRE), shrinkage ridge regression estimator (SRRE) and finally, positive rule ridge regression estimator (PRRRE) for estimating the parameters ðbÞ for the probit regression model when it is suspected that the parameter b may belong to a linear subspace defined by Hb ¼ h. The performances of the estimators are compared based on the quadratic bias and risk functions under both null and alternative hypotheses, which specify certain restrictions on the regression parameters. Under the restriction H0, the RRRE performs the best compared to other estimators, however, it performs the worst when D2 moves away from its origin. Note that the risk of URRE is constant while the risk of RRRE is unbounded as D2 goes to infinity. Also under H0, the risk of PTRRE is smaller than the risk of SRRE and PRRRE for some value of a. Thus, neither PTRRE nor PRRRE nor SRRE dominate each other uniformly. Note that the application of PRRRE and SRRE is constrained by the requirement qZ 3, while PTRRE does not need such a constraint. However, the choice of the level of significance of the test has a dramatic impact on the nature of the risk function for the PTRRE estimator. Thus when q Z3, one would uses PRRRE; otherwise one uses PTRRE with some optimum size a. Finally, we have discussed the performance of the proposed RRRE, PTRRE, SRRE and PRRRE compare to URLE (Liu, 1993), the alternative way of solve the multicollinearity problem. We found that the proposed PTRRE, SRRE and PRRRE performed better for large value of k and d. It is expected that the findings of the paper would be very useful for the practitioners in various fields.
Acknowledgments The authors are thankful to the referees for their valuable comments/suggestions which greatly improved the quality and presentation of the paper. This paper was written while the first author was on sabbatical leave (2010–2011). He is grateful to Florida International University for awarding him the sabbatical leave which, gave him excellent research facilities. This research was supported by the Discovery Grant from NSERC of the second author.
B.M.G. Kibria, A.K.Md.E. Saleh / Journal of Statistical Planning and Inference 142 (2012) 1421–1435
1435
References Akdeniz, F., Kacirnlar, S., 1995. On the almost unbiased generalized Liu estimator and unbiased estimation of the bias and MSE. Communications in Statistics – Theory and Methods 24 (7), 1789–1797. Alheety, M.A., Kibria, B.M.G., 2009. On the Liu and Almost unbiased estimators in presence of multicolliniearity with heteroscedastic or correlated error. Surveys in Mathematics and its Applications 4, 155–168. Arashi, M., 2009. Preliminary test estimation of the mean vector under balanced loss function. Journal of Statistical Research 43 (2), 55–65. Arashi, M., Tabatabaey, S.M.M., 2008. Stein-type improvement under stochastic constraints: use of multivariate Student-t model in regression. Statistics & Probability Letters 78 (14), 2142–2153. Arashi, M., Tabatabaey, S.M.M., 2009. Improved variance estimation under sub-space restriction. Journal of Multivariate Analysis 100 (8), 1752–1760. Arashi, M., Tabatabaey, S.M.M., 2010a. A note on Stein-type estimators in elliptically contoured models. Journal of Statistical Planning and Inference 140, 1206–1213. Arashi, M., Tabatabaey, S.M.M., 2010b. Estimation of the location parameter under LINEX loss function: multivariate case. Metrika 72 (1), 51–57. Bancroft, T.A., 1944. On biases in estimation due to the use of preliminary tests of significance. Annals of Mathematics and Statistics 15, 190–204. Bancroft, T.A., 1964. Analysis and inference for incompletely specified models involving the use of preliminary test(s) of significance. Biometrics 20, 427–442. Bhattacharyya, S.K., 1997. Monitoring patients with diabetes mellitus: an application of the probit model using managed care claims data. The American Journal of Managed Care 3 (9), 1345–1350. Bishai, D.M., 1996. Quality time: how parents’ schooling affects child health through its interaction with childcare time in Bangladesh. Health Economics 5, 383–407. Gana, R., 1995. Ridge regression estimation of the linear probability model. Journal of Applied Statistics 22 (4), 537–539. Giles, A.J., 1991. Pretesting for linear restrictions in a regression model with spherically symmetric distributions. Journal of Econometrics 50, 377–398. Gruber, M.H.J., 1998. Improving Efficiency by Shrinkage the James-Stein and Ridge Regression Estimators. Springer Verlag, New York. Han, C.-P., Bancroft, T.A., 1968. On Pooling means when variance is unknown. Journal of the American Statistical Association 63, 1333–1342. Hassanzadeh Bashtian, M., Arashi, M., Tabatabaey, S.M.M., 2011a. Using improved estimation strategies to combat multicollinearity. Journal of Statistical Computation and Simulation 81 (12), 1773–1797. Hassanzadeh Bashtian, M., Arashi, M., Tabatabaey, S.M.M., 2011b. Ridge estimation under the stochastic restriction. Communications in Statistics Theory and Method 40, 3711–3747. Hoerl, A.E., Kennard, R.W., 1970. Ridge regression: biased estimation for non-orthogonal problems. Technometrics 12, 55–67. Inoue, T., 2001. Improving the ‘‘HKB’’ ordinary type ridge estimators. Journal of the Japan Statistical Society 31 (1), 67–83. James, W., Stein, C., 1961. Estimation with quadratic loss. in: Proceeding of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, Berkeley, CA, pp. 361–379. Judge, G.G., Bock, M.E., 1978. The Statistical Implications of Pre-test and Stein-rule Estimators in Econometrics. North-Holland Publishing Company, Amsterdam. Kaciranlar, S., Sakallioglu, S., Akdeniz, F., Styan, G.P.H., Werner, H.J., 1999. A new biased estimator in linear regression and a detailed analysis of the widely-analyzed dataset on Portland cement. Sankhya, B-61, 443–459. Kibria, B.M.G., Saleh, A.K.Md.E., 2003. Preliminary test ridge regression estimators with Student’s t errors and conflicting test-statistics. Metrika 59 (2), 105–124. Kibria, B.M.G., Saleh, A.K.Md.E., 2004. Performance of positive rule estimator in the ill-conditioned Gaussian regression model. Calcutta Statistical Association Bulletin 55, 211–241. Kibria, B.M.G., Saleh, A.K.Md.E., 2006. Optimum critical value for pre-test estimators. Communications in Statistics – Simulation and Computation 35 (2), 309–319. Kibria, B.M.G., Saleh, A.K.Md.E., 2011. Performance of some Improved Estimators for Estimating the Parameters of ill-conditioned Probit Regression Models. Unpublished manuscript. Department of Mathematics and Statistics, Florida International University, FL, Miami, USA. Liu, K.J., 1993. A new class of biased estimate in linear regression. Communications in Statistics – Theory and Methods 22 (2), 393–402. Liu, K.J., 2003. Using Liu type estimator to combat collinearity. Communications in Statistics – Theory and Methods 32 (5), 1009–1020. Malthouse, E.C., 1999. Shrinkage estimation and direct marketing scoring model. Journal of Interactive Marketing 13 (4), 10–23. Ohtani, K., 1993. A comparison of the Stein-rule and positive part Stein-rule estimators in a misspecified linear regression models. Econometric Theory 9, 668–679. Rao, C.R., 1973. Linear Statistical Inference and its Application. Wiley, London. Saleh, A.K.Md.E., 2006. Theory of Preliminary Test and Stein-type Estimation with Applications. John Wiley, New York. Saleh, A.K.Md., Kibria, B.M.G., 1993. Performances of some new preliminary test ridge regression estimators and their properties. Communications in Statistics – Theory and Methods 22, 2747–2764. Sarker, N., 1992. A new estimator combining the ridge regression and the restricted least squares method of estimation. Communications in Statistics A 21 (7), 1987–2000. Shariff, A.A., Zaharim, A., Sopian, K., 2004. The comparison logit and probit regression analyzes in estimating the strength of gear teeth. European Journal of Scientific Research 27 (4), 548–553. Schaefer, R., 1986. Alternative estimators in logistic regression when the data are collinear. Journal of Statistical Computation and Simulation 25, 75–91. Stein, C., 1956. Inadmissibility of the usual estimator of the mean of a multivariate normal distribution. In: Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 97–206.