Hindawi Publishing Corporation Journal of Applied Mathematics Volume 2014, Article ID 360249, 16 pages http://dx.doi.org/10.1155/2014/360249
Research Article New Inference Procedures for Semiparametric Varying-Coefficient Partially Linear Cox Models Yunbei Ma1 and Xuan Luo2 1
School of Statistics and Research Center of Statistics, Southwestern University of Finance and Economics, Chengdu, Sichuan 611130, China 2 Information Engineering University, Zhengzhou, Henan 450001, China Correspondence should be addressed to Yunbei Ma;
[email protected] Received 14 October 2013; Accepted 19 March 2014; Published 25 May 2014 Academic Editor: Jinyun Yuan Copyright Š 2014 Y. Ma and X. Luo. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. In biomedical research, one major objective is to identify risk factors and study their risk impacts, as this identification can help clinicians to both properly make a decision and increase efficiency of treatments and resource allocation. A two-step penalizedbased procedure is proposed to select linear regression coefficients for linear components and to identify significant nonparametric varying-coefficient functions for semiparametric varying-coefficient partially linear Cox models. It is shown that the penalizedbased resulting estimators of the linear regression coefficients are asymptotically normal and have oracle properties, and the resulting estimators of the varying-coefficient functions have optimal convergence rates. A simulation study and an empirical example are presented for illustration.
1. Introduction To balance model flexibility and specificity, a variety of semiparametric models have been proposed in survival analysis. For example, Huang [1] studied partially linear Cox models and Cai et al. [2] and Fan et al. [3] studied the Cox proportional hazard model with varying coefficients. Tian et al. [4] proposed an estimation procedure for the Cox model with time-varying coefficients. Perhaps the most appropriate model is the semiparametric varying-coefficient model; its merit includes easy interpretation, flexible structure, potential interaction between covariates, and dimension reduction of nonparametric components models. Recently, a lot of effort has been made in this direction. Cai et al. [2] considered the Cox proportional hazard model with a semiparametric varying-coefficient structure. Yin et al. [5] proposed the semiparametric additive hazard model with varying coefficients. In biomedical research and clinical trials, one major objective is to identify risk factors and study their risk impacts, as this identification can help clinicians to properly
make a decision and increase efficiency of treatment and resource allocation. This is essentially a variable selection procedure in spirit. When the outcomes are censored, selection of significant risk factors becomes more complicated and challenging than with the case where all outcomes are complete. However, important progress has been made in recent literature. For instance, Fan and Li [6] extended the nonconcave penalized likelihood approach proposed by Fan and Li [7] to the Cox proportional hazards model and the Cox proportional hazards frailty model. Johnson et al. [8, 9] proposed procedures for selecting variables in semiparametric linear regression models for censored data. However, these papers did not consider variable selection with functional coefficients; while Wang et al. [10] extended the application of penalized likelihood to the varying-coefficient setting and studied the asymptotic distributions of the estimators, they only focused on the SCAD penalty for completely observed data. Du et al. [11] proposed a penalized variable selection procedure for Cox models with a semiparametric relative risk.
2
Journal of Applied Mathematics
In this paper, we study variable selection for both regression parameters and varying coefficients in the semiparametric varying-coefficient additive hazards model: đ (đĄ | đ, đ, đ) = đ 0 (đĄ) + đ˝T (đ (đĄ)) đ (đĄ) + đźT đ (đĄ) + đ (đ (đĄ)) ,
(1)
where đ(đĄ) is a vector of covariates with a linear effect on the logarithm of the hazard function đ(đĄ), đ(đĄ) is the main exposure variable of interest whose effect on the logarithm of the hazard might be nonlinear, and đ(â
) = (đ1 (â
), . . . , đđ (â
))T is a vector of covariates that may interact with the exposure covariate đ(â
). đ 0 (â
) is the baseline hazard ratio function and đ(â
) is an unspecified smooth function. đź is a đ-vector of constant parameters. Let đ(0) = 0 to assure that the model is identifiable. This model covers generally used basic semiparametric model settings. For instance, if đ˝(â
) ⥠0 and đ(â
) ⥠0, model (1) reduces to Lin and Yingâs additive hazard model [12] and Aalenâs additive hazard model when the baseline hazard function đ 0 (đĄ) also equals to zero [13, 14]. Furthermore, when the exposure covariate đ(â
) is time đĄ, then model (1) reduces to the partly parametric additive hazard model if đ 0 (đĄ) ⥠đ(đĄ) ⥠0 [15, 16] and reduces to the time-varying coefficient additive hazard model [17]. To select significant elements of đź and coefficient functions đ˝(â
), we require an estimation procedure for the regression parameters đź and đ˝(â
). However, this poses a challenge because it is impossible to simultaneously get the root-đ consistent penalized estimators for the nonzero components of đź by penalizing đ˝(â
) and đź. To achieve this goal, we propose the following two-step estimation procedure for selecting the relevant variables. In Step 1, we estimate đź by using the profile penalized least square function after locally approximating the nonparametric functions đ˝(â
) and đ(â
). In Step 2, given the penalized estimator of đź in Step 1, we develop a penalized least square function for đ˝(â
) by using the basis expansion. We will demonstrate that the proposed estimation procedures have oracle properties, and all penalized estimators of nonzero components can achieve their optimal convergence rate. This paper is organized as follows. Section 2 introduces the SCAD-based variable selection procedure for the parametric components and establishes the asymptotic normality for the resulting penalized estimators. Section 3 discusses a variable selection procedure for the coefficient functions by approximating these functions by using the B-spline approach. The simulation studies and an application of the proposed methods in a real data example are included in Sections 4 and 5, respectively. The technical lemmas and proofs of the main results are given in the appendix.
2. Penalized Least Squares Based Variable Selection for Parametric Components Let đđ denote the potential failure time, let đśđ denote the potential censoring time, and let đđ = min(đđ , đśđ ) denote the observed time for the đth individual, đ = 1, . . . , đ. Let Î đ be an indicator which equals 1 if đđ is a failure time and 0 otherwise. Let FđĄ,đ represent the failure, censoring, and
covariate information up to time đĄ for the đth individual. The observed data structure is {đđ , Î đ , đđ (đĄ), đđ (đĄ), đđ (đĄ), đ = 1, . . . , đ}. Assume that đ and đś are conditionally independent given covariates and that the observation period is [0, đ], where đ is a constant denoting the time for the end of the study. Let đđ (đĄ) = đź(đđ ⤠đĄ, Î đ = 1) denote the counting process corresponding to đđ and let đđ (đĄ) = đź(đđ ⼠đĄ). Let the filtration {FđĄ : đĄ â [0, đ]} be the history up to time đĄ; that is, FđĄ = đ{đđ ⤠đ˘, đđ (đ˘), đđ (đ˘), đđ (đ˘), đđ (đ˘), Î đ , 0 ⤠đ˘ ⤠đĄ, đ = T 1, . . . , đ}. Write đđ (đĄ) = đđ (đĄ)ââŤ0 đđ (đ˘)đ đ (đ˘)đđ˘. Then đđ (đĄ) is a martingale with respect to FđĄ . For ease of presentation, we drop the dependence of covariates on time. The methods and proofs in this paper are applicable to external time-dependent covariates [18]. Here we use the techniques of local linear fitting [19]. Suppose that đ˝(â
) and đ(â
) are smooth enough to allow Taylorâs expansion as follows: for each đ¤0 â W which is the support of đ and for đ¤ in a neighborhood of đ¤0 , đ˝ (đ¤) â đ˝ (đ¤0 ) + đ˝ó¸ (đ¤0 ) (đ¤ â đ¤0 ) , đ (đ¤) â đ (đ¤0 ) + đó¸ (đ¤0 ) (đ¤0 â đ¤) .
(2)
Then we can approximate the hazard ratio function given in (1) by đ (đĄ | đđ , đđ , đđ ) â đâ0 (đĄ, đ¤0 ) + đT (đ¤0 ) đđâ (đ¤0 ) + đźT đđ , (3) where đ(đ¤0 ) = (đ˝T (đ¤0 ), (đ˝ó¸ (đ¤0 ))T , đó¸ (đ¤0 ))T , đđâ (đ¤0 ) = (đđT , đđT (đđ â đ¤0 ), (đđ â đ¤0 ))T , and đâ0 (đĄ, đ¤0 ) = đ 0 (đĄ) + đ(đ¤0 ). Let H be a 2đ + 1 diagonal matrix with the first đ diagonal elements 1 and the rest â. If đź is given, then by using the counting-process notation, similarly to [5], we can obtain an estimator of đ(đ¤0 ) at each đ¤0 â W by solving the estimating equations whose objective function is given as follows: đđ (đ, đ¤0 ) đ
đ
= â ⍠đžâ (đđ â đ¤0 ) Hâ1 {đđâ (đ¤0 ) â đ (đĄ, đ¤0 )} đ=1 0
(4) Ă [đđđ (đĄ) â đđ (đĄ) Ă {đT (đ¤0 ) đđâ (đ¤0 ) + đźT đđ } đđ˘] ,
where đ (đĄ, đ¤0 ) =
âđđ=1 đžâ (đđ â đ¤0 ) đđ (đĄ) đđâ (đ¤0 ) , âđđ=1 đžâ (đđ â đ¤0 ) đđ (đĄ)
(5)
đžâ (â
) = đž(â
/â)/â with đž(â
) being a kernel function and â being a bandwidth, and đ = â. To avoid the technicality of tail problems, only data up to a finite time point đ are frequently used.
Journal of Applied Mathematics
3
Denote the solution of đđ (đ, đ¤0 ) = 0 by đĚ(đ¤0 , đź), which can be expressed as follows: đĚ (đ¤0 , đź)
expressed as functions of đź. These expressions inspire us to estimate đź by minimizing the following objective function subject to đź: đ 1 đ Ě 0 (đĄ, đđ , đź) đż đ (đź) = â (⍠[đđđ (đĄ) â đđ (đĄ) đÎ 2 đ=1 0
đ
đ 1 = [ â ⍠đžâ (đđ â đ¤0 ) Hâ1 đ đ=1 0 â2
Ě (đ , đź) + đĚ (đ , đź) â đđ (đĄ) {đđT đ˝ đ đ
â1
Ă {đđâ (đ¤0 ) â đ (đĄ, đ¤0 )} đđ (đĄ) đđĄ]
2
+đđT đź} đđĄ] ) .
1 đ đ Ă [ â ⍠đžâ (đđ â đ¤0 ) Hâ1 đ đ=1 0
Accordingly, our penalized objective function is defined as follows: đ óľ¨ óľ¨ đżâđ (đź) = đż đ (đź) + đ â đđ (óľ¨óľ¨óľ¨óľ¨đźđ óľ¨óľ¨óľ¨óľ¨) .
Ă {đđâ (đ¤0 ) â đ (đĄ, đ¤0 )} {đđđ (đĄ) â đđ (đĄ) đđT đź đđĄ} ] â1 = đđ1 (đ¤0 ) {đđ1 (đ¤0 ) â đđ2 (đ¤0 ) đź} .
def
(6) Here and below, aâđ = 1, a, aaT for any vector a and for đ = 0, 1, 2. As a consequence, we can estimate đ˝(â
) and đ(â
) presuming that đź is known: Ě (đ , đź) đ˝ đ
⥠đđ1 (đđ ) â đđ2 (đđ ) đź,
Thus, minimizing đżâđ (đź) with respect to đź results in a penalized least squares estimator of đź. Let đđ = max{đđó¸ đ (|đź0đ |) : đź0đ ≠ 0} and đđ = max{đđó¸ ó¸ đ (|đź0đ |) : đź0đ ≠ 0}. Now we establish the oracle T
properties for the resulting estimators. Let đź0 = (đźT10 , đźT20 ) be the true value of đź, the first subvector đźT10 of length đ , containing all nonzero elements of đź and đźT20 = 0đâđ . Ě= We introduce an additional notation as follows. Let đ (đT , 0Tđ , 0)T . For đ = 0, 1, let đđ = ⍠đĽđ đž(đĽ)đđĽ, ]đ =
đđ,đ (đĄ, đ¤0 )
đĚ (đđ , đź)
(7)
đđ
â1 = ⍠(0đĂđ , 0đĂđ , 1đ ) đđ1 (đ¤0 )
âđ
Ě (đâđ )T | đ = đ¤0 } , = đ (đ¤0 ) đ¸ {đ0 (đĄ, đ, V, đ¤0 ) đ
0
Ă {đđ1 (đ¤0 ) â đđ2 (đ¤0 ) đź} đđ¤0 ⥠đđ3 (đđ ) â đđ4 (đđ ) đź. 2.1. Variable Selection. Notice that if đź is given, then from (7) T we can estimate Î 0 (đĄ, đđ ) = âŤ0 đ 0 (đ˘, đđ )đđ˘ by
đ = 0, 1, đ = 0, 1, đ2,0 (đĄ, đ¤0 ) { { = đ (đ¤0 ) đ¸ {đ0 (đĄ, đ, V, đ¤0 ) { { đâ2 0đĂđ 0đ } } Ă (0đĂđ đ2 đâ2 đđ2 ) | đ = đ¤0 } , } 0Tđ đT đ2 đ2 } (11)
Ě 0 (đĄ, đđ , đź) Î [đđđ (đ˘) â đđ (đ˘) {đđT đđ1 (đđ ) + đđ3 (đđ )} đđ˘]
đĄ
= ââŤ
âđđ=1 đđ (đ˘)
đ=1 0 đ
â ââŤ
đĄ
đ=1 0
def
đ
= ââŤ
đĄ
đ=1 0
(10)
đ=1
⍠đĽđ đž2 (đĽ)đđĽ, đđ (đ˘, đ§, V, đ¤0 ) = đ(đ ⼠đ˘ | đ = đ§, đ = V, đ = đ¤0 )đđ (đĄ | đ§, V, đ¤0 ). Denote
â1 = (đźđ , 0đĂđ , 0đ ) đđ1 (đđ ) {đđ1 (đđ ) â đđ2 (đđ ) đź}
đ
(9)
đđ (đ˘) [đđT â đđT đđ2 (đđ ) â đđ4 (đđ )] đź đđ˘ âđđ=1 đđ (đ˘)
âđđ=1 [đđđ (đ˘) â đđ (đ˘) {đşđđ + đđâ (đđ ) đź} đđ˘] . âđđ=1 đđ (đ˘) (8)
Given the observed data and based on (7) and (8), đ˝(â
), đ(â
), and the cumulative function Î 0 (â
) can be approximately
where đ(â
) is the density of đ. Denote đĄ1 (đ¤) = (đźđ , 0đĂđ , đ¤ 0đ )đ1â1 (đ¤)đ2 (đ¤) and đĄ2 (đ¤) = (0đĂđ , 0đĂđ , 1đ ) âŤ0 đ1â1 (đ¤0 ) đ2 (đ¤0 )đđ¤0 , where đ
â2 (đĄ, đ¤0 ) đ1,0
0
đ0,0 (đĄ, đ¤0 )
đ1 (đ¤0 ) = ⍠{đ2,0 (đĄ, đ¤0 ) â đ
đ2 (đ¤0 ) = ⍠{đ1,1 (đĄ, đ¤0 ) â 0
} đđĄ,
đ1,0 (đĄ, đ¤0 ) đ (đĄ, đ¤0 )} đđĄ. đ0,0 (đĄ, đ¤0 ) 0,1 (12)
4
Journal of Applied Mathematics
For đ = 0, 1, 2, đ = 0, 1, let đ
đ,đ (đĄ) = đ¸[đđ (đĄ, đ, đ, đ){đ â Ěđ (đđ ) = đT âđT đĄ1 (đđ )âđĄ2 (đđ ), đ = đđĄ1T (đ)âđĄ2T (đ)}âđ ], and đ đ đ
3. Variable Selection for the Varying Coefficients
1, . . . , đ. âaâ = (aT a)
When some variables of đ are not relevant in the additive hazard regression model, the corresponding functional coefficients are zero. In this section, by using basis function expansion techniques, we try to estimate those zero functional coefficients as identically zero via nonconcave penalty functions. For a simplified expression, we denote đ(â
) as đ˝0 (â
) and đ0 ⥠1. Hence, we can rewrite the additive hazard regression model (1) as
1/2
for any vector a.
Theorem 1. If Assumptions (A.i)â(A.vi) are satisfied and Ě of đđ â 0 as đ â â, then there exist local minimizers đź đżâđ (đź) such that âĚ đź â đź0 â = đđ (đâ1/2 + đđ ). Remark 2. Theorem 1 indicates that by choosing a proper đ đ , which leads to đđ = đ(đâ1/2 ), there exists a root-đ consistent penalized least squares estimator of đź. Theorem 3. Assume that (13)
đ (đĄ | đđâ , đđ , đđ ) = đ 0 (đĄ) + đ˝âT (đđ ) đđâ (đĄ) + đźT đđ (đĄ) , (16)
Under Assumptions (A.i)â(A.vi), if đ đ â 0 and âđđ đ â â as đ â â, then the local root-n consistent local minimizer Ě 2T ) T in Theorem 1 with probability tending to 1 Ě = (Ě đź đź1T , đź satisfies
where đđâ = (1, đđT )T and đ˝â (đđ ) = (đ˝0 (đđ ), đ˝T (đđ ))T . Assume đźâ is a given root-đ consistent estimator of đź. From the arguments in Section 2, such an đźâ is available.
â > lim inf lim inf + đââ
đâ0
đđó¸ đ (đ)
> 0.
đđ
Ě 2 = 0; and (1) đź (2) asymptotic normality âđ (I1 (đ) + Î) â1
đˇ
Ă {Ě đź1 â đź10 â (I1 (đ) + Î) b} ół¨â đ (0, ÎŁ1 (đ)) ,
(14)
3.1. Penalized Function. We approximate each element of đżđ đ˝â (đ¤) by the basis expansion; that is, đ˝đ (đ¤) â âđ=1 đžđđ đľđđ (đ¤) for đ = 0, 1, . . . , đ. This approximation indicates that the hazard function given in (16) can be approximated as follows: đ (đĄ | đđâ , đđ , đđ )
where I1 (đ) and ÎŁ1 (đ) are the first đ Ă đ submatrix of I(đ)n and ÎŁ(đ), respectively, and T óľ¨ óľ¨ óľ¨ óľ¨ b = (đđó¸ đ (óľ¨óľ¨óľ¨đź10 óľ¨óľ¨óľ¨) sgn (đź10 ) , . . . , đđó¸ đ (óľ¨óľ¨óľ¨đźđ 0 óľ¨óľ¨óľ¨) sgn (đźđ 0 )) , (15) óľ¨ óľ¨ óľ¨ óľ¨ Î = diag (đđó¸ ó¸ đ (óľ¨óľ¨óľ¨đź10 óľ¨óľ¨óľ¨) , . . . , đđó¸ ó¸ đ (óľ¨óľ¨óľ¨đźđ 0 óľ¨óľ¨óľ¨)) .
đľ01 (đđ ) â
â
â
đľ0đż 0 (đđ ) ( B (đđ ) = ( (
0â
â
â
0
.. .
.. .
0â
â
â
0
0â
â
â
0
and Uđ (đđ ) = đđâT B(đđ ). Then, by using similar arguments as in Section 2, we estimate the basic hazard function Î 0 (đĄ) by đ
Ě 0 (đĄ) = â ⍠Î
T
đ=1 0
âT
đđđ (đ˘) â đđ (đ˘) {Uđ (đđ ) đž + đź đđ } đđ˘ âđđ=1 đđ (đ˘)
(19) and estimate đž by minimizing the following least square function with respect to đž:
đżđ
đ=0
đ=1
(17)
Let đžđ = (đžđ1 , . . . , đžđđđ )T and đž = (đžT0 , đžT1 , . . . , đžTđ )T . Write â
â
â
. đľ11 (đđ ) â
â
â
đľ1đż 1 (đđ ) ..
0â
â
â
0
đ
â đ 0 (đĄ) đ + â đđđ {âđžđđ đľđđ (đ¤)} + đźT đđ .
d
0â
â
â
0 0â
â
â
0 .. .
) ),
(18)
â
â
â
đľđ1 (đđ ) â
â
â
đľđđż đ (đđ ))
Ě đ (đž) đż 1 đ đ = â ⍠[đđđ (đĄ) â đđ (đĄ) 2 đ=1 0
(20) 2
Ě (đĄ) + U (đ ) đž + đźâT đ } đđĄ] . Ă {đ 0 đ đ đ Suppose that đ˝đ = 0, đ = đ + 1, . . . , đ, as mentioned before. We are trying to correctly identify these zero functional coefficients via nonconcave penalty functions, that is, via
Journal of Applied Mathematics
5
minimizing the following penalized least square function with respect to đž: đ
Ě đ (đž) + đ â đđ (óľŠóľŠóľŠđž óľŠóľŠóľŠ ) , đđż đ (đž) = đż óľŠ đ óľŠđ
(21)
đ=0
where âđžđ â2đ = đžTđ Rđ đžđ and Rđ = (đđđđ )đż đ Ăđż đ with đđđđ = ⍠đľđđ (đ¤)đľđđ (đ¤)đđ¤. Let đžĚ be the minimizer of penalized least square function Ě â (đ¤) = B(đ¤)Ěđž. (21) and define đ˝ Remark 4. Various basis systems, including Fourier basis, polynomial basis, and B-spline basis, can be used in the basis expansion. We focus on the B-spline basis and examine Ě â (đ¤). asymptotic properties of đ˝ 3.2. Asymptotic Properties. Define U(đĄ) = âđđ=1 đđ (đĄ)Uđ (đđ ) Ě đ (đž) in (20). Hence, / âđđ=1 đđ (đĄ) and let đžĚ be the minimizer of đż we can obtain đžĚ by resolving
â
Ě . Next, we demonstrate the asymptotic normality of đ˝ (1) T Ě Ě Ě First, we analyze đ˝ = (đ˝ , . . . , đ˝ ) , that is, the penalized 0
đ
(1)
estimator of đ˝(1) = (đ˝0 , . . . , đ˝đ )T . Let đđ(1) , U(1) đ , and U â denote the selected columns of đđ , Uđ , and U, respectively, corresponding to đ˝(1) . Similarly, let B(1) (đđ ) denote selected diagonal blocks of B(đđ ), đ = 1, . . . , đ. Define đž(1) = (đžT0 , đžT1 , . . . , đžTđ )T and đžĚ(1) = (ĚđžT0 , đžĚT1 , . . . , đžĚTđ )T . Then by part (a) of Theorem 5 and (21), đžĚ(1) is the local solution of đđđâ (đž(1) )
đđ (đž) đ
for the hard thresholding and SCAD penalty function, the convergent rate of penalized least square estimator is đâ2/5 . This is the optimal rate for nonparametric regression [21] if đ đ â 0. However, for the đż 1 penalty, đđâ = đ đ ; hence đ đ = đđ (đđ ) is required to lead to the optimal rate. On the other hand, the oracle property in part (b) of Theorem 5 requires đ đ /đđ â â, which contradicts đ đ = đđ (đđ ). As a consequence, for the đż 1 penalty, the oracle property does not hold.
đ
= â ⍠{Uđ (đđ ) â U (đĄ)} đ=1 0
T (1) 1 đ đ = â â ⍠(U(1) (đĄ)) đ (đđ ) â U đ đ=1 0
Ă [đđđ (đĄ) â đđ (đĄ) {UTđ (đđ ) đž + đźâT đđ }] đđĄ = 0. (22) Ě đ (đž) with respect to đž, the usual đđ (đž) is the derivative of đż score function of đž without the penalty. For any square integrable functions đ(đ¤) and đ(đ¤) on W, let âđ(đ¤)âđż 2 denote the đż 2 norm of đ(đ¤) on W, and define âđ(đ¤) â đ(đ¤)âđż â = supđ¤âW |đ(đ¤) â đ(đ¤)| as the đż â distance between đ(đ¤) and đ(đ¤). Let Gđ denote all functions that have đżđ the form âđ=1 đžđđ đľđđ (đ¤), đ = 0, 1, . . . , đ for B-spline base. đż max = max0â¤đâ¤đ đż đ and đđ = max0â¤đâ¤đ inf đâGđ âđ˝đ â đâđż â . Ě â (đ¤) = B(đ¤)Ěđž. Denote đ˝ Let đđâ = max{đđó¸ đ (âđ˝đ âđż 2 ) : âđ˝đ âđż 2 ≠ 0} and đđâ = max{đđó¸ ó¸ đ (âđ˝đ âđż 2 ) : âđ˝đ âđż 2 ≠ 0}.
Ă [đđđ (đĄ) â đđ (đĄ) (1) Ă {U(1) + đźâT đđ }] đđĄ đ (đđ ) đž
óľŠ óľŠ đđđ (óľŠóľŠóľŠđžđ óľŠóľŠóľŠđ ) . đđž(1) đ=0 đ
+â
Recall that đźâ is root-đ consistent. It follows from (23) that 0=
Remark 6. Let Gđ be a space of splines with a degree no less than 1 and with đż đ equally spaced interior knots, where đż đ â đ1/5 , đ = 0, 1, . . . , đ. Note that đđ = đ(đżâ2 max ) [20, Theorem 6.21]. Hence, if đđâ = đđ (đđ ), Theorem 5 implies that âđ˝Ěđ â đ˝đ âđż 2 = đđ (đâ2/5 ), đ = 0, . . . , đ. Remark 7. Suppose that đż đ â đ1/5 , đ = 0, 1, . . . , đ. Because đđâ = max{đđó¸ đ (âđ˝đ â) : âđ˝đ â ≠ 0}, which implies that
T (1) 1 đ đ (1) â ⍠{Uđ (đđ ) â U (đĄ)} đđđ (đĄ) đ đ=1 0 đ
Ă (1 + đđ (đâ1/2 )) â â
đ=0
óľŠ óľŠ đđđ (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ ) đđž(1)
T (1) 1 đ đ â â ⍠{U(1) (đĄ)} đđ (đĄ) đ (đđ ) â U đ đ=1 0
Theorem 5. Suppose that (13) and Assumptions (A.i) and (A.vii)â(A.iv) hold. If đ đ â 0 and đ đ / max{đđ , đđâ , (đż max /đ)1/2 } â â as đ â â, then one has (a) đ˝Ěđ = 0, đ = đ + 1, . . . , đ, with probability approaching 1; and (b) âđ˝Ěđ â đ˝đ âđż 2 = đđ (max{đđ , đđâ , (đż max /đ)1/2 }), đ = 0, . . . , đ.
(23)
Ě(1) â đđ(1)T đ˝(1) (đđ )} đđĄ Ă {U(1) đ (đđ ) đž =
T (1) 1 đ đ (1) â ⍠{Uđ (đđ ) â U (đĄ)} đđđ (đĄ) đ đ=1 0 đ
Ă (1 + đđ (đâ1/2 )) â â
đ=0
óľŠ óľŠ đđđ (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ ) đđž(1)
T (1) 1 đ đ â â ⍠{U(1) (đĄ)} đđ(1)T đđ (đĄ) đ (đđ ) â U đ đ=1 0
Ě (1) (đ ) â đ˝(1) (đ )} đđĄ. Ă {đ˝ đ đ (24) Thus we can obtain the following theorem.
6
Journal of Applied Mathematics
Theorem 8. Suppose Assumptions (A.i) and (A.vii)â(A.iv) hold, limđâ â đđ = 0, if đ đ â 0 and đ đ / max{đđ , đđâ, (đż max /đ)1/2 } â â as đ â â. For any đ¤ â W, let đž(1) Ě (1) (đ¤) given and đ˝ (1) (đ¤) be the conditional means of đžĚ(1) and đ˝ {đđ , đđ , đđ , đ = 1, 2, . . . , đ}. We then have T â1/2
â1 â1 (1) âđđżâ1 (đ¤) (Î1 + Îâ ) Ίâ (Î1 + Îâ ) {B(1) (đ¤)} } max {B
Ě Ă {đ˝
(1)
(đ¤) â đ˝
(1)
Table 1: Results of the simulation study for scenario (I). Here âcorrectâ is the average true positive rate and âincorrectâ is the average true negative rate. SCAD HARD đż 1 SCAD HARD đż 1 đ = 200, CR = 20% đ = 200, CR = 30% Correct 2.73 2.76 2.67 Correct 2.85 2.85 2.85 Incorrect 0.12 0.11 0.21 Incorrect 0.13 0.13 0.15
(25)
đż
(đ¤)} ół¨â đ (0, Iđ+1 ) ,
where Îâ = diag((đż max /2)(đđó¸ (âđ˝đ âđż 2 )/âđ˝đ âđż 2 )Rđ )0â¤đâ¤đ , Î1 =
Table 2: Simulation results for scenario (I): bias, standard error (SE), standard deviation (SD), and coverage probability (CP).
(1)
đ
â2 â limđ â â (đż max /đ) âđđ=1 âŤ0 {U(1) đ (đ¤) â U (đĄ)} đđ (đĄ)đđĄ, and Ί đ
SCAD
(1)
â2 = limđ â â (đż max /đ) âđđ=1 âŤ0 U(1) đ (đđ ) â đ (đĄ)} đđ (đĄ)đ đ (đĄ)đđĄ.
4. Simulation Studies We carried out two sets of simulation studies to examine the finite-sample properties of the proposed methods. đ¤ was generated from a uniform distribution over [0, 3] and đ đ was selected via cross-validation [7]. We set đ = 2 + â3 and reported sampling properties based on 200 replications. We consider two scenarios: (I) the performance of the regression coefficient estimates. Without loss of generality, we set đ = 1. We generated failure times from the partially linear additive hazards model (1) with đ˝(đ¤) = 1.2 + sin(2đ¤), đź = (1.5, 0, 1, 0, 0)T , đ(đ¤) = 0.3đ¤2 , and đ 0 (đĄ) = 0.5. Covariate đ was generated from a uniform distribution over [0, 1], đ1 was a Bernoulli random variable taking a value of 0 or 1 with a probability of 0.5, đ2 , đ3 were generated from uniform distributions over [0, 1] and [0, 2], respectively, and (đ4 , đ5 )T was generated from a normal distribution with mean zero, variance 0.25, and covariance 0.25. We generated the censoring time đś from a uniform distribution over [đđ/2, 3đđ/2]. We took đđ = 0.86 to yield an approximate censoring rate of CR = 20% and đđ = 0.35 to yield an approximate censoring rate of CR = 30%. We used sample size đ = 200. We selected the optimal bandwidth Ěâopt by using the method of [22] and found that the value is approximately 0.203. We present the average number of zero coefficients in Table 1, in which the column labeled âcorrectâ presents the average restricted only to the true zero coefficients, while the column labeled âincorrectâ depicts the average of coefficients erroneously set to 0. We also report the standard deviations (SD), the average standard errors (SE), and the coverage probability (CP) of the 95% confidence interval for the nonzero parameters in Table 2. As shown in Tables 1 and 2, SCAD, HARD, and đż 1 perform well and select about the same correct number of significant variables. The coverage probability (CP) based on SCAD and HARD, however, is better than that based on đż 1 . In scenario (II), we focused on functional coefficients đ˝(â
). We set đ = 1, đ(đ¤) = 0.3đ¤2 , and đź = 1. We generated failure times from the partially linear additive hazards model (1) with đ˝1 (đ¤) = 1.2 + sin(2đ¤), đ˝3 (đ¤) = 0.5đ¤2 , and đ˝2 (đ¤) = đ˝4 (đ¤) = đ˝5 (đ¤) ⥠0. Both covariates đ and đ1 were generated from a uniform distribution over
bias SE SD CP
0.0738 0.0767 0.0635 0.96
bias SE SD CP
0.0726 0.0824 0.0701 0.93
đź1 HARD đż1 SCAD đ = 200, CR = 20% 0.0729 0.0527 0.0552 0.0767 0.0757 0.0567 0.0645 0.0631 0.0495 0.91 0.91 0.90 đ = 200, CR = 30% 0.0726 0.0402 0.0480 0.0824 0.0812 0.0612 0.0721 0.0726 0.0618 0.89 0.92 0.94
đź3 HARD
đż1
0.0475 0.0566 0.0501 0.89
0.0324 0.0558 0.0558 0.88
0.0466 0.0612 0.0602 0.90
0.0280 0.0602 0.0613 0.91
[0, 1], respectively, (đ2 , đ3 )T was generated from a normal distribution with a mean of (0.5, 0.5)T , variance (0.25, 0.25)T and covariance 0, đ4 was a Bernoulli random variable taking a value of 0 or 1 with a probability of 0.5, and đ5 was uniform over [0, 2]. In simulation II, we took đđ = 1.24 to yield an approximate censoring rate of CR = 10% and đđ = 0.45 to yield an approximate censoring rate of CR = 30%. We used sample size đ = 200. The average number of zero functional coefficients based on SCAD and HARD is reported in Table 3, and fitted curves of nonzero functional coefficients are presented in Figures 1 and 2 based on SCAD and HARD, respectively. Notice that penalized estimates of functional coefficients are based on the results in Step 1. Because the oracle property does not hold for the đż 1 penalty according to Remark 7, we did not take đż 1 penalty into account. We can see from Table 3 and Figures 1 and 2 that the penalized spline estimation of the functional coefficient that we proposed performed well. Furthermore, for the varying coefficient part, the âcorrectâ numbers of significant variables selected based on SCAD and HARD penalties are close. In comparing the results between different censoring rates, it is also shown that our method is extremely robust against the censoring rate.
5. Real Data Example In this section, we apply the proposed method to a SUPPORT (Study to Understand Prognoses Preferences Outcomes and Risks of Treatment) dataset [23]. This study was a multicenter study designed to examine outcomes and clinical decision
Journal of Applied Mathematics
7
Table 3: Results from the simulation study for scenario (II). The legend is the same as in Table 1.
Table 4: Applied to SUPPORT data. Estimation results of significant binary variables using the SCAD penalty.
SCAD HARD đ = 200, CR = 10% Correct 2.61 2.79 Incorrect 0.165 0.195
Risk factor Sex Disease group ARF CHF Cirrhosis Colon cancer Lung cancer Coma COPD Race Asian Black Hispanic
SCAD HARD đ = 200, CR = 30% Correct 2.625 2.745 Incorrect 0.48 0.48
making for seriously ill-hospitalized patients. One thousand patients suffering from one of eight different diseases, including acute respiratory failure (ARF), chronic obstructive pulmonary disease (COPD), congestive heart failure (CHF), cirrhosis, coma, colon cancer, lung cancer, and multiple organ system failure (MOSF) were followed up to 5.56 years. 672 observations were collected, including age, sex, race, coma score (scoma), number of comorbidities (num.co), charges, average of therapeutic intervention scoring system (avtiss), white blood cell count (wbc), heart rate (hrt), respiratory rate (resp), temperature (temp), serum bilirubin level (bili), creatinine (crea), serum sodium (sod), and imputed ADL calibrated to surrogate (adlsc). For more details about this study, refer to [23]. We suppose that binary and multinary risk factors have constant coefficients, which yields a 12-dimensional parameter after transforming multinary variables into binary. Conversely, other risk factors are supposed to have varying coefficients, that is, a 13-dimension functional coefficient. Here the main exposure variable đ corresponds to age. Censoring rate of 672 patients is 32.7%. The model we considered is 13
12
Variance Ă 10â3 0.0178
0.0007 0.0011 0.0018 0.0034 0.0042 0.0074 0.0017
0.0247 0.0309 0.0450 0.0580 0.0585 0.1410 0.0344
â0.0002 0.0006 0.0008
0.0414 0.0232 0.0421
Appendices In this appendix, we list assumptions and outline the proofs of the main results. The following assumptions are imposed.
A. Assumptions (A.i) The density đ(đ¤) of đ¤ is continuous, has compact support W, and satisfies inf đ¤âW đ(đ¤) > 0. (A.ii) đ˝(â
) and đ(â
) have continuous bounded second derivatives on W. (A.iii) The density function đž(â
) is bounded and symmetric đ and has a compact bounded support. âŤ0 đ 0 (đĄ)đđĄ < â.
đ (đĄ | đđ , đđ , ageđ ) = đ 0 (đĄ) + â đ˝đ (ageđ ) đđđ đ=1
Estimator 0.0006
(26)
+ â đźđ đđđ + đ (ageđ ) , đ=1
where đđđ , đ = 1, 2, . . . , 13, denote binary variables of the đth patient and đđđ , đ = 1, 2, . . . , 12, denote the others of the đth patient. Here, baseline hazard function đ 0 (đĄ) is the failure risk of patients who are white females suffering from MOSF with other variables all equal to zero. By fitting model (26) with a SCAD penalty, the parameter of âother raceâ is identified as zero, and functional coefficients of num.co, charges, wbc, hrt, temp, crea, sod, adlsc, and đ(â
) are identified as zero functions. Identified significant risk factors and results of their parametric or functional coefficients are shown in Figure 3 and Table 4. The identified zero parameter of âother raceâ shows that, besides Asian, black, and Hispanic, âother raceâ has no significant different impact on risk in contrast with white people. Table 4 demonstrates that Asians have the lowest failure probability, as do people who are suffering from MOSF. Conversely, coma is the most dangerous among these eight different diseases. Figure 3 shows the fitted coefficients and their 95% pointwise confidence intervals of the five risk factors which were identified as significant.
(A.iv) â â 0, đâ â â and đâ4 â 0 as đ â â. (A.v) The conditional probability đ0 (đĄ, đ§, V, đ¤) is equicontinuous in the arguments (đĄ, đ¤) on the product space [0, đ] Ă W. (A.vi) đ0,0 (đ¤, đĄ) and đ
0,0 (đ¤, đĄ) are bounded away from zero on the product space [0, đ] Ă W, đ1 (â
) and đ2 (â
) are continuous on W, đ1 (đ¤0 ) is nonsingular for all đ¤0 â W, and đ
â2 đ
1,0 (đĄ)
0
đ
0,0 (đĄ)
I (đ) = ⍠{đ
2,0 (đĄ) â
} đđĄ, â2
đ
ÎŁ (đ) = ⍠[đ
2,1 (đĄ) + { 0
â2
đ
1,0 (đĄ) } đ
0,1 (đĄ) đ
0,0 (đĄ)
(A.1)
đ
1,0 (đĄ) đ
(đĄ) ] đđĄ đ
0,0 (đĄ) 1,1
are positive-definite. đ
(A.vii) The eigenvalues of the matrix đ¸{âŤ0 đ(đĄ)đT (đĄ)đ(đĄ)đđĄ} are uniformly bounded away from 0.
Journal of Applied Mathematics 4
4
3
3
2
2
g(w)
g(w)
8
1
1 0
0
â1
â1 0.5
0.0
1.0
1.5 w
2.0
2.5
3.0
0.0
0.5
1.0
(a) HARD
2.5
3.0
2.0
2.5
3.0
2.0
2.5
3.0
3
2
2
1
đ˝1 (w)
đ˝1 (w)
2.0
(b) SCAD
3
0
1 0
â1
â1 0.5
0.0
1.0
1.5 w
2.0
2.5
3.0
0.0
0.5
1.0
(c) HARD 6
6
5
5
4
4
3 2
3 2
1
1
0
0 0.5
0.0
1.0
1.5 w
1.5 w
(d) SCAD
đ˝3 (w)
đ˝3 (w)
1.5 w
2.0
2.5
3.0
0.0
0.5
1.0
(e) HARD
1.5 w
(f) SCAD
Figure 1: Results of the simulation study for scenario (I): the estimated curves of the nonzero functional coefficients đ˝1 (đ¤), đ˝3 (đ¤), and đ(đ¤). The solid, dashed, and dotted curves represent the true function, estimated functions, and 95% pointwise confidence intervals, respectively. Left panelâHARD penalty; right panelâSCAD penalty.
(A.viii) lim supđ (maxđ đż đ /minđ đż đ ) < â. â1
(A.ix) limđ â â đ đż max log(đż max ) = 0.
B. Technical Lemmas â1/2
2
Write đđ = (đâ) + â . Before we prove the theorems, we present several lemmas for the proofs of the main results. Lemma B.1 will be used for the proofs of Theorems 1 and 3. Lemma B.5 establishes the convergence rate for the spline approximation of nonparametric functions. Its proof can be finished in a similar way as in [24].
Write đśđ1 (đĄ, đ¤0 ) đ
= đâ1 âđđ (đĄ) đ1 (đĄ, đ=1
(đđ â đ¤0 ) , đđ , đ, đ¤0 ) â
Ă đžâ (đđ â đ¤0 ) , đ
đśđ2 (đĄ, đ¤0 ) = đâ1 âđđ (đĄ) đ2 (đĄ, đđ , đđ , đ) đ=1
(B.1) for functions đ1 (â
, â
, â
, â
, â
) and đ2 (â
, â
, â
, â
).
9
4
4
3
3
2
2
g(w)
g(w)
Journal of Applied Mathematics
1
1
0
0
â1
â1 0.0
0.5
1.0
1.5 w
2.0
2.5
3.0
0.5
0.0
1.5 w
2.0
2.5
2.0
2.5
2.0
2.5
3.0
(b) SCAD
3
3
2
2
1
đ˝1 (w)
đ˝1 (w)
(a) HARD
1.0
1
0
0
â1
â1
â2 0.0
0.5
1.0
1.5 w
2.0
2.5
3.0
0.5
0.0
1.5 w
3.0
(d) SCAD
6
6
4
4
đ˝3 (w)
đ˝3 (w)
(c) HARD
1.0
2 0
2 0
0.0
0.5
1.0
1.5 w
2.0
2.5
3.0
0.5
0.0
(e) HARD
1.0
1.5 w
3.0
(f) SCAD
Figure 2: Results of the simulation study for scenario (II). The legend is the same as in Figure 1.
Lemma B.1. Assume that đ1 (đĄ, đ˘, đ, đ, đ¤) is equicontinuous in its arguments đ¤ and đĄ, đ2 (đĄ, đ, đ, đ) is equicontinuous in its argument đĄ, and that đ¸(đ1 (đĄ, đ˘, đ, đ, đ) | đ = đ¤0 ) is equicontinuous in the argument đ¤0 . Under Assumptions (A.i)â (A.v), one has for each đ¤0 â W,
Lemma B.2. Let đđâ (đź0 ) = â(đđż đ (đź)/đđź)|đź=đź0 . Assume that (13) and Assumptions (A.i)â(A.vi) hold. We then have đđâ (đź0 )
đśđđ (đĄ, đ¤0 ) = đśđ (đĄ, đ¤0 ) + đđ (đđ ) , óľ¨ óľ¨ đ sup sup óľ¨óľ¨óľ¨óľ¨đśđđ (đĄ, đ¤0 ) â đśđ (đĄ, đ¤0 )óľ¨óľ¨óľ¨óľ¨ ół¨â 0, 0â¤đĄâ¤đ đ¤ âW
Its proof can be finished using the similar arguments in Lemma 1 of [25].
(B.2)
0
where đśđ (đĄ, đ¤0 ) = đ(đ¤0 ) ⍠đ¸{đ(đĄ)đđ (đĄ, đ˘, đ, đ, đ¤0 ) | đ = đ¤0 }đž(đ˘)đđ˘ for đ = 1, 2.
đ
đ
T
Ěđ (đđ ) â đ (đĄ)] đđđ (đĄ) + đđ (âđ) = â ⍠[đ đ=1 0
Ěđ (đź0 ) + đđ (âđ) , ⥠âđ đ Ěđ (đđ )/ âđ đđ (đĄ). where đ(đĄ) = âđđ=1 đđ (đĄ)đ đ=1
(B.3)
10
Journal of Applied Mathematics 8
10
6
8
Ě đ˝(age)
Ě đ˝(age)
4 2 0
6 4
â2 20
40
60 Age
80
20
40
(a) scoma
80
(b) avtisst 1.5
0.0
1.0 Ě đ˝(age)
â0.5 Ě đ˝(age)
60 Age
â1.0
0.5 0.0
â1.5
40
20
60 Age
80
20
40
(c) meanbp
60 Age
80
(d) resp
3
Ě đ˝(age)
2 1 0 â1 20
40
60 Age
80
(e) bili
Figure 3: Results of the real data example. The estimated curves of those nonzero functional coefficients. The solid and dashed curves represent the estimated curve and 95% pointwise confidence intervals (Ă104 ).
Proof. Because đđâ (đź0 ) = â(đđż đ (đź)/đđź)|đź=đź0 , we have
đ
T
đ=1 0
đđâ (đź0 ) đ
đ
= â ⍠{đđâ (đđ ) â đ (đĄ)}
đ
â
= â ⍠{đđâ (đđ ) â đ (đĄ)}
T
đ=1 0
Ă [đđđ (đĄ) â đđ (đĄ) {đşđđ (đĄ) + đđâ (đđ ) đź} đđĄ]
Ă [đđđ (đĄ) â đđ (đĄ) Ě (đ , đź ) + đĚ (đ , đź ) + đźT đ } đđĄ] Ă {đđT đ˝ đ 0 đ 0 0 đ (B.4)
Journal of Applied Mathematics
11
â
đ
with đ (đĄ) = âđđ=1 đđ (đĄ)đđâ (đđ )/ âđđ=1 đđ (đĄ). By a similar Ě ,đź ) discussion in [5], we can prove that the biases of đ˝(đ¤ 0 0 2 Ě 0 , đź0 ) are the order of đđ (â ) at each point đ¤0 â W, and đ(đ¤ respectively. Furthermore, recall that đđâ (đđ ) = đđT â đđT đđ2 (đđ ) â â1 (đđ )đđ2 (đđ ) and đđ4 (đđ ) with đđ2 (đđ ) = (đźđ , 0đĂđ , 0đ )đđ1
đ
T
Ěđ (đđ ) â đ (đĄ) + đđ (đđ )} đđ (đĄ) â â ⍠{đ đ=1 0
{ â1 Ă { (đđT (đĄ) , 0Tđ+1 ) đđ1 (đđ ) {
đ
â1 (đ¤0 )đđ2 (đ¤0 )đđ¤0 , where đđ4 (đđ ) = âŤ0 đ (0đĂđ , 0đĂđ , 1đ )đđ1
1 đ đ Ă â ⍠đžâ (đđ â đđ ) Hâ1 đ đ=1 0 Ă {đđâ (đĄ, đđ ) â đ (đĄ, đđ )} đđđ (đĄ)
đđ1 (đ¤0 )
+ (0T2đ , ââ1 )
â2 1 đ đ = â ⍠đžâ (đđ â đ¤0 ) {đđâ (đ¤0 ) â đ (đĄ, đ¤0 )} đđ (đĄ) đđĄ đ đ=1 0 đ
= ⍠{ÎŚđ20 (đĄ, đ¤0 ) â 0
đđ
â1 à ⍠đđ1 (đ¤)
ÎŚâ2 đ10 (đĄ, đ¤0 ) } đđĄ, ÎŚđ00 (đĄ, đ¤0 )
0
1 đ đ Ă [ â ⍠đžâ (đđ â đ¤) Hâ1 đ 0 [ đ=1
đđ2 (đ¤0 ) =
1 đ đ â ⍠đž (đ â đ¤0 ) {đđâ (đ¤0 ) â đ (đĄ, đ¤0 )} đđT đđ (đĄ) đđĄ đ đ=1 0 â đ
Ă {đđâ (đ¤) â đ (đĄ, đ¤)} } Ă đđđ (đĄ) ] đđ¤} đđĄ + đđ (âđ) . ] } (B.8)
đ
ÎŚ (đĄ, đ¤0 ) = ⍠{ÎŚđ11 (đĄ, đ¤0 ) â đ10 ÎŚ (đĄ, đ¤0 )} đđĄ, ÎŚđ00 (đĄ, đ¤0 ) đ01 0
(B.5)
with ÎŚđđđ (đĄ, đ¤0 ) = (1/đ) âđđ=1 đžâ (đđ â đ¤0 ){đđâ (đ¤0 )}âđ (đđâđ )T đđ (đĄ) for đ = 0, 1, 2 and đ = 0, 1. It follows from Lemma B.1 that
A further simplification indicates that đđâ (đź0 ) can be expressed as
đ
đđ1 (đ¤0 ) = đ1 (đ¤0 ) + đđ (đđ ) , đđ2 (đ¤0 ) = đ2 (đ¤0 ) + đđ (đđ )
đ
T
Ěđ (đđ ) â đ (đĄ)} đđđ (đĄ) â ⍠{đ (B.6)
đ=1 0
đ
đ
T
Ěđ (đđ ) â đ (đĄ) + đđ (đđ )} đđ (đĄ) â â ⍠{đ đ=1 0
Ă [{đđT , 0Tđ , 1} đ1â1 (đđ ) đđ (đđ )] đđĄ
for each đ¤0 â W. This implies that
(B.9)
+ đđ (âđ) đđ2 (đ¤) = đĄ1 (đ¤) + đđ (đđ ) , đđ4 (đ¤) = đĄ2 (đ¤) + đđ (đđ )
đ
(B.7)
uniformly hold on W. Hence it follows from Assumption (A.iv), the arguments similar to the proof of Theorem 3 in [5] and the martingale central limit that
đđâ (đź0 ) đ
đ
T
Ěđ (đđ ) â đ (đĄ) + đđ (đđ )} đđđ (đĄ) = â ⍠{đ đ=1 0
đ
T
Ěđ (đđ ) â đ (đĄ)} đđđ (đĄ) + đđ (âđ) . = â ⍠{đ đ=1 0
Thus (B.3) holds. Lemma B.3. Assume that (13) and Assumptions (A.i)â(A.vi) hold. If đ đ â 0, âđđ đ â â as đ â â, then with probability tending to 1, for any given đź1 satisfying âđź1 âđź10 â = đđ (đâ1/2 ), and any given constant đś,
đżâđ {(đźT1 , 0Tđâđ )} =
min
âđź2 ââ¤đśđâ1/2
đżâđ {(đźT1 , đźT2 )} .
(B.10)
12
Journal of Applied Mathematics
Proof. It suffices to prove that, for any given đź1 satisfying âđź1 â đź10 â = đđ (đâ1/2 ), any given constant đś > 0, and each đźđ , đ = đ + 1, . . . , đ, đđżâđ (đź) < 0 if â đśđâ1/2 < đźđ < 0, đđźđ đđżâđ (đź) > 0 if 0 < đźđ < đśđâ1/2 đđźđ
(B.12)
between đ1 and đ2 , and consequently âđđ=1 âŤ0 {Uđ (đđ ) â U(đĄ)}â2 đđ (đĄ)đđĄ, are invertible. Proof. According to Lemmas A.1 and A.2 of [24], except in an event whose probability tends to zero, â âđ,đ đžđđ đľđđ (đ¤)â2đż â
Î đ (đź0 ) T â2 1 đ đ Ě â ⍠[{đđ (đđ ) â đ (đĄ)} ] đđ (đĄ) đ 0 (đĄ) đđĄ + đđ (đđ ) đ đ=1 0
In the rest of this paper, we denote đâ1 âđđ=1 âŤ0 {Uđ (đđ ) â U(đĄ)}â2 đđ (đĄ)đđĄ by Îđ . Let 1 đ đ đžĚ = Îâ1 â ⍠{Uđ (đđ ) â U (đĄ)} đ đ đ=1 0
+đđ (đâ1/2 )} đđĄ,
T â2
0
â
⥠Π(đź0 ) + đđ (đđ ) (B.13) is positive-definite with probability tending to 1 from Assumption (A.vi) and Lemma B.1. Hence for đź â đź0 = Ěđ = đđ (1), we have đđ (đâ1/2 ), by the fact of đ
Lemma B.5. Suppose Assumptions (A.ii), (A.viii), and (A.ix) hold. We then have đđ â 0, âđđđ â â as đ â â, 2 â Ě â â đ˝âĚ â = đ (đż /đ), âđ˝Ě â đ˝â â = đ (đ ), and âđ˝ đ
= âđÎ â (đź0 ) (đź â đź0 ) + đđ (1) , (B.14) and then
đ
đż2
max
Ě â â đ˝â â = đ (đż /đ + đ2 ). âđ˝ đ max đż2 đ
Ěđ {1 + đđ (1)} + âđÎ â (đź0 ) (đź â đź0 ) {1 + đđ (1)} = âđ
đ
Proof. By Assumptions (A.ii) and (A.viii), đđ = đ(đżâ2 max ) [20, Theorem 6.27]. Hence, it follows from Assumption (A.iv) that đđ â 0 and âđđđ â â as đ â â. By the triangle Ě â â đ˝â â ⤠âđ˝ Ě â â đ˝âĚ â + âđ˝Ě â â đ˝â â . Note inequality, âđ˝ đż2 đż2 đż2 that đžĚ â đžĚ
đđżâđ (đź) đđźđ đđż đ (đź) óľ¨ óľ¨ + đđđó¸ đ (óľ¨óľ¨óľ¨óľ¨đźđ óľ¨óľ¨óľ¨óľ¨) sgn (đźđ ) đđźđ
= {đđ (
â â and đ˝Ě (đ¤) = B(đ¤)đž.Ě Then đžĚ and đ˝Ě (đ¤) are the mean of Ě â (đ¤) conditioning on {đ , đ , đ , đ = 1, 2, . . . , đ}, đžĚ and đ˝ đ đ đ respectively.
đż2 2
1 đđż đ (đź) âđ đđź
(B.17)
Ă {đ 0 (đĄ) + đđ (đĄ) đ˝âT (đđ ) đđâ
Ěđ (đđ ) â đ (đĄ)} ] đ (đĄ) đ 0 (đĄ) đđĄ) + đđ (đđ ) = đ¸ (⍠[{đ
=
2
đ
|đž|2 /đż max . Note that đžT [âđđ=1 âŤ0 {Uđ (đđ ) â U(đĄ)}â2 đđ (đĄ)đđĄ]đž â â âđ,đ đžđđ đľđđ (đ¤)â2đż by Assumption (A.ii). Hence, Lemma B.4 2 holds. đ
By similar algebra with the proof of Lemma B.2, we know that
đ
(B.16) đ
1 T + (đź â đź0 ) âđÎ (đź0 ) (đź â đź0 ) {1 + đđ (1)} . 2
=
â2 đż max đ đ â ⍠{Uđ (đđ ) â U (đĄ)} đđ (đĄ) đđĄ đ đ=1 0
(B.11)
hold with probability tending to 1. Notice that đđâ (đź0 ) = â(đđż đ (đź)/đđź)|đź=đź0 . Then for each đź in a neighborhood of đź0 from Lemma B.2 we have 1 {đż (đź) â đż đ (đź0 )} âđ đ Ěđ (đź0 ) (đź â đź0 ) {1 + đđ (1)} = âđ
Lemma B.4. Suppose Assumptions (A.i) and (A.vii)â(A.ix) hold, then there are positive constants đ1 and đ2 such that, except in an event whose probability tends to zero, all the eigenvalues of
(B.15)
đâ1/2 óľ¨óľ¨ óľ¨óľ¨ ó¸ óľ¨ óľ¨ ) + đâ1 đ đđ đ (óľ¨óľ¨đźđ óľ¨óľ¨) sgn (đźđ )} đđ đ . đđ
As a result, it follows from (13) and đ đ â 0, âđđ đ â â as đ â â that sgn(đđżâđ (đź)/đđźđ ) = sgn(đźđ ) for each đź â đź0 = đđ (đâ1/2 ), which indicates that (B.11) holds with probability tending to 1.
1 đ đ = Îâ1 â ⍠{Uđ (đđ ) â U (đĄ)} {đđđ (đĄ) + đđ (đâ1/2 )} . đ đ đ=1 0 (B.18) đ
đ Thus, according to Îâ1 đ (1/đ) âđ=1 âŤ0 {Uđ (đđ ) â U(đĄ)}đđđ (đĄ) = đđ (đż2max /đ), which can be similarly proved by Lemma A.4 in 2 Ě â â đ˝âĚ â â |Ě [24], we have âđ˝ đž â đž|Ě 2 /đż = đ (đż /đ). On đż2
max
đ
max
the other hand, by the properties of B-spline basis functions, â we can prove that âđ˝Ě â đ˝â âđż 2 = đđ (đđ ) by the similar argument of Lemma A.7 in [24]. Thus, Lemma B.5 holds.
Journal of Applied Mathematics
13
Lemma B.6. Suppose (13) and Assumptions (A.i) and (A.vii)â (A.iv) hold. If đđ â 0 and đ đ /đđ â â, đđâ â 0 as đ â â, Ě â â đ˝Ě â â = đ (đâ + (đ đ )1/2 + (đż /đ)1/2 ). we then have âđ˝ đ
đż2
đ
đ đ
max
Proof. Using the properties of B-spline basis functions (Section A.2 of [24]), we have
We first examine the second term on the right-hand side of (B.23). A direct calculation with the Taylor expansion and using đđ đ (0) = 0 yields that đ
óľŠ óľŠ óľŠ óľŠ â {đđ đ (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ ) â đđ đ (óľŠóľŠóľŠđžĚ đ óľŠóľŠóľŠđ )}
đ=0
đ
óľŠ2 óľŠóľŠ Ě óľŠóľŠĚ óľŠóľŠ2 óľŠóľŠđ˝đ â đ˝đĚ óľŠóľŠóľŠ = óľŠóľŠóľŠóľŠđžĚđ â đžĚ đ óľŠóľŠóľŠóľŠ2đ â đżâ1 đ óľŠ óľŠđžđ â đžĚ đ óľŠóľŠ , óľŠđż 2 óľŠ
(B.19)
óľŠ óľŠ óľŠ óľŠ = â [{đđ đ (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ ) â đđ đ (óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠđż 2 )} đ=0
óľŠ óľŠ óľŠ óľŠ â {đđ đ (óľŠóľŠóľŠđžĚ đ óľŠóľŠóľŠđ ) â đđ đ (óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠđż 2 )}]
which we sum over đ to obtain
đ
óľŠ óľŠ óľŠ óľŠ óľŠ óľŠ âĽ â [{đđó¸ đ (óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠđż 2 ) (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ â óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠđż 2 ) đ=0
đ
óľŠóľŠ Ěâ óľŠ2 óľŠóľŠ2 óľŠóľŠĚ óľŠóľŠđ˝ â đ˝âĚ óľŠóľŠóľŠ = â óľŠóľŠóľŠóľŠđžĚđ â đžĚ đ óľŠóľŠóľŠóľŠ2đ â đżâ1 max óľŠ óľŠđž â đžĚ óľŠóľŠ . óľŠ óľŠđż 2
óľŠ óľŠ óľŠ óľŠ óľŠ óľŠ âđđó¸ đ (óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠđż 2 ) (óľŠóľŠóľŠđžĚ đ óľŠóľŠóľŠđ â óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠđż 2 )}
(B.20)
đ=0
1 óľŠ óľŠ + đđó¸ ó¸ đ (óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠđż 2 ) 2
Let đżđ = đđâ + (đ đ đđ )1/2 + (đż max /đ)1/2 . It suffices to show that, for any given đ > 0, there exists a large enough đś such that
óľŠ óľŠ óľŠ óľŠ 2 óľŠ óľŠ óľŠ óľŠ 2 Ă {(óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ â óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠđż 2 ) â (óľŠóľŠóľŠđžĚ đ óľŠóľŠóľŠđ â óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠđż 2 ) } Ă (1 + đđ (1)) ]
Ě âĽ 1 â đ, Pr { inf đđż đ (đžĚ + đż1/2 max đżđ u) > đđż đ (đž)} âuâ=đś
(B.21)
đ
óľŠ óľŠ â â đđ đ (óľŠóľŠóľŠđžĚ đ óľŠóľŠóľŠđ ) ⼠â đđâ đđ (đżđ ) âuâ đ=đ+1
which implies đđż đ (â
) can reach a local minimum in the ball {đžĚ + đż1/2 max đżđ u : âuâ ⤠đś} with a probability of at least 1 â đ. Thus, there exists a local minimizer satisfying âĚđž â Ě = đžĚ + đż1/2 đžâĚ = đđ (đż1/2 max đżđ ). Let đž max đżđ u; then by the Taylor expansion and the definition of đžĚ we have
đđż đ (Ěđž) â đđż đ (đž)Ě =
đ T (Ěđž â đžĚ) Îđ (Ěđž â đžĚ) {1 + đđ (1)} 2 đ T + (Ěđž â đž)Ě Îđ (Ěđž â đž)Ě {1 + đđ (1)} 2 đ
óľŠ óľŠ óľŠ óľŠ + đ â {đđ đ (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ ) â đđ đ (óľŠóľŠóľŠđžĚ đ óľŠóľŠóľŠđ )} . đ=0
(B.22)
It follows from Lemmas B.4 and B.5 that đđż đ (Ěđž) â đđż đ (đž)Ě 2
1/2 đż = đ(đżđ + ( max ) ) âuâ2 đđ (1) {1 + đđ (1)} đ đ
óľŠ óľŠ óľŠ óľŠ + đ â {đđ đ (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ ) â đđ đ (óľŠóľŠóľŠđžĚ đ óľŠóľŠóľŠđ )} . đ=0
(B.23)
â đđâ đđ (đżđ2 + đżđ đđ ) âuâ2 â đđó¸ đ (0) đđ (đđ ) . (B.24) Here, đđó¸ đ (0) = limđâ0 đđó¸ đ (đ). Thus, combining that with (B.23) yields 1 Ě {đđż đ (Ěđž) â đđż đ (đž)} đ 2
⼠(đżđ + (
đż max 1/2 ) ) âuâ2 đđ (1) đ
(B.25)
â đđâ đđ (đżđ ) âuâ â đđâ đđ (đżđ2 + đżđ đđ ) âuâ2 â đđó¸ đ (0) đđ (đđ ) . Notice that đżđ = đđâ + (đ đ đđ )1/2 + (đż max /đ)1/2 and đđâ â 0 as đ â â. Then by choosing a sufficiently large đś, the first term dominates the second and the third terms on the right-hand side of (B.25). On the other hand, according to (13), we have (đżđ + (đż max /đ)1/2 )2 /đđó¸ đ (0)đđ = đđ (1 + (đż max + đđđâ )/đ đ đđ ). Hence, by choosing a sufficiently large đś, (B.21) holds. This completes the proof of Lemma B.6.
C. Proof of Theorem 1 Let đžđ = đâ1/2 + đđ . It is sufficient to prove that for any given đ > 0, there exists a large enough constant đś such that đ { inf đżâđ (đź0 + đžđ u) > đżâđ (đź0 )} ⼠1 â đ, âuâ=đś
(C.1)
14
Journal of Applied Mathematics
which means đżâđ (â
) can reach a local minimum in the ball {đź0 + đžđ u : âuâ ⤠đś} with a probability of at least 1 â đ. Thus, there exists a local minimizer satisfying âĚ đź â đź0 â = đđ (đžđ ). Now we are going to prove (C.1). Note that đżâđ (đź0 + đžđ u) â đżâđ (đź0 ) ⼠đż đ (đź0 + đžđ u) â đż đ (đź0 ) đ
Then by some regular calculations we can obtain Ě â đź0 đź â2 1 đ đ = â { â ⍠(đđâ (đđ ) â đ) đđ (đĄ) đđĄ đ đ=1 0
(C.2)
+ diag ({đđó¸ ó¸ đ
óľ¨ óľ¨ óľ¨ óľ¨ + đ â [đđ đ (óľ¨óľ¨óľ¨óľ¨đź0đ + đžđ đ˘đ óľ¨óľ¨óľ¨óľ¨) â đđ đ (óľ¨óľ¨óľ¨óľ¨đź0đ óľ¨óľ¨óľ¨óľ¨)] . đ=1
đđâ (đź0 )
Because = â(đđż đ (đź)/đđź)|đź=đź0 , it similarly follows from (A.iv) that 1 {đż (đź + đžđ u) â đż đ (đź0 )} âđ đ 0 Ěđ (đź0 ) đžđ u {1 + đđ (1)} âĽđ
(C.3)
1 + uT âđÎ âđ (đź0 ) u {1 + đđ (1)} đžđ2 . 2 Hence, we can obtain that
â
{
â2 1 đ đ â â ⍠{đ (đđ ) â đ} đđ (đĄ) đđĄ đ đ=1 0 đ
ÎĽâ2 (đĄ) = ⍠{ÎĽđ20 (đĄ) â đ10 } đđĄ + đđ (1) , ÎĽđ00 (đĄ) 0
â2 1 đ đ â â ⍠{đ (đđ ) â đ} đđ (đĄ) đđĄ = I (đ) + đđ (1) . đ đ=1 0 đ
óľ¨ óľ¨ (óľ¨óľ¨óľ¨óľ¨đź0đ óľ¨óľ¨óľ¨óľ¨) sgn (đź0đ ) đ˘đ
óľ¨ óľ¨ +đžđ2 đđó¸ ó¸ đ (óľ¨óľ¨óľ¨óľ¨đź0đ óľ¨óľ¨óľ¨óľ¨) sgn (đź0đ ) đ˘đ2 (1 + đđ (1))} . (C.4) Ěđ (đź0 ) = đđ (1), Î â (đź0 ) = đđ (1). Hence, by Notice that đ đ Ěđ (đź0 )đžđ u{1+đđ (1)} choosing a large enough constant đś, ââđđ T â 2 + (1/2)u đÎ đ (đź0 )u{1 + đđ (1)}đžđ > 0 uniformly in âđ˘â = đś. Using the discussion similar to that in [7], the last two terms on the right-hand side of (C.4) are bounded by đđ (đžđ đđ đś + đžđ2 đđ đś2 ). It then follows from limđ â â đđ = 0 that when đś is sufficiently large, đżâđ (đź0 + đžđ u) â đżâđ (đź0 ) > 0 uniformly holds in âđ˘â = đś. Hence, (C.1) holds.
D. Proof of Theorem 3 It follows from Lemma B.3 that part (1) holds. Now we are going to prove part (2). From part (1) we see that the local Ě = (Ě đźT1 , 0Tđâđ )T and minimizer of đżâđ (đź) has the form of đź Ě âđź0 = đđ (đâ1/2 ). It follows from Taylorâs expansion satisfies đź that đđżâđ (đź) óľ¨óľ¨óľ¨óľ¨ óľ¨óľ¨ đđź óľ¨óľ¨óľ¨đźĚ =
đđâ
T
đź) + đ diag (b (Ě
(D.3)
Ěđ (đđ )}âđ đđ (đĄ | đđ , đđ , đđ )đđ (đĄ) for where ÎĽđđđ (đĄ) = (1/đ) âđđ=1 {đ đ = 0, 1, 2, đ = 0, 1. Lemma B.1 and Assumptions (A.v) and (A.vi) yield
1 + uT đÎ âđ (đź0 ) u {1 + đđ (1)} đžđ2 2 +
(D.2)
1 Ě đ (đź ) + diag (bT , 0Tđâđ ) + đđ (đâ1/2 )} . âđ đ 0
đ
Ěđ (đź0 ) đžđ u {1 + đđ (1)} ⼠ââđđ
đ â {đžđ đđó¸ đ đ=1
â1
It follows from the proofs of Lemmas B.2 and B.3 that
đżâđ (đź0 + đžđ u) â đżâđ (đź0 )
đ
T óľ¨ óľ¨ (óľ¨óľ¨óľ¨đź01 óľ¨óľ¨óľ¨)}1,...,đ , 0Tđâđ ) + đđ (1) }
, 0Tđâđ )
T óľ¨ óľ¨ đź â đź0 ) . + đ {diag [{đđó¸ ó¸ đ (óľ¨óľ¨óľ¨đź01 óľ¨óľ¨óľ¨)}1,...,đ , 0Tđâđ ] + đđ (1)} (Ě
(D.1)
(D.4)
Ěđ (đź0 ). Using the martingale central limits Next we consider đ Ěđ (đź0 ) is asymptotically normally theorem, we can prove that đ Ěđ (đź0 )]â2 . distributed with a mean of zero and covariance đ¸[đ Thus we obtain that Ěđ (đź0 )] đ¸[đ
â2 đ
= lim ⍠[ÎĽđ21 (đĄ) + { đââ 0
â2
ÎĽđ10 (đĄ) â2 } ÎĽđ01 (đĄ) ÎĽđ00 (đĄ)
(D.5)
ÎĽđ10 (đĄ) ÎĽ (đĄ) ] đđĄ = ÎŁ (đ) . ÎĽđ00 (đĄ) đ11
Hence, by using Slutskyâs Theorem, it follows from (D.2)â (D.5) that â1 âđ {Ě đź1 â đź10 â (I1 (đ) + Î) b} đˇ
â1
â1
(D.6)
ół¨â đ (0, (I1 (đ) + Î) ÎŁ1 (đ) (I1 (đ) + Î) ) .
E. Proof of Theorem 5 To prove part (a) of Theorem 5, we use proof by contradiction. Suppose that for an đ sufficiently large there exists a constant đ, such that with a probability of at least đ there exists a đ0 > đ such that đ˝Ěđ0 ≠ 0. Then âĚđžđ0 âđ = âđ˝Ěđ0 âđż > 0. Let đžĚâ be 0
2
Journal of Applied Mathematics
15
a vector constructed by replacing đžĚđ0 with 0 in đžĚ. Then by the definition of đžĚ, đđż đ (Ěđž) â đđż đ (Ěđžâ )
T (1) 1 đ đ (1) â ⍠{Uđ (đđ ) â U (đĄ)} đđđ (đĄ) (1 + đđ (đâ1/2 )) đ đ=1 0
= {đż đ (Ěđž) â đż đ (Ěđž)} â {đż đ (Ěđžâ ) â đż đ (Ěđž)} óľŠ óľŠ + đđđ đ (óľŠóľŠóľŠóľŠđžĚđ0 óľŠóľŠóľŠóľŠđ )
(E.1)
0
T
It then follows from Lemmas B.4âB.6 that the first term on the right-hand side of (E.1) is an order of đđ (đ (max{(đ đ đđ )1/2 , đđâ , (đż max /đ)1/2 })2 ). Notice that đđ đ (âĚđžđ0 âđ ) 0
= đđó¸ đ (0)âĚđžđ0 âđ (1 + đđ (1)) = đđó¸ đ (0) đđ (max{(đ đ đđ )1/2 , đđâ , 0
(đż max /đ)1/2 }). Hence,
đđż đ (Ěđž) â đđż đ (Ěđžâ ) = đđđ ((max {(đ đ đđ )
đż , đđâ , ( max )
1/2
đ
1/2
+ đđđó¸ đ (0) đđ (max {(đ đ đđ ) 1/2
{(đ đ đđ ) { = {đđ (Max {
, đđâ , (
2
}) )
đż max 1/2 ) }) đ
1/2
, đđâ , (đż Max /đ) đđ
}
Ă đđđ (max {(đ đ đđ )
đż , đđâ , ( max ) đ
1/2
}) đ đ .
F. Proof of Theorem 8
đ
(1)
By using similar quadratic approximation in [10], we have óľŠ óľŠ đđ (óľŠóľŠóľŠđžđ óľŠóľŠóľŠđ ) óľŠ óľŠ đđó¸ (óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠđż 2 ) 1 óľŠóľŠ óľŠóľŠ â đđ (óľŠóľŠđ˝đ óľŠóľŠđż 2 ) + óľŠ óľŠ óľŠ óľŠ óľŠ óľŠ , 2 óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠ (óľŠóľŠóľŠđžđ óľŠóľŠóľŠ â óľŠóľŠóľŠđ˝đ óľŠóľŠóľŠ ) đż2 đ đż2 óľŠ óľŠ óľŠ óľŠ đ đđđ (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ ) đđđ (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ ) â đ¸ { }] â[ đđž(1) đđž(1) đ=0 (F.3) óľŠóľŠ ó¸ óľŠ óľŠ { 1 đđ (óľŠóľŠđ˝đ óľŠóľŠđż 2 ) } = diag { Rđ } 2 óľŠóľŠóľŠóľŠđ˝đ óľŠóľŠóľŠóľŠđż 2 }0â¤đâ¤đ {
(đđ ) â đ˝(1) (đđ )} đđĄ
(1)
(Ě đž
(1)
â1/2
đż
(F.4)
â đž ) ół¨â đ (0, 1) đ
holds for any vector đđ with dimension âđ=0 đż đ and components not all 0. Then for any (đ + 1)-dimension vector đđ whose components are not all 0 and for any given đ¤ in W, choosing đđâ = (B(1) (đ¤))T đđ yields â1 â1 âT (1) âđđżâ1 (đ¤) (Î1 + Îâ ) Ίâ (Î1 + Îâ ) đ {đđ B â1/2
(F.5)
đż Ě (1) (đ¤) â đ˝(1) (đ¤)} ół¨â Ă đđâT {đ˝ đ (0, 1) ,
}
đ
Ă
đđT
T
which implies (25) holds.
(1) 1 + â ⍠(U(1) (đĄ)) đđ(1)đ đđ (đĄ) đ (đđ ) â U đ đ=1 0
â
{đ˝
â2 (1) 1 đ đ + â ⍠{U(1) (đĄ)} đđ (đĄ) {Ěđž(1) â đž(1) } đđĄ. đ (đđ ) â U đ đ=1 0 (F.2)
Ă (B(1) (đ¤)) đđâ }
Note that đđž(1)
}]
T â â1 â â â1 âđđżâ1 đ {đđ (Î1 + Î ) Ί (Î1 + Î ) đđ }
Because đ đ â 0 and đ đ / max{đđ , đđâ , (đż đ /đ)1/2 } â â as đ â â, (13) implies that đđż đ (Ěđž) â đđż đ (Ěđžâ ) > 0 with probability tending to 1. This contradicts the fact that đđż đ (Ěđž)â đđż đ (Ěđžâ ) ⤠0. Thus, part (a) holds. Lemmas B.5 and B.6 yield the proof of part (b).
đ=0
đđž(1)
â đž(1) â đž(1) ) (1 + đđ (1)) . = đżâ1 đ Î (Ě
(E.2)
óľŠ óľŠ đđđ (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ )
óľŠ óľŠ đđđ (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ )
Finally, by using the arguments similar to the proof of Theorem 4.1 of [26] and the proof of [24], (F.2) and the martingale central theory, we know that 1/2
đ
đđž(1)
â đ¸{
Ă (Ěđž(1) â đž(1) ) (1 + đđ (1))
)
} +đđó¸ đ (0) đâ1 đ } }
đ¸ {â
óľŠ óľŠ đđđ (óľŠóľŠóľŠđžĚđ óľŠóľŠóľŠđ )
đ=0
óľŠ óľŠ Ă (1 + đđ (1)) + đđđ đ (óľŠóľŠóľŠóľŠđžĚđ0 óľŠóľŠóľŠóľŠđ ) . 0
1/2
đ
= â[
= đ {(Ěđž â đžĚ) Îđ (Ěđž â đžĚ) â (Ěđžâ â đžĚ) Îđ (Ěđžâ â đžĚ)} T
is the conditional mean of (24) given all observed covariates {(đđ , đđ , đđ )}1â¤đâ¤đ and đđ (đĄ), đ = 1, . . . , đ. It follows from (24) that the expression given in (F.1) equals zero. Hence, together with (24), we have
T
(F.1)
Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper.
16
Acknowledgments Maâs research was supported by the National Natural Science Foundation of China (NSFC) Grant no. 11301424 and the Fundamental Research Funds for the Central Universities Grant no. JBK120405.
References [1] J. Huang, âEfficient estimation of the partly linear additive Cox model,â The Annals of Statistics, vol. 27, no. 5, pp. 1536â1563, 1999. [2] J. Cai, J. Fan, J. Jiang, and H. Zhou, âPartially linear hazard regression for multivariate survival data,â Journal of the American Statistical Association, vol. 102, no. 478, pp. 538â551, 2007. [3] J. Fan, H. Lin, and Y. Zhou, âLocal partial-likelihood estimation for lifetime data,â The Annals of Statistics, vol. 34, no. 1, pp. 290â 325, 2006. [4] L. Tian, D. Zucker, and L. J. Wei, âOn the Cox model with time-varying regression coefficients,â Journal of the American Statistical Association, vol. 100, no. 469, pp. 172â183, 2005. [5] G. S. Yin, H. Li, and D. L. Zeng, âPartially linear additive hazards regression with varying coefficients,â Journal of the American Statistical Association, vol. 103, no. 483, pp. 1200â1213, 2008. [6] J. Fan and R. Li, âVariable selection for Coxâs proportional hazards model and frailty model,â The Annals of Statistics, vol. 30, no. 1, pp. 74â99, 2002. [7] J. Fan and R. Li, âVariable selection via nonconcave penalized likelihood and its oracle properties,â Journal of the American Statistical Association, vol. 96, no. 456, pp. 1348â1360, 2001. [8] B. A. Johnson, D. Y. Lin, and D. Zeng, âPenalized estimating functions and variable selection in semiparametric regression models,â Journal of the American Statistical Association, vol. 103, no. 482, pp. 672â680, 2008. [9] B. A. Johnson, âVariable selection in semiparametric linear regression with censored data,â Journal of the Royal Statistical Society B. Statistical Methodology, vol. 70, no. 2, pp. 351â370, 2008. [10] L. Wang, H. Li, and J. Z. Huang, âVariable selection in nonparametric varying-coefficient models for analysis of repeated measurements,â Journal of the American Statistical Association, vol. 103, no. 484, pp. 1556â1569, 2008. [11] P. Du, S. Ma, and H. Liang, âPenalized variable selection procedure for Cox models with semiparametric relative risk,â The Annals of Statistics, vol. 38, no. 4, pp. 2092â2117, 2010. [12] D. Y. Lin and Z. Ying, âSemiparametric analysis of the additive risk model,â Biometrika, vol. 81, no. 1, pp. 61â71, 1994. [13] F. W. Huffer and I. W. McKeague, âWeighted least squares estimation for aalenâs additive risk model,â Journal of the American Statistical Association, vol. 86, no. 413, pp. 114â129, 1991. [14] O. O. Aalen, âA linear regression model for the analysis of life times,â Statistics in Medicine, vol. 8, no. 8, pp. 907â925, 1989. [15] T. H. Scheike, âThe additive nonparametric and semiparametric Aalen model as the rate function for a counting process,â Lifetime Data Analysis, vol. 8, no. 3, pp. 247â262, 2002. [16] I. W. McKeague and P. D. Sasieni, âA partly parametric additive risk model,â Biometrika, vol. 81, no. 3, pp. 501â514, 1994. [17] H. Li, G. Yin, and Y. Zhou, âLocal likelihood with time-varying additive hazards model,â The Canadian Journal of Statistics, vol. 35, no. 2, pp. 321â337, 2007.
Journal of Applied Mathematics [18] J. D. Kalbfleisch and R. L. Prentice, The Statistical Analysis of Failure Time Data, John Wiley & Sons, Hoboken, NJ, USA, 2nd edition, 2002. [19] J. Fan and I. Gijbels, Local Polynomial Modelling and Its Applications: Monographs on Statistics and Applied Probability, Chapman & Hall, London, UK, 1996. [20] L. L. Schumaker, Spline Functions: Basic Theory, Cambridge University Press, Cambridge, UK, 3rd edition, 2007. [21] C. J. Stone, âOptimal global rates of convergence for nonparametric regression,â The Annals of Statistics, vol. 10, no. 4, pp. 1040â1053, 1982. [22] J. Fan and T. Huang, âProfile likelihood inferences on semiparametric varying-coefficient partially linear models,â Bernoulli, vol. 11, no. 6, pp. 1031â1057, 2005. [23] W. A. Knaus, F. E. Harrell Jr., J. Lynn et al., âThe SUPPORT prognostic model. Objective estimates of survival for seriously ill hospitalized adults,â Annals of Internal Medicine, vol. 122, no. 3, pp. 191â203, 1995. [24] J. Z. Huang, C. O. Wu, and L. Zhou, âPolynomial spline estimation and inference for varying coefficient models with longitudinal data,â Statistica Sinica, vol. 14, no. 3, pp. 763â788, 2004. [25] J. Cai, J. Fan, H. Zhou, and Y. Zhou, âHazard models with varying coefficients for multivariate failure time data,â The Annals of Statistics, vol. 35, no. 1, pp. 324â354, 2007. [26] J. Z. Huang, âLocal asymptotics for polynomial spline regression,â The Annals of Statistics, vol. 31, no. 5, pp. 1600â1635, 2003.
Advances in
Operations Research Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Advances in
Decision Sciences Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Journal of
Applied Mathematics
Algebra
Hindawi Publishing Corporation http://www.hindawi.com
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Journal of
Probability and Statistics Volume 2014
The Scientific World Journal Hindawi Publishing Corporation http://www.hindawi.com
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
International Journal of
Differential Equations Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Volume 2014
Submit your manuscripts at http://www.hindawi.com International Journal of
Advances in
Combinatorics Hindawi Publishing Corporation http://www.hindawi.com
Mathematical Physics Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Journal of
Complex Analysis Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
International Journal of Mathematics and Mathematical Sciences
Mathematical Problems in Engineering
Journal of
Mathematics Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Discrete Mathematics
Journal of
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Discrete Dynamics in Nature and Society
Journal of
Function Spaces Hindawi Publishing Corporation http://www.hindawi.com
Abstract and Applied Analysis
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
International Journal of
Journal of
Stochastic Analysis
Optimization
Hindawi Publishing Corporation http://www.hindawi.com
Hindawi Publishing Corporation http://www.hindawi.com
Volume 2014
Volume 2014