285
The Canadian Journal of Statistics Vol. 42, No. 2, 2014, Pages 285–307 La revue canadienne de statistique
Reweighting estimators for the additive hazards model with missing covariates Meiling HAO1 , Xinyuan SONG2 and Liuquan SUN1 * 1 Institute
of Applied Mathematics, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, P.R. China 2 Department of Statistics, The Chinese University of Hong Kong, Hong Kong, P.R. China Key words and phrases: Additive hazards model; missing at random; missing covariate; survival data; weighted estimating equation. MSC 2010: Primary 62N01; secondary 62N02 Abstract: Missing covariate data are common in biomedical studies. In this paper, we propose a reweighting method to estimate the regression parameters in the additive hazards regression model when some of the covariates are missing at random. The resulting estimators are shown to be consistent and asymptotically normal. In addition, a lack-of-fit test is presented to assess the adequacy of the model with missing covariates. Simulation studies show that the proposed reweighting estimators perform well, and are more efficient than the weighted estimators for the missing covariate effects when the selection probability is small. An application to the mouse leukaemia data is provided. The Canadian Journal of Statistics 42: 285–307; 2014 © 2014 Statistical Society of Canada Re´ sume´ : Les donn´ees comportant des valeurs manquantes pour les covariables sont courantes dans les e´ tudes biom´edicales. Dans cet article, les auteurs proposent une m´ethode de repond´eration pour estimer les coefficients du mod`ele de r´egression a` risques additifs lorsque certaines covariables sont manquantes au hasard. Ils montrent que les estimateurs obtenus sont convergents et asymptotiquement normaux. De plus, ils pr´esentent un test d’ad´equation pour le mod`ele a` covariables manquantes. Des simulations montrent que les estimateurs repond´er´es propos´es sont performants et plus efficaces que les estimateurs pond´er´es pour la covariable pr´esentant des valeurs manquantes lorsque la probabilit´e de s´election est faible. Des donn´ees portant sur la leuc´emie chez les souris permettent d’illustrer la m´ethode. La revue canadienne de statistique 42: 285–307; 2014 © 2014 Soci´et´e statistique du Canada
1. INTRODUCTION For survival data, the additive hazards (AH) model is an important alternative to the proportional hazards model (Cox, 1972) for studying the association between risk factors and failure time (Cox & Oakes, 1984; Lin & Ying, 1994). The AH model specifies that the hazard function λ(t|Z) for the failure time associated with a p-vector of possibly time-dependent covariates Z(·) takes the form λ(t | Z) = λ0 (t) + β0T Z(t),
(1)
where λ0 (t) is an unspecified baseline hazard function and β0 is a p-vector of unknown regression parameters. The AH model provides a characterization of the covariate effects different from the proportional hazards model, and has some remarkable features that are not shared by the latter. In particular, model (1) pertains to the risk difference, which is especially relevant and informative in * Author to whom correspondence may be addressed. E-mail:
[email protected] © 2014 Statistical Society of Canada / Soci´et´e statistique du Canada
286
HAO, SONG AND SUN
Vol. 42, No. 2
epidemiological and clinical studies. Like the Cox model, the AH model also has sound biological and empirical bases (Breslow & Day, 1987). When the covariates are fully observed, Lin & Ying (1994) proposed a pseudoscore approach for estimating β0 without specifying the form of λ0 (t) in model (1). In many applications, however, some components of the covariate vector Z(·) are not observed for some individuals, either by accident or by study design. Examples can be found in Little & Rubin (2002) and Tsiatis (2006). A naive way of handling such a situation is the completecase analysis, which discards all the subjects with missing covariates. It is well known that the complete-case analysis may not only be inefficient but may also yield biased estimators when the missing data mechanism depends on all observed data, including outcome variables and observed covariates (Lipsitz, Ibrahim, & Zhao, 1999; Little & Rubin, 2002; Qi, Wang, & Prentice, 2005). To reduce bias and increase efficiency, it is necessary to develop methods that incorporate the partially incomplete data into the analysis. When the missing data mechanism depends on the observed data but not on the missing data, it is termed missing at random (MAR) (Little & Rubin, 2002). A number of statistical methods have been proposed for the proportional hazards model with missing covariates under the MAR assumption (e.g., Paik & Tsai, 1997; Chen & Little, 1999; Wang & Chen, 2001; Qi, Wang, & Prentice, 2005; Xu et al., 2009; Luo, Tsai, & Xu, 2009). Limited research works on estimation methods for the AH model with missing covariates in the literature. For example, Kulich & Lin (2000) proposed an estimating method for the AH model under the case-cohort design, where covariates are measured only on the cases and a subcohort randomly selected from the entire cohort. Recently, Lin (2011) suggested both simple and augmented weighted estimators for the regression parameters of the AH model under the MAR assumption. However, the simple weighted estimators are unstable when the selection or nonmissingness probability is small. In particular, our simulation studies indicate that the variance estimators of the simple weighted estimators are underestimated, and the empirical coverage probabilities of nominal 95% confidence intervals are below the nominal level when the selection probability is small. In this article, we propose a reweighting procedure for the AH model with missing covariates under the MAR assumption. We provide both the simple reweighting and the augmented reweighting estimators, where the selection probability and the conditional distribution of missing covariates given the observed data are modelled parametrically. The resulting estimators have closed forms and are easy to implement. Both the simple reweighting and the augmented reweighting estimators are consistent and asymptotically normal. The augmented reweighting estimators are more efficient than the simple reweighting estimators. Also the proposed estimators for the missing covariate effects are more efficient than the simple and augmented weighted estimators of Lin (2011), especially when the selection probability is small. The remainder of the article is organized as follows. Section 2 presents the simple reweighting estimators and their asymptotic properties. Section 3 develops the augmented reweighting estimators and their asymptotic properties. In Section 4, we propose a test to assess the adequacy of the AH Model with missing covariates. Section 5 reports simulation results that show the proposed estimators perform well. In Section 6, our methods are applied to the mouse leukaemia data. Some concluding remarks are made in Section 7 and technical proofs are relegated to the Appendix. 2. SIMPLE REWEIGHTING ESTIMATORS Let T be the failure time, and C be the censoring time that is assumed to be conditionally independent of T given Z(·). Write X = min(T, C) and δ = I(T ≤ C), where I(·) is the indicator T (·), ZT (·))T , where function. Suppose that Z(·) is predictable and can be partitioned as Z(·) = (Zm o Zo (·) denotes the covariates that are always observed and Zm (·) denotes the covariates that are possibly missing. Let the selection indicator R = 1 if Zm (·) is observed, and R = 0 if Zm (·) is The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2014
REWEIGHTING ESTIMATORS FOR THE ADDITIVE HAZARDS MODEL
287
missing. The missing data mechanism is determined by the distribution of R given (X, δ, Z(·)), which is Bernoulli with probability Pr{R = 1|X, δ, Z(·)}. Under the MAR assumption, we have Pr{R = 1|X, δ, Z(·)} = Pr{R = 1|X, δ, Zo (·)} ≡ π(X, δ, Zo (·)). For a random sample of n subjects, let {Xi , δi , Zi (·), Ri } (i = 1, 2, . . . , n) be the independent and T (·), ZT (·))T . The observed identically distributed copies of {X, δ, Z(·), R}, where Zi (·) = (Zmi oi data consist of {Xi , δi , Zoi (·), Ri Zmi (·), Ri } (i = 1, 2, . . . , n). Define Ni (t) = I(Xi ≤ t, δi = 1) and Yi (t) = I(Xi ≥ t). Let πi (Xi , δi ) = π(Xi , δi , Zoi (·)). Then the simple weighted estimating function of Lin (2011) is USW (β) =
n i=1
Ri πi (Xi , δi )
τ
¯ SW (t)} dNi (t) − βT Zi (t)Yi (t) dt , {Zi (t) − Z
0
¯ SW (t) = SSW (t)/SSW (t), and where τ is the maximum follow-up time, Z (1)
SSW (t) = n−1 (k)
n i=1
(0)
Ri Yi (t)Zi (t)⊗k for k = 0, 1 πi (Xi , δi )
with a⊗0 = 1, a⊗1 = a, a⊗2 = aaT for any vector a. The above estimating function is the inverse probability weighted version of the pseudoscore function of Lin & Ying (1994) restricted to the subjects with Ri = 1. The simple weighted estimator, denoted by βˆSW , is defined as the solution to the estimating equation USW (β) = 0. Note that the weight function 1/πi (Xi , δi ) used in USW (β) can become arbitrarily large if πi (Xi , δi ) is small. This may cause the simple weighted estimator to be unstable. To reduce the overweighting problem in the inverse probability weighted method, inspired by the reweighting techniques of Xu et al. (2009) and Luo, Tsai, & Xu (2009), we suggest a reweighting estimation approach for the AH model under the MAR assumption. Here we use πi (t, 1) as the imposed simpler selection probability for the risk set at time t, where πi (t, 1) is the same functional form as πi (Xi , δi ). Thus, the weight for each complete observation in the same risk set becomes πi (t, 1)/πi (Xi , δi) (i = 1, . . ., n), which can handle some cases with small πi (Xi , δi ). t Let 0 (t) = 0 λ0 (u)du denote the baseline cumulative hazard function, and define Mi (t) = t Ni (t) − 0 Yi (u){d0 (u) + β0T Zi (u) du}, which is a martingale process (Lin & Ying, 1994). We first assume that the selection probability π(X, δ) is known. Following the idea of the generalized estimating equation method (Liang & Zeger, 1986) and using the reweighted functions πi (t, 1)/πi (Xi , δi ), we specify the two following estimating equations for 0 (t) and β0 : n Ri πi (t, 1) i=1
πi (Xi , δi )
dNi (t) − Yi (t){d0 (t) + βT Zi (t) dt} = 0, 0 ≤ t ≤ τ
(2)
and n i=1
DOI: 10.1002/cjs
0
τ
Ri πi (t, 1) Zi (t) dNi (t) − Yi (t){d0 (t) + βT Zi (t) dt} = 0. πi (Xi , δi )
(3)
The Canadian Journal of Statistics / La revue canadienne de statistique
288
HAO, SONG AND SUN
Vol. 42, No. 2
The resulting simple reweighting estimators for β0 and 0 (t) have the following closed forms: βˆSR =
n 0
i=1
×
τ
n
−1 Ri πi (t, 1) ¯ SR (t)}⊗2 Yi (t) dt {Zi (t) − Z πi (Xi , δi )
τ
Ri πi (t, 1) ¯ SR (t)}dNi (t), {Zi (t) − Z πi (Xi , δi )
0
i=1
and ˆ 0 (t) =
1 n n
i=1
t
Ri πi (u, 1)
(0) 0 πi (Xi , δi )SSR (u)
dNi (u) − Yi (u)βˆTSR Zi (u)du ,
¯ SR (t) = SSR (t)/SSR (t), and where Z (1)
(0)
1 Ri πi (t, 1) Yi (t)Zi (t)⊗k (k = 0, 1, 2). n πi (Xi , δi ) n
(k)
SSR (t) =
i=1
(k)
To state the asymptotic properties, let s(k) (t) = E{SSR (t)} for k = 0, 1, 2, and z¯ (t) = s(1) (t)/s(0) (t). Define t πi (u, 1) dMi (u), Ci (t) = (0) 0 s (u) τ πi (t, 1){Zi (t) − z¯ (t)} dMi (t), Bi = 0
and A=E
τ
πi (t, 1){Zi (t) − z¯ (t)}⊗2 Yi (t) dt .
0
ˆ 0 (t) are summarized in the following theorem with the The asymptotic properties of βˆSR and proof given in the Appendix. Theorem 1.
Under the regularity conditions (C1)–(C3) in the Appendix, we have
(i) βˆSR is consistent and n1/2 {βˆSR − β0 } is asymptotically normal with mean zero and covariance matrix A−1 SR A−1 , where SR = E{πi (Xi , δi )−1 Bi⊗2 }. ˆ 0 (t) − 0 (t)} ˆ 0 (t) converges in probability to 0 (t) uniformly in t ∈ [0, τ], and n1/2 { (ii) converges weakly in [0, τ] to a zero-mean Gaussian process with covariance function at (s, t) equal to SR (s, t) = H(s, t) + h(s)T A−1 SR A−1 h(t) − h(s)T A−1 g(t) − h(t)T A−1 g(s), where h(t) =
t 0
z¯ (u) du, g(t) = E{πi (Xi , δi )−1 Ci (t)Bi }, and
H(s, t) = E πi (Xi , δi )−1 Ci (s)Ci (t) .
The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2014
REWEIGHTING ESTIMATORS FOR THE ADDITIVE HAZARDS MODEL
ˆ i (t) = Ni (t) − Let M
t 0
Bˆ i =
289
T Z (u) du}, ˆ 0 (u) + βˆ SR Yi (u){d i
τ
¯ SR (t)} dM ˆ i (t), πi (t, 1){Zi (t) − Z
0
and 1 Aˆ = n n
i=1
0
τ
Ri πi (t, 1) ¯ SR (t)}⊗2 Yi (t) dt. {Zi (t) − Z πi (Xi , δi )
ˆ SR , respectively, where ˆ SR =n−1 Then A and SR can be consistently estimated by Aˆ and ˆ Ri πi (Xi , δi )−2 Bˆ ⊗2 i . Similarly, SR (s, t) can be consistently estimated by SR (s, t), where
n i=1
ˆ − h(s) ˆ T Aˆ −1 g(t) ˆ T Aˆ −1 ˆ T Aˆ −1 g(s), ˆ SR Aˆ −1 h(t) ˆ t) + h(s) ˆ − h(t) ˆ ˆ SR (s, t) = H(s, n s t 1 Ri πi (u, 1) πi (u, 1) ˆ t) = ˆ i (u) ˆ i (u), H(s, dM dM (0) (0) 2 n π (Xi , δi ) 0 S (u) 0 SSR (u) SR i=1 i t n 1 Ri Bˆ i πi (u, 1) ˆ i (u), g(t) ˆ = dM n πi (Xi , δi )2 0 S (0) (u) SR i=1 ˆ = tZ ¯ and h(t) 0 SR (u) du. In many applications, however, the selection probability π(X, δ) is unknown. Here we assume that π(X, δ) can be parametrically modelled as π(W; α0 ), where W = (X, δ, Zo (·)), and α0 is the true parameter value. For example, since R is binary, a logistic model is often used for π(W; α0 ), but other parametric models can also be easily accommodated. A consistent estimator αˆ can be obtained by maximizing the likelihood based on the observed data. Let πi (Xi , δi ; α) = ˆ π(Wi ; α) for any α, where Wi = (Xi , δi , Zoi (·)). Replacing πi (Xi , δi ) and πi (t, 1) by πi (Xi , δi ; α) and πi (t, 1; α) ˆ in (2) and (3), respectively, we obtain the simple reweighting estimators for β0 and 0 (t) with estimated weights. The resulting simple reweighting estimators are denoted by βˆSR (α) ˆ ˆ 0 (t; α), and ˆ and their asymptotic properties are given in the following theorem. Theorem 2.
Under the regularity conditions (C1)–(C4) in the Appendix, we have
(i) βˆSR (α) ˆ is consistent and n1/2 {βˆSR (α) ˆ − β0 } is asymptotically normal with mean zero and covariance matrix A−1 [SR − Dα DT ]A−1 , where α is the asymptotic variance of n1/2 {αˆ − α0 }, and D = E{πi (Xi , δi ; α0 )−1 Bi ∂πi (Xi , δi ; α0 )/∂αT }. ˆ 0 (t; α) ˆ 0 (t; α) (ii) ˆ converges in probability to 0 (t) uniformly in t ∈ [0, τ], and n1/2 { ˆ − 0 (t)} converges weakly in [0, τ] to a zero-mean Gaussian process with covariance function at (s, t) equal to SR (s, t) − {Q(s) − DT A−1 h(s)}T α {Q(t) − DT A−1 h(t)}, where Q(t) = E{πi (Xi , δi ; α0 )−1 Ci (t)∂πi (Xi , δi ; α0 )/∂α}. Because Dα DT is positive definite, βˆSR (α) ˆ has a smaller asymptotic variance than βˆSR . Thus, we can obtain more asymptotically efficient estimator of β0 by using the estimated selection probabilities. This unusual phenomenon was pointed out for various practical situations, and ˆ 0 (t) may not always Henmi & Eguchi (2004) provided the most complete explanation. Note that be monotonic in t, in which case simple modifications such as those discussed in Lin & Ying (1994) can be made to ensure monotonicity while preserving asymptotic properties. DOI: 10.1002/cjs
The Canadian Journal of Statistics / La revue canadienne de statistique
290
HAO, SONG AND SUN
Vol. 42, No. 2
Define ˆ i (t; α) M ˆ = Ni (t) −
t
ˆ 0 (t; α) Yi (u){d ˆ + βˆ SR (α) ˆ T Zi (u) du},
0
and ˆ = Bˆ i (α)
τ
¯ SR (t; α)} ˆ i (t; α), πi (t, 1; α){Z ˆ ˆ dM ˆ i (t) − Z
0 (1) (0) ¯ SR (t; α) where Z ˆ = SSR (t; α)/S ˆ SR (t; α), ˆ and
1 Ri πi (t, 1; α) ˆ Yi (t)Zi (t)⊗k (k = 0, 1, 2). n πi (Xi , δi ; α) ˆ n
(k)
SSR (t; α) ˆ =
i=1
Then the asymptotic covariance matrix of βˆSR (α) ˆ can be estimated by ˆ α) ˆ α) ˆ SR (α) ˆ α D( ˆ α) ˆ α) ˆ −1 [ ˆ − D( ˆ ˆ −1 , ˆ T ]A( n−1 A( where 1 ˆ α) A( ˆ = n n
i=1
ˆ SR (α) ˆ =
0
τ
Ri πi (t, 1; α) ˆ ¯ SR (t; α)} ˆ ⊗2 Yi (t) dt, {Zi (t) − Z πi (Xi , δi ; α) ˆ
1 Ri ˆ ⊗2 , Bˆ i (α) n πi (Xi , δi ; α) ˆ 2 n
i=1
ˆ α) D( ˆ =
Ri 1 T Bi (α)∂π ˆ ˆ , i (Xi , δi ; α)/∂α n πi (Xi , δi ; α) ˆ 2 n
i=1
ˆ α is a consistent estimator of α . Similarly, the estimation of the asymptotic covariance and ˆ 0 (t; α) function for ˆ can be carried out as above. 3. AUGMENTED REWEIGHTING ESTIMATORS The validity of the simple weighted estimator βˆSR (α) ˆ depends on the correct specification of the parametric model π(W; α0 ). If it is misspecified, βˆSR (α) ˆ may be biased. To improve the robustness as well as the efficiency of βˆSR (α), ˆ we construct a class of the augmented reweighting estimators based on the double robust technique developed by Robins, Rotnitzky, & Zhao (1994), which has the so-called double-robustness property, that is, the estimator is consistent if either the selection probability or the conditional distribution of the missing covariates is correctly specified given the observed data (Wang & Chen, 2001; Xu et al., 2009). In addition, the augmented reweighting estimator directly incorporates the incomplete observations through the augmented term and thus (1) (0) ¯ AR (t) = SAR are more efficient than the simple weighted estimator. Define Z (t)/SAR (t), where (k) SAR (t)
=
(k) SSR (t) +
n Ri 1 1− πi (t, 1)Yi (t)E{Zi (t)⊗k |Wi } (k = 0, 1). n πi (Xi , δi ) i=1
First, we assume that the selection probability π(X, δ) and the conditional expectations E{Zm (·)⊗k |W} (k = 1, 2) are known. The augmented reweighting estimator for β0 is the solution The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2014
REWEIGHTING ESTIMATORS FOR THE ADDITIVE HAZARDS MODEL
291
to the following estimating equation: UAR (β) =
n i=1
+
Ri πi (Xi , δi )
n
1−
i=1
τ
¯ AR (t)}{dNi (t) − Yi (t)βT Zi (t) dt} πi (t, 1){Zi (t) − Z
0
Ri πi (Xi , δi )
τ
πi (t, 1) E{Zi (t)[dNi (t) − Yi (t)βT Zi (t) dt]|Wi }
0
¯ AR (t)E{[dNi (t) − Yi (t)βT Zi (t) dt]|Wi } . −Z
(4)
The resulting augmented reweighting estimator for β0 has the following closed form:
τ τ n n Ri Ri ⊗2 ˆ ¯ βAR = 1− πi (t, 1){Zi (t)− ZAR (t)} Yi (t) dt + πi (t, 1) πi (Xi , δi ) 0 πi (Xi , δi ) 0 i=1
i=1
× [E{Zi (t) ×
n i=1
⊗2
T ¯ AR ¯ AR (t)E{Zi (t)|Wi }T + Z ¯ AR (t)⊗2 ]Yi (t) dt |Wi } − E{Zi (t)|Wi }Z (t) − Z
Ri πi (Xi , δi )
Ri + 1− πi (Xi , δi )
τ
−1
¯ AR (t)} dNi (t) πi (t, 1){Zi (t) − Z
0
τ
¯ πi (t, 1)[E{Zi (t)|Wi } − ZAR (t)] dNi (t) .
0
ˆ a (t), where The corresponding estimator of 0 (t) is given by t n 1 t πi (u, 1) ˆTAR ˆ a (t) = ¯ AR (u) du. dN (u) − β Z i (0) n 0 0 S (u) AR i=1 ˆ a (t) are stated in the For ease of theoretical development, the asymptotic results of βˆAR and following theorem. Theorem 3.
Under the regularity conditions (C1)–(C3) in the Appendix, we have
(i) βˆAR is consistent and n1/2 {βˆAR − β0 } is asymptotically normal with mean zero and covariance matrix A−1 AR A−1 , where 1 − πi (Xi , δi ) ⊗2 AR = SR − E . E[Bi |Wi ] πi (Xi , δi ) ˆ a (t) converges in probability to 0 (t) uniformly in t ∈ [0, τ], and n1/2 { ˆ a (t) − 0 (t)} (ii) converges weakly in [0, τ] to a zero-mean Gaussian process with covariance function at (s, t) equal to AR (s, t) = E{νi (s)νi (t)}, where
t Ri πi (u, 1) Ri πi (u, 1) νi (t) = dMi (u) + 1 − E[dMi (u)|Wi ] πi (Xi , δi )s(0) (u) πi (Xi , δi ) s(0) (u) 0
Ri Ri − πi (Xi , δi ) − h(t)T A−1 Bi − E[Bi |Wi ] . πi (Xi , δi ) πi (Xi , δi ) In general, the selection probability and the conditional expectations are unknown. In this case, we assume that π(X, δ) and E{Zm (·)⊗k |W} can be parametrically modelled as π(W; α0 ) and DOI: 10.1002/cjs
The Canadian Journal of Statistics / La revue canadienne de statistique
292
HAO, SONG AND SUN
Vol. 42, No. 2
μk (W; γk ) (k = 1, 2), respectively, where α0 and γ0 = (γ1 , γ2 ) are the true parameter values. Let αˆ and γˆ be the estimators of α0 and γ0 , respectively. When πi (Xi , δi ), πi (t, 1), and E{Zmi (·)⊗k |Wi } ˆ a (t) are substituted by πi (Xi , δi ; α), ˆ πi (t, 1; α), involved in βˆAR and ˆ and μk (Wi ; γˆ k ), respectively, we obtain the augmented reweighting estimator for β0 and 0 (t) with estimated weights. ˆ a (t; α, The resulting estimator is denoted by βˆAR (α, ˆ γ) ˆ and ˆ γ). ˆ The double-robustness property of the augmented reweighting estimators is summarized in the following theorem. ˆ a (t; α, Theorem 4. The augmented reweighting estimators βˆAR (α, ˆ γ) ˆ and ˆ γ) ˆ are consistent if either the selection probability or the conditional distribution of the missing covariates is correctly specified. In the following, let αˆ and γˆ be the consistent estimators of α0 and γ0 , respectively. The next theorem shows that the estimation of nuisance parameters does not affect the asymptotic properties of the augmented reweighting estimators. Theorem 5.
Under Conditions (C1)–(C5) in the Appendix, we have
(i) βˆAR (α, ˆ γ) ˆ is consistent and n1/2 {βˆAR (α, ˆ γ) ˆ − β0 } is asymptotically normal with mean zero ˆ and covariance matrix A−1 AR A−1 , which is no greater than that of βˆSR (α). ˆ a (t; α, ˆ a (t; α, ˆ γ) ˆ converges in probability to 0 (t) uniformly in t ∈ [0, τ], and n1/2 { (ii) ˆ γ) ˆ − 0 (t)} converges weakly in [0, τ] to a zero-mean Gaussian process with covariance function at (s, t) equal to AR (s, t) = E{νi (s)νi (t)}. Theorems 3 and 5 indicate that the augmented reweighting estimators are more efficient than the simple reweighting estimator βˆSR , and are at least as efficient as βˆSR (α). ˆ The asymptotic variance of the augmented reweighting estimators can be estimated similarly to those of the simple reweighting estimators. For illustration, we specify the variance estimator of βˆAR (α, ˆ γ). ˆ Define t ˆ a (u; α, ˆ Mi (t; α, ˆ γ) ˆ = Ni (t) − Yi (u){d ˆ γ) ˆ + βˆ AR (α, ˆ γ) ˆ T Zi (u) du}, 0
and Bˆ i (α, ˆ γ) ˆ =
τ
¯ AR (t; α, ˆ i (t; α, πi (t, 1; α){Z ˆ ˆ γ)} ˆ dM ˆ γ), ˆ i (t) − Z
0
¯ AR (t; α, where Z ˆ γ) ˆ = SAR (t; α, ˆ γ)/S ˆ AR (t; α, ˆ γ), ˆ (1)
(k) SAR (t; α, ˆ γ) ˆ
=
(0)
(k) SSR (t; α) ˆ +
n Ri 1 ˆ i (t)⊗k |Wi }, 1− ˆ i (t)E{Z πi (t, 1; α)Y n πi (Xi , δi ; α) ˆ i=1
ˆ a (t; α, ˆ γ) ˆ =
1 n
n t i=1
πi (u, 1; α) ˆ
(0) 0 SAR (u; α, ˆ γ) ˆ
dNi (u) − βˆ AR (α, ˆ γ) ˆ T
t
¯ AR (u; α, ˆ γ)du, ˆ Z
0
ˆ and E{·|W i } denotes the estimated E{·|Wi }. Note that AR can be rewritten as E{Bi⊗2 } + E
1 − π (X , δ ; α ) i i i 0 var{Bi |Wi } . πi (Xi , δi ; α0 )
ˆ γ) ˆ can be estimated by Then the asymptotic variance matrix of βˆAR (α, ˆ α, ˆ α, ˆ AR (α, ˆ γ) ˆ −1 ˆ γ) ˆ A( ˆ γ) ˆ −1 , n−1 A( The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2014
REWEIGHTING ESTIMATORS FOR THE ADDITIVE HAZARDS MODEL
293
where n τ 1 Ri ˆ ¯ AR (t; α, A(α, ˆ γ) ˆ = πi (t, 1; α){Z ˆ ˆ γ)} ˆ ⊗2 Yi (t) dt i (t) − Z n πi (Xi , δi ; α) ˆ 0 i=1
+
n
1−
i=1
Ri πi (Xi , δi ; α) ˆ
τ
ˆ i (t)⊗2 |Wi } − E{Z ˆ i (t)|Wi } πi (t, 1; α)[ ˆ E{Z
0
T T ⊗2 ¯ ¯ ˆ ¯ × ZAR (t; α, ˆ γ) ˆ − ZAR (t; α, ˆ γ) ˆ E{Zi (t)|Wi } + ZAR (t; α, ˆ γ) ˆ ]Yi (t) dt , and ˆ AR (α, ˆ γ) ˆ =
1 Ri ˆ γ) ˆ ⊗2 Bˆ i (α, n πi (Xi , δi ; α) ˆ n
i=1
+
1 − πi (Xi , δi ; α) ˆ ⊗2 ˆ Bˆ i (α, ˆ γ) ˆ − E{ [Bˆ i (α, . ˆ γ)|W ˆ i }] πi (Xi , δi ; α) ˆ
4. MODEL CHECKING In this section, we propose a formal lack-of-fit test to assess the adequacy of the AH model (1) with missing covariates. Here we only present the model checking procedure under the simple reweighting estimator βˆSR (α). ˆ The model checking procedure under the augmented reweighting estimators can be derived similarly. Following Lin, Wei, & Ying (1993), we consider the following cumulative sums of residuals: t n Ri ˆ i (u; α), F(t, z) = n−1/2 I(Zi (u) ≤ z) dM ˆ πi (Xi , δi ; α) ˆ 0 i=1
where the event I(Zi (u) ≤ z) means that each component of Zi (u) is no larger than the corresponding component of z. Define the null hypothesis H0 as the correct specification of model (1). We show in the Appendix that the null distribution of F(t, z) can be approximated by the zero-mean Gaussian process n −1/2 ˜ ˆ i (t, z), F(t, z) = n (5) i=1
where ˆ i (t, z) =
t fˆ 1 (u, z; α) ˆ I(Zi (u) ≤ z) − πi (u, 1; α) dMi (u; α) ˆ (0) ˆ 0 SSR (u; α) ˆ t Ri ˆ − fˆ 1 (u, z; α)d ˆ T Aˆ −1 (α) ˆ Bˆ i (α) ˆ + ˆ Q(u; α) ˆ T fˆ 2 (t, z; α) πi (Xi , δi ; α) ˆ 0 ˆ α) ˆ α Uˆ αi , ˆ + fˆ 2 (t, z; α) ˆ T A( ˆ T ˆ −1 D(α) − fˆ 3 (t, z; α) Ri πi (Xi , δi ; α) ˆ
fˆ1 (u, z; α) ˆ = n−1
n i=1
fˆ2 (t, z; α) ˆ = n−1
n i=1
DOI: 10.1002/cjs
Ri I(Zi (u) ≤ z)Yi (u), πi (Xi , δi ; α) ˆ Ri πi (Xi , δi ; α) ˆ
t
¯ SR (u; α)} I(Zi (u) ≤ z)Yi (u){Zi (u) − Z ˆ du,
0
The Canadian Journal of Statistics / La revue canadienne de statistique
294
HAO, SONG AND SUN
Vol. 42, No. 2
and ˆ = n−1 fˆ3 (t, z; α)
n i=1
Ri ∂πi (Xi , δi ; α) ˆ πi (Xi , δi ; α) ∂α ˆ 2
t
ˆ i (u; α). I(Zi (u) ≤ z) dM ˆ
0
It is apparently not possible to evaluate the above distribution analytically because the limiting process of F(t, z) does not have independent increments. To overcome this difficulty, we appeal to the resampling approach (e.g., Lin, Wei, & Ying, 1993). Let (G1 , . . . , Gn ) be independent standard normal random variables independent of the data. It can be shown following Lin, Wei, & Ying (1993) that the distribution of the process F(t, z) can be approximated by that of the zero-mean Gaussian process ˆ z) = n−1/2 F(t,
n
ˆ i (t, z)Gi .
(6)
i=1
ˆ z) by repeatedly generating the Thus, we can first obtain a large number of realizations of F(t, standard normal random sample (G1 , . . . , Gn ) while fixing the data at their observed values, and then use the empirical distribution of these realizations to approximate the distribution of ˆ z) along with F(t, z). To assess the overall fit of model (1), one can plot a few realizations of F(t, the observed F(t, z) and see if they can be regarded as arising from the same population. More formally, we can apply the supremum test statistic sup0≤t≤τ,z |F(t, z)|, whose p-value can be obtained by comparing the observed value of sup0≤t≤τ,z |F(t, z)| to a large number of realizations ˆ z)|. from sup0≤t≤τ,z |F(t, 5. SIMULATION STUDIES We conducted simulation studies to examine and compare the finite-sample performance of the reweighting estimators and the simple weighted estimators along with that of the full-cohort and complete case analyses. In these studies, the AH model was taken to be λ(t|Zm , Zo ) = 0.5 + βm Zm + βo Zo with βm = 1 and βo = 0.5, where the missing covariate Zm follows a Bernoulli distribution with success probability 0.5, and the observed covariate Zo follows a uniform distribution on (0, 1). The censoring time C was taken as min(U, 1), where U follows a uniform distribution on (0, 2.5), which yields a 43.2% censoring rate. We considered three settings for the selection probability: (a) π(X, δ) = exp(2X − δ − 3Zo )/(1 + exp(2X − δ − 3Zo )) with the missing rate of 68.8%; (b) π(X, δ) = exp(3X − 3Zo )/(1 + exp(3X − 3Zo )) with the missing rate of 50.7%; (c) π(X, δ) = exp(X + Zo )/(1 + exp(X + Zo )) with the missing rate of 27.9%. In each setting, the selection probability π(W; α0 ) and the conditional distribution P(Zm |W; γ0 ) were fitted by the logistic models with X, Zo , δ and their pairwise interactions as predictors. The maximum likelihood estimators of α0 and γ0 are denoted by αˆ and γ, ˆ respectively. For each generated dataset, βm and βo were estimated in various ways including the fullcohort analysis (denoted by FULL) which is based on data with all the covariates (both Zm and Zo ) being fully known/observed, the complete case analysis (denoted by CC), the simple weighted estimator with estimated selection probability (denoted by SW(α)), ˆ the augmented weighted estimator with estimated selection probability and estimated conditional expectations (denoted by ASW(α, ˆ γ)), ˆ the reweighting estimator with true selection probability (denoted by SR), the reweighting estimator with estimated selection probability (denoted by SR(α)) ˆ and the augmented reweighting estimator with estimated selection probability and estimated conditional expectations (denoted by AR(α, ˆ γ)). ˆ The results presented below are based on 1,000 replications The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2014
REWEIGHTING ESTIMATORS FOR THE ADDITIVE HAZARDS MODEL
295
Table 1: Simulation results for the selection probability π(X, δ) = exp(2X − δ − 3Zo )/(1 + exp(2X − δ − 3Zo )) with the missing rate of 68.8%. βm n 500
Method FULL CC SW(α) ˆ ASW(α, ˆ γ) ˆ
Bias
SE
ESE
CP
Bias
SE
ESE
CP
−0.11
15.35
15.10
93.9
−0.09
23.93
24.03
95.9
−44.57
16.01
16.30
25.3
−65.89
25.58
25.16
26.8
−1.04
39.36
33.75
90.2
−10.10
57.04
43.58
82.4
2.70
39.83
34.79
88.0
−0.23
31.11
29.53
92.7
−0.40
27.20
27.01
94.6
−3.11
51.27
49.55
94.6
SR(α) ˆ
0.36
26.98
26.23
94.0
−2.51
33.74
31.40
92.4
AR(α, ˆ γ) ˆ
1.23
27.13
26.43
94.1
−2.04
32.83
31.18
93.9
FULL
1.00
11.02
10.69
94.6
0.48
17.54
16.88
93.3
SR
1,000
βo
−44.09
11.68
11.51
5.6
−65.08
18.81
17.76
6.5
SW(α) ˆ
1.56
28.14
25.29
91.1
−4.76
40.15
34.45
87.9
ASW(α, ˆ γ) ˆ
2.43
27.91
25.14
91.0
−0.32
21.34
20.53
94.1
SR
1.00
19.97
19.25
94.3
0.32
37.10
35.02
92.4
CC
SR(α) ˆ
1.15
19.56
18.66
93.7
−0.57
23.36
21.89
93.1
AR(α, ˆ γ) ˆ
1.54
19.64
18.72
93.4
0.03
22.84
21.69
93.4
All of the quantities are presented in %.
with sample sizes n = 500 and 1,000. The software MATLAB is used to program the algorithm, and the computation times to calculate the augmented reweighting estimators based on 1,000 replications are approximately 6 and 16 min with sample sizes n = 500 and 1,000, respectively. Tables 1–3 present the simulation results on the estimates of βm and βo under the three different settings, respectively. In these tables, Bias is the sample mean of the estimate minus the true value, SE is the sampling standard error of the estimate, ESE is the sample mean of the estimated standard error and CP is the empirical coverage probability of the 95% confidence interval based on the normal approximation. It can be seen from Tables 1–3 that complete-case analysis may produce seriously biased estimates in all situations, with coverage probabilities that are too small, whereas the other estimators perform well. In particular, the reweighting estimators are practically unbiased, there is a good agreement between the estimated and the empirical standard errors, and the coverage probabilities of the 95% confidence intervals are reasonable. The estimator SR is less efficient than the estimators SR(α) ˆ and AR(α, ˆ γ), ˆ especially for the estimation of the observed covariate effect βo . In addition, the estimators SR(α) ˆ and AR(α, ˆ γ) ˆ are asymptotically equivalent, and provide comparable estimates for βm and βo . Compared with the weighted estimator, the proposed reweighting method can improve the efficiency of estimating the missing covariate effect βm , especially when the missing rate is large. Furthermore, when the selection probability is small, the sampling standard errors of the weighted estimators are larger than the sample mean of the estimated standard errors, which means that the weighted estimators are unstable, and the variance estimators are underestimated. This leads to the empirical coverage probabilities being below the nominal level. Thus, the reweighting estimators are good competitors with the weighted estimators when the selection probability is small. We also considered other setups and the results were similar to those given above. DOI: 10.1002/cjs
The Canadian Journal of Statistics / La revue canadienne de statistique
296
HAO, SONG AND SUN
Vol. 42, No. 2
Table 2: Simulation results for the selection probability π(X, δ) = exp(3X − 3Zo )/(1 + exp(3X − 3Zo )) with the missing rate of 50.7%. βm n 500
Method FULL
Bias
SE
ESE
CP
Bias
SE
ESE
CP
−0.04
15.28
15.10
94.6
−0.24
23.85
24.00
96.4
−27.76
15.58
15.58
55.7
−60.61
24.20
24.47
30.3
SW(α) ˆ
−0.06
27.44
24.80
92.1
−1.27
35.98
33.28
91.5
ASW(α, ˆ γ) ˆ
−0.00
26.98
24.98
91.9
−0.17
25.17
25.52
95.8
SR
−0.70
21.77
20.95
93.5
−2.03
36.62
36.53
94.7
SR(α) ˆ
−0.52
21.15
20.52
93.7
−1.11
27.83
27.91
94.8
AR(α, ˆ γ) ˆ
−0.30
21.18
20.58
93.9
−0.93
27.78
27.30
94.9
CC
1,000
βo
FULL
0.95
10.91
10.69
94.8
0.47
17.57
16.88
93.2
−26.77
10.89
11.05
32.6
−60.21
18.38
17.18
8.5
SW(α) ˆ
0.39
18.59
17.74
93.5
−1.93
25.99
24.07
93.2
ASW(α, ˆ γ) ˆ
0.66
18.57
17.73
93.7
−0.00
18.84
17.89
93.6
SR
0.62
15.24
14.88
95.3
−1.08
27.31
25.76
94.5
SR(α) ˆ
0.74
14.82
14.59
95.2
−0.23
20.69
19.22
94.0
AR(α, ˆ γ) ˆ
0.86
14.87
14.61
95.0
−0.03
20.65
19.20
94.1
CC
All of the quantities are presented in %.
Table 3: Simulation results for the selection probability π(X, δ) = exp(X + Zo )/(1 + exp(X + Zo )) with the missing rate of 27.9%. βm n 500
1,000
Method FULL
βo
Bias
SE
ESE
CP
Bias
SE
ESE
CP
−0.03
15.32
15.11
94.3
−0.77
23.68
24.03
95.9
CC
−7.37
17.15
16.68
91.1
3.72
25.54
26.26
96.0
SW(α) ˆ
−0.65
18.11
17.42
94.0
−0.55
24.27
24.69
95.4
ASW(α, ˆ γ) ˆ
−0.63
18.11
17.43
94.2
−0.53
23.75
24.55
95.4
SR
−0.91
18.27
17.65
94.2
0.07
27.33
27.92
95.8
SR(α) ˆ
−0.63
18.20
17.52
93.7
−0.31
24.39
24.71
95.0
AR(α, ˆ γ) ˆ
−0.59
18.21
17.53
94.0
−0.35
24.71
24.66
95.1
FULL
0.79
10.80
10.69
94.9
0.63
17.29
16.88
93.8
−5.89
12.06
11.81
90.9
3.99
18.47
18.41
95.0
SW(α) ˆ
0.74
12.64
12.32
95.1
0.56
17.73
17.32
94.4
ASW(α, ˆ γ) ˆ
0.76
12.63
12.32
94.8
0.55
17.54
17.24
94.3
SR
0.70
12.85
12.53
95.0
0.38
19.77
19.65
95.2
SR(α) ˆ
0.77
12.71
12.41
95.0
0.53
17.92
17.39
94.2
AR(α, ˆ γ) ˆ
0.80
12.71
12.42
95.1
0.53
17.83
17.36
94.0
CC
All of the quantities are presented in %. The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2014
REWEIGHTING ESTIMATORS FOR THE ADDITIVE HAZARDS MODEL
297
Table 4: Simulation results for the cases of a missing continuous covariate and a missing time-dependent covariate when the baseline hazard is not constant in time with n = 150. Method
Bias
SE
ESE
CP
ASW(α, ˆ γ) ˆ
SR
SR(α) ˆ
AR(α, ˆ γ) ˆ
3.45
4.26
1.27
1.85
2.81
−5.24
−0.75
0.25
0.64
0.88
1.77
−8.82
1.93
3.08
3.47
1.87
2.55
FULL
CC
βm1
−0.19
−3.75
βm2
−0.19
βo
1.86
SW(α) ˆ
βm1
47.36
58.02
71.76
74.19
64.22
64.83
66.28
βm2
37.75
49.32
63.30
64.53
55.47
57.14
57.86 25.66
βo
23.58
32.57
25.41
24.92
37.42
25.87
βm1
45.05
54.84
64.47
74.72
60.75
60.33
67.51
βm2
38.37
47.71
58.80
59.62
55.17
53.39
55.18
βo
22.70
30.87
24.19
24.41
35.41
24.81
25.25
βm1
93.6
93.6
92.1
95.4
93.1
93.0
95.6
βm2
96.2
94.8
92.6
93.9
95.7
93.1
95.9
βo
94.6
91.2
94.9
95.7
93.7
94.6
95.9
All of the quantities are presented in %.
Next, we conducted simulation studies for the cases of a missing continuous covariate and a missing time-dependent covariate when the baseline hazard was not constant in time. The sample size was taken as n = 150. In the study, the AH model was taken to be λ(t|Zm , Zo ) = 0.5t + βm1 Zm1 (t) + βm2 Zm2 + βo Zo , with βm1 = βm2 = βo = 0.5, where the observed covariate Zo follows a Bernoulli distribution with success probability 0.5, the missing covariate Zm1 (t) = U(t + 1) with U generated from a uniform distribution on (0, 1) and the missing covariate Zm2 follows a uniform distribution on (0, 1). The selection probability was taken as π(X, δ) = exp(X − Zo )/(1 + exp(X − Zo )). The conditional distribution P(Zm1 (t)|W; γ01 ) was fitted by the uniform distribution on (0, γ01 t) at time t, and a consistent estimator of γ01 was obtained by maximizing the likelihood based on the observed data. The selection probability π(W; α0 ), the conditional distribution P(Zm2 |W; γ02 ), and the censoring time C were taken as in Table 1, which yields the missing rate of 50.4% and the censoring rate of 58.2%. The results are summarized in Table 4. It can be seen that the proposed methods still perform reasonably well for the situation considered here, and are adequate for practical use. Furthermore, we conducted simulation studies when both covariates are discrete with n = 150, which aligns with the case in the real data example. The AH model was taken to be λ(t|Zm , Zo ) = 0.5 × 10−3 + βm Zm + βo Zo with βm = 1.0 × 10−3 and βo = 0.5 × 10−3 , which mimic the quantities of the real data example, where both the missing covariate Zm and the observed covariate Zo follow Bernoulli distributions with success probability 0.5. The selection probability was given by π(X, δ) = exp(2δ + 0.005X − 2.5Zo )/(1 + exp(2δ + 0.005X − 2.5Zo )). The censoring time C was taken as min(C1 , 700), where C1 follows an exponential distribution with a mean of (0.5 × 10−3 + 1.0 × 10−3 Zo + 0.5 × 10−3 Zm )−1 , which yields the missing rate of 61.3% and the censoring rate of 32.4%. The other setups were the same as in Table 1. The results are presented in Table 5. Simulation results show that the proposed methods perform satisfactorily in this case, and is therefore applicable to the motivating mouse data. DOI: 10.1002/cjs
The Canadian Journal of Statistics / La revue canadienne de statistique
298
HAO, SONG AND SUN
Vol. 42, No. 2
Table 5: Simulation results for the case when the covariates are discrete with n = 150. βm Method FULL
βo
Bias×105 SE×105 ESE×105 CP (%) Bias×105 SE×105 ESE×105 CP (%) 3.44
33.15
32.95
95.9
1.17
30.40
31.66
96.2
CC
7.71
37.19
37.02
95.8
11.54
38.87
39.78
96.9
SW(α) ˆ
4.91
42.52
39.77
95.9
2.87
32.54
32.51
94.1
ASW(α, ˆ γ) ˆ
5.05
42.75
40.73
94.7
2.81
31.71
32.42
96.0
SR
4.39
38.26
37.07
94.4
4.60
40.17
39.73
96.0
SR(α) ˆ
4.27
37.99
36.59
94.7
2.19
32.22
32.43
95.8
AR(α, ˆ γ) ˆ
4.40
38.15
36.91
94.2
2.28
31.88
32.46
96.1
We also conducted simulation studies to examine the performance of the proposed methods with different censoring rates. Simulation results are presented in Table S1, which shows that the proposed estimation procedures perform well for different censoring rates. To investigate how the effect size and the censoring mechanism affect estimation, we conducted simulation studies for the cases when the covariate effects were taken to be large or small, and the censoring mechanism depended on covariates. The results are reported in Tables S2 and S3, which indicate that the proposed methods still perform well in these cases. Finally, we conducted some simulation studies to evaluate the size and power of the Wald test for the null hypothesis Ho : βo = 0. The AH model was set as λ(t|Zm , Zo ) = 0.5 + βm Zm + 0.1kZo with βm = 0.5 or 1, where k is chosen to be 0, 1, 3, 5, 7 and 9. All other setups were the same as those in Table 1 with n = 500 and 1,000. Table 6 reports the empirical sizes and powers of the Wald test at the significance level of 0.05. The results suggest that the Wald test seems to have the right size and reasonable power. As expected, the empirical size becomes close to the nominal size when the sample size increases from 500 to 1,000, and the power increases as the sample size and the value of k increase. 6. APPLICATION For illustration purposes, we applied the proposed methods to a mouse leukaemia dataset given in Appendix A of Kalbfleisch & Prentice (2002), which has been analysed extensively in the literature (Wang et al., 1997; Chen & Little, 1999; Wang & Chen, 2001; Qi, Wang, & Prentice, 2005). The study was conducted in Robert Nowinski’s laboratories at the Fred Hutchinson Cancer Research Center to examine the effects of genetic and viral factors on the development of spontaneous leukaemia in mice. There were 204 mice that were followed over 2 years for mortality due to thymic leukaemia, nonthymic leukaemia or other natural causes. Two covariates of interest were the Gpd-1 phenotype and the level of endogenous murine leukaemia virus. The mice with lifetimes less than 400 days did not have the Gpd-1 phenotype recorded, so missingness of covariates clearly depends on the survival time, and the MAR assumption seems to be reasonable in this application (Wang & Chen, 2001; Qi, Wang, & Prentice, 2005). Here, we used the proposed methods to analyse the data and focussed on the effects of the Gpd-1 phenotype and the virus level on mortality due to thymic leukaemia. Following Wang et al. (1997), for computation simplicity, we included in our analysis only the 153 mice with T ≥ 400, among which 37 mice died of thymic leukaemia. The Gpd-1 phenotype was available on 100 mice. For the analysis, let Zm be a binary variable taking the value 0 for b/b The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2014
REWEIGHTING ESTIMATORS FOR THE ADDITIVE HAZARDS MODEL
299
Table 6: The empirical sizes and powers of the Wald test for testing βo = 0. βm
n
0.5
500
Method
1,000
1
500
1,000
k=0
k=1
k=3
k=5
k=7
k=9
SR
6.3
5.5
9.1
15.4
25.5
35.7
SR(α) ˆ
7.1
10.9
19.3
35.5
55.4
69.7
AR(α, ˆ γ) ˆ
5.5
8.4
17.8
34.9
56.1
71.0
SR
5.5
5.9
14.1
27.7
46.0
62.3
SR(α) ˆ
4.7
7.4
31.0
59.4
84.5
94.4
AR(α, ˆ γ) ˆ
4.4
7.3
30.3
61.2
85.8
95.4
SR
6.9
7.0
11.8
20.2
32.7
42.6
SR(α) ˆ
6.8
12.1
28.8
48.8
69.3
82.6
AR(α, ˆ γ) ˆ
5.6
9.6
26.2
47.4
71.1
84.9
SR
5.7
7.2
16.5
33.6
52.3
69.6
SR(α) ˆ
5.0
11.9
40.6
75.2
93.2
98.7
AR(α, ˆ γ) ˆ
4.6
10.6
40.6
76.5
94.4
98.9
All of the quantities are presented in %.
and 1 for b/a, and Zo be a binary variable which is 0 if the virus level is less than 104 PFU/ml and 1 otherwise. As discussed in Lin (2011), the selection probability and the conditional distribution were modelled by the logistic models with the survival time, failure indicator, virus level and their pairwise interactions as predictors. The best model was chosen by the Akaike information criterion (AIC), where AIC = 2k − 2 ln(L), in which k is the number of parameters in the model, and L is the maximized value of the likelihood function for the estimated model. Then the chosen models include the survival time, failure indicator and their interaction. Specifically, we obtain π(X, δ) =
exp(−8.23 + 0.01X + 9.6δ − 0.02Xδ) . 1 + exp(−8.23 + 0.01X + 9.6δ − 0.02Xδ)
Table 7 gives the estimates of regression coefficients in the AH model by the complete-case analysis, the weighted method and the reweighting method. All methods show a negative association of the Gpd-1 phenotype and a positive association of the virus level with thymic leukaemia mortality. The augmented weighted estimator ASW(α, ˆ γ) ˆ and the augmented reweighting Table 7: Analysis of the mouse leukaemia data with various methods. Gpd-1 Method
Virus
Est ×104
SE×104
P-value
Est×104
SE×104
P-value
CC
−5.30
2.25
0.019
3.07
1.57
0.051
SW(α) ˆ
−5.02
2.10
0.017
4.01
1.82
0.028
ASW(α, ˆ γ) ˆ
−5.38
2.01
0.007
4.70
1.61
0.012
SR(α) ˆ
−4.23
1.94
0.029
4.11
1.70
0.016
AR(α, ˆ γ) ˆ
−4.42
1.79
0.018
4.75
1.48
0.001
Note: Est is the estimate of the parameter, and SE is the standard error estimate. DOI: 10.1002/cjs
The Canadian Journal of Statistics / La revue canadienne de statistique
300
HAO, SONG AND SUN
Vol. 42, No. 2
estimator AR(α, ˆ γ) ˆ give stronger evidence for the effect of the virus level. That is, the values of the estimators ASW(α, ˆ γ) ˆ and AR(α, ˆ γ) ˆ are bigger than the values of the other estimators. As expected, the augmented reweighting estimator AR(α, ˆ γ) ˆ has smaller standard errors for coefficient estimates of both the Gpd-1 phenotype and the virus level than the estimators from the complete-case analysis and the weighted estimating method. These findings are consistent with those obtained by Lin (2011). Although the estimate of the Gpd-1 phenotype from the completecase analysis is not very different from those of the proposed methods, our estimators show that the virus level is statistically significant (the P-value is 0.016 for SR(α) ˆ and the P-value is 0.001 for AR(α, ˆ γ)), ˆ but the complete-case analysis reveals that the virus level is only marginally significant (the P-value is 0.051). Of course, all methods suggest that the covariate effects appear of little clinical significance. This is because the covariate effects have different meanings under different models. For example, the magnitudes of the covariate effects under the Cox model are big (Wang et al., 1997; Chen & Little, 1999; Wang & Chen, 2001; Qi, Wang, & Prentice, 2005). Finally, we applied the model checking procedure introduced in Section 4 to assess the adequacy of the AH model (1) for the data. We only calculated the statistic F(t, z) under SR(α), ˆ and obtained supt,z |F(t, z)| = 18.4, with a P-value of 0.34, based on 1,000 realizations of the ˆ z)|. The result suggests that there is no evidence against the corresponding statistic supt,z |F(t, AH model.
7. CONCLUDING REMARKS In this article, we have proposed the reweighting estimators for the AH model with missing covariates under the MAR assumption. The resulting estimators have closed forms and are easy to implement, and the asymptotic properties of the proposed estimators were also derived. The simulation results showed that the proposed estimators perform well, and are more efficient than the simple weighted estimators for the missing covariate effects when the selection probability is small. An application to the mouse leukaemia data was provided to illustrate our method. Note that given the complexity of the closed forms for the augmented reweighting estimators, which include integrals that in practice probably turn into discrete sums, it does not appear so simple to program, and would not be a trivial modification of existing packages in the standard software. The algorithm may be computationally intensive and time-consuming for more than four covariates, because the formulae involve the calculation of high-dimensional inverse matrices (at least 4 × 4 inverse matrices). The weight used in the weighted method is 1/πi (Xi , δi ), which can become arbitrarily large if the selection probability is small. In contrast, the weight used in our proposed reweighting method is πi (t, 1)/πi (Xi , δi ), which can handle some cases with small πi (Xi , δi ). Of course, πi (t, 1) does not need to have the same functional form as πi (Xi , δi ), and various weight functions may be imposed to construct the estimating equations. The proposed estimation method can be extended in a straightforward manner to the general reweighting framework by replacing πi (t, 1) with any weight function πi∗ (t), which does not depend on the outcome variables. The improvement on efficiency by the proposed method using the estimated πi (Xi , δi ) is highly relevant to those using the true πi (Xi , δi ). This is because the analysis using an estimated πi (Xi , δi ) uses information from the subjects with incomplete covariates by estimating πi (Xi , δi ), and hence incorporates more information in the estimation procedure. When the weights are estimated using a richly parameterized model, the efficiency of the simple reweighting estimators can be quite close to that of the augmented reweighting estimators. However, the augmented reweighting estimators have the double-robustness property, which allows more freedom in modelling the selection probability. Here we used the parametric models for the selection probability and the conditional expectations. As for future research, it would be worthwhile to consider nonparametric The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2014
REWEIGHTING ESTIMATORS FOR THE ADDITIVE HAZARDS MODEL
301
methods to estimate the selection probability and the conditional expectations (e.g., Qi, Wang, & Prentice, 2005). Since the full-data pseudoscore estimator is not fully efficient (Lin & Ying, 1994), our proposed reweighting estimators would not achieve the semiparametric efficiency bound. Note that the sieve maximum likelihood estimators (Zeng, Yin, & Ibrahim, 2005) achieve the semiparametric efficiency bound in the full-cohort analysis. However, this method cannot be extended in a straightforward manner to deal with the AH model with missing covariates under the MAR assumption. Development of a more efficient estimation method merits future research. In general, there are three types of assumptions on missingness: MCAR (missing completely at random), MAR and MNAR (missing not at random). The MAR assumption is common for statistical analysis with missing data and is reasonable in many practical situations (Little & Rubin, 2002), and MCAR is a special case of MAR. For MNAR, because missingness depends on missing data, we run into nonidentifiability problems. If the parameters are all identifiable, the proposed estimation procedure are still valid, provided that the selection probability is known or can be estimated by parametric methods. However, developing an estimation procedure for the AH model with missing covariates under the MNAR assumption is challenging and requires further research efforts. ACKNOWLEDGEMENTS The authors thank Editor-in-Chief, Professor David A. Stephens, the Associate Editor, and two referees for their insightful comments and suggestions that greatly improved the article. Xinyuan Song’s research was fully supported by the Research Grant Council of the Hong Kong Special Administration Region. Liuquan Sun’s research was fully supported by the National Natural Science Foundation of China Grants, and Key Laboratory of RCSDS, CAS. APPENDIX In order to study the asymptotic properties of the proposed estimators, we need the following regularity conditions: (C1) 0 (τ) < ∞, Pr{X ≥ τ} > 0, and E{supt∈[0,τ] πi (t, 1)Yi (t) Zi (t) 2 } < ∞. (C2) A is positive definite, and the sample paths of Zj (·)(j = 1, . . . , p) are of bounded variation on [0, τ]. (C3) The selection probability π(X, δ) is bounded away from zero, and π(t, 1) is of bounded variation and left continuous with respect to t on [0, τ]. (C4) π(W; α) is twice continuously differentiable in α, and there exists a compact neighborhood A of α0 such that E[supα∈A { π (W; α) 2 + π (W; α) }] < ∞, and E[supt∈[0,τ],α∈A { π (W(t); α) 2 + π (W(t); α) }] < ∞, where W(t) = (t, 1, Zo (·)) , π (W; α) = ∂π(W; α)/∂α and π (W; α) = ∂2 π(W; α)/∂ααT . (C5) Eγ {Zm (·)|W} is continuously differentiable in γ , and there exists a compact neighborhood B of γ0 such that E sup Y (t) Eγ { Zm (t) 2 |W} + ∂Eγ { Zm (t) 2 |W}/∂γ < ∞. t∈[0,τ],γ∈B
Here we present the proof of Theorem 1, outline the proofs of Theorems 2–5 and put other details in the Supplementary Material. Proof of Theorem 1(i).
Note that n1/2 {βˆSR − β0 } = Aˆ −1 n−1/2 USR (β0 ),
DOI: 10.1002/cjs
(A.1)
The Canadian Journal of Statistics / La revue canadienne de statistique
302
HAO, SONG AND SUN
Vol. 42, No. 2
where USR (β) =
n i=1
Ri πi (Xi , δi )
τ
¯ SR (t)}{dNi (t) − Yi (t)βT Zi (t) dt}. πi (t, 1){Zi (t) − Z
0
A straightforward calculation yields −1/2
n
−1/2
USR (β0 ) = n
n i=1
Ri πi (Xi , δi )
τ
¯ SR (t)} dMi (t). πi (t, 1){Zi (t) − Z
(A.2)
0
¯ SR (t) − z¯ (t) = The functional central limit theorem (Pollard, 1990) implies that supt∈[0,τ] Z −1/2 ). Since Mi (t) is a martingale process and is the difference of two monotone functions, Op (n it follows from (A.2) and Lemma A.1 of Qi, Wang, & Prentice (2005) that n−1/2 USR (β0 ) = n−1/2
n i=1
Ri Bi + op (1). πi (Xi , δi )
(A.3)
By utilizing the multivariate central limit theorem, n−1/2 USR (β0 ) is asymptotically normal with mean zero and covariance matrix SR . Using the uniform strong law of large numbers (Pollard, 1990), we can obtain that Aˆ → A almost surely. Thus, it follows from (A.1) and (A.3) that βˆSR is consistent and n1/2 {βˆSR − β0 } is asymptotically normal with mean zero and covariance matrix A−1 SR A−1 . 䊏 Proof of Theorem 1(ii). ˆ 0 (t) − 0 (t) = n−1
First write
n i=1
Ri πi (Xi , δi )
0
t
πi (u, 1) dMi (u) (0)
SSR (u)
− {βˆSR − β0 }T
t
¯ SR (u) du. Z
0
Note that supt∈[0,τ] |SSR (t) − s(0) (t)| = Op (n−1/2 ). Following similar arguments as in the proof of (i), together with (A.1) and (A.3), we obtain that uniformly on [0, τ], (0)
ˆ 0 (t) − 0 (t) = n−1
n i=1
Ri Ci (t) − h(t)T A−1 Bi + op (n−1/2 ). πi (Xi , δi )
(A.4)
In view of the consistency of βˆSR , it follows from the uniform strong law of large numbers ˆ 0 (t) − 0 (t)| → 0 in probability, and the multivariate central limit theorem that supt∈[0,τ] | 1/2 ˆ Gausand n {0 (t) − 0 (t)} converges in finite dimensional distributions to a zero-mean t sian process. Note that for each i, Zi (t) = max{Zi (t), 0} − max{−Zi (t), 0}. Then 0 s(0) (u)−1 πi (u, 1)Yi (u)β0T Zi (u) du can be written as the sum of two monotone processes on [0, τ]. Hence Ri Ci (t)/πi (Xi , δi ) can also be written as sums of monotone processes. Also it can be shown that E{Ri Ci (τ)/πi (Xi , δi )}2 < ∞. It then follows from Example 2.11.16 of van der Vaart & Wellner (1996, p. 215) that n−1/2 ni=1 Ri Ci (t)/πi (Xi , δi ) is tight. The second term on the right-hand side of (A.4) is tight because {βˆSR − β0 } converges in distribution and h(t) is a deterministic function. ˆ 0 (t) − 0 (t)} is tight and converges weakly to a zero-mean Gaussian process with Thus, n1/2 { covariance function SR (s, t) at (s, t). 䊏 Proof of Theorem 2(i).
Similarly to (A.1), we have
ˆ α) n1/2 {βˆSR (α) ˆ − β0 } = A( ˆ −1 n−1/2 USR (β0 ; α), ˆ The Canadian Journal of Statistics / La revue canadienne de statistique
(A.5) DOI: 10.1002/cjs
2014
REWEIGHTING ESTIMATORS FOR THE ADDITIVE HAZARDS MODEL
303
where USR (β0 ; α) ˆ =
n
τ
0
i=1
Ri πi (t, 1; α) ˆ T ¯ SR (t; α)}{dN ˆ {Zi (t) − Z i (t) − Yi (t)β0 Zi (t) dt}. πi (Xi , δi ; α) ˆ
Differentiation of USR (β0 ; α) with respect to α yields ∂USR (β0 ; α0 )/∂αT = −D + op (1). In a ˆ α) similar manner, we obtain that A( ˆ = A + op (1). It then follows from (A.5) and the Taylor expansion that n1/2 {βˆSR (α) ˆ − β0 } = A−1 n−1/2 USR (β0 ; α0 ) − A−1 Dn1/2 {αˆ − α0 } + op (1).
(A.6)
Since αˆ is the maximum likelihood estimator of α0 , we have n1/2 {αˆ − α0 } = α n−1/2
n
Uαi (α0 ) + op (1),
(A.7)
i=1
where Uαi is the score function derived from the parametric model π(Wi ; α). Thus, it follows from (A.3), (A.6) and (A.7) that βˆSR (α) ˆ is consistent, and n1/2 {βˆSR (α) ˆ − β0 } and n1/2 {αˆ − α0 } are asymptotically multivariate normal. Note that αˆ is asymptotically efficient. Then it follows from (1.3) of Pierce (1982) that n1/2 {βˆSR (α) ˆ − β0 } is asymptotically normal with mean zero and covariance matrix A−1 [SR − Dα DT ]A−1 . 䊏 Proof of Theorem 2(ii).
In view of (A.4), (A.6) and (A.7), we have n
Ri Ri Ci (t) − h(t)T A−1 Bi πi (Xi , δi ) πi (Xi , δi ) i=1 − {Q(t)T − h(t)T A−1 D}α Uαi (α0 ) + op (1).
ˆ 0 (t; α) n1/2 { ˆ − 0 (t)} = n−1/2
(A.8)
Thus, it follows from the uniform strong law of large numbers and the multivariate central limit theˆ 0 (t; α) ˆ 0 (t; α) orem that supt∈[0,τ] | ˆ − 0 (t)| → 0 in probability, and n1/2 { ˆ − 0 (t)} converges in finite dimensional distributions to a zero-mean Gaussian process. As in the proof of Theorem 1(ii), the first two terms on the right-hand side of (A.8) are tight. The third term is tight because n1/2 {αˆ − α0 } converges in distribution and {Q(t)T − h(t)T A−1 D} is a deterministic vectorˆ 0 (t; α) ˆ − 0 (t)} is tight and converges weakly to a zero-mean Gausfunction. Therefore, n1/2 { sian process with covariance function SR (s, t) − {Q(s) − DT A−1 h(s)}T α {Q(t) − DT A−1 h(t)} at (s, t). 䊏 Proof of Theorem 3(i).
As in the proof of Theorem 1(i), we can show that
n1/2 {βˆAR − β0 } = A−1 n−1/2 UAR (β0 ) + op (1),
(A.9)
and n−1/2 UAR (β0 ) = n−1/2
n i=1
− n−1/2
Ri πi (Xi , δi )
τ
πi (t, 1){Zi (t) − z¯ (t)} dMi (t)
0
n Ri − πi (Xi , δi ) i=1
πi (Xi , δi )
τ
πi (t, 1)E[{Zi (t) − z¯ (t)} dMi (t)|Wi ] + op (1).
0
(A.10) DOI: 10.1002/cjs
The Canadian Journal of Statistics / La revue canadienne de statistique
304
HAO, SONG AND SUN
Vol. 42, No. 2
Thus, by (A.9) and (A.10) and the multivariate central limit theorem, the consistency and the asymptotic normality of βˆAR can be established. 䊏 Proof of Theorem 3(ii).
It can be shown that ˆ a (t) − 0 (t) = n−1
n
νi (t) + op (1),
(A.11)
i=1
where νi (t) =
t
Ri πi (u, 1) Ri πi (u, 1) (u) + 1 − (u)|W ] E[ dM dM i i i πi (Xi , δi )s(0) (u) πi (Xi , δi ) s(0) (u) 0
Ri Ri − πi (Xi , δi ) T −1 − h(t) A Bi − E[Bi |Wi ] . πi (Xi , δi ) πi (Xi , δi )
ˆ a (t) − 0 (t)} converges weakly to a zero-mean Gaussian As in the proof of Theorem 1(ii), n1/2 { process with covariance function at (s, t) equal to AR (s, t) = E{νi (s)νi (t)}. 䊏 t Proof of Theorem 4. Define Mi (t; β, ) = Ni (t) − 0 Yi (u){d(u) + βT Zi (u) du}, n Ri πi (t, 1; α)
Ri U1 (t, β, ; α, γ) = dMi (t; β, ) + 1 − πi (Xi , δi ; α) πi (Xi , δi ; α) i=1 × πi (t, 1; α)Eγ [dMi (t; β, )|Wi ] , 0 ≤ t ≤ τ,
and n
Ri πi (t, 1; α) Ri U2 (β, ; α, γ) = Zi (t) dMi (t; β, ) + 1 − πi (Xi , δi ; α) 0 πi (Xi , δi ; α) i=1 τ πi (t, 1; α)Eγ [Zi (t)dMi (t; β, )|Wi ] . × τ
0
ˆ a (t; α, ˆ γ) ˆ and ˆ γ) ˆ are the solutions to U1 (t, β, ; α, ˆ γ) ˆ = 0 and U2 (β, ; α, ˆ γ) ˆ = Note that βˆAR (α, ∗ ∗ 0. Let α and γ denote the limits of αˆ and γ, ˆ respectively. In order to prove the double-robustness ˆ a (t; α, ˆ γ) ˆ and property of βˆAR (α, ˆ γ), ˆ it suffices to show that EU1 (t, β0 , 0 ; α∗ , γ ∗ ) = 0 and ∗ ∗ ∗ EU2 (β0 , 0 ; α , γ ) = 0 if either α = α0 or γ ∗ = γ0 . If α∗ = α0 , then it follows from the MAR assumption that for 0 ≤ t ≤ τ,
n Ri πi (t, 1; α0 ) Ri EU1 (t, β0 , 0 ; α0 , γ ) = E E dMi (t) + 1 − πi (Xi , δi ; α0 ) πi (Xi , δi ; α0 ) i=1 × πi (t, 1; α0 )Eγ ∗ [dMi (t)|Wi ]Wi , Zmi (·) ∗
=E
n
πi (t, 1; α0 ) dMi (t) = 0,
i=1
The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2014
REWEIGHTING ESTIMATORS FOR THE ADDITIVE HAZARDS MODEL
305
where Mi (t) = Mi (t; β0 , 0 ) is a martingale process (Lin & Ying, 1994). In a similar manner, we have that EU2 (β0 , 0 ; α0 , γ ∗ ) = 0. If γ ∗ = γ0 , then we have that for 0 ≤ t ≤ τ, U1 (t, β0 , 0 ; α∗ , γ0 ) =
n πi (t, 1; α∗ ) dMi (t) + 1 − i=1
Ri πi (Xi , δi ; α∗ )
πi (t, 1; α∗ )
× Eγ0 [dMi (t)|Wi ] − dMi (t) .
(A.12)
The expectation of the first term on the right-hand side of (A.12) is zero. It can be checked that the second term equals
Ri ∗ (t, 1; α E E 1− ) E [dM π (t)|W ] − dM (t) , Z (·) = 0. W i γ0 i i i i mi πi (Xi , δi ; α∗ ) Hence, EU1 (t, β0 , 0 ; α∗ , γ0 ) = 0. Likewise, we obtain that EU2 (β0 , 0 ; α∗ , γ0 ) = 0. Thus, ˆ a (t; α, ˆ γ) ˆ and ˆ γ) ˆ are consistent if either the selection probability or the conditional βˆAR (α, distribution of the missing covariates is correctly specified. 䊏 Proof of Theorem 5(i).
As in the proof of theorem 2(i), we have
ˆ γ) ˆ − β0 } = A−1 n−1/2 UAR (β0 ; α, ˆ γ) ˆ + op (1), n1/2 {βˆAR (α,
(A.13)
where ˆ γ) ˆ = UAR (β0 ; α,
n
τ
0
i=1
Ri πi (t, 1; α) ˆ T ¯ AR (t; α, ˆ γ)}{dN ˆ {Zi (t) − Z i (t) − Yi (t)β0 Zi (t) dt} πi (Xi , δi ; α) ˆ
n 1−
τ Ri πi (t, 1; α) ˆ Eγˆ {Zi (t)[dNi (t)−Yi (t)β0T Zi (t) dt]|Wi } πi (Xi , δi ; α) ˆ 0 i=1 ¯ AR (t; α, −Z ˆ γ)E ˆ γˆ {[dNi (t) − Yi (t)β0 Zi (t) dt]|Wi } . +
It follows from Lemma A.1 of Qi, Wang, & Prentice (2005) that ∗ ˆ γ) ˆ = n−1/2 UAR (β0 ; α, ˆ γ) ˆ + op (1), n−1/2 UAR (β0 ; α,
(A.14)
where ∗ (β0 ; α, ˆ γ) ˆ UAR
=
n i=1
+
Ri πi (Xi , δi ; α) ˆ
n i=1
1−
τ
πi (t, 1; α){Z ˆ ¯ (t)}dMi (t) i (t) − z
0
Ri πi (Xi , δi ; α) ˆ
0
τ
πi (t, 1; α) ˆ Eγˆ [{Zi (t) − z¯ (t)}dMi (t)|Wi ].
It can be checked that ∗ ∗ (β0 ; α, ˆ γ) ˆ = n−1/2 UAR (β0 ; α0 , γ0 ) + op (1). n−1/2 UAR
(A.15)
ˆ γ) ˆ is consistent and n1/2 {βˆAR (α, ˆ γ) ˆ − β0 } is Thus, it follows from (A.13)–(A.15) that βˆAR (α, asymptotically normal with mean zero and covariance matrix A−1 AR A−1 . Now, we construct DOI: 10.1002/cjs
The Canadian Journal of Statistics / La revue canadienne de statistique
306
HAO, SONG AND SUN
Vol. 42, No. 2
a class of estimating equations for β with the asymptotic expression: n
−1/2
n i=1
Ri Ri Bi + 1 − e(Wi ) , πi (Xi , δi ; α) ˆ πi (Xi , δi ; α) ˆ
where e(Wi ) is any function. By following the argument used in Example 25.43 of van der Vaart (1998), we can show that the asymptotic variance of the class of estimators is minimized by taking e(Wi ) = E{Bi |Wi }. Note that βˆSR (α) ˆ is also in this class with e(Wi ) = 0. Thus, the asymptotic ˆ γ) ˆ is no greater than that of βˆSR (α). variance of βˆAR (α, ˆ 䊏 Proof of Theorem 5(ii).
It can be checked that
ˆ a (t; α, ˆ γ) ˆ − 0 (t) = n−1
n
νi (t) + op (1).
i=1
The assertion is immediate from the proof of Theorem 3(ii). 䊏 Proof of (5) in Section 4. and (A.8), we obtain
t f1 (u, z) I(Zi (u) ≤ z) − πi (u, 1; α0 ) (0) dMi (u) s (u) 0 i=1 t Ri f2 (t, z) f1 (u, z) dQ(u)T − A−1 Bi + πi (Xi , δi ; α0 ) 0 −1 T +f2 (t, z) A D − f3 (t, z) α Uαi + op (1),
F(t, z) = n−1/2
n
As in the proof of Theorem 2(ii), and using the Taylor expansion
Ri πi (Xi , δi ; α0 )
which is a sum of i.i.d. zero-mean terms for fixed t and z, where f1 (u, z), f2 (t, z) and f3 (t, z) are the limits of fˆ1 (u, z; α), ˆ fˆ2 (t, z; α) ˆ and fˆ3 (t, z; α), ˆ respectively. By the multivariate central limit theorem, F(t, z) converges in finite-dimensional distributions to a zero-mean Gaussian process. As in the proof of Theorem 2(ii), F(t, z) is tight. Hence F(t, z) converges weakly to a zero-mean ˜ z) given Gaussian process which can be approximated by the zero-mean Gaussian process F(t, by (5). 䊏 BIBLIOGRAPHY Breslow, N. E. & Day, N. E. (1987). Statistical Models in Cancer Research 2: The Design and Analysis of Cohort Studies, International Agency for Research on Cancer, Lyon. Chen, H. Y. & Little, R. J. A. (1999). Proportional hazards regression with missing covariates. Journal of the American Statistical Association, 94, 896–908. Cox, D. R. (1972). Regression models and life-tables (with discussion). Journal of the Royal Statistical Society Series B, 34, 187–220. Cox, D. R. & Oakes, D. (1984). Analysis of Survival Data, Chapman and Hall, London. Henmi, M. & Eguchi, S. (2004). A paradox concerning nuisance parameters and projected estimating functions. Biometrika, 91, 929–941. Kalbfleisch, J. D. & Prentice, R. L. (2002). The Statistical Analysis of Failure Time Data, 2nd ed., John Wiley & Sons, New York. The Canadian Journal of Statistics / La revue canadienne de statistique
DOI: 10.1002/cjs
2014
REWEIGHTING ESTIMATORS FOR THE ADDITIVE HAZARDS MODEL
307
Kulich, M. & Lin, D. Y. (2000). Additive hazards regression for case-cohort studies. Biometrika, 87, 73–87. Liang, K. Y. & Zeger, S. L.(1986). Longitudinal data analysis using generalized linear models. Biometrika, 73, 13–22. Lin, D. Y., Wei, L. J., & Ying, Z. (1993). Checking the Cox model with cumulative sums of martingale-biased residuals. Biometrika, 85, 605–619. Lin, D. Y. & Ying, Z. (1994). Semiparametric analysis of the additive risk model. Biometrika, 81, 61–71. Lin, W. (2011). Missing Covariates and High-Dimensional Variable Selection in Additive Hazards Regression, Ph.D. dissertation, University of Southern California. Lipsitz, S. R., Ibrahim, J. G., & Zhao, L. P. (1999). A weighted estimating equation for missing covariate data with properties similar to maximum likelihood. Journal of the American statistics Association, 94, 1147–1160. Little, R. J. A. & Rubin, D. B. (2002). Statistical Analysis with Missing Data, 2nd ed., John Wiley & Sons, New York. Luo, X. D., Tsai, W. Y., & Xu, Q. (2009). Pseudo-partial likelihood estimators for the Cox regression model with missing covariates. Biometrika, 96, 617–633. Paik, M. C. & Tsai, W. Y. (1997). On using the Cox proportional hazards model with missing covariates. Biometrika, 84, 597–593. Pierce, D. A. (1982). The asymptotic effect of substituting estimators for parameters in certain types of statistics. The Annals of Statistics, 10, 475–478. Pollard, D. (1990). Empirical Processes: Theory and Applications, Institute of Mathematical Statistics, Hayward, California. Qi, L., Wang, C. Y., & Prentice, R. L. (2005). Weighted estimators for proportional hazards regression with missing covariates. Journal of the American statistics Association, 100, 1250–1263. Robins, J. M., Rotnitzky, A., & Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American statistics Association, 89, 846–866. Tsiatis, A. A. (2006). Semiparametric Theory and Missing Data, Springer, New York. van der Vaart, A. W. (1998). Asymptotic Statistics, Cambridge University Press, New York. van der Vaart, A. W. & Wellner, J. A. (1996). Weak Convergence and Empirical Processes, Spring-Verlag, New York. Wang, C. Y. & Chen, H. Y. (2001). Augmented inverse probability weighted estimator for Cox missing covariate regression. Biometrics, 57, 414–419. Wang, C. Y., Hsu, L., Feng, Z. D., & Prentice, R. L. (1997). Regression calibration in failure time regression. Biometrics, 53, 131–145. Xu, Q., Paik, M. C, Luo, X. D., & Tsai, W. Y. (2009). Reweighting estimators for Cox regression with missing covariates. Journal of the American Statistical Association, 104, 1155–1167. Zeng, D. L., Yin G., & Ibrahim, J. G. (2005). Inference for a class of transformed hazards models. Journal of the American Statistical Association, 100, 1000–1008.
Received 9 July 2009 Accepted 8 July 2010
DOI: 10.1002/cjs
The Canadian Journal of Statistics / La revue canadienne de statistique