DOI: 10.1111/j.1541-0420.2008.01056.x
Biometrics 65, 143–151 March 2009
Nonparametric Estimation in a Markov “Illness–Death” Process from Interval Censored Observations with Missing Intermediate Transition Status Halina Frydman1,∗ and Michael Szarek2,3 1
Stern School of Business, New York University, 44 West 4th Street, New York, New York 10012, U.S.A. 2 Division of Biostatistics, NYU School of Medicine, 650 First Avenue, New York, New York 10016, U.S.A. 3 ImClone Systems Incorporated, 33 ImClone Drive, Branchburg, New Jersey 08876, U.S.A. ∗ email:
[email protected] Summary. In many clinical trials patients are intermittently assessed for the transition to an intermediate state, such as occurrence of a disease-related nonfatal event, and death. Estimation of the distribution of nonfatal event free survival time, that is, the time to the first occurrence of the nonfatal event or death, is the primary focus of the data analysis. The difficulty with this estimation is that the intermittent assessment of patients results in two forms of incompleteness: the times of occurrence of nonfatal events are interval censored and, when a nonfatal event does not occur by the time of the last assessment, a patient’s nonfatal event status is not known from the time of the last assessment until the end of follow-up for death. We consider both forms of incompleteness within the framework of an “illness–death” model. We develop nonparametric maximum likelihood (ML) estimation in an “illness–death” model from interval-censored observations with missing status of intermediate transition. We show that the ML estimators are self-consistent and propose an algorithm for obtaining them. This work thus provides new methodology for the analysis of incomplete data that arise from clinical trials. We apply this methodology to the data from a recently reported cancer clinical trial (Bonner et al., 2006, New England Journal of Medicine 354, 567–578) and compare our estimation results with those obtained using a Food and Drug Administration recommended convention. Key words: Clinical trials; Illness–death process; Interval censoring; Missing status of intermediate transition; Nonparametric maximum likelihood; Progression-free survival.
1. Introduction In a number of recent cancer clinical trials, interest centers on the nonparametric estimation of the distribution of time to first event in a composite of two events: death and a diseaserelated nonfatal event. The patients are periodically assessed for the occurrence of the nonfatal event until the end of followup for death. This estimation problem is not straightforward because (i) patients are assessed intermittently, which results in interval-censored times of occurrence of the nonfatal event, especially if the event is asymptomatic (e.g., tumor progression) and (ii) if the last assessment prior to death or right censoring for death is negative with respect to experiencing the nonfatal event, the nonfatal event status of a patient is not known between the time of that last assessment and the end of follow-up. The first form of incompleteness is generally addressed by assuming that the time to an observed nonfatal event is exact and corresponds to the date of the first assessment documenting its occurrence. The second form of incompleteness is often treated in a manner that depends on the vital status of a patient at the end of follow-up for death. If the patient is known to be alive, the time to first event is right C
2008, The International Biometric Society
censored at the time of the last negative assessment. If the patient is known to be dead, the death is assumed to have occurred without the patient experiencing a nonfatal event. These strategies are recommended by the Food and Drug Administration (FDA, 2007) and European Medicines Agency (EMEA, 2006) in the context of cancer drug development. However, the recommendation for the first form of incompleteness would, on average, assign event times that are too long among those with a documented nonfatal event. The second recommendation ignores the possibility that a patient may have experienced a nonfatal event between the time of the last negative assessment and death or the end of follow-up. In this article, we develop a nonparametric maximum likelihood (ML) estimation procedure that uses all available information on patients, including their vital status, through the end of follow-up as well as properly accounts for interval censoring of times of occurrence of the nonfatal events. We do so in the framework of the three-state continuous time Markov chain, so called, “illness–death” model. In this model, depicted in Figure 1, state 1 is an initial state that in the context of clinical trials represents an initial state of the disease,
143
Biometrics, March 2009
144
Figure 1.
Illness–death model.
intermediate state 2 represents a disease-related complication (nonfatal event), and state 3 represents the occurrence of a terminal event such as death. We obtain the maximum likelihood estimators (MLEs) of the two subdistribution functions corresponding to 1 → 2 and to 1 → 3 transitions and of the 2 → 3 transition intensity from the incomplete data described in Section 2. This estimation is carried out under the assumption that the observation scheme is noninformative as discussed in Gruger, Kay, and Schumacher (1991). We then show that the MLEs are also self-consistent estimators and based on this property propose an easily applicable algorithm for obtaining the MLEs of the parameters. The idea of self-consistent estimation was introduced by Efron (1967) and is a version of the expectation–maximization algorithm. This idea was extended to the estimation of a distribution function from interval-censored data by Turnbull (1976), and then further extended by Frydman (1992, 1995) to the estimation in multistate models and by Hudgens, Satten, and Longini (2001) to the competing risks setting in the presence of interval-censored times of transitions. We show that the self-consistency idea is also applicable to the present data setting. In this article, we substantially extend the ML methodology from Frydman (1992, 1995) to incorporate the observations with unknown intermediate event status. The only other treatment of the estimation problem considered here has been by Joly et al. (2002), who used a penalized ML estimation to obtain smooth estimates of transition intensities. Our approach is different and provides the MLEs of these parameters. We compare the performance of our method and its variant, which takes into account interval censoring, but treats observations with an unknown status as recommended by the FDA (hereby referred to as the corrected FDA analysis) through a simulation study. The simulation results indicate that our method corrects the bias in the estimates of the distribution of time to first event in the composite and of the cumulative intensity of death after the occurrence of a nonfatal event, which results when the estimation is carried out using the corrected FDA method. We illustrate our method with the data reported in Bonner et al. (2006). In this trial, one of the objectives was to study progression-free survival (PFS), defined as the time from
randomization until objective tumor progression or death. Patients in the trial had locoregionally advanced squamouscell carcinoma of the head and neck and were randomized to treatment with radiotherapy plus cetuximab or radiotherapy alone. We apply our method and the corrected FDA method to the PFS data from the cetuximab-treated group of patients of whom about 53% had missing intermediate event (tumor progression) status. The two analyses result in the slightly different estimates of the distribution of time to the first event in the composite, but substantially different estimates of both the cumulative intensity of death before and after the occurrence of the intermediate event. Our estimates of the cumulative intensities of death seem to be more in line with clinical expectations for this disease setting. The article is organized as follows. Section 2 describes the data and Section 3 derives the likelihood function. ML estimation is discussed in Section 4. Section 5 presents self-consistent estimation, the algorithm for obtaining the MLEs, and exhibits the MLEs. A simulation study is presented in Section 6, and application to the oncology trial is discussed in Section 7. Supplementary Materials are listed in Section 8. 2. Description of Data We consider estimation in a continuous time Markov chain X(t), t ≥ 0, depicted in Figure 1, from the following incomplete observations. Assuming that all observations are in state 1 at time 0, let S be the exit time from state 1 and T be the entry time to state 3. There are M observations with interval ˜ also make a censored times of 1 → 2 transition of which N transition to state 3 and the remaining ones are right censored in state 2: ˜} {δm = 1, Lm ≤ S ≤ Rm , T = tm , m ≤ N ˜ < m ≤ M }, ∪ {δm = 1, Lm ≤ S ≤ Rm , T > tm , N (1) where Lm < Rm , and δ m = 1(0) indicates that 1 → 2 transition occurred (did not occur). An mth observation makes a 1 → 2 transition in an interval [Lm , Rm ] and enters state 3 at ˜ , but is right censored at tm if m > N ˜. time tm if 1 ≤ m ≤ N ˜ observations that make a direct 1 → 3 transition There are K at times: ˜ {T = ek , 1 ≤ k ≤ K},
(2)
and J observations that are right censored in state 1 at times: {sj , 1 ≤ j ≤ J}.
(3)
The estimation of the transition intensities from the data described so far has been discussed in Frydman (1995). But here we also consider observations with a missing intermediate transition status, namely there are U observations of the form {δ M +u = ?, X(LM +u −) = 1, T = tM +u , 1 ≤ u ≤ U }. For the uth such observation we know that it was last seen in state 1 at time LM +u − and subsequently made a transition to state 3 at time tM +u , but we do not know if this transition occurred from state 1 or state 2. We can represent a uth observation, 1 ≤ u ≤ U , from this set as {δM +u = 1, LM +u ≤ S ≤ tM +u −, T = tM +u } ∪ {δM +u = 0, T = tM +u }.
(4)
Nonparametric Estimation in a Markov “Illness–Death” Process There are also C observations of the form {δ W +c = ?, X(LW +c −) = 1, T > tW +c , 1 ≤ c ≤ C}, where W = M + U . For the cth such observation we know that it was last seen in state 1 at time LW +c − and subsequently was right censored at tW +c , but we do not know whether it was right censored in state 1 or in state 2. We can represent a cth observation, 1 ≤ c ≤ C, from this set as
N1
M
L1 =
n
(1 − λn ) dF 12 (s), where, λn ) L t∗n ∈(s,Rm m ] (1 − λn ) is over all G, the product t∗n ∈G ∗ t To simplify notation we will n ∈ G. from G := t∗ ∈G (1 − λn ) and assume G = 1 n We also let
Rm
I(Am ) :=
λdnn
n=1
3. The Likelihood Function Let S be the exit time from state 1, F (s) = P (S ≤ s) and consider subdistribution function F 12 (s) = P (S ≤ s, δ = 1) and F 13 (s) = P (S ≤ s, δ = 0). Clearly F (s) = F 12 (s) + F 13 (s). We will express the likelihood function of the observations in terms of F 12 (s), F 13 (s) and in terms of a discrete 2 → 3 transition intensity function Λ23 ({t}) = Pr(T = t | T ≥ t, δ = 1). For t∗n ∈ T ∗ we set λn := Λ23 ({t∗n }). We cu note ∗that the mulative transition intensity Λ23 (t) = Λ23 ({tn })I(t∗n ≤ t), where I(·) is an indicator function. Following Frydman (1995) the likelihood of an mth observation in (1), ˜, when m≤N is Pr(T = tm , S ∈ [Lm , Rm ], δm = 1) = Rm Pr(T = t |S = s, δ = 1)dF 12 (s) = λm t∗ ∈(Rm ,tm ) (1 − m m L
Rm
N1
(5)
˜} ∪ Consider the set of failure times {tm , 1 ≤ m ≤ N {tM +u , 1 ≤ u ≤ U }. Then each time in the first set is an observed time of 2 → 3 transition and each time in the second set is a potential time of this transition. Suppose that T ∗1 = {t∗n , 1 ≤ n ≤ N 1 } is a set of N1 distinct times from the set ˜ } and T ∗ = {t∗ , N 1 + 1 ≤ n ≤ N } is a set of {tm , 1 ≤ m ≤ N 2 n distinct times from {tM +u , 1 ≤ u ≤ U } such that T1∗ ∩ T2∗ = ∅. Let T ∗ = T ∗1 ∪ T ∗2 = {t∗n , 1 ≤ n ≤ N } be the set of (observed and potential) distinct times of 2 → 3 transitions. Let dn be the multiplicity of t∗n . ˜ ∪ {tM +u , 1 ≤ u ≤ U }. Then each Consider {ek , 1 ≤ k ≤ K} time in the first set is an observed time of 1 → 3 transition and each time in the second set is a potential time of 1 → 3 transition. Let E ∗1 = {e∗k , 1 ≤ k ≤ K 1 } be a set of K1 distinct ∗ ˜ and E ∗ = {e∗ times from the set {ek , 1 ≤ k ≤ K} 2 K1 +1 , . . . , eK } a set of distinct times from {tM +u , 1 ≤ u ≤ U } such that E1∗ ∩ E2∗ = ∅. Let E ∗ = E ∗1 ∪ E ∗2 = {e∗k , 1 ≤ k ≤ K} be the set of distinct (observed and potential) times of 1 → 3 transition. Let ck be the multiplicity of e∗k . Finally, let N ∗ = ˜ + J be the total number of observations, and M +U +C +K M = M + U + C.
m
λdnn m=1 ( Gm )I(Am ), where Gm = (Rm , tm ) if m ≤ ˜ . We obtain the likelihood ˜ and Gm = (Rm , tm ] if m > N N, function, L1 , of the observations with a known 1 → 2 transition status by also including the contributions of observations in (2) and (3). This gives n=1
{δW +c = 1, LW +c ≤ S ≤ tW +c , T > tW +c } ∪ {δW +c = 0, T > tW +c }.
for any set λ s for which now on write if G ∩ T ∗ = ∅.
(s, Rm ] dF 12 (s), 1 ≤ m ≤ M , (6)
Lm
where Am := [Lm , Rm ], and, for M < m ≤ M + U , RM +u = tM +u− , and for W < m ≤ M , RW +c = tW +c . In this notation, the likelihood of an mth observation in (1), when ˜ , is λm ( (Rm , tm ))I(Am ). It now follows that the m≤N likelihood of the observations in equation (1) takes the form
145
M m=1
K1
×
Gm I(Am )
F13 e∗k − F13 e∗k −
ck
k=1
×
J
{1 − F12 (sj −) − F13 (sj −)}.
(7)
j=1
The likelihood function, L2 , of observations in equations (4) and (5) with an unknown 1 → 2 transition status, using notation (6), is L2 =
U
[F13 (tM +u ) − F13 (tM +u −) + λM +u I(AM +u )]
u=1
×
C
[1 − F12 (tW +c ) − F13 (tW +c ) + I(AW +c )].
(8)
c=1
For, 1 ≤ m ≤ M , we have
k(m)
I(Am ) =
m a(k, m) F12 τkm − − F12 τk−1 −
k=1
m + F12 (Rm ) − F12 τk(m) −
k(m)
= F12 (Rm ) −
F12 τkm − [a(k + 1, m) − a(k, m)],
k=0
(9)
where, for 1 ≤ k ≤ k(m), a(k, m) ≡ [τ m k , Rm ], a(0, m) = ∗ 0, a(k(m) + 1, m) = 1, and {τ m ∈ T , 0 < k ≤ k(m)} is a k partition of (Lm , Rm ] by the times of 2 → 3 transitions and we set τ m 0 ≡ Lm . We note that upon substitution of equation (9) into equation (8), the second factor of L2 becomes C c=1
1 − F13 (tW +c ) −
k(W +c)
k=0
F12 τkW +c−
× [a(k + 1, W + c) − a(k, W + c)]
.
(10)
Finally we note that for 0 ≤ k ≤ k(m), a(k + 1, m) − (a(k, m) > 0 and set L = L1 L2 . It then follows from equations (7– 10), that to maximize the likelihood function L with respect to F12 we should make the values {F 12 (τ m k −), 0 ≤ k ≤ k(m), 1 ≤ m ≤ M }, and {F 12 (sj −), 1 ≤ j ≤ J} as small as possible and the values {F 12 (Rm ), 1 ≤ m ≤ W } as large as possible subject to the constraint that F12 is a subdistribution function. Accordingly we form two sets of times
Biometrics, March 2009
146
and log L2 by
¯ = Lm , 1 ≤ m ≤ M ∪ {T ∗ ∩ A} ∪ {SJ ∩ A} L
∪ smax : smax > Rmax ∨ e∗max
log L2
¯ = {Rm , 1 ≤ m ≤ W } ∪ {∞}, R
=
where A ≡ ∪M smax = max(sj , 1 ≤ j ≤ J), Rmax = m=1 Am , max(Rm , 1 ≤ m ≤ W ), and e∗max = max(e∗k , 1 ≤ k ≤ K). (This construction builds on the one in Frydman [1995].) From those two sets of times we construct a set of disjoint closed inter¯ and R, ¯ vals whose left and right end points lie in the sets L respectively, and which do not contain any other members of ¯ and R. ¯ Let {Qi := [li , ri ], 1 ≤ i ≤ I} be those intervals, L where l1 ≤ r1 < l2 ≤ r2 < · · · < lI ≤ rI ≤ ∞, and Q = ∪Qi . ¯ to ensure that this construction The {∞} is included in R is well defined when smax > Rmax . We note that F12 , which maximizes L, can increase only on Q. The following theorem follows from the form of L and above discussion. Theorem 1. (a) F13 increases at every time in E ∗1 and may increase on a subset of times in E ∗2 . If smax > Rmax ∨ e∗max then F13 is undefined on s ≥ smax . (b) F12 can increase only on Q. For fixed values of F 12 (ri ) and F 12 (li −), (1 ≤ i ≤ I), the likelihood is independent of the behavior of F12 on [li , ri ]. If smax > Rmax ∨ e∗max then F12 is undefined on s ≥ smax . (c) Λ23 ({t∗n } > 0 for t∗n ∈ T ∗1 , and Λ23 ({t∗n } ≥ 0 for t∗n ∈ T ∗2 . 4. The Maximum Likelihood Estimation In this section we reformulate the likelihood function, L1 L2 , by expressing it in terms of {λn , 1 ≤ n ≤ N }, {z i , 1 ≤ i ≤ I} and {z i , I < i ≤ I }, where the first set consists of discrete 2 → 3 transition intensities, the second of the probability masses on the intervals Qi that form the support of F12 , and the third of the probability masses on the time points {e∗k , 1 ≤ k ≤ K} comprising the support of F13 . This reformulation makes it possible to estimate F 12 , F 13 and Λ23 (t). More precisely, for 1 ≤ i ≤ I − 1, let z i = F 12 (ri ) − F 12 (li −) and if rI < ∞, let z I = F 12 (rI ) − F 12 (lI −), whereas if rI = ∞, let z I = 1 − F (smax −). For 1 ≤ i ≤ I − 1, z i is the probability of the 1 → 2 transition occurring in Qi and z I has the same interpretation if rI < ∞. But if rI = ∞ then z I is the probability of leaving state 1 in [smax , ∞]. Now, for I < i ≤ I + K ≡ I , let Qi = {e∗i−I }, z i = F 13 (e∗i−I ) − F 13 (e∗i−I −) and let z = {z i , 1 ≤ i ≤
I
I
u=1
i=I+1
λM +u × C
+
log
I
c=1
=:
U
N1
dn log λn +
n=1
+
M
log
Gm
m=1
log
m=1
+
M
K1 i=I+1
I
(ri , Rm ] βim zi
i=1
ci−I log zi +
J j=1
log
I i=1
αij zi ,
(11)
I tM +u = e∗i−I zi +
log
I
log LU u +
u=1
(ri , tM +u −] βi,M +u zi
i=1
αi,J+c zi +
I
i=1
(ri , tW +c ] βi,W +c zi
i=1
C
log LC c .
(12)
c=1
We now want to maximize log L with respect to (z, λ) subI ject to z = 1, (zi ≥ 0, 1 ≤ i ≤ I ), and (0 ≤ λn ≤ i=1 i 1, 1 ≤ n ≤ N ). The Lagrangian for this problem I I is H =: H(z, λ) = log L(z, λ) − a( i=1 zi − 1) + i=1 bi zi + N N h (1 − λn ) + n=1 fn λn , where, for 1 ≤ i ≤ I and 1 ≤ n=1 n n ≤ N, bi zi = 0, hn (1 − λn ) = 0, fn λn = 0,
a, bi , hn , fn ≥ 0,
I
zi = 1.
(13)
i=1
The MLE of (z, λ) is a solution to the following system of equations ∂H =0 ∂zi
(1 ≤ i ≤ I ),
∂H =0 ∂λn
(1 ≤ n ≤ N ). (14)
In the next section we present the explicit form of equations in equation (14) and the algorithm for obtaining the MLE of (z, λ). 5. The Self-Consistent Estimation We now introduce a self-consistent estimator of (z, λ) and show, in Theorem 2 below, its connection to the MLE of (z, λ). We define a self-consistent estimator of (z, λ) as a solution to the following system of equations
z = 1. Now, for 1 ≤ i ≤ I , and 1 ≤ j ≤ I }. Clearly i=1 i J + C, let αij = I(Qi ⊂ [sj , ∞]), where {sJ+c , 1 ≤ c ≤ C} = {tW +c , 1 ≤ c ≤ C}. Also, for 1 ≤ i ≤ I and 1 ≤ m ≤ M , let β im = I(Qi ⊂ Am ). With this notation log L1 is given by log L1 =
U
E(z,λ) [N12 (Qi ) | D]/N ∗ = zi , 1 ≤ i ≤ I,
(15)
E(z,λ) [N13 (Qi ) | D]/N ∗ = zi , I < i ≤ I ,
(16)
E(z,λ) N23 t∗n |D /E(z,λ) Y t∗n − |D = λn , 1 ≤ n ≤ N,
(17)
where for 1 ≤ i ≤ I, N 12 (Qi ) is the number of observations that make a 1 → 2 transition in Qi , and, for I < i ≤ I , N 13 (Qi ) is the number of observations that make a 1 → 3 transition at Qi . Also, for 1 ≤ n ≤ N , Y (t∗n −) is the number of observations that are in state 2 at time t∗n −, and N 23 (t∗n ) is the number of observations that make a 2 → 3 transition at time t∗n . All expectations are computed under the true value of (z, λ) and the conditioning is on D, the information contained in the available data. To state an explicit form of these equations we introduce the following quantities
Nonparametric Estimation in a Markov “Illness–Death” Process βim zi µmi =
I
and for 1 ≤ n ≤ N
(ri , Rm ]
βpm zp
(1 ≤ i ≤ I), (1 ≤ m ≤ M ),
,
I(n ≤ N1 ) dn +
(rp , Rm ]
p=1
µ ¯ji =
(1 ≤ i ≤ I ), (1 ≤ j ≤ J),
,
I
× βi,M +u zi /LU u,
(1 ≤ i ≤ I), (1 ≤ u ≤ U )
I(tM +u = e∗i−I )zi /LU u,
(I < i ≤ I ), (1 ≤ u ≤ U ) (20)
C α z /L + (r , t ] i W +c i,J+c i c γci =
× βi,W +c zi /LC c ,
(1 ≤ i ≤ I), (1 ≤ c ≤ C) (I < i ≤ I ), (1 ≤ c ≤ C) (21)
αi,J+c zi /LC c ,
We show in Web Appendix A that each expression in equations (18)–(21) is a conditional probability of an observation making either a 1 → 2 transition in Qi , 1 ≤ i ≤ I, or 1 → 3 transition at Qi = e∗i−I , I < i ≤ I . Each conditional probability is computed under the true value (z, λ) and the conditioning is on the information about a given observation contained, respectively, in equations (1), (3), (4), and (5). We also consider the following quantities I
∗
ρmn = I tm ≥ tn
πun = I tM +u ≥ t∗n
∗
µmi I Qi ⊂ Lm , tn
, 1 ≤ m ≤ M,
I
ηui I Qi ⊂ LM +u , t∗n
,1 ≤ u ≤ U
i=1
σcn = I tW +c ≥ t∗n
I
γci I Qi ⊂ LW +c , t∗n
i=1
, 1 ≤ c ≤ C, (22)
which, as shown in Web Appendix A are, respectively, the conditional probability that an mth observation in equation (1), uth observation in equation (4), and cth observation in equation (5) is at risk of the 2 → 3 transition at t∗n −. Theorem 2. (a) The self-consistent equations in (15)–(17) take the following explicit form: for 1 ≤ i ≤ I M
zi =
µmi +
m=1
J
µ ¯ji +
j=1
U
ηui +
u=1
C
γci
c=1
,
N∗
(23)
for I < i ≤ I
ci−I + zi =
ρmn +
u=1
I
ηui
i=1
πun +
C
,
(25)
σcn
c=1
Proof. The proof of part (a) is given in Web Appendix B and of part (b) in Web Appendix C. Equations (23)–(25) suggest an algorithm for obtaining the MLE of (z, λ). We evaluate the right-hand sides of these equations at the initial value (z 0 , λ0 ). This gives a new value (z 1 , λ1 ), with which we repeat the previous step. We stop when the convergence criterion is satisfied. The MLE is always a solution to the self-consistent equations, but the self-consistent equations may have other solutions that do not correspond to the MLE, that is, they do not satisfy conditions in (13). Thus, one should start the algorithm with different initial parameter values and, in case of multiple solutions, verify which of the solutions satisfy (13). These would be the candidates for the ML solution. Naturally the ML solution would be the one that maximizes the likelihood function. In our applications we did not encounter multiple solutions. ˆ be the MLE of (z, λ). Assuming that rI < ∞, the Let (ˆ z , λ) i MLE of F12 is given by Fˆ12 (s) = 0 if s < l1 , Fˆ12 (s) = p=1 zˆp I if ri ≤ s < li+1 , 1 ≤ i ≤ I − 1, and Fˆ12 (s) = zˆp if s ≥ p=1
i=1
I tM +u = t∗n
(b) The ML estimator of (z, λ) is a self-consistent estimator of (z, λ).
αpj zp
λ (r , t ) M +u i M +u
M
U u=1 U
m=1
(19)
p=1
ηui =
λn =
(18)
αij zi
147
J j=1
µ ¯ji +
U u=1
N∗
ηui +
C c=1
γci ,
(24)
rI and is otherwise undefined. The MLE of F13 is Fˆ13 (s) = I zˆ I(Qp ≤ s), and the MLE of F(s) is Fˆ (s) = Fˆ12 (s) + p=I+1 p Fˆ13 (s). The MLE of the cumulative intensity function, Λ23 (t), N ˆ λ I(t∗n ≤ t) on t ≤ tmax , and is not defined on t > is n=1 n ˆ 12 and Λ ˆ 13 , the MLEs of Λ12 and Λ13 , tmax . One can derive Λ using the following relations Λ12 (s) = (0,s] F12 (du)
{1 − F (u−)}−1 , Λ13 (s) = (0,s] F13 (du){1 − F (u−)}−1 . We then ˆ 13 (s) = ˆ 12 (s) = z {1 − Fˆ (li −)}−1 and Λ have Λ ri ≤s i ∗ −1 ˆ zI+k {1 − F (e −)} . If rI = ∞ (which implies that ∗ e ≤s k
k
smax ≥ Rmax ) then, for s < smax , Fˆ12 (s) and Fˆ13 (s) are given as before, but for s ≥ smax , there are two cases. If s ≥ smax > Rmax ∨ e∗max then, by Theorem 1, both Fˆ12 (s) and Fˆ13 (s) are undefined on QI = [smax , ∞]. But if e∗max ≥ smax , then for I−1 s ≥ smax , Fˆ12 (s) = p=1 zp , so that Fˆ12 does not increase on QI , but Fˆ13 (s) does. 6. Simulation Study The purpose of the simulation study is twofold: (i) to assess ˆ and Λ ˆ 23 , the MLEs derived in this the sensitivity of Fˆ12 , Fˆ13 , F, article, to different patterns of missing nonfatal event status and (ii) to compare them to the corrected FDA estimators ˆ 23,C , obtained by right censoring the Fˆ12,C , Fˆ13,C , FˆC , and Λ nonfatal event follow-up time at the last negative assessment for the observations with missing nonfatal event status. The
Biometrics, March 2009
148
Table 1 Simulation study results for F (s), F 12 (s), F 13 (s), and Λ23 (t). Bias for each simulation calculated by subtracting the estimated value of F(s) or Λ23 (t) from the true value of each parameter. Values in table are means. Scenario 1
2
3
4
s/t 365 730 1095 1460 365 730 1095 1460 365 730 1095 1460 365 730 1095 1460
Fˆ (s) (Bias) 0.305 0.515 0.660 0.779 0.303 0.515 0.661 0.780 0.301 0.510 0.657 0.766 0.300 0.509 0.658 0.768
(−0.001) (−0.003) (−0.005) (0.011) (−0.003) (−0.003) (−0.004) (0.012) (−0.005) (−0.008) (−0.008) (−0.002) (−0.006) (−0.009) (−0.007) (0.000)
Fˆ12 (s)
Fˆ13 (s)
0.215 0.352 0.423 0.500 0.215 0.350 0.421 0.501 0.214 0.345 0.416 0.490 0.213 0.347 0.418 0.491
0.090 0.163 0.237 0.280 0.088 0.165 0.240 0.280 0.087 0.165 0.241 0.276 0.087 0.162 0.240 0.277
ˆ 23 (t) (Bias) Λ 0.533 1.122 1.705 2.294 0.532 1.120 1.708 2.298 0.532 1.121 1.704 2.303 0.533 1.120 1.707 2.309
(−0.051) (−0.046) (−0.047) (−0.042) (−0.052) (−0.048) (−0.044) (−0.038) (−0.052) (−0.047) (−0.048) (−0.033) (−0.051) (−0.048) (−0.045) (−0.027)
newly developed and corrected FDA estimators are compared to the true values of F(s), at s = 365, 730, 1095, and 1460 days, which are 0.306, 0.518, 0.665, and 0.768, respectively, and to true values of Λ23 at t = 365, 730, 1095, and 1460 days, which are 0.518, 1.168, 1.752, and 2.336, respectively. We simulated 100 data sets each consisting of 250 observations from the illness–death model with constant intensities, 0.0008, 0.0002, and 0.0016 of 1 → 2, 1 → 3, and 2 → 3 transitions, respectively, under four different scenarios where each observation had a potential maximum follow-up time of approximately 4 years. Under these parameters, this number of simulations is sufficient to estimate F(s) with 5% accuracy. In scenarios 1 and 2, probability of unknown nonfatal event status was 0.5% for the first postbaseline assessment, doubling at every assessment thereafter. In scenarios 3 and 4, probability of unknown nonfatal event status was 3% for the first postbaseline assessment, increasing twofold at every assessment thereafter. In scenarios 1 and 3, the nonfatal event was assessed every 6 months, ±N(0, 100) days. In scenarios 2 and 4, the nonfatal event was assessed every 2 months for the first 6 months, every 6 months through 2 years, and year 3 and 4, ±N(0, 100) days. Nonfatal event status was unknown for a mean of 31%, 41%, 36%, and 48%, of the observations in scenarios 1–4, respectively. Vital status was known for all observations at the end of the follow-up period. Table 1 contains the average values of Fˆ (s), FˆC (s) and corresponding subdistribution functions over 100 simulated data sets at times s = 365, 730, 1095, and 1460 days. The mean bias of Fˆ (s) is generally negligible at each time point for all of the scenarios and small relative to the estimated standard errors (SEs). In contrast, the mean bias of FˆC (s) tends to be positive at later time points. This reflects the fact that right censoring of the follow-up time at the last negative assessment for the nonfatal event for the observations with missing intermediate event status tends to take these observations out of the risk set too early. Simple bootstrap SEs (Sun, 2001) of
FˆC (s) (Bias) 0.305 0.515 0.669 0.856 0.302 0.515 0.686 0.896 0.288 0.508 0.678 0.843 0.278 0.507 0.698 0.870
(−0.001) (−0.003) (0.004) (0.088) (−0.004) (−0.003) (0.021) (0.128) (−0.018) (−0.010) (0.013) (0.075) (−0.028) (−0.011) (0.033) (0.102)
Fˆ12,C (s)
Fˆ13,C (s)
0.211 0.342 0.396 0.500 0.211 0.343 0.400 0.441 0.198 0.335 0.395 0.485 0.187 0.334 0.414 0.440
0.094 0.173 0.273 0.355 0.091 0.172 0.286 0.455 0.090 0.173 0.283 0.358 0.091 0.173 0.284 0.430
ˆ 23,C (t) (Bias) Λ 0.240 0.788 1.368 1.943 0.383 0.933 1.519 2.101 0.246 0.790 1.369 1.940 0.396 0.934 1.503 2.086
(−0.344) (−0.380) (−0.384) (−0.393) (−0.201) (−0.235) (−0.233) (−0.235) (−0.338) (−0.378) (−0.383) (−0.396) (−0.188) (−0.234) (−0.249) (−0.250)
Fˆ (s) and FˆC (s) were calculated for several of the simulated data sets under scenarios with lower (scenario 1) and higher (scenario 4) amounts of missing data. The bootstrap SEs of Fˆ (s) and FˆC (s) were approximately equal to 0.03 for scenario 1 and 0.04 for scenario 4 across the four time points. The estimated cumulative intensity of the 2 → 3 transition ˆ 23 (t) tended to have small negative mean biases at each time Λ ˆ 23,C (t) tended to have point under the four scenarios, while Λ greater negative mean biases, reflecting underestimation of the true values. The bootstrap SEs tended to be ≈ 0.30 for both estimators across each of the investigated time points and did not depend on the amount of missing data. The mean ˆ 23 (t) are therefore small relative to the estimated biases of Λ SEs. The data for this simulation study were generated under a noninformative observation scheme. Scenarios 2 and 4 have assessment patterns that are typical of a clinical trial, where assessment occurs more often early in follow-up. Additional scenarios with various transition intensities and amounts of missing data under a noninformative observation scheme yielded similar results: our proposed method tends to yield less biased estimates of F(s) and Λ23 (t) than the corresponding corrected FDA estimators, with similar estimated SEs. The results show that application of our method to this type of data corrects the bias that results from the commonly used convention of right censoring the nonfatal event followup time at the last negative assessment for observations with missing intermediate event status. 7. Application We illustrate our methods with the data from cetuximabtreated patients in a recently completed clinical trial (Bonner et al., 2006). Patients with locoregionally advanced squamouscell carcinoma of the head and neck were randomly assigned to radiotherapy plus cetuximab (n = 211) or radiotherapy alone (n = 213). Assessments for progressive disease (PD) occurred
Nonparametric Estimation in a Markov “Illness–Death” Process at weeks 4 and 8, every 4 months thereafter for 2 years, and semiannually during years 3–5. An endpoint of interest was PFS, defined as time to PD or death. In the original analysis PD times were treated as exact event times. Among the 211 cetuximab-treated patients, 98 (46.4%) were observed to have an intermediate PD event; the majority with observed PD (77.6%) died prior to the end of the study. There were 111 patients (52.6%) without observed PD who did not have a disease assessment near the time of death or right censoring for death. (The remaining two patients were known not to experience PD and were alive at the end of the study.) In the original analysis of the data, the PFS time for these 111 patients was determined at the time of death (assumed to have occurred directly from state 1; n = 21) or was right censored at the last negative tumor assessment. This convention is consistent with current FDA and EMEA guidance. We use the data from Bonner, but allow for intervalcensored intermediate event (PD) times, to compare the results of the estimation of F and other parameters of an illness–death model using our analysis and the corrected FDA analysis. The two analyses differ in their treatment of 111 patients with an unknown intermediate event (PD) transition status. The way FDA analysis treats 111 patients was described above. In our analysis we treat the 111 patients without observed PD, who did not have a disease assessment near the time of death or right censoring for death, as having missing status of the intermediate transition and apply our developed method to obtain the estimates of “illness–death” model parameters. The estimated subdistribution functions of the time to PD (denoted by Fˆ12 (s) and Fˆ12,C (s)) and time to death (denoted by Fˆ13 (s) and Fˆ13,C (s)) are presented in Figure 2, while the estimates (Fˆ , FˆC ) of the distribution function F from our analysis and corrected FDA analysis are displayed in Figure 3. In each case subscript C denotes the corrected FDA estimate. In Figure 2, Fˆ12 (s) is above Fˆ12,C (s), indicating the time to PD using the proposed analysis is slightly shorter than that estimated by censoring observations at the last negative assessment. Furthermore, fewer patients make a direct 1 → 3 transition with the proposed analysis, which makes sense from a clinical perspective, as discussed below. In Figure 3, Fˆ (s) is slightly below FˆC (s), indicating the time to first event is slightly longer after accounting for the missing data. The small difference between Fˆ (s) and FˆC (s) is on one hand due to the cancellation of the two effects: Fˆ12 (s) being above Fˆ12,C (s) and Fˆ13 (s) being below Fˆ13,C (s). In addition, for those patients with unknown intermediate event status, the time interval in which the intermediate event status was unknown tended to be short relative to the total follow-up time, which also contributes to the small difference between Fˆ (s) and FˆC (s). These results also generally mirror those ob∼ FˆC (s) at early time served in the simulation study: Fˆ (s) = points with Fˆ (s) < FˆC (s) at later time points. Figure 4 displays the estimates of the 1 → 3 and 2 → 3 cuˆ 23 (t) mulative intensities, Λ13 (t), Λ23 (t), from both analyses. Λ ˆ is considerably greater than Λ23,C (t) at all values of t, and ˆ 13,C (t). ˆ 13 (t), and Λ these estimators are substantially above Λ ˆ 23 (t) Although, in absolute terms, the difference between Λ
149
Figure 2. Estimates of F 12 (s) (filled symbols) and F 13 (s) (open symbols) accounting for 111 patients with unknown intermediate transition status by analysis methods I (right censored at the last negative tumor assessment; triangles) and II (proposed MLE method; circles) for the Radiotherapy + Cetuximab group. Bootstrap standard errors based on 100 bootstrap replications calculated at s = 12, 24, and 48 months are in the range 0.03–0.04 for estimated F12 (s) and in the range 0.01–0.02 for estimated F 13 (s).
ˆ 23,C (t) is greater than the difference between Λ ˆ 13 (t) and and Λ ˆ Λ13,C (t), the relative difference is similar. Specifically, at any ˆ 23,C (t) is approxˆ 23 (t) and Λ given value of t, the ratio of Λ ˆ ˆ imately 2, while the ratio of Λ13 (t) and Λ13,C (t) is approximately 0.5. Furthermore, the bootstrap SEs based on 100 bootstrap replications, see Efron (1981), of the estimated 2 → 3 cumulative intensities at 1, 2, and 4 years are 0.68, 0.70, and 0.73, respectively, for our analysis and 0.75, 0.76, and 0.78, respectively, for the corrected FDA analysis. Given that the ˆ 23 (t) and Λ ˆ 23,C (t) is 1.48 at 1 year, 1.71 difference between Λ at 2 years, and 1.79 at 4 years, the two estimators are approximately 2.1–2.3 SEs away from each other. These results are consistent with the simulation study results for the 2 → 3 cuˆ 23 (t) > Λ ˆ 23,C (t) mulative intensity, which also showed that Λ across values of t. We see from the above discussion that accounting for missing information results in a somewhat higher risk of PD compared to the result from the corrected FDA analysis. But where the substantial difference between two analyses occurs is in the estimation of the cumulative intensities of death before and after the occurrence of PD event. The estimated difference between the cumulative death intensity after and before the occurrence of PD event is substantially larger by our analysis than by the corrected FDA analysis. This seems to be consistent with clinical expectations. One would expect
150
Biometrics, March 2009
Figure 3. Estimates of F(s) accounting for 111 patients with unknown intermediate transition status by analysis methods I (right censored at the last negative tumor assessment; open triangles) and II (proposed MLE method; filled circles) for the Radiotherapy + Cetuximab group. Bootstrap standard errors based on 100 bootstrap replications calculated at s = 12, 24, and 48 months are in the range 0.03–0.04 for estimated F(s). a very substantial difference in those intensities because in the advanced disease setting, with few exceptions (e.g., accidental death or sudden fatal adverse reaction to treatment), deaths should occur only after clinically detectable progression of disease (intermediate event). These results are also directly relevant to the recent FDA guidance on clinical trial endpoints for cancer drug development. The conventions for event and censoring dates in the guidance document are motivated by the accelerated approval regulations with surrogate endpoints for overall survival, such as PFS. For PFS to be a valid surrogate endpoint, patients who experience nonfatal PD should be at substantially elevated risk of death relative to patients who do not experience PD, so that a treatment that modifies the risk of PD should also ultimately provide definitive clinical benefit by reducing the risk of death. Not accounting for the missing information on intermediate event status can result in more apparent deaths without an intermediate transition. This is demonstrated in the application of the proposed methods to the Bonner data where accounting for the missing information increases the estimated intensity of death after nonfatal event and decreases it in the absence of a nonfatal event, compared to not accounting for missing information. This suggests that PFS is a stronger indicator of overall survival than is seen from the analysis that does not account for missing information. These results provide motivation for the use of the
Figure 4. Estimates of Λ23 (t) (filled symbols) and Λ13 (t) (open symbols) accounting for 111 patients with unknown intermediate transition status by analysis methods I (right censored at the last negative tumor assessment; triangles) and II (proposed MLE method; circles) for the Radiotherapy + Cetuximab group. Bootstrap standard errors based on 100 bootstrap replications calculated at t = 12, 24, and 48 months are in the range 0.68–0.78 for estimated Λ23 (t) and in the range 0.01–0.02 for estimated Λ13 (t). proposed methods when the focus of the analysis is on a surrogate endpoint such as PFS. 8. Supplementary Materials Web Appendices A, B, and C referenced in Section 5 are available under the Paper Information link at the Biometrics website http://www.biometrics.tibs.org.
Acknowledgements HF thanks Niels Keiding for stimulating discussions on interval censoring. We are grateful to Judith Goldberg for incisive comments and feedback during the work on this paper. We thank ImClone Systems for making the oncology trial data available to us.
References Bonner, J. A., Harari, P. M., Giralt, J., Azarnia, N., Shin, D. M., Cohen, R. B., Jones, C. U., Sur, R., Raben, D., Jassem, J., Ove, R., Kies, M. S., Baselga, J., Youssoufian, H., Amellal, N., Rowinsky, E. K., and Ang, K. K. (2006). Radiotherapy plus cetuximab for squamous-cell carcinoma of the head and neck. New England Journal of Medicine 354, 567–578.
Nonparametric Estimation in a Markov “Illness–Death” Process Efron, B. (1967). The two sample problem with censored data. In Proceedings of the 5th Berkley Symposium (Volume 4), 831–853. Berkeley: University of California Press. Efron, B. (1981). Censored data and the bootstrap. Journal of the American Statistical Association 76, 312–319. European Medicines Agency (EMEA). (2006). Committee for medicinal products for human use: Guideline on the evaluation of anticancer medicinal products in man. Accessed October 2007 from http://www.emea.europa.eu/pdfs/human/ewp/26757506en.pdf Food and Drug Administration (FDA). (2007). Guidance for industry: Clinical trial endpoints for the approval of cancer drugs and biologics. Accessed October 2007 from http://www. fda.gov/cder/guidance/7478fnl.pdf Frydman, H. (1992). A nonparametric estimation procedure for a periodically observed three state Markov process, with application to AIDS. Journal of the Royal Statistical Society, Series B 54, 853–866. Frydman, H. (1995). Nonparametric estimation of a Markov “illness– death” process from interval-censored observations, with application to diabetes survival data. Biometrika 82, 773–789.
151
Gruger, J., Kay, R., and Schumacher, M. (1991). The validity of inferences based on incomplete observations in disease state models. Biometrics 47, 595–605. Hudgens, M. G., Satten, G. A., and Longini, I. M. (2001). Nonparametric maximum likelihood estimation for competing risks survival data subject to interval censoring and truncation. Biometrics 57, 74–80. Joly, P., Commenges, D., Helmer, C., and Letenneur, L. (2002). A penalized likelihood approach for an illness-death model with interval-censored data: Application to age-specific incidence of dementia. Biostatistics 3, 433–443. Sun, J. (2001). Variance estimation of a survival function for intervalcensored survival data. Statistics in Medicine 20, 1249–1257. Turnbull, B. W. (1976). The empirical distribution function with arbitrary grouped censored and truncated data. Journal of the Royal Statistical Society, Series B 38, 290–295.
Received October 2007. Revised March 2008. Accepted March 2008.