The Accelerated Gap Times Model

The Accelerated Gap Times Model

Rob Strawderman BSCB Department Cornell University [email protected] http://www.bscb.cornell.edu/ Joint work with Edsel Pe˜ na, USC Stat

1

Outline • Model & Intensity Formulation • Semiparametric Estimation • Asymptotics & Efficiency • Simulation Results • Bladder Tumor Data • Extensions 2

AGT Model – Motivation Consider single subject with covariates Z experiencing repeated events at times 0 < S1 < S2 < · · · . Define j th gap time as Tj = Sj − Sj−1, where j ≥ 1 & S0 = 0. Assume 0

Given Z, Tj = Vj e−θ0Z, where V1, V2, . . . iid with continuous CDF F0 independent of Z. That is: Subject-level observation is realization of a renewal process, where covariates Z directly “accelerate” or “decelerate” baseline process gap times Vj .

3

Let Sn =

Pn υ j=1 Tj & define N (t) = max{n : Sn ≤ t}. Then:

• For any t, h > 0 & with Ft = σ(N υ (u), u ≤ t; Z), P {N υ (t + h) − N υ (t) = 1|Ft} = 0

0

F0(eθ0Z(t + h − SN υ (t))) − F0(eθ0Z(t − SN υ (t))) 0 1 − F0(eθ0Z(t − SN υ (t)))

.

• Dividing LHS, RHS by h and letting h → 0 yields intensity for N υ (·) (uncensored case): λ(t|Z) = λ0

µ

0Z θ e 0 R(t)

¶

0Z θ e 0 ,

d log(1 − F (t)). where R(t) = t − SN υ (t−) and λ0(t) = − dt 0

4

AGT Model – Assumptions Assume n subjects, where ith observed over [0, τi] and: †

†

• Data: ({(Ni (u), Yi (u)), u ≥ 0}, Zi), i = 1 . . . n with †

†

– Ni (u) = max{n : Sn ≤ u ∧ τi} (i.e., Ni (u) = Niυ (u ∧ τi)) †

– Yi (u) = I{τi ≥ u} ¯ ¯ υ • Subjects iid, τi ∼ G(·|Zi), τi ⊥ {Ni (u), u ≥ 0}¯¯Zi. ind

5

AGT Model – Specification Let Gt denote history of all subjects to time t; then, the † compensator of Ni (t) wrt Gt is † Ai (t) =

Z t 0

λ0

µ

0Z θ e 0 iR

¶

0Z † θ 0 i Y (u) du, (u) e i i

denotes backward recurrence time where Ri(u) = u−S † iN (u−) i

and λ0(·) is “baseline” intensity.

AGT Model: For i = 1 . . . n, †

†

†

Mi (·) = Ni (·) − Ai (·) is a martingale wrt {Gt, t ≥ 0}.

6

• New, interesting competitor to modulated renewal process of Cox (1972), which assumes compensator RP (t) = AM i

Z t

0Z † θ λ0(Ri(u)) e 0 i Yi (u) du 0

(cf. Oakes & Cui, 1994). • Useful physical interpretation: covariates directly “shrink” or “expand” gap times. See Lin, Wei, and Ying (LWY; 1998) for marginal model formulation in terms of event times S1 < S2 < · · · (“accelerated mean model”). • ML estimation for AGT model easy for parameterized λ0(·), asymptotics follow from martingale theory. Semiparametric estimation more challenging . . .

7

Semiparametric Estimation To motivate EF, start by deriving semiparametric efficient score (SES) for θ p×1: Using Jacod’s point process likelihood, easy to obtain score for θ assuming Λ0 known:

S Q (θ ) =

n Z ∞ X

i=1 0

µ

Zi Q Ri(u)e

θ0Z

i

¶

†

Mi (du)

λ00(u) u + 1 and where Q(u) = λ0(u) † † Mi (·) = Ni (·) −

Z · 0

µ

λ0 e

θ0Z

iR

¶

0Z θ i Y † (u) du. (u) e i i

8

SES for θ obtained by subtracting from SQ its orthogonal projection onto score space for Λ0. Omitting details:

Seff(θ ) = where

n Z ∞ X

i=1 0

µ

Q Ri(u)e

θ0Z

i

¶µ

µ

Zi − E Ri(u)e

θ0Z

i

¶¶

†

Mi (du)

• E(·) = [E1(·), . . . , Ep(·)]T, • Ek (u) =

E [Z1k Y1(u|θ )] , k = 1...p E [Y1(u|θ )] †

• Y1(t|θ ) =

N1 (τ1 −) ½ X

I T1j e

j=1

θ0Z

1

¾

½

≥ t +I e

θ0Z

1

·

τ1 − S

†

1N1 (τ1 −)

9

¸

¾

≥t .

Using “calendar ↔ gap” time stochastic integral identities of Pe˜ na, Strawderman, & Hollander (2001 JASA), can write

Seff(θ ) ≡

n Z ∞ X

i=1 0

Q(u) (Zi − E (u)) Mi(du|θ ),

where • Mi(t|θ ) = Ni(t|θ ) − • Ni(t|θ ) =

Z ∞ 0

Z t 0

Yi(v|θ )Λ0(dv)

0Z † θ I{Ri(u)e i ≤ t}Ni (du)

Important: “Gap time” processes Mi(t|θ ), i = 1 . . . n not martingales, so asymptotics for score process Seff(θ , t) cannot be handled via direct application of MCLT. 10

Seff(θ ) involves unknown quantities: • E(u) =

E [Z1Y1(u|θ )] , E [Y1(u|θ )]

λ00(u) u + 1. • Q(u) = λ0(u) To generate useful class of EFs for θ , substitute Pn j=1 Zj Yj (u|θ ) ¯ for E(u); • E(u|θ ) = Pn j=1 Yj (u|θ )

• computable, bounded weight W (·|θ ) for Q(·). 11

We obtain: b S

W (θ ) =

n Z ∞ X

i=1 0

(τi −) n Ni X X

i=1

´

¯ (u|θ ) Ni(du|θ ) W (u|θ ) Zi − E

†

≡

³

j=1

µ

W Tij e

θ0Z

µ ¯ ¶· ¯ ¶¸ 0 ¯ i ¯θ ¯ Tij eθ Zi ¯¯θ . Zi − E

Easy to see: b (θ ) is rank-based EF for θ • S W b (θ ) reduces to weighted linear rank EF of Tsiatis • S W (1990) if at most 1 event per subject.

12

Estimating θ b (θ ) non-monotone step function. E.g., not • Typically, S W guaranteed monotone for W (u|θ ) ≡ 1 (logrank). b as a minimizer of kS b (θ )k. Can use • Common: choose θ W deterministic (e.g., simplex method) or stochastic (e.g., simulated annealing, genetic algorithm) optimization. b (θ ) guaranteed monotone with Gehan-type weight • S W

W (u|θ ) = n−1

n X

Yj (u|θ );

j=1

see Fygenson & Ritov (1994). Solution set convex, fast numerical methods via LP methods (cf. LWY 1998).

13

Estimating Λ0(·) For all suitable deterministic η(·), “score” for Λ0 is: n Z ∞µ X

i=1 0

η Ri(u)e

θ0Z

i

¶

†

Mi (du).

Mean zero; similarly to before, can show: n Z ∞µ X

¶

Z ∞



n X



0Z † θ η Ri(u)e i Mi (du) ≡ η (v )  Mi(dv|θ ) 0 i=1 0 i=1

where Mi(v|θ ) = Ni(v|θ ) −

Z v 0

Yi(s|θ )Λ0(ds).

14

Estimator for Λ0 obtained by setting RHS = 0, solving for b . For arbitrary η(·), solution satisfies Λ 0 n h X

i=1

i b Ni(dv|θ ) − Yi(v|θ )Λ0(dv) = 0.

P b → θ 0, estimate Λ0 via Breslow-type estimator Hence: for θ

Z t Pn b) Ni(dv|θ i=1 b b . Λ0(t|θ ) = Pn b 0 i=1 Yi (v|θ )

b ) − Λ (t) more complicated than usual b (t|θ Asymptotics for Λ 0 0 b (t|θ ) not Breslow estimator; e.g., cannot employ MCLT, Λ 0 differentiable in θ , etc . . .

15

b (·|θ ) & non-monotinicity of logrank EF Λ 0 †

With Ni = Ni (τi−), may write logrank EF as b S LR (θ ) ≡

n X

i=1



Z i  Ni −

NX i +1 j=1

 ¶ 0 Z ¯¯ θ b Λ0 Xij e i ¯θ  µ

for Xij = Tij , j ≤ Ni & Xi,Ni+1 = τi − Si,Ni . b • Monotonicity of S LR (θ ) fails because

isn’t monotone in θk , k = 1 . . . p.

Zi

b Λ

³

θ 0 Z i |θ X e 0 ij

b (t|θ ) isn’t monotone in θ , k = 1 . . . p. • Why: given t, Λ 0 k b (t|θ ) is monotone in t . . . • However: given θ , Λ 0

16

´

∗ ∗ P e Suggests solving S LR (θ |θ ) = 0, where θ → θ 0 and

e S

∗ LR (θ |θ ) ≡

n X

i=1



Z i  Ni −

Monotonicity concerns vanish. b from θ LR :

NX i +1 j=1

b Λ

µ

¶



0 Z ¯¯ ∗ θ i ¯θ . 0 Xij e

e But, solution θ LR distinct

∗ −1/2 ) e b b n1/2(θ LR − θ LR ) = Op (1) unless |θ − θ LR | = op (n

Still: useful observation, because suggests

e b • θ LR good starting value for solving SLR (θ ) = 0

• applied iteratively, produces consistent sequence of solub tions asymptotically equivalent to θ LR . 17

Asymptotics Under suitable conditions and for given weight W , •

•

√

L

b − θ ) → N (0, Σ), where Σ = D-1 Σ D-1 , Σ is score n(θ 0 0 0 0 variance, & D depends on λ0(·), λ0(·).

√

b )−Λ (·)) ⇒ G, where G a Gaussian process with b (·|θ n(Λ 0 0 covariance function

Var

ÃZ

t∧s M1 (du|θ 0 ) 0

E[Y1(u|θ 0)]

!

+ AT(t) Σ A(s)

where A(·) depends on λ0(·), λ00(·). Proof: martingale-type calculations, empirical processes 18

Efficiency Considerations for θ • Asymptotically optimal choice: W = Q for λ00(u) u + 1. Q(u) = λ0(u)

• For constant c > 0, solving ODE Q(u) = c for λ0(·) yields λ0(u) = kuc−1 for some k > 0. Implies W (u|θ ) ≡ 1 (i.e., logrank-type EF) optimal when λ0(·) in Weibull family. 0Z − θ • Results parallel AFT model. If T = V e 0 for V ∼ Weibull, then log T = log V − θ 00Z for log V ∼ Extreme

Value. Well-known that Tsiatis’s logrank EF optimal in this case (e.g., Ritov, 1990).

19

Variance Estimation • Variances depend on A(·), D (i.e., on λ0(·), λ00(·)). • In similar rank-based estimation problems, LWY (1998) propose simulation-oriented approach for empirical variance estimation. Huang (2002) proposes “inverse numerb. ical differentiation” for estimating variance of θ

• Devised new approach combining small-scale simulation b and and regression to approximate A(s), D; variance for θ b (·) then directly estimated from asymptotic formulas. Λ 0

• Less computational effort than LWY; less bias, comparab ). ble variance to Huang estimator for Var(θ 20

b) Algorithm for Var(θ

For large n and θ near θ 0, asymptotic linearity of score implies . b b (θ ) = S SW (θ 0) + D (θ − θ 0) + error W b , above implies Hence, for θ ∗ near θ b S W

b (θ ∗ ) − S

³ ´ . ∗ b b W (θ ) = D θ − θ + other error

i.e., D can be viewed as slope of regression of multivariate b ) on “covariate” θ ∗ − θ b. b (θ ∗ ) − S b (θ “response” S W W Can interpret A(·) similarly . . .

21

b 1. Simulate several θ ∗ in “reasonable” neighborhood of θ b) (e.g., 50 simulated values, normally distributed about θ c 2. Approximate D as indicated; call result D

3. Estimate Σ0 via P  ⊗2 n b Z n  i⊗2 Zj Yj (u|θ ) h X ∞ 1 j=1 b) b) b = ¯ (u|θ − Σ E N (du| θ P 0 i n b  n i=1 0  j=1 Yj (u|θ ) b =D c-1 Σ b D c-1 4. Compute Σ 0

22

Simulation Results

• Gap times T ∼ V e−0.5Z , τ ∼ Uniform(0, 3.5) • Exponential: V ∼ Exp(1), Z ∼ Bernoulli(0.5) • Gamma: V ∼ Gamma(0.75,0.75), Z ∼ Normal(0,1) b • Logrank weight, minimized kS LR (θ )k via simplex method b as starting value (computed via LP methods). with θ G

23

Exponential n=50 n = 100

Gamma n=50 n = 100

b) Bias(θ

-0.004

0.002

0.005

-0.002

0.211

0.144

0.095

0.065

d Bias(SD)

-0.006

-0.003

0.000

0.000

-0.020

-0.009

-0.003

-0.002

d SD of (SD)

0.025

0.012

0.018

0.009

0.036

0.014

0.018

0.009

b) Emp SD(θ d Bias(Huang SD) d SD of (Huang SD)

Mean # events

2.3

3.7

24

Exponential n=50 n = 100

Gamma n=50 n = 100

b (1)) Bias(Λ 0

-0.001

-0.006

-0.009

-0.004

b (2)) Bias(Λ 0

-0.008

-0.001

0.007

0.006

b (1)) Emp SE(Λ 0

0.172

0.117

0.185

0.125

d Bias(SD)

-0.005

-0.000

-0.005

0.001

b (2)) Emp SE(Λ 0

0.389

0.269

0.525

0.374

-0.016

-0.008

-0.001

-0.009

d Bias(SD)

E[Y1(1|0.5)]

0.81

0.51

E[Y1(2|0.5)]

0.19

0.07 25

VA Bladder Tumor Data (Byar, 1980)

• Analyzed in several places, including LWY (1998), who 0Z θ υ use it to illustrate MAM model: E[N (t)|Z] = µ0(te 0 ).

• Treatment: Placebo (1), Thiotepa (0) • Covariates: Initial # (1-8), largest tumor size (1 - 8 cm) • 48 placebo patients with 87 recurrences in placebo group; 38 patients with 45 recurrences in thiotepa group. Number of recurrences per patient range from 0 - 9.

26

EF

LR

Ghn

Cov

AGT Model θb Est SE

MAM Model θb Est SE

Treatment

0.442

0.286

0.542

0.312

Initial Num

0.258†

0.068

0.204†

0.066

Initial Size

-0.010

0.097

-0.038

0.084

Treatment

0.433

0.257

0.657

0.314

Initial Num

0.207†

0.064

0.218†

0.086

Initial Size

-0.008

0.090

-0.022

0.101

† Highly significant at 5% level

27

• AGT model suggests recurrence times are expanded on treatment (0 = thiotepa), reduced as number of initial 0Z θ υ tumors increases; MAM model E[N (t)|Z] = µ0(te 0 ) suggests mean # of recurrences exhibit same pattern. • Model correspondence? i.e., when does E[N υ (t)|Z] = µ0

0 (teθ0Z)

=

Z t 0

λ0

µ

0 eθ0ZR(u)

¶

0

eθ0Z du?

Evidently: if λ0(u) = λ0, in which case µ0(·) = Λ0(·). That is: AGT, MAM models coincide if baseline gap times Vij0 s IID exponential, covariates Z time-independent; otherwise, models distinct. • Results here, in LWY (1998) do not rule out possibility of exponential gap times for bladder tumor data. 28

Estimated Cumulative Baseline Intensity 3 Cum Haz 95% CI 95% EP CB

2.5

Λ0(t)

2

1.5

1

0.5

0

0

10

20

30 Time (in months)

40

50

60

29

• MAM model more robust to model assumptions; e.g., valid for correlated and even non-iid gap times. Can summarize broad trends in total event counts, but no more; also, requires strong independent censoring conditions.

• AGT model can be extended to case where baseline gap times are correlated, have same marginal distributions (i.e., frailty-type settings); seems harder to move beyond because of “baseline” distributional assumptions.

• With AGT model, can additionally estimate recurrence time distribution, median recurrence times, etc . . . ; main expense is stronger modeling assumptions.

30

Estimated Recurrence Time Distributions by Treatment (At Mean Number=2.5, Size = 2) 1 t

θ z

0.9

P(T>t|Z=z) = e−Λ0(e

0.8

Median(Placebo)=9.45

t)

Median(Thiotepa)=14.65

P( T > t | Z = z)

0.7 0.6 0.5 Thiotepa Placebo

0.4 0.3 0.2 0.1 0

0

10

20 30 40 Recurrence time T (in months)

50

60

31

Some Extensions • Kalbfleisch & Prentice (2002, §9.4.3) suggest tumor recurrence intensity may depend on past recurrence history. Extension of AGT model to time-dependent covariates would (i) enable more in-depth investigation (eg, effect of recurrence pattern and timing); (ii) improve ability to predict subject-specific recurrence trajectories; and (iii) further relaxation of censoring assumptions. • Frailty-type intensity models for correlated gap times; what is appropriate EF? Estimator for Λ0? Also: is EF for θ useful under weaker marginal model assumptions? • Dependent censoring/stopping (e.g., observation ends at S ∗ ∧ τ , where S ∗ correlated with N υ (·)). 32