Conventional regression models. Auto-generated units. Consequences of auto-
generation. Arguments pro and con. Sampling bias in logistic models.
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Sampling bias in logistic models Peter McCullagh Department of Statistics University of Chicago
Health Studies, Jan 23, 2008
http://www.rss.org.uk/pdf/McCullaghFeb2008.pdf Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Outline 1
2
3
4
Conventional regression models Gaussian models Binary regression model Properties of conventional models Problems with conventional models Auto-generated units Point process model Consequences of auto-generation Sampling bias Treatment and randomization Notation Estimating functions Interference Arguments pro and con Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Gaussian models Binary regression model Properties of conventional models Problems with conventional models
Conventional regression model Fixed set U (usually infinite): u1 , u2 , . . . are subjects, plots... Covariate x(u1 ), x(u2 ), . . . (non-random, vector-valued) Response Y (u1 ), Y (u2 ), . . . (random, real-valued) Regression model: Sample = finite ordered subset u1 , . . . , un (distinct in U) For each sample with configuration x = (x(u1 ), . . . , x(un )) Distribution px (y) on Rn depends on x Example: px (y ∈ A; θ) = Nn (X β, σ02 In + σ12 K [x])(A) A ⊂ Rn , K [x] = {K (xi , xj )} block-factor models, spatial models, spline models,... Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Gaussian models Binary regression model Properties of conventional models Problems with conventional models
Binary regression model (GLMM) Units: u1 , u2 , . . . subjects, patients, plots (labelled) Covariate x(u1 ), x(u2 ), . . . (non-random, X -valued) Process η on X (Gaussian, for example) Responses Y (u1 ), . . . conditionally independent given η logit pr(Y (u) = 1 | η) = α + βx(u) + η(x(u)) Joint distribution for sample having configuration x px (y) = Eη
n Y e(α+βxi +η(xi ))yi 1 + eα+βxi +η(xi ) i=1
parameters α, β, K ,
K (x, x 0 ) = cov(η(x), η(x 0 )). Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Gaussian models Binary regression model Properties of conventional models Problems with conventional models
Binary regression model: computation GLMM computational problem: n Y e(α+βxi +η(xi ))yi px (y) = φ(η; K ) dη 1 + eα+βxi +η(xi ) Rn
Z
i=1
Options: Taylor approx: Laird and Ware; Schall; Breslow and Clayton, McC and Nelder, Drum and McC,... Laplace approximation: Wolfinger 1993; Shun and McC 1994 Numerical approximation: Egret E.M. algorithm: McCulloch 1994 for probit models Monte Carlo: Z&L,... Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Gaussian models Binary regression model Properties of conventional models Problems with conventional models
But, . . ., wait a minute...
But . . . px (y) is not usually the relevant distribution! Why not? Sampling: fixed x versus random x stratum distribution versus conditional distribution
Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Gaussian models Binary regression model Properties of conventional models Problems with conventional models
Mathematical properties of regression model (i) Model is a set of processes with U as index set covariate structure: x : U → X (fixed and given) (ii) sample = finite ordered subset of distinct units sample configuration x = (x(u1 ), . . . , x(un )) (iii) Model: sample 7→ response distribution px (y) (iv) Two samples having same x also have same response distribution. (exchangeability, no unmeasured confounders,...) (v) Distribution of Y (u) depends only on x(u), not on x(u 0 ) (no interference, Kolmogorov consistency) What if ... u1 , . . . , un were generated at random? Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Gaussian models Binary regression model Properties of conventional models Problems with conventional models
Problems in the application of conventional models Clinical trials / market research / traffic studies / crime... (i) Sample units generated by a random process sequential recruitment, purchase events, traffic studies... (ii) Population also generated by a random process in time animal populations, purchase events, crime events,... (iii) Sample as a fixed subset: what does this mean? — either mathematically or practically (iv) Random sample: No alternative, but the distribution may be different from that for a fixed sample (v) Sequential sample: Is necessarily a random sample What about x-values? Conditional distribution versus stratum distribution Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Point process model
Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Point process model
Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Point process model
Point process model for auto-generated units y=2
×..... ×.....
y=1 y=0
×.....
X
•
.. .. .. .. . ....... ... .
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . .. ... . ... .... .. .. .... .. .. .... .. .. ...... ... ... .... .. .. .... .. .. ......................... ...... ... .... .. . .
×××
•••• •
×.....
.. .. . ... .. . .. .. ... .. .. .. .. .. .. .. ... ... .. .. .. .. .. .. .. .. .. ... ... .. .. .. .. .. .. .. .. .. ............. ...... ..
×
••
.. .. ×..... ×× .. .. . .
. .. ..× ×× .. .. ... . .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ....... ... .
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .............. ... ... ..
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . .. .. .. .. .. .. .. ... ... .. .. .. ............. ...... ..
×
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. . ....... ... .
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. ... .. ... . ... .. . ....... .... .
×
• •• •• • •
×.....
× × ×
.. ...... .. .... .. .... .. .. .... .. .. .... .. .. .... .. .. .... .. . .... .. .. .... .. .. .. .. .... .. .. .. .. .... .. .. .. .. .... .. .. .. .. .... .. .. .. .. .... .. .. .. .. .... .. .. .. .. .... .. .. .. .. .... .. .. .. .. .... .. .. .. ... .. .. ... ... .. ...... ..... .. .. .. .. .. .. .. .... .... .. .. .. .. .. .. .. .... .... .. .. .. .. .. .. .. .... .... .. .. .. .. .. .. .. .... .... .. .. .. .. .. .. .. .... .... .. .. .. .. .. .. .. .... .... .................... ....... ................... .......... ............ ... ...... ... ... ...... ..... ..... . . . . . . . .. ..
×
×× × ×
.. .. ×× .. .. ..
× ×..... ×
••• • ••• •• •
.. .. .. .. .. .. .. .. .. .. .. .. .. .. ....... ... .
•
.. .. .. .. .. ... ... .. .. .. ... ... ... ... .. .. .. ... ... ... ... .. .. ... ... .. .. ... ... .. .. .. .. .. ... ... ... ... . . .. ... ... ... ... .. .. .. .. . . .. ......................... ... ......... . . ..
×× ••••
×..... .. .. .. .. .. .. .. .. .. .. .. .. .. .. ....... ... .
•
A point process on C × X for C = {0, 1, 2}, and the superposition process on X . Intensity λr (x) for class r: r = 0, 1, 2. x-values auto-generated by the superposition process with intensity λ. (x). To each auto-generated unit there corresponds an x-value and a y-value.
Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Point process model
Binary point process model Intensity process λ0 (x) for class 0, λ1 (x) for class 1 Log ratio: η(x) = log λ1 (x) − log λ0 (x) Events form a PP with intensity λ on {0, 1} × X . Conventional GLMM calculation (Bayesian and frequentist): λ1 (x) eη(x) = λ. (x) 1 + eη(x) λ1 (x) eη(x) pr(Y = 1 | x) = E =E λ. (x) 1 + eη(x)
pr(Y = 1 | x, λ) =
Calculation is correct in a sense, but irrelevant. . . . . . there might not be an event at x! Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Point process model
Correct calculation for auto-generated units
pr(event of type r in dx | λ) = λr (x) dx + o(dx) pr(event of type r in dx) = E(λr (x)) dx + o(dx) pr(event in SPP in dx | λ) = λ. (x) dx + o(dx) pr(event in SPP in dx) = E(λ. (x)) dx + o(dx)
pr(Y (x) = r | SPP event at x) =
λr (x) Eλr (x) 6= E Eλ. (x) λ. (x)
Sampling bias: Distn for fixed x versus distn for autogenerated x. Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Point process model
Two ways of thinking First way: waiting for Godot! Fix x ∈ X and wait for an event to occur at x (x) pr(Y = 1 | λ, x) = λλ1. (x) (x) = E(Yi | i : Xi = x) pr(Y = 1; x) = E λλ1. (x) Conventional, mathematically correct, but seldom relevant Second way: come what may! SPP event occurs at x, a random point in X joint density at (y , x) proportional to E(λy (x)) = my (x) x has marginal density proportional to E(λ. (x)) = m. (x) λ1 (x) Eλ1 (x) pr(Y = 1 | x) = 6= E Eλ. (x) λ. (x) Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Sampling bias Treatment and randomization Notation Estimating functions Interference
Log Gaussian illustration of sampling bias η0 (x) ∼ GP(0, K ),
λ0 (x) = exp(η0 (x))
η1 (x) ∼ GP(α + βx, K ), η(x) = η1 (x) − η0 (x) ∼ GP(α + βx, 2K ),
λ1 (x) = exp(η1 (x)) K (x, x) = σ 2
One-dimensional sampling distributions: eα+βx(u)+η(x) ρ(x(u)) = pr(Y (u) = 1) = E 1 + eα+βx(u)+η(x) ∗ logit(ρ(x)) ' α + β ∗ x (|β ∗ | < |β|) 2
eα+βx+σ /2 Eλ1 (x) = σ2 /2 Eλ. (x) e + eα+βx+σ2 /2 logit pr(Y = 1 | x ∈ SPP) = α + βx
π(x) = pr(Y = 1 | x ∈ SPP) =
Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Sampling bias Treatment and randomization Notation Estimating functions Interference
Randomization: Simple treatment effect Event x –> pair (t(x), y (x)), treatment status, outcome (t, y ) (0, 0) (0, 1) (1, 0) (1, 1)
H0 Ha λy (x)γt λy (x)γt,y λ0 (x)γ0 λ0 (x)γ00 λ1 (x)γ0 λ1 (x)γ01 λ0 (x)γ1 λ0 (x)γ10 λ1 (x)γ1 λ1 (x)γ11
Expected intensity my (x)γt,y m0 (x)γ00 m1 (x)γ01 m0 (x)γ10 m1 (x)γ11
Conditional on λ: odds(Y (x) = 1 | t, λ) = λ1 (x)γt1 /(λ0 (x)γt0 ) odds ratio = τ
= γ11 γ00 /(γ01 γ10 )
Treatment effect constant in x and independent of λ Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Sampling bias Treatment and randomization Notation Estimating functions Interference
GLMM analysis Sample (X , Y , t)1 , (X , Y , t)2 , . . . observed sequentially GLMM analysis for a single event at x λ1 (x)γt1 λ1 (x)γt1 + λ0 (x)γt0 λ1 (x)γt1 pr(Y (x) = 1 | t) = E λ1 (x)γt1 + λ0 (x)γt0 ∗ odds ratio = τ (|τ ∗ | < |τ |)
pr(Y (x) = 1 | t, λ) =
Conclusion: treatment effect is attenuated |τ ∗ | < |τ |
Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Sampling bias Treatment and randomization Notation Estimating functions Interference
Point process analysis Sample (X , Y , t)1 , (X , Y , t)2 , . . . observed sequentially Correct analysis for a single event at x λ1 (x)γt1 λ1 (x)γt1 + λ0 (x)γt0 E(λ1 (x))γt1 pr(Y (x) = 1 | t, x ∈ SPP) = E(λ1 (x))γt1 + E(λ0 (x))γt0 odds(Y (x) = 1 | t, x ∈ SPP) = m1 (x)γt1 /(m0 (x)γt0 ) pr(Y (x) = 1 | t, λ) =
odds ratio = γ11 γ00 /(γ01 γ10 ) = τ Conclusion: No attenuation of treatment effect.
Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Sampling bias Treatment and randomization Notation Estimating functions Interference
Notation: meaning of E(Yi | Xi = x) Exchangeable sequence (Y1 , X1 ), (Y2 , X2 ), . . . Probabilistic interpretation/definition: i fixed, Xi random π(x) = pr(Yi = 1 | Xi = x) = p1 (1, x)/(p1 (0, x) + p1 (1, x)) Biostatistical interpretation: ρ(x) = E(Yi | i : i ∈ Ux ) Ux = {i : Xi = x}: x fixed, i ∈ Ux random eα+βx E(λ1 (x)) = E(λ. (x)) 1 + eα+βx ∗ ∗ eα +β x λ1 (x) ρ(x) = E ' λ. (x) 1 + eα∗ +β ∗ x
π(x) =
Conditional mean π(x) versus stratum mean ρ(x) Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Sampling bias Treatment and randomization Notation Estimating functions Interference
Consequences of ambiguous notation Sample (Y1 , X1 ), (Y2 , X2 ), . . . observed sequentially E(λ1 (x)) eα+βx = = E(Yi | Xi = x) E(λ. (x)) 1 + eα+βx ∗ ∗ eα +β x λ1 (x) ρ(x) = E ' = E(Yi | i : Xi = x) λ. (x) 1 + eα∗ +β ∗ x
π(x) =
(ρ(x) computed by logistic-normal integral) Conventional PA estimating function Yi − ρ(xi ) is such that E(Y1 − ρ(x) | X1 = x) = π(x) − ρ(x) 6= 0 and same for Y2 , . . .. Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Sampling bias Treatment and randomization Notation Estimating functions Interference
Estimating functions done correctly Mean intensity for class r : mr (x) = E(λr (x)) π(x) = m1 (x)/m. (x); ρ(x) = E(λ1 (x)/λ. (x)) For random selected units i ∈ Ux , E(Yi ) = ρ(x) For autogenerated x, E(Y |x ∈ SPP) = π(x) 6= ρ(x) X T (x, y) = h(x)(Y (x) − π(x)) x∈SPP
has zero mean for auto-generated configurations x. Note: E(T | x) 6= 0; average is also over configurations
Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Sampling bias Treatment and randomization Notation Estimating functions Interference
Explanation of unbiasedness z = {(x1 , y1 ), . . .} configuration observed in (0, t). x = {x1 , . . . , } marginal configuration (SPP) z is a random measure with mean t mr (x) ν(dx) at (r , x) x is marginal random measure with mean t m. (x) ν(dx) at x πr (x) = mr (x)/m. (x) Hence t −1 E z(r , dx)−πr (x)x(dx) = mr (x) ν(dx)−πr (x)m. (x) ν(dx) = 0 for all x implies Z
h(x) z(r , dx) − πr (x)x(dx)
T = C×X
has zero expectation. But E(T | x) 6= 0. Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Sampling bias Treatment and randomization Notation Estimating functions Interference
variance calculation: binary case (y, x) generated by point process; X T (x, y) = h(x)(Y (x) − π(x)) x∈SPP
E(T | x) 6= 0 Z h2 (x)π(x)(1 − π(x)) m. (x) dx var(T ) = ZX + h(x)h(x 0 ) V (x, x 0 ) m.. (x, x 0 ) dx dx 0 2 ZX + h(x)h(x 0 )∆2 (x, x 0 )m.. (x, x 0 ) dx dx 0
E(T (x, y)) = 0;
X2
V : spatial or within-cluster correlation; ∆: interference Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Sampling bias Treatment and randomization Notation Estimating functions Interference
What is interference? Physical interference: distribution of Y (u) depends on x(u 0 ) Sampling interference for autogenerated units mr (x) = E(λr (x)); mrs (x, x 0 ) = E(λr (x)λs (x 0 )) Univariate distributions: πr (x) = mr (x)/m. (x) Bivariate: πrs (x, x 0 ) = mrs (x, x 0 )/m.. (x, x 0 ) πrs (x, x 0 ) = pr(Y (x) = r , Y (x 0 ) = s | x, x 0 ∈ SPP) Hence πr . (x, x 0 ) = pr(Y (x) = r | x, x 0 ∈ SPP) ∆r (x, x 0 ) = πr . (x, x 0 ) − πr (x) No second-order sampling interference if ∆r (x, x 0 ) = 0 Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Autogeneration of units in observational studies Q: Subject was observed to engage in behaviour X . What form Y did the behaviour take? Application Marketing Ecology Ecology Traffic study Traffic study Law enforcement Epidemiology Epidemiology
X car purchase sex play highway use highway speeding burglary birth defect cancer death
Y brand activity class relatives or non-relatives speed colour of car/driver firearm used? type of defect cancer type
Units/events auto-generated by the process
Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Auto-generation as a model for self-selection Economics: Event: single; in labour force; seeks job training Attributes (Y ): (age, job training (Y/N), income) Epidemiology: Event: birth defect Attributes: (age of M, type of defect, state) Clinical trial: Event: seeks medical help; diagnosed C.C.; informed consent; Attributes: (age, sex, treatment status, survival) What is the population of statistical units? Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Mathematical considerations
Restriction: if pk () is the distribution for k classes, what is the distribution for k − 1 classes? Does restricted model have same form? Answer: Weighted sampling Closure under weighted or case-control sampling Closure under aggregation of homogeneous classes
Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Mathematical considerations
Restriction: if pk () is the distribution for k classes, what is the distribution for k − 1 classes? Does restricted model have same form? Answer: Weighted sampling Closure under weighted or case-control sampling Closure under aggregation of homogeneous classes
Peter McCullagh
Auto-generated units
Conventional regression models Auto-generated units Consequences of auto-generation Arguments pro and con
Mathematical considerations
Restriction: if pk () is the distribution for k classes, what is the distribution for k − 1 classes? Does restricted model have same form? Answer: Weighted sampling Closure under weighted or case-control sampling Closure under aggregation of homogeneous classes
Peter McCullagh
Auto-generated units