Matrix form of chi-square and contrasts for simultaneous tests of joint odds ratio. Leif E. Peterson, Ph.D. Dept. of Medicine; Dept. of Molec. & Human Genetics.
Matrix form of chi-square and contrasts for simultaneous tests of joint odds ratio Leif E. Peterson, Ph.D. Dept. of Medicine; Dept. of Molec. & Human Genetics Baylor College of Medicine Faculty Center, Office 10.57, ext. 8-5386
Introduction • Additive and multiplicative models are widely used in epidemiology to obtain maximum likelihood (ML) estimates of relative risks (RR), odds-ratios (OR) and hazard ratios (HR). • Numerous computer programs have been developed that generate ML estimates of regression coefficients and variance-covariance matrices. • Despite this wide diversity of programs, there is a shortage of stand alone algorithms that allow the statistician to generate multifactor rel1
ative risks and confidence limits, and conduct tests of hypotheses using results of previous regression analyses. Notation and Theory According to asymptotic likelihood theory, the score vectors of a regression model are ∂logL(b) U (b) = , (1) ∂b where logL is the log of the likelihood function for the particular model under consideration, that is, the GSK, Cox proportional hazards, logit, Poisson or other model. The Fisher information matrix I(b) of the score vectors is I(b) = E[U (b)U T (b)] 2 ∂ logL(b) , = − E ∂bj∂bk 2
(2)
the generalized-inverse of which is the variance-covariance matrix, Vb, written as ⎞ ⎛ 2 σ (b0) σ(b0, b1) · · · σ(b0, bj ) ⎜ σ(b0, b1) σ 2(b1) · · · σ(b1, bj )⎟ ⎟, (3) Vb = ⎜ . . . . ⎠ ⎝ . . . . . σ 2(b0, bj ) σ(b1, bj ) · · · σ 2(bj ) which is a p × p real symmetric matrix whose principal diagonal consists of the parameter variances. The off-diagonals are the covariances between each of the coefficients. Most computer programs print the diagonal and the upper or lower triangular of the variance-covariance matrix, which makes no difference since the matrix is symmetric. When NewtonRaphson iteration is used for convergence, Vb is the generalized-inverse of the Fisher information matrix. For iteratively reweighted least squares (IRLS) regression, Vb is the generalized-inverse of the weighted dispersion
3
matrix shown as Vb = I(b) = (XTVX)−1 .
(4)
A best-asymptotic-normal (BAN) estimate of the parameter vector b is obtained by adding the solution vector ∆b = (XTVX)−1XTVY ,
(5)
to the previous parameter vector at each iteration b(i+1) = bi + ∆b(i+1) ,
(6)
which results in b at convergence
⎛ ⎞ b0 ⎜ b1 ⎟ ⎜ ⎟ ⎟ b=⎜ ⎜ b2 ⎟ . ⎝ .. ⎠ bp 4
(7)
In (5) X is the design matrix, W is a diagonal matrix of inverse variance weights and Y is the column vector of residuals. Using b and Vb from, for example, a four-variable model including terms for an intercept b0 and three risk factors (b1 ,b2 , and b3), the point estimate of RR for the three factors considered jointly is RR = exp(b1u1 + b2u2 + b3u3) ,
(8)
where b1, b2, and b3 are the regression coefficients and u1, u2 and u3 are the units applied to the coefficients. The RR in (8) is based on reference cell coding where each p-level risk factor is coded with p − 1 independent variables. The Taylor series-based (1 − α) cofidence interval (CI) for the interval estimate can be expressed by √ (1 − α)CI RR = exp (ln(RR) ± Zα × V ) . (9) Abramowitz and Stegun [1] give a useful method for determining percentage points of the standard deviate, Zα, in (9) for a specified a level in the
5
form
c0 + c1 t + c2 t2 Zα = t − 1 + d1 + d2 t2 + d3 t3
1 , t = ln 2 α
where
(10)
(11)
and c0 = 2.515517 d1 = 1.432788 c1 = 0.802853 d2 = 0.189269 c2 = 0.010328 d3 = 0.001308. The variance, V , of the logarithm of the RR (9) takes on the form V =
p i
σ 2(bi)(u∗i − ui)2 +
p p j
σij (bi, bj )(u∗i − ui)(u∗j − uj ) , (12)
i=j
where σ 2(bi) is the variance of bi, σij (bi, bj ) is the covariance between bi and bj (3) and (u∗i − ui) and (u∗j − uj ) are the change in units applied 6
to the coefficients. A test of hypothesis of no significant effect due to the joint factors of the parameters in (8) is (13) H o : b1 = b2 = b3 = 0 , which can be rewritten as Ho : Cb = 0 , (14) where C is the contrast matrix forming the desired linear combination of the regression coefficients for the three-parameter test. Thus, for the three joint factors in (8), the C matrix is ⎞ ⎛ 0 1 0 0 (15) C = ⎝0 0 1 0⎠ . 0 0 0 1 Grizzle, Starmer, and Koch [2] introduced the test statistic of the linear hypothesis Ho : Cb = 0, given by (16) χ2 = SS(Cb = 0) = bTCT[CVbCT]−1Cb , which is χ2 distributed with degrees of freedom equal to the number of independent rows in C. Matrix multiplication in (16) is first carried out 7
by obtaining the outer products bT CT and Cb. The variance-covariance matrix Vb is premultiplied by C to give CVb, which is postmultiplied by the transpose of C (CT ) to give CVbCT . The inverse of CVbCT , that is, [CVbCT ]−1, is obtained by using singular value decomposition, which avoids such problems as nullity (zero elements), singularity and collinearity, which may be inherent in ill-conditioned or “sparse” matrices. Lastly, [CVbCT ]−1 is premultiplied by bT CT then postmultiplied by Cb to arrive at bT CT [CVbCT ]−1Cb. If the test statistic exceeds some tabled value of χ2(d.f.,1−α), then we conclude that there is a statistically significant joint effect in the presence of the three factors in our example. Estimating Relative Risk for Multiple Risk Factors Let us take, as an example, the results of a unconditional (unmatched) logistic regression analysis of oral contraceptive use (OC), smoking and
8
myocardial infarction (MI) found in [3]. The coefficients are given as ⎞ ⎛ ⎞ ⎛ Intercept −9.2760 b0 ⎜b1⎟ ⎜ 1.2756 ⎟ OC Use ⎟ ⎜ ⎟ ⎜ ⎜b2⎟ ⎜ 0.1517 ⎟ Age ⎟ ⎜ ⎟ ⎜ ⎟ ⎟ ⎜ Smoke(1 − 24) . (17) b=⎜ ⎜b3⎟ = ⎜ 1.1977 ⎟ ⎜b4⎟ ⎜ 2.0828 ⎟ Smoke ≥ 25 ⎟ ⎜ ⎟ ⎜ ⎝b5⎠ ⎝−1.1709⎠ OCuse ∗ Smoke(1 − 24) OCuse ∗ Smoke ≥ 25 0.3392 b6 where the variable names to the right of the coefficients are the intercept, OC use (coded 0-no, 1-yes), age (continuous variable), smoking 1-24 cigarettes per day (coded 0-no, 1-yes), smoking ≥ 25 cigarettes per day (coded 0-no, 1-yes), interaction term for OC use and smoking 1-24 cigarettes per day and an interaction term for OC use and smoking ≥ 25 cigarettes per day. The variance-covariance matrix Vb for our example is
9
⎛
0.4045 ⎜−0.0676 ⎜ ⎜−0.0087 ⎜ C=⎜ ⎜−0.0387 ⎜−0.0470 ⎜ ⎝ 0.0140 0.0072
. 0.3345 0.0008 0.0324 0.0332 −0.3294 −0.3287
⎞
. . 0.0002 0.0002 0.0004 0.0004 0.0006
. . . . . . . . ⎟ ⎟ . . . . ⎟ ⎟ 0.0476 . . . ⎟ ⎟. 0.0321 0.0489 . . ⎟ ⎟ −0.0471 −0.0310 0.7212 . ⎠ −0.0313 −0.0472 0.3322 0.4424 (18) The RR for a woman who is an OC user that smokes ≥ 25 cigarettes per day would be determined by substituting in the values of the coefficients and applying a unit of 1.0 to each in (8), we get RR = exp(1.2756 × (1 + 2.0828) × (1 + 0.3392) × 1) = exp(3.6976) = 40.35. (19) 10
If we now substitute into (12) the variances and covariances for the three variables in (18), we obtain V as V = 0.3345(1 − 0)2 + 0.0489(1 − 0)2 + 0.4424(1 − 0)2 + 2(0.0332) + 2(−0.3287) + 2(−0.0472) = 0.1404. (20)
The 95% CI for the RR is then calculated from (9) as √ 95% CI RR = exp(ln(RR) ± Zα √V ) = exp(3.6976 ± 1.96 0.1404) = (19.4, 84.1). (21) Now that the RR and Taylor series-based CI have been estimated, we can conduct a simultaneous test of hypothesis for no significant effect due to OC use, smoking ≥ 25 cigarettes per day with respect to the RR: H o : b1 = b4 = b6 = 0 . 11
(22)
For the three factors, the C matrix is ⎞ ⎛ 0 1 00 0 0 0 (23) C = ⎝0 0 0 0 1 0 0⎠ . 0 0 00 0 0 1 By substituting the appropriate values into (16) and performing the matrix manipulation to obtain χ2 we get ⎞ ⎛ −9.2760 ⎜ 1.2756 ⎟ ⎟ ⎞⎜ ⎛ ⎟ 0 1 0 00 0 0 ⎜ ⎜ 0.1517 ⎟ ⎟, 1.1977 (24) Cb = ⎝0 0 0 0 1 0 0⎠ ⎜ ⎟ ⎜ ⎟ 2.0828 0 0 0 00 0 1 ⎜ ⎟ ⎜ ⎝−1.1709⎠ 0.3392 ⎞ ⎛ 1.2756 (25) = ⎝2.0828⎠ , 0.3392 12
⎛ 0 ⎜1 ⎜ ⎜0 ⎜ T T b C = −9.2760 1.2756 0.1517 1.1977 2.0828 −1.1709 0.3392 ⎜ ⎜0 ⎜0 ⎜ ⎝0 0
= 1.2756 2.0828 0.3392 ,
0 0 0 0 1 0 0
⎞ 0 0⎟ ⎟ 0⎟ ⎟ 0⎟ ⎟ , (26) 0⎟ ⎟ 0⎠ 1
⎞ 0.4045 . . . . . . ⎜−0.0676 0.3345 . . . . . ⎟ ⎟ ⎜ ⎛ ⎞⎜ ⎟ . . . . ⎟ 0 1 0 0 0 0 0 ⎜−0.0087 0.0008 0.0002 ⎜ ⎟ . . . ⎟, CVb = ⎝0 0 0 0 1 0 0⎠ ⎜−0.0387 0.0324 0.0002 0.0476 ⎜ ⎟ 0.0489 . . ⎟ 0 0 0 0 0 0 1 ⎜−0.0470 0.0332 0.0004 0.0321 ⎜ ⎟ ⎝ 0.0140 −0.3294 0.0004 −0.0471 −0.0310 0.7212 . ⎠ 0.0072 −0.3287 0.0006 −0.0313 −0.0472 0.3322 0.4424
(27)
⎛
⎛
(28)
⎞
−0.0676 0.3345 0.0008 0.0324 0.0332 −0.3294 −0.3287 = ⎝−0.0470 0.0332 0.0004 0.0321 0.0489 −0.0310 −0.0472⎠ , 0.0072 −0.3287 0.0006 −0.0313 −0.0472 0.3322 0.4424 (29) 13
⎛ 0 ⎜ ⎛ ⎞ ⎜1 −0.0676 0.3345 0.0008 0.0324 0.0332 −0.3294 −0.3287 ⎜ 0 ⎜ CVb CT = ⎝−0.0470 0.0332 0.0004 0.0321 0.0489 −0.0310 −0.0472⎠ ⎜ ⎜0 0.0072 −0.3287 0.0006 −0.0313 −0.0472 0.3322 0.4424 ⎜ ⎜0 ⎝0 0
⎞
⎛
0.3345 0.0332 −0.3287 = ⎝ 0.0332 0.0489 −0.0472⎠ , −0.3287 −0.0472 0.4424 ⎞ ⎛ 11.0866 0.4724 8.2877 [CVbCT]−1 = ⎝ 0.4724 22.8178 2.7855⎠ , 8.2877 2.7855 8.7153 ⎛
bTCT[CVb CT]−1
0 0 0 0 1 0 0
(31)
(32) ⎞
11.0866 0.4724 8.2877 = 1.2756 2.0828 0.3392 ⎝ 0.4724 22.8178 2.7855⎠ , 8.2877 2.7855 8.7153
= 17.9373 49.0723 19.3296 , 14
(34)
⎞
0 0⎟ ⎟ 0⎟ ⎟ 0⎟ ⎟ 0⎟ ⎟ 0⎠ 1
⎛ ⎞ 1.2756 bTCT[CVb CT]−1 Cb = 17.9373 49.0723 19.3296 ⎝2.0828⎠ = 131.6453 . 0.3392
(35)
The resulting χ2 test statistic of 131.65 is significantly greater than the tabled χ2(3,0.95) value of 7.82, indicating that there is a significant joint effect. Thus we reject the hypothesis (22) that there is no joint effect due to OC use, smoking ≥ 25 cigarettes per day and the interaction between OC use and smoking ≥ 25 cigarettes per day. References
[1] Abramowitz, M. and Stegun, A. (1965) Handbook of Mathematical Functions. New York: Dover. [2] Grizzle, J.E., Starmer, C.F., Koch, G.G. (1969) Analysis of categorical data for linear models. Biometrics. 25:489-504. 15
[3] Schlesselman, J.J. (1982) Case-Control Studies: Design, Conduct, Analysis. Oxford: Oxford University.
16