Understanding Heterogeneity in Generalized Mixed and Frailty Models Luc DUCHATEAU and Paul JANSSEN
Variance components are useful parameters to quantify the different sources of randomness in hierarchical models. Interpreting the variance components in generalized mixed and frailty models is not straightforward because the variance is not directly related to quantities with a biological meaning. We therefore investigate how the estimated values of the variance components affect the variability of specific quantities of interest such as the prevalence or the median time to event. We discuss two examples from veterinary science with clustering between animals and show for these examples how variance components can be interpreted in the case of a binary and time-to-event response variable. KEY WORDS: Logistic regression; Random effects; Survival data; Variance components.
assess the variability remaining in the data after covariate adjustment as the covariates might only explain a small part of the variability in the data (Section 2). In mixed models the interpretation of the estimates of the variance components is easy as the random effect operates on the same scale as the values of the response variable. This is, however, not true in generalized mixed models and frailty models as these models link the random effect with the response variable via a nonlinear function. For these models there is a need to interpret the size of the estimates of the variance components through their effects on quantities that have a biological and understandable meaning for the study at hand. In Section 2 we demonstrate this idea for a logistic mixed regression model with success proportion as the quantity of interest. In Section 3 we consider frailty models and demonstrate how the variance component influences the median time to event (Duchateau et al. 2002) and the success proportion at a particular time point based on two real datasets. 2. HETEROGENEITY IN GENERALIZED MIXED MODELS
1.
INTRODUCTION
The modern approach to model datasets with different sources of randomness is based on hierarchical modeling (Brown and Prescott 1999; McCulloch and Searle 2001). For instance, in randomized block designs the block effects could be considered as a random effect rather than taking the classical approach in which the block effects are modeled as fixed effects parameters. Mixed effects models use the available information in the data more efficiently in the case of an unbalanced design (Duchateau and Janssen 1999). The inferential procedures give estimates of the variance components. As will be seen from the examples, this information goes far beyond the idea of providing estimates for nuisance parameters (Carroll 2003). For instance, in animal breeding the heritability coefficient is defined in terms of the variance components and therefore the variance components are actually the parameters of interest. Even if the variance components are nuisance parameters, it is important to evaluate the remaining variability in the data after adjustment by covariates. In large epidemiological studies with many covariates, it is likely that one or more of these covariates influence the outcome significantly due to high power to detect even small and maybe practically meaningless effects. In such cases, it is important to Luc Duchateau is Professor, Department of Physiology, Biochemistry, and Biometrics, Faculty of Veterinary Medicine, Ghent University, Salisburylaan 133, 9820 Merelbeke, Belgium (E-mail:
[email protected]). Paul Janssen is Professor, Limburgs Universitair Centrum, Center for Statistics, Universitaire Campus, 3590 Diepenbeek, Belgium. Duchateau’s research was supported by the Fund for Scientific Research, grant number 31520804. Janssen’s research was supported by the Interuniversity Attraction Poles research network P5/24 of the Belgian State (Federal Office for Scientific, Technical and Cultural Affairs).
© 2005 American Statistical Association DOI: 10.1198/000313005X43236
Interpretation of heterogeneity within the context of a generalized mixed model is demonstrated in the following case study with a binary outcome. Within each of 60 randomly selected farms, 30 pigs are individually screened for the presence of the bacterial species Salmonella. This observational study was set up to investigate which factors at the farm level explain differences in prevalence over the different farms. The variance component attached to the farms can be considered a nuisance parameter, but we will demonstrate that it is worthwhile to study the variance component in the model with the covariates included as it gives us an idea of the remaining variability in the data. The dataset can be downloaded from http://vetstat.ugent.be/research/heterogeneity. We observe the number of pigs with Salmonella infection xi with ni pigs tested in farm i. The distribution of xi is binomial with parameters ni and pi , that is, xi ∼ B (ni ; pi ). First a model is fitted without covariates and with farm as a random effect log
pi 1 − pi
= µ + fi
(1)
with pi the proportion of infected pigs given the random farm effect fi; µ is the overall mean. The distributional assumption is 2 fi ∼ N 0, σf and the fi ’s are independent. Note that the response measured on an observational unit—in our case study the number of animals infected—is binomially distributed whereas the random effect representing the cluster is assumed to come from a normal density. Therefore, it is difficult to compare the variance related to the cluster and the residual variance. The American Statistician, May 2005, Vol. 59, No. 2
143
Figure 1. The density function for the prevalence over different farms in the logistic mixed regression model. Observed (bars) and expected (circles) proportions are shown in (a), the density functions of the prevalence for the three different types of floor are shown in (b). Finally, the density functions for different values of the variance are shown with µ equal to .645 (c) and 2.19 (d).
The logistic mixed regression model was fitted using Gaussian quadrature to integrate out the random effects from the likelihood (NLMIXED procedure in SAS Version 9.1). Fitting this model gives µ ˆ = .646 (se = .37) and σ ˆf2 = 7.74 (se = 1.38). Thus, for a farm with random effect equal to 0, the conditional prevalence is estimated as 65.6% (= exp(ˆ µ)/{1 + exp(ˆ µ)}). Adding covariates to the model we see that only the factor “type of floor” explains the Salmonella prevalence in a significant way (p value is .0052). We therefore propose the following extended model log
pi 1 − pi
= µ + β1 xi1 + β2 xi2 + fi
(2)
with xi1 = 1 for fully slatted floor and 0 otherwise; xi2 = 1 for more than 50% but not fully slatted floor and 0 otherwise; finally xi1 = xi2 = 0 corresponds to equal to or less than 50% slatted floor. Obviously, this covariate operates at the farm level with the “fully slatted floor” category occuring in 80% of the farms. Fitting this model gives µ ˆ = 6.03 (se = 2.08), βˆ1 = −5.88 (se ˆ = 2.11), β2 = −3.70 (se = 2.80), and the variance estimate decreases to σ ˆf2 = 6.29 (se = 1.50). The conditional prevalence for a farm with random effect equal to 0 and fully slatted floor is estimated as 53.8% (=exp(ˆ µ + βˆ1 )/{1 + exp(ˆ µ + βˆ1 )}), for more than 50% but not fully slatted floor as 91.1%, and for less than 50% slatted floor as 99.8%. At this stage, it is important to interpret the variance components of the two models in order to assess whether the inclusion of the factor “type of floor” leads to a substantial 144
General
decrease of the variability of the prevalence over the different farms. Because (1) implies pi = g(fi ) = exp (µ + fi ) / {1 + exp (µ + fi )}, we easily obtain the density function for pi fpi (p)
2 1 1 p 1 −µ . (3) =√ exp − 2 log 2σf 1−p p(1 − p) 2πσf Therefore the between-farm variability σf2 determines the shape of the prevalence density. The variance of the random effects that operate on the logit scale can be interpreted in terms of the spread of the prevalences over the farms by using the density function (3). In order to compare the variability of the observed prevalences with the variability predicted from the logistic regression mixed model without covariates, the observed frequencies for 10 classes (0–10%, . . . , 90–100 %) were derived and expected frequencies were obtained for these classes using the density function (3) with the estimates for µ and σf2 inserted. The observed and expected probabilities for these 10 classes are similar (Figure 1(a)). The introduction of the covariate “type of floor” decreases the variance at the level of the random effect from 7.74 to 6.29. The density function of the Salmonella prevalence for each floor type is also given by (3) with µ replaced by µ, µ + β1 , or µ + β2 depending on the floor type. For the two categories without fully slatted floors, this leads to density functions that are concentrated around a high prevalence value with small probability to find farms in these two categories with low prevalence. The reduction of the variance, however, has little or no impact on the density function of the prevalence for the farms with fully slatted floor:
Figure 2. The density function for the median time to insemination and success proportion at a particular time point over different farms in the frailty model. Observed (bars) and expected (circles) frequencies of median event times are shown in (a), the density functions of the median for heifers and multiparous cows in (b). Finally, the density functions of the proportion of animals with first insemination at different time points (see ˆ ) = .00109) legend: time points 30, 55, 100, and 150 days considered) are shown for heifers (hˆ 0 = .0014) in (c) and for multiparous cows (hˆ 0 exp( β in (d).
prevalences in this type of farms still vary widely between 0 and 1 (Figure 1(b)). In order to study the effect of different values of the variance component at different values for µ, the density function of p for a set of combinations of µ and σf2 is shown in Figures 1(c)– (d). Prevalences are more concentrated around a particular point when the variance of the random effect falls between .1 and .5 in the case that µ equals .646 (observed in study) (Figure 1(c)). With higher values for µ such as 2.2, this is true even for much higher values for the variance (Figure 1(d)). Thus, the variance cannot be assessed by itself but needs to be evaluated in conjunction with the overall prevalence.
3. HETEROGENEITY IN FRAILTY MODELS Also for survival data hierarchical models are needed for studies with random variation at different levels. As a case study, we investigate the time to first insemination in dairy cows. Cows are clustered within farms leading to random farm effects. In this study, data from 9,029 cows from 164 farms were collected. We consider parity as a covariate and group cows according to whether it is their first calving (uniparous cow or heifer) or not (multiparous cow). Risk time starts 30 days after calving as no insemination takes place before that day. For cow j in farm i we observe the minimum of the censoring time cij and the time to first insemination tij . Conditional on the random farm effect, we assume a Weibull distribution for the first insemination times. The conditional distribution of the event times (conditional on ui ) is then given by tij ∼ αui h0 tα−1 exp (−h0 ui tα ).
This is equivalent to the proportional hazards frailty model without covariates hi (t) = ui h0 (t)
(4)
with hi (t) the hazard of first insemination at time t conditional on the frailty (random farm effect) ui , h0 (t) the baseline hazard of first insemination at time t. We assume a one-parameter gamma density for ui with mean 1 and variance σu2 1
ui ∼ fui (u) =
u σu2
−1
exp −u/σu2 2
σu2 (1/σu ) Γ (1/σu2 )
.
The frailties can be integrated out analytically from the likelihood (Klein and Moeschberger 2003) and this likelihood can then be maximized for the remaining parameters leadˆ 0 = .0014 (se = .00013), α ing to h ˆ = 1.55 (se = .015), 2 and σ ˆu = .13 (se = .0175) (R function can be downloaded from http://vetstat.ugent.be/research/heterogeneity). For a random farm effect equal to 1, the median time to first insemination 1/α ˆ0 from 30 days after calving is thus estimated as log 2/h = 55 days. This model can be further extended with the parity covariate hij (t) = ui h0 (t) exp (βxij ) ,
(5)
where xij is 0 for heifers and 1 for multiparous cows. ˆ 0 = .0014 (se = .00013), Fitting the extended model gives h 2 α ˆ = 1.56 (se = .015), σ ˆu = .13 (se = .0176), and βˆ = −.245 (se = .0232). Thus, for a random farm effect equal to 1 the median time to first insemination corresponds to 53 days for heifers and to 62 days for multiparous cows. The American Statistician, May 2005, Vol. 59, No. 2
145
A direct interpretation of the estimated variance component, describing the between-farm variability, is not easy because it operates on the hazard of first insemination in a multiplicative way while the quantity of interest is the median time to first insemination. Since, with mi the median time to first insemination for farm 1/α i, (4) implies mi = g(fi ) = (log 2/(ui h0 )) , we easily obtain the density function for mi fmi (m) = α
log 2 σu2 h0
1/σu2
1 Γ(1/σu2 ) 1+ α2 σu log 2 1 . (6) exp − 2 α × m σ u m h0
In order to assess whether the frailty model describes the variability between the farms adequately, the median time to first insemination is obtained for each farm individually based on the Kaplan–Meier curve and a histogram with 16 classes is constructed based on this set of median times. These observed frequencies are then compared with the expected frequencies ˆ 0 and derived from the density function (6) with the estimates h 2 σ ˆu inserted. The observed frequencies correspond well with the expected frequencies from the frailty model (Figure 2(a)). The density functions for the median time to first insemination for the two groups in the extended model (5) do not only differ with respect to the mode but also with respect to the shape. Although both density functions are based on the same variance component estimate σˆu2 , the density function is flatter for the heifers with higher median time to first insemination than for the multiparous cows (Figure 2(b)). Thus, the effect of the variance of the random effect on the spread of the median event time also depends on the overall hazard rate. Alternatively, the percentage of animals with first insemination before a particular time point t, pt , defined from the survival model as pt = 1 − exp(−tα h0 u), can be used as a summary statistic to interpret the variance σu2 . The density function of the percentage of animals with first insemination over the different farms at time t is given by fpt,i (pt ) =
1 tα h
2) (2/σu
0 σu
1
Γ(1/σu2 )
(1 − pt ) tα h0 σu2
×
−1
− log(1 − pt ) tα h0
1 2 σu
−1
. (7)
Obviously, the density function of the percentage of animals with first insemination over the different farms is the flattest for that time point where the mode of the density function is close to 50%, corresponding to the percentage considered at 55 days (Figure 2(c)–(d)). The density function of the percentage of first insemination at either an earlier (e.g., 30 days) or later time point (e.g., 150 days) is more concentrated as at those time points most farms have, respectively, low and high proportions of first insemination.
146
General
4. CONCLUSIONS Even if a variance component seems to be a nuisance parameter, it is important to study the impact of the variance of the random effects on the quantities of interest (prevalence, median time to event, . . . ). Reporting the value of the estimated variance of the random effects in generalized mixed models and frailty models as such is not helpful as the actual value needs to be interpreted in terms of parameters that are easily understood by the investigator. Furthermore, even small absolute values for the random effects variance might lead to large variance in terms of the parameter of interest. For instance, Yamaguchi and Ohashi (1999) reported an estimated variance of a random effect in a frailty model of .01 as negligible, although it results in a 90% interval of median event times from 2 to 5 years. The same value for the variance might correspond to large variation in the parameter of interest in one model but not in another model, as the effect of the magnitude of the variance depends on the parameterization and the mean value of the parameter of interest. There is a need to develop diagnostic tools for survival models with random effects. For instance, the best linear unbiased predictions of the random effects should be used with caution in exploring heterogeneity or performing diagnostic tests as they are shrinkage estimators (shrunk towards the mean); therefore their variance will typically underestimate the variance (Morris 2002). The reparameterization given in this article for the logistic regression mixed model and the frailty model allows us to depict the density function of the parameter of interest and compare it with the observed frequencies to assess whether the model assumptions seem reasonable. The proposed approach can be extended to hierarchical models with other distributional assumptions when the effect of the variance components on the response of interest needs to be assessed. [Received October 2004. Revised January 2005.]
REFERENCES Brown, H., and Prescott, R. (1999), Applied Mixed Models in Medicine, New York: Wiley. Carroll, R. J. (2003), “Variances are not Always Nuisance Parameters,” Biometrics, 59, 211–220. Duchateau, L., and Janssen, P. (1999), “An Example-Based Tour in Linear Mixed Models,” in Linear Mixed Models in Practice, eds. G. Verbeke and G. Molenberghs, New York: Springer Verlag. Duchateau, L., Janssen, P., Lindsey, P., Legrand, C., Nguti, R., and Sylvester, R. (2002), “The Shared Frailty Model and the Power for Heterogeneity Tests in Multicenter Trials,” Computational Statistics and Data Analysis, 40, 603– 620. Klein, J. P., and Moeschberger, M. L. (2003), Survival Analysis. Techniques for Censored and Truncated Data, New York: Springer Verlag. McCulloch, C. E., and Searle, S. R. (2001), Generalized, Linear, and Mixed Models, New York: Wiley. Morris, J. S. (2002), “The BLUPs are not ‘Best’ When it Comes to Bootstrapping,” Statistics and Probability Letters, 56, 425–430. Yamaguchi, T., and Ohashi, Y. (1999), “Investigating Centre Effects in a MultiCentre Clinical Trial of Superficial Bladder Cancer,” Statistics in Medicine, 18, 1961–1971.