tioning as screening or diagnostic tools for dementia with much greater emphasis ... Analysis of cognitive decline data is further complicated by the fact that selec- ..... techniques offered in the NLMIXED procedure in the SAS software package.
Journal of Data Science 7(2009), 13-25
Mixed-effect Models for Truncated Longitudinal Outcomes with Nonignorable Missing Data
Sujuan Gao1 and Rodolphe Thi´ebaut2 1 Indiana University and 2 Universit de Bordeaux Abstract: Mixed effects models are often used for estimating fixed effects and variance components in continuous longitudinal outcomes. An EM based estimation approach for mixed effects models when the outcomes are truncated was proposed by Hughes (1999). We consider the situation when the longitudinal outcomes are also subject to non-ignorable missing in addition to truncation. A shared random effect parameter model is presented where the missing data mechanism depends on the random effects used to model the longitudinal outcomes. Data from the Indianapolis-Ibadan dementia project is used to illustrate the proposed approach. Key words: Longitudinal data, mixed-effects model, nonignorable missing, truncation.
1. Introduction This paper is concerned with estimating the effects of putative risk factors on cognitive decline in the elderly which is the focus of many longitudinal studies, both epidemiological or clinical trials in nature. Many cognitive assessment instruments currently used in dementia studies have an upper ceiling due to the limited time available for testing and the fact that the instruments are also functioning as screening or diagnostic tools for dementia with much greater emphasis on sensitivity at the lower end of the instruments. A well known example of such an instrument is the Mini-Mental State Examination (MMSE) which has a ceiling of 30. However, more extensive and lengthy neuropsychological tests are shown to be normally distributed in a large cohort of subjects. But given a shorter version of a lengthy test, many who would have scored in the tails of the longer version are now scoring at either endpoint. Inference on cognitive function or decline is concerned with the “true” cognitive ability or change that would have been measured rather than the observed score on the shorter version with a ceiling effect. The ceiling or truncation effect in neuropsychological tests was also
14
Sujuan Gao and Rodolphe Thi/’ebaut
observed in van Belle and Arnold (2000) although their focus was on measuring reliability. Analysis of cognitive decline data is further complicated by the fact that selective groups of subjects maybe missing some measurements either by study design or by happenstance. For example, in large cohort studies of dementia, subjects previously diagnosed with cognitive impairment no dementia (CIND), an intermediate status between normal and dementia, were usually not re-screened with the cognitive test and may directly proceed to clinical evaluations. Other missing data due to death or nursing home placement may also raise the possibility of non-ignorable missing data in this situation. When measurements with ceilings are used in regression analysis, it has been shown that the ordinary least square (OLS) estimator ignoring the ceiling is biased and inconsistent (Goldberger, 1981). There have been efforts to correct for the OLS bias in the regression model setting (Tsui, Jewell, and Wu, 1988). Hasselblad, Stead and Galke (1980) considered a univariate regression model with multiple truncation points using the EM algorithm. In the case of a single truncation point with normally distributed data, the model is sometimes referred to as the Tobit model in the econometric literature (Amemiya, 1984) after an earlier econometric application (Tobin, 1958). Little and Rubin (2002) called this type of truncation “non-ignorable missing data with known mechanism” since the truncation point is known in this situation. When such outcomes are measured repeatedly and factors associated with change in outcome over time is of interest, special methods are also needed to ensure unbiased inferences. Hughes (1999) proposed an EM based maximum likelihood approach for longitudinal outcomes with truncated outcomes. Publications adopting the Hughes model are mostly concerned with the modeling of viral load in HIV and related lab data with various special mixed-effect models (Wu, 2002)). Lyles (2000) considered the mixed-effect model in Hughes (1999) with the additional problem when the outcomes are also subject to informative drop-out. The authors adopted the approach of Schluchter (1992) by joint modeling of outcome variable and a log-normal survival model for time remained in the study for each subject. Thiebaut, Jacqmin-Gadda, Babiker and Commenges (2005) and Pantazis, Gouloumi, Walker, and babiker (2005) also considered the joint modeling of bivariate longitudinal data with a log-normal survival model for time to dropout. In this paper, we propose to use a binary survival model proposed by Wu and Carroll (1988) and previously adopted by by Pulkstenis, Ten Have, and Landis (1998), and Ten Have, Kunselman, Pulkstenis, and Landis (1998) to model the incidence of dropping out in a shared random effect model approach. We present results from a simulation study and illustrate the proposed method using data from a community-based dementia study.
Mixed-effect Models for Truncated Longitudinal Outcomes
15
2. A Longitudinal Dementia Study The Indianapolis Study of Health and Aging is one of two longitudinal cohorts in the Indianapolis-Ibadan Dementia Project aimed at identifying risk factors for dementia, Alzheimer’s disease and cognitive decline. The study population consists of 2212 African Americans age 65 and older living in Indianapolis, USA, at study baseline. Study participants were evaluated at study baseline and repeatedly evaluated at 2, 5, 8 and 11 years after baseline with a two-phase design at each evaluation wave. At the first phase (screening phase), study subjects were interviewed at their homes with a questionnaire designed to evaluate their cognitive function. In addition, demographic information, family history of illness, medical history of the subject, consumptions of alcohol and tobacco and blood samples were collected during the screening interview. At the second phase (clinical phase), selected subjects from a stratified sample based on screening results received full physical and neurological examinations to determine disease status. Subjects who received a full clinical evaluation were classified as demented, CIND, or normal. Demented subjects were then followed using a separate protocol. The CIND subjects were allowed to skip the screening phase and proceeded directly to clinical evaluation at the next follow-up evaluation. At each screening phase study subjects were interviewed using the Community Screening Instrument for Dementia (CSID), a questionnaire designed for dementia screening in diverse cultural and educational backgrounds. The CSID consists of two parts: an interview with the study participant and an interview with an informant. The interview with the study participant assesses cognitive functioning, medical history, social involvement, and other putative risk factors. The interview with informants assesses the study participant’s cognitive functioning, activities of daily living (ADL) and functioning at work and in social relationships. The cognitive test of the CSID includes a number of test items measuring multiple cognitive domains including language, memory, orientation, judgment, comprehension and constructional praxis. Several neuropsychological tests including the animal fluency test and East Boston story were also included. In this paper, we consider a total cognitive score created by summing corrected answers from 40 questions in which lower scores indicate more cognitive impairment. These 40 questions were repeatedly administered at baseline and at each of the four follow-up waves. Therefore, there is an interest in investigating the patterns of cognitive decline and factors associated with cognitive decline. One particular factor of interest is the influence of education on cognitive decline. Many crosssectional studies have reported low education is a risk factor for poorer cognitive function. However, longitudinal studies have been inconsistent with some finding
16
Sujuan Gao and Rodolphe Thi/’ebaut
no effect suggesting the effect seen in cross-sectional studies was due to biases in cognitive assessment favoring highly educated individuals. The investigation of education on cognitive decline is complicated in the Indianapolis data by two facts. The first is the truncation of the CSID scores at 40. In Table 1, we show significant difference in mean CSID cognitive scores at baseline in two groups defined by education level, namely, those who had 6 or less years of education (low education group) and those who had 7 or more years of education (high education). The cut-off of 6 years of education was chosen for this cohort in previous report (Hall, Gao, Unverzagt, and Hendrie, 2000). The percentages of subjects who scored at 40 also differed significantly in the two groups. Table 1: Comparisons of baseline characteristics between the two groups defined by education levels Characteristics
Low Education
High Education
p-value
35.0(3.7) 5.6
37.1(3.0) 22.8