TO'EVENT HISTORY AND MULTIPLE CHOICE DATA. Dan Steinberg. Department ... a dearth of conveniently available software tools ... a binary dependent vanable, using SAS. The methods .... This is the result of a conventional analysis. We.
APPLICATION OF CONDITIONAL LOGISTIC REGRESSION METHODS TO'EVENT HISTORY AND MULTIPLE CHOICE DATA Dan Steinberg Department of Economics, San Diego State University, San Diego, CA 92182 1. Introduction Social science survey data sets often contain ' panel data sets follow individuals over time. The large number of publicly available panel data sets includes lengthy panels in the Michigan Panel Study of Income Dynamics (PSI D), four separate components of the National Longitudinal Survey (NLS), and the Survey of Income and Program Participation (SIPP). Surveys of studies based on some of these dllta sets appear in Ferber and Hirsch (1982) arid Moffitt (1992). The proliferation of panel data sets has led to new interest in statistical techniques for combined cross-section and time series data. The current literature suggests that there are no generally accepted techniques, and that there is a dearth of conveniently available software tools for conducting the analyses. In this paper, I show that there are some very simple techniques for estimatinll panel data models for a binary dependent vanable, using SAS. The methods, which cast the likelihood in the form of a matched-sample case-control logistic regression, are equivalent to estimating a fixed effects model, and thus allow not only consistent parameter estimation, but also open the way for some useful model diagnostics. The methods take full account of the correlation between observations and are explained and illustrated with examples from market research surveys.
time periods t, and the error component ui ,is an ' individual specific non-time varying factor. If the ui are identically 0 for all i, standard cross sectional estimators are availabl,e. Assuming an Extreme Value Type I error ejt is sufficient to make logistic regression the efficient estimator. However, if the ui error component is non-zero two problems arise. First, the observations are no longer independent and within person correlations need to be taken into account. Second, since there is good reason to suspect that the person' specific ui may be correlated with Xii, standard estimators will be inconsistent as welf as inefficient. The problem, then is how to address the persori specific error coll1Xlnent. Unfortunately', one obvious choice, estimating a person speC:lfic: intercept in a fixed effects model will not work for the nonlinear model. The introduction of the nuisance parameter will yield inconsistent estimates of the coefficient vector 13. This is in contrast to linear regression models where the nuisance parameters disappear when the data are centered using person specific means. 3. A Solution: Conditional likelihood
Consider data available on a set of N individuals i=l, ... N, each observed for t· time periods, with 1