detecting cyclical patterns in time series: individual and grouped data

64 downloads 0 Views 3MB Size Report
SERIES: INDIVIDUAL AND GROUPED DATA. Tinw Bechger and Godfried van den Wittenboer. 1. Introduction. Cyclical patterns are patterns of variations within ...
... DETECTING

CYCLICAL

PATTERNS

IN

TIME

SERIES: INDIVIDUAL AND GROUPED DATA Tinw Bechger and Godfried van den Wittenboer

1. Introduction

Cyclical patterns are patterns of variations within a time series that repeate themselves during the proces~ at regular periods. In this paper, we discuss the detection of cyclical patterns in longitudinal data. The aim is to show that cyclical patterns can be obtained with relatively simple means, that are available in conventional software packages for data analysis: SPSSIPC+, LISREL (Joreskog and Sorbom, 1993), and Mx (Neale, 1994). As an example we discuss the detection of a cyclical pattern in negative mood scores during the menstrual cycle. The data are obtained from a set of data of an ongoing research project at the faculty of Psychology of the University of Amsterdam (Kolk, 1994). The data consist of daily measurements of the negative mood collected on 31 female Psychology students (aged 20-30) for one menstrual period (varying between 22 and 35 days). Four women showed an incomplete mood series. Their data could not be analysed. The negative mood scores were measured with the 'Moos Menstrual Distress Questionnaire' (Moos, 1968).

In this paper the data are used for illustrative purpose only. Sanders et al. (1983a) report a study with similar data Since mood scores show day to day variation from positive to more negative and vice versa, they constitute an appropriate set of data to search for periodicity of patterns at an individual level. Furthermore, Livesey et al. (1989) found two periodic patterns of daily mood scores in two menstrual cycles of several women. This suggests that at least one specific repeating pattern might be expected for some women in a monthly cycle. Besides the analysis of the individual time series, the data from the individual subjects are combined to find group patterns. Here we encounter the difficulty that different individuals have varying time scales for a cycle; the length of the monthly cycle varies across women. The medical solution is to divide each menstrual cycle into five or six distinct phases of which the beginning and the end can be determined precisely by means of hormone levels, and to equate these phases across the different subjects. Lack of resources usually excludes this possibility so that various schemes for 28 days are used as rough guidelines instead (e.g. Shaw et al., 1978; Sanders et al., 1983, Figure 1; Kolk, 1994). Since these schemes are based on assumptions about the different phases, equating to 28 days may change the structure of the process. In this paper we propose a solution that leaves the structure of the process in tact. The analysis itself is

38

39

~

based on the model of latent curve analysis (Meredith and Tisak, 1990) in which we impose

where ro = 2lt1N, and N is the number of days in the period. The term ui(t) denotes the time-

sinus and cosinus components that constitute the harmonics to describe the individual dependent, non-cyclic variation in mood score. Since we are dealing with time series the u's

processes.

may be auto-correlated, but Livesey e.a. claim that the Fourier series effectively removes autocorrelation from the residual terms. In fact, the the usual assumption about the u' s in time series is that they are uncorrelated, and that they follow a Gaussian process.

2. The Analysis of Individual Time Series

Equation 1 expresses the observed series as a weighted sum of sine and cosine waves; (bj

2.1. Fourier series

sin jOlt +

~

cos jCllt). Each of these sums is called a hannonic. Each harmonic describes a sym-

metric, oscillating process with a frequency of j cycles. In Equation I there are only two harFigure 1 describes what, according to Livesey et aI. (1989), are two typical distributions

monics. lbis means that if we observe only one menstrual period, we may describe three types of processes:

of daily mood scores.

i. If the parameters ~ and bj are all zero, except for the intercept 1Io there is no cyclicity

(b)

(a)

present in the mood of the subject.

Neg. Mood

ii. Only the parameters for the first sum of a sine and a cosine wave are different from zero. The subject shows pre-menstrual symptomatology only (see Figure 1.b).

iii. All parameters are non-zero: The subject shows both mid-follicular and pre-menstrual cycle I

cycle 2

cycle I

symptomatology (see Figure La).

cycle 2

If we wish to describe a process with k cycles we add up to k more harmonics to the model of the form (bk sin kCllt + at cos krot). The maximum value of k is N/2, since the smallest cycle

Figure 1. IwO Iypical cycles.

Figure La shows both mid-follicular and pre-menstrual symptomatology while Figure l.b shows a subject with pre-menstrual symptomatology only. It is far from easy to describe these

takes two periods of time (e.g. days) to complete. The SPECfRA routine in SPSSIPC+ estimates the aj and bj parameters with the following formulas:

curves by a polynomial model, because this model does not allow the mood score at the end of a menstrual period to be equal to the mood score at the beginning of the next period. An alternative model, proposed by Livesey e.a. (1989), takes the assumed periodic nature of the

~=

data into account and it is more apt to describe the patterns drawn in Figure 1. Liveseye.a. (1989) suggest that a so-called Fourier series with five terms would well bj

describe the mood scores of their subjects: I

xi(t)

=1Io + b,sin wt

+ a,cos rot + b 2sin 2wt + ~ cos 2wt + ui(t),

(I)

2 N-I



_ LXi(t) coJ27tJ t) N ,.0 \. N

(2)

(2 . ) = -2N-I LXi(t) sin ~ N

,.0

(3)

N

1be contribution of each harmonic to the description of the data can be estimated by decomposing the mean square value MS

= ~ ~xnt) of the observed N

,.0

series into weights cor-

responding to each harmonic: I The formula is named afler Ihe Fourier Iheorem which slales lhal any periodiC funclion can be described by a weighled

C,

sum of cosine and sine waves.

J

40

=(a.2 + b 1 )12 J

J

j

=1 tok

(4)

41

""'II!!

The weight Co is equal to the square of the mean of the series. The weights Cj can be shown to be uncorrelated. As a consequence, each weight gives the contribution of the harmonic with frequency j in the description of xi(t) and Cj provides an indicalion of the variance explained by

Cj

Fn:quency

4

0

this hannonic. To assess the relative importance of each hannonic the weights may be ex-

LCL(95%) 3.413

UCL(95%)

%MS

x'

8.388

86.95

373.34

0.26

0.680

5.654

5.65

24.26

0.34

0.948

5.922

7.40

31.73

pressed as a percentage of the total mean square. Jenkins and Watts (1968) mention two ways to assess the statistical significance of each

3

0.00

0.00

0.00

hannonic, that are easily to obtain with SPSS.

4

0.00

0.00

0.00

2

First, we may judge the statistic Table.1. Theoretical results for a Fourier process with two harmonics. A2

j

2NCj I a ..

=1 to Ie

(5)

These figures show that when the model in Equation I holds, the periodogram values drop

a!=

N-l (

~ Xi

(t) _)2

- X , t = 0 to N-l, which follows a Chi-square distribution with two •=0 N-I degrees of freedom if the process is Gaussian so that no cyclicity is present. where

The second option is to inspect the periodogram. The periodogratn of the time series xi(t) is a plot of the natural logarithm of NCj against the frequencies. The periodograrn is an estimate

abruptly after the second harmonic as do the Chi-square values. For further illustration we present the results for subject I in Table 2. Frequency 0

of the spectrum of xi(t). This spectrum gives the true contribution of the different sine and cosine waves in the description of the process. In general, the spectrum of a non-cyclic process is a continuous, smooth function of the frequency. When a cyclic component is present with frequency j, the spectrum will show a peek at j. Therefore a peak in the periodogram may indicate the presence of a cyclic component. In particular, if a cyclic component is present, this will give a periodogram value of NCj which is significantly different from the neighbouring values. To investigate this, one could use a (lOO-a)% confidence interval for Ln(NCj ). Because we consider Ln(NCj) rather than NC j , the (100-a)% confidence interval for the spectrum is constant about Ln(NCj ):

CJ

LCL(95%)

UCL(95%)

%MS

X'

1.718

2.568

7.540

95.42

1168.05

0.017

-2.060

2.915

0.93

11.416

2

0.004

-3.472

1.503

0.23

2.781

3

0.019

-1.907

3.068

1.08

13.306

4

0.002

-4.097

0.878

0.12

1.488

Total MS= 1.8013, mean = 1.311, variance = .082 Cycle length= 28 days

Table 2. Results for Case I. Data is in Appendix I.

The Chi-square values were significant for the first and the third harmonic, indicating the presence of cyclicity. It was found that the following model describes the data well.

LCL= In(N Cj) + In(+-) , Xl. 1-(!2)

veL = In(N Cj) + In(-+-) Xl. of2

(6)

Xi(t) = 1.311 - 0.012 sin oot + 0.183 cos oot - 0.150 sin 300t + 0.129 cos 300t

The confidence intervals appropriately reflect the uncertainty in this case due to the small For a Fourier series with two harmonics the results would be as in Table I.

number of observations. Figure 2 presents an overlay plot of the predicted series with the observed series.

42

43

3.2. Combining Individual Fourier Series

~'7r---------------------------------~

'.66

Data from different subjects can be combined by simply adding the weights for each

'.55

frequency. If a cyclic component is present in the data of many subjects it is said to be

dominantly present. Livesey et al. found that the frrst and second harmonic are dominantly present (N = 133) indicating that cycles such as in Figure 1 are typical indeed. In our sample of

1 .24

• t..t:>OOwlthTIVE

.~l,

6 .7 &

' ,&

3):25

#.00 xwl~T...e

31 cycles we should be able to replicate their findings. Table 3 presents our findings.

lime

Figure 2. Series plots of the observed series (mood) and the predicted values (X). Produced with PLOT command.

instead of using Formula's (I) and (2), sin(jcot) and cos(jcot) may be regressed on the dependent variable by DLS or GLS regression.

3. The Analysis of Combined Data

Between-subject stan. dev.

Frequency

mean (a/+b/)'

0.133

0.037

2

0.086

0.150

Appendix 2 contains the SPSSIPC+ setup used to analyze these example. Note that,

3

0.098

0.114

4

0.074

0.136

5

0.052

0.125

Table 3. Combining Individual Fourier Series.

3.1. Introduction

t

When the data are combined, the considerable variation in cycle length between subjects becomes a problem because time in days does not have the same interpretation among different subjects, and standard programs require individual data of equal length. As Treloar e.a. (1967) state: There is no substantial justification for the widely held believe that women normally vary in menstrual interval about a value of 28 days common to all. Each woman has her own central trend and variation, both of which change with age.

i

I

I

Notwithstandig impressive individual differences, these findings indicate that a cyclic

component with one cycle per menstrual cycle is dominantly present in our sample. Furthermore, higher frequencies are not dominantly present in the group data because of individual differences. This does not imply that cycles of higher frequencies are not present in

i

individual series, but they do not dominate in the grouped data.

I

3.4. MANOVA

I

Assemblies of menstrual interval for

to average within a few days of 28 ( ... ) Variation is the rule, exceptions to it in individual

II

cases being ofsmall duration. (Treloar et al., 1967, p. 123).

I

many persons and covering a wide span of chronological ages should however, be expected

I

I

I

To correct for the variation among and within woman in cycle length, Rossi and Rossi (1977) used the following procedure to standardize the length of the cycles to 28 days: I.

First a midpoint is chosen. Two days before and two days after the mid-point define the

series, Multivariate Analysis of Variance (MANOVA) and Latent Curve Analysis (LCA:

2.

ovalutory phase. The first four days (1 to 4) are the menstrual phase,

Meredith and Tisak, 1989). Both MANOV A and LCA require time series of equal length while

3.

The last 4 days constitute the premenstrual phase.

Three options are considered for the analysis of the combined data: combining Fourier

in practice menstrual cycles vary in the number of days. In addition, LCA requires a large

4.

The follicular phase is 7 days long and we start counting at the fifth day.

number of subjects. Although MANDV A is a special case of LCA we will treat them separately

5.

The luteal phase takes 8 days and we start counting downwards from the premenstrual

because we wish to discuss MANOVA as it is routinely done with SPSSIPC+.

phase. • Livesey eI 01. (1989) used Cj12 rather than Cj. To mach the results we did the same.

44

45

In general, day's are inserted or deleted at the end of the follicular phase and at the begin of the luteal phase. The data format has to be changed in order to conduct the analysis with

ei(t), is an error of measurement as a function of t, and is assumed to be independent of w . Again all series must be equally long. Now let xi(tj)

MANDVA. First of all, days should be inserted or deleted to standardize the cycle length. Second, respondents instead of time points are now the cases, and the data from each res-

xf

pondent should start on a new line. With the Rossi and Rossi method a lot of information is lost. We propose to equate the time series by estimating each process with the full Fourier function, so that the original mood

=

=Xij' gjk =g,(tj) and e ij =ei(tj), and define the vectors:

(Xii' Xi2 ' .... , XiN ),

wit

= (Wil'

W i21 .•• ,

wir)

and

e j ' = (eil • ei2 •

"0,

eiN ).

scores are reproduced exactly. Then each process is shrunken or streched to the mean value of 28 days by changing the frequency of the process. This transformation in which no information

Let r be a N x r matrix whose jk-th element ij, is given by gjk" Then (11) may be written as

is added or lost, leaves the structure of the process in tact and can savely be analyzed instead of the original series.

x

= r w +e.

(9)

When the series are equated, an analysis of variance can be performed with the SPSSIPC+ procedure MANDVA. For respondent i on day t the model is: If E(w) xi(t)

=m + Wi + Ui(t),

=a; E(ww') =i; E(ee') ='1'; E(e) =0; and w and e are independent, then

(7)

=r

a

E(xx') =

r

E(x) where m denotes the grandmean and Wi represents the deviation of the mean on each day of the

(10)

i

r ' + 'I'

(II)

series from the grandmean. MANDVA circumvents the problem of autocorrelated observations by transforming the variables x(l) to x(28) to orthogonal polynomials. The problem with MANDV A is that it does not provide a test for cyclicity. The MANDVA model is a special case

This model has the form of a restricted factor model. It differs from the linear factor analysis model in that Equation 9 also generates a structure for the mean vector, i.e. Equation 10.

of LCA, which we discuss in the next section, and that provides a way to detect cyclicity.

Unless the elements in

r

are fixed parameters, the model is not identified when r > 1.

Therefore we must assume that there is only one basis curve or that gl(t) is a constant term and g2(t) is the single unknown basis curve:

3.5. Latent Curve Analysis

LCA was invented by Meredith and Tisak (1989). Let i denotes an individual subject, i

=

1 to Ns, and xi(t) denote the observed series, t = 1 to N. Meredith and Tisak assume that the observed series are decomposable into a sum of r (generally unknown) so called "basis

gIl]

1 1 gn

r = . .. , [

(12)

I gm

functions" and error.

where gl2 must be fixed to 1.00 for identification. 'P may be specified as a diagonal matrix with xi(t)

=

t

Wikgk(t) + ei(t),

(8)

k=1

equal diagonal elements, but since the errors are a time series themselves we may allow errors to be correlated across time points. For illustration, we divided the observed series in 5 phases as in Rossi and Rossi (1977).

r= (1 *, 0.85, 0.82, 0.84, 0.82)' suggesting a small peak in the

where g,(t) denotes a basis function implying that every xi(t) can be expressed, apart from

Using LISREL 8 we found

measurement error, as a linear combination of the basis functions g,(t), k = 1 to r. The w's

menstrual phase and a flat trend in subsequent phases. When the program did not converge.

represent the degree to which the i-th individual utilises the single basis gk(t). The random error

46

47

r

is specified as in Equation 12

IIIIII!I

(1.36 0.29 0.82 10.30 0.71 0.85 10.21 0.53 0.57 lO.17 0.57 0.71 Means: 2.80

2.19

In this model.

II I

0.64

1 (I

0. 53 0.69) 2.08

2.21

r

is not identified unless we choose a value for k so that the elements of

r

are

fixed elements and r equals

2.10

1

sin(Ol)+ COS(Ol)

•.

sin(Ol)+COS(Ol)

sin(20l) + COS(20l)

..

sin(20lk) + COS(20lk)

I· I·

II

Table 4. Covariance matrix and means for 5 phases. N=15.

To detect a cyclical trend the basis curves are assumed to describe a smooth periodic

I

I I

sin(NOl) + COS(NOl)

The model with k=2 gives

••

sin(NOlk) + COS(NOlk»)

a

= (2.16,

(16)

0.02. - 0.05)'. suggesting the absence of cyclicity.

trend, which can be approximated as : Further results are summarized in Table 5. 2 t

xi(t) = W;I + W;2 [ sin(krot) + cos (krot) ] + eJt),

= 0, ... ,N-l

(13) or. of

Where k > N/2, and ro

= 2xIN. N is the number of phases in the standardized cycle, WI

Model

df

X'

RMSEA

IX

'I'

phases

is a

level parameter and w 2 gives the combined contribution of the sine and cosine waves with unknown frequency k. Note that k in (13) is not assumed to be an integer number. The

5

LCA

9

13.06

(0.0;.0.38)

2.56 (0.42)

0.91(0.42)

5

SLCA

9

11.93

(0.0; 0.34)

2.16 (.21). 0.03(.03).

0.63 (0.25). 0.00 (0.03)

0.05 (.05)

-0.03 (0.03)

maximal value of k is N/2, since the smallest cycle takes two time points to complete. Equation 13 may be written as:

Table 5. Results of latent curve (LeA) and structured latent curve analysis (SLCA).

(14)

xi(t) = WI + w2 f(t, w2) + e;(t)

LCA's can be perfonned simultaneously in multiple groups so that hypothesis about group differences can be tested. With suitable data, we may, for example, compare the men-

where f (t, w 2) is the partial derivative of xi(t) to w 2' which is zero at .a peak or a trough.

strual cycles of clinical versus non-clinical samples (e.g. Sanders et aI., 1983), compare

Hence, we expect a cyclical mean trend to show up as a pattern of zero and non-zero elements

athletes to non athletes (Visser, 1992), women under different physical conditions (e.g. lighting

in the relevant column ofr. Given a good overall fit, subtractive chi-square tests can be used

conditions, see Cutler, 1980), or conduct cross-cultural comparisons (e.g. Hasin et aI., 1988).

to test various nested hypothesis. For example:

4. Concluding Remarks HI: No growth r

= IN versus r as in Equation 12 The example dealt with in this paper illustrates the problems that may hinder the analysis

H2: Certain elements in r are zero.

of periodic phenomena in social science applications: small data sets and strong variability both

In addition, the estimated scores on the second factor may be used to detect subjects with severe

within as well as between subjects. Nevertheless, it is possible to detect the presence of cyclical patterns with software that is familiar to social scientists.

mood changes. A structured latent curve model (Browne, 1993) that fits the same framework is :

x;(t) = wo+ Lj=llor [Wj sin(krot)+wpos(jrot)],

t = I to N. r = I to NI2

(15) 2 Note that the confidence interval around the root·mean·square·error·o[·approximation (RMSEA) appropriately reflecls the uncertainty due to the small sample size (see MacCallum. Browne and Sugawara. 1996).

48

49

Kolk, A. (1994). De menstruele cyclus [The menstrual cycle]. Nederlands Tijdschrift voor de If cyclicity is detected this does not imply that we have found an explanation for it. Following the analysis of mood scores, one would naturally be interested in the usefulness of

Psychologie, 49, 279-285. Livesey, J.H., Wells, E., Metcalf, M.G., Hudson, S., & Bates, R.H.T. (1989). Assesment of

explanatory variables. Any variable whose values are controlled or at least observed by the ex-

the significance and severity of premenstrual tension-I. A new model. Journal of

perimenter, such as the phase of the moon (e.g. Cutler, 1980), morning temperature, the day of

Psychosomatic Research. 33, 3, 269-279.

the week (Rossi and Rossi, 1977), intellectual performance (Sommer, 1972), or menstrual cycles of roommates (Jarret, 1984), may figure as explanatory variable. To relate multiple periodic time-series, the best option is probably cross-spectral analysis (Kruse and Cottman,

MacCallum, R.C., Browne, M.W., & Sugawara, H.M.(1996). Power analysis and determination of sample size for covariance structure modeling. Psychological Methods, 1, pp. 130-149.

1982). However, with data for less than three periods, this type of analysis should not be

Meredith, W. & Tisak, J. (1990). Latent curve analysis. Psychometrika, 55, pp. 123-149.

considered for reasons of statistical power. A simpler model for a single time series, with a

Metcalf, M.G., Livesey, I.H., & Wells, E., (1988). Assesment of the significance and severity

lagged second time series as a covariate or regressor was discussed by Hibbs (1974). Although the technique descibed in this paper may contribute to research in the menstrual cycle their applications depends on the availability of theory. At present, a simple theory seems far away. Citing Kolk (1994: English summary):

of premenstrual tension-II. comparison of methods. Journal of Psychosomatic Research. 33, 3, 280-287. Neale, M. C. (1994). Mx. Statistical Modelling with Mx. Box 3 MCV, Richmond, VA 23298: Department of Human Genetics. Rossi, A. S. ,& Rossi, P. E. (1977). Body time and social time:mood patterns by menstrual

( .. ) Cyclicity in physical symptoms and mood cannot simply be explained by a biomedical model. The same is true for a psychosocial model. Inter-individual and intra-individual

cycle phase and day of the week. Social Science research, 6, 273-308. Sanders, D. Wamer, P., Biickstrom, T., & Bancroft, 1. (1983). Mood, sexuality hormones

differences in the reporting of symptoms are explained with the use of a symptom perception

and the menstrual cycle. I. Changes in mood and pyciscal state: Description of subjects

model. In this information processing model symptoms are assumed to be the product of the

and method. Psychosomatic Medicine, 45, pp. 487-501.

awareness and interpretation of physical sensations. This means that except for extreme conditions there is an ever changing contribution of physiological, cognitive, emotional and social factors to the perception and reporting of symptoms.

Sommer, B. (1972). Menstrual phase changes and intellectual performance. Psychosomatic Medicine, 34, p. 263. Treloar, A. E., Boytron, R. E., Behn, B. G., Browne, B. W. (19676). Variation of the human menstrual cycle through reproductive life. International Jounal of Fertility, 12, p. 77. Visser. M. S. (1992). De menstruele cyclus: Individua!e verschillen in Lengte en var-

References

iabiliteit. [ the menstrual cycle: individual differences in length and variability]. Unpublished Doctoral Thesis. nr. 4606. Dep. of Psychology: University of Amsterdam.

Browne, M.W. (1993). Structured latent curve models. In C.M. Cuadras and C.R. Rao (Eds.) Multivariate Analysis: Future Directions 2. pp. 171-198, Amsterdam: North Holland. Cokkinades, V.E., Macera, C.A., & Pate, R. R. (1990). Menstrual dysfunction among habitual runners. Woman and Health, 16, p. 59. Cutler, W. B. (1980). Lunar and menstrual phase locking. American Journal of Obstetrics and Gynecology, 137, p. 834. Hasin, M. & Dennerstein, L. Gotts, G. (1988). Menstrual cycle related complaints: A cross

APPENDIX: DATA and SETUPS A.!, Data for Subject 1

Subject 1: Mood score

cultural study. Journal of Psychomsomatic Obstetrics and Gynecology, 9, p. 35. Jarett, L. R. (1984). Psychological and biological influence on menstruation: Synchronicity, cycle length and regularity. Psychoneuronendocrinology, 9, p. 21.

1.17 2 1.17 1.17 1.17 1.17 1.17 1.5 1.83 1 1.17 1.17 1.17 I 1.17 l.l7 1.17 1.33 1.17 1.33 1.33 I 1.17 1.17 1.5 1.5 1.67 2.17

Jenkins, G.M., & Watts, D.G., (1968). Spectral analysis and its applications. Holden-day. Joreskog, K.G., & Sorbom, D. (1993). LISREL 8 User's reference Guide. SSI.

50

51

'!'

A structural model for doubly hierarchical structures

A.2. SPSS/PC+ Setup for Fourier Analysis used in this Paper

loopl. Hox SPSS/PC+ is able to produce most of the results presented in this paragraph by means of

1. Structural models for multilevel data

the following command lines: DESCRIPTIVES N ARIABLES mood. SPECTRA N ARIABLES mood /PLOT NONE ISA VE P (c) sin (a) cos (b). COMPUTE CJ=C_IIN COMPUTE LN=LN(C_I). COMPUTE LCL =LN-1.3056. COMPUTE UCL=LN+3.669.

N must be filled in by the researcher. In the following command lines one should fill in the standard deviation (sd) of the observed series to produce the chi-square value.

COMPUTE ClD=(2*C_I)/(sd**2). LIST N ARIABLES A_I B_1 C_I LCL LN UCL cm.

Structural equation models (SEM) for multilevel data have been proposed, among others, by Goldstein and McDonald (Goldstein & McDonald, 1988; McDonald & Goldstein, 1989), Muthen and Satorra (Muthen, 1989; Muthen & Satorra, 1989) and Longford & Muthen (1992). Overviews of this development are given by Kaplan (1996) and McArdle & Hamagarni (1996); for an introduction see MutMn (1994), McDonald (1994) and Hox (1996). The model has become popular after Muthen (1989, 1994) suggested a simplification that makes it possible to use the multi group option of existing SEM software to analyze multilevel data In the case of bilevel data, we assume sampling at two levels, with both between group (group level) and within group (individual level) covariation. Thus, in the population we distinguish between the between group covariance matrix 1:8 and the within group covariance matrix 1:w. An unbiased estimator of the population within groups covariance matrix is given by the sample pooled within groups covariance matrix Spw, given by: Gn

The last command provides a listing of the results.

LL

(YI, - Y.g)(Yig -V. g) ,

SPW

N-G

A.4. Fit Latent Curve models with LISREL and MX In LlSREL (7.0 and up), Equations 10 and 11 are specified as a confirmatory factor model with structured means:

r

= LAMBDA_Y, T = PSI, '¥ =TETA_EPSILON, a. =

ALPHA, and TAU_Y = O. In the latent structure model. LAMBDA_Y is full and fixed and

This equation uses a rather obvious notation, with G indicating the number of groups. Thus, the pooled within groups covariance matrix equals the covariance matrix of the group deviations scores, adjusted by dividing by N-G instead of N. The between groups covariance matrix is given by:

Equation 16 must be entered by hand. To fit Equation 16 with Mx, the model can be stated as

G

the matrix equation

L ng (Y.. -Y.g)(Y.. -Y.g)' Ss= G

[ I I sinew @ T)+cos(w @ T) I ... I sin«k*w) @ T)+cos«k*w) @ T)], where I is a N-dimensional unit vector, T is aN-vector, T = (0, 1, 2, 3, 4, ... , N)"

that

denotes time, and k and w are defined as 1 x 1 matrices. Hence, after the title and the data definition lines we add in a calculation group to calculate Equation 16. their value as a starting value.

52

W=Ol

and k are given

The between groups covariance matrix equals the covariance matrix of the disaggregated group means, adjusted by dividing by G instead afN. Since the pooled within groups covariance matrix Spw is an unbiased estimate of the population within groups covariance matrix 1:w, we can estimate the population within group structure by constructing and testing a model for Spw. The between groups covariance matrix Sa, however, is not a simple estimator of the population between groups covariance matrix 1:a. Instead, S8 is an estimator of the sum of two matrices:

53

Suggest Documents