partly ordered, the variable has to be treated as a nominal variable. ... Keywords: ordinal categorical variables; Goodman and Kruskal's gamma; monotone ...
A coefficient of association between categorical variables with partial or tentative ordering of categories Volkert Siersma Svend Kreiner University of Copenhagen, Denmark Goodman and Kruskal’s γ coefficient measuring monotone association and its partial variants are useful for the analysis of multi-way contingency tables containing ordinal variables. When the categories of a variable are only partly ordered, the variable has to be treated as a nominal variable. Information in the ordering of the categories and statistical power is lost. We suggest a Pγ measure which is the maximum of the ordinary γ coefficients obtained by permuting the categories of nominal or partially ordered variables, while leaving the partial ordering intact. This measure has higher power than nominal tests for association. Furthermore, the resulting optimal monotone ordering gives insight in the nature of this association which is not obtained by tests for nominal variables. The properties of the Pγ coefficient are investigated in a simulation study, and its use illustrated in two data sets. Keywords: ordinal categorical variables; Goodman and Kruskal’s gamma; monotone association
In the analysis of conditional independence in multi-way contingency tables the categorical variables involved are viewed as either purely nominal or ordinal. In the latter case the ordering of the categories gives additional information and dedicated measures of association have been devised to give stronger and more meaningful analysis. However, sometimes only part of the categories is ordered. An example is a “not relevant” category accompanying a Likert scale questionnaire question. Sometimes, setting the extra-ordinal category to missing, or placing the category in the ordinal scale anyway on the basis of subject matter considerations, can render the variable ordinal. In general, however, variables with categories that are only partially ordered have to be treated as nominal variables, and information in the ordering of the categories, and statistical power, is lost. Moreover, evidence of an association established by tests for nominal variables, e.g. Pearson chi-squared tests, does not reveal the nature of the association. This has to be deduced by examination of stratified tables and parameters of log-linear models, which in multivariate analysis can be most confusing. Often inherent monotonicity in the relationship is of interest, specifically when a tentative order is present, but not exactly determined. We propose a coefficient of association that assesses the best monotone relationship that can be obtained by permuting the categories of a variable legitimately relative to a partial ordering if such is present. This Pγ coefficient identifies an ordering of the categories that can be interpreted as an optimal ranking relative to the variable it is related to, and gives a measure of the strength of the association between the variables. Used as a test statistic the Pγ may have greater power than conventional tests in log-linear models for nominal variables, when the inherent monotonous relationship between the variables is the same in all strata defined by the other variables in the loglinear model. Beyond this, the Pγ coefficient may define efficient inference on the conditional independence between sets of ordinal variables. In the next section we present the Pγ coefficient of association. Subsequently a simulation study is conducted to characterise the power properties of this measure compared to alternatives. In the final section, to illustrate the use of the Pγ, the coefficient is applied to two data examples. In data concerning the left-right affiliation of a sample of the Danish electorate, a ranking is sought of the
1
Danish political parties on this scale. In a second data set on a weight control programme, the relationship between a weight goal and the later attained weight is investigated.
The Pγ coefficient of association and optimal monotone ordering The coefficient of association we propose is based on Goodman and Kruskal’s γ coefficient (Goodman and Kruskal 1954) for monotone association1. For two ordinal variables X and Y this coefficient is defined as the difference of the probabilities for two independent draws from the joint distribution of X and Y to be concordant respectively discordant. This difference is scaled with the probability of not having ties, and as such the γ coefficient is a version of Kendall’s τ (Kendall 1938) adapted to its use in contingency tables. In multi-way contingency tables, a partial γ measure of monotone association between two ordinal variables is defined as a weighted summary γ measure across subtables spanned by the categories of other variables in the table (Agresti 1984; Davis 1967). This partial γ is interpreted as a measure of conditional association when controlling for these other variables. The partial γ is especially useful for analysis by log-linear models with ordinal variables (Agresti 1984). It is known that tests based on chi-squared measures of association have notoriously low power, whereas tests based on the partial γ measure have much higher power (Kreiner 1987). The use of the (partial) γ between two variables demands that both are ordinal. Variables with categories that are only partially ordered have to be treated as nominal variables, and information in the ordering of the categories, and statistical power, is lost. An ordering X(r) of a categorical variable X is an ordinal random variable with a specific permutation r of the categories of X. If X has a partial order, we consider only valid orderings of X, i.e. orderings based on permutations that do not violate the partial order. With this definition all orderings of a nominal variable are valid orderings, but an ordinal variable has only one valid ordering. Notably, a variable that is the product of two ordinal variables gives rise to a categorical variable with partial order. We define the Pγ coefficient of association between a partially ordered or nominal X and an ordinal Y as the maximum γ coefficient for an association between a valid ordering of X and Y. P (1) γ XY = max γ X(r)Y r
Subsequently, the optimal monotone ordering of X with respect to Y is defined as the valid ordering of X for which this maximum is obtained. (2) ropt = argmax γ X(r)Y r
P
Similarly, the partial γ between a partially ordered or nominal X and an ordinal Y conditional on a nominal Z is defined as the maximum partial γ coefficient for an association between a valid ordering of X and Y conditional on Z. P (3) γ XY|Z = max γ X(r)Y|Z P
r
Specifically, the partial γ coefficient assumes the same ordering in each of the strata. The (partial) optimal monotone ordering of X with respect to Y, conditional on Z is the ordering corresponding to the partial PγXY|Z in similar fashion as in (2)
2
The asymptotic variance of the (partial) Pγ coefficient is intractable because of the maximisation operator involved. Instead, the significance of the partial Pγ coefficient and its corresponding partial measure is assessed by Monte Carlo methods. Specifically, a permutation test can be devised where the observed value for the partial Pγ coefficient is compared to a simulated distribution under the null hypothesis where X and Y are independent. To simulate the null distribution tables are simulated from the marginal distributions of X and Y and the Pγ are calculated for each table (cf. Kreiner 1987). Monte Carlo inference is standard in the analysis of multi-way contingency tables, also for nominal tests of association, as tests based on the asymptotic distributions are of very low power (Kreiner 1987). However, a permutation test for Pγ may become computationally cumbersome, as calculation of Pγ involves an exhaustive walkthrough of all valid orderings of X. If the number of categories of X is nX, these can be ordered in nX! ways. However, for each permutation X(r) a permutation X(r’) with the categories in the opposite order exists and γX(r)Y = -γX(r’)Y. From this it follows that PγXY is always positive, and we have to consider only nX!/2 γ permutations to identify ropt and PγXY. If a variable X has a partial order, a lower number p 0. Inserting the resulting slope parameter s0 and the marginal distributions for X and Y gives a joint distribution with the wanted value for the γ coefficient4. In the first part of the simulation study we compare the power of a test based on the partial γ coefficient, with a test based on the conventional γ coefficient and the two LR tests. We simulate tables for two ordinal variables X and Y both with five categories. All tables have uniform marginals and a monotone relationship with a prespecified true value γ0 of 0, 0.10, 0.15, or 0.25 for the γ coefficient. Tables with 250, 500 and 1000 entries are simulated from the joint distribution, and distributed uniformly over the categories of a (uniformly distributed) stratification variable Z with 2, 10 and 20 categories respectively. The estimate of the rejection rate for each combination of simulation factors is based on 1000 data sets, assessed by the four tests at a 5% significance level using permutation inference. The full ANOVA design, i.e. all combinations of factors, is simulated. The average rejection rate (ARR) for each of the factor levels is listed alongside the levels in Table 1. This shows for the most part the expected relationships between the rejection rate and sample size, effect size, and stratification. Table 1 Simulation factors and ARR on a 5% significance level for the first part of the simulation study for the detection of the relation between two ordinal variables X and Y with uniform marginal distribution, within a stratum Z. Factor Statistical test Sample size Effect size γ0 Strata of Z
Partial γ ARR 0.617 250 ARR 0.327 0 ARR 0.049 2 ARR 0.509
Levels with ARR Partial Pγ LR2 0.535 0.459 500 1000 0.474 0.611 0.1 0.15 0.359 0.604 10 20 0.464 0.439
4
LR3 0.272
0.25 0.87
Figure 1 The ARR on a 5% significance level for each of the statistical tests in the first part of the simulation study for data where the true γ = γ0 is as indicated on the horizontal axis.
The factor indicating the statistical test is seen interacting with the effect size γ0. This interaction is shown as the development in ARR in Figure 1. The interaction captures the development of a rejection rate that is 5% overall for a table of independent X and Y, but increases with effect size, differently for each statistical test5. When γ0 > 0 the test based on the conventional γ coefficient has considerably higher power than the other tests. This is as expected because data was generated with a monotonous relationship between the variables. The test based on the partial Pγ is not as strong as the conventional partial γ test, but stronger than both the chi-squared tests in all cases. With the same simulation results we examine how well the partial Pγ coefficient captures the ordering of the categories. As is seen in Figure 2 (left), the correlation between ropt and the true ordering, averaged over the 1000 simulations, is high when the sample size and γ0 are relatively large. However, the number of correctly specified orderings, as shown in Figure 2 (right), is low. This shows that identification of the correct ordering is difficult. Obviously, it will be highly dependent on the number of categories of the variable that are permuted. Thus in general we can say that ropt may be close to, but is unlikely to be equal to, the true ordering. In Figure 3 the distribution of the partial Pγ coefficient is shown for two configurations of simulation factors taken from the simulation study. By construction the Pγ coefficient is always positive. While in the case that γ0 = 0 this distribution cannot be well characterized because of its proximity to the origin, the distribution in the case when γ0 = 0.15 resembles a normal distribution. This corresponds to the witnessed relation between the conventional γ and the Pγ coefficient.
5
Figure 2 The average correlation of ropt with the true ordering (left) and the number of correct orderings found with the Pγ coefficient (right) for the three sample sizes in the first part of the simulation study for data where the true γ = γ0 is as indicated on the horizontal axis.
In Figure 4 it is seen that for the estimated correlation coefficients, Pγ > |γ| per definition, and that they are closer to each other when the estimated values are higher, as seen in the right-hand graph. This leads to an interpretation of a high Pγ coefficient as the strength of the real underlying monotone relationship in the same way as measured by the conventional γ coefficient. In the second part of the simulation study we compare the strength of the tests relative to differing marginal distributions of the variables involved. The values of the γ coefficient, and thus the Pγ coefficient, vary with changes in the marginal distributions, and properties may change when these are not as balanced as with a uniform distribution. Furthermore, non-uniform marginals make parts of the table sparse, which may cause general inefficiency. We again simulate tables for two ordinal variables X and Y with five categories, where now both can have one of five types of marginal distribution shown in Figure 5. The simulation factors that are varied in the first part of the simulation study are now held fixed. A further stratification variable Z has ten categories, but with a varying marginal distribution which is either uniform (U), or unbalanced (S). Sample size is 250 and the effect size γ0 = 0.15. The simulation factors are shown in Table 2. The estimate of the rejection rate for each combination of simulation factors is based on 1000 data sets, assessed by the four tests at a 5% significance level using MC inference. As the marginal distributions may be different for X and Y, it is important to note that the categories of Y are permuted to calculate the partial Pγ coefficient.
6
Figure 3 The distribution of the partial Pγ coefficient of correlation between X and Y both with five categories, stratified by a ten category Z when the true γ = γ0 with which the data was generated was 0 (left) and 0.15 (right)
Figure 4 The relation between the ordinary γ coefficient and the Pγ coefficient between X and Y both with five categories, stratified by a ten category Z when the true γ with which the data was generated was 0 (left) and 0.15 (right)
7
Figure 5 Definitions of the five marginal distributions for the five-category variables X and Y that are used in the second part of the simulation study.
Table 2 Simulation factors and ARR on a 5% significance level for the second part of the simulation study for the detection of the relation between two ordinal variables X and Y with various types of marginal distribution cf. Figure 5, within a stratum Z. Factor Statistical test
Partial γ ARR 0.533 Marginal X distribution U ARR 0.320 Marginal Y distribution U ARR 0.317 Marginal Z distribution U ARR 0.307
Levels with ARR Partial Pγ LR2 0.342 0.229 SR SL 0.304 0.291 SR SL 0.306 0.294 SR 0.294
LR3 0.097 MM 0.287 MM 0.275
ME 0.297 ME 0.308
The ARR for each of the factor levels are shown in Table 2. The differences in power between the tests persist at the same order of magnitude over the differing marginal distributions. Specifically, the test based on the partial Pγ coefficient is more powerful than both chi-squared tests. The marginal distributions affect the rejection rate, completely overwhelmed by the effect of the statistical test. The power is less influenced by the marginal distribution of X, i.e. the variable whose categories are not permuted, than by the marginals of Y. The power is highest when the marginal distributions are uniform, but we have to conclude that the configuration of the marginal distributions has little effect on the power of the statistical tests.
8
We observe (results not shown) that when the mass of the marginal distribution of Y is in the extremes, the power of the test is relatively high compared with the other four types (ARR=0.377), but the determination of ropt is poor, 5.1% on average. When the mass of the marginal distribution of Y is in the middle, the determination of ropt is better than with the other four types, 21.4% on average, but with relatively low power (ARR=0.301). In the third part of the simulation study we briefly examine the power of the statistical tests when the association between X and Y differs from monotonicity in one of the five categories. A probability table for a bivariate association of this type is constructed by first constructing the association vector β (4) for the monotone relationship with γ0 = 0.15 between two ordinal variables with five categories. Here specifically β(3) = 1. To construct a relationship where the third category is extra-ordinal, a table is generated with an association vector where β(3) is altered. For a sample size of 250 and ten strata for the variable Z and all variables with uniform marginals, the rejection rates are estimated on 1000 simulated data sets. The results for three choices of β(3) are shown in Table 3. For β(3) = 1 the association is monotone, and the rejection rate is the same as in the first two parts of the simulation study. In the other two cases the association is only partially monotone. Here the test based on the conventional partial γ coefficient loses power because it tries to fit the extra-ordinal category into the ordering as given. The partial Pγ is clearly superior as it can both use the information in the partial ordering, and is flexible enough to allow for the extra-ordinal category. Moreover, the partial Pγ coefficient is stronger than the chi-squared tests. Table 3 Estimates of the rejection rate on a 5% significance level for various tests for the relationship of two ordinal variables X and Y diverging from a monotone relationship, within a stratum Z. β(3) 1.5 1 0.5
Partial γ 0.609 0.607 0.575
Partial Pγ 0.702 0.388 0.850
LR2 0.538 0.274 0.731
LR3 0.140 0.087 0.232
The conclusion of this simulation study is that a test based on the Pγ coefficient is stronger than the chi-squared tests LR3 and LR2 when an underlying monotone relationship can be assumed. The ordering of this underlying monotone relationship does not have to be known. Rather, the determination of the Pγ coefficient gives a good indication of this ordering through ropt. Only when the ordering is known, does the conventional γ coefficient give stronger inference.
Applying the Pγ measure of association We illustrate the use of the Pγ coefficient in two examples. In the first example we determine the order of Danish political parties on a left-right scale. In the second example we try to characterise the predictive effect of a weight goal on the later attained weight.
The political attitudes of Danish voters The data for the first example originated in the European Values Studies (Halman 2001), a largescale, cross-national, longitudinal survey research programme initiated in the late 70s and still running. The example here is only concerned with information from Denmark in 1981, 1990 and 1999 on preferred political party and political attitudes measured on a left-right discrete 10-point VAS scale. Any statistical test will of course disclose a highly significant association between party
9
and political attitudes. The question of significance therefore is not an issue in this example. It is also common knowledge that the parties are ordered on a left-right scale with some parties clearly belonging to either left or right of the political spectrum, but they are nevertheless only partially ordered. A statistical analysis therefore would have to regard party as a nominal variable. The purpose of this example is simply to examine the degree to which an analysis by the Pγ coefficient is able to capture the ordinal structure between the two variables in a way that is at one time consistent with common knowledge and at the same time provides the missing bits of information on how the parties are ordered. A total of 10 political parties are considered. The 10×10×3 table showing the association between party and political attitudes in 1981, 1990 and 1999 is too large to show in this format. The partial P γ coefficient is equal to 0.629 indicative of a very strong association between the two variables. The order of parties providing this association is shown in Figure 6. The result fits common knowledge about Danish parties very nicely. Party B is usually assumed to define the midpoint of the Danish political spectrum with parties Ø, SF and A to the left in that precise order, and DF, Z, V and C to the right. Q and CD should be close to B. Moreover, the high value of the Pγ coefficient reflects confidence that the ordering is a true one-dimensional ordering of political parties. The position of the new party, DF, at the far right of the spectrum is somewhat surprising. The position reflects the political attitudes of the persons preferring DF to other parties in 1999; the first surveyed year DF contended in general elections. Since then, the party has with some success attempted to move towards the midpoint of the spectrum attracting an increasing number of voters from the other parties. Figure 6 The order of Danish political parties with respect to Danish voter’s left-right associations, resulting in a partial Pγ coefficient equal to 0.629
The The The The The The The The The The
Red-Green Alliance Socialist People’s party Social Democratic Party Social Liberal Party Christian People’s party Centre Democrats Progress Party Liberal Party Conservative Party Danish People’s Party
Ø SF A B Q CD Z V C DF
10
Left Right categories |Far left | | | | | | | | | | | | |Far right
The effect of goal-setting in a weight reduction programme for diabetes patients In the Diabetes Care in General Practice study (Olivarius et al 2001) 1428 newly diagnosed diabetic patients aged 40 or over were followed since 1989 in a randomised trial among more than 600 Danish general practitioners. The intervention provided optimum conditions for follow-up, doctorpatient communication and treatment, among other ways by training the doctors, producing clinical guidelines and setting individual treatment goals. Specifically, general practitioners were prompted to three-monthly sessions with their patients to monitor the disease and review treatment. The intervention was aimed to assess structural care for diabetes patients. Part of the intervention was the implementation of a weight control programme. Body weight is an important risk factor in the development of diabetic complications, and the aim of the weight control programme was to motivate the patient to control and, if possible, reduce body weight. At each session, the general practitioners were asked to formulate weight goals for the next session, together with their patients. These goals are agreements either to aim for a reduction of a certain amount, or to keep current weight. Additionally there was the possibility not to set any weight goal. Table 4 The relationship between weight goal and attained weight control three months thereafter, where the weight goal is a partially ordinal variable with no goal set as an extra ordinal category, analysed with Pγ. BMI (kg/m2) at next session
P
Bivariate
Given weight at current session
Given various possible predictors*
323
252
73
45
A
B
B
B
B
C
B
B
A
Keep current weight
991
1386
603
146
35
B
A
C
A
A
F
D
F
C
Decrease