Journal of Classification 23:301-313 (2006) DOI: 10.1007/s00357-006-0017-z
On Similarity Indices and Correction for Chance Agreement
Ahmed N. Albatineh Nova Southeastern University, USA
Magdalena Niewiadomska-Bugaj Western Michigan University, USA
Daniel Mihalko Western Michigan University, USA
Abstract: Similarity indices can be used to compare partitions (clusterings) of a data set. Many such indices were introduced in the literature over the years. We are showing that out of 28 indices we were able to track, there are 22 different ones. Even though their values differ for the same clusterings compared, after correcting for agreement attributed to chance only, their values become similar and some of them even become equivalent. Consequently, the problem of choice of the index to be used for comparing different clusterings becomes less important. Keywords: Similarity indices; Equivalence of similarity indices; Correction for chance agreement; Comparison of clusterings; Cohen’s kappa.
The authors would like to thank Willem J. Heiser and an anonymous referee for their helpful comments and valuable suggestions. Authors’ Addresses: Ahmed N. Albatineh, Division of Mathematics, Science, and Technology, Nova Southeastern University, Fort Lauderdale, FL 33314 USA, e-mail: Albatine@nova. edu; Magdalena Niewiadomska-Bugaj and Daniel Mihalko, Department of Statistics, Western Michigan University, Kalamazoo, MI 49008 USA, e-mails:
[email protected], daniel.mihalko @wmich.edu
302
A.N. Albatineh, M. Niewiadomska-Bugaj, and D. Mihalko
Table 1. Similarity Table for Two Clustering Methods
number of pairs Method 1
in the same cluster in different clusters total
Method 2 in the same in different cluster clusters a b c d a+c b+d
total a+b c+d M
1. Introduction Clustering techniques are designed to uncover existing groups in data usually with a very limited information available. For example, not only the membership of the data points has to be determined, but often also the number of groups. The many available procedures are based on various optimality criteria, and since different criteria can be used, it is important to be able to compare results obtained by different approaches. Similarly, one may be interested in assessing degree of similarity (or verifying equivalence) of two clustering algorithms (for example one being a simpler and/or more efficient version of the other) - an important issue with current research, where large data sets are so common. Similarity indices were used for assessment of clustering structure recovery (e.g., Milligan and Cooper (1985)) with respect to cluster size, dimensionality, and number of clusters. Saxena and Navaneerham (1991, 1993) used similarity indices to compare graphical (Chernoff-Type faces) with non graphical methods for clustering multivariate observations. A standard approach in such comparisons starts with obtaining a I × J summary table (sometimes also called matching matrix) M = {mij }, i = P P 1, . . . , I, j = 1, . . . J, where mi+ = Jj=1 mij and m+j = Ii=1 mij are row and column PJ respectively (and cluster sizes in respective clusterings), and PI totals m = j=1 mij is a total number of points being clustered. A Table i=1 M displays numbers mij of data points placed in cluster i according to one clustering and in cluster j according to the other clustering (I and J are numbers of clusters for the first and second clustering method, respectively). Since the clusters are not predefined, the similarity of different clustering procedures (or algorithms) is usually based on the number of pairs of data points that are (or are not) placed into the same cluster according to each procedure. Consequently, a 2 × 2 similarity table, Table 1, is formed where a is the number of pairs that were placed in the same cluster according to both clustering methods, b(c) is the number of pairs that were placed in the same cluster according to method 1 (2) but not according to method 2 (1), and finally d is the number of pairs that were not in the same cluster according to either of the methods. We have m 1 a+b+c+d=M = = m(m − 1) 2 2
On Similarity Indices and Correction for Chance Agreement
303
where
a =
J I X X mij 2
i=1 j=1
b =
I X
J X
i=1
c =
j=1
mi+ 2
m+j 2
J
I
1 XX 2 m = mij − , 2 2 i=1 j=1 I
I
J
1 1 XX 2 1X 2 mij = P − a, mi+ − −a= 2 2 2 i=1 j=1
i=1
1 −a= 2
J X j=1
I
m2+j
J
1 XX 2 1 − mij = Q − a, 2 2
(1)
i=1 j=1
J I J I X m 1 2 1 X 2 1 XX 2 2 − (a + b + c) = m − ( d = m+j ) + mi+ + mij 2 2 2 2 i=1
j=1
i=1 j=1
1 = M − (P + Q) + a, 2 PI P with P = i=1 m2i+ −m and Q = Jj=1 m2+j −m (see Jain and Dubes (1988)) 1 . To simplify further considerations we introduce two types of trivial clustering: the trivial m-cluster where each of m data points forms its own cluster of size 1 and the trivial one-cluster where all data points are placed together in one cluster. It is easy to notice that a = 0 if and only if mij ≤ 1 for any i = 1, . . . , I and j = 1, . . . , J , while b = 0 (c = 0) if and only if there is only one cell with a positive count in each row (column) of table M. Finally, d = 0 if and only if all positive counts are either in the same row or in the same column of table M (either there exists j0 such that mi+ = mij0 for any i = 1, . . . , I or there exists i0 such that m+j = mi0 j for any j = 1, . . . , J ) - so at least one of the clusterings being compared is a one-cluster.
2. Overview of Similarity Indices Various similarity indices (also referred to in the literature as similarity measures) based on the similarity table (1) have been proposed over the years. We were able to track 22 different formulas (see Table 3) some of which were introduced more than once, each time under different names (indices R, H , CZ , F M , GL, and GK in Table 2). 1. As suggested by the anonymous reviewer, we point out that M = GT 1 G2 , where G1 , and G2 are indicator matrices for respective clusterings (indicator matrix with m rows and I or J columns indicates by 0s and 1s to what cluster each data point belongs). Matrices Gk GT k , (k = 1, 2) show data points that belong to the T T same cluster. Also, one can notice that 1T G1 GT 1 1 = 2(a + b) + m, 1 G2 G2 1 = 2(a + c) + m, and that T T T T (G1 GT 1 ) ⊗ (G2 G2 ) provides the shared cluster membership. Consequently, 1 (G1 G1 ) ⊗ (G2 G2 )1 = T )] = tr(GT G GT G ) = tr(MMT ) and 2a + m = ||M||2 . tr[(G1 GT )(G G 2 2 1 1 2 2 1
304
A.N. Albatineh, M. Niewiadomska-Bugaj, and D. Mihalko
Table 2. Similarity Indices - References and Symbols
No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
Introduced by Sokal and Michener (1958), Rand (1971) Hamann (1961), Hubert (1977) Czekanowski (1932), Dice (1945), Gower and Legendre (1986) Kulczynski (1927) McConnaughey (1964) Peirce (1884) Fowlkes and Mallows (1983), Ochiai (1957) Wallace (1) (1983) Wallace (2) (1983) Gamma Sokal and Sneath (1963) Baulieu (1989) Russel and Rao (1940) Fager and McGowan (1963) Pearson Baulieu (1989) Jaccard (1908) Sokal and Sneath (1963) Sokal and Sneath (1963), Ochiai (1957) Gower and Legendre (1986), Sokal and Sneath (1963) Rogers and Tanimoto (1960) Goodman and Kruskal (1954), Yule (1927)
Symbol R H CZ K MC PE FM W1 W2 Γ SS1 B1 RR FMG P B2 J SS2 SS3 GL RT GK
All similarity indices listed are functions of counts a, b, c, d defined by (1). Their values usually lie between 0 and 1 except for seven indices (H , M C , P E , Γ, P , GK , B2) which were designed analogously to various association indices for a 2 × 2 contingency table, and as such have range [−1, 1], ([− 14 , 14 ] for B2) allowing, at least theoretically, for negative values. If both clusterings are identical then the number of rows in M is the same as the number of columns (I = J), with positive counts only on the diagonal (mij > 0 if and only if i = j ) which implies b = c = 0. If additionally both a and d are positive (which means that not all mii counts equal 1) similarity indices in Table 2 other than RR, F M G, P and B2 all equal 1. Index RR equals 1 if and only if b = c = d = 0 which is possible only in the case of both clusterings being one-clusters. F M G tends to 1 when m → ∞ and both clusterings are identical. For P to equal 1 one would need b = c = 0 and a = d = 1 but the second condition cannot be satisfied in view of the relationships (1); assuming b = c = 0, condition a = 1 implies d = 0, while d = 1 implies a = 0. For B2 = 0.25 one needs b = c = 0. This condition is usually hard to satisfy, since it implies that both clusterings must be identical, and such that a = d. Eleven indices (C, K, M C, F M, RR, W 1, W 2, J, SS2, SS3, GK ) attain their lower bound if a = 0, so when mij ≤ 1. Seven other indices (R, H, P E, Γ, SS1, GL, RT ) require additionally that d = 0 which, as explained at the end of the previous section, means that at least one of the clusterings is a one-cluster. With both a = d = 0 one of the clusterings has to
On Similarity Indices and Correction for Chance Agreement
305
Table 3. Similarity Indices
No. 1 2 3
Symbol R H CZ
4
K
5
MC
6 7
PE FM
8 9 10
W1 W2 Γ
11
SS1
12
1 4
RR FMG
15 16
P B2
20
GL
21 22
RT GK
a a a+b + a+c 2 a −bc (a+b)(a+c) ad−bc (a+c)(b+d) √ a (a+b)(a+c) a a+b a a+c ad−bc √ (a+b)(a+c)(c+d)(b+d)
13 14
J SS2 SS3
a+d a+b+c+d (a+d)−(b+c) a+b+c+d 2a 2a+b+c 1 2
B1
17 18 19
Formula
a a+b
+
a a+c
+
d d+b
m 2 − m 2 2
( )
+
( )(b+c)+(b−c) 2 (m2 )
d d+c 2
a a+b+c+d
a − √1 (a+b)(a+c) 2 (a+b) ad−bc (a+b)(a+c)(c+d)(b+d) ad−bc
√
2
(m2 )
a a+b+c a a+2(b+c) ad √ (a+b)(a+c)(d+b)(d+c) a+d a+ 21 (b+c)+d a+d a+2(b+c)+d ad−bc ad+bc
Range [0,1] [-1,1] [0,1]
L + + +
H + + +
[0,1]
+
+
[-1,1]
+
+
[-1,1] [0,1]
+ +
+ +
[0,1] [0,1] [-1,1]
+ + +
+ + +
[0,1]
+
+
[0,1]
+
+
[0,1] [- 21 , 1)
+ +
− −
[-1,1] [- 41 , 41 ]
+ +
− −
[0,1] [0,1] [0,1]
− − −
+ + +
[0,1]
−
+
[0,1] [-1,1]
− −
+ +
be an m-cluster, while the other has to be one-cluster. Index P can never be equal to its lower bound −1 for the same reason it cannot be equal to 1, as explained above. The lower bound for index FMG is − 21 and can be obtained when a = 0 and b = 1, a case requiring m = k + 1, and i0 such that mi0 + = 2 and mi+ = 1 for i 6= i0 . 3. Correction for Chance Agreement With so many similarity indices available the choice of the index and subsequent interpretation of its value is not obvious. As an example, in Table 4
306
A.N. Albatineh, M. Niewiadomska-Bugaj, and D. Mihalko
Table 4. Mean, 2.5th percentile (L) and 97.5th percentile (U) for six selected similarity indices: F M , R, H, RR, CZ, W , and I = J = 2, 3, 6.
Index Mean L U Mean L U Mean L U
FM 0.678 0.511 0.705 0.494 0.389 0.573 0.265 0.208 0.335
R 0.499 0.499 0.500 0.417 0.335 0.515 0.547 0.374 0.656
H -0.001 -0.002 0.000 -0.167 -0.331 0.030 0.094 -0.252 0.312
RR 0.462 0.261 0.497 0.248 0.151 0.328 0.071 0.044 0.113
CZ 0.645 0.511 0.665 0.453 0.384 0.497 0.236 0.202 0.265
W 0.926 0.524 0.996 0.748 0.455 0.988 0.430 0.265 0.683
I=J 2
3
6
we include mean values and limits of 95% confidence intervals for six selected similarity indices (F M, R, H, RR, CZ, W ) obtained for 1000 matching matrices M. For each such matrix a new data set of size 500 was generated from bivariate normal distribution and two clusterings were compared. In each pair one clustering was a random partition of 500 data points into clusters of equal sizes while the other was obtained using the average linkage method (this choice was arbitrary) in SAS statistical software. The number of clusters requested was the same for both clusterings and equal 2, 3, and then 6 clusters. When two clusters were requested (first row in Table 2), means vary from about 0 for H , to 0.926 for W with several intermediate values obtained for other indices, as for example 0.499 for R, and 0.645 for CZ . Clusterings were obtained at random and independently so the differences must be caused by the agreement due to chance affecting each index in a different way specific to its formula. To eliminate the effect of agreement due to chance, Morrey and Agresti (1984) and Hubert and Arabie (1985) suggested a correction for the Rand (R) similarity index. Any similarity index SI after such correction has a form CSI =
SI − E(SI) , 1 − E(SI)
(2)
where expectation E(SI) is conditional upon fixed sets of marginal counts in the matrix M. Consequently the corrected value of the index should be close to 0 if the agreement is due to chance only, and will be equal 1 when the uncorrected index equals 1. For many indices from Table 2 the latter will happen if I = J and rows and columns of table M can be rearranged in such a way that the table has positive counts only on the diagonal. Correction by elimination of chance effects is similar to the proposal of Guttman (1941) in his measure of
On Similarity Indices and Correction for Chance Agreement
307
Table 5. Mean, 2.5th percentile (L) and 97.5th percentile (U) for six corrected similarity indices: CF M , CR, CH, CRR, CCZ, CW , and I = J = 2, 3, 6 (approximate expectation formula (4) was used).
index Mean L U Mean L U Mean L U
CFM 0.000 0.000 0.002 0.001 0.000 0.007 0.005 0.001 0.011
CR 0.000 0.000 0.002 0.001 0.000 0.006 0.004 0.001 0.010
CH 0.000 0.000 0.002 0.001 0.000 0.006 0.004 0.001 0.010
CRR 0.000 0.000 0.001 0.000 0.000 0.002 0.001 0.000 0.002
CCZ 0.000 0.000 0.002 0.001 0.000 0.006 0.004 0.001 0.010
CW 0.000 0.000 0.003 0.001 0.000 0.004 0.003 0.001 0.007
I=J 2
3
6
nominal association (later denoted by λ by Goodman and Kruskal (1954)), and in the κ measure of interjudge agreement proposed by Cohen (1960). Two different conditional expectation formulas for the index R were proposed. One was based on the exact generalized hypergeometric distribution of the counts in the table M (Hubert and Arabie, 1985) E
X J I X
m2ij
i=1 j=1
=
PI
i=1
PJ
2 2 j=1 mi+ m+j
m(m − 2)
+
P P m2 − ( Ii=1 m2i+ + Jj=1 m2+j ) m−1
(3)
and the other was its asymptotic form based on the multinomial distribution (Morrey and Agresti, 1984)
E
X J I X i=1 j=1
m2ij
I X J I J 1 X 2 1 XX 2 2 2 mi+ m+j = 2 mi+ m+j . ≈ 2 m m i=1 j=1
i=1
j=1
(4) The effect of correction (4) can be seen in Table 5. Table 5 contains mean values and limits of 95% confidence intervals for six similarity indices from Table 2 after they were corrected for chance agreement (CF M, CR, CH, CRR, CCZ, CW ). They were obtained from the simulation discussed above. Clusterings being compared were independent and there was no actual similarity. It can be seen that means in Table 5 are all either equal to zero or very close to zero. Additionally, results for indices CR, CH , and CCZ are equal. As will be explained in Section 4, some of the indices become equivalent after correction (2) is applied.
308
A.N. Albatineh, M. Niewiadomska-Bugaj, and D. Mihalko
Table 6. Sample size (n) effect on the difference between Rand similarity index corrected using exact expectation formula (CRE) and approximate expectation formula (CRA).
n 6 30 60 600
EE 0.547 0.505 0.502 0.500
AE 0.400 0.483 0.492 0.499
EE - AE 0.147 0.022 0.010 0.001
R 0.600 0.655 0.661 0.666
CRE 0.118 0.304 0.319 0.332
CRA 0.333 0.333 0.333 0.333
CRA - CRE 0.216 0.029 0.013 0.001
The differences between expectations (3) and (4) pointed out in Hubert and Arabie (1985), can be apparent only when the data size is small; otherwise they are slight. To illustrate this fact we provide a simple example. Hubert and Arabie (1985) considered the following example of a summary table M in their paper 2 0 2
1 2 3
0 1 1
3 3 6
corresponding to a sample size equal 6 and two clusterings: one with two and the other with three clusters respectively. Below, we add three more cases that both preserve the structure of the clusterings above, but counts are multiplied by 5, 10 and 100 respectively, yielding 10 0 10
5 10 15
0 5 5
15 15 30
20 0 20
10 20 30
0 10 10
30 30 60
200 0 200
100 200 300
0 100 100
300 300 600
Table 6 contains uncorrected and corrected values of R for all four summary tables to illustrate the effect of the sample size on the difference between both corrections. It can be seen that while in the case of a sample size 6 the difference (see column CRA – CRE in Table 6) is quite large (0.216), it is only 0.029 when the sample size is 30, and only 0.001 for the sample of size 600. 4. L Family of Similarity Indices To correct a similarity index according to (2) one needs its (exact or approximate) conditional expectation. That appears rather simple for the 16 out of the 22 similarity indices, marked with Pin the second to last column of P a “+” Table 2, that are linear functions of the Ii=1 Jj=1 m2ij . We introduce a family
On Similarity Indices and Correction for Chance Agreement
L of indices of the form α+β
I X J X
m2ij
309
(5)
i=1 j=1
where α and β , specific for each index, depend on marginal totals but not on individual counts in table M. The Proposition 1 below specifies a condition under which indices within L family become equivalent when corrected for chance agreement. Proposition 1 Two indices in L family become identical after correction (2) if they have the same ratio 1−α . (6) β P P P P Proof: E(SI) = E(α + β Ii=1 Jj=1 m2ij ) = α + βE( Ii=1 Jj=1 m2ij ) and consequently the corrected similarity index (CSI ) becomes SI − E(SI) 1 − E(SI) P P P P α + β Ii=1 Jj=1 m2ij − α − βE( Ii=1 Jj=1 m2ij ) = P P 1 − α − βE( Ii=1 Jj=1 m2ij ) P P P P [ Ii=1 Jj=1 m2ij − E( Ii=1 Jj=1 m2ij )] = β P P 1 − α − βE( Ii=1 Jj=1 m2ij ) PI PJ PI PJ 2 2 i=1 j=1 mij − E( i=1 j=1 mij ) . = P P J I 1−α 2 j=1 mij ) β − E( i=1
CSI =
(7)
Therefore the value of similarity index corrected for chance agreement depends on the particular index only through 1−α β , where α and β characterize the index within the L family (see Table 7). Corollary 4.2 (i) Rand (R), Hubert (H ), and Czekanowski (CZ ) similarity indices are equivalent after correction for agreement due to chance. Proof: Using formulas for α and β from Table 7 we obtain the ratio (6) I J X X 1 m2i+ + m2+j 2 i=1
for all three indices.
j=1
310
A.N. Albatineh, M. Niewiadomska-Bugaj, and D. Mihalko Table 7. Coefficients α and β for indices in L family.
α
Index
β
R
1−
1 2M
(P + Q + 2m)
1 M
H
1−
1 M
(P + Q + 2m)
2 M
−2m P +Q
2 P +Q
−m(P +Q) 2P Q
P +Q 2P Q
MC
m2 −(P +m)(Q+m) PQ
P +Q PQ
PE
+P Q − 2mM Q(2M −Q)
2M Q(2M −Q)
FM
−m √ PQ
√1 PQ
W1
−m P
1 P
W2
−m Q
1 Q
CZ K
−2mM −P Q P Q(P −2M )(Q−2M )
SS1
2M P Q(P −2M )(Q−2M )
√
Γ
− P4P+Q Q +
B1
1−
[4M −(P +Q)][2M −m−(P +Q)] 4(2M −P )(2M −Q)
P B2
P +Q 4P Q
M (P +Q+2m)+2(Q−P )2 2M 2
RR FMG
√
−
4M −(P +Q) 4(2M −Q)(2M −P ) 1 M 1 2M
−m 2M −m √ PQ
+
√1 2P
√1 PQ
−4(2mM +P Q) P Q(2M −P )(2M −Q)
8M P Q(2M −P )(2M −Q)
+P Q − 2mM 4M 2
1 2M
(ii) Kulczynski (K), McConnaughey (MC), similarity indices are equivalent after correction for agreement due to chance. Proof: Using formulas for α and β from Table 7 we obtain the ratio (6) equal m+
for both indices.
1 ΣIi=1 m2i+ −m
2 +
1 ΣJj=1 m2+j −m
On Similarity Indices and Correction for Chance Agreement
311
Let us now introduce a family H, identified in the last column of Table 2, consisting of similarity indices that equal 1 if and only if both clusterings being compared are identical, which means b = c = 0 in corresponding matrix M. Consequently, if two clusterings have the same number of clusters (I = J), and if their corresponding sizes are equal (e.g., there exists permutation i1 , . . . , iI such that mij + = m+j for j = 1, . . . , J = I ) then for the fixed sets of marginal totals the maximum value of any index in H corresponds to the diagonal matrix M and is equal 1. Consequently, we have
1 = max α + β
I X J X
m2ij
i=1 j=1
= α + β max
X I X J i=1 j=1
m2ij
,
(8)
and the following proposition holds. Proposition 2 Let two clustering procedures be compared in the case of the same number of clusters (I = J ) and equal cluster sizes (e.g., there exists a permutation i1 , . . . , iI such that mij + = m+j for j = 1, . . . , I ). Then all corrected similarity indices from L ∩ H coincide after correction (2) is applied. Proof: Under assumptions of the Proposition 2, condition (8) holds, and consequently corrected similarity index (CSI ) becomes CSI = = = =
SI − E(SI) max(SI) − E(SI) P P P P α + β Ii=1 Jj=1 m2ij − α − βE( Ii=1 Jj=1 m2ij ) P P P P max(α + β Ii=1 Jj=1 m2ij ) − α − βE( Ii=1 Jj=1 m2ij ) P P P P β[ Ii=1 Jj=1 m2ij − E( Ii=1 Jj=1 m2ij )] P P P P α + β max( Ii=1 Jj=1 m2ij ) − α − βE( Ii=1 Jj=1 m2ij ) PI PJ PI PJ 2 2 j=1 mij − E( i=1 j=1 mij ) i=1 , (9) P P P P max( Ii=1 Jj=1 m2ij ) − E( Ii=1 Jj=1 m2ij )
which clearly does not depend on the index itself. In Proposition 2 it is assumed that I = J and that cluster sizes are equal. This assumption, although atypical, can be important when two clustering procedures (or two algorithms) are compared. Besides comparing results obtained for a predetermined numbers of clusters, it might be also important to compare results under the additional requirement of equal (predetermined) cluster sizes. Then Proposition 2 shows that all corrected similarity indices in the family L ∩ H will have the same value.
312
A.N. Albatineh, M. Niewiadomska-Bugaj, and D. Mihalko
5. Final Remarks We have shed some light on the properties and relationship of similarity indices from the perspective of their usefulness in comparing clusterings. Many similarity indices were designed to assess similarity of species based on the presence/absence of certain features. When comparing clusterings, counts in Table 1 are numbers of possible pairs of data points classified either in the same cluster according to both clusterings (a), by only one of the clusterings (b or c), or by a neither of them (d). Therefore definitions (1) in our paper are specific to the problem of comparing clusterings. As explained in the introduction, similarity indices can be used to evaluate a single clustering procedure (actual grouping is compared with the partition obtained from the procedure) and also to compare two clustering methods (or two algorithms of the same method). Furthermore, the behavior of the similarity index can also be used as an indicator of the proper number of clusters in a data set. Interesting results can be found in Albatineh (2004) and will be a subject of a separate paper. References ALBATINEH, A. N. (2004), On Similarity Measures for Cluster Analysis, PhD Dissertation, Kalamazoo, MI: Western Michigan University. BAULIEU, F. B. (1989), “A Classification of Presence/Absence Based Dissimilarity Coefficients”, Journal of Classification, 6, 233-246. COHEN, A. J. (1960), “A Coefficient of Agreement for Nominal Scales”, Educational and Psychological Measurement, 20, 37-46. CZEKANOWSKI, J. (1932), “Coefficient of Racial Likeness und Durchschnittliche Differenz”, Anothropologidcher, 14, 227-249. DICE, L.R. (1945), “Measures of the Amount of Ecological Association Between Species”, Ecology, 26, 297-302. EVERITT, B. S., LANDAU, S., and LEESE, M. (2001), Cluster Analysis, New York: Oxford University Press. FAGER, E. W. and MCGOWAN, J.A. (1963), “Zooplankton Species Groups in the North Pacific”, Science, 140, 453-460. FOWLKES, E. B. and MALLOWS, C. L. (1983), “A Method for Comparing Two Hierarchical Clusterings”, Journal of the American Statistical Association, 78, 553-569. GOODMAN, L. A. and KRUSKAL, W. H. (1954), “Measures of Association for Cross Classifications”, Journal of the American Statistical Association, 49, 732-764. GOWER, J. C. and LEGENDRE, P. (1986), “Metric and Euclidean Properties of Dissimilarity Coefficients”, Journal of Classification, 3, 5-48. GUTTMAN, L. (1941), “An Outline of the Statistical Theory of Prediction”, In P. Horst (Ed.), In Prediction of Personal Adjustment, New York: Social Science Research Council. HAMANN, U. (1961), “Merkmalsbestand und Verwandtschaftsbeziehungen der Farinosae”, Wildenowia 2, 639 - 768. HUBERT, L. J. (1977), “Nominal Scale Response Agreement as a Generalized Correlation”, British Journal of Mathematical and Statistical Psychology, 30, 98-103.
On Similarity Indices and Correction for Chance Agreement
313
HUBERT, L. J., and ARABIE, P. (1985), “Comparing Partitions”, Journal of Classification, 2, 193-218. JACCARD, P. (1908), “Nouvelles Recherches sur la Distribution Florale”, Bulletin de la Soci´et´e Vaudoise des Sciences Naturelles, 44, 223-270. JAIN, A. K. and DUBES, R. C. (1988), Algorithms for Clustering Data, New Jersey: Prentice Hall. KULCZYNSKI, S. (1927), “Die Pflanzenassociationen der Pienenen”, Bulletin International de L’Acad´emie Polonaise des Sciences et des letters, classe des sciences mathematiques et naturelles, Serie B, Suppl´ement II , 2, 57-203. MCCONNAUGHEY, B. H. (1964), “The Determination and Analysis of Plankton Communities”, Marine Research, Special No, Indonesia, 1-40. MILLIGAN, G. , SOON, S., and SOKOL, L. (1983), “The Effect of Cluster Size, Dimensionality, and the Number of Clusters on Recovery of True Cluster Structure”, IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-5, 40-47. MILLIGAN, G. W. and COOPER, M. C. (1986), “Comparability of External Criteria for Hierarchical Clustering Analysis”, Multivariate Behavioral Research, 21, 441-458. MOREY, L. and AGRESTI, A. (1984), “The Measurement of Classification Agreement : An Adjustment to the Rand Statistic for Chance Agreement”, Educational and Psychological Measurement, 44, 33-37. OCHIAI, A. (1957), “Zoogeographic Studies on the Soleoid Fishes Found in Japan and Its Neighboring Regions”, Bulletin of the Japanese Society for Fish Science, 22, 526-530. PEIRCE, C. S. (1884), “The Numerical Measure of the Success of Predictions”, Science, 4, 453-454. RAND, W. (1971), “Objective Criteria for the Evaluation of Clustering Methods”, Journal of the American Statistical Association, 66, 846-850. ROGERS, D. J. and TANIMOTO, T.T. (1960), “A Computer Program for Classifying Plants”, Science, 132, 1115-1118. RUSSELL, P. F. and RAO, T.R. (1940), “On Habitat and Association of Species of Anopheline Larvae in South-Eastern Madras”, Journal of Malaria Institute India, 3, 153-178. SAXENA, P. C. and NAVANEERHAM, K. (1991), “The Effect of Cluster Size, Dimensionality, and Number of Clusters on Recovery of True Cluster Structure Through Chernoff-Type Faces”, The Statistician, 40, 415-425. SAXENA, P. C. and NAVANEERHAM, K. (1993), “Comparison of Chernoff-Type Face and Non-Graphical Methods for Clustering Multivariate Observations”, Computational Statistics and Data Analysis, 15, 63-79. SOKAL, R. R. and MICHENER, C.D. (1958), “A Statistical Method for Evaluating Systematic Relationships”, University of Kansas Science Bulletin, 38, 1409-1438. SOKAL, R. R. and SNEATH, P. H. (1963), Principles of Numerical Taxonomy, San Francisco CA: Freeman. SNEATH, P. H. and SOKAL, R.R. (1973), Numerical Taxonomy, San Francisco CA: Freeman. WALLACE, D. L. (1983), “A Method for Comparing Two Hierarchical Clusterings: Comment”, Journal of the American Statistical Association, 78, 569-576. YULE, G. U. (1927), “On Reading a Scale”, Journal of the Royal Statistical Society, 90, 570579.