Inequalities on Rank Correlation with Missing. Data. BY. TAKIS PAPAIOANNOU and SOTIRIS LOUKAS. Reprinted from. THE JOURNAL OF THE ROYAL ...
Inequalities on Rank Correlation with Missing Data
BY
TAKIS PAPAIOANNOU and SOTIRIS LOUKAS
Reprinted from
THE JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B (METHODOLOGICAL) Volume 46, No. 1, 1984 (pp. 68-71)
P R IN T E D FOR P R I V A T E C I R C U L A T I ON
1984
J. R. Statist. Soc. B (1984), 46, No. I, pp. 68 71
Inequalities on Rank Correlation with Missing Data By T A K IS P Α Ρ Α ΙΟ A N N O U t a n d S O T IR IS LOUK AS University o f Ioannina, Greece [Received March 1982. Revised February 1983] SUMMARY New ineq u a lities are o b ta in e d for S p e a rm a n ’s fo o tru le , its asso ciate d sum o f th e ab so lu te values o f ran k differen ces, th e n u m b e r o f d isc o rd a n t pairs asso ciated w ith K en d all’s ta u a n d th e c o efficie n t o f c o n c o rd a n c e o f several rankings w hen o n e or m ore individuals are m issing or a d d ed to th e d ata. Keywords: SPEARMAN’S FOOTRULE; KENDALL’S TAU; COEFFICIENT OF CONCORDANCE; MISSING DATA 1. IN T R O D U C T IO N
Widely used non-parametric measures of association such as Spearman’s rank correlation coefficient rs and Kendall’s difference sign correlation r are associated with the surn of squared rank differences S S and the number / of discordant pairs. An equally simple but largely neglected com petitor is Spearman’s footrule R associated with the sum o f absolute values of rank differences D. Interest in Γ) has markedly increased due to the recent results of Diaconis and Graham (1977) who study, amongst others, its statistical properties and of Ury and KJeinecke (1979) who tabulate its distribution for small n. In this article inequalities are obtained for D and /. The inequalities are applicable when no ties are present and when one or more individuals or ranks are missing or deleted from a set of observations. They are also applicable when new observations are added to the data. Inequalities that lead to improved bounds for R , rs , τ and the coefficient of concordance w o f q rankings, when data additions or deletions occur are also derived. The coefficient o f concordance is used when more than two observers rank several individuals or objects and it is desired to investigate the communality of their judgements (cf. Kendall, 1970, p. 94). 2. MAIN R E S U L T S
Let {/(Xj·)}, { r ( Y i ) } , i = 1 , 2 , . . X and Y respectively. Let
be the rankings of n individuals according to two criteria
n
ssn = Σ i
=
in x,)-ri y ,)\\ 1
n
Dn =
Σ I r ( X i) ~ r ( Y i) \ , i = 1
/„
= the minimum number o f pairwise adjacent transpositions required to bring { r ( X j ) , . . . , r ( X n )} into the order { r ( Y1 r (K„)}, and
f Present address; Dept of Mathematics, The University, Ioannina. Greece. © 1984 Royal Statistical Society
0035-9246/84/46068 $2.00
1984]
N e w Inequalities
69
Tn = the minimum number of transpositions required to bring {/(Χ χ), .. r(Xn)} into the order . . . , r ( Y n)}. Let also rs(n), R(n) and t(n) be the Spearman rank correlation coefficient, the footrule of Spear man and the estimator of Kendall’s r for the n individuals respectively. Spearman’s coefficient rs(n) is a function of SSn , R(n) is a function of D n and Kendall’s t(n) is a function of In since, asshown in Kendall (1970, p. 8), In is also equal to the number of discordant pairs in {r(F/)}. Thequantities SSn , Dn , I n and Tn are metrics on the set of permutations of n letters (Diaconis and Graham, 1977). Consider a subset of m (m < n) individuals and rank them according to the relative rankings of the original ones for both X and Y. Let S S m , Dm , I m , Tm , rs( m), R ( m ) and t(m) be the previous quantities for the reduced number of individuals. Papaioannou and Speevak (1977) have shown that SSm < S S n . We shall prove that D m < D n and I m < I n by considering first the elimination of a single individual and then extending the result to the general case. Examples will show that the same relationship does not always hold for Tn . Theorem 1. Dm I +
/=1 Pi > Pi
Σ I0'_1)-p/l
i =Z+1 Pi Pi
n
I-
1
n
=Σ I i ~ P i \ + Σ { \ i + 1- P i I“ I i - P t I } + Σ {I i - 1~Pi l” li ~Pi I}· 0) i=l i=l i=/+1 ϊ ΦI
Pi >
Pi
Pi < Pi
Let us put
/-1 Σί =
n
Σ {I ^ + 1—Pi I —I / —P /1 } and
1=1
P i > Pi
Σ 2 =Σ
{11 “ 1
i =/+1
Pi K
~ P i I “ 11 ~Pi I}·
pi
Let k equal the number of p f s such that p z· < p z for / = 1, 2 , 1 . Then l ~ 1 - k of the p /s are greater than p t for / = 1, 2 ,. . I - 1. Also since P/ - 1 is the number of p / s that are less than Pi for / = 1, 2, . . n 9 - 1 - k of the p / s are less than p t for i - I + 1, / + 2 ,. . ., n. Suppose that P i ^ l. Then each o f the terms consisting Σ χ is equal to - 1 and each of the terms consisting Σ 2 is either 1 or - 1 depending on whether p t is greater than i or not. Therefore Σ, = ( - l) ( /- l- f c ) , and (1) becomes
Σ2 < \ ( ρ ι - \ - ν
70
PAPAIOANNOU AND LOUKAS η
Dn- 1