bers depending on n. We are interested in the double-indexed permutation ...... and complicated conditional arguments, which will be presented in the proofs. Ž .
The Annals of Statistics 1997, Vol. 25, No. 5, 2210 ] 2227
ERROR BOUND IN A CENTRAL LIMIT THEOREM OF DOUBLE-INDEXED PERMUTATION STATISTICS BY LINCHENG ZHAO1 , ZHIDONG BAI 2 , CHERN-CHING CHAO 2 AND WEN-QI LIANG2 University of Science and Technology of China, National Sun Yat-Sen University, Academia Sinica and Academia Sinica An error bound in the normal approximation to the distribution of the double-indexed permutation statistics is derived. The derivation is based on Stein’s method and on an extension of a combinatorial method of Bolthausen. The result can be applied to obtain the convergence rate of order ny1 r2 for some rank-related statistics, such as Kendall’s tau, Spearman’s rho and the Mann]Whitney]Wilcoxon statistic. Its applications to graph-related nonparametric statistics of multivariate observations are also mentioned.
1. Introduction. Let z Ž i, j, k, l ., i, j, k, l g N s 1, . . . , n4 , be real numbers depending on n. We are interested in the double-indexed permutation statistics ŽDIPS. of the general form Ý i, j z Ž i, j, p Ž i ., p Ž j .., where p is uniformly distributed on the set Pn of all permutations of N. The DIPS of the restricted form Ý i, j a i j bp Ž i.p Ž j. was first investigated by Daniels Ž1944. in the study of a generalized correlation coefficient with Kendall’s tau and Spearman’s rho being special cases. Daniels gave a set of sufficient conditions for their asymptotic normality as n ª `. Further investigations along this direction have been done by Bloemena Ž1964., Jogdeo Ž1968., Abe Ž1969., Shapiro and Hubert Ž1979., Barbour and Eagleson Ž1986. and Pham, Mocks and ¨ Sroka Ž1989.. In these contexts, the so-called scores a i j and bi j are either symmetric Ž a i j s a ji , bi j s bji . or skew-symmetric Ž a i j s ya ji , bi j s ybji .. The uses of DIPS have diversely been suggested by Friedman and Rafsky Ž1979, 1983. and Schilling Ž1986. in multivariate nonparametric tests, by Hubert and Schultz Ž1976. in clustering studies, by Mantel and Valand Ž1970. in biometry, and by Cliff and Ord Ž1981. in geography. The purpose of this paper is to derive a bound for the error in the normal approximation to the distribution of the DIPS of the general form
Received October 1995; revised December 1996. 1 Research partially supported by National Natural Science Foundation of China, Ph.D. Program Foundation of National Education Committee of China and Special Foundation of Academia Sinica. 2 Research partially supported by National Science Council of the Republic of China. AMS 1991 subject classifications. Primary 60F05, 62E20; secondary 62H20. Key words and phrases. Asymptotic normality, correlation coefficient, graph theory, multivariate association, permutation statistics, Stein’s method.
2210
2211
CLT OF PERMUTATION STATISTICS
Ý i, j z Ž i, j, p Ž i ., p Ž j ... This bound can be used to yield the convergence rate O Ž ny1 r2 . for some well-known statistics, such as Kendall’s tau, Spearman’s rho and the Mann]Whitney]Wilcoxon statistic. In Section 2 the DIPS, Ý i, j z Ž i, j, p Ž i ., p Ž j .., is converted to the form of Ý i aŽ i, p Ž i .. q ny1 ÝXi, j bŽ i, j, p Ž i ., p Ž j ... ŽThroughout this paper, ÝXi, j denotes Ý i, j, i / j .. A Berry]Esseen type of inequality for the latter is stated as Theorem 1. The result for DIPS, straightforwardly implied by Theorem 1, is stated as Theorem 2. In Section 3 the applications of Theorem 2 to Daniels’ generalized correlation coefficient, the number of edges in the random intersection of two graphs and the Mann]Whitney]Wilcoxon statistic are demonstrated. The essential theoretic part of this paper, that is, the proof of Theorem 1, is presented in Section 4. Our derivations are based on Stein’s method Ž1972. and an extension of the combinatorial method of Bolthausen Ž1984.. Bolthausen successfully employed his combinatorial method combined with Stein’s method to obtain a result on the convergence rate for the singleindexed permutation statistics of the form Ý i aŽ i, p Ž i ... Our Theorem 1 reduces to Bolthausen’s result when all bŽ i, j, k, l . s 0. These two methods were also used by Schneller Ž1989. to establish the Edgeworth expansion for general linear rank statistics. There has been little success in establishing the Berry]Esseen bound of order ny1 r2 for general classes of statistics which are asymptotically normally distributed. For the importance of and the historic developments in the study of departures from normality, the reader is referred to an earlier survey paper by Bickel Ž1974.. The possibility of applying Stein’s method in such investigations is also pointed out therein. 2. Main results. For each 4-tuple real array Ž x Ž i, j, k, l .. and each real matrix Ž y Ž i, k .., i, j, k, l g N, the following notation is used: x Ž i, j, k, ? . s ny1
Ý x Ž i, j, k, l .,
x Ž i, j, ? , ? . s ny2
l
x Ž i, ? , ? , ? . s ny3
Ý
x Ž i, j, k, l .,
x Ž?, ? , ? , ? . s ny4
j, k, l
y Ž i, ? . s ny1
Ý x Ž i, j, k, l .,
k, l
Ý
x Ž i, j, k, l .;
i, j, k, l
Ý y Ž i, k .,
y Ž?, ? . s ny2
k
Ý y Ž i, k .,
i, k
and others defined similarly. Let A s Ž aŽ i, k .., i, k g N, be a given real matrix such that
Ž 2.1. Ž 2.2.
a Ž i , ? . s a Ž ?, k . s 0,
Ý a Ž i , k . s n y 1. 2
i, k
Let B s Ž bŽ i, j, k, l .., i, j, k, l g N, be a given 4-tuple real array such that
Ž 2.3.
b Ž i , j, k , ? . s b Ž i , j, ? , l . s b Ž i , ? , k , l . s b Ž ?, j, k , l . s 0.
2212
ZHAO, BAI, CHAO AND LIANG
Consider the random variable Ws
Ý a Ž i , p Ž i . . q ny1 ÝX b Ž i , j, p Ž i . , p Ž j . . , i, j
i
where p is uniformly distributed on Pn . We have the following result. THEOREM 1. There is an absolute constant K ) 0 such that, for n G 2,
½
sup P Ž W F x . y F Ž x . F K ny1 x
Ý aŽ i , k .
3
q ny3
i, k
Ý
b Ž i , j, k , l .
i , j, k , l
where F is the standard normal distribution function. Now, consider the asymptotic normality of the DIPS Ds
Ý z Ž i , j, p Ž i . , p Ž j . . . i, j
For given Ž z Ž i, j, k, l .., i, j, k, l g N, let
z U Ž i , j, k , l . s z Ž i , j, k , l . y z Ž i , j, k , ? . q z Ž i , j, ? , l . qz Ž i , ? , k , l . q z Ž ?, j, k , l . q z Ž i , j, ? , ? . q z Ž i , ? , k , ? . q z Ž i , ? , ? , l . qz Ž ?, j, k , ? . q z Ž ?, j, ? , l . q z Ž ?, ? , k , l . y z Ž i , ? , ? , ? . q z Ž ?, j, ? , ? . q z Ž ?, ? , k , ? . q z Ž ?, ? , ? , l . q z Ž ?, ? , ? , ? . .
Then
z U Ž i , j, k , ? . s z U Ž i , k , ? , l . s z U Ž i , ? , k , l . s z U Ž ?, j, k , l . s 0 and Ds
Ý z U Ž i , j, p Ž i . , p Ž j . . q n Ý z Ž i , ? , p Ž i . , ?. i, j
i
qn Ý z Ž ?, j, ? , p Ž j . . y n 2z Ž ?, ? , ? , ? . j
s
Ý z Ž i , j, p Ž i . , p Ž j . . q Ý a Ž i , p Ž i . . X
U
i, j
s
i
Ý z Ž i , j, p Ž i . , p Ž j . . q Ý aU Ž i , p Ž i . . q naŽ ?, ?. , X
U
i, j
i
where a Ž i , k . s z U Ž i , i , k , k . q n z Ž i , ? , k , ? . q n z Ž ?, i , ? , k . y n z Ž ?, ? , ? , ? . and aU Ž i , k . s a Ž i , k . y a Ž i , ? . y a Ž ?, k . q a Ž ?, ? . . Note that aU Ž i, ? . s aU Ž?, k . s 0. Defining and assuming that
Ž 2.4.
s2s
Ý aU 2 Ž i , k . r Ž n y 1. ) 0,
i, k
3
5
,
2213
CLT OF PERMUTATION STATISTICS
we have D y na Ž ?, ? .
s
s
1
n
Ý s aU Ž i , p Ž i . . q ny1 ÝX s z U Ž i , j, p Ž i . , p Ž j . . . i, j
i
Thus, we can apply Theorem 1 to obtain the following result. THEOREM 2. There is an absolute constant K ) 0 such that, for n G 2, sup P x
F
ž
D y na Ž ?, ? .
K
s3
½
s ny1
Ý
/
F x y FŽ x. aU Ž i , k .
3
i, k
q
Ý
i , j, k , l
z U Ž i , j, k , l .
3
5
,
provided condition Ž2.4. is satisfied. 3. Applications. In this section the applications of Theorem 2 are demonstrated by three examples. In addition to those well-known testing statistics stated below, Theorem 2 reveals the potential for creating new nonparametric testing statistics, especially for multivariate observations, due to its generality. EXAMPLE 1 ŽMann]Whitney]Wilcoxon statistic.. Let x 1 , . . . , x n 1 and y 1 , . . . , yn 2 , n1 q n 2 s n, be independent univariate random samples from unknown continuous distributions FX and FY , respectively. The Mann] Whitney]Wilcoxon statistic for testing the hypothesis H0 : FX s FY is defined to be the total number of pairs Ž x i , yj . for which x i - yj . Let p Ž i ., i s 1, . . . , n1 , denote the rank of x i and p Ž n1 q j ., j s 1, . . . , n 2 , denote that of yj in the combined sample. Then the Mann]Whitney]Wilcoxon statistic can be expressed as Ý i, j z Ž i, j, p Ž i ., p Ž j .., where
½
1, if 1 F i F n1 , n1 q 1 F j F n and 1 F k - l F n, 0, otherwise; and p is uniformly distributed on Pn under H0 . Applying Theorem 2 with straightforward calculations, we obtain
z Ž i , j, k , l . s
sup P x
ž
Ý i , j z Ž i , j, p Ž i . , p Ž j . . y 12 n1 n 2
Ž
1 12
n1 n 2 Ž n q 1 . .
1r2
/
y1 F x y F Ž x . F K Ž ny1 1 q n2 .
1r2
.
The Mann]Whitney]Wilcoxon statistic is one of the members of U-statistics of degree two. The Berry]Esseen bounds and the Edgeworth expansions for U-statistics have been extensively studied; see Bickel, Gotze ¨ and van Zwet Ž1986. and the references therein. For a systematic presentation of the theory of U-statistics, the reader is referred to Koroljuk and Borovskich Ž1994.. EXAMPLE 2 ŽDaniels’ generalized correlation coefficient.. Let Ž dŽ i, j .. and Ž eŽ i, j .., i, j g N, be two real matrices. Daniels Ž1944. considers a generalized correlation coefficient Ý i, j dŽ i, j . eŽp Ž i ., p Ž j .., where the scores dŽ i, j . and eŽ i, j . are skew-symmetric and p is uniformly distributed on Pn . Applying
2214
ZHAO, BAI, CHAO AND LIANG
Theorem 2, we have sup P x
F
Ž 3.1.
ž
Ýi , j d Ž i , j . e Ž p Ž i . , p Ž j . . K
s3
½
s n2 Ý d Ž i , ?. e Ž k , ?.
/
F x y FŽ x.
3
i, k
Ý Ž d Ž i , j . y d Ž i , ?. y d Ž ?, j . .
q
i , j, k , l
= Ž e Ž k , l . y e Ž k , ? . y e Ž ?, l . .
3
5
,
where s 2 s 4 n 2 Ž n y 1.y1 Ý i, k d 2 Ž i, ? . e 2 Ž k, ? .. Consider ordered pairs of univariate observations Ž x i , yi ., i g N. Kendall’s tau Žletting dŽ i, j . s signŽ x i y x j . and eŽ i, j . s signŽ yi y yj .. and Spearman’s rho Ž dŽ i, j . s rankŽ x i . y rankŽ x j . and eŽ i, j . s rankŽ yi . y rankŽ yj .. are two statistics for testing the hypothesis H0 : no correlation between X and Y. Applying Ž3.1., we conclude that the null distribution of both Žstandardized. statistics converges to F Ž x . with the rate O Ž ny1 r2 .. EXAMPLE 3 ŽNumber of edges in the random intersection of two graphs.. Friedman and Rafsky Ž1983. extend the notion of association measures for univariate observations, such as Kendall’s tau, to multivariate observations. The lack of ordering in multivariate observations is conquered by constructing interpoint-distance based graphs, such as the k minimal spanning tree and the k nearest-neighbor graph. Then, various measures for association or others can be defined in terms of the number of edges in the intersection of two graphs. The reader is referred to Friedman and Rafsky Ž1979, 1983. for details. Now, let G 1Ž N, E1 . and G 2 Ž N, E2 . be two graphs consisting of the same set of nodes, N s 1, . . . , n4 , and sets of edges E1 and E2 , respectively. The number of edges in the random intersection of G1 and G 2 is defined as G s Ý i, j IŽ i, j.g E14 IŽp Ž i., p Ž j.. g E 2 4. Here, ID denotes the indicator of the set D. Let d i denote the degree of node i in G 1 , that is, the number of edges in E1 that are incident to i. Let r 1 denote the total number of edges in E1. Then r 1 s 12 Ý i d i . Similarly, define the degree dXi of node i in G 2 and the total number r 2 of edges in E2 . Applying Theorem 2, we obtain sup P x
F
ž
G y 4 ny2 Ž 1 q ny1 . r 1 r 2 K
s
3
½
s ny4
/
F x y FŽ x.
Ý Ž d i y 2 ny1r 1 .Ž dXk y 2 ny1r 2 .
3
i, k
q
Ý Ž IŽ i , j.g E 4 y ny1 Ž d i q d j . q 2 ny2r 1 .
i , j, k , l
1
= Ž IŽ k , l .g E 2 4 y ny1 Ž dXk q dXl . q 2 ny2r 2 .
3
5
,
2215
CLT OF PERMUTATION STATISTICS
where s 2 s 4 ny3 Ý i, k Ž d i y 2 ny1r 1 . 2 Ž dXk y 2 ny1r 2 . 2 . Note that if the degrees d i and dXi of each node grow linearly with n, then the convergence rate reaches O Ž ny1 r2 .. 4. Proof of Theorem 1. In order to create sufficient independence needed in the proof of Theorem 1, we first extend Bolthausen’s combinatorial method as follows. Define a random vector Ž I1 , J1 , I2 , J2 , K 1 , L1 , K 2 , L2 . in N 8 in the following way: first, let Ž I1 , J1 ., Ž I2 , J2 . and Ž K 1 , L1 . be independent and identically distributed with P Ž I1 s i , J1
¡0, 1 ¢n Ž n y 1 . ,
s j . s~
if i s j, if i / j.
Given these I1 / J1 , I2 / J2 and K 1 / L1 , Ž K 2 , L 2 . and its conditional distribution are defined according to the following rules: 1. If I1 s I2 and J1 s J2 , then K 2 s K 1 and L2 s L1. 2. If I1 s I2 and J1 / J2 , then K 2 s K 1 and L2 is uniformly distributed on N y K 1 , L1 4 . 3. If I1 / I2 and J1 s J2 , then L2 s L1 and K 2 is uniformly distributed on N y K 1 , L1 4 . 4. If I1 / I2 and J1 / J2 , then K 2 / K 1 , L2 / L1 , K 2 / L 2 , and furthermore, Ža. if I1 s J2 and I2 s J1 , then K 2 s L1 and L 2 s K 1; Žb. if I1 s J2 and I2 / J1 , then K 2 s L1 and L2 is uniformly distributed on N y K 1 , L1 4 ; Žc. if I1 / J2 and I2 s J1 , then L2 s K 1 and K 2 is uniformly distributed on N y K 1 , L1 4 ; Žd. if I1 / J2 and I2 / J1 , then Ž K 2 , L2 . is uniformly distributed on the set of all ordered pairs of distinct elements in N y K 1 , L1 4 . Next, let p 1 be a random permutation which is uniformly distributed on Pn and independent of Ž I1 , J1 , I2 , J2 , K 1 , L1 , K 2 , L2 .. Define y1 y1 y1 I3 s py1 1 Ž K 1 . , J 3 s p 1 Ž L1 . , I4 s p 1 Ž K 2 . , J4 s p 1 Ž L 2 . ,
K 3 s p 1 Ž I1 . , L3 s p 1 Ž J1 . , K 4 s p 1 Ž I2 . , L4 s p 1 Ž J2 . , and denote I s Ž I1 , I2 , I3 , I4 ., and similarly for J, K and L. Thus, I1 s I2 m I3 s I4 , J1 s J2 m J3 s J4 , I1 s J2 m I4 s J3 and I2 s J1 m I3 s J4 . Let Ms
½Ži, j. g N
8
: i m / jm , m s 1, . . . , 4, and satisfy the equivalence relations i 1 s i 2 m i 3 s i 4 , j1 s j2 m j3 s j4 ,
5
i 1 s j2 m i 4 s j3 and i 2 s j1 m i 3 s j4 . For each Ž i, j . g M, we fix once and for all permutations t 1Ž i, j . and t 2 Ž i, j . of N with the properties described in Table 1.
2216
ZHAO, BAI, CHAO AND LIANG TABLE 1 Definition of the permutations t1Ž i, j . and t 2 Ž i, j . i1
j1
i2
j2
t 1Ž i, j .
i4
j4
i3
j3
t 2Ž i, j .
i2
j2
i3
j3
i4
j4
N y i 1 , . . . , i 4 , j1 , . . . , j 4 4
g i 1 , . . . , i 4 , j1 , . . . , j 4 4
g i 1 , i 2 , j1 , j 2 4
Remain fixed
Finally, define p 2 s p 1 ( t 1Ž I, J . and p 3 s p ( t 2 Ž I, J .. Summarizing the above definitions, we have Table 2 which shows how p 1 , p 2 and p 3 map I and J. LEMMA 1. Ži. The terms p 1 , p 2 , and p 3 are independent of Ž I, J . and have the same law. Žii. The term p 1 is independent of Ž I1 , J1 , K 1 , L1 , I2 , J2 , K 2 , L2 . and p 2 is independent of Ž I1 , J1 , K 1 , L1 .. PROOF. Ži. For given p 0 g Pn and each Ž i, j . g M, by the definition of p 1 and the independence of p 1 and Ž Im , Jm , K m , L m , m s 1, 2., P Ž I s i , J s j, p 1 s p 0 .
s P Ž Im s i m , Jm s jm , K m s p 0 Ž i mq2 . , L m s p 0 Ž jmq2 . , m s 1, 2 . =P Ž p 1 s p 0 . s P Ž Ž I1 , I2 , K 1 , K 2 . s i , Ž J1 , J2 , L1 , L2 . s j . P Ž p 1 s p 0 . . Summation over all p 0 g Pn gives that Ž I, J . and ŽŽ I1 , I2 , K 1 , K 2 ., Ž J1 , J2 , L1 , L2 .. have the same law. Hence, p 1 is independent of Ž I, J .. It then implies that
ž
P Ž I s i , J s j, p 2 s p 0 . s P Ž I s i , J s j . P p 1 s p 0 ( ty1 1 Ži, j. s
1 n!
/
P Ž I s i, J s j..
The assertion for p 2 follows. The assertion for p 3 can be proved similarly. Žii. Similarly to Ži., by the independence of p 2 and Ž I, J ., the assertion for p 2 follows. I TABLE 2 Values of I and J under p 1 , p 2 , p 3 I1
J1
I2
J2
I3
J3
I4
J4
Ny I1 , . . . , I4 , J 1 , . . . , J 4 4
p1
K3
L3
K4
L4
K1
L1
K2
L2
gNy K 1 , . . . , K 4 , L 1 , . . . , L 4 4
p3
K2
L2
K1
L1
p3
K1
L1
g K 1 , K 2 , L 1 , L 2 4
g K 1 , . . . , K 4 , L 1 , . . . , L 4 4
Same as p 1
Same as p 2
Same as p 1
2217
CLT OF PERMUTATION STATISTICS
PROOF OF THEOREM 1. We use c to denote a positive constant which depends only on the formula where it appears. It may stand for different values even in consecutive inequalities. Let T s Ý i aŽ i, p Ž i .. and S s ÝXi, j bŽ i, j, p Ž i ., p Ž j ... Using Ž2.1., Ž2.2. and Ž2.3., we easily obtain
Ž 4.1. Ž 4.2.
ET s 0,
ES s
1 n Ž n y 1.
ES 2 F cny2
ET 2 s 1 and
Ý bŽ i , i , k, k . ,
i, k
Ý
b 2 Ž i , j, k , l . .
i , j, k , l
Let a A s Ý i, k < aŽ i, k .< 3 and bB s Ý i, j, k, l < bŽ i, j, k, l .< 3. Then, by Ž2.2. and Jensen’s inequality,
Ž 4.3.
a A G cn1r2 ,
n G 2,
and
Ž 4.4.
ny3 y1r2
Ý
i , j, k , l
b 2 Ž i , j, k , l . F c Ž ny3bB q ny1 r2 . .
For arbitrary but fixed n 0 G 8 and « 0 ) 0, the statement of the theorem is true if 2 F n F n 0 or a A q ny2bB ) « 0 n. Therefore, we assume that n ) n 0 and a A q ny2bB F « 0 n, where n 0 and « 0 will be specified later on but n 0 ) 8. For g ) 0, let Mn Ž g . s Ž A, B . : A and B satisfy Ž 2.1. , Ž 2.2. , Ž 2.3. and a A q ny2bB F g 4 . For large n, we may assume that g G 1 due to Ž4.3.. For z, x g R, l ) 0, define h z , l Ž x . s Ž Ž 1 q Ž z y x . rl . n 1 . k 0 and
h z , 0 Ž x . s IŽy` , z x Ž x . .
Let
d Ž l , g , n . s sup Eh z , l Ž W . y F Ž h z , l . : z g R, Ž A, B . g Mn Ž g . 4 and d Žg , n. s d Ž0, g , n.. Here, F Ž h z, l . is the standard normal expectation of h z, l . Thus,
Ž 4.5.
d Ž g , n . F d Ž l , g , n . q l Ž 2p .
y1 r2
and what we aim to prove is
Ž 4.6.
sup n d Ž g , n . rg : g G 1, n ) n 0 4 - `.
From now on, we write h instead of h z, l for convenience. To use Stein’s method, we let f Ž x . s exp Ž x 2r2 .
Hy` Ž h Ž t . y F Ž h . . exp Žyt r2. dt , x
2
which satisfies the differential equation f X Ž x . y xf Ž x . s hŽ x . y F Ž h.. Also, as stated in Bolthausen Ž1984., page 381,
Ž 4.7.
f Ž x . F 1, xf Ž x . F 1, f X Ž x . F 2 for all x g R
2218
ZHAO, BAI, CHAO AND LIANG
and
ž
Ž 4.8.
f X Ž x q y . y f X Ž x . F < y < 1 q 2 < x < q ly1
H0 I 1
w z , zq l x
/
Ž x q sy . ds .
To prove the theorem, we fix Ž A, B . g MnŽg . and estimate
Ž 4.9.
Eh Ž W . y F Ž h . s Ef X Ž W . y EWf Ž W . . By the same truncation used in Bolthausen Ž1984., pages 381 and 382, we may assume a Ž i , k . F 1 for all i , k g N. Ž 4.10. Denote the set of these pairs Ž A, B . g MnŽg . by Mn0 Žg .. To prove Theorem 1, we need only estimate < Ef X ŽW . y ETf ŽW .< and < Eny1 Sf ŽW .