Accurate Approximation of Correlation Coefficients by Short ...

5 downloads 4438 Views 242KB Size Report
Nov 10, 2012 - In: Shiryaev A., Varadhan S., Presman E. (eds) Prokhorov and Contemporary Probability Theory. Springer Proceedings in Mathematics ...
Accurate Approximation of Correlation Coefficients by Short Edgeworth-Chebyshev Expansion and Its Statistical Applications Gerd Christoph, Vladimir V. Ulyanov, and Yasunori Fujikoshi

Abstract In Christoph, Prokhorov and Ulyanov (Theory Probab Appl 40(2):250– 260, 1996) we studied properties of high-dimensional Gaussian random vectors. Yuri Vasil’evich Prokhorov initiated these investigations. In the present paper we continue these investigations. Computable error bounds of order O.n3 / or O.n2 / for the approximations of sample correlation coefficients and the angle between high-dimensional Gaussian vectors by the standard normal law are obtained. We give some numerical results as well. Moreover, different types of Bartlett corrections are suggested. Keywords High-dimensional Gaussian random vectors • Sample correlation coefficient • Short Edgeworth-Chebyshev expansions • Computable error bound • Bartlett correction • Fisher transform

Mathematics Subject Classification (2010): Primary 62H10; Secondary 62E20

G. Christoph () Department of Mathematics, University of Magdeburg, Postfach 4120, D-39016, Magdeburg, Germany e-mail: [email protected] V.V. Ulyanov Faculty of Computational Mathematics and Cybernetics, Moscow State University, Vorobyevy Gori, 119899, Moscow, Russia e-mail: [email protected] Y. Fujikoshi Emeritus Professor, Graduate School of Science, Hiroshima University, Higashi-Hiroshima, 739–8526, Japan e-mail: fujikoshi [email protected] A.N. Shiryaev et al. (eds.), Prokhorov and Contemporary Probability Theory, Springer Proceedings in Mathematics & Statistics 33, DOI 10.1007/978-3-642-33549-5 13, © Springer-Verlag Berlin Heidelberg 2013

239

240

G. Christoph et al.

1 Introduction In the present paper we continue to study properties of high-dimensional Gaussian random vectors. We get new results for basic statistics connected with highdimensional vectors. In Christoph, Prokhorov and Ulyanov [2] two-sided bounds were constructed for a probability density function p.u; a/ of a random variable jY  aj2 ; where Y is a Gaussian random element with zero mean in a Hilbert space H . The constructed bounds are sharp in the sense that starting from large enough u a ratio of upper bound to lower one equals 8 and does not depend on any parameters of a distribution of jY  aj2 . The results hold for finite-dimensional space H D Rd as well provided that its dimension d  3. In Kawaguchi, Ulyanov and Fujikoshi [8] geometric representation of N observations on n variables were studied. It is useful to describe asymptotic behavior of the following statistics: • Length of n-dimensional observation vector, • Distance between two independent observation vectors and • Angle between these vectors. In Hall, Marron and Neeman [6] the asymptotic distributions of these statistics were pointed out in a high-dimensional framework when the dimension n tends to infinity while the sample size N is fixed. In Kawaguchi, Ulyanov and Fujikoshi [8] we obtained the computable error bounds for approximations of the length and the distance. The aim of the present paper is to get a computable error bounds for the angle. Moreover, in order to construct the bounds we study approximations for the sample correlation coefficients. Assuming that X1 ; : : : ; XN is a sample from a normal distribution N.0; In / with zero mean and identity covariance matrix In . Hall, Marron and Neeman [6] showed that  D ang.Xi ; Xj / D

1  C Op .n1=2 /; i; j D 1; : : : ; N; i ¤ j; 2

(1)

where Op denotes the stochastic order. Since cos  D

kXi k2 C kXj k2  kXi  Xj k2 D Rij ; 2 kXi k kXj k

where Rij is the sample correlation coefficient for the vectors Xi and Xj , the computable error bounds for  will follow from computable error bounds for Rij . Below we omit the indices i and j and write simply R D Rij . There are many results about asymptotic properties of R, see e.g. Johnson, Kotz and Balakrishnan [7], Chap. 32. Some of the most precise approximations of the distributions of R and Fisher’s normalizing and variance stabilizing z-transform Z.R/ D .1=2/ lnf.1 C R/=.1  R/g

(2)

Accurate Approximation of Correlation Coefficients

241

by short Edgeworth-Chebyshev expansions were suggested by Konishi [9]. The remainder terms have the order O.n3=2 /. The accuracy of the proposed approximations is examined comparing the normal short Edgeworth-Chebyshev expansions with the exact values due to David [4]. However, our paper is first one containing the computable error bounds of approximations. The structure of the paper is the following. In Sect. 2 we consider the sample correlation coefficient and the angle between the involved vectors. In Sect. 3 some asymptotes for the constant factor with the Gamma-functions in the density function of the correlation coefficient are given. Computable error bounds of order O.n3 / or O.n2 / are constructed in Sect. 4 when the distributions of R or the angle between the vectors are approximated by short asymptotic expansions using one of the representations for the probability density of R. In Sect. 5 some Bartletttype corrections are considered. A new transform of R similar to Fisher transform is constructed. This transform can be approximated by normal distribution up to order O.n2 /. In Sect. 6 we give an error bound also of order O.n2 / as corollary of general results for scale-mixed distributions, see Fujikoshi, Ulyanov p p and Shimizu [5], Chap. 13, and the fact that n  2 R = 1  R2 has Student’s t-distribution with n  2 degrees of freedom. The last Sect. 7 contains the proofs.

2 Sample Correlation Coefficient and Angle Between Vectors Let X D .X1 ; : : : ; Xn /T , and Y D .Y1 ; : : : ; Yn /T be two vectors from an n-dimensional normal distribution N.0; In/ with zero mean, identity covariance matrix In and the sample correlation coefficient Pn Xk Yk R D R.X; Y/ D qP kD1 P : n n 2 2 X Y kD1 kD1 k k The so-called null density function pR .rI n/ of R is given in Johnson, Kotz and Balakrishnan [7], Chap. 32, formula (32.7): .n4/=2  ..n  1/=2/  1  r2 I.1; 1/ .r/; pR .rI n/ D p   ..n  2/=2/

n  5;

(3)

where IA .x/ denotes indicator function of set A. R is two point distributed with P .R D  1/ D P .R D 1/ D 1=2 if n D 2 and it is U -shaped for n D 3 with density pR .rI 3/ D  2 .1=2/ .1  r 2 /1=2 I.1;1/ .r/. The sample correlation coefficient R is uniform for n D 4: pR .rI 4/ D 1=2 I.1;1/ .r/. Moreover, for n  5 the density function pR .rI n/ is unimodal. p Consider now the standardized correlation coefficient R D n  c R with some correcting real constant c < n having density

242

G. Christoph et al.

 .n4/=2 r2  ..n  1/=2/ 1 pR .rI n; c/ D p Ifjrj 2 and put for x W 1 < x < 1, ˚1;4 .n; x/ D ˚.x/ 

 x3 C x



p x n2 g.x/ D p : 1  x2 Since the function g.x/ is increasing, we have for any constant c W c < n P

p

     p p n  c R  x D P g.R/  g.x= n  c/ D P Tn2  g.x= n  c/ :

Therefore, by (32) we get ˇ p ˇ 6 .n C 2/  p ˇ ˇ sup ˇP n  c R  x  ˚1;4 .n  2; g.x= n  c//ˇ  : .n  2/3

(33)

Using (33), we can obtain results similar to Theorems 1 and 2. However, the upper bounds for errors of approximation, say Mn , will be worse comparing with right

Accurate Approximation of Correlation Coefficients

251

hand sides in the inequalities (19) and (20). In fact, according to (33) we shall have for all n > 2 that .n  2/2 Mn  6: Compare it with values for Bn ./ in Table 1. It is not surprising that (33) implies the worse result because in Theorems 1 and 2 we have used essentially the representation (4) and, in particular, the properties of Gamma-function while Theorem 13.2.3 in Fujikoshi, Ulyanov and Shimizu [5] is obtained for the general distributions of scale mixtures.

7 Proofs Proof of Lemma 1. The error term estimations for asymptotic expansions of logarithm of Gamma function in Abramowitz and Stegun [1], formula (6.1.42), imply  1 1 1 1 1 ln x C x  ln.2 /   ln  .x/  x   ; 3 12 x 360 x 2 2 12 x

x > 0: (34)

Consider now for x  1 the function h.x/ WD ln

  1 1 1  .x/  x  ln x C .x  1/ ln x  C : (35)  .x  1=2/ 2 2 2

Taking into account (34) and similar inequalities for the argument x  1=2 we find a.x/ WD

1 1 1   3 12 x 360 x 12 .x  1=2/

 h.x/ 

1 1 1  C DW b.x/: 12 x 12 .x  1=2/ 360 .x  1=2/3

(36)

1 1 1 1 Using x1  C D  D  x  1=2 2 x .x  1=2/ 2 .x  1=2/2 4 x .x  1=2/2 we obtain for x  1 1 1 1 C  2 2 24 .x  1=2/ 48 x .x  1=2/ 360 x 3

(37)

1 17 1 C  : 24.x  1=2/2 720.x  1=2/3 96 x .x  1=2/3

(38)

a.x/ D  and b.x/ D 

Remember some well-known inequalities where k is an integer:

252

G. Christoph et al.

0   ln.1  z/  z  : : : 

zkC1 zkC2 zk  C ; k kC1 .k C 2/ .1  z/

0  ln.1 C z/  z C z2 =2  z3 =3 C z4 =4  z5 =5;

0  z < 1; k  1; (39) (40)

0  z < 1;

and for integer k  0 0  sgn

kC1

zk   .z/ e  1  z  : : :  kŠ 

(

z

zkC1 e z = .k C 1/Š;

z0

.z/kC1 = .k C 1/Š;

z 1 we define the function   1 1 1  .y C 1=2/ : C ln.y/ D  y ln 1 C g.y/ WD h.y C 1=2/  ln  .y/ 2 2 2y The inequalities (40) for z D 1=.2y/ lead to upper and lower bounds for g.y/: 

1 1 1 1 C  g.y/    0; 4 2 160 y 8y 24 y 64 y 3

y > 1:

(42)

Next we are going to estimate the function R.y/ WD h.y C 1=2/  g.y/ D ln

1  .y C 1=2/  ln.y/:  .y/ 2

Suppose m WD n  2  5. Using (36)–(38) and (42) with y D x  1=2 D m=2 and 1 1 1 7u3 C 90u2 C 300u C 80   D > 0; u D m5; 6 .m C 1/ m2 45 .m C 1/3 8 m3 360.m C 1/3 m3 to obtain the lower bound we find 1 < 

1 1 23  R.m=2/   C < 0: 4m 4m 360 m3

(43)

Since An D e R..n2/=2/ D e R.m=2/ with 1 < R.m=2/ < 0 we find An < 1 and define r1 .m/ WD e R.m=2/  1  R.m=2/  R2 .m=2/=2 and r2 .m/ WD R.m=2/C1=.4 m/: Making use of (41) with k D 2 for 1 < z < 0 and (43) we find 1 23  23 R2 .m=2/ 1   r .m/  0; 0  r .m/  ;   0; 1 2 3 3 4 384 m 1;440 m 2 32 m2 360 m which lead to (9) for m D n  2  5.

Accurate Approximation of Correlation Coefficients

253

Let now c D 5=2 and put N D n  2:5. Note An D by (9)

p 1 C 1=.2N / An , then

ˇ ˇ 1=2  ˇ 23 1  3=2 ˇˇ 1  1=2 1  1  ˇAn  1 C 1  1C 1C C  : ˇ ˇ 2 2N 4N 2N 32 N 2N 360 N 3 The binomial series .1 C x/˛ for 0  x  1 and ˛ 2 f3=2; 1=2; 1=2g, see Abramowitz and Stegun [1], formula (3.6.9), imply 

 1  3=2 3  1C  1  0; 4N 2N

 1 1  1=2 1  0 1C  1 2N 4N 32 N 2

and

 1 1 1 1 1=2 C 1  : 0 1C 2 2N 4N 32 N 128 N 3 Hence (10) holds for n  7.

t u

Proof of Theorem 1. Let Fn .x/ be the distribution function of the standardized correlation coefficient R having density (4) with c D 2:5, see (6). Put ! 3x 7 C 13x 5 C 2x 3 C 6x x3 ˚n .x/ WD ˚.x/ C '.x/ C : 4 .n  2:5/ 96 .n  2:5/2 Our aim is to estimate Fn .x/  ˚n .x/ with an error having the order C =.n  2:5/3 . Note Fn .0/  ˚n .0/ D 0, therefore we suppose x ¤ 0. Moreover, we may consider only case x > 0 since pR .rI n; 2:5/, qR .rI n; 2:5/ and '.r/ are symmetric functions, hence jFn .x/  ˚n .x/j D jFn . x/  ˚n . x/j . Using (13) define for x > 0 with N D n  2:5  Hn .x/ D 1  Hn .x/ WD 1 C

1 16 N 2

p

Z

N

qR .rI n; 2:5/ dr:

x

Then we have jFn .x/  ˚n .x/j  jFn .x/  Hn .x/j C jHn .x/  ˚n .x/j: For 0  x 

p N with (15), (12) and (10) we find An  1 and

ˇ ˇ ˇZ ˇ ˇ ˇ ˇ Fn .x/  Hn .x/ˇ  ˇ

p N

x



(44)

319 2 An p 2;880 N 3 2 An 2

Z 0

  pR .rI n; 2:5/  1 C p

N

1

r2 N



1 16 N 2

!N=23=4 dr 

 qR .rI n; 2:5/

ˇ ˇ dr ˇ

319 319  : 5;760 An N 3 5;760 N 3 (45)

254

G. Christoph et al.

Now we have to estimate jHn .x/  ˚n .x/j. Define Q2 .x/ WD Q2 .x/C6.1˚.x//, 'n .x/ WD

 d x 4  3x 2 3x 8  34x 6 C 63x 4 C 6  ˚n .x/ D '.x/ 1  C dx 4N 96 N 2

and 'n .x/ WD 'n .x/  '.x/ = .16 N 2/. Then we obtain for x > 0 jHn .x/  ˚n .x/j  K1 C

 1  0:462541 14:766155 1 1 C C K  K C ; 2 1 16 N 2 16 N 2 64 N 3 1;536 N 4 (46)

where ˇZ 1 ˇ ˇZ 1 ˇ ˇ ˇ ˇ ˇ K1 WD ˇˇ .qR .rI n/  'n .r//dr ˇˇ ; K2 D ˇˇ .qR .rI n/  '.r//dr ˇˇ  K1 C K3 x x ˇ ˇZ 1  x 4  3x 2 ˇ 3x 8  34x 6 C 63x 4  ˇˇ x 3 '.x/ jQ2 .x/j dr  K3 D ˇˇ '.x/  C C ˇ 2 4N 96 N 4N 96 N 2 x

.3=e/3=2  0:462541 sup x 3 '.x/ D p 2 x>0

and

sup jQ2 .x/j D Q2 .3/  14:766155: x>0

Now we have to estimate K1  J1 C J2 C J3 , where with  2 .0; 1/ p  N

Z J1 WD IŒ0;  pN / .x/

x p N

Z J2 WD

p  N

ˇZ ˇ J3 WD ˇ

1

p  N

ˇ ˇ N=2  3=4 ˇ ˇ 1  r2 ˇ ˇ  'n .r/ˇ dr; 1 ˇp ˇ ˇ 2 N

 N=2  3=4 r2 1 dr; N ˇ p ˇ 'n .r/ dr ˇ D j1  ˚n . N /j 1 p 2

and ˚n .x/ D ˚n .x/ C .1  ˚.x//=.16 N 2 /. Substituting u2 D r 2 =N we find p p Z 1  N=2  3=4 N=2 C 1=4  N u N 2 J2 D p du  p : 1  2 1u 2  u 2  .N C 1=2/ p Using the second inequality of (16) to estimate 1  ˚. N / we find p   p p p 2 '. N / 1 C 1=.16 N 2/ Q2 . N /j . N /3 '. N / C J3  p  : q C   4N 96 N 2  N 1 C 1 C 8=  2 t p p Let now 0 < x   N . To estimate J1 we suppose 0 < r   N and define

Accurate Approximation of Correlation Coefficients

255

# ! r2 r2 r4 r6 ln 1  C C C N N 2N2 3N3 # " ! 3 r2 r2 r4  ; ln 1  C C 4 N N 2N2     a2 .r/ D  N 1 r 4 =4  3 r 2 =4 and a3 .r/ D  N 2 r 6 =6  3 r 4 =8 : N a1 .r/ D 2

"

  Then we have 'n .r/ D '.r/ 1 C a2 .r/ C a22 .r/=2 C a3 .r/ and 

1  r 2 =N

N=2  3=4

D e r

2 =2Ca .r/Ca .r/Ca .r/ 1 2 3

h

    e a2 .r/Ca3 .r/ e a1 .r/  1 C e a2 .r/ e a3 .r/  1  a3 .r/  i Ce a2 .r/ 1 C a3 .r/ :

D e r

2 =2

Using (41), akC WD max.0; ak / and ak WD max.0;  ak /, k D 1; 2; 3, we find p  N

Z J1 

X4 ˇ  ˇ '.r/ˇe a1 .r/Ca2 .r/Ca3 .r/  1 C a2 .r/ C a22 .r/=2 C a3 .r/ ˇdr 

kD1

0

J1;k ;

where Z



p N

J1;1 WD

'.r/e

ˇ

a2 .r/Ca3 .r/ ˇ a1 .r/

e

ˇ  1ˇdr 

Z

0

Z J1;2 WD



p N

0 

p N

'.r/e

ˇ e

a2 .r/ ˇ a3 .r/

ˇ  1  a3 .r/ˇ dr 

Z

0

Z J1;3 WD Z

J1;4 WD



p N

'.r/ 0



p N

0

and

ˇ ˇ C '.r/ˇa1 .r/ˇe a2 .r/Ca3 .r/Ca1 .r/ dr;



p N

Z  ˇ a22 .r/ ˇˇ ˇ a2 .r/ '.r/ˇe  1  a2 .r/  ˇ dr  2 0 ˇ ˇ  '.r/ˇ e a2 .r/  1 a3 .r/ˇ dr 

0

Z



p N

a32 .r/ a2 .r/Ca3C .r/ dr; e 2

p N

'.r/

ja23 .r/j a2C .r/ dr e 6

ˇ ˇ C '.r/ˇa2 .r/ a3 .r/ˇe a2 .r/ dr:

0

p Let N  4:5 and 0 < r   N . It follows from (39) with y D r 2 =N that   a1 .r/ WD 

r 10 r8 C 8N3 10 N 4 .1  2 / C

  a1 .r/ 

3 r8 DW a1 .r/; 16 N 4 .1  2 /

r6 4N3

256

G. Christoph et al.

Z

Z

s

s

'.r/a1 .r/ dr  0

'.r/a1 .r/ dr for s  1:7 and 0

a1C .r/  r 6 =.4 N 3 /  6 =4: For r > 0 the functions a2 .r/ and a3 .r/ take both their only maximum at r D a2 .r/ D

3 r2  r4 4t



p r  p3 r< 3

 0; > 0;

with

a2C .r/ 



p 3=2,

p 0; r  p3 9=.16 N /; r < 3

and 9r 4  4r 6 a3 .r/ D 24 t 2



 0; r  3=2 with a3C .r/  > 0; r < 3=2



0; r  3=2 : 2 9=.32N /; r < 3=2



Then we find with e  ak .r/  1, (17) and the moments E.Y 4 / D 3, E.Y 6 / D 15, E.Y 8 / D 105, E.Y 10 / D 945 and E.Y 12 / D 10;395 if Y is standard normal distributed Z

p  N

  r8 r 10 dr C 8N 3 10 N 4 .1  2 / 0 p p  945  2 U10 . N /  9=.16 N /C9=.32 N 2 /C6 =4 105  2 U8 . N / e ; C 16 N 3 20 N 4 .1  2 / p 2 Z   e 9=.16  4:5/C9=.32  4:5 /  N J12  '.r/ 16 r 12  72 r 10 C 81 r 8 dr 4 1;152 N 0 p p p  1:14899  106;785  32 U12 . N / C144 U10 . N / 162 U8. N / ;  4 2;304 N

J11  e

9=.16 N /C9=.32 N 2 /C6 =4

with a2 .r/  0 only for 0  r  J13

'.r/

p 3 3 and .r 4  3 r 2 D r 12  9r 10 C 27r 8  27r 6

Z p 3    1   N  '.r/ r 4  3 r 2 dr C 1 C e 9=.16  4:5/ 3 384 N 0 Z p3 3   '.r/ 3 r 2  r 4 dr 0



p p p 1  . N / C 9 U . N /  27 U . N/ 2;160  U 12 10 8 384 N 3  p C27 U6 . N / C 2:937248

and with a2 .r/ a3 .r/ D .96 N 3 /1 .4r 10 21 r 8 C27r 6 /  0 only for 3=2  r 

p 3

Accurate Approximation of Correlation Coefficients

J14 

p  N

Z

  '.r/ a2 .r/a3 .r/ dr C e 9=.16  4:5/  1

0

  C 1 C e 9=.16  4:5/

Z

p

257

Z

3=2

'.r/ a2 .r/ a3 .r/ dr 0

3

   '.r/  a2 .r/ a3 .r/ dr

3=2

 p p p 1  990  4 U . N / C 21 U . N /  27 U . N / C 0:574299 :  10 8 6 96 N 3 Hence, J1 and also K1 are estimated. Taking estimates (44)–(46) together, we obtain (18). u t Proof of Theorem 2 The first bound (19) follows immediately from Theorem 1 and supx > 0 jQ2 .x/j  14:758064. To prove (20) we use (19) and Taylor expansion. As in the proof of Theorem 1 we may suppose x > 0. Here we have p  p p Fn .x/ D P . n  2 R  x/ D P n  2:5 R  x 1  1=.2n  4/ D Fn .y/ with y D x

p 1  1=.2n  4/: The bound (19) leads to ˇ ˇ ˇ Bn ./ y 3 '.y/ ˇˇ ˇ sup ˇFn .y/  ˚.y/  ˇ  .n  2:5/2 : 4 .n  2:5/ y>0

(47)

Put M D 2 .n  2/ D 2 n  4. Consider now the Taylor expansions ˚.y/ D ˚.x/  '.x/ .x  y/ C ' 0 .z/ .y  x/2 =2 with 0 < y < z < x; p '.x/ .x  y/ D x '.x/=.2M / C R1 .n/ and x  y D x .1  1  1=M /; where p     R1 .n/ WD '.x/ x  y  x=.2M / D x '.x/ 1  1  1=M  1=.2M / and p  2   R2 .n/ WD j' 0 .z/j .y  x/2 =2 D z3 '.z/ x=z 1  1  1=M  1=.2M / : Formula (3.6.9) in Abramowitz and Stegun [1] implies 01

p ı  ı  1  1=M  1=.2M /  1 8 M 2 .1  1=M / D 1 8 M .M  1/

and 01

p ı  ı  1  1=M  1 2 M .1  1=M / D 1 2 .M  1/

Hence, with .x=z/2  .x=y/2 D .1  1=M /1 we obtain

258

G. Christoph et al.

e 1=2 .3=e/3=2 R1 .n/  p : and R2 .n/  p 2  8 M 2 .1  1=M / 2  8 M 2 .1  1=M /2 (48) p Using y D x 1  1=M , n  2:5 D .n  2/.1  1=M / and the Taylor expansion '.y/ D '.x/ C ' 0 .z/ .y  x/ with 0 < y < z < x; we find x 3 '.x/ p x 3 '.x/ y 3 '.x/ D  R3 .n/ 1  1=M D 4 .n  5=2/ 4 .n  2/ 4 .n  2/ with R3 .n/ WD

p  x 3 '.x/  .3=e/3=2 1  1  1=M  p : 2M 2  4 M 2 .1  1=M /

(49)

It remains to estimate p   y 3 z '.z/ x 1  1  1=M y 3 ' 0 .z/ .x  y/ D R4 .n/ WD 4 .n  5=2/ 2.M  1/ p   z5 '.z/ 1  1  1=M .5=e/5=2  p  : 4 M 2 .1  1=M /5=2 1  1=M 2.M  1/

(50)

Taking (47)–(50) together we obtain (20). t u p p p Proof of Theorem 3 Define N D n  2:5 and h.x/ D N sin.x= N /. Starting from (7), we have to prove (21). Considering (7) and that R is symmetric and sin.x/ is an odd function, we may limit us to the case x > 0. In order to get smaller constants we use both Taylor expansions  3 ˚.h.x// D ˚.x/ C '.x/ .h.x/  x/ C ' 0 .x/ .h.x/  x/2 =2 C ' 00 .z/ h.x/  x =6 or ˚.h.x// D ˚.x/ C '.x/ .h.x/  x/ C ' 0 .z/ .h.x/  x/2 =2;

0 < h.x/ < z < x: (51) ˇ ˇp p ı ı Using ˇ N sin.x= N /  x C x 3 .6 N /ˇ  x 5 .120 N 2 /, we find ı '.x/ .h.x/  x/ D x 3 .6 N / C S1 .n/; where ˇp p x 3 ˇˇ .5=e/2:5 0:015256 x 5 '.x/ ˇ p S1 .n/ WD '.x/ ˇ N sin.x= N /  x   D : ˇ 2 6N 120 N N2 120 2  N 2

Accurate Approximation of Correlation Coefficients

259

ˇp ˇ p p  ı ıp With ˇ N sin.x= N /  x ˇ  x 3 .6 N /, 0 < x=z  x N sin.x= N /  =2 for 0 < x  =2 and having in mind S2 .n/ WD minfS2a .n/; S2b .n/ C S2c .n/g, where S2a .n/ WD

p j' 0 .z/j p z '.z/ x 6 .7=e/3:5 .=2/9 2:280916  D . N sin.x= N /  x/2  p 2 2 72 N 2 N2 72 2  N

or alternatively p j' 0 .x/j p x 7 '.x/ .7=e/3:5 0:151842 p  D . N sin.x= N /x/2  2 2 72 N N2 72 2  N 2 p p and since jz11  z9 j '.z/ takes its maximum for z D 3= 2 C 6=2 S2b .n/ WD

p ˇ3 j' 00 .z/j ˇˇp jz11  z9 j '.z/ .=2/9 N sin.x= N /  x ˇ  6 1;296 N 3 p p p   p .3= 2 C 6=2/11  .3= 2 C 6=2/9 .=2/9 35:597236  D : p p p 2 3 N3 1;296 2  expf.3= 2 C 6=2/ =2g N

S2c .n/ WD

Note that S2b .n/ C S2c .n/ < S2a .n/ for n  20. Finally we define m.x/ WD x 3 '.x/ then we have   m.h.x// D m.x/ C m0 .z/ h.x/  x

for 0 < h.x/ < z < x:

2 4 7 5 Since m0 .x/ p D .3px  x / '.x/ and the function .z  3z / '.z/ takes its maximum at zmax D 5 C 10 we obtain

p j3z2  z4 j '.z/ p j3z2  z4 j '.z/ x 3 j N sin.x= N /  xj  4N 24 N 2 7  5 3 z  3 zmax '.zmax / .=2/ 1:069085  max D 2 24 N N2

S3 .n/ WD

and (21) is proved. Changing .=2/k by .=6/k in the estimates of S2a , S2c and S3 , we find Dn . u t Proof of Theorem 4. Since the transformation T is assumed to be increasing, we get P .S  x/ D P .T .S /  T .x//: Therefore, in order (26) holds it is enough to find the function T such that ˚.T .x// D ˚.x/ C pn .x/ '.x/ C O.n˛1=2 /: Hence, by smoothness properties of ˚.x/ we may take T given by (27).

t u

260

G. Christoph et al.

Proof of Theorem 5. Put N D n  2:5 and h.x/ D p

F

1

2 e 3y  1 p .y/ D p 3 e 3y C 1

p p N F 1 .x= N /, where

ln.7 C 4 p for jyj  3

p 3/

is the inverse function to F .y/, given in (29). Then we find by Theorem 2 as n ! 1 P

p  p    h3 .x/ '.h.x// CO.n2 /: N F .R/  x D P N R  h.x/ D ˚ h.x/ C 4N

Using (51) and Z 1 .y/ D y  y 3 =4 C O.y 5 / as y ! 0 we find in our case as n!1 ˚.h.x// D ˚.x/x 3 '.x/=.4 N /CO.n2 / and h3 .x/'.h.x// D x '.x/CO.n1 /; which lead to (30). With F .y/ D y C y 3 =4 C O.jyj7 / as y ! 0 and similar calculations we find (31). t u

References 1. Abramowitz, M., Stegun, I.A. (eds.): Handbook of Mathematical Functions. Verlag Harry Deutsch, Frankfurt/Main (1984) 2. Christoph, G., Prokhorov, Yu.V., Ulyanov, V.V.: On the distribution of quadratic forms in Gaussian random variables. Theory Probab. Appl. 40(2), 250–260 (1996) 3. Christoph, G., Ulyanov, V.V.: On accurate approximation of standardized Chi-Squared distributions by Edgeworth-Chebyshev expansions. Inform. Appl. 5(1), 25–31 (2011) 4. David, F.N.: Tables of Correlation Coefficient. Cambridge University Press, Cambridge (1938) 5. Fujikoshi, Y., Ulyanov, V.V., Shimizu, R.: Multivariate Statistics: High-Dimensional and Large-Sample Approximations. Wiley, Hoboken (2010) 6. Hall, P., Marron, J. S., Neeman, A.: Geometric representation of high dimension, low sample size data. J. R. Stat. Soc. Ser. B 67, 427–444 (2005) 7. Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol.2, 2nd edn. Wiley, New York (1995) 8. Kawaguchi,Y., Ulyanov, V., Fujikoshi, Y.: Approximation of statistics, describing geometric properties of samples with large dimensions, with error estimation. Inform. Appl. 4(1), 22–27 (2010) 9. Konishi, S.: Asymptotic expansions for the distributions of functions of a correlation matrix. J. Multivar. Anal. 9(2), 259–266 (1979) 10. Niki, N., Konishi, S.: Effects of transformations in higher order asymptotic expansions. Ann. Inst. Stat. Math. 38(Part A), 371–383 (1986) 11. Ulyanov, V., Christoph, G., Fujikoshi, Y.: On approximations of transformed chi-squared distributions in statistical applications. Sib. Math. J. 47(6), 1154–1166 (2006). Translated from Sibirskii Matematicheskii Zhurnal, 47(6), 1401–1413.