Journal of Statistical Computation and Simulation ...

5 downloads 0 Views 723KB Size Report
noted by many authors (e.g., Bell et al., 1982; Sandvik and Olsson,. 1982; Cohen, 1986; McCulloch, 1987; Ramsey and Ramsey, 1987) that this test is sensitive ...
This article was downloaded by: [German National Licence 2007] On: 16 November 2009 Access details: Access Details: [subscription number 912873518] Publisher Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 3741 Mortimer Street, London W1T 3JH, UK

Journal of Statistical Computation and Simulation

Publication details, including instructions for authors and subscription information: http://www.informaworld.com/smpp/title~content=t713650378

Tests for equality of dispersion in divariate samples- review and empirical comparison H. P. Piepho a a Institute for Crop Science, University of Kassel, Witzenhausen, Germany

To cite this Article Piepho, H. P.'Tests for equality of dispersion in divariate samples- review and empirical comparison',

Journal of Statistical Computation and Simulation, 56: 4, 353 — 372 To link to this Article: DOI: 10.1080/00949659708811799 URL: http://dx.doi.org/10.1080/00949659708811799

PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf This article may be used for research, teaching and private study purposes. Any substantial or systematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material.

J . Starisr. Comput. Sirnu.. 1997, Vol. 56, pp. 353-372 Reprints available directly from the publisher

Downloaded By: [German National Licence 2007] At: 11:28 16 November 2009

Photocopying permitted b> 11censeonly

p 1997 OPA (Overseas Publishers Association) Amsterdam B.V. Published in The Yetherlands under license by Gordon and Breach Sclence Publishers Printed in India

TESTS FOR EQUALITY OF DISPERSION IN BIVARIATE SAMPLES REVIEW AND EMPIRICAL COMPARISON -

H.-P. PIEPHO Institute for Crop Science, Cniuersity of Kassel, Steinstrasse 19, 37213 Witzenhausen, Germany (Recelaed 6 February 1996; I n find form 30 October 1996) This paper investigates the robustness and power of several tests for equality of dispersion in bivariate samples. Parametric tests are shown by Monte Carlo simulation to be very sensitive to departures from the normality assumption. Of the alternative procedures, the extended Brown-Forsythe test and Spearman's rank correlation of sums and differences are recommended as robust tests.

Keywords: Homogeneity of variance; Monte Carlo simulation; power; robustness

1. INTRODUCTION Morgan (1939) and Pitman (1939) have introduced a test for equality of variances in bivariate normal samples. Subsequently it has been noted by many authors (e.g., Bell et al., 1982; Sandvik and Olsson, 1982; Cohen, 1986; McCulloch, 1987; Ramsey and Ramsey, 1987) that this test is sensitive to departures from the assumption of bivariate normality. Several alternative procedures have been suggested, and some of them have been compared to the parametric test by Morgan and Pitman via Monte Carlo simulation. Most of the simulations included only a small number of the available tests. The purpose of the present paper is to give a brief review of tests for homogeneity of dispersion in bivariate samples and to compare the performance of these tests by Monte Carlo simulation.

H.-P. PIEPHO

354

Downloaded By: [German National Licence 2007] At: 11:28 16 November 2009

2. TESTS FOR HOMOGENEITY O F DISPERSION I N PAIRED DATA Consider a random bivariate sample (Xi, Yi) i = 1,... ,n. It may be of interest to test H,: a2, = a',, where a2, and 0 2 , are the variances of Xi and Y, respectively. Xi and Yi could be, e.g.: (i) Yields of two crop varieties tested in n randomly chosen locations: The variance of a variety is often interpreted as a measure of 'stability': The smaller the variance, the more stable the variety. H, implies equal 'stability' of both varieties (Piepho, 1994). [It is noted that a comparison of variances is usually only a first step in a more thorough analysis of yield trial data, which may involve scrutiny of environmental as well as genotypic covariates, if available. For example, a scatter plot of yields against site temperatures may provide some insight into the causes of variance heterogeneity. For more details see Cullis et al. (1996) and Denis et al. (199611. (ii) Analytical results from two different laboratories, which have been asked to analyze the same sample (n batches) of some material: Under H, both laboratories are equally reliable (Jorgensen, 1985). McCulloch (1987) and Rothstein et al. (1981) list some other examples from biology and psychology in which H, is of interest. It is usually assumed that (Xi, Y,) are bivariate normal. Often, however, this is not a tenable assumption. We will therefore relax the distributional assumption. Let Gx and G, be the (continuous) distribution functions of X and E: It will be assumed that under H,, GX(x)= G,(x + 6) for some 6, i.e., the distributions of X and Y have identical shape and differ only in location. This implies homogeneity of variance in case the distribution has finite first and second moments. In the following several tests for H, will be briefly described. Most of these tests are only applicable for comparison of two groups, while some are p-sample tests (p 3 2). Abbreviations in brackets will be used to refer to the tests later in the paper. Some of the simulation results by authors of the tests are briefly mentioned. 2.1. Tests Based on the Normality Assumption Morgan (1939) and Pitman (1939) proposed a parametric test I D l Z A D C\ t h n t

h o o J

rnpf-

nn D n ~ r o n r n ' onrnrll~rt-mnmont rnrr~10t;nn

.

Downloaded By: [German National Licence 2007] At: 11:28 16 November 2009

TESTS OF EQUALITY OF DISPERSION

355

ficient (r,,) for sums (Si = Xi + Y,) and differences (D, = X i - Y,) of paired data [also see Maloney and Rastogi (1970)l. The test requires that (Xi, Y,) be bivariate normal. Ekbohm (1981) proposed an F-test for correlated variances with approximate degrees of freedom (d.f.) to account for the correlation among X and Y(EKB0HM). Under H,, F = S2,/S2, is approximately distributed as F with v, v d.f., where v = (n - 1 - 2r2)/(1- r2), r is sample correlation coefficient between X and I: and S2,(S2,) is the sample estimate of cr2,(02,). Ekbohm (1981) noted that the test is probably very sensitive to non-normality, though he did not investigate this further. Harris (1985) suggested four Wald test statistics, two of which (W and W,) are based on the normality assumption. Under H, both statistics are asymptotically distributed as chi-squared (with one d.f. in the bivariate case). Computation of the test statistics involves estimates of the variance-covariance matrix of sample variances (W) or their log-transforms (W,). The variance-covariance matrix is estimated based on large sample normal theory. For computational details see Harris (1985). It is noted here that Han (1968) suggested four p-sample tests, which could be applied here. Han (1968) notes that for p = 2 his tests are equivalent. In fact the statistics of Han's likelihood ratio test and the modified Bartlett test coincide with WL except for a scaling factor of n(n Modarres (1993) proposed a likelihood ratio test (MOD) for p 3 2. In the two-sample case, the statistic n {log [(S2, S2y)2/4- (SXy)2] -log [S2, SZy- (SXy)2])is asymptotically chi-squared with one d.f.

+

2.2. Robust Tests Replacing Pearson's correlation coefficient by Spearman's rank correlation of sums (S,) and differences (D,) (SPEAR) has been shown to improve the robustness of this procedure for non-normal data (McCulloch 1987; Ramsey and Ramsey, 1987). It should be noted, however, that SPEAR is not distribution-free as erroneously suggested by Salsburg (1975). Rothstein et al. (1981) proposed three different jackknife procedures denoted as JRATIO, JPEARS and JFISHER in McCulloch (1987), rxrhirh or- h0f-A

fin

r - o n e ~ t i ~ ~l n d l~ C~ 1

rr

o n r l C;chnro

-. t r ~ n c f n r m

Downloaded By: [German National Licence 2007] At: 11:28 16 November 2009

356

H.-P. PIEPHO

of r,,. The jackknife statistic for log(F) is computed as Q = L/V, where L=&L-,/n, L - , = n log(F)-(n-l)log(F-,) and v ~ = & ( L - , - L ) ~ / [n(n - I)]. The notation "- i" denotes results from a reduced sample obtained by deleting the i-th observation. Q is approximately distributed as t with n - 1 d.f. The jackknife statistic for r,, and Fishers z-transform of r,, are computed similarly. These procedures were subsequently compared to PEARS in a simulation study by Bell et al. (1982). In a more extensive simulation study, McCulloch (1987) showed JRATIO, JPEARS, JFISHER and PEARS to be inferior to SPEAR in terms of Type I error control. Harris (1985) suggested two "asymptotically robust test statistics" denoted as W , and W,, which correspond to the normal-theory tests W and W,, respectively. They are asymptotically chi-squared (with one d.f. in the bivariate case). For W, and W, the variance-covariance matrix of the sample variances and their log-transforms is approximated by the sample forth moments, while for W and W , the variance-covariance matrix are estimated based on large sample normal theory. In her simulations Cohen (1986) did not consider W , because "it is clear that W , is inferior to W,. It does not have the robust properties of W, and it is slower in convergence to chi-squared." Wilcox (1990) applied the method of Tiku and Balakrishnan (1986) (TB) to test the correlation among sums and differences. It is based on a 10% trimming of the original sample. Details of the complex test statistic are given in Wilcox (1990) and will not be repeated here for brevity. Levy's (1976) test (LEVY) is an extension of Box's (1953) approximate test for variance homogeneity. The n samples are partitioned into c groups of size rn (n = mc). Based on his simulations Levy recommended m = 5. In each of the c groups compute the sample variance of X and Y (denoted as S2,, and S:,, j = 1,. .. ,c) and perform a paired t-test on Z,, = log(S2,,) and Z,, = log(S2,,). Significance of this test indicates dispersion differences. Wilcox (1990) suggested performing Levy's test with S2,j and S2,, instead of their log-transforms (method henceforth referred to as T). He employed m = 2. To specify the group size, the two tests will be denoted as LEVYm and Tm. In comparing SPEAR, T, TB and SANDVIK (see section 2.3), Wilcox (1990) concluded that as regards Type I error control, only method T can be recommended for general use, though it had lowest power of all tests

TESTS OF EQUALITY OF DISPERSION

357

Downloaded By: [German National Licence 2007] At: 11:28 16 November 2009

investigated. It is noted, however, that Wilcox (1990) investigated only the case of non-identical distributions for X and Y Wilcox (1989) suggested two extension of the robust test by BrownForsythe (1974), which will be referred to as EBF (for extended Brown-Forsythe). In the two-sample case the methods are: EBF1: paired t-test on Zil = IXi - M,I and Zi2 = IYi - Myl,where M , and M y are the medians of X and X EBF2: paired t-test on Zil = IXi - X,,,j and Zi2= I Y( - Y lol, where X,,, and Y,, are the 10% trimmed means of X and Y ,

,

Further, Wilcox proposed to use rank transforms of Z,, and Z,, (i.e., assign ranks 1 to 2n) and apply an ANOVA for dependent groups (a paired t-test in the two sample case). This is an extension of a method proposed by Iman (1974), which will be referred to as EI (EI1 and EI2). In simulations with p 2 3 it was found that EBF2 can be too liberal under normality. Often, however, EBF and EI were rather conservative. Cohen (1986) suggested two test statistics, which are identical in the bivariate case. The test is based on bootstrapping the null distribution of Fisher's z-transform of r,,. To simulate H,, X and Yare scaled to unit variance for bootstrapping. Cohen (1986) used B = 500 bootstrapping samples. Simulations by Cohen (1986) indicate that Harris' tests are very non-robust, while LEVY controls the Type I error very well in non-normal samples. LEVY had low power compared to Cohen's and Harris' tests. Wilcox (1989) found that Cohen's procedures can be rather liberal. For p = 3 and n = 40, the Type I error exceeded 10% at nominal a = 5% with a contaminated normal distribution. Wilcox (1991) investigated two percentile bootstrap methods of Rasmussen (1989) as applied to r,, and showed them to be too liberal under some non-normal distributions. Other forms of the bootstrap were also considered, but they did not improve on the methods by Rasmussen (1989). I will not consider these bootstrap methods in the simulations, because of the computational burden and because none of the bootstrap procedures proved to be entirely satisfactory in previous investigations. Wilcox (1995) suggested a bootstrap method (BOOT), which yielded satisfactory Type I errors with non-identical distributions for X and Yin most of the investigated cases. The author cautions, however: "If distributions differ in shape, and one of the marginal distributions

H. -P. PIEPHO

358

is sufficiently non-normal, control over the probability of a type I error might be unsatisfactory ( . . .)". To perform the test, obtain a bootstrap sample by sampling (with replacement) n pairs from (S,, Dl), . . . ,(S,, D,), where Si = X, Y, and Di = X i - Y,, and compute the regression of Di on Si. Let *b be the least squares estimate of the slope. Repeat this process C = 599 times, yielding "b,,. . . , *bc. Denote the ordered sample by *b,,, ,. . . ,*b(,,. The approximate 95% confidence interval for the regression slope is given by b(,, ,...,* b(,,. For 2O

Suggest Documents