A Comparison of Power of Normality Tests: Shapiro ...

2 downloads 0 Views 391KB Size Report
A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors,. Anderson-Darling and Jarque-Bera Tests. Md. Moniruzzaman Moni.
Introduction

Methodology

Results and Analysis

Conclusion

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests Md. Moniruzzaman Moni Muhammad Shuaib Institute of Statistical Research and Training University of Dhaka

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Outline 1

2

3

4

Introduction Introduction Objectives of the study Methodology Empirical Distribution Function (EDF) Tests Regression and Correlation Tests Moments Tests Simulation Procedures Results and Analysis Comparison of Power against the Symmetric Non-normal Distributions Comparison of Power against the Asymmetric Non-normal Distributions Computation of Sample Sizes for Different Powers against Different Normality Tests Conclusion

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Introduction

Introduction → Most used distribution in statistical analysis is the normal distribution. → Importance: An underlying assumption of many statistical procedures such as -

t-test linear regression analysis discriminant analysis F-test for homogeneity of variances and Analysis of Variance (ANOVA)

→ Consequences of Violation: - Interpretation and inference may not be reliable or valid

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Introduction

Introduction cont... → Three common ways to check the normality: 1 2 3

Graphical Methods (Q-Q plot, Histograms) Numerical Methods (Skewness and Kurtosis coefficients) Formal Normality Tests (Shapiro-Wilk, Kolmogorov-Smirnov etc.)

→ To support the graphical methods, numerical methods or formal normality tests should be performed before making any conclusion about the normality of the data. → Different tests of normality often produce different results i.e. some tests reject while others fail to reject the null hypothesis of normality.

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Objectives of the study

Objectives of the study

→ To understand the characteristics of different methods of normality test. → To compare the Shapiro-Wilk test, Kolmogorov-Smirnov test, Anderson-Darling test, Lilliefors test and Jarque-Bera test in terms of power via Monte Carlo simulation. → To provide guidelines to practitioners on the choice of normality test → To find the sample size of some specific distributions using 80% and 90% powers against these tests.

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Methodology

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Empirical Distribution Function (EDF) Tests

Empirical Distribution Function (EDF) Tests

→ EDF tests are those based on a measure of discrepancy between the empirical and hypothesized distributions. → The most crucial and widely known EDF tests are Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Cramer-Von Mises tests.

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Empirical Distribution Function (EDF) Tests

EDF Tests cont... ◦ Kolmogorov-Smirnov Test: T = supx |F ∗ (x) − Fn (x)| ◦ Lilliefors Test: D = maxx |F ∗ (x) − Sn (x)| Even though the LF statistic is the same as the KS statistic, the table for the critical values is different which leads to a different conclusion about the normality of a data. ◦ Anderson-Darling Test: Z ∞ Wn2 = n [Fn (x) − F ∗ (x)]2 ψ(F ∗ (x))dF ∗ (x) −∞

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Regression and Correlation Tests

Regression and Correlation Tests Regression and correlation tests based on the ratio of two weighted least squares estimates of scale obtained from order statistics. ◦ Shapiro-Wilk Test: Pn ( i=1 ai yi )2 W = Pn ¯ )2 i=1 (yi − y where yi is the i th order statistic y¯ is the sample mean T −1 ai = (a1 , ..., an ) = (mT Vm−1 VV −1 m)1/2 and m=(m1 , ..., mn )T are the expected values of the order statistics of independent and identically distributed random variables sampled from the standard normal distribution and V is the covariance matrix of those order statistics. Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Moments Tests

Moments Tests

Moment tests are those derived from the recognition that the departure of normality may be detected based on the sample moments which are the skewness and kurtosis. ◦ Jarque-Bera Test: JB = where

Md. Moniruzzaman Moni



n p 2 (b2 − 3)2 (( b1 ) + ) 6 4

b1 and b2 are the sample skewness and kurtosis respectively n is the sample size

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Simulation Procedures

Simulation Procedures

→ Monte Carlo procedure was used to evaluate the power of SW, KS, AD, LF and JB test statistics in testing if a random sample of n independent observations come from a population with a normal N(µ, σ 2 ) distribution. → The null and alternative hypotheses are: H0 : The distribution is normal H1 : The distribution is not normal → Two levels of significance, α=5% and 10% and sample sizes n=10, 20, 30, 50, 75, 100, 200, 300, 400, 500 and 1000 were considered to do the study.

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Simulation Procedures

Simulation Procedures cont...

→ The alternative distributions considered were four symmetric distributions: U(0,1), Beta(2,2), t(7) and Laplace(0,1) and four asymmetric distributions: χ2 (4), Gamma(4,5), Beta(3,2) and Exponential(1). → The power of each test was obtained by comparing the p-value of normality tests with the significance levels.

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Results and Analysis

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

0.0 0.2 0.4 0.6 0.8 1.0 1.2

Simulated Power

Comparison of Power against the Symmetric Non-normal Distributions







● ●



● ● ● ●





SW KS LF AD JB



● ● ● ● ● ●● ● ● ●●

0

200

400

600

800

1000

0.0 0.2 0.4 0.6 0.8 1.0 1.2

Simulated Power

Sample size,n







● ●

● ●

















● ● ● ● ●● ● ●

0

SW KS LF AD JB



200

400

600

800

1000

Sample size,n

Figure 1: Comparison of Power for Different Normality Tests against Uniform(0,1) Distribution at α = .05 and α = 0.10 Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

0.0 0.2 0.4 0.6 0.8 1.0 1.2

Simulated Power

Comparison of Power against the Symmetric Non-normal Distributions















● ● ● ● ● ● ● ● ● ● ●●

0





200



SW KS LF AD JB





400

600

800

1000

0.0 0.2 0.4 0.6 0.8 1.0 1.2

Simulated Power

Sample size,n

















● ● ● ●● ● ●● ● ●







0

SW KS LF AD JB



200





400

600

800

1000

Sample size,n

Figure 2: Comparison of Power for Different Normality Tests against Beta(2,2) Distribution at α = .05 and α = 0.10 Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

0.0 0.2 0.4 0.6 0.8 1.0 1.2

Simulated Power

Comparison of Power against the Asymmetric Non-normal Distributions







● ●

● ●

● ●

● ●

● ●



● ●



SW KS LF AD JB



● ● ● ● ● ●

0

200

400

600

800

1000

0.0 0.2 0.4 0.6 0.8 1.0 1.2

Simulated Power

Sample size,n

● ●





● ●

● ●

● ●

● ●















SW KS LF AD JB

● ●● ● ●

0

200

400

600

800

1000

Sample size,n

Figure 3: Comparison of Power for Different Normality Tests against Chi-square(4) Distribution at α = .05 and α = 0.10 Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

0.0 0.2 0.4 0.6 0.8 1.0 1.2

Simulated Power

Comparison of Power against the Asymmetric Non-normal Distributions







● ●













● ●





SW KS LF AD JB

● ●

● ● ●● ● ●



0

200

400

600

800

1000

0.0 0.2 0.4 0.6 0.8 1.0 1.2

Simulated Power

Sample size,n











● ●

● ●













● ●

SW KS LF AD JB

● ●

● ● ● ● ●

0

200

400

600

800

1000

Sample size,n

Figure 4: Comparison of Power for Different Normality Tests against Gamma(4,5) Distribution at α = .05 and α = 0.10 Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Computation of Sample Sizes for Different Powers against Different Normality Tests

Computation of Sample Sizes for Different Powers against Different Normality Tests Table 1: Sample Sizes of Uniform(0,1) Distribution for Different Powers against Different Normality Tests Sample Sizes Power 80% 90%

SW 54 64

α = 5% KS LF AD 427 140 72 490 171 87

JB 114 125

SW 44 52

α = 10% KS LF AD 330 108 57 390 136 70

JB 88 98

Table 2: Sample Sizes of Chi-square(4) Distribution for Different Powers against Different Normality Tests Sample Sizes Power 80% 90% Md. Moniruzzaman Moni

SW 34 45

α = 5% KS LF AD 184 63 42 213 82 53

JB 54 68

SW 28 37

α = 10% KS LF AD 143 49 34 170 67 45

JB 47 56 ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

Conclusion

→ In general, it can be concluded that among the five tests we considered, Shapiro-Wilk test is the most powerful test and followed by Anderson-Darling, Jarque-Bera and Lilliefors tests respectively whereas Kolmogorov-Smirnov test is the least powerful. → But keep in mind that all of these tests have low power for small sample size. → It is recommended that practitioners should not depend solely on graphical techniques such as q-q plot or histogram to conclude about the distribution of the data. Rather than the graphical techniques be combined with formal normality test.

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Introduction

Methodology

Results and Analysis

Conclusion

THANK YOU

Md. Moniruzzaman Moni

ISRT,University of Dhaka

A Comparison of Power of Normality Tests: Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors, Anderson-Darling and Jarque-Bera Tests

Suggest Documents