Some simpler statistical tests for rejecting outliers in quantitative data*

24 downloads 96408 Views 129KB Size Report
Dixon's Q-test: If you have a single outlier, and your data has a normal ... -“Statistical Treatment of Analytical Data: Outliers (Chapter 6)” by Z.B. Alfassi, Z. Boger ...
Some simpler statistical tests for rejecting outliers in quantitative data* Large data sets (N>100):

∑ For small data sets: ∑ 1 Rule of Huge Error: If you have a single outlier, then you can discard it with 98% confidence if any of the following conditions are met. |

|

5

8

6

8

14

5

15

4

Dixon’s Q-test: If you have a single outlier, and your data has a normal distribution, then you can discard the outlier if . Order the data values in increasing or decreasing order, such that the outlier is the final data point (xN). 3

7

8

10

11

13

*Compiled from: -Personal webpage of Prof. James K. Hardy, Dept. of Chemistry, University of Akron, “Statistical Treatment of Data” at http://ull.chemistry.uakron.edu/analytical/Statistics/. This has good notes for basic statistics and refers to specific tests for the rejection of data and discusses large and small sample sets. -“Dixon's Q-test: Detection of a single outlier”, which includes an Applet for doing Q-test calculations and a brief discussion on rejecting data from small data sets, on the University of Athen’s Department of Chemistry website at http://www.chem.uoa.gr/Applets/AppletQtest/Appl_Qtest2.html. Note: although much of the department’s website is in Greek, this page is in English. -“Statistical Treatment of Analytical Data: Outliers (Chapter 6)” by Z.B. Alfassi, Z. Boger and Y. Ronen. CRC Press: 2005. This chapter is available for reading through Google books if your library doesn’t have a copy.

Qcrit Values for Dixon's Q-test Outliers Risk of false rejection (%) Data points N 0.5 1 5 3 0.994 0.988 0.941 4 0.926 0.889 0.765 5 0.821 0.780 0.642 6 0.740 0.698 0.560 7 0.680 0.637 0.507 8 0.725 0.683 0.554 9 0.677 0.635 0.512 10 0.639 0.597 0.477 11 0.713 0.679 0.576 12 0.675 0.642 0.546 13 0.649 0.615 0.521

10 0.886 0.679 0.557 0.482 0.434 0.479 0.441 0.409 0.517 0.490 0.467

Grubbs’ T-test: This test can be used to evaluate multiple possible outliers. Start with the furthest | outlier, | , and discard it if T > Tcrit. |

|

Tcrit Values for Grubbs' T-test for Outliers Data points Risk of false rejection (%) N 0.1 0.5 1 5 3 1.155 1.155 1.155 1.153 4 1.496 1.496 1.492 1.463 5 1.780 1.764 1.749 1.672 6 2.011 1.973 1.944 1.822 7 2.201 2.139 2.097 1.938 8 2.358 2.274 2.221 2.032 9 2.492 2.387 2.323 2.110 10 2.606 2.482 2.410 2.176 15 2.997 2.806 2.705 2.409 20 3.230 3.001 2.884 2.557 25 3.389 3.135 3.009 2.663 50 3.789 3.483 3.336 2.956 100 4.084 3.754 3.600 3.207

If you discard the outlier, and suspect others, then recalculate furthest point.

10 1.148 1.425 1.602 1.729 1.828 1.909 1.977 2.036 2.247 2.385 2.486 2.768 3.017

and s in order to evaluate the next