Anderson darling normality test

Anderson-Darling normality test is a type of statistical test which is used to check whether the range of tested data overlaps with the theoretical range and thus confirm or deny hypotheses made earlier (L. Jäntschi, S.D. Bolboacă 2018, s.1). This test was created in 1954 by Theodore Wilbur Anderson and Donald Allan Darling as a result of modification of the Cramer-von Mises (CVM) and the Kolmogorov-Smirnov (K-S) tests (N. Mohd Razali, Y. Bee Wah 2011 s. 24).

Advantages and disadvantages

The most important difference in Kolmogorov-Smirnov Test and Anderson-Darling Test is that in the Kolmogorov-Smirnov test critical values are not dependent on the distribution that is being tested (this statement is true only if all parameters tested are known). However, in the Anderson-Darling Test critical values are determined based on the distribution that has been tested. The advantage of this is that the Anderson-Darling Test is much more flexible and better shows how the tested values are arranged, while the Kolmogorow-Smiernov Test gives much stiffer results. The disadvantage of Anderson-Darling Test is that for each distribution that is tested, the critical value must be calculated separately (Anderson-Darling Test 2012, Anderson-Darling Test).

Anderson-Darling Formula

The original Formula of Anderson-Darling test took the form given below (H. Shin, Y. Jung, C. Jeong, J.H. Heo 2012 s. 107):\[ A^2_n = n \int_{-\infty}^\infty \frac{[F_n(x) - F(x)]^2}{F(x)\; [1-F(x)]} \, dF(x) \] However, usually is used a simplified version of this formula (H. Shin, Y. Jung, C. Jeong, J.H. Heo 2012 s. 107):\[A^2_n = -n-\frac{1}{n}\sum\limits_{i=1}^n\left (2i-1)[\log F_{X_{i}}+\log(1-F_{X_{n+1-i}})\right]\]

Test for normality

Research carried out by statisticians proved that the Anderson-Darling Test is much better than other types of tests and "is the most powerful EDF (Empirical Distribution Function) test" (N. Mohd Razali, Y. Bee Wah 2011 s. 24).

To calculate The Anderson-Darling Normality Test, follow steps given below (R.B. D'Agostino 1986, s.372):

Rank the data as follows:\[ X_1\leq ...\leq X_n\]
Calculate \(Y_i\) - where \(Y_i\) stands for normalized values and \( i=1,...,n\):\[Y_i= \frac{X_i-\overline{X}}S\]
Determine the value of \(P_i\):\[ P_i = {\Phi(Y_i)} = \int_{-\infty}^{Y_i} \frac{e^{-t^2_2}}{\sqrt{2 \Pi}} \, dt \]
Calculate the \( A^2 \) factor:\[A^2 = -\sum\limits_{i=1}^n\left[ (2i-1) \frac{\log P_i + \log(1- P_{n+1-i})}{ n} - n\right]\]
Calculate the \( A^* \) factor, where \( A^* \) is the modfied statistic:\[ A^* = A^2 (1 + \frac{0,75}{n} + \frac{2,25}{n^2}) \]
Reject the hypothesis for which the \(A^*\) factor is higher than a given level of significance for 0.10, 0.05, 0.025, 0.01 and 0.005.

The Anderson-Darling Normality Test can be used only, if \( n \geq 8 \).

See also Statistical power

References

Anderson-Darling Test (2012) Anderson-Darling Test, "e-Handbook of Statistical Methods"
D'Agostino R.B. (1986) Godness-of-Fit-Techniques, Marcel Dekker Inc., New York, s. 372-373
Jäntschi L., Bolboacă S.D. (2018) Computation of Probability Associated with Anderson–Darling Statistic, "Mathematics 2018", nr 6(6), s.1-16
Mohd Razali N., Bee Wah Y. (2011) Power Comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling Tests, "Journal of Statistical Modeling and Analitics", nr 2(1), s. 21-33
Shin H., Jung Y., Jeong C., Heo J.H. (2012) Assessment of modified Anderson–Darling test statistics for the generalized extreme value and generalized logistic distributions, " Stochastic Environmental Research and Risk Assessment", nr 26(1), s. 105-114

Author: Patrycja Czerwiec