Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
One-way nonparametric ANOVA with trigonometric scores by Kravchuk, O.Y. School of Land and Food Sciences, University of Queensland Inspired by the simplicity of the Kruskal-Wallis ksample procedure, we introduce a new rank test of the χ2 type that allows one to work with data that violates the normality assumption, being unimodal and symmetric but more heavier tailed than the normal. This type of non-normality is common in biometrical applications and also describes the distribution of the log-transformed Cauchy data. The distribution of the test statistic corresponds to the distribution of the first component of the wellknown Cramer-von Mises test statistic. The test is asymptotically most efficient for the hyperbolic secant distribution that is compared to the normal and logistic distributions in the diagram below. f HSD ( y ) 1 sech( y ) 1 2 y f L ( y ) sech 4 2 y2 1 f N ( y) exp 2 2 Fig1: Standardised normal, hyperbolic secant and logistic densities The test is a one way rank based ANOVA, where we assume that within the k treatments the populations are continuous, belong to the same location family and may differ in the location parameter only. There are N experimental units, where the jth treatment accumulates nj units. The test statistic is built on k “bridges” corresponding to k linear contrasts of type T1=(A1<A2,A3). The asymptotic distribution of the test statistic is the χ2 with k-1 degrees of freedom. Computationally, the exact distribution is easy to construct on the basis of k-1 orthogonal contrasts (for example, for k = 3, T1, 2U3 and T2,3). Q Sj k 1 2 j 1 2 2 S j 1 nj , N N 1 sin( / 2 N ) i 2 sin cD j , N / 2N 2 N i 1 N j 1 N i j 1 j 1 n j (N n j ) , nk i n k N nj k 1 k 1 ci n j (N n j ) 1 , otherwise ( N n ) N j d Q k21 2 For small samples (max(nj)<6), the chi-square approximation is more conservative than the exact null distribution. The diagram and table below provide the exact distribution for n1=3, n2=n3=2 Q=q P(Q<=q) 3.172 0.810 3.511 0.848 3.525 0.867 4.039 0.886 4.311 0.905 4.545 0.924 4.799 0.943 5.238 0.962 5.647 7.359 0.981 1.000 We illustrate the method by an artificial example of three normal populations different in location only. The populations are, correspondingly, N(0,1), N(-1,1), N(2,1). Random samples of size 8 are taken from these populations. One-way ANOVA F = 18.07, p = 0.000 5 4 Kruskal-Wallis KW = 14.11, p = 0.001 3 2 1 0 -1 -2 -3 N(0,1) N(-1,1) N(2,1) Trigonometric ANOVA Q=14.32, p = 0.001 Illustrating the procedure… When there is a certain linear trend among the treatments, the corresponding bridge tends to have the U-shape. We measure the strength of such a tendency by the first coefficient of the Fourier sine-decomposition of the bridge. S1=1.17 S2=2.58 S3=-3.75 The larger the sample sizes, the smoother the bridge. The actual shape of the bridge depends on the difference in location as well as on the distributions of the underlying populations. If the difference is large, the shape is strictly triangular regardless of the underlying distribution and the median k-sample test works well. If the difference in location is small, for symmetric, unimodal distributions, the shape of the bridges is determined by the tails of the distributions. Normal Logistic Hyperbolic secant Efficiency 0.905 0.986 1.000 The difference in scale among several Cauchy distributions may be analysed by means of the current test. To illustrate such an application, we perform the following ANOVA on the logtransformed Cauchy populations: Cauchy(0,1), Cauchy(0,5) and Cauchy(0,2). The logtransformation of the absolute values of the data makes it more normal-like. However, the analysis of the residuals of one-way ANOVA shows a departure from normality. The test allows us to perform the formal analysis and detect the difference in scale. The KruskalWallis test gives a similar conclusion. The trigonometric ANOVA on log-transformed Cauchy… Random samples of size 8 were taken from the parent populations. Normal Probability Plot .999 5 .99 4 .95 Probability 3 2 1 0 .80 .50 .20 .05 .01 -1 .001 -2 -3 -2 Log(C(0,1)) Log(C(0,5)) Average: -0.0000000 StDev: 1.34377 N: 24 One-way ANOVA F = 5.78, p = 0.01 Trigonometric ANOVA Q=11.33, p = 0.003 -1 0 1 2 3 RESI1 Log(C(0,2)) Anderson-Darling Normality Test A-Squared: 0.631 P-Value: 0.088 Kruskal-Wallis KW = 11.26, p = 0.004 The multiple comparisons and contrasts are to be further developed for this test. The two-way test with trigonometric scores is to be investigated. The test performances are to be compared to the ksample Cramer-von Mises test. Olena Kravchuk, LAFS, UQ [email protected] (07) 33652171