Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
27. Nonparametric tests The Practice of Statistics in the Life Sciences Second Edition © 2012 W.H. Freeman and Company Objectives (PSLS Chapter 27) Nonparametric tests Rank tests versus Normal tests Ranks Comparing two samples: the Wilcoxon rank sum test Rank tests vs. Normal tests For strongly skewed data, we might prefer the median to the mean for describing the center of the data. Hypotheses for rank tests rely on the median and data ranks. Ranks To rank sample observations, first arrange them in order from smallest to largest. The rank of each observation is its position in this ordered list, starting with rank 1 for the smallest observation. We will assume there are no ties for this lecture only. Weeds among the corn A researcher planted corn at the same rate in 8 small plots of ground, then weeded the corn rows by hand to allow no weeds in 4 randomly selected plots and exactly three lamb’s–quarter weed plants per meter of row in the other 4 plots. Here are the yields of corn (bushels per acre) in each plot. Back-to-back stemplots show non-Normality, a likely outlier, and small sample size. First we sort all 8 observations together, from smallest to largest. Then we assign a rank to each observation. The circled numbers are plots with no weeds. The idea of rank tests is to look just at the position in this list. Working with ranks allows us to dispense with the numerical values of the data and specific conditions such as Normality. Wilcoxon rank sum test for two samples We have two independent random samples of sizes n1 and n2. Rank all N = n1 + n2 observations. The sum W of the ranks for the first sample is the Wilcoxon rank sum statistic. If the two populations have the same continuous distribution, then W has mean and standard deviation n1 ( N 1) W 2 W n1n2 ( N 1) 12 What hypotheses does Wilcoxon test? The Wilcoxon rank sum test will test the hypothesis H0: population median1 = population median2 when both populations have distributions of the same shape. That is, we test H0: the two population distributions are the same Ha: one has values that are systematically larger These hypotheses are considered “nonparametric” because they do not include a parameter. If the presence of weeds reduces corn yields, we expect the ranks of the yields from plots without weeds to be larger as a group than the ranks from plots with weeds. H0: There is no difference in the population distributions of yields. Ha: Yields are systematically higher in weed-free plots. The conditions for the Wilcoxon test are met: Data come from a randomized comparative experiment. Yield of corn in bushels per acre has a continuous distribution. The test statistic is the sums of the ranks W for the weed-free plots. N = 8: n1 (no weeds) = 4, and n2 (three weeds per meter) = 4. The sum of ranks for weed-free plants has mean and standard deviation: W W n1 ( N 1) 4(9) 18 2 2 n1n2 ( N 1) (4)(4)(9) 3.464 12 12 Software gives the P-value as P(W 23) = 0.1. There is not enough evidence (at α = 5%) to say that yields are systematically higher in weed-free plots. Keep in mind, though, that the samples are small and the test may not have enough power.