Download F13_Lecture14_ch27

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Psychometrics wikipedia , lookup

Transcript
27. Nonparametric tests
The Practice of Statistics in the Life Sciences
Second Edition
© 2012 W.H. Freeman and Company
Objectives (PSLS Chapter 27)
Nonparametric tests

Rank tests versus Normal tests

Ranks

Comparing two samples: the Wilcoxon rank sum test
Rank tests vs. Normal tests
For strongly skewed data, we might prefer the median to the mean for
describing the center of the data.
Hypotheses for rank tests rely on the median and data ranks.
Ranks
To rank sample observations, first arrange them in order from smallest
to largest. The rank of each observation is its position in this ordered
list, starting with rank 1 for the smallest observation.
We will assume there are no ties for this lecture only.
Weeds among the corn
A researcher planted corn at the same rate in 8 small plots of ground,
then weeded the corn rows by hand to allow no weeds in 4 randomly
selected plots and exactly three lamb’s–quarter weed plants per meter of row in
the other 4 plots. Here are the yields of corn (bushels per acre) in each plot.
Back-to-back stemplots show
non-Normality, a likely outlier,
and small sample size.
First we sort all 8 observations together, from smallest to largest.
Then we assign a rank to each observation.
The circled numbers are plots with no weeds.
The idea of rank tests is to look just at the position in this list.
Working with ranks allows us to dispense with the numerical values of the data
and specific conditions such as Normality.
Wilcoxon rank sum test for two samples
We have two independent random samples of sizes n1 and n2.
Rank all N = n1 + n2 observations. The sum W of the ranks for the first
sample is the Wilcoxon rank sum statistic.
If the two populations have the same continuous distribution, then W has
mean and standard deviation
n1 ( N  1)
W 
2
W 
n1n2 ( N  1)
12
What hypotheses does Wilcoxon test?
The Wilcoxon rank sum test will test the hypothesis
H0: population median1 = population median2
when both populations have distributions of the same shape.
That is, we test
H0: the two population distributions are the same
Ha: one has values that are systematically larger
These hypotheses are considered “nonparametric” because they do
not include a parameter.
If the presence of weeds reduces corn yields, we expect the ranks of
the yields from plots without weeds to be larger as a group than the
ranks from plots with weeds.
H0: There is no difference in the population distributions of yields.
Ha: Yields are systematically higher in weed-free plots.
The conditions for the Wilcoxon test are met:
 Data come from a randomized comparative experiment.
 Yield of corn in bushels per acre has a continuous distribution.
The test statistic is the sums of the ranks W for the weed-free plots.
N = 8: n1 (no weeds) = 4, and n2 (three weeds per meter) = 4.
The sum of ranks for weed-free plants has mean and standard deviation:
W 
W 
n1 ( N  1) 4(9)

 18
2
2
n1n2 ( N  1)
(4)(4)(9)

 3.464
12
12
Software gives the P-value as P(W  23) = 0.1. There is not enough evidence (at
α = 5%) to say that yields are systematically higher in weed-free plots. Keep in
mind, though, that the samples are small and the test may not have enough
power.