Download Chap 14

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Chapter 14
Nonparametric Statistics
Introduction: Distribution-Free Tests
Distribution-free tests – statistical tests
that don’t rely on assumptions about the
probability distribution of the sampled
population
Nonparametrics – branch of inferential
statistics devoted to distribution-free tests
Rank statistics (Rank tests) –
nonparametric statistics based on the ranks
of measurements
2
Single Population Inferences
The Sign test is used to make inferences
about the central tendency of a single
population
Test is based on the median η
Test involves hypothesizing a value for the
population median, then testing to see if the
distribution of sample values around the
hypothesized median value reaches
significance
3
Single Population Inferences
Sign Test for a Population Median η
One-Tailed Test
H0:η1 = η0
Ha :η1 < η0
{or Ha: η1> η0]
Test Statistic
S = Number of sample
measurements greater than η0 [or
S = number of measurements
less than η0]
Two-Tailed Test
H0: η1 = η0
Ha: η1  η0
S = Larger of S1 and S2, where S1
is the number of measurements
less than η0 and S2 is the number
of measurements greater than η0
Observed Significance Level
p-value = P(x ≥ S)
p-value = 2P(x ≥ S)
Where x has a binomial distribution with parameters n and p = .5
Rejection region: Reject H0 if p-value ≤ .05
Conditions required for sign test – sample must be randomly selected from a
continuous probability distribution
4
Single Population Inferences
Large-Sample Sign Test for a Population Median η
One-Tailed Test
H0:η1 = η0
Ha :η1 < η0
{or Ha: η1> η0]
Test Statistic
Two-Tailed Test
H0: η1 = η0
Ha: η1  η0
z
 S  .5  .5n
.5 n
Observed Significance Level
p-value = P(x ≥ S)
p-value = 2P(x ≥ S)
Where x has a binomial distribution with parameters n and p = .5
Rejection region: z  z
Rejection region: z  z / 2
Conditions required for sign test – sample must be randomly selected from a
continuous probability distribution
5
Comparing Two Populations: Independent
Samples
The Wilcoxon Rank Sum Test is used when
two independent random samples are being
used to compare two populations, and the ttest is not appropriate
It tests the hypothesis that the probability
distributions associated with the two
populations are equivalent
6
Comparing Two Populations: Independent
Samples
Rank Data from both samples from smallest to largest
If populations are the same, ranks should be randomly
mixed between the samples
Percentage Cost of Living Change, as Predicted
by Government and University Economists
Government Economist (1)
Prediction
Rank
3.1
4
4.8
7
2.3
2
5.6
8
0.0
1
2.9
3
University Economist (2)
Prediction
Rank
4.4
6
5.8
9
3.9
5
8.7
11
6.3
10
10.5
12
10.8
13
Test statistic is based on the rank sums – the totals of the
ranks for each of the samples. T1 is the sum for sample 1,
T2 is the sum for sample 2
7
Comparing Two Populations: Independent
Samples
Wilcoxon Rank Sum Test: Independent Samples
One-Tailed Test
H0:D1 and D2 are identical
Ha :D1 is shifted to the right of D2
{or Ha: D1 is shifted to the left of
D2]
Test Statistic
T1, if n1<n2; T2, if n2 < n1
(Either rank sum can be used if n1
= n2 )
Two-Tailed Test
H0:D1 and D2 are identical
Ha :D1 is shifted either to the left
or to the right of D2
T1, if n1<n2; T2, if n2 < n1
(Either rank sum can be used if n1
= n2 )
We will denote this rank sum as T
Rejection region:
T ≤ TL or T ≥ TU
Rejection region:
T1: T1 ≥ TU [or T1 ≤ TL]
T1: T1 ≤ TL [or T1 ≥ TU]
Where TL and TU are obtained from table
Required Conditions:
Random, independent samples
Probability distributions samples drawn from are continuous
8
Comparing Two Populations: Independent
Samples
Wilcoxon Rank Sum Test for Large Samples(n1
and n2 ≥ 10)
One-Tailed Test
H0:D1 and D2 are identical
Ha :D1 is shifted to the right of D2
{or Ha: D1 is shifted to the left of
D2]
Test Statistic
Test statistic : z 
Rejection region:
z>z(or z<-z)
Two-Tailed Test
H0:D1 and D2 are identical
Ha :D1 is shifted either to the left
or to the right of D2
n1 (n1  n2  1)
2
n1n2 (n1  n2  1)
12
T1 
Rejection region:
|z|>z/2
9
Comparing Two Populations: Paired
Differences Experiment
Wilcoxon Signed Rank Test: An alternative test to the
paired difference of means procedure
Analysis is of the differences between ranks
Softness Ratings of Paper
Judge
1
2
3
4
5
6
7
8
9
10
Product
A
B
6
4
8
5
4
5
9
8
4
1
7
9
6
2
5
3
6
7
8
2
Difference
(A-B)
2
3
-1
1
3
-2
4
2
-1
6
Absolute Value of Difference
Rank of Absolute Value
2
5
3
7.5
1
2
1
2
3
7.5
2
5
4
9
2
5
1
2
6
10
T+ = Sum of positive ranks = 46
T- = Sum of negative ranks = 9
Any differences of 0 are eliminated, and n is reduced
accordingly
10
Comparing Two Populations: Paired
Differences Experiment
Wilcoxon Signed Rank Test for a Paired Difference
Experiment
Let D1 and D2 represent the probability distributions for
populations 1 and 2, respectively
One-Tailed Test
H0:D1 and D2 are identical
Ha :D1 is shifted to the right of D2
[or Ha: D1 is shifted to the left of
D2]
Test Statistic
T-, the rank sum of the negative
distances
(or T+, the rank sum of the
positive distances)
Rejection region:
T-: ≤ T0 [or T+: ≤ T0]
Where T0 is from table
Two-Tailed Test
H0:D1 and D2 are identical
Ha :D1 is shifted either to the left
or to the right of D2
T, the smaller of T+ or T-
Rejection region:
T ≤ T0
Required Conditions
Sample of differences
is randomly selected
Probability distribution
from which sample is
drawn is continuous
11
Comparing Three or More Populations:
Completely Randomized Design
Kruskal-Wallis H-Test
An alternative to the completely randomized ANOVA
Based on comparison of rank sums
Number of Available Beds
Hospital 1
Beds
6
38
3
17
11
30
15
16
25
5
R1 = 120
Rank
5
27
2
13
8
21
11
12
17
4
Hospital 2
Beds
34
28
42
13
40
31
9
32
39
27
Rank
25
19
30
9.5
29
22
7
23
28
18
R2 = 210.5
Hospital 3
Beds
13
35
19
4
29
0
7
33
18
24
Rank
9.5
26
15
3
20
1
6
24
14
16
R3 = 134.5
12
Comparing Three or More Populations:
Completely Randomized Design
Kruskal-Wallis H-Test for Comparing k Probability
Distributions
H0: The k probability distributions are identical
Ha: At least two of the k probability distributions differ in location
R 2j
12
Test statistic: H 
 n  3(n  1)
n  n  1
j
Where
Nj = Number of measurements in sample j
Rj = Rank sum for sample j, where the rank of each measurement is computed according to its
relative magnitude in the totality of data for the p samples
n = Total Sample Size = n1 +n2 + ….+ nk
Rejection region: H  2 with (k-1) degrees of freedom
Required Conditions:
•The k samples are random and independent
•5 or more measurements per sample
•Probability distributions samples drawn from are continuous
13
Comparing Three or More Populations:
Randomized Block Design
The Friedman Fr Test
A nonparametric method for the randomized block
design
Based on comparison of rank sums
Reaction Time for Three Drugs
Subject
1
2
3
4
5
6
Drug A
1.21
1.63
1.42
2.43
1.16
1.94
Rank
1
1
1
2
1
1
R1 = 7
Drug B
1.48
1.85
2.06
1.98
1.27
2.44
Rank
2
2
3
1
2
2
R2 = 12
Drug C
1.56
2.01
1.70
2.64
1.48
2.81
Rank
3
3
2
3
3
3
R3 = 17
14
Comparing Three or More Populations:
Randomized Block Design
The Friedman Fr-test
H0: The probability distributions for the p treatments are identical
Ha: At least two of the p probability distributions differ in location
Test statistic: Fr 
12
R2j  3b( p  1)

bp  p  1
Where
b = Number of blocks
p = number of treatments
Rj = Rank sum of the jth treatment; where the rank of each measurement is computed relative to its
position within its own block
Rejection region: Fr  2 with (p-1) degrees of freedom
Required Conditions:
•Random assignment of treatments to units within blocks
•Measurements can be ranked within blocks
•Probability distributions samples within each block drawn from
are continuous
15
Rank Correlation
Spearman’s rank correlation coefficient Rs
provides a measure of correlation between ranks
Brake Rankings of New Car Models: Less than Perfect Agreement
Magazine
Car Model
1
2
3
4
5
6
7
8
9
10
1
4
1
9
5
2
10
7
3
6
8
2
5
2
10
6
1
9
7
3
4
8
Difference between Rank 1 and Rank 2
D
-1
-1
-1
-1
1
1
0
0
2
0
D2
d
1
1
1
1
1
1
0
0
4
0
2
 10
16
Rank Correlation
One-Tailed Test
H0:p = 0
Ha :p < 0
{or Ha: p> 0]
Test Statistic
Two-Tailed Test
H0: p = 0
Ha: p  0
rs  1 
6 di2
n(n 2  1)
Where
di = ui –vi (difference in ranks of ith observations for samples 1 and 2
Rejection region: rs  rs ,
Rejection region: rs  rs, / 2
(or rs   rs , when Ha: p> 0)
Conditions Required:
Sample of experimental units is randomly selected
Probability distributions of two variables are continuous
17