Download Non-Parametric Statistics: When Normal Isn`t Good Enough

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Non-Parametric Statistics: When Normal Isn’t Good Enough"
Professor Ron Fricker"
Naval Postgraduate School"
Monterey, California"
1/28/13
1
A Bit About Me"
•  Academic credentials"
–  Ph.D. and M.A. in Statistics, Yale University"
–  M.S. in Ops Research, George Washington University"
–  B.S. in Ops Analysis, U.S. Naval Academy"
•  (Some) research interests"
– 
– 
– 
– 
Biosurveillance – how to quickly detect a bioterrorist attack"
Survey design and analysis"
Quality control and statistical process control"
Military manpower and personnel issues: recruiting,
retention, effects of deployment"
•  Contact information"
–  Phone: 831-656-3048 (DSN: 756-3048)"
–  E-mail: [email protected]"
–  Web site: http://faculty.nps.edu/rdfricke"
1/28/13
2
Outline"
•  Discuss advantages and disadvantages of
nonparametric tests"
•  Describe some nonparametric tests"
–  One sample data"
–  Paired data"
–  Multiple groups"
•  Illustrate application with various real-world
datasets"
•  Show how to implement them in Excel, JMP,
and R"
1/28/13
3
Parametric vs. Nonparametric Hypothesis Testing"
•  Parametric hypothesis testing: "
–  Statistic distribution are specified (often normal)"
–  Often follows from Central Limit Theorem, but
sometimes CLT assumptions don’t fit/apply"
•  Perhaps sample size too small or underlying
population distribution too skewed"
•  For ANOVA, CLT doesn’t apply – data is
assumed to follow normal distribution"
•  Nonparametric hypothesis testing:"
–  Does not assume a particular probability
distribution"
•  Often called “distribution free” "
–  Generally based on ordering or order statistics"
1/28/13
4
Challenges in Parametric Hypothesis Testing"
•  Sometimes experiments give responses they
defy exact quantification"
–  Want to rank the “utility” of four weapons systems"
•  Gives an ordering, but can be impossible to say
things like “System A is twice as useful as B”"
•  Sometimes distributional assumptions of
parametric tests violated"
–  Want to compare two LVS maintenance programs,
but data clearly non-normal"
•  If the data do not fit the assumptions of the
(parametric) tests, what to do? "
"
1/28/13
5
Advantages of Nonparametric Tests"
•  Nonparametric tests make less stringent
demands on the data"
–  E.g., they require fewer assumptions"
•  Usually require independent observations (or
independence of paired differences)"
•  Sometimes assumes continuity of the measure"
•  Can be more appropriate:"
–  When measures are not precise"
–  For ordinal data where scale is not obvious"
–  When only ordering of data is available"
1/28/13
6
Disadvantages of Nonparametric Tests"
•  They may “throw away” information"
–  E.g., Sign test only uses the signs (+ or -) of the
data, not the numeric values"
–  If the other information is available and there is an
appropriate parametric test, that test will be more
powerful"
•  The trade-off: "
–  Parametric tests are more powerful if the
assumptions are met!
–  Nonparametric tests provide a more general result
if they are powerful enough to reject!
1/28/13
7
(Some) Nonparametric Tests"
Parametric Test!
“Equivalent” Nonparametric Test!
One-sample t-test"
Sign test, signed rank test"
Two-sample t-test"
Sign test, Wilcoxon rank sum test/
Mann-Whitney U test"
Paired t-test"
Sign test, signed rank test"
One-way ANOVA"
Kruskall-Wallis test"
"
•  Also:"
– 
– 
– 
– 
– 
1/28/13
Friedman test"
Bootstrap-based methods"
Spearman’s rank correlation coefficient"
Kolmogorov-Smirnov test for comparing two distributions"
McNemar’s test"
8
Sign Test Hypotheses and Assumptions"
•  Basic nonparametric test procedure to assess
the plausibility of the null"
•  Assumption: Observations x1,…,xn are
independent and identically distributed from
some unknown distribution F
•  Hypotheses:"
H 0 : F(x) = p0
H a : F(x) ≠ p0
•  Common to test p0 = 0.5; i.e., test the median "
–  For symmetric distributions, equivalent to testing
the mean"
–  Can also test quartiles or any other percentile"
1/28/13
9
Sign Test: Methodology "
•  First, delete all observations equal to x
(and decrement n accordingly)"
•  Calculate S ( x) = # xi ≤ x and note that if the
null hypothesis is true" S( X ) ~ Bin(n, p0 )
•  Test the appropriate alternative hypothesis:"
p-value"
" Alternative Hypothesis"
1/28/13
H a: p > p0
Pr(S( X ) ≥ S(x)) = 1-BINOMDIST(S(x) − 1,n, p0 ,1)
H a: p < p0
Pr(S( X ) ≤ S(x)) = BINOMDIST(S(x),n, p0 ,1)
H a: p ≠ p0
⎧ 2 × Pr(S( X ) ≥ S(x)), if S(x) > np
⎪
0
p-value = ⎨
⎪⎩ 2 × Pr(S( X ) ≤ S(x)), if S(x) < np0
10
Large Sample Approximation"
•  For large n, the binomial can be approximated
by the normal: "
⎛
p0 (1− p0 ) ⎞
⎛
S(x)
F(x)(1− F(x)) ⎞
~ N ⎜ F(x),
= N ⎜ p0 ,
⎟
⎟⎠
n
n
n
⎝
⎠
⎝
•  Requirement: np0>5 and n(1-p0)>5"
•  Then, for a two sided test,
p-value ≈ 2 × Φ( - z )
with"
S(x) − np0
"
z=
np0 (1− p0 )
1/28/13
11
5000
•  Data set has 9,505 observations"
4000
3000
"
•  Mean of all 9,505 obs = 63.25 days
and the median is 35 days"
2000
•  Think of this set as the “population”"
o  Clearly not normally distributed"
0
1000
Number
6000
Example #2: MK-48 LVS Down Time "
0
1/28/13
200
400
600
Days Down
800
1000
12
Example #2, continued"
0.000
0.005
0.010
0.015
0.020
0.025
Histogram of Means of Random Samples of Size n=30
Density
•  What if only had 30 observations "
•  Okay to assume mean is normally distributed?"
•  Nope; in this case,
n=30 not sufficient for CLT à "
20
40
60
80
100
120
140
means
1/28/13
13
Example #2, continued"
•  Then let’s use the sign
test to test whether the
median down time is
greater than 10 days"
•  Since np0=30x0.5=15>5,
can do large sample calc:"
z=
=
S(x) − np0
np0 (1− p0 )
=
21− 15
50(0.5)(1− 0.5)
6
= 2.19
2.74
So, p-value ≈ Φ( - 2.19) = 0.014
1/28/13
14
Sign Test in Other Software: JMP"
p0 and 1-p0"
estimated
p-value"
1/28/13
15
Sign Test in Other Software: R"
•  Just use the binom.test() function"
•  From previous example:"
1/28/13
16
Sign Test for Paired Observations"
•  Consider n pairs of observations (xi, yi)"
–  x1 ,..., xn is a random sample from F, and"
–  y1 ,..., yn is a random sample from G, where
G( y) = F( y − θ ) for some unknown value θ
"
•  Want to test the hypothesis that the
distribution of the Xs and Ys is the same
except perhaps for the location:"
H 0 : θ = 0 vs. H a : θ ≠ 0
•  Can easily adapt the sign test to this problem"
–  Idea: Define di = xi - yi. Then under the null
hypothesis, the probability that di is positive is 0.5
1/28/13
17
Location Shift Assumption"
G( y)
F( y)
θ
G( y) = F( y − θ )
1/28/13
y
18
Example #3: INSURV Material Inspections vs. Ship Trials"
•  For 2008-2011 data, test whether there is a
location difference in percent SAT for ship
trials versus MIs"
•  Solution:"
1/28/13
19
Example #4: AAAV Maintenance Rates at I and III MEF"
•  Test whether the AAAV maintenance rates
are different between the two MEFs"
•  Perhaps could use a parametric approach to
test the mean maintenance rates:"
Normal Probability Plot for III MEF
1
0.0025
0.0020
0.0010
0.0015
0.0005
0.0010
0
0.0015
Sample Quantiles
0.0025
0.0020
Sample Quantiles
0.0030
0.0035
0.0045
0.0040
0.0035
Sample Quantiles
0.0030
0.0025
0.0020
-1
Theoretical Quantiles
1/28/13
Normal Probability Plot for I MEF - III MEF
0.0040
Normal Probability Plot for I MEF
-1
0
Theoretical Quantiles
1
-1
0
1
Theoretical Quantiles
20
Example #4: AAAV Maintenance Rates at I and III MEF"
•  Paired t-test (parametric test) result"
–  Pick your preferred software output"
1/28/13
21
Example #4: AAAV Maintenance Rates at I and III MEF"
•  Sign test confirms I MEF rates higher –
without having to assume normality"
1/28/13
22
Signed Rank Test
Hypotheses and Assumptions"
•  Most often used to make inferences about
median of unknown distribution F "
•  Assumptions:"
–  Observations are independent and identically
distributed from F"
–  If distribution is symmetric, median−1= mean and both can be denoted as" µ = F ( 0.5)
•  Hypotheses: "H 0 : µ = µ0 H a : µ ≠ µ0
•  Signed rank more sensitive than Sign test"
–  It uses the magnitudes of the deviations from the
median as well as the signs in its calculations"
1/28/13
23
Signed Rank Test: Methodology (1)"
•  First, calculate deviations from hypothesized
mean value: d
" i ( µ0 ) = xi − µ0
•  Compute the ranks of the absolute values of
the deviations: r"i ( µ0 ) = rank di
( )
–  Delete any observations equal to !µ0
–  If two observations are tied, give them both the
average rank"
•  Calculate S+( µ0 ), the sum of the ranks of the
positive deviations"
1/28/13
24
Interpretation of the Signed Rank Test Statistic"
1/28/13
25
Signed Rank Test: Methodology (3)"
•  If S+( µ0 ) is far from n(n+1)/4, then reject null"
–  For small sample sizes, tables/software necessary
to determine “far”"
–  If sample size is large, use normal approximation
to judge “far”"
•  The test statistic"
Z + ( µ0 ) =
S+ ( µ0 ) − n(n + 1) / 4
n(n + 1)(2n + 1) / 24
"is approximately standard normally distributed"
1/28/13
26
Example #2, continued"
•  Calcs give S+( µ0 )=407.5
–  Table: reject at significance
level "α < 0.01
–  Large sample calcs:"
Z + ( µ0 ) =
=
S+ ( µ0 ) − n(n + 1) / 4
n(n + 1)(2n + 1) / 24
407.5 − 30 × 31/ 4
30 × 31× 61/ 24
= 3.6
So, p-value ≈ Φ( - 3.6) = 0.0002
–  Conclude median > 10"
1/28/13
27
Signed Rank Variants"
•  Can also use signed rank test for paired data"
–  Do test on differences of the pairs"
1/28/13
28
Signed Rank Test in Other Software: JMP"
1/28/13
29
Signed Rank Test in Other Software: R"
•  Use the wilcox.test() function for two
sample tests"
1/28/13
30
Rank Sum Test
Hypotheses and Assumptions"
•  One- or two-sided test for the hypotheses of
the means of two independent samples"
–  x1 ,..., xn is a random sample from F, and"
–  y1 ,..., ym is a random sample from G, where
" y) = "F( y − θ ) for some unknown value "θ
G(
•  Want to test the hypothesis that the
distribution of the Xs and Ys is the same
except perhaps for the location:"
H 0 : θ = 0 vs. H a : θ ≠ 0
1/28/13
31
Idea of the Test"
•  If two samples come from same distribution,
the data ought to be “mixed in” together"
–  If not, likely that one sample of data will be
systematically higher or lower than other sample"
•  Use ranks to quantify where data falls"
–  I.e., smallest observation gets rank 1, next
smallest gets rank 2, etc. "
•  Sum of the ranks quantifies whether whole
sample is systematically different"
1/28/13
32
Calculating the Test Statistic"
•  In JMP, assuming no ties, the test statistic is "
"
"where"
Z = ⎡⎣ S x − E(S x ) ⎤⎦
Var(S x )
E(S x ) = m(m + n + 1) / 2,
"
"and"
Var(S x ) = mn(m + n + 1) / 12
"
•  There are other ways to calculate"
–  And there are details for handling ties"
–  We’ll just let JMP or R calculate for us…"
1/28/13
33
Example #5"
•  Comparison of age distribution of US vs. nonUS Iraq war casualties"
–  4,123 casualties from 3-21-03 to 10-10-07"
–  3,820 US (n) and 301 non-US (m) "
Age Distribution of non-US Casualties
40
Age (years)
0
20
20
30
40
Frequency
1000
500
0
Frequency
60
50
1500
60
Age Distribution of US Casualties
20
30
40
Age (years)
1/28/13
50
60
20
30
40
50
60
non-US
US
Age (years)
34
Calculating in JMP"
• 
Analyze > Fit Y by X"
–  Be sure X is nominal and Y is
continuous"
• 
• 
Under red triangle >
Nonparametric > Wilcoxon Test"
Check the calcs:"
E(S x ) = m(m + n + 1) / 2
= 295(4,112) / 2 =
= 606,520
Var(S x ) = mn(m + n + 1) / 12
= 295 × 3,816 × 4,112 / 12
= 385,746,720
Z = ⎡⎣ S x − E(S x ) ⎤⎦
Var(S x )
= ⎡⎣778,733.5 − 606,520 ⎤⎦
385,746,720
= 8.768314
• 
Conclusion: Location different"
1/28/13
35
Calculating in R"
•  Use the wilcox.test() function with option
paired=FALSE!
1/28/13
36
Kruskal-Wallis Test
Hypotheses and Assumptions"
•  Generalization of the rank sum test for k > 2 samples"
–  nT observations divided into k groups"
–  Nonparametric alternative to one-way ANOVA"
•  Null hypothesis is that all k samples are from
the same population"
•  Assumption: error terms are identically
distributed"
–  Need not be a normal distribution"
1/28/13
37
Kruskal-Wallis Test: Methodology"
•  Idea: combine k samples into one large
sample, order them, and rank observations
from 1 to nT
•  Denote rank of observation xij as rij, "r1 ,..., rk
1 ≤ i ≤ k ,1 ≤ j ≤ ni
•  Further denote the average ranks as where
ri =
1/28/13
ri1 ++ rin
i
ni
38
Calculate the Test Statistic"
"
The test statistic is "
k
"
12
2
H
=
n
r
− 3(nT + 1)
!
i i
nT (nT + 1) i=1
"
Calculate p-value as Pr( X > H ) where X has a
chi-square distribution with k-1 df"
"
∑
1/28/13
39
Example #6: Helo Back Pain Survey"
•  Survey of 683 US Navy helo pilots and copilots about back pain in the cockpit"
–  Question 29: “How often do you have back and/ or neck pain during or immediately after a
flight?” (1=Rarely, 5=Very Often)!
–  Compare by question 11: “In what type of
helicopter do you have the most flight
hours?” (H-46, H-53, TH-57, H60)"
–  553 respondents answered both questions:"
1/28/13
40
Kruskall-Wallis Test in JMP"
•  Analyze > Fit Y by X > red triangle >
Nonparametric > Wilcoxon Test"
–  Same as Wilcoxon rank sum, just need more than two levels in X "
p-value – cannot
reject null
1/28/13
41
Parametric vs. Nonparametric"
One-way ANOVA
Kruskal-Wallis Test
1/28/13
42
Calculating in R"
•  Use the kruskal.test() function"
•  No statistical difference by type of helo "
1/28/13
43
Good References"
•  My OA3102 course notes: http://
faculty.nps.edu/rdfricke/OA3102.htm"
•  Hayter, A.J., Probability and Statistics for
Engineers and Scientists, Duxbury Press,
2006."
•  Wackerly, D., W. Mendenhall, and R.L.
Scheafer, Mathematical Statistics with
Applications, Duxbury Press, 2007."
•  Conover, W.J., Practical Nonparametric
Statistics, Wiley, 1999."
1/28/13
44
What We Covered "
•  Discussed advantages and disadvantages of
nonparametric tests"
•  Described some nonparametric tests"
–  One sample data"
–  Paired data"
–  Multiple groups"
•  Illustrated application with various real-world
datasets"
•  Showed how to implement them in Excel,
JMP, and R"
1/28/13
45