Download Non-parametric Statistics - An Introduction for Experimentalists

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Non-parametric Statistics
An Introduction for Experimentalists
Sebastian Strasser
University of Munich
July 13, 2011
2
Sebastian Strasser
Non-parametric Statistics
Overview
I
I
The role of non-parametric statistics in experimental economics
Tests
I
I
I
I
I
I
3
Mann-Whitney U Test
Kolmogorov-Smirnov Test
Wilcoxon Test
Binomial Test
χ2 Test
Kruskal-Wallis Test
Sebastian Strasser
Non-parametric Statistics
The role of non-parametric statistics in experimental
economics
I
small sample sizes (often between n = 6 and n = 30 (cf.
independent observations))
I
no assumption about underlying distribution of data generating
process
I
possibility to analyze ordinal and categorical data
I
close relation to methods used in medicine, biology, chemistry
In practice, both non-parametric and parametric analyzes (OLS, IV,
MLE, GMM, t-test, etc.) conducted at the same time.
4
Sebastian Strasser
Non-parametric Statistics
Mann-Whitney U Test
Definition
Test whether two statistically independent groups have been drawn
from the same population with respect to the mean.
H0 No difference in means
H1a Difference in means: X 6= Y (two-sided)
H1b Difference in means: X < Y or X > Y (one-sided)
5
Sebastian Strasser
Non-parametric Statistics
Mann-Whitney U Test
Example
Ultimatum Game with students from economics (ECON) and
management science (MGMT).Variable of Interest: Offered amount.
Offered amounts (ECON)
Offered amounts (MGMT)
2
3
4
2.5
1
5
0.5
5
0.5
Step 1: Bring all observations in ascending order and assign ascending
ranks:
offer
group
rank
6
0.5
ECON
1.5
0.5
ECON
1.5
1
ECON
3
2
ECON
4
Sebastian Strasser
2.5
MGMT
5
3
MGMT
6
4
ECON
7
Non-parametric Statistics
5
MGMT
8.5
5
MGMT
8.5
Mann-Whitney U Test
Example
offer
group
rank
0.5
ECON
1.5
0.5
ECON
1.5
1
ECON
3
2
ECON
4
2.5
MGMT
5
3
MGMT
6
4
ECON
7
5
MGMT
8.5
5
MGMT
8.5
Step 2: Sum the ranks of the smaller group to obtain W.
I
In our example: W (N) = 28 (Wmax =30)
I
p = 0.063 (two-sided) (table J from Siegel/Castellan)
I
p = 0.048 (two-sided) (from STATA)
Approximation of W (N) for n → ∞ through normal distribution
STATA: ranksum offer, by(study)
7
Sebastian Strasser
Non-parametric Statistics
Kolmogorov-Smirnov Test
Definition
Test whether two statistically independent groups have been drawn
from the same population with respect to the distribution (mean,
skewness, kurtosis).
H0 Same distribution
H1 Difference in distributions (two-sided)
8
Sebastian Strasser
Non-parametric Statistics
Kolmogorov-Smirnov Test
Example
Ultimatum Game with students from economics (ECON) and
management science (MGMT). Variable of Interest: Offered amount.
Offered amounts (ECON)
Offered amounts (MGMT)
2
3
4
2.5
1
5
0.5
5
0.5
Step 1: Determine the cumulative frequencies of the observations:
offer
ECON
MGMT
9
0.5
40 %
0%
1
60 %
0%
2
80 %
0%
Sebastian Strasser
2.5
80 %
25 %
3
80 %
50 %
4
100 %
50 %
Non-parametric Statistics
5
100 %
100 %
Kolmogorov-Smirnov Test
Example
offer
ECON
MGMT
Sn (X ) − Sm (X )
0.5
40 %
0%
40 %
1
60 %
0%
60 %
2
80 %
0%
80 %
2.5
80 %
25 %
55 %
3
80 %
50 %
30 %
4
100 %
50 %
50 %
5
100 %
100 %
0%
Step 2: Look for the biggest absolute difference between the
cumulative frequencies by calculating the following values:
10
I
Dm,n = max |Sn (X ) − Sm (X )| where m(n) is the number of
observations in both samples
I
Sm (X ) =
equal X .
K
m
where K is the number of observations smaller or
Sebastian Strasser
Non-parametric Statistics
Kolmogorov-Smirnov Test
Example
offer
ECON
MGMT
Sn (X ) − Sm (X )
0.5
40 %
0%
40 %
1
60 %
0%
60 %
2
80 %
0%
80 %
2.5
80 %
25 %
55 %
3
80 %
50 %
30 %
4
100 %
50 %
50 %
5
100 %
100 %
0%
The test statistic is then given by m · n · Dm,n = 5 · 4 · 0.8 = 16
I
p = 0.10 (two-sided) (table LII from Siegel/Castellan)
I
p = 0.116 (two-sided) (from STATA)
Approximation for n → ∞ through χ2 distribution
STATA: ksmirnov offer, by(study)
Possibility to test against theoretical distribution
11
Sebastian Strasser
Non-parametric Statistics
Wilcoxon Signed-Ranks Test
Definition
Test whether there are difference between two statistically
dependent observations (X1 and X2 ).
H0 No differences between the observations (X1 = X2 ).
H1a Difference between the observations: X1 6= X2 (two-sided)
H1b Difference between the observations: X1 < X2 or X1 > X2
(one-sided)
12
Sebastian Strasser
Non-parametric Statistics
Wilcoxon Signed-Ranks Test
Example
Repeated Ultimatum Game.Variable of Interest: Offered amount in
rounds 1 and 2.
subject
round 1
round 2
1
0.5
1.5
2
0.5
1.5
3
1
1
4
2
1.5
5
2.5
1
6
3
1
7
4
1
8
5
2
9
5
2.5
Step 1: Determine the difference between the paired observations and
assign ranks according to the absolute difference (taking into account
the sign of the difference):
subject
round 1
round 2
difference
rank
13
1
0.5
1.5
1
+2.5
2
0.5
1.5
1
+2.5
3
1
1
0
drop
Sebastian Strasser
4
2
1.5
-0.5
-1
5
2.5
1
-1.5
-4
6
3
1
-2
-5
7
4
1
-3
-7.5
Non-parametric Statistics
8
5
2
-3
-7.5
9
5
2.5
-2.5
-6
Wilcoxon Signed-Ranks Test
Example
subject
round 1
round 2
difference
rank
1
0.5
1.5
1
+2.5
2
0.5
1.5
1
+2.5
3
1
1
0
drop
4
2
1.5
-0.5
-1
5
2.5
1
-1.5
-4
6
3
1
-2
-5
7
4
1
-3
-7.5
8
5
2
-3
-7.5
9
5
2.5
-2.5
-6
I
T + = Sum of all ranks with positive sign. (T + = 5)
I
T − = Sum of all ranks with negative sign. (T − = 31)
I
p = 0.078 (two-sided with N = 8 (!)) (table H from S/C)
I
p = 0.0745 (two-sided) (from STATA)
Approximation for n → ∞ through normal distribution
STATA: signrank offer1 = offer2
Sign-Test as an alternative (neglecting the size of the deviations)
14
Sebastian Strasser
Non-parametric Statistics
Binomial Test
Definition
Test whether there the sampling distribution of a dichotomous
random variable is different from a population with p = p0
Two possible events (X = 1 or X = 0): heads or tail, budget surplus
or deficit, etc.
I
Probability of X = 1: p
I
Probability of X = 0: q = 1 − p
H0 p = p 0
H1 p 6= p0
15
Sebastian Strasser
Non-parametric Statistics
Binomial Test
Example
dice throw
result
X
1
heads
0
2
tail
1
3
heads
0
4
heads
0
5
heads
0
6
heads
0
I
Is the coin a fair coin, i.e. p = q = 0.5
I
Y =
P
7
tail
1
X =2
!
P[Y = k] =
where
n
k
16
!
=
n k N−k
p q
k
N!
k!(N − k)!
Sebastian Strasser
Non-parametric Statistics
8
heads
0
9
heads
0
10
heads
0
Binomial Test
Example
The probability that Y = 2 is given by
!
P[Y = 2] =
10 2 8
10!
p q =
0.52 0.52 = 0.043
2
2!8!
We are interested in the cumulative probability that Y ≤ r or Y ≥ s
P[Y ≤ k] =
k
X
n
i
i=0
P[Y ≤ 2] =
2
X
n
i=0
17
Sebastian Strasser
i
!
p i q N−i
!
p i q N−i = 0.055
Non-parametric Statistics
Binomial Test
Another Example
Systematic deviations of actual expenses from budget for R&D:
I
I
I
18
12 out of 32 negative
p = 0.2153 hence not able to reject H0
STATA: bitest deviation==0.5
Sebastian Strasser
Non-parametric Statistics
χ2 Test
Definition
Test whether there are differences in distributions in two (or more)
categories A and B.
H0 No differences between the categories.
H1 Differences between the categories.
Possibility to test between two observed distributions or to compare
observed sample to a theoretical distribution.
Minimum number of observations per cell: ≈ 5
19
Sebastian Strasser
Non-parametric Statistics
χ2 Test
Easiest application: 2 x 2 tables, but possibility for n x k extension:
A
C
B
D
Test statistic (general):
χ2 =
k X
n
X
(Oij − Eij )2
j=1 i=1
Eij
where Oij is the observation, Eij the expected observation under
independence and n(k) the number of rows(columns)
Degrees of freedom: df = (n − 1)(k − 1)
20
Sebastian Strasser
Non-parametric Statistics
χ2 Test
Example
Application for 2 x 2 tables:
Econ
Mgmt
Offers under 5
8
13
Offers of 5 and more
14
12
Approximation of test statistic for 2 x 2 tables:
χ2 =
N(|AD − BC | − N2 )2
(A + B)(C + D)(A + C )(B + D)
χ2 = 1.15
(p = 0.282, df = 1)
χ2 = 0.61 (with 2 x 2 approximation)
STATA: tab study offerdummy, chi
For smaller cells (< 5), use Fisher-exact test (hypergeo test stat)
STATA: tab study offerdummy, exact
21
Sebastian Strasser
Non-parametric Statistics
Kruskal Wallis Test (or H-Test)
Definition
Test whether k statistically independent samples have been drawn
from the same population w.r.t. to the mean
H0 k samples are from the same population
H1 k samples are from different populations
Closely related to Mann-Whitney, but applicable to k > 2 groups
Test statistic:
H=
X RS 2
12
h
− 3(n + 1)
n(n + 1) h nh
where RSh is the sum of ranks per group
STATA: kwallis offer, by(age)
22
Sebastian Strasser
Non-parametric Statistics
Overview of tests used
23
Sebastian Strasser
Non-parametric Statistics
24
Sebastian Strasser
Non-parametric Statistics