Download resampling

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Probability wikipedia , lookup

Statistics wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
Resampling Methods
From Wikipedia: “Parametric statistics is a branch of statistics that assumes (that)
data come from a type of probability distribution and makes inferences about the
parameters of the distribution.
Most well-known elementary statistical methods (e.g. the ones from our class) are
parametric.”
But there are alternative methods that don’t require any assumptions about the
shape of the population’s probability distribution. Resampling methods are an
example.
There are three kinds of resampling methods:
Permutation methods – used most commonly with correlations where the probability
of the observed data is estimated by comparing the observed parings to a large
number of random parings of the data.
Monte Carlo methods – estimate the population probability distribution through
simulation.
Bootstrap methods – the population distribution of an observed statistic is estimated
by repeatedly resampling the data with replacement and calculating the statistic.
Example of a permutation method: Suppose you measured the IQ’s of 25 pairs of
twins and found a correlation of r=0.36. The scatter plot of your data is shown
below. Is the observed correlation significantly greater than zero? (use a = .01)
Correlation r = 0.36
IQ Twin 2
100
80
60
60
80
100
IQ Twin 1
120
The (parametric) test used in our class would have found an rcrit value of 0.330
We would reject H0 and conclude that a correlation 0.36 is (barely) significantly
greater than zero.
The distribution under the null hypothesis can be estimated by repeatedly shuffling (or
‘permuting’) the relationship between the X and Y values and calculating the correlation:
X Y
97
89
81
85
70
105
81
107
84
93
58
69
99
70
89
75
78
60
61
95
79
68
69
93
79
89
87
91
87
59
88
97
94
45
77
73
84
74
79
105
92
84
64
77
84
72
74
85
105
43
r = .36
X Y’
97
89
81
85
70
105
81
107
84
93
58
69
99
70
89
75
78
60
61
95
79
68
69
93
79
59
91
45
72
43
84
74
77
105
64
87
73
79
89
84
77
87
84
94
92
97
88
85
74
105
r = -.12
X Y’
97
89
81
85
70
105
81
107
84
93
58
69
99
70
89
75
78
60
61
95
79
68
69
93
79
79
85
88
84
77
84
105
72
43
77
73
97
89
92
45
64
91
94
87
74
74
84
87
59
105
r = -.26
X Y’
97
89
81
85
70
105
81
107
84
93
58
69
99
70
89
75
78
60
61
95
79
68
69
93
79
64
72
105
73
91
97
84
92
77
74
77
59
85
105
84
43
74
84
45
87
94
89
87
79
88
r = .20
…
This generates a distribution of correlations that should be centered around zero.
r= 0.05
r=-0.01
r=-0.00
r=-0.22
r= 0.25
r= 0.05
r=-0.32
r=-0.47
r=-0.34
r=-0.18
r= 0.11
r=-0.12
r=-0.19
r=-0.01
r=-0.01
r= 0.31
r=-0.20
r=-0.25
r= 0.15
r=-0.37
r=-0.11
r=-0.24
r=-0.38
r=-0.36
r=-0.26
r=-0.30
r=-0.09
r=-0.24
r= 0.07
r= 0.05
r= 0.13
r=-0.05
r=-0.16
r= 0.02
r=-0.17
We can then use this distribution to calculate the probability of making our
observed sample correlation.
After 100000 reps, Pr(r> 0.36)= 0.0378
-0.6
-0.4
-0.2
0
0.2
0.4
Permuted correlation (r)
0.6
Only 3.78% of the correlations generated by permutation exceeds the observed
correlation of 0.36, so we’d reject the null hypothesis using a = .05
Example of a Monte Carlo simulation: Liar’s dice
This is a game where n players roll 40 6-sided dice and keep the outcome hidden
under their own separate cups. The goal is to guess how many dice equal the
mode. After a player makes a guess, the next player must decide if the guess is too
high, or otherwise guess a higher number. If it is decided that the guess is too high,
the cups are lifted and the number of dice equal to the mode is computed. If the
he/she wins and the player that made the guess must drink (lemonade).
Suppose there are eight players, each with 5 dice. The player to your right just
guessed that the modal value is 14. What is the probability that the mode of the 40
dice is that high or higher?
Here’s an example of 40 throws. The mode is 5, and 10 of these throws equals the mode.
mode #
3 13
Example of 20 simulations. Each row is a throw of 40 dice. The last
column is the number of throws that equal the mode.
rep #
mode
#
1
5
12
2
1
8
3
1
8
4
2
9
5
2
11
6
3
9
7
2
10
8
3
12
9
3
8
10
3
9
11
2
10
12
4
12
13
6
13
14
2
9
15
2
11
16
1
11
17
6
10
18
2
12
19
3
10
20
2
9
A computer simulation of one million rolls generated this histogram. Shown in red are
the examples when the number of dice equal to the mode is 14 or higher.
Only 2.31% of the simulations found a count of 14 or higher. This small number
means that the player should ask all players to lift their cups and calculate the value.
Percent of rolls
30
20
10
0
10
15
Mode of 40 dice
20
Third method of resampling: bootstrapping to conduct a hypothesis test on medians.
Suppose you measured the amount of time it takes for a subject to perform a simple
mental rotation. Previous research shows that it should take a median of 2 seconds to
conduct this task. Your subject conducts 500 trials and generates the distribution of
response times below, which has a median of 2.15 seconds. Is this number significantly
greater than 2? (use a = .05)
median = 2.15 (sec)
0
5
10
15
Response Time (sec)
20
25
The trick to bootstrapping is to generate an estimate of the sampling distribution of
your observed statistic by repeatedly sampling the data with replacement and
recalculating the statistic.
median = 2.22
0
5
10
15
20
median = 2.23
25
0
5
10
15
20
median = 2.22
25
0
5
10
15
20
25
median = 2.20
median = 2.10
0
5
10
15
20
median = 2.21
25
0
5
10
15
20
25
0
5
10
15
20
25
median = 2.08
median = 2.19
0
5
10
15
20
median = 1.98
25
0
5
10
median = 2.21
0
5
10
15
20
15
20
25
0
5
10
median = 2.07
25
0
5
10
15
20
15
20
25
median = 2.22
25
0
5
10
15
20
25
For our example, we can count the proportion of times that the median falls below 2.
After 1000000 reps, Pr(median < 2.00)= 0.0620
1.6
1.8
2
2.2
2.4
Bootstrapped median
2.6
Since more than 5% of our bootstrapped medians fall below 2, we (just barely)
cannot conclude that our observed median is significantly greater than 2.