Download Bootstrap slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Quantifying uncertainty using the
bootstrap
Reading
Efron, B. and R. Tibishirani, (1993), An
Introduction to the Bootstrap, Chapman
Hall, New York, 436 p. Chapters 1, 2, 6.
Approaches to uncertainty
estimation
• Use statistical theory
2
e.g. Standard Error
• Bootstrapping
ˆ  s(x )

SEx 
N
seboot  s(x* )


Confidence Intervals:
ˆ  z / 2  se boot
Bootstrapping
• Motivated by the absence of equations for
other accuracy measures (bias, prediction
error, confidence intervals) for statistics of
interest (correlation, regressions, ACF)
• Definition: “The bootstrap is a data-based
simulation method for statistical inference.”
• Principle: resample with replacement from
data.
After Efron and Tibshirani, An Introduction to the Bootstrap, 1993
from Efron and Tibshirani, An Introduction to the Bootstrap, 1993
Schematic of Bootstrap Process
from Efron and Tibshirani, An Introduction to the Bootstrap, 1993
Bootstrapping
BOOTSTRAP WORLD
REAL WORLD
Unknown
Probability
Distribution
F
Observed
Random
Sample
x = {x1, x2, …, xn}
ˆ  s (x)
Statistic of
Interest

Empirical
Distribution
F*
Sampling with
replacement
Bootstrap
Sample
x * = {x*1, x * 2, …, x *n}
ˆ*  s (x* )
Bootstrap
Replication
After Efron and Tibshirani, An Introduction to the Bootstrap, 1993
from Efron and Tibshirani, An Introduction to the Bootstrap, 1993
Bootstrap Algorithm for Standard Error
from Efron and Tibshirani, An Introduction to the Bootstrap, 1993
Hillsborough River at Zephyr Hills, September flows
0 2 4 6 8
Frequency
12
Mean = 8621 mgal
S = 8194 mgal
N = 31
0
5000
10000
15000
20000
25000
30000
35000
Uncertainty on estimates of the mean
80
40
0
Frequency
120
2
One and two standard errors SEx 
N
95% CI and interquartile range from
500 bootstrap samples
0
5000
10000
15000
20000
25000
Millions of gallons
30000
35000