Download View Full Page PDF - Advances in Physiology Education

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Adv Physiol Educ 33: 286–292, 2009;
doi:10.1152/advan.00062.2009.
Staying Current
Explorations in statistics: the bootstrap
Douglas Curran-Everett
Division of Biostatistics and Bioinformatics, National Jewish Health, and Department of Biostatistics and Informatics
and Department of Physiology and Biophysics, School of Medicine, University of Colorado Denver, Denver, Colorado
Submitted 20 July 2009; accepted in final form 4 September 2009
Central Limit Theorem; R; sample mean; software; standard error
in Explorations in Statistics (see Refs. 3–5)
explores the bootstrap, a recent development in statistics (8,
11–16) that evolved from the jackknife (25, 28, 36). Despite its
brief history, the bootstrap is discussed in textbooks of statistics (26) and used in manuscripts published by the American
Physiological Society (2, 10, 17–24, 35, 37).
The bootstrap1 gives us an empirical approach to estimate
the theoretical variability among possible values of a sample
statistic.2 In our previous explorations (3–5) we calculated the
standard error of the mean, our estimate of the theoretical
variability among sample means, as SE{y៮ } ⫽ s/公n, where s
was the sample standard deviation and n was the number of
observations in the sample. This computation was grounded in
theory (7). In our last exploration (4) we derived a confidence
interval for the population mean. This too was grounded in
theory. The beauty of the bootstrap is that it can transcend
theory: we can use the bootstrap to make an inference about
some experimental result when the theory is uncertain or even
unknown (14, 26). We can also use the bootstrap to assess how
well the theory holds: that is, whether an inference we make
from a hypothesis test or confidence interval is justified.
THIS FOURTH PAPER
Boot.R4 to your Advances folder and install the extra package boot.
To install boot, open R and then click Packages | Install
package(s) . . . 5 Select a CRAN mirror6 close to your location
and then click OK. Select boot and then click OK. When you
have installed boot, you will see
package ‘boot’ successfully unpacked and MD5 sums checked
in the R Console.
To run R commands. If you use a Mac, highlight the commands
you want to submit and then press
(command key⫹enter). If
you use a PC, highlight the commands you want to submit, rightclick, and then click Run line or selection. Or, highlight the commands you want to submit and then press Ctrl⫹R.
The Simulation: Data for the Bootstrap
In our early explorations (3–5) we drew random samples of
9 observations from a standard normal distribution with mean
␮ ⫽ 0 and standard deviation ␴ ⫽ 1. These were the observations–the data–for samples 1, 2, and 1000:
⬎ # Sample Observations
[1]
[2]
:
[1000]
0.422
0.154
1.103
⫺0.654
1.006
⫺0.147
1.034
1.715
0.285
0.720
⫺0.647
0.804
1.235
0.256
0.912
1.155
0.560
⫺1.138
0.485
⫺0.864
⫺0.277
2.198
0.050
0.500
In the first article (3) of this series, I summarized R (29)
and outlined its installation.3 For this exploration, there are
two additional steps: download Advances_Statistics_Code_
1
The APPENDIX reviews the origin of the name bootstrap.
For example, a sample mean, a sample standard deviation, or a sample
correlation.
3
I developed the scripts for the early explorations (3–5) using R-2.6.2. I
developed the script for this exploration using R-2.8.2 (deployed 22 Dec 2008),
but it will run in R-2.6.2.
Address for reprint requests and other correspondence: D. Curran-Everett,
Div. of Biostatistics and Bioinformatics, M222, National Jewish Health, 1400
Jackson St., Denver, CO 80206 (e-mail: [email protected]).
2
286
0.587
Each time we drew a random sample we calculated some
sample statistics. These were the statistics for samples 1, 2, and
1000:
⬎ # Sample
[,1]
1
2
:
1000
Mean
SD
SE
t
LCI
UCI
[,2]
0.797
0.517
[,3]
0.702
0.707
[,4]
0.234
0.236
[,5]
3.407
2.193
[,6]
0.362
0.079
[,7]
1.232
0.955
0.233
0.975
0.325
0.718
⫺0.371
0.838
In contrast to our early explorations that used 1000 samples
from our standard normal population, our exploration of the
bootstrap uses just the 9 observations from sample 1:
0.422
R: Basic Operations
1.825
0.646
1.103
1.006
1.034
0.285
⫺0.647
1.235
0.912
1.825
In our previous exploration (4) we used these observations to
calculate a 90% confidence interval for the population mean:
关0.362, 1.232兴 ⬟ 关0.36, 1.23兴 .
Because this interval excluded 0, we declared, with 90%
confidence, that 0 was not a plausible value of the population mean. This inference was consistent with the result of
4
This file is available through the Supplemental Material for this article at
the Advances in Physiology Education website.
5
The notation click A | B means click A, then click B.
6
CRAN stands for the Comprehensive R Archive Network. A mirror is a
duplicate server.
1043-4046/09 $8.00 Copyright © 2009 The American Physiological Society
Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017
Curran-Everett D. Explorations in statistics: the bootstrap. Adv
Physiol Educ 33: 286–292, 2009; doi:10.1152/advan.00062.2009.—
Learning about statistics is a lot like learning about science: the
learning is more meaningful if you can actively explore. This fourth
installment of Explorations in Statistics explores the bootstrap. The
bootstrap gives us an empirical approach to estimate the theoretical
variability among possible values of a sample statistic such as the
sample mean. The appeal of the bootstrap is that we can use it to make
an inference about some experimental result when the statistical
theory is uncertain or even unknown. We can also use the bootstrap to
assess how well the statistical theory holds: that is, whether an
inference we make from a hypothesis test or confidence interval is
justified.
Staying Current
BOOTSTRAP METHODS
Table 1. Illustration of bootstrap samples
Bootstrap Sample j
y
1
2
…
b
0.422
1.103
1.006
1.034
0.285
⫺0.647
1.235
0.912
1.825
⫺0.647
1.285
0.912
0.912
1.825
1.825
0.285
1.006
⫺0.647
0.285
1.235
1.825
1.034
1.006
1.103
0.912
1.103
0.422
…
1.034
0.285
1.103
0.912
1.103
0.422
⫺0.647
0.912
1.825
0.797
y៮
0.640
y៮ *1
0.992
y៮ *2
0.772
y៮ *b
the hypothesis test in our second exploration (5) in which
we realized that when we draw a single random sample from
some population–when we do a single experiment–we can
have enough unusual observations so that it appears the
observations came from a different population. This is what
happened with our first sample.
With this brief review of the observations from our first
sample, we are ready to explore the bootstrap.
We use the notation y៮ *j to represent the mean of bootstrap
sample j (11, 16).
Suppose we generate 10,000 bootstrap replications of the
sample mean (Fig. 1). If we treat these 10,000 sample means as
observations, we can calculate their average and standard
deviation:
Ave兵y៮ *其 ⫽ 0.798
and
SD兵y៮ *其 ⫽ 0.219 .
The standard deviation SD{y៮ *} describes the variability among
the b bootstrap means and estimates the standard deviation
of the theoretical distribution of the sample mean (16). The
commands in lines 85– 86 of Advances_Statistics_Code_
Boot.R return these values. Your values will differ slightly.
We can do more than just calculate Ave{y៮ *} and SD{y៮ *}:
we can assess whether the distribution of these 10,000 bootstrap means is consistent with a normal distribution. How? By
using a normal quantile plot (Fig. 2). It turns out that these
10,000 bootstrap means are not consistent with a normal
distribution (Fig. 3). On the other hand, 10,000 bootstrap
replications of the sample mean using the observations from
sample 1000 are consistent with a normal distribution (Fig. 4).
And last, we can use bootstrap replications of the sample
mean to estimate different kinds of confidence intervals for
the population mean: for example, normal-theory, percentile, and bias-corrected-and-accelerated confidence intervals
(16, 26).
Normal-theory confidence interval. In our last exploration
(4) we calculated a 100(1 ⫺ ␣)% confidence interval for the
population mean as
The Bootstrap
In our previous exploration (4) we used the standard error of
the mean, our estimate of the theoretical variability among
possible values of the sample mean, to calculate a confidence
interval for the mean of the underlying population. Rather than
use theory (7) to develop the notion of the standard error of the
mean, we used a simulation: we drew 1000 random samples
from our population and–for each sample–we calculated the
mean (3). When we treated those 1000 sample means as
observations, we calculated their standard deviation:
SD兵y៮ 其 ⫽ 0.326 .
The standard deviation of sample means is the standard error of
the sample mean SE {y៮ }. The bootstrap estimates the standard
error of a statistic using not a whole bunch of random samples
from some theoretical population but actual sample observations.
Suppose we want to bootstrap the mean using the observations from the first sample: 0.422, 1.103, . . . , 1.825. How do
we do this? We draw at random–with replacement7–a sample
of size 9 from these 9 actual observations (Table 1). We then
repeat this process until we have drawn a total of b bootstrap
samples.8 For each bootstrap sample, we calculate its mean.
7
After an observation is drawn from the pool of 9 values, it is returned to
the pool. The consequence: the observation can appear more than once in a
bootstrap sample.
8
The number of bootstrap replications can vary from 1,000 to 10,000 (11,
16, 26).
Fig. 1. Bootstrap distribution of 10,000 sample means, each with 9 observations. The bootstrap replications were drawn at random from the observations
in sample 1 (see Table 1). The average of these sample means, Ave{y៮ *}, is
0.798, nearly identical to the observed sample mean y៮ ⫽ 0.797. The standard
deviation of these sample means, SD{y៮ *}, is 0.219, which estimates the
standard deviation of the theoretical distribution of the sample mean, ␴/公n ⫽
0.333 (3). The commands in lines 65–78 of Advances_⬎Statistics_Code_Boot.R
create this data graphic. To generate this data graphic, highlight and submit the
lines of code from Figure 1: first line to Figure 1: last line.
Advances in Physiology Education • VOL
33 • DECEMBER 2009
Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017
y, Actual observations from sample 1. Because the bootstrap procedure samples
with replacement, an original observation can appear more than once in a bootstrap
sample. The bootstrap sample means y៮ *1, y៮ *2, . . . , y៮ *b cluster around 0.797, the
observed sample mean y៮. The commands in lines 40–49 of Advances_Statistics_
Code_Boot.R create three bootstrap samples from the observations in sample 1. To
generate three bootstrap samples as in this table, highlight and submit the lines of code
from Table 1: first line to Table 1: last line.
287
Staying Current
288
BOOTSTRAP METHODS
Percentile confidence interval. In the bootstrap distribution
of 10,000 sample means (see Fig. 1), 90% of the means are
covered by the interval
关y៮ ⫺ a, y៮ ⫹ a兴 ,
where the allowance a was
a ⫽ z ␣ /2 䡠 SD兵y៮ 其 .
关0.426, 1.143兴 ⬟ 关0.43, 1.14兴 .
The commands in lines 196 –199 of Advances_Statistics_
Code_Boot.R return these values.
Recall that z␣/2 is the 100[1 ⫺ (␣/2)]th percentile from the
standard normal distribution and SD{y៮ } is the standard deviation of the sample means, ␴/公n.
A normal-theory confidence interval based on bootstrap
replications of the sample mean is similar. We just replace y៮
with Ave{y៮ *}, a with a*, and SD{y៮ } with SD{y៮ *}:
关Ave兵y៮ *其 ⫺ a*, Ave兵y៮ *其 ⫹ a*兴 .
The bootstrap allowance a* is
a* ⫽ z ␣ /2 䡠 SD兵y៮ *其 .
Suppose we want to calculate a 90% bootstrap confidence
interval for the population mean. In this situation, ␣ ⫽ 0.10 and
z␣/2 ⫽ 1.645. Therefore, the allowance a* is
a* ⫽ z ␣ /2 䡠 SD兵y៮ *其 ⫽ 1.645 䡠 0.219 ⫽ 0.360 ,
and the resulting 90% confidence interval is
关0.438, 1.158兴 ⬟ 关0.44, 1.16兴 .
This bootstrap confidence interval resembles the confidence
interval of [0.36, 1.23] that we calculated before (4). The
commands in lines 196 –199 of Advances_Statistics_Code_
Boot.R return these values. Your values will differ slightly.
Fig. 4. Normal quantile plot of 10,000 sample means, each with 9 observations. The bootstrap replications were drawn at random from the observations
in sample 1000. These bootstrap sample means are consistent with a normal
distribution. The commands in lines 183–189 of Advances_Statistics_Code_
Boot.R create this data graphic. To generate this data graphic, highlight and
submit the lines of code from Figure 4: first line to Figure 4: last line.
Advances in Physiology Education • VOL
33 • DECEMBER 2009
Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017
Fig. 2. Normal quantile plots (right) of 100 observations drawn from a normal
(top) or nonnormal (bottom) distribution. If observations are consistent with a
normal distribution, then they will follow a rough straight line. If observations are
inconsistent with a normal distribution, then they will deviate systematically from
a straight line. The commands in lines 109 –156 of Advances_Statistics_Code_
Boot.R create this data graphic. To generate this data graphic, highlight and submit
the lines of code from Figure 2: first line to Figure 2: last line.
Fig. 3. Normal quantile plot of 10,000 sample means, each with 9 observations.
The bootstrap replications were drawn at random from the observations in sample
1 (see Table 1). The bootstrap sample means are inconsistent with a normal
distribution. The commands in lines 165–171 of Advances_Statistics_Code_
Boot.R create this data graphic. To generate this data graphic, highlight and submit
the lines of code from Figure 3: first line to Figure 3: last line.
Staying Current
BOOTSTRAP METHODS
Bias-corrected-and-accelerated confidence interval. If the
number of observations in the actual sample is too small, if the
average Ave{y៮ *} of the bootstrap sample means differs from
the sample mean y៮ , or if the bootstrap distribution of the sample
mean is skewed, then normal-theory and percentile confidence
intervals are likely to be inaccurate (16, 26). A bias-corrected-andaccelerated confidence interval adjusts percentiles of the bootstrap
distribution to account for bias and skewness (16, 26). In this exploration, the 90% bias-corrected-and-accelerated confidence interval is
What happens, however, if we doubt that the theoretical
distribution of the sample mean is consistent with a normal
distribution? Our suspicion would be natural: this distribution
is, after all, a theoretical one. One approach is to transform the
sample observations. Common transformations include the
logarithm and the inverse. Box and Cox (1) described a family
of power transformations in which an observed variable y is
transformed into the variable w using the parameter ␭:
w⫽
关0.376, 1.113兴 ⬟ 关0.38, 1.11兴 .
再 共yln y⫺ 1兲/␭
␭
for ␭ ⫽ 0 , and
for ␭ ⫽ 0 .
The logarithmic, ␭ ⫽ 0, and inverse, ␭ ⫽ ⫺1, transformations
belong to this family.
Draper and Smith (9) summarized the steps needed to
estimate ␭ and its approximate 100(1 ⫺ ␣)% confidence
interval. These steps include, for each trial value of ␭, the
calculation of the maximum likelihood ᐉmax:
冘
n
1
ᐉmax ⫽ ⫺ n ln 共SSresidual/n兲 ⫹ 共␭ ⫺ 1兲 ln yi ,
2
i⫽1
(1)
where SSresidual is the residual sum of squares in the fit of a
general linear model to the observations. The optimum estimate of ␭ maximizes ᐉmax (Fig. 5).
Without question, transformation can be useful (1, 9). But
really, who wants to identify the optimum transformation by
solving Eq. 1? The bootstrap provides another way.
The textbook I use in my statistics course provides the data:
measurements of C-reactive protein (Table 2). Problem 7.26 (Ref.
26, p. 442– 443) asks students to study the distribution of
these 40 observations and to calculate a 95% confidence
interval for the true C-reactive protein mean. The real value
of problem 7.26 is that it then asks students if a confidence
interval is even appropriate for these data.
Cursory examinations of the histogram and normal quantile
plot reveal that the C-reactive protein values are skewed and
inconsistent with a normal distribution (Fig. 6, top). Still, if the
theoretical distribution of the sample mean is roughly normal–if
The Bootstrap in Data Transformation
In our previous exploration (4) we delved into confidence
intervals by drawing random samples from a normal distribution. That we chose a normal distribution for our population
was no accident. For normal-theory confidence intervals to be
meaningful, one thing must be approximately true: if the
random variable Y represents the physiological thing we care
about, then the theoretical distribution of the sample mean Y៮
with n observations must be distributed normally with mean ␮
and standard deviation ␴/公n.9 In our simulations (3–5) this
assumption was satisfied exactly (7). In a real experiment we
never know if this assumption is satisfied, but we know it will
be satisfied at least roughly–regardless of the population distribution from which the sample observations came–as long as
the sample size n is big enough (27).
9
In our second exploration (5) we used the test statistic t to investigate
hypothesis tests, test statistics, and P values. A t statistic shares this assumption.
Fig. 5. Maximum likelihood approach to data transformation. For each value
of ␭, the maximum likelihood ᐉmax is calculated according to Eq. 1. For the
C-reactive protein values (see Table 2), the maximum likelihood estimate of ␭
that maximizes ᐉmax is ⫺0.1, and an approximate 95% confidence interval for
␭ is [⫺0.4, ⫹0.1]. Because this interval includes 0, a log transformation of the
C-reactive protein values is reasonable.
Advances in Physiology Education • VOL
33 • DECEMBER 2009
Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017
The commands in lines 196 –199 of Advances_Statistics_
Code_Boot.R return these values. This confidence interval is
shifted slightly to the left of the percentile interval of [0.43,
1.14]. In many situations, a bias-corrected-and-accelerated
confidence interval will provide a more accurate estimate of
our uncertainty about the true value of a population parameter
such as the mean (16, 26).
Limitations. As useful as the bootstrap is, it cannot always
salvage the statistical analysis of a small sample. Why not? If the
sample is too small, then it may be atypical of the underlying
population. When this happens, the bootstrap distribution will not
mirror the theoretical distribution of the sample statistic. The
trouble is, it may not be obvious how small too small is. A statistician
(see Ref. 6, guideline 1) and an estimate of power can help.
In our second and third explorations (4, 5) we concluded that
the observations from sample 1 were consistent with having
come from a population that had a mean other than 0. Only
because we had defined the underlying population (see Ref. 3)
could we have known that we had erred. When we do a single
experiment, we can have enough unusual observations so that
it just appears the observations came from a different population (5). A small sample size exacerbates the potential for this
phenomenon. This is what happened when we bootstrapped the
sample mean using observations from the first sample.
We know the theoretical distribution of the sample mean is
exactly normal (3), but the distribution of those bootstrap means
is clearly not normal (see Fig. 3). This happens because the
observations from the first sample are atypical of the underlying
population. The message? All bets are off if you bootstrap a
sample statistic using observations from a sample that is too small.
289
Staying Current
290
BOOTSTRAP METHODS
Table 2. C-reactive protein data
C-reactive protein observations by adding 1 to each observation and taking the logarithm of that number:
Observations, n ⫽ 40
y
w
y
w
y
w
y
w
0
3.90
5.64
8.22
0
5.62
3.92
6.81
0
0
0
1.589
1.893
2.221
0
1.890
1.593
2.055
0
0
73.20
0
46.70
0
0
26.41
22.82
0
30.61
3.49
4.307
0
3.865
0
0
3.311
3.171
0
3.453
1.502
0
0
4.81
9.57
5.36
0
5.66
0
59.76
12.38
0
0
1.760
2.358
1.850
0
1.896
0
4.107
2.594
15.74
0
0
0
0
9.37
20.78
7.10
7.89
5.53
2.818
0
0
0
0
2.339
3.081
2.092
2.185
1.876
the Central Limit Theorem holds–then the 95% confidence
interval [4.74, 15.33] mg/l will be meaningful. But we have an
intractable problem: we have no way of knowing if the sample
size of 40 is big enough for the theoretical distribution of the
sample mean to be roughly normal.
Problem 7.27 (Ref. 26, p. 443) tells students a log transformation10
decreases skewness and asks them to transform the actual
Problem 7.27 then asks students to study the distribution of the
40 transformed observations and to calculate a 95% confidence
interval for the true transformed C-reactive protein mean.
A histogram and normal quantile plot show that the transformed values are less skewed but still inconsistent with a
normal distribution (Fig. 6, bottom). As my students tell me,
“Better, but not great.” The 95% confidence interval [1.05,
1.94] reverts to
关e 1.05 ⫺ 1, e 1.94 ⫺ 1兴 ⬟ 关1.86, 5.96兴 mg/l .
At this point, we have another problem: we have assumed
the log transformation ln (y ⫹ 1) is useful–that it gives us a
meaningful confidence interval– but what evidence do we
have that it really is? You guessed it: the bootstrap.
A bootstrap distribution estimates the theoretical distribution
of some sample statistic. In this problem, the bootstrap sample
means from the actual observations are inconsistent with a
normal distribution (Fig. 7, top). This means the sample size of
40 is not big enough for the theoretical distribution of the
sample mean to be roughly normal; as a result, the confidence
interval [4.74, 15.33] mg/l is misleading. On the other hand,
the bootstrap sample means from the transformed C-reactive
protein observations are consistent with a normal distribution
(Fig. 7, bottom): the confidence interval [1.05, 1.94] ⫽
˙ [1.86,
5.96] mg/l is a useful tool for inference.
10
The Box and Cox method identifies a log transformation of the C-reactive
protein values as the optimal transformation (see Fig. 5).
Fig. 6. Distributions (left) and normal quantile plots (right) of the actual and
transformed C-reactive protein observations. The transformation changed the
distribution of the observations, but the transformed values remain inconsistent
with a normal distribution. The commands in lines 232–252 of Advances_
Statistics_Code_Boot.R create this data graphic. To generate this data graphic,
highlight and submit the lines of code from Figure 6: first line to Figure 6: last line.
Fig. 7. Bootstrap distributions (left) and normal quantile plots (right) of 10,000
sample means, each with 40 observations. The bootstrap replications were drawn
at random from the actual or transformed C-reactive protein observations in Table
2. The bootstrap sample means from the actual observations are inconsistent with
a normal distribution. In contrast, the bootstrap sample means from the transformed observations are consistent with a normal distribution. The commands in
lines 259 –279 of Advances_Statistics_Code_Boot.R create this data graphic. To
generate this data graphic, highlight and submit the lines of code from Figure 7:
first line to Figure 7: last line.
Advances in Physiology Education • VOL
33 • DECEMBER 2009
Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017
y, Actual C-reactive protein observations (in mg/l) (26). The average, standard
deviation, and standard error of these observations are Ave{y} ⫽ 10.03, SD{y} ⫽
16.56, and SE{y៮} ⫽ 2.62. The transformed observations, w, result from ln (y ⫹ 1). The
average, standard deviation, and standard error of the transformed observations are
Ave{w} ⫽ 1.495, SD{w} ⫽ 1.391, and SE{w៮ } ⫽ 0.220. For each set of observations,
if we want to calculate a 95% confidence interval for the true mean, then ␣ ⫽ 0.05, the
degrees of freedom v ⫽ 39, and t␣/2,v ⫽ 2.023 (see Ref. 4).
w ⫽ ln共y ⫹ 1兲 .
Staying Current
BOOTSTRAP METHODS
Summary
I was four or five miles from the earth at least, when it
broke; I fell to the ground with such amazing violence, that
I found myself stunned, and in a hole nine fathoms deep
at least, made by the weight of my body falling from so great
a height: I recovered, but knew not how to get out again;
however, I dug slopes or steps with my [finger] nails (the
Baron’s nails were then of forty years’ growth), and easily
accomplished it.
Refs. 31 (1786) and 32 (2001)
It was from this 9-fathom-deep hole that the Baron is rumored to have
extricated himself by his bootstraps.
Although the notion of pulling yourself up by your bootstraps is
entirely consistent with Baron Munchausen’s flair for the dramatic, I
failed to find any evidence that the Baron availed himself of this
life-saving technique.
ACKNOWLEDGMENTS
I thank John Ludbrook (Department of Surgery, The University of Melbourne, Victoria, Australia) and Matthew Strand (National Jewish Health,
Denver, CO) for giving helpful comments and suggestions, Sarah Kareem
(Department of English, University of California, Los Angeles, CA) for
humoring my questions and for graciously providing a lot of information about
Baron Munchausen, and Bernhard Wiebel (Munchausen Library, Zurich,
Switzerland) for searching more than 80 English editions of the Baron’s
adventures for mention of bootstraps.
APPENDIX
REFERENCES
In 1993, Efron and Tibshirani (16) wrote that bootstrap, a term
coined originally by Efron in 1979 (11), was inspired by the colloquialism pull yourself up by your bootstraps, a nonsensical notion
generally attributed to an escapade of the fictitious Baron Munchausen
(30). When I learned this, I embarked on a search for the escapade.
With the generous assistance of Dr. Sarah Kareem (Department of
English, University of California, Los Angeles, CA) and Bernhard
Wiebel (Munchausen Library, Zurich, Switzerland), this is what I
discovered.
The complete title of the Baron’s adventures (30), written by
Rudolf Erich Raspe, was Baron Munchausen’s Narrative of His
Marvellous Travels and Campaigns in Russia. Humbly Dedicated
and Recommended to Country Gentlemen; and, If They Please, to
be Repeated as Their Own, After a Hunt at Horse Races, in
Watering-Places, and Other Such Polite Assemblies; Round the
Bottle and Fire-Side.
In chapter VI, the Baron throws a silver hatchet at two bears in
hopes of rescuing a bee.11 The hatchet misses both bears and ends
up on the moon. To retrieve the hatchet, the Baron grows and
promptly climbs a Turkey-bean after it had attached itself to the
moon. He finds his hatchet but then discovers the bean has dried
up, so he braids a rope of straw to use for the descent. The Baron
is partway down the straw rope when the next catastrophe hits. The
details of the Baron’s escape differ according to the account you
happen to read:
I was still a couple of miles in the clouds when it broke, and
with such violence I fell to the ground that I found myself
stunned, and in a hole nine fathoms under grass, when I
recovered, hardly knowing how to get out again. There was no
other way than to go home for a spade and to dig me out by
slopes, which I fortunately accomplished, before I had been so
much as missed by the steward.
Refs. 30 (1785), 33 (1948), and 34 (1952)
11
It is a long story.
1. Box GEP, Cox DR. An analysis of transformations. J R Stat Soc Ser B 26:
211–243, 1964.
2. Carra J, Candau R, Keslacy S, Giolbas F, Borrani F, Millet GP,
Varray A, Ramonatxo M. Addition of inspiratory resistance increases the
amplitude of the slow component of O2 uptake kinetics. J Appl Physiol 94:
2448 –2455, 2003.
3. Curran-Everett D. Explorations in statistics: standard deviations and
standard errors. Adv Physiol Educ 32: 203–208, 2008.
4. Curran-Everett D. Explorations in statistics: confidence intervals. Adv
Physiol Educ 33: 87–90, 2009.
5. Curran-Everett D. Explorations in statistics: hypothesis tests and P
values. Adv Physiol Educ 33: 81– 86, 2009.
6. Curran-Everett D, Benos DJ. Guidelines for reporting statistics in
journals published by the American Physiological Society. Physiol
Genomics 18: 249 –251, 2004.
7. Curran-Everett D, Taylor S, Kafadar K. Fundamental concepts in
statistics: elucidation and illustration. J Appl Physiol 85: 775–786, 1998.
8. DiCiccio TJ, Efron B. Bootstrap confidence intervals. Stat Sci 11:
189 –228, 1996.
9. Draper NR, Smith H. Applied Regression Analysis (2nd ed.). New York:
Wiley, 1981, p. 225–227.
10. Durheim MT, Slentz CA, Bateman LA, Mabe SK, Kraus WE. Relationships between exercise-induced reductions in thigh intermuscular
adipose tissue, changes in lipoprotein particle size, and visceral adiposity.
Am J Physiol Endocrinol Metab 295: E407–E412, 2008.
11. Efron B. Bootstrap methods: another look at the jackknife. Ann Stat 7:
1–26, 1979.
12. Efron B. Better bootstrap confidence intervals. J Am Stat Assoc 82:
171–185, 1987.
13. Efron B. Discussion: theoretical comparison of bootstrap confidence
intervals. Ann Stat 16: 969 –972, 1988.
14. Efron B. The bootstrap and modern statistics. J Am Stat Assoc 95:
1293–1296, 2000.
15. Efron B. Second thoughts on the bootstrap. Stat Sci 18: 135–140, 2003.
16. Efron B, Tibshirani RJ. An Introduction to the Bootstrap. New York:
Chapman & Hall, 1993.
17. Esbaugh AJ, Perry SF, Gilmour KM. Hypoxia-inducible carbonic
anhydrase IX expression is insufficient to alleviate intracellular metabolic
acidosis in the muscle of zebrafish, Danio rerio. Am J Physiol Regul Integr
Comp Physiol 296: R150 –R160, 2009.
18. Fanini A, Assad JA. Direction selectivity of neurons in the macaque
lateral intraparietal area. J Neurophysiol 101: 289 –305, 2009.
19. Gillis TE, Marshall CR, Tibbits GF. Functional and evolutionary relationships of troponin C. Physiol Genomics 32: 16 –27, 2007.
Advances in Physiology Education • VOL
33 • DECEMBER 2009
Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017
As this exploration has demonstrated, the bootstrap gives us
an approach we can use to assess whether an inference we
make from a normal-theory hypothesis test or confidence
interval is justified: if the bootstrap distribution of a statistic
such as the sample mean is roughly normal, then the
inference is justified. But the bootstrap gives us more than
that. We can use a bootstrap confidence interval to make an
inference about some experimental result when the statistical theory is uncertain or even unknown (14, 26). Although
we have explored the bootstrap using the sample mean, we
can use the bootstrap to make an inference about other
sample statistics such as the standard deviation or correlation coefficient.
In the next installment of this series, we will explore power,
a concept we mentioned in our exploration of hypothesis tests
(5). Power is the probability that we reject the null hypothesis
given that the null hypothesis is false. The notion of power is
integral to hypothesis testing, confidence intervals, and grant
applications.
291
Staying Current
292
BOOTSTRAP METHODS
29. R Development Core Team. R: a Language and Environment for
Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing, 2008; http://www.R-project.org.
30. Raspe RE. Baron Munchausen’s Narrative of his Marvellous Travels and
Campaigns in Russia. Oxford: Smith, 1785, p. 45.
31. Raspe RE. Gulliver Revived; or The Singular Travels, Campaigns,
Voyages, and Adventures of Baron Munikhouson, commonly called Munchausen. London: Kearsley, 1786, p. 45.
32. Raspe RE. The Surprising Adventures of Baron Munchausen. Rockville,
MD: Wildside, 2001, p. 58 –59.
33. Raspe RE. Singular Travels, Campaigns and Adventures of Baron Munchausen, edited by Carswell JP, others. London: Cresset, 1948, p. 22.
34. Raspe RE. The Singular Adventures of Baron Munchausen, edited by
Carswell JP, others. New York: Heritage, 1952, p. 18.
35. Rozengurt N, Wu SV, Chen MC, Huang C, Sternini C, Rozengurt E.
Colocalization of the ␣-subunit of gustducin with PYY and GLP-l in L
cells of human colon. Am J Physiol Gastrointest Liver Physiol 291:
G792–G802, 2006.
36. Tukey J. Bias and confidence in not-quite large samples (abstract). Ann
Math Statistics 29: 614, 1958.
37. Yoshimura H, Jones KA, Perkins WJ, Kai T, Warner DO. Calcium
sensitization produced by G protein activation in airway smooth muscle.
Am J Physiol Lung Cell Mol Physiol 281: L631–L638, 2001.
Advances in Physiology Education • VOL
33 • DECEMBER 2009
Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017
20. Grenier AL, Abu-ihweij K, Zhang G, Ruppert SM, Boohaker R,
Slepkov ER, Pridemore K, Ren JJ, Fliegel L, Khaled AR. Apoptosisinduced alkalinization by the Na⫹/H⫹ exchanger isoform 1 is mediated
through phosphorylation of amino acids Ser726 and Ser729. Am J Physiol
Cell Physiol 295: C883–C896, 2008.
21. Kelley GA, Kelley KS, Tran ZV. Exercise and bone mineral density in
men: a meta-analysis. J Appl Physiol 88: 1730 –1736, 2000.
22. Kristensen M, Hansen T. Statistical analyses of repeated measures in
physiological research: a tutorial. Adv Physiol Educ 28: 2–14, 2004.
23. Lemley KV, Boothroyd DB, Blouch KL, Nelson RG, Jones LI, Olshen
RA, Myers BD. Modeling GFR trajectories in diabetic nephropathy. Am J
Physiol Renal Physiol 289: F863–F870, 2005.
24. Lott MEJ, Hogeman C, Herr M, Bhagat M, Sinoway LI. Sex differences in limb vasoconstriction responses to increases in transmural pressures. Am J Physiol Heart Circ Physiol 296: H186 –H194, 2009.
25. Miller RG. The jackknife–a review. Biometrika 61: 1–15, 1974.
26. Moore DS, McCabe GP, Craig BA. Introduction to the Practice of
Statistics (6th ed.). New York: Freeman, 2009.
27. Moses LE. Think and Explain with Statistics. Reading, MA: AddisonWesley, 1986.
28. Quenouille M. Approximate tests of correlation in time-series. J R Stat
Soc Ser B 11: 68 – 84, 1949.