Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Adv Physiol Educ 33: 286–292, 2009; doi:10.1152/advan.00062.2009. Staying Current Explorations in statistics: the bootstrap Douglas Curran-Everett Division of Biostatistics and Bioinformatics, National Jewish Health, and Department of Biostatistics and Informatics and Department of Physiology and Biophysics, School of Medicine, University of Colorado Denver, Denver, Colorado Submitted 20 July 2009; accepted in final form 4 September 2009 Central Limit Theorem; R; sample mean; software; standard error in Explorations in Statistics (see Refs. 3–5) explores the bootstrap, a recent development in statistics (8, 11–16) that evolved from the jackknife (25, 28, 36). Despite its brief history, the bootstrap is discussed in textbooks of statistics (26) and used in manuscripts published by the American Physiological Society (2, 10, 17–24, 35, 37). The bootstrap1 gives us an empirical approach to estimate the theoretical variability among possible values of a sample statistic.2 In our previous explorations (3–5) we calculated the standard error of the mean, our estimate of the theoretical variability among sample means, as SE{y } ⫽ s/公n, where s was the sample standard deviation and n was the number of observations in the sample. This computation was grounded in theory (7). In our last exploration (4) we derived a confidence interval for the population mean. This too was grounded in theory. The beauty of the bootstrap is that it can transcend theory: we can use the bootstrap to make an inference about some experimental result when the theory is uncertain or even unknown (14, 26). We can also use the bootstrap to assess how well the theory holds: that is, whether an inference we make from a hypothesis test or confidence interval is justified. THIS FOURTH PAPER Boot.R4 to your Advances folder and install the extra package boot. To install boot, open R and then click Packages | Install package(s) . . . 5 Select a CRAN mirror6 close to your location and then click OK. Select boot and then click OK. When you have installed boot, you will see package ‘boot’ successfully unpacked and MD5 sums checked in the R Console. To run R commands. If you use a Mac, highlight the commands you want to submit and then press (command key⫹enter). If you use a PC, highlight the commands you want to submit, rightclick, and then click Run line or selection. Or, highlight the commands you want to submit and then press Ctrl⫹R. The Simulation: Data for the Bootstrap In our early explorations (3–5) we drew random samples of 9 observations from a standard normal distribution with mean ⫽ 0 and standard deviation ⫽ 1. These were the observations–the data–for samples 1, 2, and 1000: ⬎ # Sample Observations [1] [2] : [1000] 0.422 0.154 1.103 ⫺0.654 1.006 ⫺0.147 1.034 1.715 0.285 0.720 ⫺0.647 0.804 1.235 0.256 0.912 1.155 0.560 ⫺1.138 0.485 ⫺0.864 ⫺0.277 2.198 0.050 0.500 In the first article (3) of this series, I summarized R (29) and outlined its installation.3 For this exploration, there are two additional steps: download Advances_Statistics_Code_ 1 The APPENDIX reviews the origin of the name bootstrap. For example, a sample mean, a sample standard deviation, or a sample correlation. 3 I developed the scripts for the early explorations (3–5) using R-2.6.2. I developed the script for this exploration using R-2.8.2 (deployed 22 Dec 2008), but it will run in R-2.6.2. Address for reprint requests and other correspondence: D. Curran-Everett, Div. of Biostatistics and Bioinformatics, M222, National Jewish Health, 1400 Jackson St., Denver, CO 80206 (e-mail: [email protected]). 2 286 0.587 Each time we drew a random sample we calculated some sample statistics. These were the statistics for samples 1, 2, and 1000: ⬎ # Sample [,1] 1 2 : 1000 Mean SD SE t LCI UCI [,2] 0.797 0.517 [,3] 0.702 0.707 [,4] 0.234 0.236 [,5] 3.407 2.193 [,6] 0.362 0.079 [,7] 1.232 0.955 0.233 0.975 0.325 0.718 ⫺0.371 0.838 In contrast to our early explorations that used 1000 samples from our standard normal population, our exploration of the bootstrap uses just the 9 observations from sample 1: 0.422 R: Basic Operations 1.825 0.646 1.103 1.006 1.034 0.285 ⫺0.647 1.235 0.912 1.825 In our previous exploration (4) we used these observations to calculate a 90% confidence interval for the population mean: 关0.362, 1.232兴 ⬟ 关0.36, 1.23兴 . Because this interval excluded 0, we declared, with 90% confidence, that 0 was not a plausible value of the population mean. This inference was consistent with the result of 4 This file is available through the Supplemental Material for this article at the Advances in Physiology Education website. 5 The notation click A | B means click A, then click B. 6 CRAN stands for the Comprehensive R Archive Network. A mirror is a duplicate server. 1043-4046/09 $8.00 Copyright © 2009 The American Physiological Society Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017 Curran-Everett D. Explorations in statistics: the bootstrap. Adv Physiol Educ 33: 286–292, 2009; doi:10.1152/advan.00062.2009.— Learning about statistics is a lot like learning about science: the learning is more meaningful if you can actively explore. This fourth installment of Explorations in Statistics explores the bootstrap. The bootstrap gives us an empirical approach to estimate the theoretical variability among possible values of a sample statistic such as the sample mean. The appeal of the bootstrap is that we can use it to make an inference about some experimental result when the statistical theory is uncertain or even unknown. We can also use the bootstrap to assess how well the statistical theory holds: that is, whether an inference we make from a hypothesis test or confidence interval is justified. Staying Current BOOTSTRAP METHODS Table 1. Illustration of bootstrap samples Bootstrap Sample j y 1 2 … b 0.422 1.103 1.006 1.034 0.285 ⫺0.647 1.235 0.912 1.825 ⫺0.647 1.285 0.912 0.912 1.825 1.825 0.285 1.006 ⫺0.647 0.285 1.235 1.825 1.034 1.006 1.103 0.912 1.103 0.422 … 1.034 0.285 1.103 0.912 1.103 0.422 ⫺0.647 0.912 1.825 0.797 y 0.640 y *1 0.992 y *2 0.772 y *b the hypothesis test in our second exploration (5) in which we realized that when we draw a single random sample from some population–when we do a single experiment–we can have enough unusual observations so that it appears the observations came from a different population. This is what happened with our first sample. With this brief review of the observations from our first sample, we are ready to explore the bootstrap. We use the notation y *j to represent the mean of bootstrap sample j (11, 16). Suppose we generate 10,000 bootstrap replications of the sample mean (Fig. 1). If we treat these 10,000 sample means as observations, we can calculate their average and standard deviation: Ave兵y *其 ⫽ 0.798 and SD兵y *其 ⫽ 0.219 . The standard deviation SD{y *} describes the variability among the b bootstrap means and estimates the standard deviation of the theoretical distribution of the sample mean (16). The commands in lines 85– 86 of Advances_Statistics_Code_ Boot.R return these values. Your values will differ slightly. We can do more than just calculate Ave{y *} and SD{y *}: we can assess whether the distribution of these 10,000 bootstrap means is consistent with a normal distribution. How? By using a normal quantile plot (Fig. 2). It turns out that these 10,000 bootstrap means are not consistent with a normal distribution (Fig. 3). On the other hand, 10,000 bootstrap replications of the sample mean using the observations from sample 1000 are consistent with a normal distribution (Fig. 4). And last, we can use bootstrap replications of the sample mean to estimate different kinds of confidence intervals for the population mean: for example, normal-theory, percentile, and bias-corrected-and-accelerated confidence intervals (16, 26). Normal-theory confidence interval. In our last exploration (4) we calculated a 100(1 ⫺ ␣)% confidence interval for the population mean as The Bootstrap In our previous exploration (4) we used the standard error of the mean, our estimate of the theoretical variability among possible values of the sample mean, to calculate a confidence interval for the mean of the underlying population. Rather than use theory (7) to develop the notion of the standard error of the mean, we used a simulation: we drew 1000 random samples from our population and–for each sample–we calculated the mean (3). When we treated those 1000 sample means as observations, we calculated their standard deviation: SD兵y 其 ⫽ 0.326 . The standard deviation of sample means is the standard error of the sample mean SE {y }. The bootstrap estimates the standard error of a statistic using not a whole bunch of random samples from some theoretical population but actual sample observations. Suppose we want to bootstrap the mean using the observations from the first sample: 0.422, 1.103, . . . , 1.825. How do we do this? We draw at random–with replacement7–a sample of size 9 from these 9 actual observations (Table 1). We then repeat this process until we have drawn a total of b bootstrap samples.8 For each bootstrap sample, we calculate its mean. 7 After an observation is drawn from the pool of 9 values, it is returned to the pool. The consequence: the observation can appear more than once in a bootstrap sample. 8 The number of bootstrap replications can vary from 1,000 to 10,000 (11, 16, 26). Fig. 1. Bootstrap distribution of 10,000 sample means, each with 9 observations. The bootstrap replications were drawn at random from the observations in sample 1 (see Table 1). The average of these sample means, Ave{y *}, is 0.798, nearly identical to the observed sample mean y ⫽ 0.797. The standard deviation of these sample means, SD{y *}, is 0.219, which estimates the standard deviation of the theoretical distribution of the sample mean, /公n ⫽ 0.333 (3). The commands in lines 65–78 of Advances_⬎Statistics_Code_Boot.R create this data graphic. To generate this data graphic, highlight and submit the lines of code from Figure 1: first line to Figure 1: last line. Advances in Physiology Education • VOL 33 • DECEMBER 2009 Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017 y, Actual observations from sample 1. Because the bootstrap procedure samples with replacement, an original observation can appear more than once in a bootstrap sample. The bootstrap sample means y *1, y *2, . . . , y *b cluster around 0.797, the observed sample mean y. The commands in lines 40–49 of Advances_Statistics_ Code_Boot.R create three bootstrap samples from the observations in sample 1. To generate three bootstrap samples as in this table, highlight and submit the lines of code from Table 1: first line to Table 1: last line. 287 Staying Current 288 BOOTSTRAP METHODS Percentile confidence interval. In the bootstrap distribution of 10,000 sample means (see Fig. 1), 90% of the means are covered by the interval 关y ⫺ a, y ⫹ a兴 , where the allowance a was a ⫽ z ␣ /2 䡠 SD兵y 其 . 关0.426, 1.143兴 ⬟ 关0.43, 1.14兴 . The commands in lines 196 –199 of Advances_Statistics_ Code_Boot.R return these values. Recall that z␣/2 is the 100[1 ⫺ (␣/2)]th percentile from the standard normal distribution and SD{y } is the standard deviation of the sample means, /公n. A normal-theory confidence interval based on bootstrap replications of the sample mean is similar. We just replace y with Ave{y *}, a with a*, and SD{y } with SD{y *}: 关Ave兵y *其 ⫺ a*, Ave兵y *其 ⫹ a*兴 . The bootstrap allowance a* is a* ⫽ z ␣ /2 䡠 SD兵y *其 . Suppose we want to calculate a 90% bootstrap confidence interval for the population mean. In this situation, ␣ ⫽ 0.10 and z␣/2 ⫽ 1.645. Therefore, the allowance a* is a* ⫽ z ␣ /2 䡠 SD兵y *其 ⫽ 1.645 䡠 0.219 ⫽ 0.360 , and the resulting 90% confidence interval is 关0.438, 1.158兴 ⬟ 关0.44, 1.16兴 . This bootstrap confidence interval resembles the confidence interval of [0.36, 1.23] that we calculated before (4). The commands in lines 196 –199 of Advances_Statistics_Code_ Boot.R return these values. Your values will differ slightly. Fig. 4. Normal quantile plot of 10,000 sample means, each with 9 observations. The bootstrap replications were drawn at random from the observations in sample 1000. These bootstrap sample means are consistent with a normal distribution. The commands in lines 183–189 of Advances_Statistics_Code_ Boot.R create this data graphic. To generate this data graphic, highlight and submit the lines of code from Figure 4: first line to Figure 4: last line. Advances in Physiology Education • VOL 33 • DECEMBER 2009 Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017 Fig. 2. Normal quantile plots (right) of 100 observations drawn from a normal (top) or nonnormal (bottom) distribution. If observations are consistent with a normal distribution, then they will follow a rough straight line. If observations are inconsistent with a normal distribution, then they will deviate systematically from a straight line. The commands in lines 109 –156 of Advances_Statistics_Code_ Boot.R create this data graphic. To generate this data graphic, highlight and submit the lines of code from Figure 2: first line to Figure 2: last line. Fig. 3. Normal quantile plot of 10,000 sample means, each with 9 observations. The bootstrap replications were drawn at random from the observations in sample 1 (see Table 1). The bootstrap sample means are inconsistent with a normal distribution. The commands in lines 165–171 of Advances_Statistics_Code_ Boot.R create this data graphic. To generate this data graphic, highlight and submit the lines of code from Figure 3: first line to Figure 3: last line. Staying Current BOOTSTRAP METHODS Bias-corrected-and-accelerated confidence interval. If the number of observations in the actual sample is too small, if the average Ave{y *} of the bootstrap sample means differs from the sample mean y , or if the bootstrap distribution of the sample mean is skewed, then normal-theory and percentile confidence intervals are likely to be inaccurate (16, 26). A bias-corrected-andaccelerated confidence interval adjusts percentiles of the bootstrap distribution to account for bias and skewness (16, 26). In this exploration, the 90% bias-corrected-and-accelerated confidence interval is What happens, however, if we doubt that the theoretical distribution of the sample mean is consistent with a normal distribution? Our suspicion would be natural: this distribution is, after all, a theoretical one. One approach is to transform the sample observations. Common transformations include the logarithm and the inverse. Box and Cox (1) described a family of power transformations in which an observed variable y is transformed into the variable w using the parameter : w⫽ 关0.376, 1.113兴 ⬟ 关0.38, 1.11兴 . 再 共yln y⫺ 1兲/ for ⫽ 0 , and for ⫽ 0 . The logarithmic, ⫽ 0, and inverse, ⫽ ⫺1, transformations belong to this family. Draper and Smith (9) summarized the steps needed to estimate and its approximate 100(1 ⫺ ␣)% confidence interval. These steps include, for each trial value of , the calculation of the maximum likelihood ᐉmax: 冘 n 1 ᐉmax ⫽ ⫺ n ln 共SSresidual/n兲 ⫹ 共 ⫺ 1兲 ln yi , 2 i⫽1 (1) where SSresidual is the residual sum of squares in the fit of a general linear model to the observations. The optimum estimate of maximizes ᐉmax (Fig. 5). Without question, transformation can be useful (1, 9). But really, who wants to identify the optimum transformation by solving Eq. 1? The bootstrap provides another way. The textbook I use in my statistics course provides the data: measurements of C-reactive protein (Table 2). Problem 7.26 (Ref. 26, p. 442– 443) asks students to study the distribution of these 40 observations and to calculate a 95% confidence interval for the true C-reactive protein mean. The real value of problem 7.26 is that it then asks students if a confidence interval is even appropriate for these data. Cursory examinations of the histogram and normal quantile plot reveal that the C-reactive protein values are skewed and inconsistent with a normal distribution (Fig. 6, top). Still, if the theoretical distribution of the sample mean is roughly normal–if The Bootstrap in Data Transformation In our previous exploration (4) we delved into confidence intervals by drawing random samples from a normal distribution. That we chose a normal distribution for our population was no accident. For normal-theory confidence intervals to be meaningful, one thing must be approximately true: if the random variable Y represents the physiological thing we care about, then the theoretical distribution of the sample mean Y with n observations must be distributed normally with mean and standard deviation /公n.9 In our simulations (3–5) this assumption was satisfied exactly (7). In a real experiment we never know if this assumption is satisfied, but we know it will be satisfied at least roughly–regardless of the population distribution from which the sample observations came–as long as the sample size n is big enough (27). 9 In our second exploration (5) we used the test statistic t to investigate hypothesis tests, test statistics, and P values. A t statistic shares this assumption. Fig. 5. Maximum likelihood approach to data transformation. For each value of , the maximum likelihood ᐉmax is calculated according to Eq. 1. For the C-reactive protein values (see Table 2), the maximum likelihood estimate of that maximizes ᐉmax is ⫺0.1, and an approximate 95% confidence interval for is [⫺0.4, ⫹0.1]. Because this interval includes 0, a log transformation of the C-reactive protein values is reasonable. Advances in Physiology Education • VOL 33 • DECEMBER 2009 Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017 The commands in lines 196 –199 of Advances_Statistics_ Code_Boot.R return these values. This confidence interval is shifted slightly to the left of the percentile interval of [0.43, 1.14]. In many situations, a bias-corrected-and-accelerated confidence interval will provide a more accurate estimate of our uncertainty about the true value of a population parameter such as the mean (16, 26). Limitations. As useful as the bootstrap is, it cannot always salvage the statistical analysis of a small sample. Why not? If the sample is too small, then it may be atypical of the underlying population. When this happens, the bootstrap distribution will not mirror the theoretical distribution of the sample statistic. The trouble is, it may not be obvious how small too small is. A statistician (see Ref. 6, guideline 1) and an estimate of power can help. In our second and third explorations (4, 5) we concluded that the observations from sample 1 were consistent with having come from a population that had a mean other than 0. Only because we had defined the underlying population (see Ref. 3) could we have known that we had erred. When we do a single experiment, we can have enough unusual observations so that it just appears the observations came from a different population (5). A small sample size exacerbates the potential for this phenomenon. This is what happened when we bootstrapped the sample mean using observations from the first sample. We know the theoretical distribution of the sample mean is exactly normal (3), but the distribution of those bootstrap means is clearly not normal (see Fig. 3). This happens because the observations from the first sample are atypical of the underlying population. The message? All bets are off if you bootstrap a sample statistic using observations from a sample that is too small. 289 Staying Current 290 BOOTSTRAP METHODS Table 2. C-reactive protein data C-reactive protein observations by adding 1 to each observation and taking the logarithm of that number: Observations, n ⫽ 40 y w y w y w y w 0 3.90 5.64 8.22 0 5.62 3.92 6.81 0 0 0 1.589 1.893 2.221 0 1.890 1.593 2.055 0 0 73.20 0 46.70 0 0 26.41 22.82 0 30.61 3.49 4.307 0 3.865 0 0 3.311 3.171 0 3.453 1.502 0 0 4.81 9.57 5.36 0 5.66 0 59.76 12.38 0 0 1.760 2.358 1.850 0 1.896 0 4.107 2.594 15.74 0 0 0 0 9.37 20.78 7.10 7.89 5.53 2.818 0 0 0 0 2.339 3.081 2.092 2.185 1.876 the Central Limit Theorem holds–then the 95% confidence interval [4.74, 15.33] mg/l will be meaningful. But we have an intractable problem: we have no way of knowing if the sample size of 40 is big enough for the theoretical distribution of the sample mean to be roughly normal. Problem 7.27 (Ref. 26, p. 443) tells students a log transformation10 decreases skewness and asks them to transform the actual Problem 7.27 then asks students to study the distribution of the 40 transformed observations and to calculate a 95% confidence interval for the true transformed C-reactive protein mean. A histogram and normal quantile plot show that the transformed values are less skewed but still inconsistent with a normal distribution (Fig. 6, bottom). As my students tell me, “Better, but not great.” The 95% confidence interval [1.05, 1.94] reverts to 关e 1.05 ⫺ 1, e 1.94 ⫺ 1兴 ⬟ 关1.86, 5.96兴 mg/l . At this point, we have another problem: we have assumed the log transformation ln (y ⫹ 1) is useful–that it gives us a meaningful confidence interval– but what evidence do we have that it really is? You guessed it: the bootstrap. A bootstrap distribution estimates the theoretical distribution of some sample statistic. In this problem, the bootstrap sample means from the actual observations are inconsistent with a normal distribution (Fig. 7, top). This means the sample size of 40 is not big enough for the theoretical distribution of the sample mean to be roughly normal; as a result, the confidence interval [4.74, 15.33] mg/l is misleading. On the other hand, the bootstrap sample means from the transformed C-reactive protein observations are consistent with a normal distribution (Fig. 7, bottom): the confidence interval [1.05, 1.94] ⫽ ˙ [1.86, 5.96] mg/l is a useful tool for inference. 10 The Box and Cox method identifies a log transformation of the C-reactive protein values as the optimal transformation (see Fig. 5). Fig. 6. Distributions (left) and normal quantile plots (right) of the actual and transformed C-reactive protein observations. The transformation changed the distribution of the observations, but the transformed values remain inconsistent with a normal distribution. The commands in lines 232–252 of Advances_ Statistics_Code_Boot.R create this data graphic. To generate this data graphic, highlight and submit the lines of code from Figure 6: first line to Figure 6: last line. Fig. 7. Bootstrap distributions (left) and normal quantile plots (right) of 10,000 sample means, each with 40 observations. The bootstrap replications were drawn at random from the actual or transformed C-reactive protein observations in Table 2. The bootstrap sample means from the actual observations are inconsistent with a normal distribution. In contrast, the bootstrap sample means from the transformed observations are consistent with a normal distribution. The commands in lines 259 –279 of Advances_Statistics_Code_Boot.R create this data graphic. To generate this data graphic, highlight and submit the lines of code from Figure 7: first line to Figure 7: last line. Advances in Physiology Education • VOL 33 • DECEMBER 2009 Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017 y, Actual C-reactive protein observations (in mg/l) (26). The average, standard deviation, and standard error of these observations are Ave{y} ⫽ 10.03, SD{y} ⫽ 16.56, and SE{y} ⫽ 2.62. The transformed observations, w, result from ln (y ⫹ 1). The average, standard deviation, and standard error of the transformed observations are Ave{w} ⫽ 1.495, SD{w} ⫽ 1.391, and SE{w } ⫽ 0.220. For each set of observations, if we want to calculate a 95% confidence interval for the true mean, then ␣ ⫽ 0.05, the degrees of freedom v ⫽ 39, and t␣/2,v ⫽ 2.023 (see Ref. 4). w ⫽ ln共y ⫹ 1兲 . Staying Current BOOTSTRAP METHODS Summary I was four or five miles from the earth at least, when it broke; I fell to the ground with such amazing violence, that I found myself stunned, and in a hole nine fathoms deep at least, made by the weight of my body falling from so great a height: I recovered, but knew not how to get out again; however, I dug slopes or steps with my [finger] nails (the Baron’s nails were then of forty years’ growth), and easily accomplished it. Refs. 31 (1786) and 32 (2001) It was from this 9-fathom-deep hole that the Baron is rumored to have extricated himself by his bootstraps. Although the notion of pulling yourself up by your bootstraps is entirely consistent with Baron Munchausen’s flair for the dramatic, I failed to find any evidence that the Baron availed himself of this life-saving technique. ACKNOWLEDGMENTS I thank John Ludbrook (Department of Surgery, The University of Melbourne, Victoria, Australia) and Matthew Strand (National Jewish Health, Denver, CO) for giving helpful comments and suggestions, Sarah Kareem (Department of English, University of California, Los Angeles, CA) for humoring my questions and for graciously providing a lot of information about Baron Munchausen, and Bernhard Wiebel (Munchausen Library, Zurich, Switzerland) for searching more than 80 English editions of the Baron’s adventures for mention of bootstraps. APPENDIX REFERENCES In 1993, Efron and Tibshirani (16) wrote that bootstrap, a term coined originally by Efron in 1979 (11), was inspired by the colloquialism pull yourself up by your bootstraps, a nonsensical notion generally attributed to an escapade of the fictitious Baron Munchausen (30). When I learned this, I embarked on a search for the escapade. With the generous assistance of Dr. Sarah Kareem (Department of English, University of California, Los Angeles, CA) and Bernhard Wiebel (Munchausen Library, Zurich, Switzerland), this is what I discovered. The complete title of the Baron’s adventures (30), written by Rudolf Erich Raspe, was Baron Munchausen’s Narrative of His Marvellous Travels and Campaigns in Russia. Humbly Dedicated and Recommended to Country Gentlemen; and, If They Please, to be Repeated as Their Own, After a Hunt at Horse Races, in Watering-Places, and Other Such Polite Assemblies; Round the Bottle and Fire-Side. In chapter VI, the Baron throws a silver hatchet at two bears in hopes of rescuing a bee.11 The hatchet misses both bears and ends up on the moon. To retrieve the hatchet, the Baron grows and promptly climbs a Turkey-bean after it had attached itself to the moon. He finds his hatchet but then discovers the bean has dried up, so he braids a rope of straw to use for the descent. The Baron is partway down the straw rope when the next catastrophe hits. The details of the Baron’s escape differ according to the account you happen to read: I was still a couple of miles in the clouds when it broke, and with such violence I fell to the ground that I found myself stunned, and in a hole nine fathoms under grass, when I recovered, hardly knowing how to get out again. There was no other way than to go home for a spade and to dig me out by slopes, which I fortunately accomplished, before I had been so much as missed by the steward. Refs. 30 (1785), 33 (1948), and 34 (1952) 11 It is a long story. 1. Box GEP, Cox DR. An analysis of transformations. J R Stat Soc Ser B 26: 211–243, 1964. 2. Carra J, Candau R, Keslacy S, Giolbas F, Borrani F, Millet GP, Varray A, Ramonatxo M. Addition of inspiratory resistance increases the amplitude of the slow component of O2 uptake kinetics. J Appl Physiol 94: 2448 –2455, 2003. 3. Curran-Everett D. Explorations in statistics: standard deviations and standard errors. Adv Physiol Educ 32: 203–208, 2008. 4. Curran-Everett D. Explorations in statistics: confidence intervals. Adv Physiol Educ 33: 87–90, 2009. 5. Curran-Everett D. Explorations in statistics: hypothesis tests and P values. Adv Physiol Educ 33: 81– 86, 2009. 6. Curran-Everett D, Benos DJ. Guidelines for reporting statistics in journals published by the American Physiological Society. Physiol Genomics 18: 249 –251, 2004. 7. Curran-Everett D, Taylor S, Kafadar K. Fundamental concepts in statistics: elucidation and illustration. J Appl Physiol 85: 775–786, 1998. 8. DiCiccio TJ, Efron B. Bootstrap confidence intervals. Stat Sci 11: 189 –228, 1996. 9. Draper NR, Smith H. Applied Regression Analysis (2nd ed.). New York: Wiley, 1981, p. 225–227. 10. Durheim MT, Slentz CA, Bateman LA, Mabe SK, Kraus WE. Relationships between exercise-induced reductions in thigh intermuscular adipose tissue, changes in lipoprotein particle size, and visceral adiposity. Am J Physiol Endocrinol Metab 295: E407–E412, 2008. 11. Efron B. Bootstrap methods: another look at the jackknife. Ann Stat 7: 1–26, 1979. 12. Efron B. Better bootstrap confidence intervals. J Am Stat Assoc 82: 171–185, 1987. 13. Efron B. Discussion: theoretical comparison of bootstrap confidence intervals. Ann Stat 16: 969 –972, 1988. 14. Efron B. The bootstrap and modern statistics. J Am Stat Assoc 95: 1293–1296, 2000. 15. Efron B. Second thoughts on the bootstrap. Stat Sci 18: 135–140, 2003. 16. Efron B, Tibshirani RJ. An Introduction to the Bootstrap. New York: Chapman & Hall, 1993. 17. Esbaugh AJ, Perry SF, Gilmour KM. Hypoxia-inducible carbonic anhydrase IX expression is insufficient to alleviate intracellular metabolic acidosis in the muscle of zebrafish, Danio rerio. Am J Physiol Regul Integr Comp Physiol 296: R150 –R160, 2009. 18. Fanini A, Assad JA. Direction selectivity of neurons in the macaque lateral intraparietal area. J Neurophysiol 101: 289 –305, 2009. 19. Gillis TE, Marshall CR, Tibbits GF. Functional and evolutionary relationships of troponin C. Physiol Genomics 32: 16 –27, 2007. Advances in Physiology Education • VOL 33 • DECEMBER 2009 Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017 As this exploration has demonstrated, the bootstrap gives us an approach we can use to assess whether an inference we make from a normal-theory hypothesis test or confidence interval is justified: if the bootstrap distribution of a statistic such as the sample mean is roughly normal, then the inference is justified. But the bootstrap gives us more than that. We can use a bootstrap confidence interval to make an inference about some experimental result when the statistical theory is uncertain or even unknown (14, 26). Although we have explored the bootstrap using the sample mean, we can use the bootstrap to make an inference about other sample statistics such as the standard deviation or correlation coefficient. In the next installment of this series, we will explore power, a concept we mentioned in our exploration of hypothesis tests (5). Power is the probability that we reject the null hypothesis given that the null hypothesis is false. The notion of power is integral to hypothesis testing, confidence intervals, and grant applications. 291 Staying Current 292 BOOTSTRAP METHODS 29. R Development Core Team. R: a Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing, 2008; http://www.R-project.org. 30. Raspe RE. Baron Munchausen’s Narrative of his Marvellous Travels and Campaigns in Russia. Oxford: Smith, 1785, p. 45. 31. Raspe RE. Gulliver Revived; or The Singular Travels, Campaigns, Voyages, and Adventures of Baron Munikhouson, commonly called Munchausen. London: Kearsley, 1786, p. 45. 32. Raspe RE. The Surprising Adventures of Baron Munchausen. Rockville, MD: Wildside, 2001, p. 58 –59. 33. Raspe RE. Singular Travels, Campaigns and Adventures of Baron Munchausen, edited by Carswell JP, others. London: Cresset, 1948, p. 22. 34. Raspe RE. The Singular Adventures of Baron Munchausen, edited by Carswell JP, others. New York: Heritage, 1952, p. 18. 35. Rozengurt N, Wu SV, Chen MC, Huang C, Sternini C, Rozengurt E. Colocalization of the ␣-subunit of gustducin with PYY and GLP-l in L cells of human colon. Am J Physiol Gastrointest Liver Physiol 291: G792–G802, 2006. 36. Tukey J. Bias and confidence in not-quite large samples (abstract). Ann Math Statistics 29: 614, 1958. 37. Yoshimura H, Jones KA, Perkins WJ, Kai T, Warner DO. Calcium sensitization produced by G protein activation in airway smooth muscle. Am J Physiol Lung Cell Mol Physiol 281: L631–L638, 2001. Advances in Physiology Education • VOL 33 • DECEMBER 2009 Downloaded from http://advan.physiology.org/ by 10.220.33.3 on May 5, 2017 20. Grenier AL, Abu-ihweij K, Zhang G, Ruppert SM, Boohaker R, Slepkov ER, Pridemore K, Ren JJ, Fliegel L, Khaled AR. Apoptosisinduced alkalinization by the Na⫹/H⫹ exchanger isoform 1 is mediated through phosphorylation of amino acids Ser726 and Ser729. Am J Physiol Cell Physiol 295: C883–C896, 2008. 21. Kelley GA, Kelley KS, Tran ZV. Exercise and bone mineral density in men: a meta-analysis. J Appl Physiol 88: 1730 –1736, 2000. 22. Kristensen M, Hansen T. Statistical analyses of repeated measures in physiological research: a tutorial. Adv Physiol Educ 28: 2–14, 2004. 23. Lemley KV, Boothroyd DB, Blouch KL, Nelson RG, Jones LI, Olshen RA, Myers BD. Modeling GFR trajectories in diabetic nephropathy. Am J Physiol Renal Physiol 289: F863–F870, 2005. 24. Lott MEJ, Hogeman C, Herr M, Bhagat M, Sinoway LI. Sex differences in limb vasoconstriction responses to increases in transmural pressures. Am J Physiol Heart Circ Physiol 296: H186 –H194, 2009. 25. Miller RG. The jackknife–a review. Biometrika 61: 1–15, 1974. 26. Moore DS, McCabe GP, Craig BA. Introduction to the Practice of Statistics (6th ed.). New York: Freeman, 2009. 27. Moses LE. Think and Explain with Statistics. Reading, MA: AddisonWesley, 1986. 28. Quenouille M. Approximate tests of correlation in time-series. J R Stat Soc Ser B 11: 68 – 84, 1949.