Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Minitab Macros for Resampling Methods By Adam Butler CEH Monks Wood September 2001 SUMMARY This report describes a library of macros for implementing a variety of statistical methods in Minitab using computationally-intensive methods of inference (randomization, bootstrapping and Monte Carlo simulation). CONTENTS INTRODUCTION 4 1 Resampling methods in statistics What are they ? When should I used them ? Randomization, bootstrapping and Monte Carlo simulation A note on the use of p-values 4 4 4 5 5 2 Resampling in Minitab Minitab Some useful Minitab commands The resampling macros Other sources of information Arguments to the macros Subcommands in the macros Computing power and number of resamples Speed 7 7 7 8 8 8 8 9 10 3 How to use this guide Information about the macros Worked examples 11 11 12 4 Literature review 13 REFERENCE MANUAL 14 1 Significance tests Overview ONESAMPLERAN TWOSAMPLERAN TWOTRAN TWOTPOOLBOOT TWOTUNPOOLBOOT CORRELATIONRAN 14 14 15 18 22 26 29 32 2 Confidence intervals Overview An introduction to bootstrap confidence intervals MEANCIBOOT MEDIANCIBOOT STDEVCIBOOT ANYCIBOOT 35 35 35 37 41 45 49 2 3 Analysis of variance Overview ONEWAYRAN TWOWAYRAN TWOWAYREPRAN LEVENERAN 53 53 54 58 62 66 4 Regression Overview Should we resample residuals or observations ? REGRESSSIMRAN REGRESSOBSRAN REGRESSRESRAN REGRESSBOOT 70 70 70 72 76 79 83 5 Time series Overview ACFRAN TRENDRAN 89 89 90 96 6 Spatial statistics Overview Which procedure should I use ? Using the macros for spatial statistics SPATAUTORAN MANTELRAN MEAD4RAN MEAD8RAN Creating and interpreting EDF plots DISTEDFMC NEARESTMC LOCREGULARMC 100 100 100 101 102 107 111 114 117 TABLE OF ALTERNATIVE DATASETS REFERENCES ACKNOWLEDGEMENTS AND CONTACT DETAILS 137 138 139 APPENDIX : Reference card for the macros 3 INTRODUCTION 1 RESAMPLING METHODS IN STATISTICS What are they ? Resampling methods are a class of statistical techniques for drawing inferences based on the variability present within a dataset Resampling methods (sometimes known as computationally intensive methods) include : Bootstrapping Randomization tests (also known as permutation tests) Monte Carlo tests and related procedures In general, resampling methods are difficult to justify in theory, but relatively easy to apply in practice. The common concept underlying all resampling methods is that we can assess the variability by drawing a large number of samples, each having the same size as the original dataset, from the observed data (this is the process of resampling); we then compare the properties of the observed data to the properties of the resampled datasets. When should I use them ? Resampling methods are useful for obtaining assessments of variability - this means that they are principally used to calculate confidence intervals and p-values. Resampling methods can be used with many different statistical methods - including comparison of two samples, ANOVA, regression, spatial statistics, time series and multivariate analysis - and can potentially be applied to any area of application; Manly (1997) discusses how resampling methods have been applied in a number of different areas of biology. Resampling methods have become increasingly popular in recent year, partly because of increasing computer power. Resampling methods are usually used instead of - or alongside - standard techniques for drawing inferences from data. Standard techniques usually rely upon statistical theory (especially asymptotic arguments) and assumptions about the distribution of the data (for example, that the data are normally distributed). Resampling methods do not make these assumptions, and so should be more reliable in those situations in which the standard assumptions are false. If the assumptions underlying standard theory are valid, resampling and standard techniques should give very similar results. In fact, resampling methods often give similar results to standard theory even if the assumptions underlying standard theory are not valid. Resampling methods also rely upon their own (fairly complicated) assumptions. It is felt that these assumptions will often be valid, or approximately valid, but it is worth noting that there are situations in which the application of resampling methods may go badly wrong. Resampling methods place much emphasis on the observed dataset, and so may be very susceptible to any errors or problems with the data that has been collected. It is therefore important to check data carefully, and to use graphical techniques to look for outlying points. Possibly the most interesting feature of resampling methods is their generality - they may be used to tackle a wide variety of practical statistical problems, including problems for which standard theory does not yet exist, in a fairly straightforward way. 4 Randomization, bootstrapping and Monte Carlo simulation The macros in this library sometimes use randomization tests, and sometimes use bootstrapping. The differences between the two techniques are rather subtle. The key differences are that : In practice, if we are in a situation in which either method can be used, then the methods work in an almost identical fashion. Usually the only difference is that randomization tests involve resampling without replacement (i.e. we simply re-order the original data), whereas bootstrapping involves resampling with replacement (i.e. a value from the original data may occur more than once in a resampled dataset). Bootstrap methods are substantially more general than randomization methods, and may often be used in situations in which randomization methods are not available. The assumptions which justify the use of the two techniques are different. A few of the macros for spatial statistics used Monte Carlo methods; these are a more general class of technique than either randomization or bootstrap methods (in fact, both randomization and bootstrap methods can be viewed as special cases of Monte Carlo methods). Whilst bootstrap and randomization methods involve simulating only from the observed data, Monte Carlo methods involve taking simulations using a statistical model. All of the Monte Carlo methods in this study involve simulating datapoints at random from within a fixed rectangular region, in order to examine the hypothesis of Complete Spatial Randomness (CSR). A note on the use of p-values The bulk of the macros in the library deal with significance tests. In general, these involve testing a null hypothesis against one or more possible alternative hypotheses. Performing a significance test involves calculating the location of the observed test-statistic value t within the probability distribution of the teststatistic. Assume that the true distribution is T. This probability distribution can often be approximated either using statistical theory, or, as is the case in the macros, by resampling. One-sided randomization p-values Assume for the time being that we are only interested in the alternative hypothesis which implies a large value of the test-statistic. Then the true one-sided p-value of interest is p T t Standard procedure If the test-statistic is known by statistical theory to follow a particular distribution, Ta, then the standard one-sided p-value of interest is given by p s Ta t p . For a continuous distribution, this value is obtained by integrating the probability density from t to infinity, whilst for a discrete distribution the probability mass function is summed from t to infinity (inclusive). Randomization If the resampling distribution is given by Tr, then the resampling one-sided p-value of interest is given by pr Tr t p . This value is therefore the proportion of all test-statistics (the set of resampled test-statistics, plus the observed test-statistic, since under the null hypothesis this is also a realisation from T) greater than or equal to t. 5 Two-sided randomization p-values Now assume that either alternative hypothesis may be of interest. One-sided randomization p-values for the opposite alternative hypothesis, corresponding to small values of the test-statistic, are analogous to those defined above. Calculating the two-sided p-value, i.e. the probability of the test-statistic being extreme in either direction is more complicated, because there are now two possible approaches : 1. Let the two-sided p-value be the probability of being as far from the mean of the distribution as the observed test-statistic, in either direction 2. Let the two-sided p-value be double the smaller of the one-sided p-values Most standard theory either uses distributions for which only one-sided p-values are relevant (e.g. F and chi-squared distributions), or uses distributions (e.g. normal or T distributions) which are symmetric. For a symmetric distribution, either method of computing the 2-sided p-value will give the same answer, because both of the one-sided p-values will be the same. When we obtain distributions by resampling, however, there is no reason to assume that they will be symmetric. For very non-symmetric distributions, the first approach to computing two-sided p-values may give substantially misleading results. The disadvantage of using the second approach in the resampling context is that we use the data from only one tail of the distribution, so that we need a greater number of resamples to give the same accuracy in the calculation of p-values. Since resampling methods are most useful for those situations in which standard approximations are not valid - i.e. for situations in which test-statistics have highly skewed distributions - we use the second method of computing resampling two-sided p-values (where these are required) throughout the library. 6 2 RESAMPLING IN MINITAB MINITAB MINITAB is a general purpose package for data manipulation and statistical analysis. This guide outlines a library of MINITAB macros which have been written to perform a variety of commonly-used statistical procedures using randomization and bootstrapping methods, rather than the more traditional (and less computationally-intensive) methods which involve approximations and distributional theory. In order to implement the macros, you must work in the session window. To open the session window, click on window on the menu bar. Move down the list, and select session. Click on the editor menu, move down the list and select enable commands. The various macros may be invoked by typing their name at the MTB> prompt; all other MINITAB commands can also be invoked from this prompt. Some useful MINITAB commands The most useful MINITAB commands in the context of randomization and bootstrapping are statistics Can be used to display or store a wide variety of descriptive statistics for a given column. The subcommands specify the various descriptive statistics to be used. Example: statistics c1; mean c2. This takes the mean of column c1 and stores it in the first element of column c2. sample Can be used to draw a sample, with or without replacement, from a column. Example: sample 10 c1 c2 This takes a sample of size 10 from column c1, without replacement, and stores it in c2. Randomization tests are based upon sampling without replacement. Example: sample 7 c1 c2 ; replace. This takes a sample of size 7 from column c1, with replacement, and stores it in c2. Bootstrapping is based upon sampling with replacement. random Can be used to simulate random datasets from standard probability distributions. Example: random 50 C1; normal 0 1. 7 This simulates 50 values from a standard normal [i.e. a Normal(0,1)] distribution, and stores the simulated values in c1. Example: random 10 c2; poisson 6. This simulates 10 values from a Poisson distribution with parameter 6, and stores the values in c2. The resampling macros The resampling macros are designed, as far as possible, to mimic standard MINITAB functions for the statistical methods in question. In some cases, there are both randomization and bootstrap versions of standard MINITAB commands; the justification for using randomization and bootstrap techniques is substantially different, but they will often (though not always) give similar answers. Much of the output from the macros will be identical to output from the standard MINITAB functions, since it is not dependent upon the randomization or bootstrapping process (for example correlation coefficients, regression parameter estimates, ANOVA tables and sample statistics will not be affected by using randomization or bootstrap techniques in place of standard techniques). However, assessments of the significance or variability of an estimate (such as p-values and confidence intervals) will be altered by the use of randomization and bootstrap techniques. It is important to realise that MINITAB functions for most standard techniques will yield the same answers again and again, regardless of how many times they are run; in contrast, p-values and confidence intervals produced using the resampling macros will be different every time the macro is run. This is an inherent feature of randomization and bootstrap techniques; so long as the number of randomizations or bootstrap samples is large (how large depends upon the particular statistical method being used), these differences should not be particularly important. The Minitab macros are designed for release 13, but most will probably function with earlier releases. The macro generally have a similar calling statement to the corresponding standard Minitab command, if this exists. Additionally, macro names end in a suffix, depending upon the type of resampling methodology used: ran : for randomization procedures boot : for bootstrap procedures mc : for Monte Carlo tests and related procedures. Although these three classes of methods are all fairly similar to implement in practice, the theoretical justification for using them is different, so we distinguish clearly between the different forms of resampling. Other sources of information Along with the macros library, we provide : Individual descriptions of each macro (taken from the sections of this guide) Sample datasets, as Minitab files Sample datasets, as .DAT files .TXT files, containing output from running the macros over the sample datasets Minitab files, containing the final worksheet obtained after running the macros over the sample datasets (although these are missing for some macros) Arguments to the macros Most of the macros require one or more columns of numeric data as input. 8 For some macros, the order in which the columns are entered is crucially important. In regression, the response must be the first column, followed by one or more predictors. In two-way analysis of variance, the group must be entered before the block. For some of the spatial macros, the input is in the form of matrices. Consult the Minitab documentation or help menu for information on entering and reading matrices. Missing values are allowed for all macros, except those which take matrices as inputs. Missing values are dealt with in an obvious way; e.g. observations with one or more missing values are usually ignored. Subcommands in the Minitab macros Subcommands are used for the following purposes in the macros : Specifying values These subcommands allow the user to change basic quantities involved in the operation of the macro. For subcommands of this type, the argument for the subcommand is simply the value of the quantity in question (a constant). Specific uses are : To specify the number of resamples to be used. These are specified using the subcommands NRAN for randomization tests, NBOOT for bootstrap procedures and NSIM for Monte Carlo procedures. The required value (a positive integer constant) is entered. To specify significance levels for confidence intervals, using the subcommand SIGLEV. The significance level (expressed as a percentage) is entered e.g. 95. To specify the number of test-statistics to be considered. NLAG in the macro ACFRAN specifies the maximum lag for which serial correlation coefficients should be computed, whilst NSTATS in the macro NEARESTMC specifies the largest value of k for which kth nearest neighbour distances should be computed. To specify the graphical resolution. NPOINTS in DISTEDFMC specifies the number of points to be used for evaluating CDFs and EDFs, and so controls the resolution of the resulting graph. Modifying procedures These subcommands allow the user to modify the technical details of the procedure used within the macro. In these cases, a constant should be entered; a key to the values to be used is given below. USEMEAN in LEVENERAN specifies whether the median or mean should be used to create the modified dataset. 1 = use median [default - this will occur if the subcommand is not used] 0 = use mean any other numeric value except 0 or 1 = use mean RESIDUALS in REGRESSRESRAN and REGRESSBOOT specifies the kind of residuals which should be used in the randomization procedure. 1 = raw residuals 2 = modified residuals [default] 3 = deletion residuals any other numeric value except 1, 2 or 3 will lead to an error message TYPE in MISSING specified how missing values should be treated. 1 = delete the missing value only in the column concerned. 2 = delete the entire row if it contains any missing values. any other numeric value except 1 or 2 will lead to an error message Storing output All of the remaining subcommands in the macros are concerned with allowing the user to store lengthy output to file. All such subcommands operate in the same way: if the user does not wish to store the 9 output, then the subcommand need not be used ; if the user wishes to store the output, then the appropriate subcommand should be used, and the argument to the subcommand should be the column, columns constant or matrix (as required) in which the output is to be stored. *** IMPORTANT NOTES *** Take care not to over-write the original data when storing lengthy output. If the number of resamples is large, storing resampled test-statistics may generate worksheets which take up a large amount of memory. Computing power and number of resamples In order to gain a clear indication of the resampling distribution of the quantity of interest, it is necessary to use a reasonably large number of resamples. We make use of the following defaults Number of randomizations 999 Number of bootstrap resamples 999 for significance tests 2000 for calculating confidence intervals Number of simulations 999 (except DISTEDFMC, where we use 99). Different authors make different recommendations about the number of bootstrap resamples required. With general purpose macros such as those in this library, there is a trade off between running time (which usually increases roughly as a linear function of number of resamples) and accuracy (which will increase with the number of resamples). The defaults seem to us to provide a reasonable compromise. If a high degree of accuracy is required in the estimation of p-values or confidence intervals then the number of resamples must be made very large. We would suggest that the macros is this library are probably not appropriate for this, since some of the procedures used are relatively inefficient (i.e. will take a long time to run). Speed The speed at which the macros run will depend upon The size of the dataset The number of randomizations / bootstrap resamples / simulations used The capabilities of the computer so it is not possible to state clearly how long the macros will take to run. The macros can broadly be divided into three categories FAST : These macros should run within a few minutes, or possibly much less, with the default number of randomizations or bootstrap samples. Significance tests : all macros Confidence intervals : all macros Analysis of variance : ONEWAYRAN, LEVENERAN Regression : REGRESSSIMRAN Time series : all macros Spatial statistics : MEAD4RAN, MEAD8RAN MODERATE : These macros will take at least a few minutes to run with the default number of randomizations or bootstrap samples. Analysis of variance : TWOWAYRAN, TWOWAYREPRAN Regression : REGRESSOBSRAN, REGRESSRESRAN, REGRESSBOOT 10 Spatial statistics : MANTELRAN SLOW : These macros may take a long time to run - up to a few hours. Spatial statistics : SPATAUTORAN, DISTEDFMC, NEARESTMC, LOCREGULARMC. The advice would be that the 'FAST' macros can be used like any other Minitab command, with only a short wait for the output, whereas the 'MODERATE' and 'SLOW' macros should be left to run in the background whilst the user works on another task. 11 3 HOW TO USE THIS GUIDE Information about the macros For each of the macros in the resampling methods library, the following information is provided: An outlines of the purpose of the macro. RUNNING THE MACRO Macro calling statement : Gives the full calling statement for the macro. Note that macros are invoked using a % sign, and that the appropriate path to the resampling methods library must be placed in front of the macro name. All possible subcommands are listed. For the main command and for each subcommand, the required form of data is listed. For example : c1-c3 means that three columns are required, c1 means that one column is required, k1 k2 means that 2 constants are required, m1 means that 1 matrix is required, c1-cN means that an unspecified number of columns (N) are required For subcommands, default values are also given, in brackets, if they are available. If the user wishes to use the default values, then the subcommands need not be included in the calling statement. For subcommands which involve storing output to a column of the worksheet, default values are not relevant; if the user does not wish to store the output, then the appropriate subcommand should simply not be used. Input : A detailed description of what kind of data should be used in the compulsory arguments to the macro, and in what order the data should be entered. Mention is made of whether missing values are allowed, and how they will be dealt with. Subcommands : A description of the purpose and operation of each subcommand, and of the type of input required. Output : A straightforward description of the output produced by the macro. Speed of macro : Some indication of the speed at which the macro runs. TECHNICAL DETAILS Notes : Any general comments on the operation of the macro, including possible bugs. Hypotheses : For hypothesis tests, the null and (if relevant) alternative hypotheses are stated explicitly. Test-statistic : For hypothesis tests, the test-statistic is defined and justified. Resampling procedure : The procedure for randomizing, bootstrapping or simulating is stated and briefly justified. If the algorithm is complicated (e.g. for multiple regression), a reference is given. ALTERNATIVE PROCEDURES Other macros : Outlines any alternative macros in the library that may be used to perform the same statistical analyses using different methods (for example, different resampling algorithms). Standard procedure : Gives an outline of the calling procedure and general purpose of the standard MINITAB function which corresponds most closely to a non-resampling version of the macro. In many cases, the macros in the resampling methods library are direct computationally-intensive analogues of in-built MINITAB functions for standard statistical procedures. In the case of analysis of variance, the standard procedures are incorporated within the macros, so that the macros are simply extended forms of the standard functions. 12 In the case of more sophisticated techniques in time series and spatial statistics, standard procedures in Minitab are generally not available. In some of these cases, non-randomization methods are actually included as part of the macro output; a description of these methods is given. REFERENCES References are provided; if the user is in any doubt as to the appropriateness of the methods used in the macros for their data, then these should be consulted. Many references are to the book by Manly (1997), which provides a good general introduction to randomization and bootstrap procedures. Worked examples Datasets Example datasets are provided alongside the library. A brief description of the dataset pertaining to each macro (or set of macros) is given, together with some or all of the data. Most of these datasets are taken from Manly (1997), and a detailed description of the analysis of these datasets can often be found in his book. The example datasets are all taken from biological studies; note that sample sizes are generally small. Example MINITAB input and output for the analysis of each example dataset is shown. For each macro, we provide : DATA Name of dataset : The datasets are named. The names correspond to the filenames of the corresponding Minitab worksheets. Description : We give a brief description of the data, how it was collected, and for what purpose. Source : We give both the source from which we took the data, and, if different, the original source of the data. Data : We list the full dataset. Note that the listing is often not in a useful form for pasting into Minitab, so it is more sensible to use the .DAT or .Minitab files to input the data to Minitab. Worksheet : We describe the columns, constants and matrices in the Minitab worksheet. ANALYSIS Aims of analysis : We briefly describe the aims of the analysis described in the "output". These aims are generally fairly limited, and a full statistical analysis of the data would usually have more substantial objectives. Minitab Output : Minitab input and output are listed in full. Note : The worked examples are for demonstrative purposes only. Details of the procedures (e.g. the values chosen for subcommands) have been chosen to give the best demonstration of the capabilities of the macros, rather than for sound statistical reasons. Modified worksheet : A description of any additional columns, constant or matrices created in the worksheet by running the macro. Discussion : A brief discussion of the results. 13 4 Literature review General The range and approach of this macro library largely mirrors that of Manly (1997). His book provides an clear, non-technical introduction to resampling methods, with an emphasis upon biological and ecological applications. We cover a substantial proportion of the material in chapters 1 to 11 of Manly (1997), and the arrangement of our material largely mirrors that of Manly (1997): Section 1 : Significance tests Section 2 : Confidence intervals Section 3 : Analysis of variance Section 4 : Regression Section 5 : Time series > > > > > Based upon Chapter 6 of Manly (1997) Based upon Chapter 3 of Manly (1997) Based upon Chapter 7 of Manly (1997) Based upon Chapter 8 of Manly (1997) Based upon Chapter 11 of Manly (1997) Chapters 1 and 5 of Manly (1997) provide a general introduction to resampling methods in biology. The material in Section 6 of the macro library, Spatial Statistics, is partly based upon Chapters 9 and 10 of Manly (1997), but also includes material from Chapter 4 (Monte Carlo tests), and material taken from other sources (see below). Confidence intervals The emphasis of Manly (1997) is upon randomization and hypothesis testing. His introduction to bootstrap confidence intervals is relatively brief, and without worked examples, so it is probably better to consult Efron and Tibshirani (1993), who provide a clear, fairly non-technical introduction to bootstrap confidence intervals. Chapter 12 discusses the bootstrap-t method, Chapter 13 the Efron percentile and Hall percentile methods, and Chapter 14 the BC and BCa percentile methods. Resampling methodology The statistical literature concerning computer-intensive inference is extremely large, and there are a large number of technical issues involved. Efron and Tibshirani (1993) discuss general principles and issues, whilst a good (but highly mathematical) introduction to the statistical theory of resampling is provided by Davison and Hinkley (1997), who also provide extensive references. Regression Draper and Smith (1998) give a wide-ranging overview of regression methods, and include a chapter on the application of resampling methods to regression. The algorithms used for multiple regression with randomization or bootstrapping of residuals in the macros are those proposed by Ter Braak (1992). Spatial statistics Diggle (1983) provides an introduction to the search for spatial pattern, and discusses how EDF plots can be used for this purpose. Brown and Rothery (1978) look at the same topic, and propose test-statistics that are sensitive to different kinds of regularity. Cliff and Ord (1973) discuss methods for estimating spatial autocorrelation. Minitab For further details about Minitab, see Minitab Inc. (1999). 14 REFERENCE MANUAL 1 BASIC STATISTICS: SIGNIFICANCE TESTS OVERVIEW One sample tests ONESAMPLERAN tests whether a population mean is equal to a hypothesised value. Two sample tests TWOSAMPLERAN tests for the equality of two population means using randomization. TWOTRAN also tests for the equality of two population means using randomization. TWOTUNPOOLBOOT tests for the equality of two population means using bootstrapping. TWOTPOOLBOOT also tests for the equality of two population means using bootstrapping. Correlation CORRELATIONRAN tests whether a correlation coefficient between two variables is significant. Comments Possibly the most widely used test in statistics is the 2 sample t-test, in which we test the equality of two means. We include four different computer-intensive macros for this procedure. 15 ONESAMPLERAN This macro is designed to test whether or not the mean of a single column of data is equal to a hypothesised value specified by the user. RUNNING THE MACRO Calling statement onesampleran c1 k1 ; nran k1 (999) ; sums c1. Input C1 A single column, containing only numerical values. Missing values are allowed. K1 A single constant, containing the hypothesised mean value. Subcommands nran Number of randomizations used. sums Specify a column in which to store sample sums for bootstrap samples. Output Basic statistics: Sample size, sample mean, sum of sample values, and sample standard deviation. Hypothesised mean value. Resampling details: Number of randomizations, One and two-sided randomization p-values. The two-sided randomization p-value is double the smaller of the one-sided randomization p-values. Speed of macro : FAST TECHNICAL DETAILS Null hypothesis : The population mean is equal to the hypothesised mean value. Test-statistic : We create a modified dataset by deducting the hypothesised mean from each data value. The appropriate test-statistic is the sum of these modified values. Randomization procedure : We randomize the allocation of signs to the absolute values within the modified dataset, since under the null hypothesis there should be an equal probability that any data point will have been allocated a negative or positive value once the hypothesised mean is deducted from it. ALTERNATIVE PROCEDURES Standard procedures onet C1; test k1. Performs a one-sample t-test for the mean of the data in c1 being equal to the hypothesised mean value k1, in the situation in which the sample variance is unknown. onet C1; sigma k1 test k2. Performs a one-sample normal test for the mean of the data in c1 being equal to the hypothesised mean value k2, in the situation in which the standard deviation is known to be equal to k1. 16 REFERENCES MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 6). WORKED EXAMPLE FOR ONESAMPLERAN Name of dataset DARWIN Description The data refers to the heights of 15 self-fertilised offspring from the plant Zea mays. The data were originally collected by Charles Darwin, were analysed by RA Fisher in the 1930s (see Fisher, 1935), and are analysed by Manly (1997) using a one-sample randomization test. Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source FISHER, R.A. (1935) The design of experiments, Oliver & Boyd, Edinburgh. Data Number of observations = 15 Number of variables = 1 43 67 64 64 51 53 53 26 36 48 34 48 6 28 48 Worksheet C1 Data Aims of analysis To test whether the population mean is equal to a hypothesised value of 56. Minitab output : standard procedure MTB > Retrieve "N:\resampling\Examples\Darwin.MTW". Retrieving worksheet from file: N:\resampling\Examples\Darwin.MTW # Worksheet was saved on 27/07/01 14:03:05 Results for: Darwin.MTW MTB > onet c1 ; SUBC> test 56. One-Sample T: Self Test of mu = 56 vs mu not = 56 Variable Self N 15 Mean 44.60 StDev SE Mean 16.41 4.24 Variable 95.0% CI T P Self ( 35.51, 53.69) -2.69 0.018 17 Minitab output : randomization procedure MTB > % N:\resampling\library\onesampleran c1 56 ; SUBC> nran 499 ; SUBC> sums c3. Executing from file: N:\resampling\library\onesampleran.MAC One-sample randomization test Data Display (WRITE) Number of observations 15 Observed mean value 44.60 Hypothesised mean value 56.00 Observed sum of values 669.0 Observed standard deviation 16.41 Number of randomization samples 499 P-value for one-sided test with alternative: true mean < hypothesised mean P-value for one-sided test with alternative: true mean > hypothesised mean P-value for two-sided test 0.0040 0.0020 1.0000 Modified worksheet C3 A column containing 499 sums of values, one for each randomized dataset Discussion The standard (two-sided) p-value is 0.018. Manly obtains a randomization p-value of 0.016, by enumeration of the full randomization distribution. Our two-sided p-value of 0.004 is substantially smaller than either of these values, but this may just be a consequence of the relatively small number of randomizations used. The conclusion is the same in all cases - there is strong evidence that the population mean is not equal to the hypothesised mean. Looking at the one-sided p-values (and the sample means) we see that we can accept the alternative hypothesis that the population mean is lower than the hypothesised mean. 18 TWOSAMPLERAN This macro is designed to test, using randomization, whether or not the means for two independent samples are equal. RUNNING THE MACRO Calling statement twosampleran c1 c2 ; nran k1 (999) ; differences c1 ; tstatistics c1. Input C1 Data for first group C2 Data for second group C1 and C2 must both be columns containing only numerical data, but they need not be of the same length. Missing values are allowed. Subcommands nran differences tstatistics Number of randomizations used. Specify a column in which to store differences between simulated group means. Specify a column in which to store t-statistics for differences between simulated group means. Output Basic statistics: Sample size, sample mean, sum of sample values, and sample standard deviation. Hypothesised mean value. Resampling: Number of randomizations, One and two-sided randomization p-values. The two-sided randomization p-value is double the smaller of the one-sided randomization p-values. Speed of macro : FAST ALTERNATIVE PROCEDURES Other macros This macro uses randomization, but two bootstrapping versions of the test are available (depending upon whether variances are pooled) : TWOTPOOLBOOT Bootstrap test with pooling of variances TWOTUNPOOLBOOT Bootstrap test without pooling of variances This macro is suitable for when data for the two groups are contained in separate columns. If data is contained in a single column, with a second column denoting group number, then TWOTRAN should be used. Standard procedures twosample c1 c2. This performs a two-sample t-test for the mean of the data in c1 being equal to the mean of the data in c2. Variances are not pooled, so this is appropriate if the variances for the two groups cannot be assumed to be equal. 19 twosample C1 C2; pooled. This performs a two-sample t-test for the mean of the data in c1 being equal to the mean of the data in c2. Variances are pooled, so this is only appropriate if the variances for the two groups can be assumed to be equal. TECHNICAL DETAILS Null hypothesis :We test the null hypothesis that the mean for the first group is equal to the mean for the second group. Randomization procedure :We fix the data value for each individual, and fix the size of the groups. We then randomize the allocation of individuals to groups, since under the null hypothesis this allocation will be random. Test-statistic : We use the difference between the two sample group means as the test-statistic. REFERENCES MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 6). WORKED EXAMPLE FOR TWOSAMPLERAN Name of dataset LIZARDS Description The data consists of the quantity of dry biomass of Coleoptera in the stomachs of two size morphs of the Eastern Horned Lizard, Phrynosoma douglassi brevirostre. The data were collected by Powell and Russell, and are analysed by Manly (1997) using a two sample randomization test. Data is available for 24 lizards in the first size morph (adult males and yearling females) and 21 lizards in the second size morph (adult females). Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source POWELL, G.L. & RUSSELL, A.P. (1984), The diet of the eastern short-horned lizard (Phrynosoma douglassi brevirostre) in Alberta and its relationship to sexual size dimorphism, Canadian Journal of Zoology, 62, pp. 428-440. POWELL, G.L. & RUSSELL, A.P. (1985), Growth and sexual size dimorphisms in Alberta populations of the eastern short-horned lizard, Phrynosoma douglassi brevirostre, Canadian Journal of Zoology, 63, pp. 139-154. Data Number of observations = 45 Number of variables = 2 For each size morph group, data are given. Group 1 (Adult males and yearling females) 256 209 0 0 0 44 49 117 6 0 0 75 34 13 90 0 32 0 205 332 0 31 0 0 Group 2 (Adult females) 20 2 0 89 0 0 179 19 142 100 0 163 286 0 432 3 843 0 158 443 311 232 179 Worksheet C1 Data for group 1 C2 Data for group 2 Aims of analysis To investigate whether stomach biomass is different for lizards in size morph 1 and lizards in size morph 2. Minitab output : Standard procedure, without pooling MTB > Retrieve "N:\resampling\Examples\Lizards.MTW". Retrieving worksheet from file: N:\resampling\Examples\Lizards.MTW # Worksheet was saved on 03/07/01 16:32:34 Results for: Lizards.MTW MTB > twosample c1 c2 Two-Sample T-Test and CI: Group1, Group2 Two-sample T for Group1 vs Group2 N Group1 24 Group2 21 Mean 62.2 170 StDev SE Mean 94.1 19 209 46 Difference = mu Group1 - mu Group2 Estimate for difference: -108.2 95% CI for difference: (-209.6, -6.9) T-Test of difference = 0 (vs not =): T-Value = -2.19 P-Value = 0.037 DF = 27 Minitab output : Standard procedure, with pooling MTB > twosample c1 c2 ; SUBC> pooled. Two-Sample T-Test and CI: Group1, Group2 Two-sample T for Group1 vs Group2 N Group1 24 Group2 21 Mean 62.2 170 StDev SE Mean 94.1 19 209 46 Difference = mu Group1 - mu Group2 Estimate for difference: -108.2 95% CI for difference: (-203.4, -13.0) T-Test of difference = 0 (vs not =): T-Value = -2.29 P-Value = 0.027 DF = 43 21 Both use Pooled StDev = 158 Randomization procedure (with pooling) MTB > Retrieve "N:\resampling\Examples\Lizards.MTW". Retrieving worksheet from file: N:\resampling\Examples\Lizards.MTW # Worksheet was saved on 07/03/01 04:32:34 PM Results for: Lizards.MTW MTB > % N:\resampling\library\twosampleran c1 c2 ; SUBC> nran 999 ; SUBC> differences c4 ; SUBC> tstatistics c5. Executing from file: N:\resampling\library\twosampleran.MAC Two-sample randomization test Data Display (WRITE) Number of observations in group 1 24 Number of observations in group 2 21 Data mean for group 1 62.21 Data mean for group 2 170.4 Standard deviation for group 1 94.11 Standard deviation for group 2 208.6 Observed difference in means Observed t-statistic -2.19 -108.2 Number of randomization samples 999 P-value for one-sided test with alternative: mean(group 1)>mean(group2) 0.9880 P-value for one-sided test with alternative: mean(group 1)<mean(group2) 0.0130 P-value for two-sided test 0.0260 Modified worksheet C4 A column containing 999 differences between sample means, one for each randomized dataset C5 A column containing 999 t-statistics for differences, one for each randomized dataset Discussion Standard (two-sided) p-values are 0.037 (if we do not pool variances) or 0.027 (if we pool variances), whilst our randomization p-value is 0.026. All of these values are similar, and provide reasonable evidence for a different in stomach biomass between males and females. Looking at the data (and onesided p-values) it is clear that stomach biomass is higher for lizards in size morph 2. 22 TWOTRAN This macro is designed to test, using randomization, whether or not the means for two independent samples are equal. RUNNING THE MACRO Calling statement twotran c1 c2 ; nran k1 (999) ; differences c1 ; tstatistics c1. Input C1 Data for both groups C2 Group indicator C1 and C2 must both be columns containing only numerical data, and they must be of the same length. The column c2 should contain group markers; these should be any two distinct numerical values (for example, 1 and 2). Subcommands nran differences tstatistics Number of randomizations used. Specify a column in which to store differences between simulated group means. Specify a column in which to store t-statistics for differences between simulated group means. Output Basic summary statistics (numbers of observations, group means & standard deviations) are given, along with the observed t-statistic and difference in sample means. Randomization p-values are given for both one-sided hypotheses, and for the two-sided hypothesis. Speed of macro : FAST ALTERNATIVE PROCEDURES Other macros This macro uses randomization, but two bootstrapping versions of the test are available (depending upon whether variances are pooled) : TWOTPOOLBOOT Bootstrap test with pooling of variances TWOTUNPOOLBOOT Bootstrap test without pooling of variances This macros is suitable when data for the two groups are contained in contained in the same column, with a separate column denoting which group each observation corresponds to. If data for the two groups are contained in separate columns, TWOSAMPLERAN should be used. Standard procedures twot [C1][C2]. This performs a two-sample t-test that the mean of the data for the first group is equal to the mean of the data for the second group. The data is provided in c1, group labels are provided in c2. Variances are not pooled, so this is appropriate if the variances for the two groups cannot be assumed to be equal. 23 twot [C1][C2]; pooled. This performs a two-sample t-test that the mean of the data for the first group is equal to the mean of the data for the second group. The data is provided in c1, group labels are provided in c2. Variances are pooled, so this is only appropriate if the variances for the two groups can be assumed to be equal. TECHNICAL DETAILS Null hypothesis : We test the null hypothesis that the mean for the first group is equal to the mean for the second group. Randomization procedure : We fix the data value for each individual, and fix the size of the groups. We then randomize the allocation of individuals to groups, since under the null hypothesis this allocation will be random. Test-statistic : We use the difference between the two sample group means as the test-statistic. REFERENCES MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 6). WORKED EXAMPLE FOR TWOTRAN Name of dataset MANDIBLES Description The data are mandible lengths (mm) for 10 male and 10 female golden jackals. Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source HIGHAM, C.F.W., KIJNGAM, A. & MANLY, B.F.J. (1980), An analysis of prehistoric canid remains from Thailand. Journal of Archaeological Science, 7, pp. 149-165. The data Male (group 1) 120 107 110 116 114 111 113 117 114 112 Female (group 2) 110 111 107 108 110 105 107 106 111 111 The worksheet C1 Mandible lengths for males C2 Mandible lengths for females Aims of analysis To investigate whether mandible lengths are different for males and females. Standard procedure (without pooling) 24 MTB > Retrieve "N:\resampling\Examples\Mandibles.MTW". Retrieving worksheet from file: N:\resampling\Examples\Mandibles.MTW # Worksheet was saved on 28/08/01 11:00:13 Results for: Mandibles.MTW MTB > twot c1 c2 Two-Sample T-Test and CI: Data, Group Two-sample T for Data Group 1 2 N Mean 10 113.40 10 108.60 StDev SE Mean 3.72 1.2 2.27 0.72 Difference = mu (1) - mu (2) Estimate for difference: 4.80 95% CI for difference: (1.85, 7.75) T-Test of difference = 0 (vs not =): T-Value = 3.48 P-Value = 0.004 DF = 14 Standard procedure (with pooling) MTB > twot c1 c2 ; SUBC> pooled. Two-Sample T-Test and CI: Data, Group Two-sample T for Data Group 1 2 N Mean 10 113.40 10 108.60 StDev SE Mean 3.72 1.2 2.27 0.72 Difference = mu (1) - mu (2) Estimate for difference: 4.80 95% CI for difference: (1.91, 7.69) T-Test of difference = 0 (vs not =): T-Value = 3.48 P-Value = 0.003 DF = 18 Both use Pooled StDev = 3.08 Randomization procedure MTB > Retrieve "N:\resampling\Examples\Mandibles.MTW". Retrieving worksheet from file: N:\resampling\Examples\Mandibles.MTW # Worksheet was saved on 05/07/01 15:04:34 Results for: Mandibles.MTW MTB > % N:\resampling\library\twotran c1 c2 ; 25 SUBC> nran 999 ; SUBC> differences c4 ; SUBC> tstatistics c6. Executing from file: N:\resampling\library\twotran.MAC Two-sample randomization test Data Display (WRITE) Number of observations in group 1 10 Number of observations in group 2 10 Data mean for group 1 113.4 Data mean for group 2 108.6 Standard deviation for group 1 3.718 Standard deviation for group 2 2.271 Observed difference in means Observed t-statistic 3.48 4.800 Number of randomization samples 999 P-value for one-sided test with alternative: mean(group 1) > mean(group2) P-value for one-sided test with alternative: mean(group 2) < mean(group1) P-value for two-sided test 0.0040 0.0020 1.0000 Modified worksheet C4 A column containing 999 differences between sample means, one for each randomized dataset C6 A column containing 999 t-statistics for differences, one for each randomized dataset Discussion All methods agree that there is clear evidence of a difference in mandible lengths between sexes. Two-sided p-values are 0.004 for standard methods (without pooling) and for randomization, and 0.003 for standard methods (with pooling). Looking at the data, we see that males (group 1) have longer mandibles. 26 TWOTPOOLBOOT This macro is designed to test, using bootstrapping, whether or not the means for two independent samples are equal. We assume that the groups have equal variances. RUNNING THE MACRO Calling statement twotpoolboot c1 c2 ; nboot k1 (999) ; differences c1 ; tstatistics c1. Input C1 Data for both groups C2 Group indicator C1 and C2 must both be columns containing only numerical data, and they must be of the same length. The column c2 should contain group markers; these should be two distinct numerical values (for example, 1 and 2). Missing values are allowed. Subcommands nboot differences tstatistics Number of bootstrap resamples used. Specify a column in which to store differences between simulated group means. Specify a column in which to store t-statistics for differences between simulated group means. Output Number of observations for each group Means and standard deviations for each group Pooled standard deviation Observed difference in means, with associated t-statistic Number of bootstrap resamples One and two-sided randomization p-values The two-sided randomization p-value is equal to double the smaller of the one-sided p-values. Speed of macro FAST ALTERNATIVE PROCEDURES Other macros This macro uses bootstrapping, but two randomization versions of the test are available : TWOSAMPLERAN Randomization test, samples in different columns TWOTRAN Randomization test, samples in the same column This macro is suitable when variances can be assumed to be equal; if this is not the case, use TWOTUNPOOLBOOT instead. Standard procedures twot [C1][C2]; pooled. 27 This performs a two-sample t-test that the mean of the data for the first group is equal to the mean of the data for the second group. The data is provided in c1, group labels are provided in c2. Variances are pooled, so this is only appropriate if the variances for the two groups can be assumed to be equal. TECHNICAL DETAILS Null hypothesis : The mean for the first group is equal to the mean for the second group. Test-statistic : The t statistic (with pooled standard deviation): t = {mean for group 1 - mean for group 2} / (pooled standard deviation * sqrt{[1/sample size for group 1] + [1/sample size for group 2]}). Resampling procedure : Assume that samples sizes are n1 (for group 1) and n2 (for group 2). Then 1. We create a modified dataset, by deducting the sample group mean from each data value. This ensures that both groups have the same mean. Since we assume that group variances are also equal, we can therefore assume that the allocation of individuals to groups within this modified dataset is random under the null hypothesis. 2. For each bootstrap sample, we select n1 values from the modified dataset (with replacement) and allocate these to group 1. Similarly, we select n2 values and allocate these to group 2. REFERENCES MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 6). WORKED EXAMPLE FOR TWOTPOOLBOOT Name of dataset MANDIBLES Description The data are mandible lengths (mm) for 10 male and 10 female golden jackals. Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source HIGHAM, C.F.W., KIJNGAM, A. & MANLY, B.F.J. (1980), An analysis of prehistoric canid remains from Thailand. Journal of Archaeological Science, 7, pp. 149-165. The data Male (group 1) 120 107 110 116 114 111 113 117 114 112 Female (group 2) 110 111 107 108 110 105 107 106 111 111 The worksheet C1 Mandible lengths for males C2 Mandible lengths for females Aims of analysis To investigate whether mandible lengths are different for males and females. 28 Randomization procedure MTB > Retrieve "N:\resampling\Examples\Mandibles.MTW". Retrieving worksheet from file: N:\resampling\Examples\Mandibles.MTW # Worksheet was saved on 28/08/01 11:00:13 Results for: Mandibles.MTW MTB > % N:\resampling\library\twotpoolboot c1 c2 ; SUBC> nboot 999 ; SUBC> differences c4 ; SUBC> tstatistics c6. Executing from file: N:\resampling\library\twotpoolboot.MAC Two-sample bootstrap t-test (with pooling of standard deviations) Data Display (WRITE) Number of observations in group 1 Number of observations in group 2 Data mean for group 1 113.4 Data mean for group 2 108.6 10 10 Standard deviation for group 1 3.718 Standard deviation for group 2 2.271 Pooled standard deviation 3.080 Observed difference in means Observed t-statistic 3.484 4.800 Number of bootstrap samples 999 P-value for one-sided test with alternative: mean(group 1) > mean(group2) P-value for one-sided test with alternative: mean(group 2) < mean(group1) P-value for two-sided test 0.0040 0.0020 0.9990 Modified worksheet C4 A column containing 999 differences between sample means, one for each bootstrap resample C6 A column containing 999 t-statistics for differences, one for each bootstrap resample Discussion The results are very similar to those using TWOTRAN. Again, there is very clear evidence of a difference in means (p-value = 0.004). 29 TWOTUNPOOLBOOT This macro is designed to test, using bootstrapping, whether or not the means for two independent groups are equal. We do not assume that the groups have equal variances. RUNNING THE MACRO Calling statement twotunpoolboot c1 c2 ; nboot k1 (999); differences c1 ; tstatistics c1. Input C1 Data for both groups C2 Group indicator C1 and C2 must both be columns containing only numerical data, and they must be of the same length. Missing values are allowed. The column c2 should contain group markers; these should be two distinct numerical values (for example, 1 and 2). Subcommands nboot differences tstatistics Number of bootstrap resamples used. Specify a column in which to store differences between simulated group means. Specify a column in which to store t-statistics for differences between simulated group means. Speed of macro FAST ALTERNATIVE PROCEDURES Other macros This macro uses bootstrapping, but two randomization versions of the test are available : TWOSAMPLERAN Randomization test, samples in different columns TWOTRAN Randomization test, samples in the same column This macro is suitable when variances cannot be assumed to be equal; if it is reasonable to assume equal variances, use TWOTPOOLBOOT instead. Standard procedures twot [C1][C2]. This performs a two-sample t-test that the mean of the data for the first group is equal to the mean of the data for the second group. The data is provided in c1, group labels are provided in c2. Variances are not pooled, so this is appropriate if the variances for the two groups cannot be assumed to be equal. TECHNICAL DETAILS We test the null hypothesis that the means for the two groups are equal, using the usual t-statistic (with unpooled standard deviation) as our test-statistic. In order to resample under the null hypothesis, we first deduct group means from each data point, to ensure that both groups have the same mean. We then randomize separately within each group, and compare the two groups using a t-statistic with unpooled 30 variances. We must randomize separately within groups, because under the null hypothesis data from the two groups will not be identical (since group variances are - we assume - unequal). Null hypothesis : The mean for the first group is equal to the mean for the second group. Test-statistic : The t statistic (with separate group standard deviations): t = {mean for group 1 - mean for group 2} / sqrt{[standard deviation for group 1/sample size for group 1] + [standard deviation for group 2/sample size for group 2]}). Resampling procedure : Assume that samples sizes are n1 (for group 1) and n2 (for group 2). Then [1] We create a modified dataset, by deducting the sample group mean from each data value. This ensures that both groups have the same mean. [2] For each bootstrap sample, we select (with replacement) n1 values from the modified data for group 1, and allocate these to group 1. We also select (with replacement) n2 values from the modified data for group 2, and allocated these to group 2. It is necessary to use this form of restricted bootstrapping because we cannot assume that group variances are equal, and so we cannot pool data from groups 1 and 2. REFERENCES MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 6). WORKED EXAMPLE FOR TWOTUNPOOLBOOT Name of dataset MANDIBLES Description The data are mandible lengths (mm) for 10 male and 10 female golden jackals. Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source HIGHAM, C.F.W., KIJNGAM, A. & MANLY, B.F.J. (1980), An analysis of prehistoric canid remains from Thailand. Journal of Archaeological Science, 7, pp. 149-165. The data Male (group 1) 120 107 110 116 114 111 113 117 114 112 Female (group 2) 110 111 107 108 110 105 107 106 111 111 The worksheet C1 Mandible lengths for males C2 Mandible lengths for females Aims of analysis To investigate whether mandible lengths are different for males and females. Randomization procedure 31 MTB > Retrieve "N:\resampling\Examples\Mandibles.MTW". Retrieving worksheet from file: N:\resampling\Examples\Mandibles.MTW # Worksheet was saved on 28/08/01 11:00:13 Results for: Mandibles.MTW MTB > % N:\resampling\library\twotunpoolboot c1 c2 ; SUBC> nboot 999 ; SUBC> differences c4 ; SUBC> tstatistics c6. Executing from file: N:\resampling\library\twotunpoolboot.MAC Two-sample bootstrap t-test (standard deviations not pooled) Data Display (WRITE) Number of observations in group 1 10 Number of observations in group 2 10 Data mean for group 1 113.4 Data mean for group 2 108.6 Standard deviation for group 1 3.718 Standard deviation for group 2 2.271 Observed difference in means Observed t-statistic 3.484 4.800 Number of bootstrap samples 999 P-value for one-sided test with alternative: mean(group 1) > mean(group2) P-value for one-sided test with alternative: mean(group 2) < mean(group1) P-value for two-sided test 0.0100 0.0050 0.9990 Modified worksheet C4 A column containing 999 differences between sample means, one for each bootstrap resample C6 A column containing 999 t-statistics for differences, one for each bootstrap resample Discussion The results are very similar to those using TWOTRAN. Again, there is clear evidence of a difference in means, but the p-value is somewhat larger in this case (p-value = 0.010). 32 CORRELATIONRAN The macro is designed to test the significance of the correlation between two variables. RUNNING THE MACRO Calling statement correlationran c1 c2 ; nran k1 (999) ; corrs c1. Input C1 First variable C2 Second variable C1 and c2 must be columns, of the same length, containing only numerical values. Subcommands nran Number of randomizations used. corrs Specifya column in which to store correlation coefficients for randomization samples. Output Number of observations, and means for each variable Observed correlation coefficient Number of randomizations Randomization p-values Speed of macro : FAST Missing values : Allowed. ALTERNATIVE PROCEDURES Standard procedures Correlation C2 C1. This finds the correlation between the data in c1 and the data in c2, and gives the p-value for this correlation. TECHNICAL DETAILS Null hypothesis : The two variables are uncorrelated, i.e. = 0. Test-statistic : The Pearson correlation coefficient. Randomization: We randomize the allocation of the values to the second variable to the values of the first variable, since under the null hypothesis the pairing of the two variables will be independent. Note : This macro operates in exactly the same way as the simple linear regression macro, REGRANSIMPLE. The output is substantially different, reflecting the different emphasis of correlation as opposed to regression. REFERENCES MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 8). 33 WORKED EXAMPLE FOR CORRELATIONRAN Name of dataset HEXOKINASE Description The data is taken from part of a study by McKechnie, concerning electrophoretic frequencies of the butterfly Euphydryas editha. For each of 18 units (corresponding either to colonies, or to sets of colonies), the reciprocal of altitude (originally measured in feet * 103) is recorded, together with the percentage frequency of hexokinase 1.00 mobility genes from electrophoresis of samples of Euphydryas editha. We choose to label these variables "invalt" and "hk" respectively. Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source MC. KECHNIE, S.W., EHRLICH, P.R. & WHITE, R.R. (1975), Population genetics of Euphydryas butterflies. I. Genetic variation and the neutrality hypothesis, Genetics, 81, pp. 571-594. Data Number of observations = 18 Number of variables = 2 For each observation, HK (top) and INVALT (bottom) are given. 98.00 36.00 72.00 67.00 82.00 72.00 65.00 1.00 40.00 39.00 9.00 2.00 1.25 1.75 1.82 2.63 1.08 2.08 1.59 0.67 0.57 0.50 19.00 42.00 37.00 16.00 4.00 1.00 4.00 0.24 0.40 0.50 0.15 0.13 0.11 0.10 Plot 100 80 hk 60 40 20 0 0 1 2 invalt Minitab worksheet C1 HK measurements C2 INVALT measurements Aims of analysis 34 To investigate whether HK and INVALT measurements are correlated. Standard procedure Welcome to Minitab, press F1 for help. MTB > Retrieve "N:\resampling\Examples\Hexokinase.MTW". Retrieving worksheet from file: N:\resampling\Examples\Hexokinase.MTW # Worksheet was saved on 06/07/01 14:15:38 Results for: Hexokinase.MTW Correlation c1 c2. Correlations: hk, invalt Pearson correlation of hk and invalt = 0.770 P-Value = 0.000 Resampling procedure MTB > % N:\resampling\library\correlationran c1 c2 ; SUBC> nran 499 ; SUBC> corrs c4. Executing from file: N:\resampling\library\correlationran.MAC Data Display (WRITE) Number of observations 18 Mean of first variable 39.11 Mean of second variable 0.98 Correlation coefficient 0.770 Number of randomizations 499 One sided randomization p-value, H1: -ve correlation One sided randomization p-value, H1: +ve correlation Two sided randomization p-value 0.0040 1.0000 0.0020 Modified worksheet C4 A column containing 499 correlation coefficients, one for each randomized dataset Discussion There is clearly a strong positive correlation between the variables. The standard p-value is 0.000, whilst the randomization p-value is 0.004, the smallest possible value for 499 randomizations. 35 2 BASIC STATISTICS: CONFIDENCE INTERVALS Overview Specific procedures MEANCIBOOT computes bootstrap confidence intervals for a population mean MEDIANCIBOOT computes bootstrap confidence intervals for a population median STDEVCIBOOT compute bootstrap confidence intervals for a population standard deviation General procedures ANYCIBOOT provides a template for creating a macro to calculate confidence intervals using any teststatistic (so long as it is a function of univariate data). An introduction to bootstrap confidence intervals A large number of bootstrap techniques for constructing confidence intervals have been suggested, and the merits of the different approaches are discussed at length in the statistical literature. We concentrate on those techniques discussed by Manly (1997). However, in our opinion the clearest introduction to bootstrap confidence intervals is that of Efron and Tibshirani (1993). Standard confidence intervals A standard 95% confidence interval for a parameter estimate is given by CI = Parameter estimate 1.96 * Standard error based on observed sample, Where 1.96 is found from tables of the normal distribution. Estimate -/+ 1.96 * bootstrap standard deviation The simplest type of 95% bootstrap confidence interval involves estimating the standard error to be the standard deviation of the bootstrap parameter estimates (henceforth known simply as the "bootstrap standard deviation"), so that CI = Parameter estimate 1.96 * Bootstrap standard deviation. For intervals other than 95%, a value other than 1.96 is required, and can be obtained from normal tables. Bootstrap-t method Standard confidence intervals involve making assumptions about the distribution of parameters - the 1.96 in the above equations arises because we assume that parameters (or a standardized version of them) are normally distributed. Using bootstrapping, we can avoid such assumptions. Instead, we can find the distribution of the t-statistic for the parameter - a standardized version of the parameter estimate – from the bootstrap samples. The confidence interval is then CI = Parameter estimate Bootstrap t-statistic * Standard error based on observed sample. The bootstrap t-statistic for the dth resample is defined by tbootd = (Parameter estimate for dth resample - Parameter estimate for observed sample) / Standard error for dth resample Efron percentile method Assume that a number of bootstrap resamples, nboot, are used. Then, in order to create a 100(1-alpha)% confidence interval we sort the parameter estimates obtained from the nsim resamples into ascending order. We then take the 36 [nboot * alpha]th value and [nboot * (1-alpha)]th value in this sorted list as our lower and upper confidence limits. If [nsim * alpha] is not an integer, we round it down; correspondingly, we round [nsim * (1-alpha)] up; this rounding procedure is conservative. For example, if there are 1000 bootstrap samples, then we calculate the test-statistics for each sample, and sort these test-statistics. The 95% confidence interval is formed by taking the 0.025 * 1000 = 25th and 975th test-statistics from this list as our lower and upper limits. Hall percentile method A modified version of the Efron percentile method. If the Efron confidence limits are Efronlow and Efronhigh, then the Hall limits are HallLow = (2 * Parameter estimate) - EfronHigh HallHigh = (2 * Parameter estimate) - EfronLow. Hall confidence intervals will have the same length as Efron confidence intervals. BC percentile method An extension of the Efron percentile method, in which possible bias in the parameter estimate is corrected for. The correction alters the rank values of the lower and upper endpoints used in the percentile method. BCa percentile method An extension of the BC percentile method, in which the possibilities of both bias and non-constant standard error are corrected for. The corrections alter the rank values of the lower and upper endpoints used in the percentile method. Relationship between different bootstrap methods 37 MEANCIBOOT This macro is designed to calculate bootstrap confidence intervals for a population mean. RUNNING THE MACRO Calling statement meanciboot c1 ; siglev k1 (95) ; nboot k1 (2000); means c1 ; quantiles c1-c3 ; tvalues c1. Input Input to the macro must be a single column, containing only numerical values. Subcommands siglev The significance level of the confidence interval, expressed as a percentage. The default is 95 (corresponding to 95% significance); other standard choices are 90, 98 or 99. nboot The number of bootstrap samples used. The default is 2000. It is not recommend to use less than 1000 for the construction of confidence intervals. means Specify a column in which to store bootstrap sample means. quantiles Specify three columns in which to store ranks corresponding to the lower and upper confidence interval limits, for the standard percentile method (column 1), the BC method (column 2) and the BCa method (column 3). tvalues Specify a column in which to bootstrap sample t-statistics. Output Basic information (number of data points, significance level, number of bootstrap samples) Sample mean, with associated standard error Sample standard deviation Bootstrap standard deviation about the estimated mean Overall bootstrap mean Estimated bias correction (for BC and BCa methods) Estimated acceleration (for BCa method) Standard bootstrap confidence interval Bootstrap confidence intervals using : Estimate -/+ 1.96*bootstrap standard deviation, Bootstrap-t method, Efron percentile method, Hall percentile method, BC method, BCa method. Speed of macro : FAST Missing data : Allowed ALTERNATIVE PROCEDURES Standard procedures tinterval c1 This produces a confidence interval about a mean value, in the situation in which variance is unknown. 38 zinterval k1 c1 This produces a confidence interval about a mean value, where the variance is known to be k1. REFERENCES MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 3). EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London (Chapters 12-14). WORKED EXAMPLE FOR MEANCIBOOT Name of dataset EXPONENTIAL Description The data are 20 realisations from an Exponential distribution with rate parameter 1. Source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Data Number of observations = 20 Number of variables = 1 3.56 0.69 0.10 1.84 3.93 1.25 0.18 1.13 0.27 0.50 0.67 0.01 0.61 0.82 1.70 0.39 0.11 1.20 1.21 0.72 Worksheet C1 Data Aims of analysis To create confidence intervals for the population mean. Standard procedure MTB > Retrieve "N:\resampling\Examples\Exponential.MTW". Retrieving worksheet from file: N:\resampling\Examples\Exponential.MTW # Worksheet was saved on 23/08/01 12:16:52 Results for: Exponential.MTW MTB > OneT c1. One-Sample T: C1 Variable C1 N 20 Mean 1.044 StDev SE Mean 1.060 0.237 95.0% CI (0.549, 1.540) Resampling procedure MTB > % N:\resampling\library\meanciboot c1 ; SUBC> siglev 95 ; 39 SUBC> nboot 1000 ; SUBC> means c3 ; SUBC> quantiles c5-c7 ; SUBC> tvalues c9. Executing from file: N:\resampling\library\meanciboot.MAC Data Display STANDARD CONFIDENCE INTERVALS Data Display (WRITE) Number of data values 20 Mean of data values 1.0445 Standard deviation of data values 1.0597 Standard error of the mean 0.23695 Significance level for confidence intervals 95 Estimated confidence interval, lower bound (Standard t method) Estimated confidence interval, upper bound (Standard t method) 0.54855 1.5404 BOOTSTRAP CONFIDENCE INTERVALS Data Display (WRITE) Number of bootstrap samples 1000 Overall mean for bootstrap samples 1.042 Standard deviation of bootstrap means 0.2407 Estimated bias-correction (for BC, BCa) 0.0652 Estimated acceleration (for BCa) 0.0612 Confidence limits Data Display (WRITE) Estimate -/+ 1.96*boot sd Bootstrap-t method 0.5727 0.6405 1.516 1.932 Efron percentile method Hall percentile method 0.6255 0.5355 1.553 1.463 BC percentile method BCa percentile method 0.6400 0.6650 1.580 1.718 Modified worksheet C3 A column containing 1000 sample means, one for each bootstrap resample C5 Upper and lower rank positions for percentile confidence limits using the Efron method C6 Upper and lower rank positions for percentile confidence limits using the Efron method C7 Upper and lower rank positions for percentile confidence limits using the Efron method C9 A column containing 1000 t-statistics for sample means, one for each bootstrap resample Columns c5 - c7 each contain 2 values. 40 Discussion There is a fair amount of variation between the different confidence intervals. All 7 intervals include the true population mean of one. The Efron percentile method produces the shortest interval in this case, the bootstrap t-interval the longest. The bootstrap-t and BCa intervals generally imply larger values for the mean, whilst the Hall, standard t and estimate -/+ 1.96 * bootstrap standard deviation intervals generally imply smaller values for the mean. These are not general properties of the different methods, however. Manly (1997) performs a simulation study to investigate the coverage of the different bootstrap intervals for a sample of 20 observations from an exponential distribution with rate parameter one. He finds that the bootstrap-t method (with 95.2% coverage) has the closest coverage to the nominal 95% level, closely followed by the BCa method (with 92.4% coverage). 41 MEDIANCIBOOT This macro is designed to calculate bootstrap confidence intervals for a population median. RUNNING THE MACRO Calling statement medianciboot c1 ; siglev k1 (95) ; nboot k1 (2000); medians c1 ; quantiles c1-c3 ; tvalues c1. Input Input to the macro must be a single column, containing only numerical values. Discrete or continuous data are allowed. Missing data is allowed. Subcommands siglev The significance level of the confidence interval, expressed as a percentage. The default is 95 (corresponding to 95% significance); other standard choices are 90, 98 or 99. nboot The number of bootstrap samples used. The default is 2000. It is not recommend to use less than 1000 for the construction of confidence intervals. medians Specify a column in which to store bootstrap sample medians. quantiles Specify three columns in which to store ranks corresponding to the lower and upper confidence interval limits, for the standard percentile method (column 1), the BC method (column 2) and the BCa method (column 3). tvalues Specify a column in which to store bootstrap sample t-statistics. Output Basic information (number of data points, significance level, number of bootstrap samples) Sample median, with standard error (assuming a normal distribution) Bootstrap standard deviation about the estimated median Estimated bias correction (for BC and BCa methods) Estimated acceleration (for BCa method) Standard nonparametric confidence interval for the median Bootstrap confidence intervals using : Estimate -/+ 1.96*bootstrap standard deviation, Bootstrap-t method, Efron percentile method, Hall percentile method, BC method, BCa method. Speed of macro : fast ALTERNATIVE METHODS : Standard methods sinterval 95 c1. produces three different 95% nonparametric confidence intervals for the median. The first and third intervals are based upon exact ranks, and have exact achieved confidence levels. These confidence levels will not, in general, be equal to 95% : the first interval is the interval with the closest confidence level to 95% which is below 95%, the third interval that with the closest confidence level to 95% which is above 95%. Hence, the 3rd procedure is conservative, the 1st anti-conservative. The 2nd interval is an approximate confidence interval based upon interpolation. 42 In the output to our macro, we include the conservative nonparametric confidence interval. The construction is discussed in "technical details". TECHNICAL DETAILS : nonparametric confidence interval for the median The nonparametric confidence interval for the median is formed by finding the rank order of the lower and upper limits using – Lower limit : (n + 1)/2 – (0.9789 * sqrt(n)), rounded down to the nearest integer Upper limit : (n + 1)/2 – (0.9789 * sqrt(n)), rounded up to the nearest integer, where n is sample size. [Notes: 1. for n > 283, we use 0.9800 instead of 0.9789 2. for n = 17, the formula provides rank values of 4 and 14, but we use 5 and 13. 3. for n = 67, the formula provides rank values of 25 and 43, but we use 26 and 42]. The data points corresponding to these rank orders then form the confidence interval. Details of these procedures can be found at : http://www.umanitoba.ca/centres/mchpe/concept/dict/Statistics/ci_median/ http://www.maths.unb.ca/~knight/utility/MedInt95.htm. REFERENCES MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 3). EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London (Chapters 12-14). WORKED EXAMPLE FOR MEDIANCIBOOT Data EXPONENTIAL (see MEANCIBOOT) Aims of analysis To create confidence intervals for the population median. Standard procedure : Sign confidence interval MTB > SInterval 95.0 c1. Sign CI: C1 Sign confidence interval for median C1 Achieved N Median Confidence 20 0.705 0.8847 0.9500 0.9586 Confidence interval Position ( 0.500, 1.200) 7 ( 0.416, 1.208) NLI ( 0.390, 1.210) 6 Randomization procedure MTB > Retrieve "N:\resampling\Examples\Exponential.MTW". Retrieving worksheet from file: N:\resampling\Examples\Exponential.MTW # Worksheet was saved on 23/08/01 12:16:52 Results for: Exponential.MTW 43 MTB > % N:\resampling\library\medianciboot c1 ; SUBC> siglev 95 ; SUBC> nboot 1000 ; SUBC> medians c3 ; SUBC> quantiles c5-c7 ; SUBC> tvalues c9. Executing from file: N:\resampling\library\medianciboot.MAC General information Data Display (WRITE) Number of data values 20 Median of data values 0.70500 Standard error of the median 0.29697 Significance level for confidence intervals 95 Number of bootstrap samples 1000 Bootstrap standard deviation 0.2019 Estimated bias-correction (for BC, BCa) -0.1156 Estimated acceleration (for BCa) -0.0000 Confidence limits Data Display (WRITE) Standard non-parametric method 0.3900 Estimate -/+ 1.96*boot sd 0.3093 Bootstrap-t method 0.1550 1.101 1.082 Efron percentile method Hall percentile method 0.4450 0.2050 1.205 0.9650 BC percentile method BCa percentile method 0.3850 0.3850 1.200 1.200 1.210 Modified worksheet C3 A column containing 1000 sample medians, one for each bootstrap resample C5 Upper and lower rank positions for percentile confidence limits using the Efron method C6 Upper and lower rank positions for percentile confidence limits using the Efron method C7 Upper and lower rank positions for percentile confidence limits using the Efron method C9 A column containing 1000 t-statistics for sample medians, one for each bootstrap resample Columns c5 - c7 each contain 2 values. 44 STDEVCIBOOT This macro is designed to calculate bootstrap confidence intervals about a population standard deviation. RUNNING THE MACRO Calling statement stdevciboot c1 ; siglev k1 (95) ; nboot k1 (2000); stdevs c1 ; quantiles c1-c3. Input Input to the macro must be a single column, containing only numerical values. Discrete or continuous data are allowed. Missing data is allowed. Subcommands siglev The significance level of the confidence interval, expressed as a percentage. The default is 95 (corresponding to 95% significance); other standard choices are 90, 98 or 99. nboot The number of bootstrap samples used. The default is 2000. It is not recommend to use less than 1000 for the construction of confidence intervals. medians Specify a column in which to store bootstrap sample medians. quantiles Specify three columns in which to store ranks corresponding to the lower and upper confidence interval limits, for standard percentile method (column 1), the BC method (column 2) and the BCa method (column 3). tvalues Specify a column in which to bootstrap sample t-statistics. Six ranks are given; the first two ranks correspond to the ranks for the lower and upper confidence limits for the standard percentile and bootstrap-t confidence intervals. The next two ranks correspond to the ranks for the bias-corrected (BC) percentile intervals, the final two ranks correspond to ranks for the accelerated bias-corrected (BCa) percentile intervals. Output Basic information (number of data points, significance level, number of bootstrap samples) Sample standard deviation Bootstrap standard deviation about the estimated standard deviation Estimated bias correction (for BC and BCa methods) Estimated acceleration (for BCa method) Confidence interval using chi-squared approximation Bootstrap confidence intervals using : Estimate -/+ bootstrap standard deviation, Efron percentile method, Hall percentile method, BC method, BCa method. Speed of macro : Fast. ALTERNATIVE PROCEDURES Standard procedure : No built-in Minitab function, but the macro incorporates a confidence interval obtained using the following approximation based upon the chi-squared distribution : 45 The standard 100(1 – alpha) confidence interval for a standard deviation has limits of sqrt{(n – 1) * sample variance / appropriate quantiles of the chi-squared n-1 distribution}, where n is sample size. For a 95% confidence interval, the 2.5% and 97.5% quantiles are used. This interval is based on the fact that, if data is normally distributed, the quantity Sample variance * (n – 1) / Population variance has a chi-squared distribution with n – 1 degrees of freedom. References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 3). EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London (Chapters 12-14). WORKED EXAMPLE FOR STDEVCIBOOT Data EXPONENTIAL (see MEANCIBOOT) Aims of analysis To create confidence intervals for the population standard deviation. Randomization procedure MTB > Retrieve "N:\resampling\Examples\Exponential.MTW". Retrieving worksheet from file: N:\resampling\Examples\Exponential.MTW # Worksheet was saved on 23/08/01 12:16:52 Results for: Exponential.MTW MTB > % N:\resampling\library\stdevciboot c1 ; SUBC> siglev 95 ; SUBC> nboot 2000 ; SUBC> stdevs c3 ; SUBC> quantiles c5-c7. Executing from file: N:\resampling\library\stdevciboot.MAC Data Display BOOTSTRAP CONFIDENCE INTERVALS FOR A POPULATION VARIANCE Histogram simsort General information Data Display (WRITE) Number of data values 20 Observed variance 1.0597 Significance level for confidence intervals 95 Number of bootstrap samples 2000 46 Bootstrap standard deviation about the variance 0.2474 Estimated bias-correction (for BC, BCa) 0.1231 Estimated acceleration (for BCa) 0.1009 Confidence intervals Data Display (WRITE) Standard chi-squared based interval 0.7829 1.504 Estimate -/+ 1.96*bootstrap SE 0.5749 1.545 Efron percentile method Hall percentile method 0.4842 0.6836 1.436 1.635 BC percentile method BCa percentile method 0.5137 0.5742 1.474 1.549 Distribution of variances from bootstrap resamples 80 70 Frequency 60 50 40 30 20 10 0 0.2 0.7 1.2 1.7 Sample variance Modified worksheet C3 A column containing 2000 sample standard deviations, one for each bootstrap resample C5 Upper and lower rank positions for percentile confidence limits using the Efron method C6 Upper and lower rank positions for percentile confidence limits using the Efron method C7 Upper and lower rank positions for percentile confidence limits using the Efron method Columns c5 - c7 each contain 2 values. Discussion The different methods produce substantially different results. The confidence interval based upon standard methods is substantially shorter than any of the bootstrap confidence intervals. Manly (1997) performs a simulation study to investigate the coverage of the different methods for a sample of 20 from an exponential distribution with parameter 1. The coverages are extremely poor for all of the methods. Against a nomial coverage level of 95%, the bootstrap methods achieve coverages of between 65.9% (Efron method) and 72.7% (Hall method), whilst the standard method has a coverage of 72.7%. Some improvement to the standard and Hall methods can be obtained by taking logarithms, but the coverage 47 remains poor. We see that bootstrap distribution of the standard deviation (see figure above) is very lumpy, and this may explain the poor performance of the methods. ADDITIONAL SAMPLE DATASET FOR STDEVCIBOOT Name of dataset SPATIAL Description For each of 26 neurologically impaired children, the results of two tests of spatial perception, A and B, are recorded. Source EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London. Data Number of observations = 26 Number of variables = 2 For each child, A scores (top) and B scores (bottom) are shown. 48 36 20 29 42 42 20 42 22 41 45 14 6 0 33 28 34 4 32 42 33 16 39 38 36 15 33 20 43 34 22 7 15 34 29 41 13 38 24 47 41 24 26 30 41 25 27 41 28 14 28 40 Aims of analysis Efron and Tibshirani (1993) produce confidence intervals for test A scores. 48 ANYCIBOOT This macro is designed to provide a template for the creation of a bootstrap confidence interval using any test-statistic. ADAPTING THE MACRO In order to change the test-statistic, modify the three lines of code denoted by hashed boxes (and clearly marked). A number of possible alternatives are provided, but almost any test-statistic for univariate data may be used. If the test-statistic is complex, any additional variables included within the code for its computation must be declared. For complex test-statistics, it may be better to call another local macro to compute the test-statistic at each stage. Note on a potential bug : For some test-statistics, it may be impossible to calculate the acceleration for particular datasets. If there is any risk that, for a given test-statistic, the test-statistic will take the same value for each subset of the data formed by excluding one datapoint at a time, then the details and calculations concerning the calculation of BCa intervals should be excluded from the code (contact the authors for further details). Example : If the test-statistic is the median and the observed dataset is [1,3,3,3,6], then the medians for each of the restricted datasets [1,3,3,3],[1,3,3,6],[1,3,3,6],[1,3,3,6],[3,3,3,6] (formed by missing out the values 1,3,3,3 and 6 respectively) are 3, and so acceleration cannot be calculated. For many test-statistics, such as the mean and standard deviation, it is highly unlikely that this phenomenon will arise (in fact, the only situation we can envisage is if all of the data are equal, in which case resampling methods are clearly inappropriate anyway), we it may be a risk for other test-statistics based upon quantiles or ranks (e.g. interquartile range). RUNNING THE MACRO Calling statement anyciboot c1 ; siglev k1 (95) ; nboot k1 (2000); stdevs c1 ; quantiles c1-c3. Input Input to the macro must be a single column, containing only numerical values. Discrete or continuous data are allowed. Missing data is not allowed. Subcommands siglev The significance level of the confidence interval, expressed as a percentage. The default is 95 (corresponding to 95% significance); other standard choices are 90, 98 or 99. nboot The number of bootstrap samples used. The default is 2000. It is not recommend to use less than 1000 for the construction of confidence intervals. medians Specify a column in which to store bootstrap sample medians. quantiles Specify three columns in which to store ranks corresponding to the lower and upper confidence interval limits, for standard percentile method (column 1), the BC method (column 2) and the BCa method (column 3). tvalues Specify a column in which to bootstrap sample t-statistics. 49 Six ranks are given; the first two ranks correspond to the ranks for the lower and upper confidence limits for the standard percentile and bootstrap-t confidence intervals. The next two ranks correspond to the ranks for the bias-corrected (BC) percentile intervals, the final two ranks correspond to ranks for the accelerated bias-corrected (BCa) percentile intervals. Output Basic information (number of data points, significance level, number of bootstrap samples) Sample standard deviation Bootstrap standard deviation about the estimated standard deviation Estimated bias correction (for BC and BCa methods) Estimated acceleration (for BCa method) Bootstrap confidence intervals using : Estimate -/+ bootstrap standard deviation, Efron percentile method, Hall percentile method, BC method, BCa method. Speed of macro : FAST ALTERNATIVE PROCEDURES : Other macros Specific macros exist to compute confidence intervals for means, medians and standard deviations. See MEANCIBOOT bootstrap confidence interval for a mean MEDIANCIBOOT bootstrap confidence interval for a median STDEVCIBOOT bootstrap confidence interval for a standard deviation. TECHNICAL DETAILS The choice of test-statistic is important, and will have a critical effect upon the results obtained. A statistician should be consulted, to determine whether or not the required assumptions underlying resampling procedures hold for the given test-statistic. References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 3). EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London (Chapters 12-14). WORKED EXAMPLE FOR ANYCIBOOT Data EXPONENTIAL (see MEANCIBOOT) Aims of analysis To create confidence intervals for the population mean. Randomization procedure MTB > Retrieve "N:\resampling\Examples\Exponential.MTW". Retrieving worksheet from file: N:\resampling\Examples\Exponential.MTW # Worksheet was saved on 23/08/01 12:16:52 Results for: Exponential.MTW MTB > % N:\resampling\library\anyciboot c1 ; SUBC> siglev 95 ; 50 SUBC> nboot 1000 ; SUBC> teststats c3 ; SUBC> quantiles c5-c7. Executing from file: N:\resampling\library\anyciboot.MAC BOOTSTRAP CONFIDENCE INTERVALS ABOUT A POPULATION TEST-STATISTIC Histogram simsort Distribution of test statistic from bootstrap resamples Frequency 30 20 10 0 0.5 1.0 1.5 2.0 Sample test statistic General information Data Display (WRITE) Number of data values Observed test-statistic 20 1.044 Significance level for confidence intervals 95 Number of bootstrap samples 1000 Bootstrap standard deviation about the test-statistic 0.2271 Estimated bias-correction (for BC, BCa) 0.0326 Estimated acceleration (for BCa) 0.0612 Confidence intervals Data Display (WRITE) Estimate -/+ 1.96*bootstrap SE Efron percentile method Hall percentile method 0.5994 0.6480 0.5455 1.490 1.544 1.441 51 BC percentile method BCa percentile method 0.6525 0.6960 1.553 1.635 Modified worksheet C3 A column containing 1000 sample test-statistics, one for each bootstrap resample C5 Upper and lower rank positions for percentile confidence limits using the Efron method C6 Upper and lower rank positions for percentile confidence limits using the Efron method C7 Upper and lower rank positions for percentile confidence limits using the Efron method Columns c5 - c7 each contain 2 values. Discussion The results are very similar to those obtained using MEANCIBOOT. 52 3 ANALYSIS OF VARIANCE Overview One-way analysis of variance ONEWAYRAN tests for a factor effect in a one-way analysis of variance Two-way analysis of variance TWOWAYRAN tests for a group effect in a two-way analysis of variance, without replication TWOWAYREPRAN tests for a group effect in a two-way analysis of variance, with replication Testing for constant variance LEVENERAN tests for constant variance using a randomization version of Levene's test 53 ONEWAYRAN This macro is designed to perform a one-way analysis of variance. Randomization is used to assess the significance of the factor effect. Calling statement oneanovaran c1 c2 ; nran k1 (999) ; fvalues c1. Input C1 Data. A column containing only numeric values. C2 Group. A column containing only numeric values. The number of distinct numeric values used should be equal to the number of groups, with each value denoting a particular group. Subcommands nran Number of randomizations fvalues Specify a column in which to store simulated F-ratios for group effect. Output Identical to the output from the standard Minitab command "oneway", with the addition of a randomization p-value for group effect. Speed of macro : FAST ALTERNATIVE PROCEDURES : Standard procedures oneway C1 C2. This performs a one-way analysis of variance. The response variable is provided in c1, the factor levels corresponding to each data point are provided in c2. TECHNICAL DETAILS Null hypothesis : Means for all groups are equal, so that 1 = 2 = … = g, where there are g groups and i is the mean data value for the ith group. Test-statistic : The standard F-ratio produced in a one-way analysis of variance. Randomization procedure : We fix the data value for each individual, and fix group sizes. The then randomize the allocation of data to groups. This is valid, since under the null hypothesis the allocation of group labels to individuals is random. REFERENCES MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 7). WORKED EXAMPLE FOR ONEWAYRAN Name of dataset MONTHS 54 Description Again, the data is taken from the study by Powell and Russell, and concerns the stomach contents of the Eastern horned lizard Phrynosoma douglassi brevirostre. The data record, for each of four months, the amount of dry biomass of ants for the 24 adult and yearling females mentioned above. Manly (1997) uses these data to perform a one-way analysis of variance. Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source POWELL, G.L. & RUSSELL, A.P. (1984), The diet of the eastern short-horned lizard (Phrynosoma douglassi brevirostre) in Alberta and its relationship to sexual size dimorphism, Canadian Journal of Zoology, 62, pp. 428-440. POWELL, G.L. & RUSSELL, A.P. (1985), Growth and sexual size dimorphisms in Alberta populations of the eastern short-horned lizard, Phrynosoma douglassi brevirostre, Canadian Journal of Zoology, 63, pp. 139-154. Data Number of observations = 25 Number of variables = 2 For each observation, data (top) and month (bottom) are given. 13 242 105 1 1 1 600 3 82 3 8 2 59 2 20 2 40 52 1889 3 3 3 18 4 2 245 515 488 2 2 3 3 88 233 3 3 44 21 4 4 0 4 5 4 6 4 50 3 Worksheet C1 Data C2 Month Aims of analysis To investigate whether month has an impact upon stomach biomass Standard procedure Retrieving worksheet from file: N:\resampling\Examples\Months.MTW # Worksheet was saved on 07/06/01 09:11:46 AM Results for: Months.MTW MTB > Oneway c1 c2. Randomization procedure MTB > % N:\resampling\library\onewayran c1 c2 ; SUBC> nran 999 ; SUBC> fvalues c4. Executing from file: N:\resampling\library\onewayran.MAC 55 STANDARD ONE-WAY ANOVA ANALYSIS One-way ANOVA: biomass versus month Analysis of Variance for biomass Source DF SS MS F P month 3 726695 242232 1.64 0.211 Error 20 2947024 147351 Total 23 3673719 Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev --+---------+---------+---------+---1 3 120.0 115.2 (--------------*--------------) 2 5 66.8 102.1 (-----------*-----------) 3 10 403.7 565.4 (-------*--------) 4 6 15.7 16.1 (----------*---------) --+---------+---------+---------+---Pooled StDev = 383.9 -300 0 300 600 RANDOMIZATION P-VALUES Data Display (WRITE) Number of groups 4 Number of randomizations 999 Randomization p-value 0.1940 Modified worksheet C4 A column containing 999 F-ratios, one for each randomized dataset Discussion There is no real evidence for a month effect, with p-values of 0.211 (standard methods) and 0.194 (randomization). ADDITIONAL SAMPLE DATASET FOR ONEWAYRAN Name of dataset COLONY Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London, pp. 162-166. Original source CAIN, A.J. & SHEPPARD, P.M. (1950), Selection in the polymorphic land snail Cepaea nemoralis, Heredity, 4, 275-294. Data Number of observations = 17 Number of variables = 2 For each point, the data (top) and group (bottom) are given 56 25.0 26.9 8.1 13.5 3.8 9.1 30.9 17.1 37.4 26.9 76.2 40.9 58.1 1.0 1.0 2.0 2.0 2.0 3.0 3.0 3.0 3.0 3.0 4.0 4.0 4.0 18.4 64.2 42.6 45.1 4.0 5.0 5.0 6.0 57 TWOWAYRAN This macro is designed to perform a two-way analysis of variance, in a situation in which where there is no replication. Important note The macro is only capable of assessing the impact of one of the factors in a two-way ANOVA. The user must choose which factor is of interest – this factor will be called the group. The remaining factor is treated as nuisance factor, and we will call this the block. If the user is interested in the (individual) effect of both factors, then the macro must be run twice, with the “group” and “block” factors swapped the second time the macro is run. RUNNING THE MACRO Calling statement twowayran c1 c2 c3 ; nran k1 (999) ; listdata c1-c3 ; ssquares c1 ; fvalues c1. Input C1 Data. A column containing only numeric values. C2 Group. A column containing only numeric values. The number of distinct numeric values used should be equal to the number of groups, with each value denoting a particular group. C3 Block. A column containing only numeric values. The number of distinct numeric values used should be equal to the number of blocks, with each value denoting a particular block. Subcommands nran Number of randomizations listdata Specify three columns in which to store the sorted data. Within the macro, data is sorted, and group markers are changed to be consecutive integers, and this may make output difficult to interpret. In order to see which group is which, output the sorted data to "listdata", and compare this against the original dataset. ssquares Specify a column in which to store sums of squares for group effect. fvalues Specify a column in which to store simulated F-ratios for group effect. Output Identical to the output from the standard Minitab command "twoway", with the addition of a randomization p-values for group effect obtained using two different test-statistics (mean square and Fratio). Speed of macro : MODERATE ALTERNATIVE PROCEDURES Other macros This macro is suitable only if there is no replication (i.e. if there is only one observation for each factorby-factor combination); if replication is present, then you should use TWOWAYREPRAN. Standard procedures twoway C1 C2 C3. 58 This performs a two-way analysis of variance. The response variable is provided in c1, the factor levels corresponding to each data point are provided in c2 in c3. TECHNICAL DETAILS Null hypothesis : The means of the response variable are constant across groups. Test-statistic : The F-ratio for group effect in a two-way ANOVA. Randomization procedure : We use a restricted randomization; we fix the allocation of data to blocks (and fix the number of data points in each group within each block). We randomize the allocation of data to groups within each block. Notes : The two-way ANOVA does not include an interaction term, since there is insufficient data with a single replicate to allow this. References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 7). WORKED EXAMPLE FOR TWOWAYRAN Name of dataset ORTHOPTERA Description We return again to the data of Powell & Russell, except that data is now also provided for adult females. Data are now classified according to two factors, month and size morph, and Manly (1997) analyses this as a two-way analysis of variance. Source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Data Number of observations = 8 Number of variables = 3 Data 190 0 52 50 10 110 8 1212 Size 1 1 1 1 2 2 2 2 Month 1 2 3 4 1 2 3 4 Worksheet C1 Data C2 Size morph 59 C3 Month Aim of analysis To investigate whether month and size morph have an impact upon stomach biomass. Randomization procedure MTB > % N:\resampling\library\twowayran c1 c2 c3 ; SUBC> nran 999 ; SUBC> listdata c5-c7 ; SUBC> ssquares c10 ; SUBC> fvalues c12. Executing from file: N:\resampling\library\twowayran.MAC STANDARD TWO-WAY ANOVA ANALYSIS Two-way ANOVA: Data versus Size, Month Analysis of Variance for Data Source DF SS MS F P Size 1 137288 137288 0.73 0.455 Month 3 491244 163748 0.88 0.542 Error 3 561052 187017 Total 7 1189584 Size 1 2 Month 1 2 3 4 Individual 95% CI Mean ------+---------+---------+---------+----73 (----------------*----------------) 335 (----------------*-----------------) ------+---------+---------+---------+-----400 0 400 800 Individual 95% CI Mean ----+---------+---------+---------+------100 (------------*-------------) 55 (-------------*-------------) 30 (------------*-------------) 631 (-------------*-------------) ----+---------+---------+---------+-------700 0 700 1400 RANDOMIZATION P-VALUE (FOR GROUP EFFECT) Method used: restricted randomization (randomization within blocks) Data Display (WRITE) Number of randomizations 999 P-value for group effects (using F-ratio) 0.7160 P-value for group effects (using mean square) 0.7160 Modified worksheet C5 A column containing sorted data (sorted by group and block) C6 A column containing re-numbered group markers 60 C7 C10 C12 A column containing re-numbered block markers A column containing 999 sums of square for group effect, one for each randomized dataset A column containing 999 F-ratios for group effect, one for each randomized dataset Discussion There is no evidence whatsoever of a group (size morph) effect, with p-values of 0.455 (by standard methods) and 0.716 (by randomization). 61 TWOWAYREPRAN This macro is designed to perform a two-way analysis of variance, in situations in which replication is present. Important note The macro is only capable of assessing the impact of one of the factors in a two-way ANOVA. The user must choose which factor is of interest – this factor will be called the group. The remaining factor is treated as nuisance factor, and we will call this the block. If the user is interested in the (individual) effect of both factors, then the macro must be run twice, with the “group” and “block” factors swapped the second time the macro is run. The two-way ANOVA includes an interaction term, but the macro cannot determine the significance (p-value) for this interaction. Calling statement twowayran c1 c2 c3 ; nran k1 (999) ; listdata c1-c4 ; ssquares c1 ; fvalues c1. Input C1 Data. A column containing only numeric values. C2 Group. A column containing only numeric values. The number of distinct numeric values used should be equal to the number of groups, with each value denoting a particular group. C3 Block. A column containing only numeric values. The number of distinct numeric values used should be equal to the number of blocks, with each value denoting a particular block. Subcommands nran Number of randomizations listdata Specify four columns in which to store the sorted data. Within the macro, data is sorted, and group markers are changed to be consecutive integers, and this may make output difficult to interpret. In order to see which group is which, output the sorted data to "listdata", and compare this against the original dataset. The fourth column contains a marker for group * block combinations, so that any individuals which are in the same group and block should have the same value from the 4th column. ssquares Specify a column in which to store sums of squares for group effect. fvalues Specify a column in which to store simulated F-ratios for group effect. Output Identical to the output from the standard Minitab command "twoway", with the addition of a randomization p-values for group effect obtained using two different test-statistics (mean square and Fratio). Speed of macro : MODERATE Other macros This macro is suitable only if there is replication (i.e. if there is more than one observation for each factor-by-factor combination); if replication is not present, then you should use TWOWAYRAN. 62 Standard procedures twoway C1 C2 C3. This performs a two-way analysis of variance. The response variable is provided in c1, the factor levels corresponding to each data point are provided in c2 in c3. Null hypothesis : The means of the response variable are constant across groups. Test-statistic : The F-ratio for group effect in a two-way ANOVA. Randomization procedure : We use a restricted randomization; we fix the allocation of data to blocks (and fix the number of data points in each group within each block). We randomize the allocation of data to groups within each block. References : MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 7). WORKED EXAMPLE FOR TWOWAYREPRAN Name of dataset TWOWAY Description We return again to the data of Powell & Russell, except that data is now also provided for adult females. Data are now classified according to two factors, month and size morph, and Manly (1997) analyses this as a two-way analysis of variance. In this case, data is available for each individual, so there is also replication. Source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Data Number of observations = 24 Number of variables = 3 For each observation, data (top), group (middle) and month (bottom) are shown. 13 242 105 1 1 1 1 1 1 21 2 1 7 2 1 8 1 2 24 312 2 2 2 2 59 1 2 20 515 488 1 1 1 2 3 3 88 1 3 68 460 1223 990 140 2 2 2 2 2 2 3 3 3 4 18 1 4 40 2 4 44 21 182 1 1 2 4 4 1 27 2 4 Worksheet C1 Data C2 Size morph C3 Month Aim of analysis To investigate whether month and size morph have an impact upon stomach biomass. 63 Standard procedure MTB > Retrieve "N:\resampling\Examples\Twoway.MTW". Retrieving worksheet from file: N:\resampling\Examples\Twoway.MTW # Worksheet was saved on 20/07/01 11:49:31 Results for: Twoway.MTW MTB > Twoway c1 c2 c3; SUBC> Means c2 c3. Randomization procedure MTB > % N:\resampling\library\twowayrepran c1 c2 c3 ; SUBC> nran 999 ; SUBC> listdata c5-c8 ; SUBC> ssquares c10 ; SUBC> fvalues c12. Executing from file: N:\resampling\library\twowayrepran.MAC STANDARD TWO-WAY ANOVA ANALYSIS Two-way ANOVA: Value versus Group, Month Analysis of Variance for Value Source DF SS MS F P Group 1 146172 146172 4.47 0.051 Month 3 1379495 459832 14.06 0.000 Interaction 3 294009 98003 3.00 0.062 Error 16 523222 32701 Total 23 2342899 Group 1 2 Month 1 2 3 4 Individual 95% CI Mean --------+---------+---------+---------+--135 (-----------*----------) 291 (----------*----------) --------+---------+---------+---------+--100 200 300 400 Individual 95% CI Mean -----+---------+---------+---------+-----95 (-----*-----) 82 (-----*------) 627 (-----*-----) 48 (-----*-----) -----+---------+---------+---------+-----0 250 500 750 RANDOMIZATION P-VALUE (FOR GROUP EFFECT) Method used: restricted randomization (randomization within blocks) Data Display (WRITE) 64 Number of randomizations 999 P-value for group effects (test-statistic = F-ratio) 0.0940 P-value for group effects (test-statistic = Mean square) 0.0810 Modified worksheet C5 A column containing sorted data (sorted by group and block) C6 A column containing re-numbered group markers C7 A column containing re-numbered block markers C8 A column containing markers for new group by block combinations C10 A column containing 999 sums of square for group effect, one for each randomized dataset C12 A column containing 999 F-ratios for group effect, one for each randomized dataset Discussion There is a suggestion from both standard and randomization methods that group (size morph) has an effect. However, p-values are 0.051 by standard methods and 0.094 by randomization, so the evidence for a group effect is not strong. 65 LEVENERAN This macro is designed to test whether the variances of data in different groups are equal, using a randomization version of Levene's Test. Calling statement leveneran c1 c2 ; nran k1 (4999) ; fvalues c1 ; modified c1 ; usemean k1 (0). Input c1 is a numeric column containing the observed data for all groups. c2 is a numeric column, of the same length as c1, containing group labels. Missing data is allowed. If data is missing in either the 'data' or 'group' column for any particular individual, then that individual is excluded from the analysis. Subcommands fvalues - specify a column in which to store simulated F-values from the ANOVA procedure. usemean - an option to use group means rather than medians in the construction of the modified data on which the ANOVA is performed. If usemean = 1, group means are used. For any other value, and by default, group medians are used. Output Individual group means, medians and sample standard deviations Standard ANOVA output Randomization p-values for the F-ratio in the ANOVA Speed of macro : FAST Standard procedure % vartest c1 c2. This tests for equal variances amonst different groups. The data is given in c1, whilst group labels are given in c2. The output reports the findings of a number of tests for equal variance, including Levene's Test. References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapters 6 and 7). 66 Null hypothesis : The variance is constant across all groups (i.e. 12 = 22 = … = g2). Test-statistic : We calculate absolute differences between the data and the relevant individual group medians, and then perform a one-way ANOVA upon these differences. The F-ratio obtained from this ANOVA is our test-statistic. By subtracting the group means from the data, we remove the effects of differences in means between groups, so that we can attribute remaining variation in the absolute differences (if this exists) to differences in variability between groups. Randomization : We randomize the allocation of group labels to data values, because, as for the standard one-way ANOVA, the null hypothesis implies that the allocation of group labels should be independent of the data values. WORKED EXAMPLE FOR LEVENERAN Name of dataset FERNBIRDS Description The data, from a study by Harris on the selection of nest site by the fernbird Bowdleria puncta, compares perimeters of vegetation clumps within a region. 24 of the clumps within the data were selected by fernbirds as nest sites; the remaining 25 clumps were selected at random from the same study region. Manly (1997) applies Levene's test to these data. Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London, pp. 228-231. Original source HARRIS, W.F. (1986) The breeding ecology of the South Island Fernbird in Otago Wetlands, PhD Thesis, University of Otago, Dunedin, New Zealand. Data Number of observations = 49 Number of variables = 2 For each point, the data (top) and group (bottom) are given. 8.90 4.34 2.30 5.16 2.92 3.30 3.17 4.81 2.40 3.74 4.86 2.88 4.90 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 4.65 4.02 4.54 3.22 3.08 4.43 3.48 4.50 2.96 5.25 3.07 3.17 3.23 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 2.00 2.00 2.44 1.56 2.28 3.16 2.78 3.07 3.84 3.33 2.80 2.92 4.40 3.86 3.48 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.36 3.08 5.07 2.02 1.81 2.05 1.74 2.85 3.64 2.40 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 Aims of analysis To investigate whether variability in perimeter lengths is the same for the nest sites as for the randomly chosen sites. 67 Randomization procedure MTB > Retrieve "N:\resampling\Examples\Fernbirds.MTW". Retrieving worksheet from file: N:\resampling\Examples\Fernbirds.MTW # Worksheet was saved on 27/07/01 17:15:50 Results for: Fernbirds.MTW MTB > % N:\resampling\library\leveneran c1 c2 ; SUBC> nran 1999 ; SUBC> fvalues c4 ; SUBC> modified c6 ; SUBC> usemean 0. Executing from file: N:\resampling\library\leveneran.MAC LEVENE TEST Data Display > Number of groups 2 > Statistics for each group Data Display Row n obs grp mean grp median grp stdev 1 2 24 4.03667 25 2.93360 3.88 1.36963 2.92 0.84426 > ANOVA for *modified* data Modified data is formed by subtracting from each value the median value for the group of which it is a part One-way ANOVA: modified data versus group Analysis of Variance for modified Source DF SS MS F P group 1 1.447 1.447 2.55 0.117 Error 47 26.615 0.566 Total 48 28.062 Individual 95% CIs For Mean Based on Pooled StDev Level N Mean StDev ---------+---------+---------+------1 24 0.9933 0.9338 (---------*---------) 2 25 0.6496 0.5229 (---------*---------) ---------+---------+---------+------Pooled StDev = 0.7525 0.60 0.90 1.20 68 > Randomization p-value Data Display (WRITE) Number of randomizations used to compute p-value 1999 Randomization p-value 0.1145 Modified worksheet C4 A column containing 999 F-ratios for group effect on the modified data, one for each randomized dataset C6 A column containing 49 values, the absolute values of (data - group median). Discussion The p-value of 0.115 corresponds closely to that obtained using an F-distribution approximation (p-value = 0.12, see Manly, 1997). Using either method, there is no real evidence for a different in variances between groups. 69 4 REGRESSION Overview Simple linear regression REGRESSSIMRAN performs a randomization test for the slope in the regression of a response upon a single predictor. Multiple regression REGRESSOBSRAN tests for the significance of parameters in a multiple regression, with inference based upon randomization of observations. REGRESSRESRAN tests for the significance of parameters in a multiple regression, with inference based upon randomization of residuals. REGRESSBOOT tests for the significance of parameters in a multiple regression, and computes confidence intervals about parameters, using inference based upon the bootstrapping of residuals. Should we resample residuals or observations ? In the previous three sections, it has been relatively clear which quantity should be bootstrapped or randomized – frequently, we simply bootstrap/randomize the data itself. In multiple regression, however, it becomes less clear which quantity should be randomized or bootstrapped. There are two basic alternatives – Resample from the data itself – this is known as resampling cases. Fit the regression model, and resample from the residuals for the fitted model. Randomization methods In the case of randomization, either method is reasonably straightforward. We can resample cases by randomizing the allocation of response variable values to individuals, whilst keeping the values of all of the predictors fixed. We resample residuals by fitting a regression model containing all of the predictors to the observed data, bias-correcting the residuals (so that they have zero mean), standardizing the residuals so that they have constant variance (there is an option to use the raw residuals, or to use deletion residuals), and then randomizing the allocation of residuals to individuals. The randomized residuals are then added to the fitted values for the individual, to create simulated values. Under the null hypothesis that the fitted regression model is true, the allocation of these simulated values to observations should be random (because the distribution of residuals should be the same as the error distribution associated with the model). The advantage of randomizing residuals is that this allows us to assess the significance of the effect of any given predictor, conditional upon the effect of the remaining predictors – this is not possible when we randomize cases, because in that case we simultaneously assess the effect of all of the predictors. The first problem with randomizing residuals is that the method is more model-dependent that randomizing cases, since we must assume that the distribution of residuals mirrors the error distribution within the model. Any inadequacy in the fitted model is likely to lead to problems with the method. The second problem is that the method may lead to the production of implausible datasets (see Manly, 1997). In many circumstances, the two methods can be expected to give similar results. Bootstrapping methods In the case of bootstrapping, we have only produced a macro for bootstrapping residuals, although it is possible to bootstrap cases instead. Bootstrapping residuals has the advantage that the same simulated datasets can be used to create confidence intervals for individual parameters, but this is not the case if we bootstrap cases. 70 REGRESSSIMRAN The macro is designed to assess, using randomization, the significance of slope parameter in a simple linear regression of a response variable upon a single predictor. Calling statement regsimran c1 c2 ; nran k1 (999); fits c1; residuals c1; correlations c1; coefficients c1-c2; tstatistics c1. Input C1 Response variable : a column containing only numeric values. C2 Predictor variable : a column containing only numeric values. C1 and C2 must have the same length. Missing values : Allowed. If the value for any variable is missing for an observation, then that observation is excluded from the analysis. Subcommands nran Number of randomizations used. fits Specify a column in which to store fitted values. stores Specify a column in which to store raw residuals. correlations Specify a column in which to store simulated correlation coefficients. coefficients Specify two columns in which to store simulated parameter estimates (intercept in 1st column, slope in 2nd column). tstatistics Specify a column in which to store simulated t-statistics for slope. Output Means for response variable and predictor Estimated regression slope and intercept Standard error for estimated regression slope Correlation coefficient between response variable and predictor T-statistic for estimate regression slope One and two-sided randomization p-values for slope Technical details For the regression model, Y = + x + we test the null hypothesis H0: Slope is equal to zero (=0) using the t-statistic corresponding to the slope parameter, t ˆ / SE[ ˆ ] We randomize the allocation of response variable values to predictor values, since under the null hypothesis the response variable is independent of the predictor. References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, 71 Chapman and Hall, London (Chapter 8). Standard procedures regress c1 1 c2; constant. This regresses the response variable c1 upon the predictor c2. The "1" indicates that there is only one predictor variable; it is usual to fit a constant as well as a slope in regression, unless there is some reason to believe that the regression must pass through the origin. WORKED EXAMPLE FOR REGRESSSIMRAN Name of dataset HEXOKINASE Description The data is taken from part of a study by McKechnie, concerning electrophoretic frequencies of the butterfly Euphydryas editha. For each of 18 units (corresponding either to colonies, or to sets of colonies), the reciprocal of altitude (originally measured in feet * 103) is recorded, together with the percentage frequency of hexokinase 1.00 mobility genes from electrophoresis of samples of Euphydryas editha. We label these variables "invalt" and "hk" respectively. Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source MC. KECHNIE, S.W., EHRLICH, P.R. & WHITE, R.R. (1975), Population genetics of Euphydryas butterflies. I. Genetic variation and the neutrality hypothesis, Genetics, 81, pp. 571-594. Data Number of observations = 18 Number of variables = 2 For each observation, HK (top) and INVALT (bottom) are given. 98.00 36.00 72.00 67.00 82.00 72.00 65.00 1.00 40.00 39.00 9.00 2.00 1.25 1.75 1.82 2.63 1.08 2.08 1.59 0.67 0.57 0.50 19.00 42.00 37.00 16.00 4.00 1.00 4.00 0.24 0.40 0.50 0.15 0.13 0.11 0.10 Minitab worksheet C1 HK measurements C2 INVALT measurements Aims of analysis To investigate, using a linear regression model, whether INVALT has an effect upon the value of HK. Standard procedure MTB > Regress c1 1 c2; SUBC> Constant; SUBC> Brief 2. 72 Regression Analysis: hk versus invalt The regression equation is hk = 10.7 + 29.2 invalt Predictor Coef SE Coef T P Constant 10.654 7.585 1.40 0.179 invalt 29.153 6.035 4.83 0.000 S = 20.27 R-Sq = 59.3% R-Sq(adj) = 56.8% Analysis of Variance Source DF SS MS F P Regression 1 9585.3 9585.3 23.33 0.000 Residual Error 16 6572.5 410.8 Total 17 16157.8 Unusual Observations Obs invalt hk Fit 8 1.59 1.00 57.01 SE Fit 6.05 Residual -56.01 St Resid -2.90R R denotes an observation with a large standardized residual Resampling procedure MTB > Retrieve "N:\resampling\Examples\Hexokinase.MTW". Retrieving worksheet from file: N:\resampling\Examples\Hexokinase.MTW # Worksheet was saved on 06/07/01 14:15:38 Results for: Hexokinase.MTW MTB > % N:\resampling\library\regresssimran c1 c2 ; SUBC> nran 999 ; SUBC> fits c4 ; SUBC> residuals c5 ; SUBC> correlations c7 ; SUBC> coefficients c9 c10 ; SUBC> tstatistics c12. Executing from file: N:\resampling\library\regresssimran.MAC Data Display (WRITE) Number of observations 18 Mean of response variable 39.11 Mean of predictor 0.98 Correlation coefficient 0.770 Estimated intercept 10.654 Estimated slope 29.153 Standard error on estimated slope T-statistic for significance of slope 6.035 4.83 73 One sided randomization p-value, H1: -ve slope One sided randomization p-value, H1: +ve slope Two sided randomization p-value 0.0020 1.0000 0.0010 Modified worksheet C4 A column containing 18 fitted values for the regression on the observed data C5 A column containing 18 raw residuals for the regression on the observed data C7 A column containing 999 correlation coefficients, one for each randomized dataset C7 A column containing 999 intercept parameter estimates, one for each randomized dataset C7 A column containing 999 slope parameter estimates, one for each randomized dataset C7 A column containing 999 slope parameter t-statistics, one for each randomized dataset Discussion There is very strong evidence (standard p-value = 0.000, whilst two-sided randomization p-value = 0.002, the smallest possible value with 999 randomizations) that INVALT does have an effect upon HK. We see that INVALT actually has a positive effect upon HK frequency (implying that altitude has a negative effect upon HK frequency). 74 REGRESSOBSRAN To fit a multiple regression model. The significance of the parameter for each predictor is computed, along with the overall significance of the regression. P-values are obtained by randomization of observations. Other macros REGRESSSIMRAN should be used if there is a single predictor. REGRESSRESRAN is the same, except that randomization is of residuals, not observations. REGRESSBOOT performs bootstrap multiple regression, with bootstrapping of residuals. Calling statement regressobsran c1 c2-cN ; nran k1 ; tvalues m1. Input C1 C2 - CN Response variable. A column containing numeric values. Predictor variables. Columns containing numeric values. All N columns must have the same length. Missing values : Allowed. If the value for any variable is missing for an observation, then that observation is excluded from the analysis. Subcommands nran Number of randomizations. tvalues Specify a matrix within which to store simulated t-values for the coefficient of each predictor. Output For the coefficient associated with each predictor, we present Estimated coefficient, together with standard error T-statistic, plus p-value using normal theory Randomization p-values Randomization p-values are based upon T-statistics, and two-sided values are found by doubling the smaller one-sided value. In addition, we present an overall F-ratio for the regression, with p-values from both normal theory and randomisation. Technical details We randomise the allocation of the response variable to individuals. References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 8). Standard procedures 75 For example, regress c1 8 c2-c9; constant. This regresses the response variable c1 upon the predictors c2-c9. The "8" indicates that the number of predictors. An intercept term is also included in the regression. WORKED EXAMPLE FOR REGRESSOBSRAN Name of dataset OREGON Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source MC. KECHNIE, S.W., EHRLICH, P.R. & WHITE, R.R. (1975), Population genetics of Euphydryas butterflies. I. Genetic variation and the neutrality hypothesis, Genetics, 81, pp. 571-594. Data Number of observations = 18 Number of variables = 7 Colony Altitude 1 0.50 2 0.80 3 0.57 4 0.55 5 0.38 6 0.93 7 0.48 8 0.63 9 1.50 10 1.75 11 2.00 12 4.20 13 2.50 14 2.00 15 6.50 16 7.85 17 8.95 18 10.50 Invalt 2.00 1.25 1.75 1.82 2.63 1.08 2.08 1.59 0.67 0.57 0.50 0.24 0.40 0.50 0.15 0.13 0.11 0.10 Precip 58 20 28 28 15 21 24 10 19 22 58 36 34 21 40 42 57 50 Tmax 97 92 98 98 99 99 101 101 99 101 100 95 102 105 83 84 79 81 Tmin 16 32 26 26 28 28 27 27 23 27 18 13 16 20 0 5 -7 -12 Hk 98 36 72 67 82 72 65 1 40 39 9 19 42 37 16 4 1 4 Aims of analysis To investigate whether altitude (INVALT) and climatic variables (Precip, Tmax, Tmin) have an impact upon electrophoretic frequency (HK), and to investigate the nature of any possibly effects. Randomization procedure MTB > Retrieve "N:\resampling\Examples\Oregon.MTW". Retrieving worksheet from file: N:\resampling\Examples\Oregon.MTW # Worksheet was saved on 14/08/01 11:36:50 76 Results for: Oregon.MTW MTB > % N:\resampling\library\regressobsran c7 c3-c6 ; SUBC> nran 999 ; SUBC> tvalues m1. Executing from file: N:\resampling\library\regressobsran.MAC INDIVIDUAL REGRESSION COEFFICIENTS * KEY * > row = predictor > coef = estimated coefficient > SE coef = standard error about coefficient > T = t-statistic for coefficient > normal p = p-value under assumption of normality > 1s- ran p = one-sided randomization p-value, H1: +ve coefficient > 1s+ ran p = one-sided randomization p-value, H1: -ve coefficient > 2s ran p = two-sided randomization p-value * ESTIMATES * Data Display Row coef SE coef T normal p 1s- ran p 1s+ ran p 2s ran p 1 26.1237 8.64504 3.02182 0.009818 2 0.4720 0.49554 0.95247 0.358235 3 0.8668 1.17253 0.73923 0.472904 4 0.2503 1.01945 0.24555 0.809863 0.005 0.181 0.230 0.390 0.995 0.819 0.770 0.610 0.010 0.362 0.460 0.780 OVERALL SIGNIFICANCE OF THE REGRESSION Data Display (WRITE) Overall F-ratio for regression 5.96 P-value using normality 0.0060 P-value using randomization 0.0100 Modified worksheet M1 A 999 * 4 matrix, containing t-statistics for the parameter estimates for each of the four predictor effects (column 1 for predictor 1, etc.). Discussion There is strong evidence that the overall regression is significant (p-value = 0.006 by normal theory, 0.010 by randomization), and that INVALT has a significant (positive) impact upon HK (p-value = 0.010 by normal theory or randomization). There is no evidence that the remaining three variables have any significant impact upon the response. The p-values obtained by randomization and normal theory are very similar in all cases. 77 REGRESSRESRAN To fit a multiple regression model. The significance of the parameter for each predictor is computed, along with the overall significance of the regression. P-values are obtained by randomization of residuals. Other macros REGRESSSIMRAN should be used if there is a single predictor. REGRESSOBSRAN is the same, except that randomization is of observations, not residuals. REGRESSBOOT performs bootstrap multiple regression, with bootstrapping of residuals. Calling statement regressresran c1 c2-cN ; nran k1 (999) ; residuals k1 (2) ; tstatistics m1. Input C1 C2 - CN Response variable. A column containing numeric values. Predictor variables. Columns containing numeric values. All N columns must have the same length. Missing values : Allowed. If the value for any variable is missing for an observation, then that observation is excluded from the analysis. Subcommands nran residuals tstatistics Number of randomizations Type of residual : 1 = Raw residuals 2 = Modified residuals (default) 3 = Deletion residuals Specify a matrix within which to store simulated t-values for the coefficient of each predictor. Output For the coefficient associated with each predictor, we present Estimated coefficients, together with standard errors Corresponding F-ratios, with p-values using normal theory and randomization T-statistics, with p-values using randomization The randomization p-values computed using F-ratios are naturally one-sided. In the case of T statistics, two-sided p-values are calculated by doubling the smaller one-sided p-value. P-values obtained from the two methods should be very similar, since the F ratios can be found by squaring the T statistics. In addition, we present an overall F-ratio for the regression, with p-values from both normal theory and randomization. References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 8). 78 TER BRAAK, C.J.F. (1992), Permutation versus bootstrap significance tests in multiple regression and ANOVA, in Bootstrapping and Related Techniques (ed. K.H. Jockel), Springer-Verlag, Berlin, pp.79-86. Technical details We randomize residuals, according to the procedure based on equation 3.3 of Ter Braak (1992). In fact, we use a simplification of the algorithm (described in Manly, 1997) in which we regress directly upon the residuals, rather than upon the fitted values : Algorithm Assume that we have data on a response variable, y, and p predictors x1,…,xp. Assume that the parameter for predictor xi is i. Stage 1 : Regress y upon x1,…,xp. Hence obtain parameter estimates b1,…,bp and standard errors SE[b1],…,SE[bp]. Further obtain t-statistics t1,…,tp, where ti = bi / SE[bI]. Also obtain fitted values, yFIT, and residuals r = y - yFIT. Stage 2 : For j = 1,…,d times (where d is the number of randomizations), randomize the ordering of the residuals, so obtaining randomized residuals rj*. Then regress the randomized residuals, rj*, upon the predictors x1,…,xp. Hence obtained parameter estimates and standard errors, and so t-values t1j*,…, tpj*. Stage 3 : For each predictor, i, compare the observed t-statistic ti to the statistics based upon randomization ti1*,…,tid*. The algorithm is straightforward to implement, but more complicated to justify (see Ter Braak, 1992). Standard procedures For example, regress c1 4 c2-c5; constant. This regresses the response variable c1 upon the predictors c2-c5. The "4" indicates that the number of predictors. An intercept term is also included in the regression. WORKED EXAMPLE FOR REGRESSRESRAN Name of dataset ARTIFICIAL Description We use the artificial data created on Manly, 1997 (and based on similar data generated by Kennedy and Kade). The data are artificial; their construction is discussed at length by Manly (1997). The purpose of the data is to demonstrate that computationally intensive methods can, in certain circumstances, produce results which are very substantially different from the results of standard methods. Source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Data Number of observations = 20 Number of variables = 4 y x1 x2 x3 79 99.00 33.00 5.94 1.97 103.45 2.65 8.33 2.72 3.83 0.70 2.82 0.94 4.19 0.76 2.86 0.95 4.18 1.39 3.47 0.69 38.09 1.28 25.62 1.50 7.54 2.33 2.51 0.82 10.13 2.80 4.66 1.55 4.15 0.76 19.97 0.46 4.23 0.70 8.83 1.98 2.09 0.87 2.83 0.42 2.80 1.93 0.82 0.31 1.20 1.17 1.79 1.53 2.35 0.19 0.72 2.74 2.29 2.93 2.02 1.94 18.59 1.86 4.15 1.78 3.15 2.40 1.20 0.79 1.89 1.51 2.43 2.28 3.51 0.60 2.12 3.51 2.67 3.16 2.37 2.93 Aims of analysis To investigate whether predictors x1 and x2 have an impact upon response y, and to investigate the nature of any possible effects. Randomization procedure MTB > Retrieve "N:\resampling\Examples\Artificial.MTW". Retrieving worksheet from file: N:\resampling\Examples\Artificial.MTW # Worksheet was saved on 08/08/01 11:16:26 Results for: Artificial.MTW MTB > % N:\resampling\library\regressresran c1 c2-c3; SUBC> nran 999 ; SUBC> residuals 1 ; SUBC> tstatistics m1. Executing from file: N:\resampling\library\regressresran.MAC Multiple regression, with randomization of raw residuals F-ratios for individual coefficients * KEY * > row = predictor > coef = estimated coefficient > SE coef = standard error about coefficient > F = F-ratio for coefficient > normal p = p-value under assumption of normality > ran p = randomization p-value (based on F-ratio) * ESTIMATES * 80 Data Display Row coef SE coef F normal p ran p 1 2.66985 0.70490 14.3454 0.001470 0.041 2 9.62936 5.63366 2.9216 0.105594 0.101 T-values for individual coefficients * KEY * > row = predictor > T = T-value for estimated coefficient > 1s- ran p = one-sided randomization p-value based on T, H1: +ve coef. > 1s+ ran p = one-sided randomization p-value based on T, H1: -ve coef. > 2s ran p = two-sided randomization p-value based on T * ESTIMATES * Data Display Row T 1s+ ran p 1s- ran p 2s ran p 1 3.78753 2 1.70925 0.041 0.049 0.960 0.952 0.082 0.098 Overall regression Data Display (WRITE) Overall F-ratio for regression 9.42 P-value using normality 0.0018 P-value using randomization 0.0410 Modified worksheet M1 A 999 * 2 matrix, containing t-statistics for the parameter estimates for each of the two predictor effects (column 1 for predictor 1, etc.). Discussion [2-sided] P-value Using normality Using randomization and t-statistics Using randomization and F-ratios 1st predictor 0.001 0.082 0.041 2nd predictor 0.106 0.098 0.101 Overall regression 0.002 NA 0.041 81 We see that p-values for the 1st predictor and overall regression differ substantially between the methods, and that the conclusions drawn would also differ. The results also differ if we randomize cases instead of residuals. This highlights that standard and randomization methods do not always give the same answers. Plot A plot of the artificial dataset (y = response ; x1,x2 = predictors). 100 y 50 3 2 0 0 1 10 x2 20 30 x1 0 82 REGRESSBOOT To fit a multiple regression model. The significance of the parameter for each predictor is computed, along with the overall significance of the regression. P-values are obtained by bootstrapping of residuals. Calling statement regressboot c1 c2-cN ; nboot k1 (2000) ; residuals k1 (2) ; siglev k1 (95) ; resco m1 m2 ; fittedco m3 ; bcadetails c1-c4. Input C1 C2 - CN Response variable. A column containing numeric values. Predictor variables. Columns containing numeric values. All N columns must have the same length. Missing values : Allowed. If the value for any variable is missing for an observation, then that observation is excluded from the analysis. Subcommands nboot residuals siglev resco fittedco bcadetails Number of bootstrap resamples Type of residual : 1 = Raw residuals 2 = Modified residuals (default) 3 = Deletion residuals Significance level for confidence intervals, in %. Store coefficients for regressions upon resampled residuals in m1, together with standard errors in m2 Store coefficients for regressions upon simulated data (fitted values + reesampled residuals) in m3. Store : Column 1 - estimated bias for each parameter Column 2 - estimated acceleration for each parameter Column 3 - rank value for lower BCa confidence limit Column 4 - rank value for upper BCa confidence limit Output For the coefficient associated with each predictor, we present Estimated coefficient, together with standard error F-ratio, plus p-value using normal theory Randomization p-values based on F-ratio T-statistics, with randomisation p-values In addition, we present an overall F-ratio for the regression, with p-values from both normal theory and randomisation. 83 Other macros REGRESSSIMRAN should be used if there is a single predictor. REGRESSOBSRAN performs multiple regression, with significance determined by randomization of observations. REGRESSRESBOOT performs multiple regression, with significance determined by randomization of residuals. Standard procedures For example, regress c1 5 c2-c6 constant. This regresses the response variable c1 upon the predictors c2-c6. The "5" indicates that the number of predictors. An intercept term is also included in the regression. Technical details : hypothesis tests We bootstrap residuals, according to the procedure based on equation 3.3 of Ter Braak (1992). In fact, we use a simplification of the algorithm (described in Manly, 1997) in which we regress directly upon the residuals, rather than upon the fitted values : Algorithm Assume that we have data on a response variable, y, and p predictors x1,…,xp. Assume that the parameter for predictor xi is i. Stage 1 : Regress y upon x1,…,xp. Hence obtain parameter estimates b1,…,bp and standard errors SE[b1],…,SE[bp]. Further obtain t-statistics t1,…,tp, where ti = bi / SE[bI]. Also obtain fitted values, yFIT, and residuals r = y - yFIT. Stage 2 : For j = 1,…,d times (where d is the number of bootstrap resamples), take a bootstrap sample, rj*, from the residuals. Then regress the bootstrapped residuals, rj*, upon the predictors x1,…,xp. Hence obtained parameter estimates and standard errors, and so t-values t1j*,…, tpj*. Stage 3 : For each predictor, i, compare the observed t-statistic ti to the statistics based upon bootstrapping, ti1*,…,tid*, and so obtain p-values. The algorithm is straightforward to implement, but more complicated to justify (see Ter Braak, 1992). Technical details : confidence intervals We use the same bootstrap samples as for the hypothesis tests, but perform a different regression. For j = 1,…,d, we regress simulated data, yj* = yFIT + rj*, upon predictors x1,…,xp. We use the subsequent distributions for each of the parameter estimates as the source of our confidence intervals, which are then computed as before. References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 8). TER BRAAK, C.J.F. (1992), Permutation versus bootstrap significance tests in multiple regression and ANOVA, in Bootstrapping and Related Techniques (ed. K.H. Jockel), Springer-Verlag, Berlin, pp.79-86. DRAPER, N.R. & SMITH, H. (1998) Applied regression analysis (3rd edition), John Wiley & Sons., New York (Chapter 26). 84 WORKED EXAMPLE FOR REGRESSBOOT Name of dataset SWAVESEY Description A subset of data from a study carried out in Swavesey fens to investigate bird species diversity in relation to field boundary characteristics. The response variable, meanno, is the mean number of species recorded in a series of visits to each site. The predictors are th, average tree height tn, number of trees hh, average hedge height hl, hedge length cw, average hedge crown width bw, average hedge base width dd, average ditch depth dw, average ditch width woodyno, number of woody species herbno, number of herb species Source Our own unpublished data (Centre for Ecology and Hydrology). Data Number of observations = 44 Number of variables = 11 meanno 4.25 6.50 4.50 4.25 1.75 2.75 1.75 8.50 1.50 3.00 4.25 1.00 0.75 1.75 0.75 2.75 1.50 3.50 5.75 1.75 1.75 6.25 2.00 4.75 5.25 1.50 9.00 3.50 th 10.0 10.0 9.0 12.0 0.0 0.0 0.0 0.0 9.0 0.0 10.0 0.0 0.0 8.0 0.0 0.0 0.0 0.0 8.0 8.0 0.0 7.0 8.0 0.0 0.0 0.0 12.0 0.0 tn 2 10 9 2 0 0 0 0 2 0 6 0 0 5 0 0 0 0 10 4 0 4 1 0 0 0 20 0 hh 3.50000 6.00000 3.00000 6.00000 0.00000 3.00000 0.00000 7.00000 1.40000 3.00000 5.00000 0.00000 0.00000 3.50000 1.20000 2.50000 2.50000 4.50000 4.00000 3.00000 4.50000 4.00000 1.80000 3.00000 4.50000 1.80000 5.00000 4.50000 hl 180 190 150 160 0 20 0 180 196 2 196 0 0 140 200 194 196 198 180 176 190 180 200 80 180 190 80 200 cw 3.0000 5.5000 2.0000 5.0000 0.0000 3.0000 0.0000 12.0000 1.0000 3.0000 3.0000 0.0000 0.0000 2.5000 1.0000 1.7000 1.7000 3.5000 3.5000 2.5000 3.5000 3.0000 1.2000 3.0000 3.0000 1.0000 4.0000 4.0000 bw 2.5000 2.0000 1.0000 4.0000 0.0000 3.0000 0.0000 11.0000 1.0000 3.0000 1.5000 0.0000 0.0000 2.5000 0.8000 1.7000 1.7000 2.5000 2.0000 2.0000 3.0000 2.5000 1.0000 3.0000 1.5000 0.8000 1.5000 2.0000 dd 1.00000 2.20000 0.30000 2.50000 3.00000 1.50000 1.50000 1.50000 1.50000 1.00000 1.50000 1.20000 1.00000 0.70000 1.00000 1.50000 2.00000 2.00000 1.20000 1.20000 1.80000 0.20000 0.10000 0.50000 0.20000 0.20000 0.50000 1.50000 dw 2.5000 5.0000 3.0000 6.0000 10.0000 8.0000 8.0000 6.0000 4.0000 5.0000 4.0000 5.0000 4.5000 2.5000 2.0000 2.5000 5.0000 5.0000 4.0000 3.0000 4.0000 1.5000 1.0000 2.0000 1.5000 2.0000 5.0000 4.5000 woodyno 7 5 5 7 0 4 0 5 5 1 5 3 3 5 4 4 3 4 6 6 2 7 8 5 3 1 3 3 herbno 14 19 7 25 13 13 16 22 25 19 28 15 25 8 15 14 23 20 17 22 24 13 28 15 8 4 2 8 85 4.25 2.00 2.00 0.50 2.25 0.25 2.00 1.75 3.75 3.75 2.75 3.50 0.50 1.00 5.50 10.00 0.0 0.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5.5 9.0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 1 3 5.50000 1.10000 1.00000 0.00000 1.30000 1.40000 1.30000 1.30000 4.50000 4.00000 5.00000 5.00000 1.20000 1.20000 4.50000 4.50000 190 198 200 0 200 180 192 200 198 184 200 200 198 196 200 194 5.0000 1.0000 1.0000 0.0000 1.2000 1.0000 1.0000 1.0000 3.5000 3.0000 4.0000 4.0000 1.0000 1.0000 3.0000 3.5000 3.0000 1.0000 1.3000 0.0000 1.0000 1.5000 1.2000 1.5000 4.5000 3.0000 2.5000 2.0000 0.9000 1.0000 1.5000 2.5000 1.50000 1.50000 2.00000 2.50000 1.20000 2.00000 0.20000 2.00000 2.00000 1.70000 2.00000 2.00000 0.50000 0.60000 1.00000 0.20000 4.5000 4.5000 5.0000 6.0000 2.5000 4.5000 1.0000 4.0000 4.0000 4.0000 4.0000 4.0000 1.0000 1.2000 2.5000 1.5000 3 5 7 1 5 4 3 3 4 3 4 4 5 4 5 6 9 22 19 18 12 13 15 20 14 12 6 11 10 9 11 7 Aims of analysis To investigate the effect of predictors describing hedge and ditch characteristics (th,tn,hh,hl,cw,bw,dd,dw,woodyno,herbno) upon the number of bird species recorded, and to investigate the nature of any possible effects. Randomization procedure Welcome to Minitab, press F1 for help. Retrieving worksheet from file: N:\resampling\datamin\Swavesey.mtw # Worksheet was saved on 29/08/01 14:10:49 Results for: Swavesey.mtw MTB > % N:\resampling\library\regressboot c1 c2-c11 ; SUBC> nboot 1000 ; SUBC> residuals 1 ; SUBC> siglev 95 ; SUBC> resco m1 m2 ; SUBC> fittedco m3 ; SUBC> bcadetails c13-c16. Executing from file: N:\resampling\library\regressboot.MAC Bootstrap significance tests and confidence intervals for multiple regression, with bootstrapping from modified residuals Overall regression Data Display (WRITE) Overall F-ratio for regression 10.32 P-value using normality 0.0000 P-value using randomization 0.0010 Parameter estimates and F-ratios for individual coefficients * KEY * coef = estimated coefficient for this parameter secoef = standard error about estimated coefficient F = F-ratio for this parameter 86 normal p = P-value corresponding to this F-ratio, using normality ran p = Randomization p-value corresponding to this F-ratio Data Display Row coef SE coef 1 0.060102 2 0.093362 3 0.358451 4 -0.002020 5 0.518894 6 -0.107250 7 -0.924962 8 0.220561 9 0.105074 10 -0.049098 F 0.102011 0.092542 0.318912 0.005029 0.444374 0.351562 0.583145 0.267790 0.188804 0.038267 normal p 0.34713 1.01779 1.26334 0.16144 1.36352 0.09307 2.51591 0.67837 0.30972 1.64619 0.559754 0.320385 0.269131 0.690427 0.251297 0.762231 0.122240 0.416061 0.581608 0.208416 ran p 0.569431 0.315684 0.280719 0.680320 0.262737 0.778222 0.113886 0.393606 0.587413 0.200799 T-values for individual coefficients * KEY * t = t-statistic for this parameter 1s- ran p = One-sided randomization p-value for this t-statistic (H1: negative slope) 1s+ ran p = One-sided randomization p-value for this t-statistic (H1: positive slope) 2s ran p = Two-sided randomization p-value for this t-statistic Data Display Row t 1 0.58918 2 1.00885 3 1.12398 4 -0.40179 5 1.16770 6 -0.30507 7 -1.58616 8 0.82363 9 0.55652 10 -1.28304 1s+ ran p 1s- ran p 2s ran p 0.268731 0.159840 0.148851 0.667333 0.121878 0.621379 0.946054 0.201798 0.295704 0.895105 0.732268 0.841159 0.852148 0.333666 0.879121 0.379620 0.054945 0.799201 0.705295 0.105894 0.537463 0.319680 0.297702 0.667333 0.243756 0.759241 0.109890 0.403596 0.591409 0.211788 Confidence intervals for individual coefficients * KEY * norm l = Lower confidence limit, using normal theory norm u = Upper confidence limit, using normal theory perc l = Lower confidence limit, using Efron percentile method perc u = Upper confidence limit, using Efron percentile method bca l = Lower confidence limit, using BCa percentile method 87 bca u = Upper confidence limit, using BCa percentile method Data Display Row norm l 1 -0.13983 2 -0.08802 3 -0.26660 4 -0.01188 5 -0.35206 6 -0.79630 7 -2.06790 8 -0.30430 9 -0.26498 10 -0.12410 norm u perc l 0.26004 -0.14573 0.27474 -0.08100 0.98351 -0.27908 0.00784 -0.01299 1.38985 -0.39402 0.58180 -0.76739 0.21798 -2.08857 0.74542 -0.31512 0.47512 -0.26142 0.02590 -0.12518 perc u bca l bca u 0.27974 -0.11468 0.26928 -0.13135 1.01148 -0.39776 0.00774 -0.01466 1.31184 -0.33113 0.67198 -0.83608 0.24681 -1.92206 0.72322 -0.38128 0.45915 -0.28554 0.02523 -0.12730 0.31275 0.23386 0.91001 0.00730 1.49021 0.60864 0.39595 0.66396 0.45586 0.02412 Modified worksheet C13 A column of 10 values, containing estimated bias values for each parameter C14 A column of 10 values, containing estimated acceleration values for each parameter C15 A column of 10 values, containing rank values for the lower limits of BCa confidence intervals C16 A column of 10 values, containing rank values for the upper limits of BCa confidence intervals M1 A 1000*10 matrix. Each column contains 1000 estimated parameter estimates for each parameter, from where the bootstrapped residuals are regressed upon the predictors. These estimates are used, together with their standard errors, for the creation of p-values. M2 A 1000*10 matrix. Contains standard errors about the estimates in M1. M3 A 1000*10 matrix. Each column contains 1000 estimated parameter estimates for each parameter, from where the simulated data (original fitted values plus bootstrapped residuals) are regressed upon the predictors. These estimates are used for the creation of confidence intervals. Discussion The overall regression is very clearly significant (p-value = 0.000 using normal theory, 0.001 using randomization), but none of the individual predictors a significant effect (two-sided p-values are greater than 0.1 in all cases, and using all methods). This apparent paradox results from the fact that many of the predictors are very highly correlated, so that there is no need to include all ten within the regression. The 2-sided p-values obtained using normal theory, using randomization with F-ratios, and using randomization with t-statistics are all very similar in all cases, and the confidence intervals also appear to be reasonably similar for different methods. The Efron and standard intervals appear to be very similar, though the differences each between these intervals and the BCa intervals are somewhat larger. The intervals all have similar lengths (average length of 0.938 for the standard interval, 0.948 for the Efron interval and 0.964 for the BCa interval). 88 5 TIME SERIES Overview ACFRAN tests for autocorrelation in a univariate time series TRENDRAN tests for trend in a univariate time series Comments Time series is a large, and often fairly complicated, branch of statistics. It is characterised by the fact that observations at a timepoint are usually dependent upon observations at previous timepoints. We provide two quick, straightforward macros which test the null hypothesis that the observed data are random against alternative hypotheses of short-term dependence (autocorrelation) and long-term dependence (trend). 89 ACFRAN To test for the presence of serial correlation in a regular time series using serial correlation coefficients and the Von Neumann ratio. Significance is determined using both normal approximations and randomization. Calling statement acfran c1 ; nran k1 ; nlag k1 (10). Input c1 A column of numeric data. Missing values : Allowed. Observations with missing values are simply ignored. Subcommands nlag - the maximum number of lags which should be considered. Must be less than the number of observations. Output For each lag from 1 to nlag, we present Observed autocorrelation at that lag Standardised autocorrelation and approximate p-value using a normal approximation Randomisation p-values for the autocorrelation coefficient In addition, we present Observed Von Neumann ratio for the data Standardised VN ratio and approximate p-value using a normal approximation Randomisation p-values for the VN ratio Speed of macro : FAST Notes If the number of observations is less than 10, then the default lag automatically changes from 10 to one less than the number of available observations. Reference : MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Null hypothesis : We test the null hypothesis that the series is random. Alternative hypothesis : For the kth serial correlation coefficient : There is kth autocorrelation (short-term dependence) within the series. For the Von Neumann ratio : The series is a simple random walk. Test-statistics : We use the kth sample serial correlations (for k = 1,…,nlag) as test-statistics. If the kth sample serial correlation value is significantly different from zero then this provides evidence of autocorrelation at the kth lag. 90 In addition, we use the Von Neumann ratio as an overall test for the presence autocorrelation. This ratio tests the null hypothesis that the series is random against the alternative hypothesis that it is a simple random walk. The Von Neumann ratio is of the form n v (x i 2 n i (x i 1 xi 1 ) 2 , where xi is the ith data value. i x)2 Randomization procedure : We randomize the order of the points in the observed series, since under the null hypothesis this will random. Standard procedure The function % acf c1 computes serial correlation coefficients, and produces p-values using normal approximations: If rk is the kth sample serial correlation, then under the null hypothesis the standardized version zk = [rk + 1/(n -1)] / sqrt{1/n} has an approximate standard normal distribution for sufficiently large n. We also include p-values based upon these normal approximations within our macro output. Within the macro output, we provide a p-value for the Von Neumann ratio based upon a normal approximation. Under the null hypothesis, for sufficiently large n, the Von Neumann ratio has a normal distribution with mean 2 and variance 4(n-2)/(n2-1). WORKED EXAMPLE FOR ACFRAN Name of dataset PROLOCULI Description The data are mean diameters of megalospheric proloculi of the Cretaceous bolivinid foraminifer Afrobolivina afra from 92 levels in a borehole in Gbekebo, Ondo State, Nigeria. The rank of the depth is recorded, and provides a measure of the age of the sample (1 = oldest, corresponding to late Cretaceous, 92 = youngest, corresponding to early Palaeocene). Diameters are recorded, but interest really lies in the 91 differences between diameters from adjacent depths. Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source REYMENT, R.A. (1982), Phenotypic evolution in a Cretaceous foraminifer, Evolution, 36, pp. 1182-1199. Data Number of observations = 92 Number of variables = 3 For each observation, sample number (which corresponds to rank depth) (left), diameter (middle) and difference from previous diameter (left) are given. 91 SampleDiam Diff SampleDiam Diff 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 156 146 136 152 147 190 169 170 179 176 184 162 155 154 151 150 187 220 205 194 221 185 171 177 194 176 170 178 177 168 176 209 172 195 169 173 156 161 161 147 158 162 233 184 205 203 * -10 -10 16 -5 43 -21 1 9 -3 8 -22 -7 -1 -3 -1 37 33 -15 -11 27 -36 -14 6 17 -18 -6 8 -1 -9 8 33 -37 23 -26 4 -17 5 0 -14 11 4 71 -49 21 -2 201 261 262 271 202 235 214 212 210 241 211 247 238 235 227 236 230 241 232 230 238 234 230 254 256 210 230 231 225 227 226 237 250 226 229 240 205 221 208 207 215 233 210 213 198 213 -2 60 1 9 -69 33 -21 -2 -2 31 -30 36 -9 -3 -8 9 -6 11 -9 -2 8 -4 -4 24 2 -46 20 1 -6 2 -1 11 13 -24 3 11 -35 16 -13 -1 8 18 -23 3 -15 15 92 Plot Difference 50 0 -50 Index 10 20 30 40 50 60 70 80 90 Worksheet C1 Rank depth C2 Diameter C3 Distance from previous diameter Aims of analysis To investigate whether or not stage-to-stage differences in diameters suffer from autocorrelation. Standard procedure Welcome to Minitab, press F1 for help. MTB > Retrieve "N:\resampling\Examples\Proloculi.MTW". Retrieving worksheet from file: N:\resampling\Examples\Proloculi.MTW # Worksheet was saved on 07/08/01 11:51:17 Results for: Proloculi.MTW MTB > ACF 8 c3 c4. Randomization procedure MTB > % N:\resampling\library\acfran c3 ; SUBC> nran 999 ; SUBC> nlag 8 ; SUBC> autocor m1 ; SUBC> vonneu c5. Executing from file: N:\resampling\library\acfran.MAC Randomization tests for autocorrelation in a univariate time series SERIAL CORRELATION COEFFICIENTS for lags of 1 to k * KEY * > row = lag, j > corr = observed autocorrelation at the j^th lag > z-value = standardized autocorrelation 93 > normal p = p-value using normal approximation > 1s - ran p = one-sided randomization p-value, H1: negative correlation > 1s + ran p = one-sided randomization p-value, H1: positive correlation > 2s ran p = two-sided randomization p-value Data Display Row 1 2 3 4 5 6 7 8 corr z-value normal p 1s - ran p 1s + ran p 2s ran p -0.421503 0.100958 -0.102290 0.060894 -0.135631 0.004072 0.106034 -0.155653 -3.91489 1.06907 -0.86979 0.68689 -1.18784 0.14484 1.11750 -1.37885 0.000090 0.285038 0.384417 0.492152 0.234895 0.884836 0.263782 0.167942 0.001 0.871 0.182 0.760 0.123 0.564 0.883 0.068 1.000 0.130 0.819 0.241 0.878 0.437 0.118 0.933 0.002 0.260 0.364 0.482 0.246 0.874 0.236 0.136 * NOTE * The interpretion of serial correlation p-values: Significance levels should be reduced appropriately to account for the effects of multiple testing. The simplest procedure is to divide the p-value by the number of lags being considered (this is conservative). For example, if a 95% significance level is required and lags up to 10 are of interest, then individual p-values smaller than 0.05/10 = 0.005 are taken to be significant. VON-NEUMANN RATIO Data Display (WRITE) Observed von-neumann ratio 2.835 Standardized von-neumann ratio 4.029 P-value using normal approximation 0.0001 One-sided randomization p-values 1.0000 0.0010 Two-sided randomization p-value 0.0020 * NOTE * The use of the Von-Neumann ratio : This ratio provides a test of randomness within a time series. It tests the null hypothesis that the observed series is random against the alternative hypothesis that it is a simple random walk (in which the value at a particular point in time is partly determined by the value at the previous time point). Modified worksheet C5 Column containing 999 Von Neumann ratios, one for each randomized dataset M1 A 999*8 matrix. The kth column contains 999 serial correlation coefficients at lag k, one for each randomzied dataset. 94 Discussion Randomization and standard methods present a very similar picture. There is strong evidence of autocorrelation at lag 1 (p-value = 0.000 by standard methods, 0.002 by randomization), but no evidence of autocorrelation at any other lag. The Von-Neumann ratio also provides clear evidence against this being a random series (p-value = 0.000 by standard methods, 0.002 by randomization). 95 TRENDRAN To test for the presence of trend in a regular or irregular time series, using a variety of non-parametric test-statistics. The test-statistics are Number of runs above and below the median Number of positive differences Number of runs up and down. Significance is determined using both normal approximations and randomisation. Calling statement trendran c1 ; nran k1 ; statistics c1-c3. Input c1 A column of numeric data. Missing values : Allowed. Observations with missing values are simply ignored. Subcommands statistics Specify three columns in which to store simulated test-statistics. 1st column : number of runs above and below the median 2nd column : number of positive differences 3rd column : number of runs up and down Outputs For each of the three test-statistics, we present the Observed value of the test-statistic Expected value, standard error and p-value for the test-statistic using a (large-sample) normal approximation Randomization p-values Null hypothesis : The observed time series is a random series. Alternative hypothesis : There is trend (long-term dependence) within the series. Test-statistic The first test-statistic which we use is the number of runs above and below the median (note: values equal to the median are assumed to be above the median), M. The second test-statistic is the number of positive differences, P. The third test-statistic is the number of runs up and down, U. Randomization : We randomize the order of the data, since under the null hypothesis this ordering will be random. Speed of macro : FAST References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. 96 Standard procedures As well as randomization output, the macro produces p-values using normal approximations (Manly, 1997). For long series, a normal approximation to the distribution of M is reasonable under the null hypothesis, with mean = 2r(n-r)/(n+1), variance = 2r(n-r){2r(n-r)-n}/{n2(n-1)}, where n is the length of the series and r is the observed number of runs below the median. For long series, a normal approximation to the distribution of P is reasonable under the null hypothesis, with mean = m/2, variance = m/12, where m is the number of differences after zeros have been removed. For long series, a normal approximation to the distribution of U is reasonable under the null hypothesis, with mean = (2m+1)/3, Variance = (16m-13)/90, where m is the number of differences. There is no in-built Minitab command for this kind of procedure, but the closest command is % trend c1, which tests for evidence of trend in c1 using parametric models. WORKED EXAMPLE FOR TRENDRAN Name of dataset EXTINCTION Description The data are estimated extinction rates for marine genera from the late Permian period until the present, listed in chronological order. There are 48 geological stages… We use data on extinction rates for marine genera from the late Permian period until the present. The data are from an irregular time series; times are not presented here, because there is some doubt as to their accuracy. Source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source RAUP, D.M. (1987), Mass extinctions: a Discussion, Palaeontology, 30, pp. 1-13. Data Number of observations = 48 Number of variables = 1 Extinction rate 22 23 61 7 14 26 30 7 14 60 21 10 45 29 23 40 28 46 7 22 16 19 18 15 11 18 7 9 11 26 97 13 6 8 5 11 4 13 3 48 11 9 6 6 7 7 2 13 16 Plot 60 Extinction rate 50 40 30 20 10 0 Index 10 20 30 40 Worksheet C1 Data Aim of analysis To investigate whether there is a trend in extinction rates over time. Randomization procedure Welcome to Minitab, press F1 for help. MTB > Retrieve "N:\resampling\Examples\Extinction.MTW". Retrieving worksheet from file: N:\resampling\Examples\Extinction.MTW # Worksheet was saved on 07/08/01 10:57:19 Results for: Extinction.MTW MTB > % N:\resampling\library\trendran c1; SUBC> nran 999 ; SUBC> statistics c3-c5. Executing from file: N:\resampling\library\trendran.MAC Some tests for detecting trend in a single time series Test 1 : Runs above and below the median test Data Display (WRITE) Observed number of runs 16 Expected number of runs 25.00 Standard deviation for number of runs 3.427 Two-sided p-value using normal approximation 0.0131 One-sided randomization p-value, H1: trend 0.9970 One-sided randomization p-value, H1: rapid oscillation 0.0060 Two-sided randomization p-value 0.0120 98 Test 2 : Sign test Data Display (WRITE) Observed number of positive differences 23 Total observed number of non-zero distances 47 Expected number of positive differences 23.50 Standard deviation for number of positive differences 1.979 Two-sided p-value using normal approximation 1.0000 One-sided randomization p-value, H1: decreasing trend 0.6350 One-sided randomization p-value, H1: increasing trend 0.5570 Two-sided randomization p-value 1.0000 Test 3 : Runs up and down test Data Display (WRITE) Observed number of runs 28 Expected number of runs 31.67 Standard deviation for number of runs 2.866 Two-sided p-value using normal approximation 0.2691 One-sided randomization p-value, H1: trend 0.8950 One-sided randomization p-value, H1: rapid oscillation 0.1690 Two-sided randomization p-value 0.3380 Modified worksheet C3 Column containing 999 M statistics, one for each randomized dataset C4 Column containing 999 P statistics, one for each randomized dataset C5 Column containing 999 U statistics, one for each randomized dataset Discussion Method Runs above and below the median Positive differences Runs up and down Randomization p-values (2-sided) This example Manly 0.012 0.008 1.000 1.000 0.338 0.360 P-values by normal Approximation 0.013 1.000 0.260 Our randomization p-values agree closely with those of Manly (1997). These are somewhat different from the p-values obtained using the normal approximation, but the differences are not too great. Overall, only the first of the tests shows any evidence of trend. In the case of runs above and below the median, the evidence for trend is reasonably strong. Manly (1997) suggests that there is clear trend within the data, and that the 2nd and 3rd tests have failed to pick this up because they concentrate too much on small-scale behaviour. 99 6 SPATIAL STATISTICS Overview TYPES OF MACRO (by type of resampling) Randomization tests SPATAUTORAN tests for spatial autocorrelation MEAD4RAN performs Mead's randomization test upon a 4*4 grid of quadrat counts MEAD8RAN performs Mead's randomization test upon an 8*8 grid of quadrat counts MANTELRAN performs a Mantel test Monte Carlo procedures DISTEDFMC plots EDF plots, using data on the location of objects within a fixed area NEARESTMC tests for random location of objects within a fixed area, using nearest neighbour distances LOCREGULARMC tests for random location of objects within a fixed area, using indices of local regularity Which procedure should I use ? Searching for pattern in the location of points A key area of spatial statistics is the search for pattern in the location of points or objects within a fixed region. If points are distributed at random within the region, so that there is no pattern is present, then we say that there is Complete Spatial Randomness (CSR). In searching for spatial pattern, the usual starting point is to test the null hypothesis of CSR; only once we have rejected CSR (if indeed we do reject CSR) can we begin to look at the nature of the pattern. NEARESTMC and LOGREGULARMC both test the null hypothesis of CSR, using statistics based upon kth nearest neighbour distances. In NEARESTMC, we use the kth nearest neighbour distances themselves, and test CSR against alternative hypotheses of regularity and clustering. In LOCREGULARMC, we use statistics which are particularly sensitive to regularity (and, specifically, which are sensitive to local regularity i.e. large-scale clustering occurring together with small-scale regularity). DISTEDFMC also provides two tests for alternative hypotheses of clustering and regularity against the null hypothesis of CSR, this time using the distances between all objects (rather than just kth nearest neighbour distances). The primary purpose of DISTEDFMC, however, is to provide a powerful graphical means of identifying if (and where) deviations from CSR occur. If data are in the form of grid of quadrat counts (i.e. the number of objects/points present within each of a number of adjacent regions is recorded) then MEAD4RAN and MEAD8RAN can also be used to test the null hypothesis of CSR against alternative hypotheses of clustering and regularity. It is quite possible to use these macros upon data which is of the form of locations of points within a region (as already discussed), by superimposing a grid over the region, and counting the number of objects within each cell of the grid. It is important, however, to note that MEAD4RAN and MEAD8RAN can only be used to test for CSR at a particular scale, and are likely to entirely miss deviations from CSR at smaller or larger spatial scales. Other techniques The remaining two macros have entirely different purposes. 100 Much data is of the form of a variable, recorded at different spatial locations (either at different points, or within different regions). Standard statistical analyses would ignore the spatial structure of this data, and would assume that the values of the variable at different points are independent. Often, however, values of a variable at a point will tend to be similar to values at nearby points – this kind of dependence is known as spatial autocorrelation. In SPATAUTORAN, we test for spatial autocorrelation against a null hypothesis of independence. Finally, much spatial data is in the form of distance matrices – for a network of n points, a variety of distances between each pair of points may be computed, and these distances may be formed into matrices. For example, Manly (1997) discusses a situation in which the global distribution of Earwigs is of interest. In this case, the points consist of 8 continental-level regions. The first distance matrix contains measures of similarity in the types of earwig species found in each pair of regions, whilst the second distance matrix contains a measure of the geographical distance between the regions. Interest lies in seeing whether the degree of similarity in species is related to geographical distance. This kind of problem can be addressed using MANTELRAN. Using the macros for spatial statistics Although the techniques used within these macros are well-established, they are much less commonly used and taught than the previous techniques we have considered. Before using the macros, the user is strongly advised to study the suggested references. Most of these macros are also very computerintensive, so they are less suitable for using on a routine basis, or for teaching. 101 SPATAUTORAN ! Intensive ! To test for the presence of spatial autocorrelation for data recorded at points within a region. Two alternative coefficients of spatial autocorrelation, the Moran and Geary coefficients, are computed, and significance levels are determined by randomisation. RUNNING THE MACROS Calling statement spatautoran c1 m1 k1 ; nran k1 ; autocorrelations c1-c2. Input c1 is a column containing data for each of the N points. m1 is a weighting or connectivity matrix, representing the geographical arrangement of the points. It must be an N-by-N matrix. It should contain zero entries upon the diagonal (it is does not, diagonal entries will be set equal to zero). It need not be symmetric. k1 is the number of points, N. Subcommands autocorrelations - specify columns in which to store simulated Moran coefficients (1st column) and Geary coefficients (2nd column). Output For each of the two kinds of spatial autocorrelation coefficient, we present: Observed coefficient Expected coefficient and standard error, under assumptions N and R P-values using normal approximations, under assumptions N and R Two-sided p-values using randomization TECHNICAL DETAILS Null hypothesis : Independence (no spatial autocorrelation). Alternative hypotheses : Positive or negative spatial autocorrelation. Data exhibit spatial autocorrelation if data values at a point or region are influenced by data values at other nearby points or regions. Test-statistic We consider two measures of autocorrelation, the Moran and Geary autocorrelation coefficients. If xi is the data value for the ith point and wij is the i,jth element of the weighting matrix, then the Moran coefficient is defined to be I n wij ( xi x )( x j x ) i j wij ( xi x ) 2 i j i and the Geary coefficient is defined to be , 102 c (n 1) wij ( xi x j ) 2 i j 2 wij ( xi x ) 2 i j i . The denominator for both coefficients is : [Sum of all elements of the weighting matrix] * variance of the xi values The numerator for the Moran coefficient is : Sum for all i,j of : [ijth element of weighting matrix * (xi – mean of x values)*(xj – mean of x values)] The numerator for the Geary coefficient is : (½) * Sum of all i,j of [ijth element of weighting matrix * (xi – xj)] [Note : for the Moran coefficient, we estimate the variance by dividing by n, but for the Geary coefficient we divide by n – 1.] Randomization We randomize the allocation of data values to points, since under the null hypothesis this is random. We do not modify the weighting matrix in any way, since this describes the "map" of the study area. Weighting matrix In order to quantify the effect of spatial autocorrelation, we need a mathematical representation for the geographical layout of the study region: this representation is the weighting matrix, W. The ijth element of W, wij, provides a measure either of the geographical proximity of regions i and j, or else provides some other measure of the degree to which the regions influence each other. The simplest weighting matrix is a connectivity matrix, so that 1 if regions i and j are adjacent , wij 0 otherwise and this is the matrix used in our worked example. Cliff and Ord (1973) discuss more sophisticated weighting matrices; for example, wij may be inversely proportional to the distance between the centrepoints of the regions, or/and proportional to the length of the common boundary between the regions. The form of the matrix should be relevant to the context of the applied problem, and must be determined prior to the analysis stage. ALTERNATIVE PROCEDURES : Standard procedures No built-in Minitab command exists. Normal approximations to the distributions of the Moran and Geary coefficients can be derived using asymptotic theory, under two possible assumptions : N The data are independent realisations from a normal distribution R The data are independent realisations from an unknown distribution. The normal approximations are complicated, and we do not state them here. We present p-values obtained using both normal approximations; they do not necessarily give similar answers. REFERENCE CLIFF, A.D. & ORD, J.K. (1973) Spatial autocorrelation, Pion, London WORKED EXAMPLE FOR SPATAUTORAN Name of dataset WALES Description 103 The data describe the percentage change in population in each of 13 Welsh counties (coded A to M) over the period 1951-1961. Ranks for percentage change are then constructed, and it is these which are of interest. Our source CLIFF, A.D. & ORD, J.K. (1973) Spatial autocorrelation, Pion, London. Original source GENERAL REGISTER OFFICE (1961) England and Wales: Preliminary Census Report, 1961, HMSO, London. The data Number of observations = 13 Number of variables = 3 Code A B C D E F G H I J K L M Change 2.05 -1.70 -2.40 0.50 -2.50 1.80 3.20 2.10 -5.90 4.40 -3.80 3.40 -7.80 Rank 5 8 9 7 10 6 3 4 12 1 11 2 13 Worksheet C1 County code C2 % population change C3 Rank change M1 Geographical connectivity matrix Plot : Network chart. This is a graphical representation of the connectivity matrix used in the example. If counties i and j are directly joined in the chart, then the ijth element of the weighting matrix is 1. Otherwise it is zero. 104 Aims of analysis To investigate whether there is spatial autocorrelation in the rank population changes. Randomization procedure Welcome to Minitab, press F1 for help. Retrieving worksheet from file: N:\resampling\Examples\Wales.MTW # Worksheet was saved on 28/08/01 16:49:48 Results for: Wales.MTW MTB > % N:\resampling\library\spatautoran c3 m1 13 ; SUBC> nran 999 ; SUBC> autocorrel c5 c6. Executing from file: N:\resampling\library\spatautoran.MAC Moran coefficient of spatial autocorrelation Data Display (WRITE) Observed Moran coefficient 0.1259 Expected Moran coefficient -0.08333 Standard error for Moran coefficient, under N 0.1743 Standard error for Moran coefficient, under R 0.1938 Standard two-sided p-value, under N 0.2302 Standard two-sided p-value, under R 0.2803 One-sided randomization p-values 0.1190 0.8860 Two-sided randomization p-value 0.2380 105 Geary coefficient of spatial autocorrelation Data Display (WRITE) Observed Geary coefficient 0.6656 Expected Geary coefficient 1.000 Standard error for Geary coefficient, under N 0.2354 Standard error for Geary coefficient, under R 0.1257 Standard two-sided p-value, under N 0.1554 Standard two-sided p-value, under R 0.0078 One-sided randomization p-values 0.9430 0.0600 Two-sided randomization p-value 0.1200 * NOTE * For further details, see CLIFF, A.D. & ORD, J.K. (1973) Spatial autocorrelation, Pion, London. Modified worksheet C5 A column containing 999 Moran coefficients, one for each randomized dataset C6 A column containing 999 Geary coefficients, one for each randomized dataset Discussion There is no real evidence for spatial autocorrelation, except that the standard two-sided p-value under the 106 MANTELRAN ! Intensive ! To perform Mantel's test for association between the elements of two matrices, A and B. Important note A and B are usually distance matrices, so that, for example, the ijth element of A represents some kind of distance between the ith and jth objects, whilst the ijth element of B represents a different kind of distance between the same two objects. A and B are square, symmetric matrices. The elements along the diagonals of A and B should all be zero. The kinds of "distances" which may be of interest include : Geographical distance Separation in time Environmental differences between sites Differences between quadrat counts at sites Genetic differences RUNNING THE MACRO Calling statement mantelran m1 m2 k1 ; nran k1 (999) ; correlations c1. Input m1 and m2 should be square, symmetric, k-by-k matrices of the same dimension. Both must have zero entries along their leading diagonal. Matrices m1 and m2 represent matrices A and B in the Discussion below. k1 is the number of observations (equal to the length of the side of each matrix), k. Subcommands correlations - specify a column within which to store simulated correlation coefficients. Output Observed correlation coefficient P-values determined by randomisation TECHNICAL DETAILS Null hypothesis : The elements of matrices A and B are independent. Alternative hypothesis : Linear association between the elements of A and B. Test-statistic : The Pearson correlation coefficient between the entries of matrices A and B. Actually, we only need to use the lower triangular elements of A and B, since the matrix is symmetric. Randomization procedure : We fix the elements of A (say). To construct B, we randomly permute the order of the individuals 1,…,n, and locate values of B according to the new position of the individuals with which they are associated. This procedure is valid under one of two assumptions : 1. If the n individuals are a random sample from a larger population, then we must assume that the A distances and B distances are independent within the population. 2. If the n individuals form the population of interest, then we must assume that the mechanism generating A distances is independent of the mechanism generating B distances. ALTERNATIVE PROCEDURES : 107 Other macros : None. Standard procedures : None. REFERENCES MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 9). WORKED EXAMPLE FOR MANTELRAN Name of dataset EARWIGS Description and worksheet The data describe the global distribution of earwigs. Observations are taken upon 8 continental-level regions (1:Europe and Asia, 2:Africa, 3:Madagascar, 4: Orient, 5: Australia, 6: New Zealand, 7: South America, 8: North America). We have three matrices: M1 A correlation matrix. Element ij quantifies the similarity of earwig species in regions i and j. M2 A matrix representing current distances between matrices. Element ij quantifies the "stepwise" distance between regions i and j (i.e. it is 1 if they are adjacent, 2 if they are separated (overland) by one other region and so on). M3 An alternative distance matrix, based upon the hypothesised arrangement of the continents in Gondwanaland. Interest lies in seeing whether the similarity in species between regions is more closely related to their current geographical proximity or to their geographical proximity in Gondwanaland. If the latter relationship is substantially stronger, this provides evidence that evolution of earwig species occurred in Gondwanaland, which in turn provides supporting evidence for the Continental Drift Hypothesis (which hypothesises the existence of Gondwanaland). Our source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Original source POPHAM, E.J. & MANLY, B.F.J. (1969), Geographical distribution of the Dermaotera and the continental drift hypothesis, Nature, 222, pp. 981-982. Data Correlation matrix (8*8) 0.00 0.30 0.14 0.23 0.30 0.00 0.50 0.50 0.14 0.50 0.00 0.54 0.23 0.50 0.54 0.00 0.30 0.40 0.50 0.61 -0.04 0.04 0.11 0.03 0.02 0.09 0.14 -0.16 -0.09 -0.06 0.05 -0.16 0.30 0.40 0.50 0.61 0.00 0.15 0.11 0.03 -0.04 0.04 0.11 0.03 0.15 0.00 0.14 -0.06 0.02 0.09 0.14 -0.16 0.11 0.14 0.00 0.36 -0.09 -0.06 0.05 -0.16 0.03 -0.06 0.36 0.00 Distance matrix (8*8) 0 1 2 1 1 0 1 2 2 1 0 3 1 2 3 0 2 3 4 1 2 3 4 1 0 3 4 5 2 1 2 3 4 3 4 1 2 3 2 3 108 3 2 1 4 3 2 5 4 3 2 3 2 1 4 3 Alternative distance matrix (8*8) 0 1 2 1 2 1 0 1 1 1 2 1 0 1 1 1 1 1 0 1 2 1 1 1 0 3 2 2 2 1 2 1 2 2 2 1 2 3 2 3 0 5 4 5 0 1 4 1 0 3 2 2 2 1 0 3 4 2 1 2 2 2 3 0 1 1 2 3 2 3 4 1 0 Worksheet M1 Correlation matrix M2 Distance matrix (based on current arrangement of continents) M3 Alternative distance matrix (based on Continental Drift Hypothesis) Aims of analysis To investigate whether the similarity of earwig species between continental-scale regions is correlated with the "stepwise" geographical distance between those regions. Randomization procedure : current distribution of continents MTB > Retrieve "N:\spatial\Earwigs.MTW". Retrieving worksheet from file: N:\spatial\Earwigs.MTW # Worksheet was saved on 08/03/01 05:09:05 PM Results for: Earwigs.MTW MTB > % N:\spatial\mantelran m1 m2 8 ; SUBC> nran 499 ; SUBC> correlations c1. Executing from file: N:\spatial\mantelran.MAC Mantel Test, with significance determined by randomization Data Display (WRITE) Number of units 8 Observed correlation -0.2170 Number of randomizations 499 One-sided p-value, H1: positive correlation One-sided p-value, H1: negative correlation Two-sided p-value 0.3800 0.8160 0.1900 Looking at the data MTB > Retrieve "N:\resampling\Examples\Earwigs.MTW". Retrieving worksheet from file: N:\resampling\Examples\Earwigs.MTW # Worksheet was saved on 13/08/01 11:34:05 Results for: Earwigs.MTW 109 MTB > print m1 Data Display Matrix M1 0.00 0.40 0.04 0.50 -0.06 0.30 0.50 0.09 0.40 0.00 0.15 0.50 0.03 0.30 0.61 0.11 0.04 0.15 0.00 0.11 -0.06 -0.04 0.03 0.14 0.50 0.50 0.11 0.00 0.05 0.14 0.54 0.14 -0.06 0.03 -0.06 0.05 0.00 -0.09 -0.16 0.36 0.30 0.50 0.09 0.30 0.61 0.11 -0.04 0.03 0.14 0.14 0.54 0.14 -0.09 -0.16 0.36 0.00 0.23 0.02 0.23 0.00 -0.16 0.02 -0.16 0.00 Modified worksheet C1 A column containing 999 correlation coefficients, one for each simulated dataset. Discussion The appropriate one-sided randomization p-value is 0.190, very similar to the 0.183 obtained by Manly (1997) using 4999 randomizations. This provides no significant evidence of linear association between the similarity in earwig species and the current distances between continental-level regions. 110 MEAD4RAN To perform Mead’s randomization test upon a 4*4 grid of spatial count data. Mead’s randomization test is designed to test the null hypothesis of CSR (Complete Spatial Randomness). Calling statement mead4ran m1 ; nran k1 (999) ; qstatistics c1. Input A 4*4 matrix of quadrat counts, which may not contain any missing values. The ordering of the counts in the matrix should be the same as the spatial ordering in the experiment or study, and the results obtained will be dependent upon this ordering. Subcommands qstatistics Specify a column in which to store simulated Q-statistics. Output Observed Q-statistic Associated one-sided and two-sided randomization p-values References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 10). Standard procedure : None Null hypotheses : Assume that the quadrats are labelled as follows : 1 3 9 11 2 4 10 12 5 7 13 15 6 8 14 16 The null hypothesis is that the division of the quadrats into blocks of 4 [(1) = 1,2,3,4 ; (2) = 5,6,7,8 ; (3) = 9,10,11,12 ; (4) = 13,14,15,16] is random. If the data exhibit Complete Spatial Randomness (CSR), then this division will be random, so the null hypothesis can also be viewed as CSR. Alternative hypotheses : Clustering or regularity at an appropriate scale, resulting in a non-random division of the quadrats into blocks. Test-statistic : Assume that the data are as follows, so that Ti represents the quadrat count in the ith quadrat. T1 T2 T5 T6 T3 T4 T7 T8 T9 T10 T13 T14 T11 T12 T15 T16 111 We use the test-statistic Q = BSS / TSS, where TSS is the variance of the 16 counts in the 4*4 grid, and BSS is the variance for the 4 counts (in the 2*2 grid formed by aggregating counts as follows : AC1 = T1 + T2 + T3 + T4 AC3 = T9 + T10 + T11 + T12 Definition BSS T1 T2 T3 T4 2 2 n 2 TSS Ti 16T . i 1 AC2 = T5 + T6 + T7 + T8 AC4 = T13 + T14 10 + T15 + T16 13 14 11 2 2 2 2 T5 T6 T7 T8 T9 T12 10 T11 T12 T13 T14 T15 T16 16 T , 15 4 16 Q lies between 0 and 1. In general, unusually large values of Q imply clustering, whilst unusually small values of Q imply some form of regularity. However, it should be noted that the test is only capable of detecting regularity or clustering at a particular spatial scale (the scale reflected by blocks of size 4). Mead's randomization test can either be applied to data which naturally arise as a 4*4 grid of counts, or (more commonly) by placing a 4*4 grid over a region in which locations of points are recorded, and counting the number of points within each section of the grid. Randomization procedure : We randomize the allocation of counts to cells within the grid, since under the null hypothesis of complete spatial randomness this allocation should occur at random. WORKED EXAMPLE FOR MEAD4RAN Name of dataset SAPLING1 Description The raw data describes the position of 71 Swedish pine saplings in a 10 x 10m square. In this dataset, we divide the region in 16 squares (each 2.5m x 2.5m), and count the number of saplings within each square. Source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Data Number of observations = 16 Number of variables = 1 Counts within the 4*4 grid are shown. 6 4 3 2 2 6 4 3 5 4 5 4 4 6 6 7 Worksheet M1 Matrix of counts Aim of analysis 112 To test whether the distribution of pine saplings is random. Randomization procedure MTB > Retrieve "N:\resampling\Examples\Sapling1.MTW". Retrieving worksheet from file: N:\resampling\Examples\Sapling1.MTW # Worksheet was saved on 09/08/01 09:13:12 Results for: Sapling1.MTW MTB > print m1 Data Display Matrix M1 6 4 3 2 2 6 4 3 5 4 5 4 4 6 6 7 MTB > % N:\resampling\library\mead4ran m1 ; SUBC> nran 999 ; SUBC> qstatistics c1. Executing from file: N:\resampling\library\mead4ran.MAC Mead's randomization test for a 4*4 grid Data Display (WRITE) Observed Q-statistic 0.3886 One-sided randomization p-value, H1: regularity One-sided randomization p-value, H1: clustering Two-sided randomization p-value 0.2100 0.9010 0.1050 * NOTE * For further details, see MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Modified worksheet C1 A column containing 999 Q-statistics, one for each simulated dataset Discussion There is slight evidence of clustering (we obtain a one-sided p-value of 0.105; Manly (1997) obtains 0.111), but this cannot be regarded as statistically significant. Mead's test therefore provides no real evidence against randomness at this scale (this qualification is important - the test is scale-dependent). 113 MEAD8RAN To perform Mead’s randomization test upon an 8*8 grid of spatial count data. Mead’s randomization test is designed to test the null hypothesis of CSR (Complete Spatial Randomness). Calling statement mead8ran m1 ; nran k1 (999) ; qstatistics c1-c5. Input An 8*8 matrix of quadrat counts, which may not contain any missing values. The ordering of the counts in the matrix should be the same as the spatial ordering in the experiment or study, and the results obtained will be dependent upon this ordering. Subcommands qstatistics Specify five columns in which to store simulated Q-statistics for each of the four quarters of the study area (top left in the 1st column, top right in the 2nd column, bottom left in the 3rd column, bottom right in the 4th column), and the simulated mean Q-statistics (in the 5th column). Output Observed Q-statistics for each quarter Observed mean Q-statistic One-sided and two-sided randomization p-values for the mean Q-statistic Null hypothesis : Complete spatial randomness. Alternative hypothesis : Clustering or regularity at a particular scale. Test-statistic : We compute Q statistics (as in Mead's randomization test for a 4*4 grid) for each of the four 4*4 blocks of cells in each corner of the 8*8 grid. We use the average of these as the test-statistic. Once again, Mead's randomization test is only capable of picking up deviations from randomness at particular scales, but in this case the scale is finer. Randomization procedure : We use restricted randomization, randomizing counts separately within each of the four 4*4 grids. References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 10). Standard procedure : None 114 WORKED EXAMPLE FOR MEAD8RAN Name of dataset SAPLING2 Description The raw data describes the position of 71 Swedish pine saplings in a 10 x 10m square. In this dataset, we divide the region in 64 squares (each 1.25m x 1.25m), and count the number of saplings within each square. Source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Data Number of observations = 16 Number of variables = 1 Counts within the 8*8 grid are shown. 1 3 2 0 1 0 0 0 0 2 1 1 1 1 1 1 0 0 2 3 1 1 0 1 1 1 0 1 1 1 2 0 1 2 2 1 1 0 2 0 1 1 0 1 2 2 1 1 1 1 1 3 2 1 1 1 0 2 0 2 1 2 2 3 Worksheet M1 Matrix of counts Randomization procedure MTB > Retrieve "N:\resampling\Examples\Sapling2.MTW". Retrieving worksheet from file: N:\resampling\Examples\Sapling2.MTW # Worksheet was saved on 28/08/01 16:45:24 Results for: Sapling2.MTW MTB > print m1 Data Display Matrix M1 1 3 2 0 1 0 0 0 2 1 1 1 1 1 0 0 2 3 1 1 0 1 1 0 1 1 1 2 1 2 2 1 1 0 2 1 1 0 1 2 2 1 1 1 1 3 2 1 1 0 2 0 2 1 2 2 115 0 1 1 0 0 1 1 3 MTB > Save "N:\resampling\Examples\Sapling2.MTW"; SUBC> Replace. Saving file as: N:\resampling\Examples\Sapling2.MTW * NOTE * Existing file replaced. MTB > % N:\resampling\library\mead8ran m1 ; SUBC> nran 999 ; SUBC> qstatistics c3-c7. Executing from file: N:\resampling\library\mead8ran.MAC Mead's randomization test for an 8*8 grid Data Display (WRITE) Observed Q-statistic for top left quarter 0.1746 Observed Q-statistic for top right quarter 0.06587 Observed Q-statistic for bottom left quarter 0.1000 Observed Q-statistic for bottom right quarter 0.1282 Data Display (WRITE) Observed mean Q-statistic 0.1172 Number of randomizations 999 One-sided randomization p-value, H1: regularity One-sided randomization p-value, H1: clustering Two-sided randomization p-value 0.1880 0.0940 0.9110 * NOTE * For further details, see MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Modified worksheet C3 A column containing 999 Q-statistics for top left quarter, one for each simulated dataset. C4 A column containing 999 Q-statistics for top right quarter, one for each simulated dataset. C5 A column containing 999 Q-statistics for bottom left quarter, one for each simulated dataset. C6 A column containing 999 Q-statistics for bottom right quarter, one for each simulated dataset. C7 A column containing 999 average Q statistics, one for each simulated dataset. Discussion There is slight evidence of regularity (we obtain a one-sided p-value of 0.094; Manly (1997) obtains 0.093), but this cannot be regarded as statistically significant. Mead's test again provides no real evidence against randomness at this scale (this qualification is important - the test is scale-dependent), but it is interesting to note that whilst clustering is the most plausible hypothesis at the 4*4 scale, regularity is the most plausible hypothesis at the 8*8 scale. 116 Creating and interpreting EDF plots Introduction EDF plots are a convenient graphical procedure for investigating the null hypothesis of CSR (Complete Spatial Randomness). EDF plots are very similar to the Probability Plots produced (for example by Minitab) to test distributional assumptions. The differences are that : 1) EDF plots test the observed data against the null hypothesis of Complete Spatial Randomness, rather than against a hypothesised distribution (e.g. normality). 2) EDF plots are constructed using Monte Carlo simulation, rather than theoretical results. Underlying theory Assume that data is available for n points within a fixed rectangular region A of known area. We consider the distribution of all inter-event distances. Under the null hypothesis of CSR, we can assume that the distance between any two events in A is a realisation from a random variable T. Assume for the time being that T is known (for very simple regions, such as a square or circle, we can derive the exact distribution of T), and assume T has cumulative distribution function F(t). We compare this theoretical distribution, F(t) derived under the null hypothesis of CSR with the empirical distribution function (EDF) of the data, E(t). If the null hypothesis is true, we would then expect E(t) to be "close" to F(t); the basis of the EDF plot involves plotting E(t) against F(t), and comparing this to a plot of F(t) against itself. Implementation Four issues arise in practice : Question 1 : At what values of t should be evaluate E(t) and F(t) ? Answer : If we assume that the region is rectangular, with sides a and b, then t is constrained to lie in the range [0,(a2+b2)1/2]. We then evaluate the CDF and EDF at a fixed number (the default is 100, but there is an option for the user to change this) of equally spaced points within this range. We therefore work not with the EDF, but with an approximation to it. If the number of points is reasonably large, the error from working with an approximation is very small. Question 2 : How do we compute the empirical distribution function ? Answer : At each evaluation distance t, the EDF is defined to be E(t) = (Number of observed inter-event distances less than or equal to t) / n. Question 3 : How do we find F(t) if the region is not very simple ? Answer : We create d simulated datasets of size n under the null hypothesis of CSR. If the theoretical bounds of the study region are [XMIN,XMAX] on the x-axis and [YMIN,YMAX] on the y-axis, then to create one simulated dataset, we simulate n x-values from a uniform distribution on [XMIN,XMAX] and n y-values from a uniform distribution on [YMIN,YMAX], and pair these up to give the location of n simulated datapoints. We then compute the EDF for each of the d simulated datasets. Let these EDFs be H1(t),…,Hd(t). An estimate of F(t) at distance t is the mean of H1(t),…,Hd(t). Question 4 : How do we create an indication of variability under the null hypothesis ? In order to compare the EDF against the CDF generated under CSR we need some indication of the variability in the CDF under CSR. To do this, we can create lower and upper simulation limits L(t) = Min{i=1,…d} [Hi(t)] U(t) = Max{i=1,…d} [Hi(t)] against F(t). In the EDF, E(t), deviates from the simulation "envelope" bounded by L(t) and U(t), then this provides evidence against CSR (although exact significance levels are difficult to assign). 117 Graphical representations There are two possible graphical representations of the EDF and associated simulation envelope : Plot the EDF and simulation envelope against the theoretical distribution (the CDF) Plot the EDF, CDF and simulation envelope against distance, t. Both plots are presented, and should be interpreted in much the same way. Interpreting the graphs Look at both graphs, and ask yourself the following questions Does the EDF line generally seem to lie far from the CDF line ? Does the EDF line go outside the simulation envelope at any point ? Does the EDF line show any systematic deviation from the CDF line (e.g. always lying about the CDF line) ? If any of the answers is "yes", this may suggest that the assumption of CSR is false. If the EDF line always lies close to the CDF, never travels outside the envelope and shows no systematic deviation from the EDF then there is no evidence against CSR. Test-statistics We use two "ad hoc" test-statistics, 1. Maximum pointwise squared difference between the EDF and the estimated CDF 2. Average squared difference (across all t values) between the EDF and the estimated CDF. Both test the null hypothesis of complete spatial randomness in the location of data points within the region, against alternative hypothesis of clustering & regularity. We only consider the one-sided p-value, since deviations of CSR will always be associated with a large squared difference. The second test-statistic is likely to have low power in many circumstances. 118 DISTEDFMC ! Intensive ! To construct EDF plots based on distances between all points within a region. Calling statement distedf.mac c1 c2 k1 ; nsim k1 (999) ; npoints k1 (100) ; distances c1 ; edfs c1-c5. Input c1 and c2 should contain paired x and y co-ordinates for each point in the plane. k1 should be the number of points at which observations are available. Interactive input The user is prompted to enter the minimum and maximum possible values of the x and y values. The macro checks whether these lie outside the range of the observed data; if they lie within the observed range, an error similar to the following will arise : *** ERROR *** Stated theoretical minimum for x is greater than the observed minimum value of x. Subcommands npoints distances edfs Number of distance values which should be used to construct the EDF plot. Increasing the number of values will increase the resolution of the plot. Specify a column in which to store all distances between points. Specify five columns in which to store : 1. An equally-spaced vector of distance values (determined by npoints) 2. An estimate of the expected cumulative distribution function 3. The empirical distribution function 4. The lower bound of the simulation envelope 5. The upper bound of the simulation envelope Output An EDF plot, with simulation envelope A global assessment of randomness, with randomization p-value Missing values Are allowed. However, take note ! Important note : The specified number of data points (the third argument to the command) should not include any data points for which either the x or y value is missing. If this is not taken account of, the following error arises : *** ERROR *** The number of points in the data is not equal to the specified number of points. 119 TECHNICAL DETAILS See above. STANDARD PROCEDURE No standard MINITAB procedure is available. REFERENCE DIGGLE, P.J. (1983) Statistical analysis of spatial point patterns, Academic Press, London. WORKED EXAMPLE FOR DISTEDFMC Name of dataset JAPANESE Description The data record the location of 65 Japanese black pine seedlings within a fixed square of side 5.7m. Data have been scaled so that x and y co-ordinates must lie between 0 and 1. Our source DIGGLE, P.J. (1983) Statistical analysis of spatial point patterns, Academic Press, London (pp. 1). Original source NUMATA, M. (1961), Forest vegetation in the vicinity of Choshi. Coastal flora and vegetation at Choshi, Chiba Prefecture IV., Bull. Choshi Marine Lab. Chiba Uni., 3, pp. 28-48 [in Japanese]. Data Number of observations = 42 Number of variables = 2 For each point, the x-value (top) and y-value (bottom) are given. 0.09 0.59 0.86 0.42 0.02 0.08 0.31 0.94 0.59 0.94 0.17 0.39 0.36 0.09 0.02 0.13 0.22 0.41 0.59 0.53 0.58 0.67 0.78 0.95 0.79 0.97 0.29 0.65 0.89 0.48 0.03 0.08 0.32 0.34 0.66 0.98 0.21 0.52 0.36 0.02 0.16 0.08 0.13 0.44 0.63 0.52 0.68 0.68 0.79 0.79 0.93 0.96 0.38 0.67 0.98 0.62 0.07 0.12 0.42 0.37 0.76 0.97 0.29 0.58 0.39 0.03 0.13 0.02 0.21 0.42 0.63 0.49 0.68 0.66 0.86 0.84 0.83 0.96 0.39 0.73 0.02 0.73 0.52 0.12 0.52 0.47 0.73 0.12 0.32 0.69 0.43 0.18 0.13 0.18 0.23 0.42 0.66 0.52 0.67 0.73 0.84 0.83 0.93 0.96 0.48 0.79 0.11 0.89 0.64 0.17 0.91 0.52 0.89 0.11 0.35 0.77 0.62 0.03 0.03 0.31 0.23 0.43 0.58 0.52 0.67 0.74 0.94 0.86 0.93 0.97 120 Plot y 1 0 0 1 x Worksheet C1 X-co-ordinates C2 Y-co-ordinates Aim of analysis To investigate whether the spatial distribution of Japanese pine seedlings is random. Randomization procedure MTB > Retrieve "N:\resampling\Examples\Japanese.MTW". Retrieving worksheet from file: N:\resampling\Examples\Japanese.MTW # Worksheet was saved on 16/08/01 10:21:11 Results for: Japanese.MTW MTB > % N:\resampling\library\distedfmc c1 c2 65 ; SUBC> nsim 99 ; SUBC> npoints 50 ; SUBC> distances c4 ; SUBC> edfs c6-c10. Executing from file: N:\resampling\library\distedfmc.MAC What are the minimum and maximum possible x co-ordinates ? * NOTE * Please enter 2 values (min and max), then press return. DATA> 0 1 What are the minimum and maximum possible y co-ordinates ? * NOTE * Please enter 2 values (min and max), then press return. DATA> 0 1 Exact Monte Carlo test of Complete Spatial Randomness (CSR), 121 based on inter-event distances * NOTE * In many circumstances, the test based on average deviation will be very weak to detect departures from CSR Data Display (WRITE) Observed test-statistic, average deviation Randomization p-value 0.3800 0.000344 Observed test-statistic, maximum deviation Randomization p-value 0.4100 0.00145 EDF plots of inter-event distances, with simulation envelopes layout ; * NOTE * Beginning LAYOUT mode. Type ENDLAYOUT to end mode. * NOTE * Ending LAYOUT mode. plot expected*tvec hemp*tvec ltval*tvec utval*tvec; * NOTE * For further details, see DIGGLE, P.J. (1983) Statistical analysis of spatial point patterns, Academic Press, London. Modified worksheet C6 Column containing 50 evaluation distances C7 Column containing 50 empirical distribution function values, one for each distance C8 Column containing 50 estimates of the cumulative distribution function, one for each distance C9 Column containing 50 lower simulation envelope bounds, one for each distance C10 Column containing 50 upper simulation envelope bounds, one for each distance EDF plot of inter-event distances Cumulative probability 1.0 0.5 0.0 0.0 0.5 1.0 1.5 Distance Observed EDF is solid line. Simulation envelope is formed by dotted lines. 122 EDF plot of inter-event distances Empirical 1.0 0.5 0.0 0.0 Observed EDF is solid line. 0.5 1.0 Theoretical Simulation envelope is formed by dotted lines. Discussion Both kinds of EDF plot give the same impression; the EDF lies close to the CDF throughout the range, and well within the simulation envelope. This suggests that we should not reject the null hypothesis of complete spatial randomness. This is confirmed by the formal tests, with non-significant p-values for tests based upon average deviation (p-value = 0.38) and maximum deviation (p-value = 0.41). ADDITIONAL SAMPLE DATA FOR DISTEDFMC Name of dataset REDWOOD Suitable for use with DISTEDFMC NEARESTMC LOCREGULARMC Description The data record the location of 62 redwood seedlings within a fixed square of side 23m. Data have been scaled so that x and y co-ordinates must lie between 0 and 1. Our source DIGGLE, P.J. (1983) Statistical analysis of spatial point patterns, Academic Press, London (pp. 2). Original sources RIPLEY, B.D. (1977), Modelling spatial patterns (with Discussion), JRSS series B, 39, pp. 172-212. STRAUSS, D.J. (1975), A model for clustering, Biometrika, 62, pp. 467-475. Data Number of observations = 42 Number of variables = 2 123 For each point, the x-value (top) and y-value (bottom) are given. 0.364 0.898 0.864 0.966 0.864 0.686 0.500 0.483 0.339 0.483 0.186 0.082 0.082 0.180 0.541 0.902 0.328 0.598 0.672 0.836 0.820 0.402 0.203 0.186 0.483 0.898 0.780 0.898 0.746 0.644 0.525 0.220 0.381 0.525 0.541 0.082 0.098 0.123 0.754 0.902 0.279 0.574 0.795 0.836 0.483 0.203 0.102 0.186 0.441 0.839 0.780 0.898 0.678 0.610 0.585 0.770 0.426 0.574 0.557 0.098 0.082 0.164 0.779 0.246 0.344 0.574 0.220 0.407 0.508 0.220 0.119 0.500 0.898 0.763 1.000 0.703 0.627 0.836 0.852 0.754 0.459 0.574 0.098 0.164 0.148 0.836 0.279 0.344 0.559 0.263 0.263 0.441 0.237 0.136 0.483 0.898 0.949 0.966 0.729 0.639 0.852 0.697 0.754 0.475 0.574 0.148 0.189 0.525 0.959 0.262 0.644 0.525 0.288 0.441 0.186 0.203 0.119 0.361 0.656 0.852 0.820 0.377 0.500 0.623 Worksheet C1 X-co-ordinates C2 Y-co-ordinates 124 NEARESTMC ! Intensive ! To compute kth nearest neighbour distances for a set of points within a fixed rectangular region, and to use a Monte Carlo test to determine the significance of each of these distances. Calling statement nearestmc c1 c2 k1 ; nsim k1 ; nstats k1 ; distances m1 ; nearest m1. Input c1 and c2 should contain paired x and y co-ordinates for each point in the plane. k1 should be the number of points at which observations are available. Interactive input The user will be asked to specify the minimum and maximum theoretical ranges of the x and y; these quantities (XMIN, XMAX, YMIN and YMAX) are important, as they determine the region from which points may be drawn in the Monte Carlo simulation. Subcommands nstats - the macro considers kth nearest neighbour distances for k = 1,…,m. nstats specifies the maximum nearest neighbour distance to be considered, m. distances - specify a matrix within which to store the distance matrix for the observed data. nearest - specify a matrix within which to store the kth nearest neighbour distances (k = 1,…,m). Output gobs gsmean gsmin gsmax p1low p1high p2sided Observed kth nearest neighbour distance Mean of simulated kth nearest neighbour distances Minimum of simulated kth nearest neighbour distances Maximum of simulated kth nearest neighbour distances One-sided p-value for kth nearest neighbour distance being unusually small One-sided p-value for kth nearest neighbour distance being unusually large Two-sided p-value for kth nearest neighbour distance Null hypothesis : Complete spatial randomness in the location of data points within the region. Test-statistic : We use kth nearest neighbour distances (for k = 1,…,nstats) as our test-statistics. Simulation procedure : Assume that the sample size is n. For each Monte Carlo simulation, we simulate n realisations from a continuous uniform distribution on the interval [XMIN, XMAX], and use these as x co-ordinates. We also simulate n realisations from a continuous distribution on the interval [YMIN, YMAX], and use these as y co-ordinates. We pair the x and y co-ordinates randomly, and use the resulting points as our simulated dataset. References MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London (Chapter 4). 125 Standard procedure No standard MINITAB procedure is available. WORKED EXAMPLE FOR NEARESTMC Name of dataset FIELD Description The data are artificial; they show the positions of 24 points within a 2m x 2m square. Source MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, Chapman and Hall, London. Data Number of observations = 24 Number of variables = 2 For each point, the x-value (top) and y-value (bottom) are given. 0.1 0.1 0.3 0.4 0.7 0.9 1.1 1.2 1.1 1.2 1.2 1.0 1.1 1.3 1.0 0.9 1.5 0.9 1.1 0.7 0.3 0.1 0.1 0.3 0.3 0.6 0.8 0.8 0.6 1.9 1.3 1.4 1.5 1.6 1.7 1.6 1.7 1.9 1.9 1.4 0.8 0.7 0.6 0.4 0.8 0.8 0.5 0.9 Plot y 2 1 0 0 1 2 x Worksheet C1 X-co-ordinates C2 Y-co-ordinates Aim of analysis To investigate whether the location of points within the field is random. 126 Randomization procedure MTB > Retrieve "N:\resampling\Examples\Field.MTW". Retrieving worksheet from file: N:\resampling\Examples\Field.MTW # Worksheet was saved on 16/08/01 10:11:34 Results for: Field.MTW MTB > % N:\resampling\library\nearestmc c1 c2 24 ; SUBC> nsim 999 ; SUBC> nstats 12 ; SUBC> distances m1 ; SUBC> nearest m2. Executing from file: N:\resampling\library\nearestmc.MAC What are the minimum and maximum possible x co-ordinates ? * NOTE * Please enter 2 values (min and max), then press return. DATA> 0 2 What are the minimum and maximum possible y co-ordinates ? * NOTE * Please enter 2 values (min and max), then press return. DATA> 0 2 Monte-Carlo test for nearest-neighbour distances Data Display Row gobs gsmean gsmin gsmax p1low p1high p2sided 1 2 3 4 5 6 7 8 9 10 11 12 0.204 0.306 0.361 0.421 0.488 0.537 0.599 0.641 0.684 0.721 0.758 0.786 0.222 0.345 0.444 0.527 0.605 0.678 0.746 0.813 0.876 0.936 0.995 1.054 0.138 0.245 0.306 0.375 0.433 0.497 0.556 0.601 0.638 0.686 0.720 0.741 0.304 0.456 0.566 0.676 0.758 0.857 0.927 1.008 1.088 1.156 1.301 1.359 0.268 0.128 0.025 0.011 0.013 0.011 0.011 0.009 0.009 0.007 0.007 0.003 0.733 0.873 0.976 0.990 0.988 0.990 0.990 0.992 0.992 0.994 0.994 0.998 0.536 0.256 0.050 0.022 0.026 0.022 0.022 0.018 0.018 0.014 0.014 0.006 Modified worksheet M1 A 24*24 matrix of distances between sample points M2 A 999*12 matrix of kth nearest neighbour distances (k = 1,…,12). The kth column contains 999 kth nearest neighbour distances, one for each simulated dataset Discussion The 1st and 2nd nearest neighbour distances are not significantly different from that which we would expect if the points were distributed at random. All higher distances are significant at the 5% level, with the degree of significance increasing as the order (k) of the distance increases. However, if we allow for multiple testing, then we should really reduce the significance level from 5% to 5/12 = 0.42% (using the Bonferroni inequality, a conservative procedure). In this case, none of the distances is found to be 127 significant, suggesting that there is no strong evidence against randomness. The findings are different to those of Manly (1997), probably because we extracted the data from his graph by eye, and this would have led to error. 128 LOCREGULARMC ! Intensive ! To perform Monte Carlo tests for randomness, against the alternative hypotheses of regularity or local regularity, using three statistics based upon nearest neighbour distances. RUNNING THE MACRO Calling statement locregularmc c1 c2 k1 ; nsim k1 (999) ; distances m1 ; statistics c1-c3. Input c1 and c2 should contain paired x and y co-ordinates for each point in the plane. k1 should be the number of points at which observations are available. Subcommands nsim Number of Monte Carlo simulations distances Specify a matrix within which to store the distance matrix statistics Specify three columns, within which to store simulated statistics for D (column 1), S (column 2) and G (column 3). Output Observed D, S and G statistics, with associated 1-sided randomization p-values Indices of regularity, based upon S and G. Note that only the one-sided randomization p-values which test against the alternative hypothesis of regularity are given. ALTERNATIVE PROCEDURES : Standard procedures No standard MINITAB procedure is available. TECHNICAL DETAILS Null hypothesis : Complete spatial randomness in the location of data points within the region. Alternative hypothesis : Regularity (including local regularity). Test-statistic : We use three different test-statistics. The three statistics considered here are : D, the mean squared nearest neighbour distance. S, the coefficient of variation of the squared nearest neighbour distances G, the ratio of the geometric mean of the squared NN distances to their arithmetic mean. These are defined to be - n D vi n , S (vi D) (n 1) , G vi i 1 i 1 i 1 where the vi are squared nearest neighbour distances. n n 1/ n 2 D, The test-statistics have the following properties D > 0. Large values of D tend to indicate regularity. 129 S > 0. Small values of S tend to indicate regularity. For complete regularity, S = 0. G lies between 0 and 1. Large values of G tend to indicate regularity. The test-statistic D is often used to test for spatial randomness, against alternative hypotheses of both clustering and regularity, and S and G can be used for the same purpose. In this macro, we restrict attention to the one-sided alternative hypothesis of regularity, since there are problems in interpreting deviations of S and G from randomness in the opposite direction. All three statistics should be sensitive to the detection of global regularity (i.e. the usual kind of largescale regularity), but Brown & Rothery (1978) suggest that S and G should be more effective teststatistics than D for detecting local regularity (i.e. regularity at the small scale, although the data are clustered at the large scale). Simulation procedure : Assume that the sample size is n. For each Monte Carlo simulation, we simulate n realisations from a continuous uniform distribution on the interval [XMIN, XMAX], and use these as x co-ordinates. We also simulate n realisations from a continuous distribution on the interval [YMIN, YMAX], and use these as y co-ordinates. We pair the x and y co-ordinates randomly, and use the resulting points as our simulated dataset. Indices of regularity Brown and Rothery (1978) suggest that suitable indices of regularity / local regularity might be : IG = sqrt(1 - G) IS = sqrt(S). REFERENCE BROWN, D. & ROTHERY, P. (1978), Randomness and local regularity of points in a plane, Biometrika, 65, pp. 115-122. WORKED EXAMPLE FOR LOCREGULARMC Name of dataset CELLS Description The data record the location of the centres of 42 biological cells within a fixed square of known size. Data have been scaled so that x and y co-ordinates must lie between 0 and 1. Source DIGGLE, P.J. (1983) Statistical analysis of spatial point patterns, Academic Press, London (pp. 1). Original sources RIPLEY, B.D. (1977), Modelling spatial patterns (with Discussion), JRSS series B, 39, pp. 172-212. CRICK, F.H.C. & LAWRENCE, P.A. (1975), Compartments and polychones in insect development, Science, 189, pp. 340-347. Data Number of observations = 42 Number of variables = 2 For each point, the x-value (top) and y-value (bottom) are given. 0.350 0.062 0.938 0.462 0.462 0.737 0.800 0.337 0.350 0.637 0.325 0.025 0.362 0.400 0.750 0.900 0.237 0.387 0.750 0.962 0.050 0.287 130 0.350 0.737 0.487 0.212 0.150 0.525 0.625 0.825 0.650 0.725 0.237 0.600 0.687 0.087 0.337 0.500 0.650 0.950 0.125 0.362 0.512 0.787 0.775 0.450 0.562 0.862 0.237 0.337 0.987 0.775 0.087 0.900 0.862 0.025 0.287 0.575 0.637 0.150 0.462 0.512 0.850 0.187 0.262 0.525 0.637 0.575 0.600 0.175 0.175 0.400 0.462 0.062 0.900 0.812 0.212 0.475 0.650 0.912 0.162 0.425 0.750 0.775 Plot y 1 0 0 1 x Worksheet C1 X-co-ordinates C2 Y-co-ordinates Aims of analysis To investigate whether the distribution of cells within the study region is random. Randomization procedure MTB > Retrieve "N:\resampling\Examples\Cells.MTW". Retrieving worksheet from file: N:\resampling\Examples\Cells.MTW # Worksheet was saved on 24/08/01 15:00:13 Results for: Cells.MTW MTB > % N:\resampling\library\locregularmc c1 c2 42 ; SUBC> nsim 999 ; SUBC> distances m1 ; SUBC> statistics c4-c6 Executing from file: N:\resampling\library\locregularmc.MAC 131 What are the minimum and maximum possible x co-ordinates ? * NOTE * Please enter 2 values (min and max), then press return. DATA> 0 1 What are the minimum and maximum possible y co-ordinates ? * NOTE * Please enter 2 values (min and max), then press return. DATA> 0 1 Monte-Carlo tests for local regularity of points in a fixed rectangular plane Data Display Observed D statistic 0.01694 Randomization p-value 0.0010 Data Display Observed S statistic 0.06669 Randomization p-value 0.0010 Data Display Observed G statistic 0.9626 Randomization p-value 0.001000 Indices of local regularity Data Display (WRITE) Index based on G Index based on S 0.1934 0.2582 * NOTE * For further details, see BROWN, D. & ROTHERY, P. (1978), Randomness and local regularity of points in a plane, Biometrika, 65, pp. 115-122. Modified worksheet M1 A 65*65 matrix of distances between sample points C4 A column of 999 D statistics, one for each simulated dataset C5 A column of 999 S statistics, one for each simulated dataset C5 A column of 999 G statistics, one for each simulated dataset Discussion All three statistics have picked up on the obvious (global) regularity in the dataset, with p-values of 0.001 (the minimum possible p-value for 999 randomizations) in all cases. 132 7 OTHER MACROS In the course of developing the macro library, we also created two further routines, unrelated to resampling methods. DIFFMATRIX To extract a matrix of differences from a column of data. RUNNING THE MACRO Calling statement diffmatrix c1 m1 k1 Input C1 should be a column of numeric data. Missing values are not allowed. M1 should be an empty matrix in which the differences are to be stored. K1 should be the number of observations (equal to the length of C1). Output A matrix of differences. If xi is the ith element of the input, then the ijth element of the output matrix is equal to (xi - xj). WORKED EXAMPLE FOR DIFFMATRIX Data WALES (see SPATAUTORAN) Aims of analysis To compute the matrix of differences between population change ranks for each county. Randomization procedure MTB > Retrieve "N:\resampling\Examples\Wales.MTW". Retrieving worksheet from file: N:\resampling\Examples\Wales.MTW # Worksheet was saved on 15/08/01 12:31:16 Results for: Wales.MTW MTB > % N:\resampling\library\diffmatrix c3 m5 13. Executing from file: N:\resampling\library\diffmatrix.MAC MTB > print m5 Data Display Matrix M5 0 3 4 -3 0 1 -4 -1 0 -2 1 2 -5 -2 -1 2 -1 -2 0 -3 5 2 1 3 0 1 -2 -3 -1 -4 -2 -5 -6 -4 -7 -1 -4 -5 -3 -6 7 4 3 5 2 -4 -7 -8 -6 -9 6 3 2 4 1 -3 -6 -7 -5 -8 8 5 4 6 3 133 -1 2 1 -7 4 -6 3 -8 2 5 4 -4 7 -3 6 -5 3 6 5 -3 8 -2 7 -4 1 4 3 -5 6 -4 5 -6 4 7 6 -2 9 -1 8 -3 0 -3 -2 6 -5 5 -4 7 3 0 1 9 -2 8 -1 10 2 -1 0 8 -3 7 -2 9 -6 -9 -8 0 -11 -1 -10 1 5 2 3 11 0 10 1 12 -5 -8 -7 1 -10 0 -9 2 4 1 2 10 -1 9 0 11 -7 -10 -9 -1 -12 -2 -11 0 134 MISSING To remove missing values from a number of columns of data. RUNNING THE MACRO Calling statement diffmatrix c1-cN; type k1 (1). Input c1-cN Any number of columns containing numeric data. Missing values : Allowed. Subcommand type - determines how missing values are treated. If type = 1 (default), then missing values are removed separately from each column. If type = 2, then if any row contains missing values in one or more column, the entire row is removed. If type = 2, then the columns c1-cN must all be of the same length. Output The columns c1-cN are changed on the worksheet, to omit missing values. WORKED EXAMPLE FOR MISSING Name of dataset PUFFIN Description The data concern puffin beak measurements, and derive from a study of St. Kilda puffin populations in 1991-93. Beak length and depth are recorded, along with sex (1 = male, 2 = female). Source Our own unpublished data (Centre for Ecology and Hydrology). Data Number of observations = 41 Number of variables = 3 For each observation, sex (top), length (middle) and depth (bottom) are given. 1.0 2.0 2.0 1.0 1.0 1.0 2.0 1.0 2.0 1.0 1.0 1.0 2.0 28.2 28.5 * 27.1 27.8 28.8 28.0 29.0 27.5 28.0 29.7 26.6 28.0 35.5 32.6 30.6 34.3 37.6 39.0 33.8 34.9 29.1 36.9 35.9 34.4 34.0 1.0 2.0 1.0 1.0 2.0 2.0 2.0 2.0 1.0 1.0 2.0 2.0 2.0 28.2 29.0 27.7 28.0 27.6 29.3 27.5 27.2 29.1 29.5 30.4 29.1 27.8 34.6 34.7 34.0 37.0 33.2 31.4 32.5 32.6 34.5 34.5 32.8 34.0 33.8 2.0 1.0 1.0 2.0 1.0 2.0 1.0 2.0 1.0 1.0 2.0 1.0 2.0 27.6 30.0 27.6 26.6 28.8 28.0 30.5 29.2 28.8 29.9 28.6 29.2 27.0 32.0 36.1 36.5 31.9 34.3 28.6 34.1 29.8 33.5 35.8 32.3 34.8 32.9 135 2.0 1.0 28.5 29.0 30.7 34.0 Minitab output and discussion First of all, we attempt to remove missing values from the dataset separately for each variable. To do this, we use the default for the "type" subcommand. Welcome to Minitab, press F1 for help. MTB > Retrieve "N:\resampling\Examples\Puffin.MTW". Retrieving worksheet from file: N:\resampling\Examples\Puffin.MTW # Worksheet was saved on 04/07/01 11:43:12 Results for: Puffin.MTW MTB > % N:\resampling\library\missing c1-c3 Executing from file: N:\resampling\library\missing.MAC * NOTE * Some variables contain missing values, which have been excluded. The single missing value has been excluded from the dataset. Now we reload the original data, and attempt to remove completely any individuals which have missing values. To do this, we have to use 2 in the "type" subcommand. MTB > Retrieve "N:\resampling\Examples\Puffin.MTW". Retrieving worksheet from file: N:\resampling\Examples\Puffin.MTW # Worksheet was saved on 04/07/01 11:43:12 Results for: Puffin.MTW MTB > % N:\resampling\library\missing c1-c3 ; SUBC> type 2. Executing from file: N:\resampling\library\missing.MAC * NOTE * Some units have one or more items of missing data, and have been excluded. The individual which has a missing value has been excluded from the data. 136 ADDITIONAL SAMPLE DATA Macros A r t i f i c i a l c e l l s c o l o n y d a r w i n E a r w i g s E x p o n e n t i a l E x t i n c t i o n F e r n b i r d s f i e l d h e x o k i n a s e j a p a n e s e Datasets l Mm O i a o r z n n e a d t g r i h o d b s n s l e s O r t h o p t e r a P r o l o c u l i p u f f i n r e d w o o d s a p l i n g S a p l i n g 2 S n a i l s p a t i a l s w a v e s e y t w o w a y W a l e s ONEWAYRAN TWOWAYRAN TWOTRAN TWOTUNPOOLBOOT TWOTPOOLBOOT CORRELATIONRAN MEANCIBOOT MEDIANCIBOOT STDEVCIBOOT ANYCIBOOT ONEWAYRAN TWOWAYRAN TWOWAYREPRAN LEVENERAN REGRESSSIMRAN REGRESSOBSRAN REGRESSRESRAN REGRESSBOOT ACFRAN TRENDRAN SPATAUTORAN MANTELRAN MEAD4RAN MEAD8RAN DISTEDFMC NEARESTMC LOCREGULARMC 137 If the user requires additional sample datasets for any particular macro, the above table suggests some possibilities (although many of the datasets could be modified to be suitable for the application of other macros). KEY : Worked example for this dataset Suitable alternative dataset for this macro ACCESSING THE MACROS Our intention is that the macro library will be able to be accessed in three ways: From the Minitab website : We have submitted our macros, together with documentation and sample data, to the Minitab macro library at http://www.minitab.com/support/macros/index.asp If accepted as suitable, they will appear on this website shortly. From the CEH website : The macros will also be placed on the CEH website at http://www.ceh.ac.uk/ For details of the exact location, contact the authors. On disk : If you wish to obtain a copy of the macros, please send a blank disk to either Peter Rothery (CEH Monks Wood) or Adam Butler (Lancaster University). CONTACT DETAILS Further information If you encounter any problems, or for further information, in the first instance please e-mail [email protected] [Adam Butler] Full contact details Peter Rothery Peter Rothery CEH Monks Wood Abbots Ripton Huntingdon Cambridgeshire PE28 2LS Phone: (01487) 772448 E-mail: [email protected] Adam Butler Adam Butler Department of Mathematics and Statistics Lancaster University Bailrigg Lancaster LA1 4YF E-mail: [email protected] ACKNOWLEDGEMENTS Peter Rothery, for supervision of the project. David Roy, for computer support. Phil Croxton, for advice upon the layout of the report. Hannah Butler, for contributing the macro MANTELRAN. Thanks also to all authors whose data we have quoted. 138 REFERENCES BROWN, D. & ROTHERY, P. (1978), Randomness and local regularity of points in a plane, Biometrika, 65, pp. 115-122. CAIN, A.J. & SHEPPARD, P.M. (1950), Selection in the polymorphic land snail Cepaea nemoralis, Heredity, 4, 275-294. CLIFF, A.D. & ORD, J.K. (1973) Spatial autocorrelation, Pion, London. CRICK, F.H.C. & LAWRENCE, P.A. (1975), Compartments and polychones in insect development, Science, 189, pp. 340-347. DAVISON, A.C. & HINKLEY, D.V. (1997) Bootstrap methods and their application, CUP, Cambridge. DIGGLE, P.J. (1983) Statistical analysis of spatial point patterns, Academic Press, London. DRAPER, N.R. & SMITH, H. (1998) Applied regression analysis (3rd edition), John Wiley & Sons., New York (Chapter 26). EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London. FISHER, R.A. (1935) The design of experiments, Oliver & Boyd, Edinburgh. GENERAL REGISTER OFFICE (1961) England and Wales: Preliminary Census Report, 1961, HMSO, London. HARRIS, W.F. (1986) The breeding ecology of the South Island Fernbird in Otago Wetlands, PhD Thesis, University of Otago, Dunedin, New Zealand. HIGHAM, C.F.W., KIJNGAM, A. & MANLY, B.F.J. (1980), An analysis of prehistoric canid remains from Thailand. Journal of Archaeological Science, 7, pp. 149-165. MANLY, B. F. J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, 2nd edn, Chapman & Hall, London. MC. KECHNIE, S.W., EHRLICH, P.R. & WHITE, R.R. (1975), Population genetics of Euphydryas butterflies. I. Genetic variation and the neutrality hypothesis, Genetics, 81, pp. 571-594. MINITAB INC. (1999) MINITAB User's Guide, Release 13 for Windows, Minitab Inc., 3081 Enterprise Drive, State College, Pennsylvania 16801-3008. NUMATA, M. (1961), Forest vegetation in the vicinity of Choshi. Coastal flora and vegetation at Choshi, Chiba Prefecture IV., Bull. Choshi Marine Lab. Chiba Uni., 3, pp. 28-48 [in Japanese]. POPHAM, E.J. & MANLY, B.F.J. (1969), Geographical distribution of the Dermaotera and the continental drift hypothesis, Nature, 222, pp. 981-982. POWELL, G.L. & RUSSELL, A.P. (1984), The diet of the eastern short-horned lizard (Phrynosoma douglassi brevirostre) in Alberta and its relationship to sexual size dimorphism, Canadian Journal of Zoology, 62, pp. 428-440. POWELL, G.L. & RUSSELL, A.P. (1985), Growth and sexual size dimorphisms in Alberta populations of the eastern short-horned lizard, Phrynosoma douglassi brevirostre, Canadian Journal of Zoology, 63, pp. 139-154. RAUP, D.M. (1987), Mass extinctions: a Discussion, Palaeontology, 30, pp. 1-13. REYMENT, R.A. (1982), Phenotypic evolution in a Cretaceous foraminifer, Evolution, 36, pp. 1182-1199. RIPLEY, B.D. (1977), Modelling spatial patterns (with Discussion), JRSS series B, 39, pp. 172-212. STRAUSS, D.J. (1975), A model for clustering, Biometrika, 62, pp. 467-475. TER BRAAK, C.J.F. (1992), Permutation versus bootstrap significance tests in multiple regression and ANOVA, in Bootstrapping and Related Techniques (ed. K.H. Jockel), Springer-Verlag, Berlin, pp.79-86. 139