* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Input
		                    
		                    
								Survey							
                            
		                
		                
                            
                            
								Document related concepts							
                        
                        
                    
						
						
							Transcript						
					
					Minitab Macros for
Resampling Methods
By Adam Butler
CEH Monks Wood
September 2001
SUMMARY
This report describes a library of macros for implementing a variety of statistical methods in
Minitab using computationally-intensive methods of inference (randomization, bootstrapping and
Monte Carlo simulation).
CONTENTS
INTRODUCTION
4
1
Resampling methods in statistics
What are they ?
When should I used them ?
Randomization, bootstrapping and Monte Carlo simulation
A note on the use of p-values
4
4
4
5
5
2
Resampling in Minitab
Minitab
Some useful Minitab commands
The resampling macros
Other sources of information
Arguments to the macros
Subcommands in the macros
Computing power and number of resamples
Speed
7
7
7
8
8
8
8
9
10
3
How to use this guide
Information about the macros
Worked examples
11
11
12
4
Literature review
13
REFERENCE MANUAL
14
1
Significance tests
Overview
ONESAMPLERAN
TWOSAMPLERAN
TWOTRAN
TWOTPOOLBOOT
TWOTUNPOOLBOOT
CORRELATIONRAN
14
14
15
18
22
26
29
32
2
Confidence intervals
Overview
An introduction to bootstrap confidence intervals
MEANCIBOOT
MEDIANCIBOOT
STDEVCIBOOT
ANYCIBOOT
35
35
35
37
41
45
49
2
3
Analysis of variance
Overview
ONEWAYRAN
TWOWAYRAN
TWOWAYREPRAN
LEVENERAN
53
53
54
58
62
66
4
Regression
Overview
Should we resample residuals or observations ?
REGRESSSIMRAN
REGRESSOBSRAN
REGRESSRESRAN
REGRESSBOOT
70
70
70
72
76
79
83
5
Time series
Overview
ACFRAN
TRENDRAN
89
89
90
96
6
Spatial statistics
Overview
Which procedure should I use ?
Using the macros for spatial statistics
SPATAUTORAN
MANTELRAN
MEAD4RAN
MEAD8RAN
Creating and interpreting EDF plots
DISTEDFMC
NEARESTMC
LOCREGULARMC
100
100
100
101
102
107
111
114
117
TABLE OF ALTERNATIVE DATASETS
REFERENCES
ACKNOWLEDGEMENTS AND CONTACT DETAILS
137
138
139
APPENDIX : Reference card for the macros
3
INTRODUCTION
1
RESAMPLING METHODS IN STATISTICS
What are they ?
Resampling methods are a class of statistical techniques for drawing inferences based on the
variability present within a dataset
Resampling methods (sometimes known as computationally intensive methods) include :
 Bootstrapping
 Randomization tests (also known as permutation tests)
 Monte Carlo tests and related procedures
In general, resampling methods are difficult to justify in theory, but relatively easy to apply in
practice.
The common concept underlying all resampling methods is that we can assess the variability by
drawing a large number of samples, each having the same size as the original dataset, from the
observed data (this is the process of resampling); we then compare the properties of the observed data
to the properties of the resampled datasets.
When should I use them ?
Resampling methods are useful for obtaining assessments of variability - this means that they are
principally used to calculate confidence intervals and p-values.
Resampling methods can be used with many different statistical methods - including comparison of
two samples, ANOVA, regression, spatial statistics, time series and multivariate analysis - and can
potentially be applied to any area of application; Manly (1997) discusses how resampling methods
have been applied in a number of different areas of biology.
Resampling methods have become increasingly popular in recent year, partly because of increasing
computer power.
Resampling methods are usually used instead of - or alongside - standard techniques for drawing
inferences from data. Standard techniques usually rely upon statistical theory (especially asymptotic
arguments) and assumptions about the distribution of the data (for example, that the data are normally
distributed). Resampling methods do not make these assumptions, and so should be more reliable in
those situations in which the standard assumptions are false.
If the assumptions underlying standard theory are valid, resampling and standard techniques should
give very similar results. In fact, resampling methods often give similar results to standard theory
even if the assumptions underlying standard theory are not valid.
Resampling methods also rely upon their own (fairly complicated) assumptions. It is felt that these
assumptions will often be valid, or approximately valid, but it is worth noting that there are situations
in which the application of resampling methods may go badly wrong.
Resampling methods place much emphasis on the observed dataset, and so may be very susceptible to
any errors or problems with the data that has been collected. It is therefore important to check data
carefully, and to use graphical techniques to look for outlying points.
Possibly the most interesting feature of resampling methods is their generality - they may be used to
tackle a wide variety of practical statistical problems, including problems for which standard theory
does not yet exist, in a fairly straightforward way.
4
Randomization, bootstrapping and Monte Carlo simulation
The macros in this library sometimes use randomization tests, and sometimes use bootstrapping. The
differences between the two techniques are rather subtle. The key differences are that :
 In practice, if we are in a situation in which either method can be used, then the methods work in an
almost identical fashion. Usually the only difference is that randomization tests involve resampling
without replacement (i.e. we simply re-order the original data), whereas bootstrapping involves
resampling with replacement (i.e. a value from the original data may occur more than once in a
resampled dataset).
 Bootstrap methods are substantially more general than randomization methods, and may often be used
in situations in which randomization methods are not available.
 The assumptions which justify the use of the two techniques are different.
A few of the macros for spatial statistics used Monte Carlo methods; these are a more general class of
technique than either randomization or bootstrap methods (in fact, both randomization and bootstrap
methods can be viewed as special cases of Monte Carlo methods). Whilst bootstrap and randomization
methods involve simulating only from the observed data, Monte Carlo methods involve taking
simulations using a statistical model. All of the Monte Carlo methods in this study involve simulating
datapoints at random from within a fixed rectangular region, in order to examine the hypothesis of
Complete Spatial Randomness (CSR).
A note on the use of p-values
The bulk of the macros in the library deal with significance tests. In general, these involve testing a null
hypothesis against one or more possible alternative hypotheses. Performing a significance test involves
calculating the location of the observed test-statistic value t within the probability distribution of the teststatistic. Assume that the true distribution is T. This probability distribution can often be approximated
either using statistical theory, or, as is the case in the macros, by resampling.
One-sided randomization p-values
Assume for the time being that we are only interested in the alternative hypothesis which implies a large
value of the test-statistic. Then the true one-sided p-value of interest is
p  T  t 
Standard procedure
If the test-statistic is known by statistical theory to follow a particular distribution, Ta, then the standard
one-sided p-value of interest is given by
p s  Ta  t   p .
For a continuous distribution, this value is obtained by integrating the probability density from t to
infinity, whilst for a discrete distribution the probability mass function is summed from t to infinity
(inclusive).
Randomization
If the resampling distribution is given by Tr, then the resampling one-sided p-value of interest is given by
pr  Tr  t   p .
This value is therefore the proportion of all test-statistics (the set of resampled test-statistics, plus the
observed test-statistic, since under the null hypothesis this is also a realisation from T) greater than or
equal to t.
5
Two-sided randomization p-values
Now assume that either alternative hypothesis may be of interest.
One-sided randomization p-values for the opposite alternative hypothesis, corresponding to small values
of the test-statistic, are analogous to those defined above. Calculating the two-sided p-value, i.e. the
probability of the test-statistic being extreme in either direction is more complicated, because there are
now two possible approaches :
1. Let the two-sided p-value be the probability of being as far from the mean of the
distribution as the observed test-statistic, in either direction
2. Let the two-sided p-value be double the smaller of the one-sided p-values
Most standard theory either uses distributions for which only one-sided p-values are relevant (e.g. F and
chi-squared distributions), or uses distributions (e.g. normal or T distributions) which are symmetric. For
a symmetric distribution, either method of computing the 2-sided p-value will give the same answer,
because both of the one-sided p-values will be the same.
When we obtain distributions by resampling, however, there is no reason to assume that they will be
symmetric. For very non-symmetric distributions, the first approach to computing two-sided p-values
may give substantially misleading results. The disadvantage of using the second approach in the
resampling context is that we use the data from only one tail of the distribution, so that we need a greater
number of resamples to give the same accuracy in the calculation of p-values. Since resampling methods
are most useful for those situations in which standard approximations are not valid - i.e. for situations in
which test-statistics have highly skewed distributions - we use the second method of computing
resampling two-sided p-values (where these are required) throughout the library.
6
2
RESAMPLING IN MINITAB
MINITAB
MINITAB is a general purpose package for data manipulation and statistical analysis.
This guide outlines a library of MINITAB macros which have been written to perform a variety of
commonly-used statistical procedures using randomization and bootstrapping methods, rather than the
more traditional (and less computationally-intensive) methods which involve approximations and
distributional theory.
In order to implement the macros, you must work in the session window.
To open the session window, click on window on the menu bar. Move down the list, and select
session. Click on the editor menu, move down the list and select enable commands.
The various macros may be invoked by typing their name at the MTB> prompt; all other MINITAB
commands can also be invoked from this prompt.
Some useful MINITAB commands
The most useful MINITAB commands in the context of randomization and bootstrapping are statistics
Can be used to display or store a wide variety of descriptive statistics for a given column. The subcommands specify the various descriptive statistics to be used.
Example:
statistics c1;
mean c2.
This takes the mean of column c1 and stores it in the first element of column c2.
sample
Can be used to draw a sample, with or without replacement, from a column.
Example:
sample 10 c1 c2
This takes a sample of size 10 from column c1, without replacement, and stores it in c2.
Randomization tests are based upon sampling without replacement.
Example:
sample 7 c1 c2 ;
replace.
This takes a sample of size 7 from column c1, with replacement, and stores it in c2.
Bootstrapping is based upon sampling with replacement.
random
Can be used to simulate random datasets from standard probability distributions.
Example:
random 50 C1;
normal 0 1.
7
This simulates 50 values from a standard normal [i.e. a Normal(0,1)] distribution, and stores the simulated
values in c1.
Example:
random 10 c2;
poisson 6.
This simulates 10 values from a Poisson distribution with parameter 6, and stores the values in c2.
The resampling macros
The resampling macros are designed, as far as possible, to mimic standard MINITAB functions for the
statistical methods in question. In some cases, there are both randomization and bootstrap versions of
standard MINITAB commands; the justification for using randomization and bootstrap techniques is
substantially different, but they will often (though not always) give similar answers. Much of the output
from the macros will be identical to output from the standard MINITAB functions, since it is not
dependent upon the randomization or bootstrapping process (for example correlation coefficients,
regression parameter estimates, ANOVA tables and sample statistics will not be affected by using
randomization or bootstrap techniques in place of standard techniques). However, assessments of the
significance or variability of an estimate (such as p-values and confidence intervals) will be altered by the
use of randomization and bootstrap techniques. It is important to realise that MINITAB functions for
most standard techniques will yield the same answers again and again, regardless of how many times they
are run; in contrast, p-values and confidence intervals produced using the resampling macros will be
different every time the macro is run. This is an inherent feature of randomization and bootstrap
techniques; so long as the number of randomizations or bootstrap samples is large (how large depends
upon the particular statistical method being used), these differences should not be particularly important.
The Minitab macros are designed for release 13, but most will probably function with earlier releases.
The macro generally have a similar calling statement to the corresponding standard Minitab command, if
this exists. Additionally, macro names end in a suffix, depending upon the type of resampling
methodology used:
 ran : for randomization procedures
 boot : for bootstrap procedures
 mc : for Monte Carlo tests and related procedures.
Although these three classes of methods are all fairly similar to implement in practice, the theoretical
justification for using them is different, so we distinguish clearly between the different forms of
resampling.
Other sources of information
Along with the macros library, we provide :
 Individual descriptions of each macro (taken from the sections of this guide)
 Sample datasets, as Minitab files
 Sample datasets, as .DAT files
 .TXT files, containing output from running the macros over the sample datasets
 Minitab files, containing the final worksheet obtained after running the macros over the sample
datasets (although these are missing for some macros)
Arguments to the macros
Most of the macros require one or more columns of numeric data as input.
8
For some macros, the order in which the columns are entered is crucially important. In regression, the
response must be the first column, followed by one or more predictors. In two-way analysis of
variance, the group must be entered before the block.
For some of the spatial macros, the input is in the form of matrices. Consult the Minitab
documentation or help menu for information on entering and reading matrices.
Missing values are allowed for all macros, except those which take matrices as inputs. Missing values
are dealt with in an obvious way; e.g. observations with one or more missing values are usually
ignored.
Subcommands in the Minitab macros
Subcommands are used for the following purposes in the macros :
Specifying values
These subcommands allow the user to change basic quantities involved in the operation of the macro.
For subcommands of this type, the argument for the subcommand is simply the value of the quantity in
question (a constant). Specific uses are :
 To specify the number of resamples to be used. These are specified using the subcommands NRAN
for randomization tests, NBOOT for bootstrap procedures and NSIM for Monte Carlo procedures. The
required value (a positive integer constant) is entered.
 To specify significance levels for confidence intervals, using the subcommand SIGLEV. The
significance level (expressed as a percentage) is entered e.g. 95.
 To specify the number of test-statistics to be considered. NLAG in the macro ACFRAN specifies the
maximum lag for which serial correlation coefficients should be computed, whilst NSTATS in the
macro NEARESTMC specifies the largest value of k for which kth nearest neighbour distances should
be computed.
 To specify the graphical resolution. NPOINTS in DISTEDFMC specifies the number of points to be
used for evaluating CDFs and EDFs, and so controls the resolution of the resulting graph.
Modifying procedures
These subcommands allow the user to modify the technical details of the procedure used within the
macro. In these cases, a constant should be entered; a key to the values to be used is given below.
USEMEAN in LEVENERAN specifies whether the median or mean should be used to create the
modified dataset.
1 = use median [default - this will occur if the subcommand is not used]
0 = use mean
any other numeric value except 0 or 1 = use mean
RESIDUALS in REGRESSRESRAN and REGRESSBOOT specifies the kind of residuals which
should be used in the randomization procedure.
1 = raw residuals
2 = modified residuals [default]
3 = deletion residuals
any other numeric value except 1, 2 or 3 will lead to an error message
TYPE in MISSING specified how missing values should be treated.
1 = delete the missing value only in the column concerned.
2 = delete the entire row if it contains any missing values.
any other numeric value except 1 or 2 will lead to an error message
Storing output
All of the remaining subcommands in the macros are concerned with allowing the user to store lengthy
output to file. All such subcommands operate in the same way: if the user does not wish to store the
9
output, then the subcommand need not be used ; if the user wishes to store the output, then the appropriate
subcommand should be used, and the argument to the subcommand should be the column, columns
constant or matrix (as required) in which the output is to be stored.
*** IMPORTANT NOTES ***
 Take care not to over-write the original data when storing lengthy output.
 If the number of resamples is large, storing resampled test-statistics may generate worksheets which
take up a large amount of memory.
Computing power and number of resamples
In order to gain a clear indication of the resampling distribution of the quantity of interest, it is necessary
to use a reasonably large number of resamples. We make use of the following defaults Number of randomizations
999
Number of bootstrap resamples
999 for significance tests
2000 for calculating confidence intervals
Number of simulations
999 (except DISTEDFMC, where we use 99).
Different authors make different recommendations about the number of bootstrap resamples required.
With general purpose macros such as those in this library, there is a trade off between running time
(which usually increases roughly as a linear function of number of resamples) and accuracy (which will
increase with the number of resamples). The defaults seem to us to provide a reasonable compromise.
If a high degree of accuracy is required in the estimation of p-values or confidence intervals then the
number of resamples must be made very large. We would suggest that the macros is this library are
probably not appropriate for this, since some of the procedures used are relatively inefficient (i.e. will
take a long time to run).
Speed
The speed at which the macros run will depend upon
 The size of the dataset
 The number of randomizations / bootstrap resamples / simulations used
 The capabilities of the computer
so it is not possible to state clearly how long the macros will take to run. The macros can broadly be
divided into three categories FAST : These macros should run within a few minutes, or possibly much less, with
the default number of randomizations or bootstrap samples.
Significance tests : all macros
Confidence intervals : all macros
Analysis of variance : ONEWAYRAN, LEVENERAN
Regression : REGRESSSIMRAN
Time series : all macros
Spatial statistics : MEAD4RAN, MEAD8RAN
MODERATE : These macros will take at least a few minutes to run with the default
number of randomizations or bootstrap samples.
Analysis of variance : TWOWAYRAN, TWOWAYREPRAN
Regression : REGRESSOBSRAN, REGRESSRESRAN, REGRESSBOOT
10
Spatial statistics : MANTELRAN
SLOW : These macros may take a long time to run - up to a few hours.
Spatial statistics : SPATAUTORAN, DISTEDFMC, NEARESTMC, LOCREGULARMC.
The advice would be that the 'FAST' macros can be used like any other Minitab command, with only a
short wait for the output, whereas the 'MODERATE' and 'SLOW' macros should be left to run in the
background whilst the user works on another task.
11
3
HOW TO USE THIS GUIDE
Information about the macros
For each of the macros in the resampling methods library, the following information is provided:
An outlines of the purpose of the macro.
RUNNING THE MACRO
Macro calling statement : Gives the full calling statement for the macro.
 Note that macros are invoked using a % sign, and that the appropriate path to the resampling methods
library must be placed in front of the macro name.
 All possible subcommands are listed.
 For the main command and for each subcommand, the required form of data is listed. For example :
c1-c3 means that three columns are required,
c1
means that one column is required,
k1 k2 means that 2 constants are required,
m1
means that 1 matrix is required,
c1-cN means that an unspecified number of columns (N) are required
 For subcommands, default values are also given, in brackets, if they are available. If the user wishes to
use the default values, then the subcommands need not be included in the calling statement. For
subcommands which involve storing output to a column of the worksheet, default values are not
relevant; if the user does not wish to store the output, then the appropriate subcommand should simply
not be used.
Input : A detailed description of what kind of data should be used in the compulsory arguments to the
macro, and in what order the data should be entered. Mention is made of whether missing values are
allowed, and how they will be dealt with.
Subcommands : A description of the purpose and operation of each subcommand, and of the type of input
required.
Output : A straightforward description of the output produced by the macro.
Speed of macro : Some indication of the speed at which the macro runs.
TECHNICAL DETAILS
Notes : Any general comments on the operation of the macro, including possible bugs.
Hypotheses : For hypothesis tests, the null and (if relevant) alternative hypotheses are stated explicitly.
Test-statistic : For hypothesis tests, the test-statistic is defined and justified.
Resampling procedure : The procedure for randomizing, bootstrapping or simulating is stated and briefly
justified. If the algorithm is complicated (e.g. for multiple regression), a reference is given.
ALTERNATIVE PROCEDURES
Other macros : Outlines any alternative macros in the library that may be used to perform the same
statistical analyses using different methods (for example, different resampling algorithms).
Standard procedure : Gives an outline of the calling procedure and general purpose of the standard
MINITAB function which corresponds most closely to a non-resampling version of the macro.
 In many cases, the macros in the resampling methods library are direct computationally-intensive
analogues of in-built MINITAB functions for standard statistical procedures.
 In the case of analysis of variance, the standard procedures are incorporated within the macros, so that
the macros are simply extended forms of the standard functions.
12
In the case of more sophisticated techniques in time series and spatial statistics, standard procedures in
Minitab are generally not available. In some of these cases, non-randomization methods are actually
included as part of the macro output; a description of these methods is given.
REFERENCES
References are provided; if the user is in any doubt as to the appropriateness of the methods used in the
macros for their data, then these should be consulted. Many references are to the book by Manly (1997),
which provides a good general introduction to randomization and bootstrap procedures.
Worked examples
Datasets
Example datasets are provided alongside the library. A brief description of the dataset pertaining to each
macro (or set of macros) is given, together with some or all of the data. Most of these datasets are taken
from Manly (1997), and a detailed description of the analysis of these datasets can often be found in his
book. The example datasets are all taken from biological studies; note that sample sizes are generally
small.
Example MINITAB input and output for the analysis of each example dataset is shown. For each macro,
we provide :
DATA
Name of dataset : The datasets are named. The names correspond to the filenames of the corresponding
Minitab worksheets.
Description : We give a brief description of the data, how it was collected, and for what purpose.
Source : We give both the source from which we took the data, and, if different, the original source of the
data.
Data : We list the full dataset. Note that the listing is often not in a useful form for pasting into Minitab,
so it is more sensible to use the .DAT or .Minitab files to input the data to Minitab.
Worksheet : We describe the columns, constants and matrices in the Minitab worksheet.
ANALYSIS
Aims of analysis : We briefly describe the aims of the analysis described in the "output". These aims are
generally fairly limited, and a full statistical analysis of the data would usually have more substantial
objectives.
Minitab Output : Minitab input and output are listed in full. Note : The worked examples are for
demonstrative purposes only. Details of the procedures (e.g. the values chosen for subcommands) have
been chosen to give the best demonstration of the capabilities of the macros, rather than for sound
statistical reasons.
Modified worksheet : A description of any additional columns, constant or matrices created in the
worksheet by running the macro.
Discussion : A brief discussion of the results.
13
4
Literature review
General
The range and approach of this macro library largely mirrors that of Manly (1997). His book provides an
clear, non-technical introduction to resampling methods, with an emphasis upon biological and ecological
applications. We cover a substantial proportion of the material in chapters 1 to 11 of Manly (1997), and
the arrangement of our material largely mirrors that of Manly (1997):
Section 1 : Significance tests
Section 2 : Confidence intervals
Section 3 : Analysis of variance
Section 4 : Regression
Section 5 : Time series
>
>
>
>
>
Based upon Chapter 6 of Manly (1997)
Based upon Chapter 3 of Manly (1997)
Based upon Chapter 7 of Manly (1997)
Based upon Chapter 8 of Manly (1997)
Based upon Chapter 11 of Manly (1997)
Chapters 1 and 5 of Manly (1997) provide a general introduction to resampling methods in biology. The
material in Section 6 of the macro library, Spatial Statistics, is partly based upon Chapters 9 and 10 of
Manly (1997), but also includes material from Chapter 4 (Monte Carlo tests), and material taken from
other sources (see below).
Confidence intervals
The emphasis of Manly (1997) is upon randomization and hypothesis testing. His introduction to
bootstrap confidence intervals is relatively brief, and without worked examples, so it is probably better to
consult Efron and Tibshirani (1993), who provide a clear, fairly non-technical introduction to bootstrap
confidence intervals. Chapter 12 discusses the bootstrap-t method, Chapter 13 the Efron percentile and
Hall percentile methods, and Chapter 14 the BC and BCa percentile methods.
Resampling methodology
The statistical literature concerning computer-intensive inference is extremely large, and there are a large
number of technical issues involved. Efron and Tibshirani (1993) discuss general principles and issues,
whilst a good (but highly mathematical) introduction to the statistical theory of resampling is provided by
Davison and Hinkley (1997), who also provide extensive references.
Regression
Draper and Smith (1998) give a wide-ranging overview of regression methods, and include a chapter on
the application of resampling methods to regression. The algorithms used for multiple regression with
randomization or bootstrapping of residuals in the macros are those proposed by Ter Braak (1992).
Spatial statistics
Diggle (1983) provides an introduction to the search for spatial pattern, and discusses how EDF plots can
be used for this purpose. Brown and Rothery (1978) look at the same topic, and propose test-statistics that
are sensitive to different kinds of regularity. Cliff and Ord (1973) discuss methods for estimating spatial
autocorrelation.
Minitab
For further details about Minitab, see Minitab Inc. (1999).
14
REFERENCE MANUAL
1 BASIC STATISTICS: SIGNIFICANCE TESTS
OVERVIEW
One sample tests
ONESAMPLERAN tests whether a population mean is equal to a hypothesised value.
Two sample tests
TWOSAMPLERAN tests for the equality of two population means using randomization.
TWOTRAN also tests for the equality of two population means using randomization.
TWOTUNPOOLBOOT tests for the equality of two population means using bootstrapping.
TWOTPOOLBOOT also tests for the equality of two population means using bootstrapping.
Correlation
CORRELATIONRAN tests whether a correlation coefficient between two variables is significant.
Comments
Possibly the most widely used test in statistics is the 2 sample t-test, in which we test the equality of two
means. We include four different computer-intensive macros for this procedure.
15
ONESAMPLERAN
This macro is designed to test whether or not the mean of a single column of data is equal to a
hypothesised value specified by the user.
RUNNING THE MACRO
Calling statement
onesampleran c1 k1 ;
nran k1 (999) ;
sums c1.
Input
C1
A single column, containing only numerical values. Missing values are allowed.
K1
A single constant, containing the hypothesised mean value.
Subcommands
nran
Number of randomizations used.
sums
Specify a column in which to store sample sums for bootstrap samples.
Output
 Basic statistics: Sample size, sample mean, sum of sample values, and sample standard deviation.
 Hypothesised mean value.
 Resampling details: Number of randomizations, One and two-sided randomization p-values.
The two-sided randomization p-value is double the smaller of the one-sided randomization p-values.
Speed of macro : FAST
TECHNICAL DETAILS
Null hypothesis : The population mean is equal to the hypothesised mean value.
Test-statistic : We create a modified dataset by deducting the hypothesised mean from each data value.
The appropriate test-statistic is the sum of these modified values.
Randomization procedure : We randomize the allocation of signs to the absolute values within the
modified dataset, since under the null hypothesis there should be an equal probability that any data point
will have been allocated a negative or positive value once the hypothesised mean is deducted from it.
ALTERNATIVE PROCEDURES
Standard procedures
onet C1;
test k1.
Performs a one-sample t-test for the mean of the data in c1 being equal to the hypothesised mean value k1,
in the situation in which the sample variance is unknown.
onet C1;
sigma k1
test k2.
Performs a one-sample normal test for the mean of the data in c1 being equal to the hypothesised mean
value k2, in the situation in which the standard deviation is known to be equal to k1.
16
REFERENCES
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 6).
WORKED EXAMPLE FOR ONESAMPLERAN
Name of dataset
DARWIN
Description
The data refers to the heights of 15 self-fertilised offspring from the plant Zea mays. The data were
originally collected by Charles Darwin, were analysed by RA Fisher in the 1930s (see Fisher, 1935), and
are analysed by Manly (1997) using a one-sample randomization test.
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
FISHER, R.A. (1935) The design of experiments, Oliver & Boyd, Edinburgh.
Data
Number of observations = 15
Number of variables = 1
43 67 64 64 51 53 53 26 36 48 34 48
6 28 48
Worksheet
C1
Data
Aims of analysis
To test whether the population mean is equal to a hypothesised value of 56.
Minitab output : standard procedure
MTB > Retrieve "N:\resampling\Examples\Darwin.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Darwin.MTW
# Worksheet was saved on 27/07/01 14:03:05
Results for: Darwin.MTW
MTB > onet c1 ;
SUBC> test 56.
One-Sample T: Self
Test of mu = 56 vs mu not = 56
Variable
Self
N
15
Mean
44.60
StDev SE Mean
16.41
4.24
Variable
95.0% CI
T
P
Self
( 35.51, 53.69) -2.69 0.018
17
Minitab output : randomization procedure
MTB > % N:\resampling\library\onesampleran c1 56 ;
SUBC> nran 499 ;
SUBC> sums c3.
Executing from file: N:\resampling\library\onesampleran.MAC
One-sample randomization test
Data Display (WRITE)
Number of observations 15
Observed mean value
44.60
Hypothesised mean value
56.00
Observed sum of values
669.0
Observed standard deviation
16.41
Number of randomization samples 499
P-value for one-sided test with alternative: true mean < hypothesised mean
P-value for one-sided test with alternative: true mean > hypothesised mean
P-value for two-sided test
0.0040
0.0020
1.0000
Modified worksheet
C3
A column containing 499 sums of values, one for each randomized dataset
Discussion
The standard (two-sided) p-value is 0.018. Manly obtains a randomization p-value of 0.016, by
enumeration of the full randomization distribution. Our two-sided p-value of 0.004 is substantially
smaller than either of these values, but this may just be a consequence of the relatively small number of
randomizations used.
The conclusion is the same in all cases - there is strong evidence that the population mean is not equal to
the hypothesised mean. Looking at the one-sided p-values (and the sample means) we see that we can
accept the alternative hypothesis that the population mean is lower than the hypothesised mean.
18
TWOSAMPLERAN
This macro is designed to test, using randomization, whether or not the means for two independent
samples are equal.
RUNNING THE MACRO
Calling statement
twosampleran c1 c2 ;
nran k1 (999) ;
differences c1 ;
tstatistics c1.
Input
C1
Data for first group
C2
Data for second group
C1 and C2 must both be columns containing only numerical data, but they need not be of the same length.
Missing values are allowed.
Subcommands
nran
differences
tstatistics
Number of randomizations used.
Specify a column in which to store differences between simulated group means.
Specify a column in which to store t-statistics for differences between simulated
group means.
Output
 Basic statistics: Sample size, sample mean, sum of sample values, and sample standard deviation.
 Hypothesised mean value.
 Resampling: Number of randomizations, One and two-sided randomization p-values.
The two-sided randomization p-value is double the smaller of the one-sided randomization p-values.
Speed of macro : FAST
ALTERNATIVE PROCEDURES
Other macros
This macro uses randomization, but two bootstrapping versions of the test are available (depending upon
whether variances are pooled) :
TWOTPOOLBOOT
Bootstrap test with pooling of variances
TWOTUNPOOLBOOT Bootstrap test without pooling of variances
This macro is suitable for when data for the two groups are contained in separate columns. If data is
contained in a single column, with a second column denoting group number, then TWOTRAN should be
used.
Standard procedures
twosample c1 c2.
This performs a two-sample t-test for the mean of the data in c1 being equal to the mean of the data in c2.
Variances are not pooled, so this is appropriate if the variances for the two groups cannot be assumed to
be equal.
19
twosample C1 C2;
pooled.
This performs a two-sample t-test for the mean of the data in c1 being equal to the mean of the data in c2.
Variances are pooled, so this is only appropriate if the variances for the two groups can be assumed to be
equal.
TECHNICAL DETAILS
Null hypothesis :We test the null hypothesis that the mean for the first group is equal to the mean for the
second group.
Randomization procedure :We fix the data value for each individual, and fix the size of the groups. We
then randomize the allocation of individuals to groups, since under the null hypothesis this allocation will
be random.
Test-statistic : We use the difference between the two sample group means as the test-statistic.
REFERENCES
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 6).
WORKED EXAMPLE FOR TWOSAMPLERAN
Name of dataset
LIZARDS
Description
The data consists of the quantity of dry biomass of Coleoptera in the stomachs of two size morphs of the
Eastern Horned Lizard, Phrynosoma douglassi brevirostre. The data were collected by Powell and
Russell, and are analysed by Manly (1997) using a two sample randomization test. Data is available for 24
lizards in the first size morph (adult males and yearling females) and 21 lizards in the second size morph
(adult females).
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
POWELL, G.L. & RUSSELL, A.P. (1984), The diet of the eastern short-horned lizard (Phrynosoma douglassi
brevirostre) in Alberta and its relationship to sexual size dimorphism, Canadian Journal of Zoology, 62,
pp. 428-440.
POWELL, G.L. & RUSSELL, A.P. (1985), Growth and sexual size dimorphisms in Alberta populations of the
eastern short-horned lizard, Phrynosoma douglassi brevirostre, Canadian Journal of Zoology, 63, pp.
139-154.
Data
Number of observations = 45
Number of variables = 2
For each size morph group, data are given.
Group 1 (Adult males and yearling females)
256 209 0 0 0 44 49 117 6 0 0 75 34 13
90 0 32 0 205 332 0 31 0
0
Group 2 (Adult females)
20
2 0 89 0 0
179 19 142 100
0 163 286
0 432
3 843
0 158 443 311 232 179
Worksheet
C1
Data for group 1
C2
Data for group 2
Aims of analysis
To investigate whether stomach biomass is different for lizards in size morph 1 and lizards in size morph
2.
Minitab output : Standard procedure, without pooling
MTB > Retrieve "N:\resampling\Examples\Lizards.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Lizards.MTW
# Worksheet was saved on 03/07/01 16:32:34
Results for: Lizards.MTW
MTB > twosample c1 c2
Two-Sample T-Test and CI: Group1, Group2
Two-sample T for Group1 vs Group2
N
Group1 24
Group2 21
Mean
62.2
170
StDev SE Mean
94.1
19
209
46
Difference = mu Group1 - mu Group2
Estimate for difference: -108.2
95% CI for difference: (-209.6, -6.9)
T-Test of difference = 0 (vs not =): T-Value = -2.19 P-Value = 0.037 DF = 27
Minitab output : Standard procedure, with pooling
MTB > twosample c1 c2 ;
SUBC> pooled.
Two-Sample T-Test and CI: Group1, Group2
Two-sample T for Group1 vs Group2
N
Group1 24
Group2 21
Mean
62.2
170
StDev SE Mean
94.1
19
209
46
Difference = mu Group1 - mu Group2
Estimate for difference: -108.2
95% CI for difference: (-203.4, -13.0)
T-Test of difference = 0 (vs not =): T-Value = -2.29 P-Value = 0.027 DF = 43
21
Both use Pooled StDev = 158
Randomization procedure (with pooling)
MTB > Retrieve "N:\resampling\Examples\Lizards.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Lizards.MTW
# Worksheet was saved on 07/03/01 04:32:34 PM
Results for: Lizards.MTW
MTB > % N:\resampling\library\twosampleran c1 c2 ;
SUBC> nran 999 ;
SUBC> differences c4 ;
SUBC> tstatistics c5.
Executing from file: N:\resampling\library\twosampleran.MAC
Two-sample randomization test
Data Display (WRITE)
Number of observations in group 1 24
Number of observations in group 2 21
Data mean for group 1
62.21
Data mean for group 2
170.4
Standard deviation for group 1
94.11
Standard deviation for group 2
208.6
Observed difference in means
Observed t-statistic
-2.19
-108.2
Number of randomization samples 999
P-value for one-sided test with alternative: mean(group 1)>mean(group2) 0.9880
P-value for one-sided test with alternative: mean(group 1)<mean(group2) 0.0130
P-value for two-sided test
0.0260
Modified worksheet
C4
A column containing 999 differences between sample means, one for each randomized dataset
C5
A column containing 999 t-statistics for differences, one for each randomized dataset
Discussion
Standard (two-sided) p-values are 0.037 (if we do not pool variances) or 0.027 (if we pool variances),
whilst our randomization p-value is 0.026. All of these values are similar, and provide reasonable
evidence for a different in stomach biomass between males and females. Looking at the data (and onesided p-values) it is clear that stomach biomass is higher for lizards in size morph 2.
22
TWOTRAN
This macro is designed to test, using randomization, whether or not the means for two independent
samples are equal.
RUNNING THE MACRO
Calling statement
twotran c1 c2 ;
nran k1 (999) ;
differences c1 ;
tstatistics c1.
Input
C1
Data for both groups
C2
Group indicator
C1 and C2 must both be columns containing only numerical data, and they must be of the same length.
The column c2 should contain group markers; these should be any two distinct numerical values (for
example, 1 and 2).
Subcommands
nran
differences
tstatistics
Number of randomizations used.
Specify a column in which to store differences between simulated group means.
Specify a column in which to store t-statistics for differences between simulated
group means.
Output
Basic summary statistics (numbers of observations, group means & standard deviations) are given,
along with the observed t-statistic and difference in sample means. Randomization p-values are
given for both one-sided hypotheses, and for the two-sided hypothesis.
Speed of macro : FAST
ALTERNATIVE PROCEDURES
Other macros
This macro uses randomization, but two bootstrapping versions of the test are available (depending upon
whether variances are pooled) :
TWOTPOOLBOOT
Bootstrap test with pooling of variances
TWOTUNPOOLBOOT Bootstrap test without pooling of variances
This macros is suitable when data for the two groups are contained in contained in the same column, with
a separate column denoting which group each observation corresponds to. If data for the two groups are
contained in separate columns, TWOSAMPLERAN should be used.
Standard procedures
twot [C1][C2].
This performs a two-sample t-test that the mean of the data for the first group is equal to the mean of the
data for the second group. The data is provided in c1, group labels are provided in c2. Variances are not
pooled, so this is appropriate if the variances for the two groups cannot be assumed to be equal.
23
twot [C1][C2];
pooled.
This performs a two-sample t-test that the mean of the data for the first group is equal to the mean of the
data for the second group. The data is provided in c1, group labels are provided in c2. Variances are
pooled, so this is only appropriate if the variances for the two groups can be assumed to be equal.
TECHNICAL DETAILS
Null hypothesis : We test the null hypothesis that the mean for the first group is equal to the mean for the
second group.
Randomization procedure : We fix the data value for each individual, and fix the size of the groups. We
then randomize the allocation of individuals to groups, since under the null hypothesis this allocation will
be random.
Test-statistic : We use the difference between the two sample group means as the test-statistic.
REFERENCES
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 6).
WORKED EXAMPLE FOR TWOTRAN
Name of dataset
MANDIBLES
Description
The data are mandible lengths (mm) for 10 male and 10 female golden jackals.
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
HIGHAM, C.F.W., KIJNGAM, A. & MANLY, B.F.J. (1980), An analysis of prehistoric canid remains from
Thailand. Journal of Archaeological Science, 7, pp. 149-165.
The data
Male (group 1)
120 107 110 116 114 111 113 117 114 112
Female (group 2)
110 111 107 108 110 105 107 106 111 111
The worksheet
C1
Mandible lengths for males
C2
Mandible lengths for females
Aims of analysis
To investigate whether mandible lengths are different for males and females.
Standard procedure (without pooling)
24
MTB > Retrieve "N:\resampling\Examples\Mandibles.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Mandibles.MTW
# Worksheet was saved on 28/08/01 11:00:13
Results for: Mandibles.MTW
MTB > twot c1 c2
Two-Sample T-Test and CI: Data, Group
Two-sample T for Data
Group
1
2
N
Mean
10 113.40
10 108.60
StDev SE Mean
3.72
1.2
2.27
0.72
Difference = mu (1) - mu (2)
Estimate for difference: 4.80
95% CI for difference: (1.85, 7.75)
T-Test of difference = 0 (vs not =): T-Value = 3.48 P-Value = 0.004 DF = 14
Standard procedure (with pooling)
MTB > twot c1 c2 ;
SUBC> pooled.
Two-Sample T-Test and CI: Data, Group
Two-sample T for Data
Group
1
2
N
Mean
10 113.40
10 108.60
StDev SE Mean
3.72
1.2
2.27
0.72
Difference = mu (1) - mu (2)
Estimate for difference: 4.80
95% CI for difference: (1.91, 7.69)
T-Test of difference = 0 (vs not =): T-Value = 3.48 P-Value = 0.003 DF = 18
Both use Pooled StDev = 3.08
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Mandibles.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Mandibles.MTW
# Worksheet was saved on 05/07/01 15:04:34
Results for: Mandibles.MTW
MTB > % N:\resampling\library\twotran c1 c2 ;
25
SUBC> nran 999 ;
SUBC> differences c4 ;
SUBC> tstatistics c6.
Executing from file: N:\resampling\library\twotran.MAC
Two-sample randomization test
Data Display (WRITE)
Number of observations in group 1 10
Number of observations in group 2 10
Data mean for group 1
113.4
Data mean for group 2
108.6
Standard deviation for group 1
3.718
Standard deviation for group 2
2.271
Observed difference in means
Observed t-statistic
3.48
4.800
Number of randomization samples 999
P-value for one-sided test with alternative: mean(group 1) > mean(group2)
P-value for one-sided test with alternative: mean(group 2) < mean(group1)
P-value for two-sided test
0.0040
0.0020
1.0000
Modified worksheet
C4
A column containing 999 differences between sample means, one for each randomized dataset
C6
A column containing 999 t-statistics for differences, one for each randomized dataset
Discussion
All methods agree that there is clear evidence of a difference in mandible lengths between sexes.
Two-sided p-values are 0.004 for standard methods (without pooling) and for randomization, and 0.003
for standard methods (with pooling). Looking at the data, we see that males (group 1) have longer
mandibles.
26
TWOTPOOLBOOT
This macro is designed to test, using bootstrapping, whether or not the means for two independent
samples are equal. We assume that the groups have equal variances.
RUNNING THE MACRO
Calling statement
twotpoolboot c1 c2 ;
nboot k1 (999) ;
differences c1 ;
tstatistics c1.
Input
C1
Data for both groups
C2
Group indicator
C1 and C2 must both be columns containing only numerical data, and they must be of the same length.
The column c2 should contain group markers; these should be two distinct numerical values (for example,
1 and 2). Missing values are allowed.
Subcommands
nboot
differences
tstatistics
Number of bootstrap resamples used.
Specify a column in which to store differences between simulated group means.
Specify a column in which to store t-statistics for differences between simulated
group means.
Output
 Number of observations for each group
 Means and standard deviations for each group
 Pooled standard deviation
 Observed difference in means, with associated t-statistic
 Number of bootstrap resamples
 One and two-sided randomization p-values
The two-sided randomization p-value is equal to double the smaller of the one-sided p-values.
Speed of macro
FAST
ALTERNATIVE PROCEDURES
Other macros
This macro uses bootstrapping, but two randomization versions of the test are available :
TWOSAMPLERAN
Randomization test, samples in different columns
TWOTRAN
Randomization test, samples in the same column
This macro is suitable when variances can be assumed to be equal; if this is not the case, use
TWOTUNPOOLBOOT instead.
Standard procedures
twot [C1][C2];
pooled.
27
This performs a two-sample t-test that the mean of the data for the first group is equal to the mean of the
data for the second group. The data is provided in c1, group labels are provided in c2. Variances are
pooled, so this is only appropriate if the variances for the two groups can be assumed to be equal.
TECHNICAL DETAILS
Null hypothesis : The mean for the first group is equal to the mean for the second group.
Test-statistic : The t statistic (with pooled standard deviation):
t = {mean for group 1 - mean for group 2} / (pooled standard deviation *
sqrt{[1/sample size for group 1] + [1/sample size for group 2]}).
Resampling procedure : Assume that samples sizes are n1 (for group 1) and n2 (for group 2). Then 1. We create a modified dataset, by deducting the sample group mean from each data value. This ensures
that both groups have the same mean. Since we assume that group variances are also equal, we can
therefore assume that the allocation of individuals to groups within this modified dataset is random
under the null hypothesis.
2. For each bootstrap sample, we select n1 values from the modified dataset (with replacement) and
allocate these to group 1. Similarly, we select n2 values and allocate these to group 2.
REFERENCES
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 6).
WORKED EXAMPLE FOR TWOTPOOLBOOT
Name of dataset
MANDIBLES
Description
The data are mandible lengths (mm) for 10 male and 10 female golden jackals.
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
HIGHAM, C.F.W., KIJNGAM, A. & MANLY, B.F.J. (1980), An analysis of prehistoric canid remains from
Thailand. Journal of Archaeological Science, 7, pp. 149-165.
The data
Male (group 1)
120 107 110 116 114 111 113 117 114 112
Female (group 2)
110 111 107 108 110 105 107 106 111 111
The worksheet
C1
Mandible lengths for males
C2
Mandible lengths for females
Aims of analysis
To investigate whether mandible lengths are different for males and females.
28
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Mandibles.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Mandibles.MTW
# Worksheet was saved on 28/08/01 11:00:13
Results for: Mandibles.MTW
MTB > % N:\resampling\library\twotpoolboot c1 c2 ;
SUBC> nboot 999 ;
SUBC> differences c4 ;
SUBC> tstatistics c6.
Executing from file: N:\resampling\library\twotpoolboot.MAC
Two-sample bootstrap t-test (with pooling of standard deviations)
Data Display (WRITE)
Number of observations in group 1
Number of observations in group 2
Data mean for group 1
113.4
Data mean for group 2
108.6
10
10
Standard deviation for group 1
3.718
Standard deviation for group 2
2.271
Pooled standard deviation
3.080
Observed difference in means
Observed t-statistic
3.484
4.800
Number of bootstrap samples 999
P-value for one-sided test with alternative: mean(group 1) > mean(group2)
P-value for one-sided test with alternative: mean(group 2) < mean(group1)
P-value for two-sided test
0.0040
0.0020
0.9990
Modified worksheet
C4
A column containing 999 differences between sample means, one for each bootstrap resample
C6
A column containing 999 t-statistics for differences, one for each bootstrap resample
Discussion
The results are very similar to those using TWOTRAN. Again, there is very clear evidence of a difference
in means (p-value = 0.004).
29
TWOTUNPOOLBOOT
This macro is designed to test, using bootstrapping, whether or not the means for two independent groups
are equal. We do not assume that the groups have equal variances.
RUNNING THE MACRO
Calling statement
twotunpoolboot c1 c2 ;
nboot k1 (999);
differences c1 ;
tstatistics c1.
Input
C1
Data for both groups
C2
Group indicator
C1 and C2 must both be columns containing only numerical data, and they must be of the same length.
Missing values are allowed. The column c2 should contain group markers; these should be two distinct
numerical values (for example, 1 and 2).
Subcommands
nboot
differences
tstatistics
Number of bootstrap resamples used.
Specify a column in which to store differences between simulated group means.
Specify a column in which to store t-statistics for differences between simulated
group means.
Speed of macro
FAST
ALTERNATIVE PROCEDURES
Other macros
This macro uses bootstrapping, but two randomization versions of the test are available :
TWOSAMPLERAN
Randomization test, samples in different columns
TWOTRAN
Randomization test, samples in the same column
This macro is suitable when variances cannot be assumed to be equal; if it is reasonable to assume equal
variances, use TWOTPOOLBOOT instead.
Standard procedures
twot [C1][C2].
This performs a two-sample t-test that the mean of the data for the first group is equal to the mean of the
data for the second group. The data is provided in c1, group labels are provided in c2. Variances are not
pooled, so this is appropriate if the variances for the two groups cannot be assumed to be equal.
TECHNICAL DETAILS
We test the null hypothesis that the means for the two groups are equal, using the usual t-statistic (with
unpooled standard deviation) as our test-statistic. In order to resample under the null hypothesis, we first
deduct group means from each data point, to ensure that both groups have the same mean. We then
randomize separately within each group, and compare the two groups using a t-statistic with unpooled
30
variances. We must randomize separately within groups, because under the null hypothesis data from the
two groups will not be identical (since group variances are - we assume - unequal).
Null hypothesis : The mean for the first group is equal to the mean for the second group.
Test-statistic : The t statistic (with separate group standard deviations):
t = {mean for group 1 - mean for group 2} / sqrt{[standard deviation for group 1/sample size for group 1]
+ [standard deviation for group 2/sample size for group 2]}).
Resampling procedure : Assume that samples sizes are n1 (for group 1) and n2 (for group 2). Then [1] We create a modified dataset, by deducting the sample group mean from each data value. This ensures
that both groups have the same mean.
[2] For each bootstrap sample, we select (with replacement) n1 values from the modified data for group 1,
and allocate these to group 1. We also select (with replacement) n2 values from the modified data for
group 2, and allocated these to group 2. It is necessary to use this form of restricted bootstrapping because
we cannot assume that group variances are equal, and so we cannot pool data from groups 1 and 2.
REFERENCES
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 6).
WORKED EXAMPLE FOR TWOTUNPOOLBOOT
Name of dataset
MANDIBLES
Description
The data are mandible lengths (mm) for 10 male and 10 female golden jackals.
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
HIGHAM, C.F.W., KIJNGAM, A. & MANLY, B.F.J. (1980), An analysis of prehistoric canid remains from
Thailand. Journal of Archaeological Science, 7, pp. 149-165.
The data
Male (group 1)
120 107 110 116 114 111 113 117 114 112
Female (group 2)
110 111 107 108 110 105 107 106 111 111
The worksheet
C1
Mandible lengths for males
C2
Mandible lengths for females
Aims of analysis
To investigate whether mandible lengths are different for males and females.
Randomization procedure
31
MTB > Retrieve "N:\resampling\Examples\Mandibles.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Mandibles.MTW
# Worksheet was saved on 28/08/01 11:00:13
Results for: Mandibles.MTW
MTB > % N:\resampling\library\twotunpoolboot c1 c2 ;
SUBC> nboot 999 ;
SUBC> differences c4 ;
SUBC> tstatistics c6.
Executing from file: N:\resampling\library\twotunpoolboot.MAC
Two-sample bootstrap t-test (standard deviations not pooled)
Data Display (WRITE)
Number of observations in group 1 10
Number of observations in group 2 10
Data mean for group 1
113.4
Data mean for group 2
108.6
Standard deviation for group 1
3.718
Standard deviation for group 2
2.271
Observed difference in means
Observed t-statistic
3.484
4.800
Number of bootstrap samples 999
P-value for one-sided test with alternative: mean(group 1) > mean(group2)
P-value for one-sided test with alternative: mean(group 2) < mean(group1)
P-value for two-sided test
0.0100
0.0050
0.9990
Modified worksheet
C4
A column containing 999 differences between sample means, one for each bootstrap resample
C6
A column containing 999 t-statistics for differences, one for each bootstrap resample
Discussion
The results are very similar to those using TWOTRAN. Again, there is clear evidence of a difference in
means, but the p-value is somewhat larger in this case (p-value = 0.010).
32
CORRELATIONRAN
The macro is designed to test the significance of the correlation between two variables.
RUNNING THE MACRO
Calling statement
correlationran c1 c2 ;
nran k1 (999) ;
corrs c1.
Input
C1
First variable
C2
Second variable
C1 and c2 must be columns, of the same length, containing only numerical values.
Subcommands
nran
Number of randomizations used.
corrs
Specifya column in which to store correlation coefficients for randomization samples.
Output
 Number of observations, and means for each variable
 Observed correlation coefficient
 Number of randomizations
 Randomization p-values
Speed of macro : FAST
Missing values : Allowed.
ALTERNATIVE PROCEDURES
Standard procedures
Correlation C2 C1.
This finds the correlation between the data in c1 and the data in c2, and gives the p-value for this
correlation.
TECHNICAL DETAILS
Null hypothesis : The two variables are uncorrelated, i.e.  = 0.
Test-statistic : The Pearson correlation coefficient.
Randomization: We randomize the allocation of the values to the second variable to the values of the
first variable, since under the null hypothesis the pairing of the two variables will be independent.
Note : This macro operates in exactly the same way as the simple linear regression macro,
REGRANSIMPLE. The output is substantially different, reflecting the different emphasis of correlation
as opposed to regression.
REFERENCES
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 8).
33
WORKED EXAMPLE FOR CORRELATIONRAN
Name of dataset
HEXOKINASE
Description
The data is taken from part of a study by McKechnie, concerning electrophoretic frequencies of the
butterfly Euphydryas editha. For each of 18 units (corresponding either to colonies, or to sets of colonies),
the reciprocal of altitude (originally measured in feet * 103) is recorded, together with the percentage
frequency of hexokinase 1.00 mobility genes from electrophoresis of samples of Euphydryas editha. We
choose to label these variables "invalt" and "hk" respectively.
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
MC. KECHNIE, S.W., EHRLICH, P.R. & WHITE, R.R. (1975), Population genetics of Euphydryas butterflies.
I. Genetic variation and the neutrality hypothesis, Genetics, 81, pp. 571-594.
Data
Number of observations = 18
Number of variables = 2
For each observation, HK (top) and INVALT (bottom) are given.
98.00 36.00 72.00 67.00 82.00 72.00 65.00 1.00 40.00 39.00 9.00
2.00 1.25 1.75 1.82 2.63 1.08 2.08 1.59 0.67 0.57 0.50
19.00 42.00 37.00 16.00 4.00 1.00 4.00
0.24 0.40 0.50 0.15 0.13 0.11 0.10
Plot
100
80
hk
60
40
20
0
0
1
2
invalt
Minitab worksheet
C1
HK measurements
C2
INVALT measurements
Aims of analysis
34
To investigate whether HK and INVALT measurements are correlated.
Standard procedure
Welcome to Minitab, press F1 for help.
MTB > Retrieve "N:\resampling\Examples\Hexokinase.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Hexokinase.MTW
# Worksheet was saved on 06/07/01 14:15:38
Results for: Hexokinase.MTW
Correlation c1 c2.
Correlations: hk, invalt
Pearson correlation of hk and invalt = 0.770
P-Value = 0.000
Resampling procedure
MTB > % N:\resampling\library\correlationran c1 c2 ;
SUBC> nran 499 ;
SUBC> corrs c4.
Executing from file: N:\resampling\library\correlationran.MAC
Data Display (WRITE)
Number of observations 18
Mean of first variable
39.11
Mean of second variable
0.98
Correlation coefficient
0.770
Number of randomizations 499
One sided randomization p-value, H1: -ve correlation
One sided randomization p-value, H1: +ve correlation
Two sided randomization p-value
0.0040
1.0000
0.0020
Modified worksheet
C4
A column containing 499 correlation coefficients, one for each randomized dataset
Discussion
There is clearly a strong positive correlation between the variables. The standard p-value is 0.000, whilst
the randomization p-value is 0.004, the smallest possible value for 499 randomizations.
35
2
BASIC STATISTICS: CONFIDENCE INTERVALS
Overview
Specific procedures
MEANCIBOOT computes bootstrap confidence intervals for a population mean
MEDIANCIBOOT computes bootstrap confidence intervals for a population median
STDEVCIBOOT compute bootstrap confidence intervals for a population standard deviation
General procedures
ANYCIBOOT provides a template for creating a macro to calculate confidence intervals using any teststatistic (so long as it is a function of univariate data).
An introduction to bootstrap confidence intervals
A large number of bootstrap techniques for constructing confidence intervals have been suggested, and the
merits of the different approaches are discussed at length in the statistical literature. We concentrate on
those techniques discussed by Manly (1997). However, in our opinion the clearest introduction to
bootstrap confidence intervals is that of Efron and Tibshirani (1993).
Standard confidence intervals
A standard 95% confidence interval for a parameter estimate is given by
CI = Parameter estimate  1.96 * Standard error based on observed sample,
Where 1.96 is found from tables of the normal distribution.
Estimate -/+ 1.96 * bootstrap standard deviation
The simplest type of 95% bootstrap confidence interval involves estimating the standard error to be the
standard deviation of the bootstrap parameter estimates (henceforth known simply as the "bootstrap
standard deviation"), so that
CI = Parameter estimate  1.96 * Bootstrap standard deviation.
For intervals other than 95%, a value other than 1.96 is required, and can be obtained from normal tables.
Bootstrap-t method
Standard confidence intervals involve making assumptions about the distribution of parameters - the 1.96
in the above equations arises because we assume that parameters (or a standardized version of them) are
normally distributed.
Using bootstrapping, we can avoid such assumptions.
Instead, we can find the distribution of the t-statistic for the parameter - a standardized version of the
parameter estimate – from the bootstrap samples.
The confidence interval is then
CI = Parameter estimate  Bootstrap t-statistic * Standard error based on observed sample.
The bootstrap t-statistic for the dth resample is defined by
tbootd = (Parameter estimate for dth resample - Parameter estimate for observed sample) /
Standard error for dth resample
Efron percentile method
Assume that a number of bootstrap resamples, nboot, are used. Then, in order to create a 100(1-alpha)%
confidence interval we sort the parameter estimates obtained from the nsim resamples into ascending
order. We then take the
36
[nboot * alpha]th value and [nboot * (1-alpha)]th value in this sorted list as our lower and upper confidence
limits. If [nsim * alpha] is not an integer, we round it down; correspondingly, we round [nsim * (1-alpha)]
up; this rounding procedure is conservative.
For example, if there are 1000 bootstrap samples, then we calculate the test-statistics for each sample, and
sort these test-statistics. The 95% confidence interval is formed by taking the 0.025 * 1000 = 25th and
975th test-statistics from this list as our lower and upper limits.
Hall percentile method
A modified version of the Efron percentile method.
If the Efron confidence limits are Efronlow and Efronhigh, then the Hall limits are HallLow = (2 * Parameter estimate) - EfronHigh
HallHigh = (2 * Parameter estimate) - EfronLow.
Hall confidence intervals will have the same length as Efron confidence intervals.
BC percentile method
An extension of the Efron percentile method, in which possible bias in the parameter estimate is corrected
for. The correction alters the rank values of the lower and upper endpoints used in the percentile method.
BCa percentile method
An extension of the BC percentile method, in which the possibilities of both bias and non-constant
standard error are corrected for. The corrections alter the rank values of the lower and upper endpoints
used in the percentile method.
Relationship between different bootstrap methods
37
MEANCIBOOT
This macro is designed to calculate bootstrap confidence intervals for a population mean.
RUNNING THE MACRO
Calling statement
meanciboot c1 ;
siglev k1 (95) ;
nboot k1 (2000);
means c1 ;
quantiles c1-c3 ;
tvalues c1.
Input
Input to the macro must be a single column, containing only numerical values.
Subcommands
siglev
The significance level of the confidence interval, expressed as a percentage.
The default is 95 (corresponding to 95% significance); other standard choices are
90, 98 or 99.
nboot
The number of bootstrap samples used. The default is 2000. It is not recommend to use
less than 1000 for the construction of confidence intervals.
means
Specify a column in which to store bootstrap sample means.
quantiles Specify three columns in which to store ranks corresponding to the lower and upper
confidence interval limits, for the standard percentile method (column 1), the BC method
(column 2) and the BCa method (column 3).
tvalues
Specify a column in which to bootstrap sample t-statistics.
Output
 Basic information (number of data points, significance level, number of bootstrap samples)
 Sample mean, with associated standard error
 Sample standard deviation
 Bootstrap standard deviation about the estimated mean
 Overall bootstrap mean
 Estimated bias correction (for BC and BCa methods)
 Estimated acceleration (for BCa method)
 Standard bootstrap confidence interval
 Bootstrap confidence intervals using : Estimate -/+ 1.96*bootstrap standard deviation, Bootstrap-t
method, Efron percentile method, Hall percentile method, BC method, BCa method.
Speed of macro : FAST
Missing data : Allowed
ALTERNATIVE PROCEDURES
Standard procedures
tinterval c1
This produces a confidence interval about a mean value, in the situation in which variance is unknown.
38
zinterval k1 c1
This produces a confidence interval about a mean value, where the variance is known to be k1.
REFERENCES
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 3).
EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London
(Chapters 12-14).
WORKED EXAMPLE FOR MEANCIBOOT
Name of dataset
EXPONENTIAL
Description
The data are 20 realisations from an Exponential distribution with rate parameter 1.
Source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Data
Number of observations = 20
Number of variables = 1
3.56 0.69 0.10 1.84 3.93 1.25 0.18 1.13 0.27 0.50 0.67 0.01 0.61
0.82 1.70 0.39 0.11 1.20 1.21 0.72
Worksheet
C1
Data
Aims of analysis
To create confidence intervals for the population mean.
Standard procedure
MTB > Retrieve "N:\resampling\Examples\Exponential.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Exponential.MTW
# Worksheet was saved on 23/08/01 12:16:52
Results for: Exponential.MTW
MTB > OneT c1.
One-Sample T: C1
Variable
C1
N
20
Mean
1.044
StDev SE Mean
1.060 0.237
95.0% CI
(0.549, 1.540)
Resampling procedure
MTB > % N:\resampling\library\meanciboot c1 ;
SUBC> siglev 95 ;
39
SUBC> nboot 1000 ;
SUBC> means c3 ;
SUBC> quantiles c5-c7 ;
SUBC> tvalues c9.
Executing from file: N:\resampling\library\meanciboot.MAC
Data Display
STANDARD CONFIDENCE INTERVALS
Data Display (WRITE)
Number of data values 20
Mean of data values
1.0445
Standard deviation of data values
1.0597
Standard error of the mean 0.23695
Significance level for confidence intervals
95
Estimated confidence interval, lower bound (Standard t method)
Estimated confidence interval, upper bound (Standard t method)
0.54855
1.5404
BOOTSTRAP CONFIDENCE INTERVALS
Data Display (WRITE)
Number of bootstrap samples 1000
Overall mean for bootstrap samples
1.042
Standard deviation of bootstrap means
0.2407
Estimated bias-correction (for BC, BCa)
0.0652
Estimated acceleration (for BCa)
0.0612
Confidence limits
Data Display (WRITE)
Estimate -/+ 1.96*boot sd
Bootstrap-t method
0.5727
0.6405
1.516
1.932
Efron percentile method
Hall percentile method
0.6255
0.5355
1.553
1.463
BC percentile method
BCa percentile method
0.6400
0.6650
1.580
1.718
Modified worksheet
C3
A column containing 1000 sample means, one for each bootstrap resample
C5
Upper and lower rank positions for percentile confidence limits using the Efron method
C6
Upper and lower rank positions for percentile confidence limits using the Efron method
C7
Upper and lower rank positions for percentile confidence limits using the Efron method
C9
A column containing 1000 t-statistics for sample means, one for each bootstrap resample
Columns c5 - c7 each contain 2 values.
40
Discussion
There is a fair amount of variation between the different confidence intervals. All 7 intervals include the
true population mean of one. The Efron percentile method produces the shortest interval in this case, the
bootstrap t-interval the longest. The bootstrap-t and BCa intervals generally imply larger values for the
mean, whilst the Hall, standard t and estimate -/+ 1.96 * bootstrap standard deviation intervals generally
imply smaller values for the mean. These are not general properties of the different methods, however.
Manly (1997) performs a simulation study to investigate the coverage of the different bootstrap intervals
for a sample of 20 observations from an exponential distribution with rate parameter one. He finds that
the bootstrap-t method (with 95.2% coverage) has the closest coverage to the nominal 95% level, closely
followed by the BCa method (with 92.4% coverage).
41
MEDIANCIBOOT
This macro is designed to calculate bootstrap confidence intervals for a population median.
RUNNING THE MACRO
Calling statement
medianciboot c1 ;
siglev k1 (95) ;
nboot k1 (2000);
medians c1 ;
quantiles c1-c3 ;
tvalues c1.
Input
Input to the macro must be a single column, containing only numerical values. Discrete or continuous data
are allowed. Missing data is allowed.
Subcommands
siglev
The significance level of the confidence interval, expressed as a percentage.
The default is 95 (corresponding to 95% significance); other standard choices are
90, 98 or 99.
nboot
The number of bootstrap samples used. The default is 2000. It is not recommend to use
less than 1000 for the construction of confidence intervals.
medians
Specify a column in which to store bootstrap sample medians.
quantiles Specify three columns in which to store ranks corresponding to the lower and upper
confidence interval limits, for the standard percentile method (column 1), the BC method
(column 2) and the BCa method (column 3).
tvalues
Specify a column in which to store bootstrap sample t-statistics.
Output
 Basic information (number of data points, significance level, number of bootstrap samples)
 Sample median, with standard error (assuming a normal distribution)
 Bootstrap standard deviation about the estimated median
 Estimated bias correction (for BC and BCa methods)
 Estimated acceleration (for BCa method)
 Standard nonparametric confidence interval for the median
 Bootstrap confidence intervals using : Estimate -/+ 1.96*bootstrap standard deviation, Bootstrap-t
method, Efron percentile method, Hall percentile method, BC method, BCa method.
Speed of macro : fast
ALTERNATIVE METHODS : Standard methods
sinterval 95 c1.
produces three different 95% nonparametric confidence intervals for the median.
The first and third intervals are based upon exact ranks, and have exact achieved confidence levels.
These confidence levels will not, in general, be equal to 95% : the first interval is the interval with the
closest confidence level to 95% which is below 95%, the third interval that with the closest confidence
level to 95% which is above 95%. Hence, the 3rd procedure is conservative, the 1st anti-conservative.
The 2nd interval is an approximate confidence interval based upon interpolation.
42
In the output to our macro, we include the conservative nonparametric confidence interval.
The construction is discussed in "technical details".
TECHNICAL DETAILS : nonparametric confidence interval for the median
The nonparametric confidence interval for the median is formed by finding the rank order of the lower and
upper limits using –
Lower limit : (n + 1)/2 – (0.9789 * sqrt(n)), rounded down to the nearest integer
Upper limit : (n + 1)/2 – (0.9789 * sqrt(n)), rounded up to the nearest integer,
where n is sample size.
[Notes: 1. for n > 283, we use 0.9800 instead of 0.9789
2. for n = 17, the formula provides rank values of 4 and 14, but we use 5 and 13.
3. for n = 67, the formula provides rank values of 25 and 43, but we use 26 and 42].
The data points corresponding to these rank orders then form the confidence interval.
Details of these procedures can be found at :
http://www.umanitoba.ca/centres/mchpe/concept/dict/Statistics/ci_median/
http://www.maths.unb.ca/~knight/utility/MedInt95.htm.
REFERENCES
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 3).
EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London
(Chapters 12-14).
WORKED EXAMPLE FOR MEDIANCIBOOT
Data
EXPONENTIAL (see MEANCIBOOT)
Aims of analysis
To create confidence intervals for the population median.
Standard procedure : Sign confidence interval
MTB > SInterval 95.0 c1.
Sign CI: C1
Sign confidence interval for median
C1
Achieved
N Median Confidence
20 0.705 0.8847
0.9500
0.9586
Confidence interval Position
( 0.500, 1.200)
7
( 0.416, 1.208)
NLI
( 0.390, 1.210)
6
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Exponential.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Exponential.MTW
# Worksheet was saved on 23/08/01 12:16:52
Results for: Exponential.MTW
43
MTB > % N:\resampling\library\medianciboot c1 ;
SUBC> siglev 95 ;
SUBC> nboot 1000 ;
SUBC> medians c3 ;
SUBC> quantiles c5-c7 ;
SUBC> tvalues c9.
Executing from file: N:\resampling\library\medianciboot.MAC
General information
Data Display (WRITE)
Number of data values 20
Median of data values 0.70500
Standard error of the median 0.29697
Significance level for confidence intervals
95
Number of bootstrap samples 1000
Bootstrap standard deviation
0.2019
Estimated bias-correction (for BC, BCa) -0.1156
Estimated acceleration (for BCa) -0.0000
Confidence limits
Data Display (WRITE)
Standard non-parametric method
0.3900
Estimate -/+ 1.96*boot sd
0.3093
Bootstrap-t method
0.1550
1.101
1.082
Efron percentile method
Hall percentile method
0.4450
0.2050
1.205
0.9650
BC percentile method
BCa percentile method
0.3850
0.3850
1.200
1.200
1.210
Modified worksheet
C3
A column containing 1000 sample medians, one for each bootstrap resample
C5
Upper and lower rank positions for percentile confidence limits using the Efron method
C6
Upper and lower rank positions for percentile confidence limits using the Efron method
C7
Upper and lower rank positions for percentile confidence limits using the Efron method
C9
A column containing 1000 t-statistics for sample medians, one for each bootstrap resample
Columns c5 - c7 each contain 2 values.
44
STDEVCIBOOT
This macro is designed to calculate bootstrap confidence intervals about a population standard deviation.
RUNNING THE MACRO
Calling statement
stdevciboot c1 ;
siglev k1 (95) ;
nboot k1 (2000);
stdevs c1 ;
quantiles c1-c3.
Input
Input to the macro must be a single column, containing only numerical values. Discrete or continuous data
are allowed. Missing data is allowed.
Subcommands
siglev
The significance level of the confidence interval, expressed as a percentage.
The default is 95 (corresponding to 95% significance); other standard choices are
90, 98 or 99.
nboot
The number of bootstrap samples used. The default is 2000. It is not recommend to use
less than 1000 for the construction of confidence intervals.
medians
Specify a column in which to store bootstrap sample medians.
quantiles Specify three columns in which to store ranks corresponding to the lower and upper
confidence interval limits, for standard percentile method (column 1), the BC method
(column 2) and the BCa method (column 3).
tvalues
Specify a column in which to bootstrap sample t-statistics.
Six ranks are given; the first two ranks correspond to the ranks for the lower and upper
confidence limits for the standard percentile and bootstrap-t confidence intervals.
The next two ranks correspond to the ranks for the bias-corrected (BC) percentile intervals,
the final two ranks correspond to ranks for the accelerated bias-corrected (BCa) percentile
intervals.
Output
 Basic information (number of data points, significance level, number of bootstrap samples)
 Sample standard deviation
 Bootstrap standard deviation about the estimated standard deviation
 Estimated bias correction (for BC and BCa methods)
 Estimated acceleration (for BCa method)
 Confidence interval using chi-squared approximation
 Bootstrap confidence intervals using : Estimate -/+ bootstrap standard deviation, Efron percentile
method, Hall percentile method, BC method, BCa method.
Speed of macro : Fast.
ALTERNATIVE PROCEDURES
Standard procedure : No built-in Minitab function, but the macro incorporates a confidence interval
obtained using the following approximation based upon the chi-squared distribution :
45
The standard 100(1 – alpha) confidence interval for a standard deviation has limits of
sqrt{(n – 1) * sample variance / appropriate quantiles of the chi-squared n-1 distribution},
where n is sample size. For a 95% confidence interval, the 2.5% and 97.5% quantiles are used.
This interval is based on the fact that, if data is normally distributed, the quantity
Sample variance * (n – 1) / Population variance
has a chi-squared distribution with n – 1 degrees of freedom.
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 3).
EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London
(Chapters 12-14).
WORKED EXAMPLE FOR STDEVCIBOOT
Data
EXPONENTIAL (see MEANCIBOOT)
Aims of analysis
To create confidence intervals for the population standard deviation.
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Exponential.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Exponential.MTW
# Worksheet was saved on 23/08/01 12:16:52
Results for: Exponential.MTW
MTB > % N:\resampling\library\stdevciboot c1 ;
SUBC> siglev 95 ;
SUBC> nboot 2000 ;
SUBC> stdevs c3 ;
SUBC> quantiles c5-c7.
Executing from file: N:\resampling\library\stdevciboot.MAC
Data Display
BOOTSTRAP CONFIDENCE INTERVALS FOR A POPULATION VARIANCE
Histogram simsort
General information
Data Display (WRITE)
Number of data values 20
Observed variance 1.0597
Significance level for confidence intervals
95
Number of bootstrap samples 2000
46
Bootstrap standard deviation about the variance
0.2474
Estimated bias-correction (for BC, BCa)
0.1231
Estimated acceleration (for BCa)
0.1009
Confidence intervals
Data Display (WRITE)
Standard chi-squared based interval
0.7829
1.504
Estimate -/+ 1.96*bootstrap SE
0.5749
1.545
Efron percentile method
Hall percentile method
0.4842
0.6836
1.436
1.635
BC percentile method
BCa percentile method
0.5137
0.5742
1.474
1.549
Distribution of variances
from bootstrap resamples
80
70
Frequency
60
50
40
30
20
10
0
0.2
0.7
1.2
1.7
Sample variance
Modified worksheet
C3
A column containing 2000 sample standard deviations, one for each bootstrap resample
C5
Upper and lower rank positions for percentile confidence limits using the Efron method
C6
Upper and lower rank positions for percentile confidence limits using the Efron method
C7
Upper and lower rank positions for percentile confidence limits using the Efron method
Columns c5 - c7 each contain 2 values.
Discussion
The different methods produce substantially different results. The confidence interval based upon
standard methods is substantially shorter than any of the bootstrap confidence intervals. Manly (1997)
performs a simulation study to investigate the coverage of the different methods for a sample of 20 from
an exponential distribution with parameter 1. The coverages are extremely poor for all of the methods.
Against a nomial coverage level of 95%, the bootstrap methods achieve coverages of between 65.9%
(Efron method) and 72.7% (Hall method), whilst the standard method has a coverage of 72.7%. Some
improvement to the standard and Hall methods can be obtained by taking logarithms, but the coverage
47
remains poor. We see that bootstrap distribution of the standard deviation (see figure above) is very
lumpy, and this may explain the poor performance of the methods.
ADDITIONAL SAMPLE DATASET FOR STDEVCIBOOT
Name of dataset
SPATIAL
Description
For each of 26 neurologically impaired children, the results of two tests of spatial perception, A and B,
are recorded.
Source
EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London.
Data
Number of observations = 26
Number of variables = 2
For each child, A scores (top) and B scores (bottom) are shown.
48 36 20 29 42 42 20 42 22 41 45 14 6 0 33 28 34 4 32
42 33 16 39 38 36 15 33 20 43 34 22 7 15 34 29 41 13 38
24 47 41 24 26 30 41
25 27 41 28 14 28 40
Aims of analysis
Efron and Tibshirani (1993) produce confidence intervals for test A scores.
48
ANYCIBOOT
This macro is designed to provide a template for the creation of a bootstrap confidence interval using any
test-statistic.
ADAPTING THE MACRO
In order to change the test-statistic, modify the three lines of code denoted by hashed boxes (and clearly
marked). A number of possible alternatives are provided, but almost any test-statistic for univariate data
may be used. If the test-statistic is complex, any additional variables included within the code for its
computation must be declared. For complex test-statistics, it may be better to call another local macro to
compute the test-statistic at each stage.
Note on a potential bug : For some test-statistics, it may be impossible to calculate the acceleration for
particular datasets.
If there is any risk that, for a given test-statistic, the test-statistic will take the same value for each subset
of the data formed by excluding one datapoint at a time, then the details and calculations concerning the
calculation of BCa intervals should be excluded from the code (contact the authors for further details).
Example : If the test-statistic is the median and the observed dataset is [1,3,3,3,6], then the medians for
each of the restricted datasets [1,3,3,3],[1,3,3,6],[1,3,3,6],[1,3,3,6],[3,3,3,6] (formed by missing out the
values 1,3,3,3 and 6 respectively) are 3, and so acceleration cannot be calculated.
For many test-statistics, such as the mean and standard deviation, it is highly unlikely that this
phenomenon will arise (in fact, the only situation we can envisage is if all of the data are equal, in which
case resampling methods are clearly inappropriate anyway), we it may be a risk for other test-statistics
based upon quantiles or ranks (e.g. interquartile range).
RUNNING THE MACRO
Calling statement
anyciboot c1 ;
siglev k1 (95) ;
nboot k1 (2000);
stdevs c1 ;
quantiles c1-c3.
Input
Input to the macro must be a single column, containing only numerical values. Discrete or continuous data
are allowed. Missing data is not allowed.
Subcommands
siglev
The significance level of the confidence interval, expressed as a percentage.
The default is 95 (corresponding to 95% significance); other standard choices are
90, 98 or 99.
nboot
The number of bootstrap samples used. The default is 2000. It is not recommend to use
less than 1000 for the construction of confidence intervals.
medians
Specify a column in which to store bootstrap sample medians.
quantiles Specify three columns in which to store ranks corresponding to the lower and upper
confidence interval limits, for standard percentile method (column 1), the BC method
(column 2) and the BCa method (column 3).
tvalues
Specify a column in which to bootstrap sample t-statistics.
49
Six ranks are given; the first two ranks correspond to the ranks for the lower and upper
confidence limits for the standard percentile and bootstrap-t confidence intervals.
The next two ranks correspond to the ranks for the bias-corrected (BC) percentile intervals,
the final two ranks correspond to ranks for the accelerated bias-corrected (BCa) percentile
intervals.
Output
 Basic information (number of data points, significance level, number of bootstrap samples)
 Sample standard deviation
 Bootstrap standard deviation about the estimated standard deviation
 Estimated bias correction (for BC and BCa methods)
 Estimated acceleration (for BCa method)
 Bootstrap confidence intervals using : Estimate -/+ bootstrap standard deviation, Efron percentile
method, Hall percentile method, BC method, BCa method.
Speed of macro : FAST
ALTERNATIVE PROCEDURES : Other macros
Specific macros exist to compute confidence intervals for means, medians and standard deviations. See
MEANCIBOOT
bootstrap confidence interval for a mean
MEDIANCIBOOT bootstrap confidence interval for a median
STDEVCIBOOT
bootstrap confidence interval for a standard deviation.
TECHNICAL DETAILS
The choice of test-statistic is important, and will have a critical effect upon the results obtained.
A statistician should be consulted, to determine whether or not the required assumptions underlying
resampling procedures hold for the given test-statistic.
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 3).
EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London
(Chapters 12-14).
WORKED EXAMPLE FOR ANYCIBOOT
Data
EXPONENTIAL (see MEANCIBOOT)
Aims of analysis
To create confidence intervals for the population mean.
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Exponential.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Exponential.MTW
# Worksheet was saved on 23/08/01 12:16:52
Results for: Exponential.MTW
MTB > % N:\resampling\library\anyciboot c1 ;
SUBC> siglev 95 ;
50
SUBC> nboot 1000 ;
SUBC> teststats c3 ;
SUBC> quantiles c5-c7.
Executing from file: N:\resampling\library\anyciboot.MAC
BOOTSTRAP CONFIDENCE INTERVALS ABOUT A POPULATION TEST-STATISTIC
Histogram simsort
Distribution of test statistic
from bootstrap resamples
Frequency
30
20
10
0
0.5
1.0
1.5
2.0
Sample test statistic
General information
Data Display (WRITE)
Number of data values
Observed test-statistic
20
1.044
Significance level for confidence intervals
95
Number of bootstrap samples 1000
Bootstrap standard deviation about the test-statistic
0.2271
Estimated bias-correction (for BC, BCa)
0.0326
Estimated acceleration (for BCa)
0.0612
Confidence intervals
Data Display (WRITE)
Estimate -/+ 1.96*bootstrap SE
Efron percentile method
Hall percentile method
0.5994
0.6480
0.5455
1.490
1.544
1.441
51
BC percentile method
BCa percentile method
0.6525
0.6960
1.553
1.635
Modified worksheet
C3
A column containing 1000 sample test-statistics, one for each bootstrap resample
C5
Upper and lower rank positions for percentile confidence limits using the Efron method
C6
Upper and lower rank positions for percentile confidence limits using the Efron method
C7
Upper and lower rank positions for percentile confidence limits using the Efron method
Columns c5 - c7 each contain 2 values.
Discussion
The results are very similar to those obtained using MEANCIBOOT.
52
3
ANALYSIS OF VARIANCE
Overview
One-way analysis of variance
ONEWAYRAN tests for a factor effect in a one-way analysis of variance
Two-way analysis of variance
TWOWAYRAN tests for a group effect in a two-way analysis of variance, without replication
TWOWAYREPRAN tests for a group effect in a two-way analysis of variance, with replication
Testing for constant variance
LEVENERAN tests for constant variance using a randomization version of Levene's test
53
ONEWAYRAN
This macro is designed to perform a one-way analysis of variance.
Randomization is used to assess the significance of the factor effect.
Calling statement
oneanovaran c1 c2 ;
nran k1 (999) ;
fvalues c1.
Input
C1
Data. A column containing only numeric values.
C2
Group. A column containing only numeric values. The number of distinct numeric values used
should be equal to the number of groups, with each value denoting a particular group.
Subcommands
nran
Number of randomizations
fvalues
Specify a column in which to store simulated F-ratios for group effect.
Output
Identical to the output from the standard Minitab command "oneway", with the addition of a
randomization p-value for group effect.
Speed of macro : FAST
ALTERNATIVE PROCEDURES : Standard procedures
oneway C1 C2.
This performs a one-way analysis of variance. The response variable is provided in c1, the factor levels
corresponding to each data point are provided in c2.
TECHNICAL DETAILS
Null hypothesis : Means for all groups are equal, so that 1 = 2 = … = g, where there are g groups and
i is the mean data value for the ith group.
Test-statistic :
The standard F-ratio produced in a one-way analysis of variance.
Randomization procedure : We fix the data value for each individual, and fix group sizes. The then
randomize the allocation of data to groups. This is valid, since under the null hypothesis the allocation of
group labels to individuals is random.
REFERENCES
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 7).
WORKED EXAMPLE FOR ONEWAYRAN
Name of dataset
MONTHS
54
Description
Again, the data is taken from the study by Powell and Russell, and concerns the stomach contents of the
Eastern horned lizard Phrynosoma douglassi brevirostre. The data record, for each of four months, the
amount of dry biomass of ants for the 24 adult and yearling females mentioned above. Manly (1997) uses
these data to perform a one-way analysis of variance.
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
POWELL, G.L. & RUSSELL, A.P. (1984), The diet of the eastern short-horned lizard (Phrynosoma douglassi
brevirostre) in Alberta and its relationship to sexual size dimorphism, Canadian Journal of Zoology, 62,
pp. 428-440.
POWELL, G.L. & RUSSELL, A.P. (1985), Growth and sexual size dimorphisms in Alberta populations of the
eastern short-horned lizard, Phrynosoma douglassi brevirostre, Canadian Journal of Zoology, 63, pp.
139-154.
Data
Number of observations = 25
Number of variables = 2
For each observation, data (top) and month (bottom) are given.
13 242 105
1 1
1
600
3
82
3
8
2
59
2
20
2
40 52 1889
3 3
3
18
4
2 245 515 488
2 2
3
3
88 233
3
3
44 21
4 4
0
4
5
4
6
4
50
3
Worksheet
C1
Data
C2
Month
Aims of analysis
To investigate whether month has an impact upon stomach biomass
Standard procedure
Retrieving worksheet from file: N:\resampling\Examples\Months.MTW
# Worksheet was saved on 07/06/01 09:11:46 AM
Results for: Months.MTW
MTB > Oneway c1 c2.
Randomization procedure
MTB > % N:\resampling\library\onewayran c1 c2 ;
SUBC> nran 999 ;
SUBC> fvalues c4.
Executing from file: N:\resampling\library\onewayran.MAC
55
STANDARD ONE-WAY ANOVA ANALYSIS
One-way ANOVA: biomass versus month
Analysis of Variance for biomass
Source DF
SS
MS
F
P
month
3 726695 242232 1.64 0.211
Error 20 2947024 147351
Total
23 3673719
Individual 95% CIs For Mean
Based on Pooled StDev
Level
N
Mean StDev --+---------+---------+---------+---1
3
120.0 115.2
(--------------*--------------)
2
5
66.8 102.1
(-----------*-----------)
3
10
403.7 565.4
(-------*--------)
4
6
15.7
16.1
(----------*---------)
--+---------+---------+---------+---Pooled StDev = 383.9
-300
0
300
600
RANDOMIZATION P-VALUES
Data Display (WRITE)
Number of groups 4
Number of randomizations 999
Randomization p-value
0.1940
Modified worksheet
C4
A column containing 999 F-ratios, one for each randomized dataset
Discussion
There is no real evidence for a month effect, with p-values of 0.211 (standard methods) and 0.194
(randomization).
ADDITIONAL SAMPLE DATASET FOR ONEWAYRAN
Name of dataset
COLONY
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London, pp. 162-166.
Original source
CAIN, A.J. & SHEPPARD, P.M. (1950), Selection in the polymorphic land snail Cepaea nemoralis,
Heredity, 4, 275-294.
Data
Number of observations = 17
Number of variables = 2
For each point, the data (top) and group (bottom) are given
56
25.0 26.9 8.1 13.5 3.8 9.1 30.9 17.1 37.4 26.9 76.2 40.9 58.1
1.0 1.0 2.0 2.0 2.0 3.0 3.0 3.0 3.0 3.0 4.0 4.0 4.0
18.4 64.2 42.6 45.1
4.0 5.0 5.0 6.0
57
TWOWAYRAN
This macro is designed to perform a two-way analysis of variance, in a situation in which where there is
no replication.
Important note
 The macro is only capable of assessing the impact of one of the factors in a two-way ANOVA.
The user must choose which factor is of interest – this factor will be called the group. The remaining
factor is treated as nuisance factor, and we will call this the block.
 If the user is interested in the (individual) effect of both factors, then the macro must be run twice,
with the “group” and “block” factors swapped the second time the macro is run.
RUNNING THE MACRO
Calling statement
twowayran c1 c2 c3 ;
nran k1 (999) ;
listdata c1-c3 ;
ssquares c1 ;
fvalues c1.
Input
C1
Data. A column containing only numeric values.
C2
Group. A column containing only numeric values. The number of distinct numeric values used
should be equal to the number of groups, with each value denoting a particular group.
C3
Block. A column containing only numeric values. The number of distinct numeric values used
should be equal to the number of blocks, with each value denoting a particular block.
Subcommands
nran
Number of randomizations
listdata
Specify three columns in which to store the sorted data. Within the macro, data is sorted,
and group markers are changed to be consecutive integers, and this may make output
difficult to interpret. In order to see which group is which, output the sorted data to
"listdata", and compare this against the original dataset.
ssquares
Specify a column in which to store sums of squares for group effect.
fvalues
Specify a column in which to store simulated F-ratios for group effect.
Output
Identical to the output from the standard Minitab command "twoway", with the addition of a
randomization p-values for group effect obtained using two different test-statistics (mean square and Fratio).
Speed of macro : MODERATE
ALTERNATIVE PROCEDURES
Other macros
This macro is suitable only if there is no replication (i.e. if there is only one observation for each factorby-factor combination); if replication is present, then you should use TWOWAYREPRAN.
Standard procedures
twoway C1 C2 C3.
58
This performs a two-way analysis of variance. The response variable is provided in c1, the factor levels
corresponding to each data point are provided in c2 in c3.
TECHNICAL DETAILS
Null hypothesis : The means of the response variable are constant across groups.
Test-statistic : The F-ratio for group effect in a two-way ANOVA.
Randomization procedure : We use a restricted randomization; we fix the allocation of data to blocks
(and fix the number of data points in each group within each block). We randomize the allocation of data
to groups within each block.
Notes : The two-way ANOVA does not include an interaction term, since there is insufficient data with a
single replicate to allow this.
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 7).
WORKED EXAMPLE FOR TWOWAYRAN
Name of dataset
ORTHOPTERA
Description
We return again to the data of Powell & Russell, except that data is now also provided for adult females.
Data are now classified according to two factors, month and size morph, and Manly (1997) analyses this
as a two-way analysis of variance.
Source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Data
Number of observations = 8
Number of variables = 3
Data
190
0
52
50
10
110
8
1212
Size
1
1
1
1
2
2
2
2
Month
1
2
3
4
1
2
3
4
Worksheet
C1
Data
C2
Size morph
59
C3
Month
Aim of analysis
To investigate whether month and size morph have an impact upon stomach biomass.
Randomization procedure
MTB > % N:\resampling\library\twowayran c1 c2 c3 ;
SUBC> nran 999 ;
SUBC> listdata c5-c7 ;
SUBC> ssquares c10 ;
SUBC> fvalues c12.
Executing from file: N:\resampling\library\twowayran.MAC
STANDARD TWO-WAY ANOVA ANALYSIS
Two-way ANOVA: Data versus Size, Month
Analysis of Variance for Data
Source
DF
SS
MS
F
P
Size
1 137288 137288 0.73 0.455
Month
3 491244 163748 0.88 0.542
Error
3 561052 187017
Total
7 1189584
Size
1
2
Month
1
2
3
4
Individual 95% CI
Mean ------+---------+---------+---------+----73 (----------------*----------------)
335
(----------------*-----------------)
------+---------+---------+---------+-----400
0
400
800
Individual 95% CI
Mean ----+---------+---------+---------+------100 (------------*-------------)
55 (-------------*-------------)
30 (------------*-------------)
631
(-------------*-------------)
----+---------+---------+---------+-------700
0
700
1400
RANDOMIZATION P-VALUE (FOR GROUP EFFECT)
Method used: restricted randomization (randomization within blocks)
Data Display (WRITE)
Number of randomizations 999
P-value for group effects (using F-ratio) 0.7160
P-value for group effects (using mean square)
0.7160
Modified worksheet
C5
A column containing sorted data (sorted by group and block)
C6
A column containing re-numbered group markers
60
C7
C10
C12
A column containing re-numbered block markers
A column containing 999 sums of square for group effect, one for each randomized dataset
A column containing 999 F-ratios for group effect, one for each randomized dataset
Discussion
There is no evidence whatsoever of a group (size morph) effect, with p-values of 0.455 (by standard
methods) and 0.716 (by randomization).
61
TWOWAYREPRAN
This macro is designed to perform a two-way analysis of variance, in situations in which replication is
present.
Important note
 The macro is only capable of assessing the impact of one of the factors in a two-way ANOVA.
The user must choose which factor is of interest – this factor will be called the group. The remaining
factor is treated as nuisance factor, and we will call this the block.
 If the user is interested in the (individual) effect of both factors, then the macro must be run twice,
with the “group” and “block” factors swapped the second time the macro is run. The two-way
ANOVA includes an interaction term, but the macro cannot determine the significance (p-value) for
this interaction.
Calling statement
twowayran c1 c2 c3 ;
nran k1 (999) ;
listdata c1-c4 ;
ssquares c1 ;
fvalues c1.
Input
C1
Data. A column containing only numeric values.
C2
Group. A column containing only numeric values. The number of distinct numeric values used
should be equal to the number of groups, with each value denoting a particular group.
C3
Block. A column containing only numeric values. The number of distinct numeric values used
should be equal to the number of blocks, with each value denoting a particular block.
Subcommands
nran
Number of randomizations
listdata
Specify four columns in which to store the sorted data. Within the macro, data is sorted,
and group markers are changed to be consecutive integers, and this may make output
difficult to interpret. In order to see which group is which, output the sorted data to
"listdata", and compare this against the original dataset. The fourth column contains a
marker for group * block combinations, so that any individuals which are in the same
group and block should have the same value from the 4th column.
ssquares
Specify a column in which to store sums of squares for group effect.
fvalues
Specify a column in which to store simulated F-ratios for group effect.
Output
Identical to the output from the standard Minitab command "twoway", with the addition of a
randomization p-values for group effect obtained using two different test-statistics (mean square and Fratio).
Speed of macro : MODERATE
Other macros
This macro is suitable only if there is replication (i.e. if there is more than one observation for each
factor-by-factor combination); if replication is not present, then you should use TWOWAYRAN.
62
Standard procedures
twoway C1 C2 C3.
This performs a two-way analysis of variance. The response variable is provided in c1, the factor levels
corresponding to each data point are provided in c2 in c3.
Null hypothesis : The means of the response variable are constant across groups.
Test-statistic : The F-ratio for group effect in a two-way ANOVA.
Randomization procedure : We use a restricted randomization; we fix the allocation of data to blocks
(and fix the number of data points in each group within each block). We randomize the allocation of data
to groups within each block.
References : MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 7).
WORKED EXAMPLE FOR TWOWAYREPRAN
Name of dataset
TWOWAY
Description
We return again to the data of Powell & Russell, except that data is now also provided for adult females.
Data are now classified according to two factors, month and size morph, and Manly (1997) analyses this
as a two-way analysis of variance. In this case, data is available for each individual, so there is also
replication.
Source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Data
Number of observations = 24
Number of variables = 3
For each observation, data (top), group (middle) and month (bottom) are shown.
13 242 105
1 1
1
1 1
1
21
2
1
7
2
1
8
1
2
24 312
2 2
2 2
59
1
2
20 515 488
1 1
1
2 3
3
88
1
3
68 460 1223 990 140
2
2
2
2
2
2
3
3
3
4
18
1
4
40
2
4
44 21 182
1 1
2
4 4
1
27
2
4
Worksheet
C1
Data
C2
Size morph
C3
Month
Aim of analysis
To investigate whether month and size morph have an impact upon stomach biomass.
63
Standard procedure
MTB > Retrieve "N:\resampling\Examples\Twoway.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Twoway.MTW
# Worksheet was saved on 20/07/01 11:49:31
Results for: Twoway.MTW
MTB > Twoway c1 c2 c3;
SUBC> Means c2 c3.
Randomization procedure
MTB > % N:\resampling\library\twowayrepran c1 c2 c3 ;
SUBC> nran 999 ;
SUBC> listdata c5-c8 ;
SUBC> ssquares c10 ;
SUBC> fvalues c12.
Executing from file: N:\resampling\library\twowayrepran.MAC
STANDARD TWO-WAY ANOVA ANALYSIS
Two-way ANOVA: Value versus Group, Month
Analysis of Variance for Value
Source
DF
SS
MS
F
P
Group
1 146172 146172 4.47 0.051
Month
3 1379495 459832 14.06 0.000
Interaction 3 294009 98003 3.00 0.062
Error
16 523222 32701
Total
23 2342899
Group
1
2
Month
1
2
3
4
Individual 95% CI
Mean --------+---------+---------+---------+--135 (-----------*----------)
291
(----------*----------)
--------+---------+---------+---------+--100
200
300
400
Individual 95% CI
Mean -----+---------+---------+---------+-----95
(-----*-----)
82 (-----*------)
627
(-----*-----)
48 (-----*-----)
-----+---------+---------+---------+-----0
250
500
750
RANDOMIZATION P-VALUE (FOR GROUP EFFECT)
Method used: restricted randomization (randomization within blocks)
Data Display (WRITE)
64
Number of randomizations 999
P-value for group effects (test-statistic = F-ratio)
0.0940
P-value for group effects (test-statistic = Mean square)
0.0810
Modified worksheet
C5
A column containing sorted data (sorted by group and block)
C6
A column containing re-numbered group markers
C7
A column containing re-numbered block markers
C8
A column containing markers for new group by block combinations
C10 A column containing 999 sums of square for group effect, one for each randomized dataset
C12 A column containing 999 F-ratios for group effect, one for each randomized dataset
Discussion
There is a suggestion from both standard and randomization methods that group (size morph) has an
effect. However, p-values are 0.051 by standard methods and 0.094 by randomization, so the evidence for
a group effect is not strong.
65
LEVENERAN
This macro is designed to test whether the variances of data in different groups are equal, using a
randomization version of Levene's Test.
Calling statement
leveneran c1 c2 ;
nran k1 (4999) ;
fvalues c1 ;
modified c1 ;
usemean k1 (0).
Input
c1 is a numeric column containing the observed data for all groups.
c2 is a numeric column, of the same length as c1, containing group labels.
Missing data is allowed. If data is missing in either the 'data' or 'group' column for any particular
individual, then that individual is excluded from the analysis.
Subcommands
fvalues - specify a column in which to store simulated F-values from the ANOVA procedure.
usemean - an option to use group means rather than medians in the construction of the modified
data on which the ANOVA is performed. If usemean = 1, group means are used. For any other
value, and by default, group medians are used.
Output
 Individual group means, medians and sample standard deviations
 Standard ANOVA output
 Randomization p-values for the F-ratio in the ANOVA
Speed of macro : FAST
Standard procedure
% vartest c1 c2.
This tests for equal variances amonst different groups. The data is given in c1, whilst group labels are
given in c2. The output reports the findings of a number of tests for equal variance, including Levene's
Test.
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapters 6 and 7).
66
Null hypothesis : The variance is constant across all groups (i.e. 12 = 22 = … = g2).
Test-statistic : We calculate absolute differences between the data and the relevant individual group
medians, and then perform a one-way ANOVA upon these differences. The F-ratio obtained from this
ANOVA is our test-statistic. By subtracting the group means from the data, we remove the effects of
differences in means between groups, so that we can attribute remaining variation in the absolute
differences (if this exists) to differences in variability between groups.
Randomization : We randomize the allocation of group labels to data values, because, as for the standard
one-way ANOVA, the null hypothesis implies that the allocation of group labels should be independent of
the data values.
WORKED EXAMPLE FOR LEVENERAN
Name of dataset
FERNBIRDS
Description
The data, from a study by Harris on the selection of nest site by the fernbird Bowdleria puncta, compares
perimeters of vegetation clumps within a region. 24 of the clumps within the data were selected by
fernbirds as nest sites; the remaining 25 clumps were selected at random from the same study region.
Manly (1997) applies Levene's test to these data.
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London, pp. 228-231.
Original source
HARRIS, W.F. (1986) The breeding ecology of the South Island Fernbird in Otago Wetlands, PhD Thesis,
University of Otago, Dunedin, New Zealand.
Data
Number of observations = 49
Number of variables = 2
For each point, the data (top) and group (bottom) are given.
8.90 4.34 2.30 5.16 2.92 3.30 3.17 4.81 2.40 3.74 4.86 2.88 4.90
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00
4.65 4.02 4.54 3.22 3.08 4.43 3.48 4.50 2.96 5.25 3.07 3.17 3.23
1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 2.00 2.00
2.44 1.56 2.28 3.16 2.78 3.07 3.84 3.33 2.80 2.92 4.40 3.86 3.48
2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00
2.36 3.08 5.07 2.02 1.81 2.05 1.74 2.85 3.64 2.40
2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00 2.00
Aims of analysis
To investigate whether variability in perimeter lengths is the same for the nest sites as for the randomly
chosen sites.
67
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Fernbirds.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Fernbirds.MTW
# Worksheet was saved on 27/07/01 17:15:50
Results for: Fernbirds.MTW
MTB > % N:\resampling\library\leveneran c1 c2 ;
SUBC> nran 1999 ;
SUBC> fvalues c4 ;
SUBC> modified c6 ;
SUBC> usemean 0.
Executing from file: N:\resampling\library\leveneran.MAC
LEVENE TEST
Data Display
> Number of groups
2
> Statistics for each group
Data Display
Row n obs grp mean grp median grp stdev
1
2
24 4.03667
25 2.93360
3.88 1.36963
2.92 0.84426
> ANOVA for *modified* data
Modified data is formed by subtracting from each value
the median value for the group of which it is a part
One-way ANOVA: modified data versus group
Analysis of Variance for modified
Source DF
SS
MS
F
P
group
1 1.447 1.447 2.55 0.117
Error 47 26.615 0.566
Total
48 28.062
Individual 95% CIs For Mean
Based on Pooled StDev
Level
N
Mean StDev ---------+---------+---------+------1
24 0.9933 0.9338
(---------*---------)
2
25 0.6496 0.5229 (---------*---------)
---------+---------+---------+------Pooled StDev = 0.7525
0.60
0.90
1.20
68
> Randomization p-value
Data Display (WRITE)
Number of randomizations used to compute p-value 1999
Randomization p-value
0.1145
Modified worksheet
C4
A column containing 999 F-ratios for group effect on the modified data, one for each randomized
dataset
C6
A column containing 49 values, the absolute values of (data - group median).
Discussion
The p-value of 0.115 corresponds closely to that obtained using an F-distribution approximation (p-value
= 0.12, see Manly, 1997). Using either method, there is no real evidence for a different in variances
between groups.
69
4
REGRESSION
Overview
Simple linear regression
REGRESSSIMRAN performs a randomization test for the slope in the regression of a response upon a
single predictor.
Multiple regression
REGRESSOBSRAN tests for the significance of parameters in a multiple regression, with inference
based upon randomization of observations.
REGRESSRESRAN tests for the significance of parameters in a multiple regression, with inference based
upon randomization of residuals.
REGRESSBOOT tests for the significance of parameters in a multiple regression, and computes
confidence intervals about parameters, using inference based upon the bootstrapping of residuals.
Should we resample residuals or observations ?
In the previous three sections, it has been relatively clear which quantity should be bootstrapped or
randomized – frequently, we simply bootstrap/randomize the data itself. In multiple regression, however,
it becomes less clear which quantity should be randomized or bootstrapped. There are two basic
alternatives –
 Resample from the data itself – this is known as resampling cases.
 Fit the regression model, and resample from the residuals for the fitted model.
Randomization methods
In the case of randomization, either method is reasonably straightforward. We can resample cases by
randomizing the allocation of response variable values to individuals, whilst keeping the values of all of
the predictors fixed. We resample residuals by fitting a regression model containing all of the predictors
to the observed data, bias-correcting the residuals (so that they have zero mean), standardizing the
residuals so that they have constant variance (there is an option to use the raw residuals, or to use deletion
residuals), and then randomizing the allocation of residuals to individuals. The randomized residuals are
then added to the fitted values for the individual, to create simulated values. Under the null hypothesis
that the fitted regression model is true, the allocation of these simulated values to observations should be
random (because the distribution of residuals should be the same as the error distribution associated with
the model).
The advantage of randomizing residuals is that this allows us to assess the significance of the effect of
any given predictor, conditional upon the effect of the remaining predictors – this is not possible when we
randomize cases, because in that case we simultaneously assess the effect of all of the predictors. The first
problem with randomizing residuals is that the method is more model-dependent that randomizing cases,
since we must assume that the distribution of residuals mirrors the error distribution within the model.
Any inadequacy in the fitted model is likely to lead to problems with the method. The second problem is
that the method may lead to the production of implausible datasets (see Manly, 1997).
In many circumstances, the two methods can be expected to give similar results.
Bootstrapping methods
In the case of bootstrapping, we have only produced a macro for bootstrapping residuals, although it is
possible to bootstrap cases instead. Bootstrapping residuals has the advantage that the same simulated
datasets can be used to create confidence intervals for individual parameters, but this is not the case if we
bootstrap cases.
70
REGRESSSIMRAN
The macro is designed to assess, using randomization, the significance of slope parameter  in a simple
linear regression of a response variable upon a single predictor.
Calling statement
regsimran c1 c2 ;
nran k1 (999);
fits c1;
residuals c1;
correlations c1;
coefficients c1-c2;
tstatistics c1.
Input
C1
Response variable : a column containing only numeric values.
C2
Predictor variable : a column containing only numeric values.
C1 and C2 must have the same length.
Missing values : Allowed. If the value for any variable is missing for an observation, then that
observation is excluded from the analysis.
Subcommands
nran
Number of randomizations used.
fits
Specify a column in which to store fitted values.
stores
Specify a column in which to store raw residuals.
correlations Specify a column in which to store simulated correlation coefficients.
coefficients Specify two columns in which to store simulated parameter estimates (intercept in 1st
column, slope in 2nd column).
tstatistics
Specify a column in which to store simulated t-statistics for slope.
Output
 Means for response variable and predictor
 Estimated regression slope and intercept
 Standard error for estimated regression slope
 Correlation coefficient between response variable and predictor
 T-statistic for estimate regression slope
 One and two-sided randomization p-values for slope
Technical details
For the regression model, Y =  + x +  we test the null hypothesis
H0: Slope is equal to zero (=0)
using the t-statistic corresponding to the slope parameter, t  ˆ / SE[ ˆ ]
We randomize the allocation of response variable values to predictor values, since under the null
hypothesis the response variable is independent of the predictor.
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
71
Chapman and Hall, London (Chapter 8).
Standard procedures
regress c1 1 c2;
constant.
This regresses the response variable c1 upon the predictor c2. The "1" indicates that there is only one
predictor variable; it is usual to fit a constant as well as a slope in regression, unless there is some reason
to believe that the regression must pass through the origin.
WORKED EXAMPLE FOR REGRESSSIMRAN
Name of dataset
HEXOKINASE
Description
The data is taken from part of a study by McKechnie, concerning electrophoretic frequencies of the
butterfly Euphydryas editha. For each of 18 units (corresponding either to colonies, or to sets of colonies),
the reciprocal of altitude (originally measured in feet * 103) is recorded, together with the percentage
frequency of hexokinase 1.00 mobility genes from electrophoresis of samples of Euphydryas editha. We
label these variables "invalt" and "hk" respectively.
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
MC. KECHNIE, S.W., EHRLICH, P.R. & WHITE, R.R. (1975), Population genetics of Euphydryas butterflies.
I. Genetic variation and the neutrality hypothesis, Genetics, 81, pp. 571-594.
Data
Number of observations = 18
Number of variables = 2
For each observation, HK (top) and INVALT (bottom) are given.
98.00 36.00 72.00 67.00 82.00 72.00 65.00 1.00 40.00 39.00 9.00
2.00 1.25 1.75 1.82 2.63 1.08 2.08 1.59 0.67 0.57 0.50
19.00 42.00 37.00 16.00 4.00 1.00 4.00
0.24 0.40 0.50 0.15 0.13 0.11 0.10
Minitab worksheet
C1
HK measurements
C2
INVALT measurements
Aims of analysis
To investigate, using a linear regression model, whether INVALT has an effect upon the value of HK.
Standard procedure
MTB > Regress c1 1 c2;
SUBC> Constant;
SUBC> Brief 2.
72
Regression Analysis: hk versus invalt
The regression equation is
hk = 10.7 + 29.2 invalt
Predictor
Coef SE Coef
T
P
Constant
10.654
7.585
1.40 0.179
invalt
29.153
6.035
4.83 0.000
S = 20.27
R-Sq = 59.3%
R-Sq(adj) = 56.8%
Analysis of Variance
Source
DF
SS
MS
F
P
Regression
1
9585.3
9585.3 23.33 0.000
Residual Error 16
6572.5
410.8
Total
17 16157.8
Unusual Observations
Obs invalt
hk
Fit
8
1.59
1.00
57.01
SE Fit
6.05
Residual
-56.01
St Resid
-2.90R
R denotes an observation with a large standardized residual
Resampling procedure
MTB > Retrieve "N:\resampling\Examples\Hexokinase.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Hexokinase.MTW
# Worksheet was saved on 06/07/01 14:15:38
Results for: Hexokinase.MTW
MTB > % N:\resampling\library\regresssimran c1 c2 ;
SUBC> nran 999 ;
SUBC> fits c4 ;
SUBC> residuals c5 ;
SUBC> correlations c7 ;
SUBC> coefficients c9 c10 ;
SUBC> tstatistics c12.
Executing from file: N:\resampling\library\regresssimran.MAC
Data Display (WRITE)
Number of observations 18
Mean of response variable
39.11
Mean of predictor
0.98
Correlation coefficient
0.770
Estimated intercept
10.654
Estimated slope 29.153
Standard error on estimated slope
T-statistic for significance of slope
6.035
4.83
73
One sided randomization p-value, H1: -ve slope
One sided randomization p-value, H1: +ve slope
Two sided randomization p-value
0.0020
1.0000
0.0010
Modified worksheet
C4
A column containing 18 fitted values for the regression on the observed data
C5
A column containing 18 raw residuals for the regression on the observed data
C7
A column containing 999 correlation coefficients, one for each randomized dataset
C7
A column containing 999 intercept parameter estimates, one for each randomized dataset
C7
A column containing 999 slope parameter estimates, one for each randomized dataset
C7
A column containing 999 slope parameter t-statistics, one for each randomized dataset
Discussion
There is very strong evidence (standard p-value = 0.000, whilst two-sided randomization p-value = 0.002,
the smallest possible value with 999 randomizations) that INVALT does have an effect upon HK. We see
that INVALT actually has a positive effect upon HK frequency (implying that altitude has a negative
effect upon HK frequency).
74
REGRESSOBSRAN
To fit a multiple regression model. The significance of the parameter for each predictor is computed,
along with the overall significance of the regression. P-values are obtained by randomization of
observations.
Other macros
REGRESSSIMRAN should be used if there is a single predictor.
REGRESSRESRAN is the same, except that randomization is of residuals, not observations.
REGRESSBOOT performs bootstrap multiple regression, with bootstrapping of residuals.
Calling statement
regressobsran c1 c2-cN ;
nran k1 ;
tvalues m1.
Input
C1
C2 - CN
Response variable. A column containing numeric values.
Predictor variables. Columns containing numeric values.
All N columns must have the same length.
Missing values : Allowed. If the value for any variable is missing for an observation, then that
observation is excluded from the analysis.
Subcommands
nran
Number of randomizations.
tvalues
Specify a matrix within which to store simulated t-values for the coefficient of each
predictor.
Output
For the coefficient associated with each predictor, we present  Estimated coefficient, together with standard error
 T-statistic, plus p-value using normal theory
 Randomization p-values
Randomization p-values are based upon T-statistics, and two-sided values are found by doubling the
smaller one-sided value.
In addition, we present an overall F-ratio for the regression, with p-values from both normal theory and
randomisation.
Technical details
We randomise the allocation of the response variable to individuals.
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 8).
Standard procedures
75
For example,
regress c1 8 c2-c9;
constant.
This regresses the response variable c1 upon the predictors c2-c9. The "8" indicates that the number of
predictors. An intercept term is also included in the regression.
WORKED EXAMPLE FOR REGRESSOBSRAN
Name of dataset
OREGON
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
MC. KECHNIE, S.W., EHRLICH, P.R. & WHITE, R.R. (1975), Population genetics of Euphydryas butterflies.
I. Genetic variation and the neutrality hypothesis, Genetics, 81, pp. 571-594.
Data
Number of observations = 18
Number of variables = 7
Colony Altitude
1
0.50
2
0.80
3
0.57
4
0.55
5
0.38
6
0.93
7
0.48
8
0.63
9
1.50
10
1.75
11
2.00
12
4.20
13
2.50
14
2.00
15
6.50
16
7.85
17
8.95
18
10.50
Invalt
2.00
1.25
1.75
1.82
2.63
1.08
2.08
1.59
0.67
0.57
0.50
0.24
0.40
0.50
0.15
0.13
0.11
0.10
Precip
58
20
28
28
15
21
24
10
19
22
58
36
34
21
40
42
57
50
Tmax
97
92
98
98
99
99
101
101
99
101
100
95
102
105
83
84
79
81
Tmin
16
32
26
26
28
28
27
27
23
27
18
13
16
20
0
5
-7
-12
Hk
98
36
72
67
82
72
65
1
40
39
9
19
42
37
16
4
1
4
Aims of analysis
To investigate whether altitude (INVALT) and climatic variables (Precip, Tmax, Tmin) have an impact
upon electrophoretic frequency (HK), and to investigate the nature of any possibly effects.
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Oregon.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Oregon.MTW
# Worksheet was saved on 14/08/01 11:36:50
76
Results for: Oregon.MTW
MTB > % N:\resampling\library\regressobsran c7 c3-c6 ;
SUBC> nran 999 ;
SUBC> tvalues m1.
Executing from file: N:\resampling\library\regressobsran.MAC
INDIVIDUAL REGRESSION COEFFICIENTS
* KEY *
> row = predictor
> coef = estimated coefficient
> SE coef = standard error about coefficient
> T = t-statistic for coefficient
> normal p = p-value under assumption of normality
> 1s- ran p = one-sided randomization p-value, H1: +ve coefficient
> 1s+ ran p = one-sided randomization p-value, H1: -ve coefficient
> 2s ran p = two-sided randomization p-value
* ESTIMATES *
Data Display
Row
coef SE coef
T normal p 1s- ran p 1s+ ran p 2s ran p
1 26.1237 8.64504 3.02182 0.009818
2 0.4720 0.49554 0.95247 0.358235
3 0.8668 1.17253 0.73923 0.472904
4 0.2503 1.01945 0.24555 0.809863
0.005
0.181
0.230
0.390
0.995
0.819
0.770
0.610
0.010
0.362
0.460
0.780
OVERALL SIGNIFICANCE OF THE REGRESSION
Data Display (WRITE)
Overall F-ratio for regression
5.96
P-value using normality 0.0060
P-value using randomization
0.0100
Modified worksheet
M1
A 999 * 4 matrix, containing t-statistics for the parameter estimates for each of the
four predictor effects (column 1 for predictor 1, etc.).
Discussion
There is strong evidence that the overall regression is significant (p-value = 0.006 by normal theory,
0.010 by randomization), and that INVALT has a significant (positive) impact upon HK (p-value = 0.010
by normal theory or randomization). There is no evidence that the remaining three variables have any
significant impact upon the response. The p-values obtained by randomization and normal theory are very
similar in all cases.
77
REGRESSRESRAN
To fit a multiple regression model. The significance of the parameter for each predictor is computed,
along with the overall significance of the regression. P-values are obtained by randomization of residuals.
Other macros
REGRESSSIMRAN should be used if there is a single predictor.
REGRESSOBSRAN is the same, except that randomization is of observations, not residuals.
REGRESSBOOT performs bootstrap multiple regression, with bootstrapping of residuals.
Calling statement
regressresran c1 c2-cN ;
nran k1 (999) ;
residuals k1 (2) ;
tstatistics m1.
Input
C1
C2 - CN
Response variable. A column containing numeric values.
Predictor variables. Columns containing numeric values.
All N columns must have the same length.
Missing values : Allowed. If the value for any variable is missing for an observation, then that
observation is excluded from the analysis.
Subcommands
nran
residuals
tstatistics
Number of randomizations
Type of residual :
1 = Raw residuals
2 = Modified residuals (default)
3 = Deletion residuals
Specify a matrix within which to store simulated t-values for the coefficient of each
predictor.
Output
For the coefficient associated with each predictor, we present  Estimated coefficients, together with standard errors
 Corresponding F-ratios, with p-values using normal theory and randomization
 T-statistics, with p-values using randomization
The randomization p-values computed using F-ratios are naturally one-sided. In the case of T statistics,
two-sided p-values are calculated by doubling the smaller one-sided p-value. P-values obtained from the
two methods should be very similar, since the F ratios can be found by squaring the T statistics.
In addition, we present an overall F-ratio for the regression, with p-values from both normal theory and
randomization.
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 8).
78
TER BRAAK, C.J.F. (1992), Permutation versus bootstrap significance tests in multiple regression and
ANOVA, in Bootstrapping and Related Techniques (ed. K.H. Jockel), Springer-Verlag, Berlin, pp.79-86.
Technical details
We randomize residuals, according to the procedure based on equation 3.3 of Ter Braak (1992). In fact,
we use a simplification of the algorithm (described in Manly, 1997) in which we regress directly upon the
residuals, rather than upon the fitted values :
Algorithm
Assume that we have data on a response variable, y, and p predictors x1,…,xp.
Assume that the parameter for predictor xi is i.
Stage 1 : Regress y upon x1,…,xp.
Hence obtain parameter estimates b1,…,bp and standard errors SE[b1],…,SE[bp].
Further obtain t-statistics t1,…,tp, where ti = bi / SE[bI].
Also obtain fitted values, yFIT, and residuals r = y - yFIT.
Stage 2 : For j = 1,…,d times (where d is the number of randomizations), randomize the ordering of
the residuals, so obtaining randomized residuals rj*.
Then regress the randomized residuals, rj*, upon the predictors x1,…,xp.
Hence obtained parameter estimates and standard errors, and so t-values t1j*,…, tpj*.
Stage 3 : For each predictor, i, compare the observed t-statistic ti to the statistics based upon
randomization ti1*,…,tid*.
The algorithm is straightforward to implement, but more complicated to justify (see Ter Braak, 1992).
Standard procedures
For example,
regress c1 4 c2-c5;
constant.
This regresses the response variable c1 upon the predictors c2-c5. The "4" indicates that the number of
predictors. An intercept term is also included in the regression.
WORKED EXAMPLE FOR REGRESSRESRAN
Name of dataset
ARTIFICIAL
Description
We use the artificial data created on Manly, 1997 (and based on similar data generated by Kennedy and
Kade).
The data are artificial; their construction is discussed at length by Manly (1997). The purpose of the data
is to demonstrate that computationally intensive methods can, in certain circumstances, produce results
which are very substantially different from the results of standard methods.
Source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Data
Number of observations = 20
Number of variables = 4
y
x1
x2
x3
79
99.00 33.00
5.94 1.97
103.45 2.65
8.33 2.72
3.83 0.70
2.82 0.94
4.19 0.76
2.86 0.95
4.18 1.39
3.47 0.69
38.09 1.28
25.62 1.50
7.54 2.33
2.51 0.82
10.13 2.80
4.66 1.55
4.15 0.76
19.97 0.46
4.23 0.70
8.83 1.98
2.09
0.87
2.83
0.42
2.80
1.93
0.82
0.31
1.20
1.17
1.79
1.53
2.35
0.19
0.72
2.74
2.29
2.93
2.02
1.94
18.59
1.86
4.15
1.78
3.15
2.40
1.20
0.79
1.89
1.51
2.43
2.28
3.51
0.60
2.12
3.51
2.67
3.16
2.37
2.93
Aims of analysis
To investigate whether predictors x1 and x2 have an impact upon response y, and to investigate the nature
of any possible effects.
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Artificial.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Artificial.MTW
# Worksheet was saved on 08/08/01 11:16:26
Results for: Artificial.MTW
MTB > % N:\resampling\library\regressresran c1 c2-c3;
SUBC> nran 999 ;
SUBC> residuals 1 ;
SUBC> tstatistics m1.
Executing from file: N:\resampling\library\regressresran.MAC
Multiple regression, with randomization of raw residuals
F-ratios for individual coefficients
* KEY *
> row = predictor
> coef = estimated coefficient
> SE coef = standard error about coefficient
> F = F-ratio for coefficient
> normal p = p-value under assumption of normality
> ran p = randomization p-value (based on F-ratio)
* ESTIMATES *
80
Data Display
Row
coef SE coef
F
normal p ran p
1 2.66985 0.70490 14.3454 0.001470 0.041
2 9.62936 5.63366 2.9216 0.105594 0.101
T-values for individual coefficients
* KEY *
> row = predictor
> T = T-value for estimated coefficient
> 1s- ran p = one-sided randomization p-value based on T, H1: +ve coef.
> 1s+ ran p = one-sided randomization p-value based on T, H1: -ve coef.
> 2s ran p = two-sided randomization p-value based on T
* ESTIMATES *
Data Display
Row
T 1s+ ran p 1s- ran p 2s ran p
1 3.78753
2 1.70925
0.041
0.049
0.960
0.952
0.082
0.098
Overall regression
Data Display (WRITE)
Overall F-ratio for regression
9.42
P-value using normality 0.0018
P-value using randomization
0.0410
Modified worksheet
M1
A 999 * 2 matrix, containing t-statistics for the parameter estimates for each of the
two predictor effects (column 1 for predictor 1, etc.).
Discussion
[2-sided] P-value
Using normality
Using randomization and t-statistics
Using randomization and F-ratios
1st predictor
0.001
0.082
0.041
2nd predictor
0.106
0.098
0.101
Overall regression
0.002
NA
0.041
81
We see that p-values for the 1st predictor and overall regression differ substantially between the methods,
and that the conclusions drawn would also differ. The results also differ if we randomize cases instead of
residuals. This highlights that standard and randomization methods do not always give the same answers.
Plot
A plot of the artificial dataset (y = response ; x1,x2 = predictors).
100
y
50
3
2
0
0
1
10
x2
20
30
x1
0
82
REGRESSBOOT
To fit a multiple regression model. The significance of the parameter for each predictor is computed,
along with the overall significance of the regression. P-values are obtained by bootstrapping of residuals.
Calling statement
regressboot c1 c2-cN ;
nboot k1 (2000) ;
residuals k1 (2) ;
siglev k1 (95) ;
resco m1 m2 ;
fittedco m3 ;
bcadetails c1-c4.
Input
C1
C2 - CN
Response variable. A column containing numeric values.
Predictor variables. Columns containing numeric values.
All N columns must have the same length.
Missing values : Allowed. If the value for any variable is missing for an observation, then that
observation is excluded from the analysis.
Subcommands
nboot
residuals
siglev
resco
fittedco
bcadetails
Number of bootstrap resamples
Type of residual :
1 = Raw residuals
2 = Modified residuals (default)
3 = Deletion residuals
Significance level for confidence intervals, in %.
Store coefficients for regressions upon resampled residuals in m1,
together with standard errors in m2
Store coefficients for regressions upon simulated data
(fitted values + reesampled residuals) in m3.
Store : Column 1 - estimated bias for each parameter
Column 2 - estimated acceleration for each parameter
Column 3 - rank value for lower BCa confidence limit
Column 4 - rank value for upper BCa confidence limit
Output
For the coefficient associated with each predictor, we present  Estimated coefficient, together with standard error
 F-ratio, plus p-value using normal theory
 Randomization p-values based on F-ratio
 T-statistics, with randomisation p-values
In addition, we present an overall F-ratio for the regression, with p-values from both normal theory and
randomisation.
83
Other macros
REGRESSSIMRAN should be used if there is a single predictor.
REGRESSOBSRAN performs multiple regression, with significance determined by randomization of
observations.
REGRESSRESBOOT performs multiple regression, with significance determined by randomization of
residuals.
Standard procedures
For example,
regress c1 5 c2-c6
constant.
This regresses the response variable c1 upon the predictors c2-c6. The "5" indicates that the number of
predictors. An intercept term is also included in the regression.
Technical details : hypothesis tests
We bootstrap residuals, according to the procedure based on equation 3.3 of Ter Braak (1992). In fact, we
use a simplification of the algorithm (described in Manly, 1997) in which we regress directly upon the
residuals, rather than upon the fitted values :
Algorithm
Assume that we have data on a response variable, y, and p predictors x1,…,xp.
Assume that the parameter for predictor xi is i.
Stage 1 : Regress y upon x1,…,xp.
Hence obtain parameter estimates b1,…,bp and standard errors SE[b1],…,SE[bp].
Further obtain t-statistics t1,…,tp, where ti = bi / SE[bI].
Also obtain fitted values, yFIT, and residuals r = y - yFIT.
Stage 2 : For j = 1,…,d times (where d is the number of bootstrap resamples), take a bootstrap
sample, rj*, from the residuals.
Then regress the bootstrapped residuals, rj*, upon the predictors x1,…,xp.
Hence obtained parameter estimates and standard errors, and so t-values t1j*,…, tpj*.
Stage 3 : For each predictor, i, compare the observed t-statistic ti to the statistics based upon
bootstrapping, ti1*,…,tid*, and so obtain p-values.
The algorithm is straightforward to implement, but more complicated to justify (see Ter Braak, 1992).
Technical details : confidence intervals
We use the same bootstrap samples as for the hypothesis tests, but perform a different regression.
For j = 1,…,d, we regress simulated data, yj* = yFIT + rj*, upon predictors x1,…,xp.
We use the subsequent distributions for each of the parameter estimates as the source of our confidence
intervals, which are then computed as before.
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 8).
TER BRAAK, C.J.F. (1992), Permutation versus bootstrap significance tests in multiple regression and
ANOVA, in Bootstrapping and Related Techniques (ed. K.H. Jockel), Springer-Verlag, Berlin, pp.79-86.
DRAPER, N.R. & SMITH, H. (1998) Applied regression analysis (3rd edition), John Wiley & Sons., New
York (Chapter 26).
84
WORKED EXAMPLE FOR REGRESSBOOT
Name of dataset
SWAVESEY
Description
A subset of data from a study carried out in Swavesey fens to investigate bird species diversity in relation
to field boundary characteristics. The response variable, meanno, is the mean number of species recorded
in a series of visits to each site.
The predictors are th, average tree height
tn, number of trees
hh, average hedge height
hl, hedge length
cw, average hedge crown width
bw, average hedge base width
dd, average ditch depth
dw, average ditch width
woodyno, number of woody species
herbno, number of herb species
Source
Our own unpublished data (Centre for Ecology and Hydrology).
Data
Number of observations = 44
Number of variables = 11
meanno
4.25
6.50
4.50
4.25
1.75
2.75
1.75
8.50
1.50
3.00
4.25
1.00
0.75
1.75
0.75
2.75
1.50
3.50
5.75
1.75
1.75
6.25
2.00
4.75
5.25
1.50
9.00
3.50
th
10.0
10.0
9.0
12.0
0.0
0.0
0.0
0.0
9.0
0.0
10.0
0.0
0.0
8.0
0.0
0.0
0.0
0.0
8.0
8.0
0.0
7.0
8.0
0.0
0.0
0.0
12.0
0.0
tn
2
10
9
2
0
0
0
0
2
0
6
0
0
5
0
0
0
0
10
4
0
4
1
0
0
0
20
0
hh
3.50000
6.00000
3.00000
6.00000
0.00000
3.00000
0.00000
7.00000
1.40000
3.00000
5.00000
0.00000
0.00000
3.50000
1.20000
2.50000
2.50000
4.50000
4.00000
3.00000
4.50000
4.00000
1.80000
3.00000
4.50000
1.80000
5.00000
4.50000
hl
180
190
150
160
0
20
0
180
196
2
196
0
0
140
200
194
196
198
180
176
190
180
200
80
180
190
80
200
cw
3.0000
5.5000
2.0000
5.0000
0.0000
3.0000
0.0000
12.0000
1.0000
3.0000
3.0000
0.0000
0.0000
2.5000
1.0000
1.7000
1.7000
3.5000
3.5000
2.5000
3.5000
3.0000
1.2000
3.0000
3.0000
1.0000
4.0000
4.0000
bw
2.5000
2.0000
1.0000
4.0000
0.0000
3.0000
0.0000
11.0000
1.0000
3.0000
1.5000
0.0000
0.0000
2.5000
0.8000
1.7000
1.7000
2.5000
2.0000
2.0000
3.0000
2.5000
1.0000
3.0000
1.5000
0.8000
1.5000
2.0000
dd
1.00000
2.20000
0.30000
2.50000
3.00000
1.50000
1.50000
1.50000
1.50000
1.00000
1.50000
1.20000
1.00000
0.70000
1.00000
1.50000
2.00000
2.00000
1.20000
1.20000
1.80000
0.20000
0.10000
0.50000
0.20000
0.20000
0.50000
1.50000
dw
2.5000
5.0000
3.0000
6.0000
10.0000
8.0000
8.0000
6.0000
4.0000
5.0000
4.0000
5.0000
4.5000
2.5000
2.0000
2.5000
5.0000
5.0000
4.0000
3.0000
4.0000
1.5000
1.0000
2.0000
1.5000
2.0000
5.0000
4.5000
woodyno
7
5
5
7
0
4
0
5
5
1
5
3
3
5
4
4
3
4
6
6
2
7
8
5
3
1
3
3
herbno
14
19
7
25
13
13
16
22
25
19
28
15
25
8
15
14
23
20
17
22
24
13
28
15
8
4
2
8
85
4.25
2.00
2.00
0.50
2.25
0.25
2.00
1.75
3.75
3.75
2.75
3.50
0.50
1.00
5.50
10.00
0.0
0.0
3.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
5.5
9.0
0
0
3
0
0
0
0
0
0
0
0
0
0
0
1
3
5.50000
1.10000
1.00000
0.00000
1.30000
1.40000
1.30000
1.30000
4.50000
4.00000
5.00000
5.00000
1.20000
1.20000
4.50000
4.50000
190
198
200
0
200
180
192
200
198
184
200
200
198
196
200
194
5.0000
1.0000
1.0000
0.0000
1.2000
1.0000
1.0000
1.0000
3.5000
3.0000
4.0000
4.0000
1.0000
1.0000
3.0000
3.5000
3.0000
1.0000
1.3000
0.0000
1.0000
1.5000
1.2000
1.5000
4.5000
3.0000
2.5000
2.0000
0.9000
1.0000
1.5000
2.5000
1.50000
1.50000
2.00000
2.50000
1.20000
2.00000
0.20000
2.00000
2.00000
1.70000
2.00000
2.00000
0.50000
0.60000
1.00000
0.20000
4.5000
4.5000
5.0000
6.0000
2.5000
4.5000
1.0000
4.0000
4.0000
4.0000
4.0000
4.0000
1.0000
1.2000
2.5000
1.5000
3
5
7
1
5
4
3
3
4
3
4
4
5
4
5
6
9
22
19
18
12
13
15
20
14
12
6
11
10
9
11
7
Aims of analysis
To investigate the effect of predictors describing hedge and ditch characteristics
(th,tn,hh,hl,cw,bw,dd,dw,woodyno,herbno) upon the number of bird species recorded, and to investigate
the nature of any possible effects.
Randomization procedure
Welcome to Minitab, press F1 for help.
Retrieving worksheet from file: N:\resampling\datamin\Swavesey.mtw
# Worksheet was saved on 29/08/01 14:10:49
Results for: Swavesey.mtw
MTB > % N:\resampling\library\regressboot c1 c2-c11 ;
SUBC> nboot 1000 ;
SUBC> residuals 1 ;
SUBC> siglev 95 ;
SUBC> resco m1 m2 ;
SUBC> fittedco m3 ;
SUBC> bcadetails c13-c16.
Executing from file: N:\resampling\library\regressboot.MAC
Bootstrap significance tests and confidence intervals
for multiple regression, with bootstrapping from
modified residuals
Overall regression
Data Display (WRITE)
Overall F-ratio for regression
10.32
P-value using normality 0.0000
P-value using randomization
0.0010
Parameter estimates and F-ratios for individual coefficients
* KEY *
coef = estimated coefficient for this parameter
secoef = standard error about estimated coefficient
F = F-ratio for this parameter
86
normal p = P-value corresponding to this F-ratio, using normality
ran p
= Randomization p-value corresponding to this F-ratio
Data Display
Row
coef
SE coef
1 0.060102
2 0.093362
3 0.358451
4 -0.002020
5 0.518894
6 -0.107250
7 -0.924962
8 0.220561
9 0.105074
10 -0.049098
F
0.102011
0.092542
0.318912
0.005029
0.444374
0.351562
0.583145
0.267790
0.188804
0.038267
normal p
0.34713
1.01779
1.26334
0.16144
1.36352
0.09307
2.51591
0.67837
0.30972
1.64619
0.559754
0.320385
0.269131
0.690427
0.251297
0.762231
0.122240
0.416061
0.581608
0.208416
ran p
0.569431
0.315684
0.280719
0.680320
0.262737
0.778222
0.113886
0.393606
0.587413
0.200799
T-values for individual coefficients
* KEY *
t = t-statistic for this parameter
1s- ran p = One-sided randomization p-value for this t-statistic
(H1: negative slope)
1s+ ran p = One-sided randomization p-value for this t-statistic
(H1: positive slope)
2s ran p = Two-sided randomization p-value for this t-statistic
Data Display
Row
t
1 0.58918
2 1.00885
3 1.12398
4 -0.40179
5 1.16770
6 -0.30507
7 -1.58616
8 0.82363
9 0.55652
10 -1.28304
1s+ ran p 1s- ran p 2s ran p
0.268731
0.159840
0.148851
0.667333
0.121878
0.621379
0.946054
0.201798
0.295704
0.895105
0.732268
0.841159
0.852148
0.333666
0.879121
0.379620
0.054945
0.799201
0.705295
0.105894
0.537463
0.319680
0.297702
0.667333
0.243756
0.759241
0.109890
0.403596
0.591409
0.211788
Confidence intervals for individual coefficients
* KEY *
norm l = Lower confidence limit, using normal theory
norm u = Upper confidence limit, using normal theory
perc l = Lower confidence limit, using Efron percentile method
perc u = Upper confidence limit, using Efron percentile method
bca l = Lower confidence limit, using BCa percentile method
87
bca u = Upper confidence limit, using BCa percentile method
Data Display
Row
norm l
1 -0.13983
2 -0.08802
3 -0.26660
4 -0.01188
5 -0.35206
6 -0.79630
7 -2.06790
8 -0.30430
9 -0.26498
10 -0.12410
norm u
perc l
0.26004 -0.14573
0.27474 -0.08100
0.98351 -0.27908
0.00784 -0.01299
1.38985 -0.39402
0.58180 -0.76739
0.21798 -2.08857
0.74542 -0.31512
0.47512 -0.26142
0.02590 -0.12518
perc u
bca l
bca u
0.27974 -0.11468
0.26928 -0.13135
1.01148 -0.39776
0.00774 -0.01466
1.31184 -0.33113
0.67198 -0.83608
0.24681 -1.92206
0.72322 -0.38128
0.45915 -0.28554
0.02523 -0.12730
0.31275
0.23386
0.91001
0.00730
1.49021
0.60864
0.39595
0.66396
0.45586
0.02412
Modified worksheet
C13 A column of 10 values, containing estimated bias values for each parameter
C14 A column of 10 values, containing estimated acceleration values for each parameter
C15 A column of 10 values, containing rank values for the lower limits of BCa confidence intervals
C16 A column of 10 values, containing rank values for the upper limits of BCa confidence intervals
M1
A 1000*10 matrix. Each column contains 1000 estimated parameter estimates for each parameter,
from where the bootstrapped residuals are regressed upon the predictors. These estimates are used,
together with their standard errors, for the creation of p-values.
M2
A 1000*10 matrix. Contains standard errors about the estimates in M1.
M3
A 1000*10 matrix. Each column contains 1000 estimated parameter estimates for each parameter,
from where the simulated data (original fitted values plus bootstrapped residuals) are regressed
upon the predictors. These estimates are used for the creation of confidence intervals.
Discussion
The overall regression is very clearly significant (p-value = 0.000 using normal theory, 0.001 using
randomization), but none of the individual predictors a significant effect (two-sided p-values are greater
than 0.1 in all cases, and using all methods). This apparent paradox results from the fact that many of the
predictors are very highly correlated, so that there is no need to include all ten within the regression.
The 2-sided p-values obtained using normal theory, using randomization with F-ratios, and using
randomization with t-statistics are all very similar in all cases, and the confidence intervals also appear to
be reasonably similar for different methods. The Efron and standard intervals appear to be very similar,
though the differences each between these intervals and the BCa intervals are somewhat larger. The
intervals all have similar lengths (average length of 0.938 for the standard interval, 0.948 for the Efron
interval and 0.964 for the BCa interval).
88
5
TIME SERIES
Overview
ACFRAN tests for autocorrelation in a univariate time series
TRENDRAN tests for trend in a univariate time series
Comments
Time series is a large, and often fairly complicated, branch of statistics. It is characterised by the fact that
observations at a timepoint are usually dependent upon observations at previous timepoints. We provide
two quick, straightforward macros which test the null hypothesis that the observed data are random
against alternative hypotheses of short-term dependence (autocorrelation) and long-term dependence
(trend).
89
ACFRAN
To test for the presence of serial correlation in a regular time series using serial correlation coefficients
and the Von Neumann ratio. Significance is determined using both normal approximations and randomization.
Calling statement
acfran c1 ;
nran k1 ;
nlag k1 (10).
Input
c1
A column of numeric data.
Missing values : Allowed. Observations with missing values are simply ignored.
Subcommands
nlag - the maximum number of lags which should be considered. Must be less than the number of
observations.
Output
For each lag from 1 to nlag, we present
 Observed autocorrelation at that lag
 Standardised autocorrelation and approximate p-value using a normal approximation
 Randomisation p-values for the autocorrelation coefficient
In addition, we present
 Observed Von Neumann ratio for the data
 Standardised VN ratio and approximate p-value using a normal approximation
 Randomisation p-values for the VN ratio
Speed of macro : FAST
Notes
If the number of observations is less than 10, then the default lag automatically changes from 10 to one
less than the number of available observations.
Reference : MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Null hypothesis : We test the null hypothesis that the series is random.
Alternative hypothesis :
 For the kth serial correlation coefficient : There is kth autocorrelation (short-term dependence) within
the series.
 For the Von Neumann ratio : The series is a simple random walk.
Test-statistics : We use the kth sample serial correlations (for k = 1,…,nlag) as test-statistics. If the kth
sample serial correlation value is significantly different from zero then this provides evidence of autocorrelation at the kth lag.
90
In addition, we use the Von Neumann ratio as an overall test for the presence autocorrelation. This ratio
tests the null hypothesis that the series is random against the alternative hypothesis that it is a simple
random walk. The Von Neumann ratio is of the form
n
v
 (x
i 2
n
i
 (x
i 1
 xi 1 ) 2
, where xi is the ith data value.
i
 x)2
Randomization procedure : We randomize the order of the points in the observed series, since under the
null hypothesis this will random.
Standard procedure
The function
% acf c1
computes serial correlation coefficients, and produces p-values using normal approximations: If rk is the
kth sample serial correlation, then under the null hypothesis the standardized version
zk = [rk + 1/(n -1)] / sqrt{1/n}
has an approximate standard normal distribution for sufficiently large n. We also include p-values based
upon these normal approximations within our macro output.
Within the macro output, we provide a p-value for the Von Neumann ratio based upon a normal
approximation. Under the null hypothesis, for sufficiently large n, the Von Neumann ratio has a normal
distribution with mean 2 and variance 4(n-2)/(n2-1).
WORKED EXAMPLE FOR ACFRAN
Name of dataset
PROLOCULI
Description
The data are mean diameters of megalospheric proloculi of the Cretaceous bolivinid foraminifer
Afrobolivina afra from 92 levels in a borehole in Gbekebo, Ondo State, Nigeria. The rank of the depth is
recorded, and provides a measure of the age of the sample (1 = oldest, corresponding to late Cretaceous,
92 = youngest, corresponding to early Palaeocene). Diameters are recorded, but interest really lies in the
91 differences between diameters from adjacent depths.
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
REYMENT, R.A. (1982), Phenotypic evolution in a Cretaceous foraminifer, Evolution, 36, pp. 1182-1199.
Data
Number of observations = 92
Number of variables = 3
For each observation, sample number (which corresponds to rank depth) (left), diameter (middle) and
difference from previous diameter (left) are given.
91
SampleDiam Diff
SampleDiam Diff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
156
146
136
152
147
190
169
170
179
176
184
162
155
154
151
150
187
220
205
194
221
185
171
177
194
176
170
178
177
168
176
209
172
195
169
173
156
161
161
147
158
162
233
184
205
203
*
-10
-10
16
-5
43
-21
1
9
-3
8
-22
-7
-1
-3
-1
37
33
-15
-11
27
-36
-14
6
17
-18
-6
8
-1
-9
8
33
-37
23
-26
4
-17
5
0
-14
11
4
71
-49
21
-2
201
261
262
271
202
235
214
212
210
241
211
247
238
235
227
236
230
241
232
230
238
234
230
254
256
210
230
231
225
227
226
237
250
226
229
240
205
221
208
207
215
233
210
213
198
213
-2
60
1
9
-69
33
-21
-2
-2
31
-30
36
-9
-3
-8
9
-6
11
-9
-2
8
-4
-4
24
2
-46
20
1
-6
2
-1
11
13
-24
3
11
-35
16
-13
-1
8
18
-23
3
-15
15
92
Plot
Difference
50
0
-50
Index
10
20
30
40
50
60
70
80
90
Worksheet
C1
Rank depth
C2
Diameter
C3
Distance from previous diameter
Aims of analysis
To investigate whether or not stage-to-stage differences in diameters suffer from autocorrelation.
Standard procedure
Welcome to Minitab, press F1 for help.
MTB > Retrieve "N:\resampling\Examples\Proloculi.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Proloculi.MTW
# Worksheet was saved on 07/08/01 11:51:17
Results for: Proloculi.MTW
MTB > ACF 8 c3 c4.
Randomization procedure
MTB > % N:\resampling\library\acfran c3 ;
SUBC> nran 999 ;
SUBC> nlag 8 ;
SUBC> autocor m1 ;
SUBC> vonneu c5.
Executing from file: N:\resampling\library\acfran.MAC
Randomization tests for autocorrelation in a univariate time series
SERIAL CORRELATION COEFFICIENTS
for lags of 1 to k
* KEY *
> row = lag, j
> corr = observed autocorrelation at the j^th lag
> z-value = standardized autocorrelation
93
> normal p = p-value using normal approximation
> 1s - ran p = one-sided randomization p-value, H1: negative correlation
> 1s + ran p = one-sided randomization p-value, H1: positive correlation
> 2s ran p = two-sided randomization p-value
Data Display
Row
1
2
3
4
5
6
7
8
corr z-value normal p 1s - ran p 1s + ran p 2s ran p
-0.421503
0.100958
-0.102290
0.060894
-0.135631
0.004072
0.106034
-0.155653
-3.91489
1.06907
-0.86979
0.68689
-1.18784
0.14484
1.11750
-1.37885
0.000090
0.285038
0.384417
0.492152
0.234895
0.884836
0.263782
0.167942
0.001
0.871
0.182
0.760
0.123
0.564
0.883
0.068
1.000
0.130
0.819
0.241
0.878
0.437
0.118
0.933
0.002
0.260
0.364
0.482
0.246
0.874
0.236
0.136
* NOTE * The interpretion of serial correlation p-values:
Significance levels should be reduced appropriately to
account for the effects of multiple testing. The simplest
procedure is to divide the p-value by the number of lags
being considered (this is conservative). For example, if
a 95% significance level is required and lags up to 10
are of interest, then individual p-values smaller than
0.05/10 = 0.005 are taken to be significant.
VON-NEUMANN RATIO
Data Display (WRITE)
Observed von-neumann ratio
2.835
Standardized von-neumann ratio
4.029
P-value using normal approximation
0.0001
One-sided randomization p-values
1.0000
0.0010
Two-sided randomization p-value
0.0020
* NOTE * The use of the Von-Neumann ratio :
This ratio provides a test of randomness within a time
series. It tests the null hypothesis that the observed
series is random against the alternative hypothesis that
it is a simple random walk (in which the value
at a particular point in time is partly determined by
the value at the previous time point).
Modified worksheet
C5
Column containing 999 Von Neumann ratios, one for each randomized dataset
M1
A 999*8 matrix. The kth column contains 999 serial correlation coefficients at lag k,
one for each randomzied dataset.
94
Discussion
Randomization and standard methods present a very similar picture. There is strong evidence of
autocorrelation at lag 1 (p-value = 0.000 by standard methods, 0.002 by randomization), but no evidence
of autocorrelation at any other lag. The Von-Neumann ratio also provides clear evidence against this
being a random series (p-value = 0.000 by standard methods, 0.002 by randomization).
95
TRENDRAN
To test for the presence of trend in a regular or irregular time series, using a variety of non-parametric
test-statistics. The test-statistics are  Number of runs above and below the median
 Number of positive differences
 Number of runs up and down.
Significance is determined using both normal approximations and randomisation.
Calling statement
trendran c1 ;
nran k1 ;
statistics c1-c3.
Input
c1
A column of numeric data.
Missing values : Allowed. Observations with missing values are simply ignored.
Subcommands
statistics
Specify three columns in which to store simulated test-statistics.
1st column : number of runs above and below the median
2nd column : number of positive differences
3rd column : number of runs up and down
Outputs
For each of the three test-statistics, we present the
 Observed value of the test-statistic
 Expected value, standard error and p-value for the test-statistic using a (large-sample) normal
approximation
 Randomization p-values
Null hypothesis : The observed time series is a random series.
Alternative hypothesis : There is trend (long-term dependence) within the series.
Test-statistic
The first test-statistic which we use is the number of runs above and below the median (note: values equal
to the median are assumed to be above the median), M.
The second test-statistic is the number of positive differences, P.
The third test-statistic is the number of runs up and down, U.
Randomization : We randomize the order of the data, since under the null hypothesis this ordering will
be random.
Speed of macro : FAST
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
96
Standard procedures
As well as randomization output, the macro produces p-values using normal approximations (Manly,
1997).
For long series, a normal approximation to the distribution of M is reasonable under the null hypothesis,
with
mean = 2r(n-r)/(n+1),
variance = 2r(n-r){2r(n-r)-n}/{n2(n-1)},
where n is the length of the series and r is the observed number of runs below the median.
For long series, a normal approximation to the distribution of P is reasonable under the null hypothesis,
with
mean = m/2,
variance = m/12,
where m is the number of differences after zeros have been removed.
For long series, a normal approximation to the distribution of U is reasonable under the null hypothesis,
with
mean = (2m+1)/3,
Variance = (16m-13)/90,
where m is the number of differences.
There is no in-built Minitab command for this kind of procedure, but the closest command is
% trend c1,
which tests for evidence of trend in c1 using parametric models.
WORKED EXAMPLE FOR TRENDRAN
Name of dataset
EXTINCTION
Description
The data are estimated extinction rates for marine genera from the late Permian period until the present,
listed in chronological order. There are 48 geological stages…
We use data on extinction rates for marine genera from the late Permian period until the present.
The data are from an irregular time series; times are not presented here, because there is some doubt as to
their accuracy.
Source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
RAUP, D.M. (1987), Mass extinctions: a Discussion, Palaeontology, 30, pp. 1-13.
Data
Number of observations = 48
Number of variables = 1
Extinction rate
22 23 61
7 14 26
30
7 14
60
21
10
45 29 23 40 28 46
7 22 16 19 18 15
11 18
7
9 11 26
97
13
6
8
5
11
4
13
3
48
11
9
6
6
7
7
2
13
16
Plot
60
Extinction rate
50
40
30
20
10
0
Index
10
20
30
40
Worksheet
C1
Data
Aim of analysis
To investigate whether there is a trend in extinction rates over time.
Randomization procedure
Welcome to Minitab, press F1 for help.
MTB > Retrieve "N:\resampling\Examples\Extinction.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Extinction.MTW
# Worksheet was saved on 07/08/01 10:57:19
Results for: Extinction.MTW
MTB > % N:\resampling\library\trendran c1;
SUBC> nran 999 ;
SUBC> statistics c3-c5.
Executing from file: N:\resampling\library\trendran.MAC
Some tests for detecting trend in a single time series
Test 1 : Runs above and below the median test
Data Display (WRITE)
Observed number of runs 16
Expected number of runs
25.00
Standard deviation for number of runs
3.427
Two-sided p-value using normal approximation
0.0131
One-sided randomization p-value, H1: trend
0.9970
One-sided randomization p-value, H1: rapid oscillation
0.0060
Two-sided randomization p-value
0.0120
98
Test 2 : Sign test
Data Display (WRITE)
Observed number of positive differences 23
Total observed number of non-zero distances 47
Expected number of positive differences
23.50
Standard deviation for number of positive differences
1.979
Two-sided p-value using normal approximation
1.0000
One-sided randomization p-value, H1: decreasing trend
0.6350
One-sided randomization p-value, H1: increasing trend
0.5570
Two-sided randomization p-value
1.0000
Test 3 : Runs up and down test
Data Display (WRITE)
Observed number of runs 28
Expected number of runs
31.67
Standard deviation for number of runs
2.866
Two-sided p-value using normal approximation
0.2691
One-sided randomization p-value, H1: trend
0.8950
One-sided randomization p-value, H1: rapid oscillation
0.1690
Two-sided randomization p-value
0.3380
Modified worksheet
C3
Column containing 999 M statistics, one for each randomized dataset
C4
Column containing 999 P statistics, one for each randomized dataset
C5
Column containing 999 U statistics, one for each randomized dataset
Discussion
Method
Runs above and below the median
Positive differences
Runs up and down
Randomization p-values (2-sided)
This example
Manly
0.012
0.008
1.000
1.000
0.338
0.360
P-values by normal
Approximation
0.013
1.000
0.260
Our randomization p-values agree closely with those of Manly (1997). These are somewhat different from
the p-values obtained using the normal approximation, but the differences are not too great. Overall, only
the first of the tests shows any evidence of trend. In the case of runs above and below the median, the
evidence for trend is reasonably strong. Manly (1997) suggests that there is clear trend within the data,
and that the 2nd and 3rd tests have failed to pick this up because they concentrate too much on small-scale
behaviour.
99
6
SPATIAL STATISTICS
Overview
TYPES OF MACRO (by type of resampling)
Randomization tests
SPATAUTORAN tests for spatial autocorrelation
MEAD4RAN performs Mead's randomization test upon a 4*4 grid of quadrat counts
MEAD8RAN performs Mead's randomization test upon an 8*8 grid of quadrat counts
MANTELRAN performs a Mantel test
Monte Carlo procedures
DISTEDFMC plots EDF plots, using data on the location of objects within a fixed area
NEARESTMC tests for random location of objects within a fixed area, using nearest neighbour distances
LOCREGULARMC tests for random location of objects within a fixed area, using indices of local
regularity
Which procedure should I use ?
Searching for pattern in the location of points
A key area of spatial statistics is the search for pattern in the location of points or objects within a fixed
region. If points are distributed at random within the region, so that there is no pattern is present, then we
say that there is Complete Spatial Randomness (CSR). In searching for spatial pattern, the usual starting
point is to test the null hypothesis of CSR; only once we have rejected CSR (if indeed we do reject CSR)
can we begin to look at the nature of the pattern.
NEARESTMC and LOGREGULARMC both test the null hypothesis of CSR, using statistics based upon
kth nearest neighbour distances. In NEARESTMC, we use the kth nearest neighbour distances themselves,
and test CSR against alternative hypotheses of regularity and clustering. In LOCREGULARMC, we use
statistics which are particularly sensitive to regularity (and, specifically, which are sensitive to local
regularity i.e. large-scale clustering occurring together with small-scale regularity).
DISTEDFMC also provides two tests for alternative hypotheses of clustering and regularity against the
null hypothesis of CSR, this time using the distances between all objects (rather than just kth nearest
neighbour distances). The primary purpose of DISTEDFMC, however, is to provide a powerful graphical
means of identifying if (and where) deviations from CSR occur.
If data are in the form of grid of quadrat counts (i.e. the number of objects/points present within each of a
number of adjacent regions is recorded) then MEAD4RAN and MEAD8RAN can also be used to test the
null hypothesis of CSR against alternative hypotheses of clustering and regularity. It is quite possible to
use these macros upon data which is of the form of locations of points within a region (as already
discussed), by superimposing a grid over the region, and counting the number of objects within each cell
of the grid. It is important, however, to note that MEAD4RAN and MEAD8RAN can only be used to test
for CSR at a particular scale, and are likely to entirely miss deviations from CSR at smaller or larger
spatial scales.
Other techniques
The remaining two macros have entirely different purposes.
100
Much data is of the form of a variable, recorded at different spatial locations (either at different points, or
within different regions). Standard statistical analyses would ignore the spatial structure of this data, and
would assume that the values of the variable at different points are independent. Often, however, values
of a variable at a point will tend to be similar to values at nearby points – this kind of dependence is
known as spatial autocorrelation. In SPATAUTORAN, we test for spatial autocorrelation against a null
hypothesis of independence.
Finally, much spatial data is in the form of distance matrices – for a network of n points, a variety of
distances between each pair of points may be computed, and these distances may be formed into matrices.
For example, Manly (1997) discusses a situation in which the global distribution of Earwigs is of interest.
In this case, the points consist of 8 continental-level regions. The first distance matrix contains measures
of similarity in the types of earwig species found in each pair of regions, whilst the second distance
matrix contains a measure of the geographical distance between the regions. Interest lies in seeing
whether the degree of similarity in species is related to geographical distance. This kind of problem can
be addressed using MANTELRAN.
Using the macros for spatial statistics
Although the techniques used within these macros are well-established, they are much less commonly
used and taught than the previous techniques we have considered. Before using the macros, the user is
strongly advised to study the suggested references. Most of these macros are also very computerintensive, so they are less suitable for using on a routine basis, or for teaching.
101
SPATAUTORAN
! Intensive !
To test for the presence of spatial autocorrelation for data recorded at points within a region. Two
alternative coefficients of spatial autocorrelation, the Moran and Geary coefficients, are computed, and
significance levels are determined by randomisation.
RUNNING THE MACROS
Calling statement
spatautoran c1 m1 k1 ;
nran k1 ;
autocorrelations c1-c2.
Input
 c1 is a column containing data for each of the N points.
 m1 is a weighting or connectivity matrix, representing the geographical arrangement of the points.
It must be an N-by-N matrix. It should contain zero entries upon the diagonal (it is does not,
diagonal entries will be set equal to zero). It need not be symmetric.
 k1 is the number of points, N.
Subcommands
autocorrelations - specify columns in which to store simulated Moran coefficients (1st column)
and Geary coefficients (2nd column).
Output
For each of the two kinds of spatial autocorrelation coefficient, we present:
 Observed coefficient
 Expected coefficient and standard error, under assumptions N and R
 P-values using normal approximations, under assumptions N and R
 Two-sided p-values using randomization
TECHNICAL DETAILS
Null hypothesis : Independence (no spatial autocorrelation).
Alternative hypotheses : Positive or negative spatial autocorrelation. Data exhibit spatial autocorrelation
if data values at a point or region are influenced by data values at other nearby points or regions.
Test-statistic
We consider two measures of autocorrelation, the Moran and Geary autocorrelation coefficients.
If xi is the data value for the ith point and wij is the i,jth element of the weighting matrix, then the Moran
coefficient is defined to be
I
n wij ( xi  x )( x j  x )
i
j
  wij   ( xi  x ) 2 
 i j
 i
and the Geary coefficient is defined to be
,
102
c
(n  1) wij ( xi  x j ) 2
i
j
2  wij   ( xi  x ) 2 
 i j
 i
.
The denominator for both coefficients is :
[Sum of all elements of the weighting matrix] * variance of the xi values
The numerator for the Moran coefficient is :
Sum for all i,j of : [ijth element of weighting matrix * (xi – mean of x values)*(xj – mean of x values)]
The numerator for the Geary coefficient is :
(½) * Sum of all i,j of [ijth element of weighting matrix * (xi – xj)]
[Note : for the Moran coefficient, we estimate the variance by dividing by n, but for the Geary coefficient
we divide by n – 1.]
Randomization
We randomize the allocation of data values to points, since under the null hypothesis this is random.
We do not modify the weighting matrix in any way, since this describes the "map" of the study area.
Weighting matrix
In order to quantify the effect of spatial autocorrelation, we need a mathematical representation for the
geographical layout of the study region: this representation is the weighting matrix, W.
The ijth element of W, wij, provides a measure either of the geographical proximity of regions i and j, or
else provides some other measure of the degree to which the regions influence each other.
The simplest weighting matrix is a connectivity matrix, so that
1 if regions i and j are adjacent
,
wij  
0 otherwise
and this is the matrix used in our worked example.
Cliff and Ord (1973) discuss more sophisticated weighting matrices; for example, wij may be inversely
proportional to the distance between the centrepoints of the regions, or/and proportional to the length of
the common boundary between the regions. The form of the matrix should be relevant to the context of
the applied problem, and must be determined prior to the analysis stage.
ALTERNATIVE PROCEDURES : Standard procedures
No built-in Minitab command exists. Normal approximations to the distributions of the Moran and Geary
coefficients can be derived using asymptotic theory, under two possible assumptions :
N
The data are independent realisations from a normal distribution
R
The data are independent realisations from an unknown distribution.
The normal approximations are complicated, and we do not state them here. We present p-values obtained
using both normal approximations; they do not necessarily give similar answers.
REFERENCE
CLIFF, A.D. & ORD, J.K. (1973) Spatial autocorrelation, Pion, London
WORKED EXAMPLE FOR SPATAUTORAN
Name of dataset
WALES
Description
103
The data describe the percentage change in population in each of 13 Welsh counties (coded A to M) over
the period 1951-1961. Ranks for percentage change are then constructed, and it is these which are of
interest.
Our source
CLIFF, A.D. & ORD, J.K. (1973) Spatial autocorrelation, Pion, London.
Original source
GENERAL REGISTER OFFICE (1961) England and Wales: Preliminary Census Report, 1961, HMSO,
London.
The data
Number of observations = 13
Number of variables = 3
Code
A
B
C
D
E
F
G
H
I
J
K
L
M
Change
2.05
-1.70
-2.40
0.50
-2.50
1.80
3.20
2.10
-5.90
4.40
-3.80
3.40
-7.80
Rank
5
8
9
7
10
6
3
4
12
1
11
2
13
Worksheet
C1
County code
C2
% population change
C3
Rank change
M1
Geographical connectivity matrix
Plot : Network chart. This is a graphical representation of the connectivity matrix used in the example.
If counties i and j are directly joined in the chart, then the ijth element of the weighting matrix is 1.
Otherwise it is zero.
104
Aims of analysis
To investigate whether there is spatial autocorrelation in the rank population changes.
Randomization procedure
Welcome to Minitab, press F1 for help.
Retrieving worksheet from file: N:\resampling\Examples\Wales.MTW
# Worksheet was saved on 28/08/01 16:49:48
Results for: Wales.MTW
MTB > % N:\resampling\library\spatautoran c3 m1 13 ;
SUBC> nran 999 ;
SUBC> autocorrel c5 c6.
Executing from file: N:\resampling\library\spatautoran.MAC
Moran coefficient of spatial autocorrelation
Data Display (WRITE)
Observed Moran coefficient
0.1259
Expected Moran coefficient -0.08333
Standard error for Moran coefficient, under N
0.1743
Standard error for Moran coefficient, under R
0.1938
Standard two-sided p-value, under N
0.2302
Standard two-sided p-value, under R
0.2803
One-sided randomization p-values
0.1190
0.8860
Two-sided randomization p-value
0.2380
105
Geary coefficient of spatial autocorrelation
Data Display (WRITE)
Observed Geary coefficient
0.6656
Expected Geary coefficient
1.000
Standard error for Geary coefficient, under N
0.2354
Standard error for Geary coefficient, under R
0.1257
Standard two-sided p-value, under N
0.1554
Standard two-sided p-value, under R
0.0078
One-sided randomization p-values
0.9430
0.0600
Two-sided randomization p-value
0.1200
* NOTE * For further details, see
CLIFF, A.D. & ORD, J.K. (1973) Spatial
autocorrelation, Pion, London.
Modified worksheet
C5
A column containing 999 Moran coefficients, one for each randomized dataset
C6
A column containing 999 Geary coefficients, one for each randomized dataset
Discussion
There is no real evidence for spatial autocorrelation, except that the standard two-sided p-value under the
106
MANTELRAN
! Intensive !
To perform Mantel's test for association between the elements of two matrices, A and B.
Important note
A and B are usually distance matrices, so that, for example, the ijth element of A represents some kind of
distance between the ith and jth objects, whilst the ijth element of B represents a different kind of distance
between the same two objects. A and B are square, symmetric matrices. The elements along the diagonals
of A and B should all be zero. The kinds of "distances" which may be of interest include :
 Geographical distance
 Separation in time
 Environmental differences between sites
 Differences between quadrat counts at sites
 Genetic differences
RUNNING THE MACRO
Calling statement
mantelran m1 m2 k1 ;
nran k1 (999) ;
correlations c1.
Input
 m1 and m2 should be square, symmetric, k-by-k matrices of the same dimension. Both must have zero
entries along their leading diagonal. Matrices m1 and m2 represent matrices A and B in the
Discussion below.
 k1 is the number of observations (equal to the length of the side of each matrix), k.
Subcommands
correlations - specify a column within which to store simulated correlation coefficients.
Output
 Observed correlation coefficient
 P-values determined by randomisation
TECHNICAL DETAILS
Null hypothesis : The elements of matrices A and B are independent.
Alternative hypothesis : Linear association between the elements of A and B.
Test-statistic : The Pearson correlation coefficient between the entries of matrices A and B. Actually, we
only need to use the lower triangular elements of A and B, since the matrix is symmetric.
Randomization procedure : We fix the elements of A (say). To construct B, we randomly permute the
order of the individuals 1,…,n, and locate values of B according to the new position of the individuals
with which they are associated. This procedure is valid under one of two assumptions :
1. If the n individuals are a random sample from a larger population, then we must assume that
the A distances and B distances are independent within the population.
2. If the n individuals form the population of interest, then we must assume that the mechanism
generating A distances is independent of the mechanism generating B distances.
ALTERNATIVE PROCEDURES :
107
Other macros : None.
Standard procedures : None.
REFERENCES
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 9).
WORKED EXAMPLE FOR MANTELRAN
Name of dataset
EARWIGS
Description and worksheet
The data describe the global distribution of earwigs. Observations are taken upon 8 continental-level
regions (1:Europe and Asia, 2:Africa, 3:Madagascar, 4: Orient, 5: Australia, 6: New Zealand, 7: South
America, 8: North America). We have three matrices:
M1
A correlation matrix. Element ij quantifies the similarity of earwig species in regions i and j.
M2
A matrix representing current distances between matrices. Element ij quantifies the "stepwise"
distance between regions i and j (i.e. it is 1 if they are adjacent, 2 if they are separated (overland)
by one other region and so on).
M3
An alternative distance matrix, based upon the hypothesised arrangement of the continents in
Gondwanaland.
Interest lies in seeing whether the similarity in species between regions is more closely related to their
current geographical proximity or to their geographical proximity in Gondwanaland. If the latter
relationship is substantially stronger, this provides evidence that evolution of earwig species occurred in
Gondwanaland, which in turn provides supporting evidence for the Continental Drift Hypothesis (which
hypothesises the existence of Gondwanaland).
Our source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Original source
POPHAM, E.J. & MANLY, B.F.J. (1969), Geographical distribution of the Dermaotera and
the continental drift hypothesis, Nature, 222, pp. 981-982.
Data
Correlation matrix (8*8)
0.00 0.30 0.14 0.23
0.30 0.00 0.50 0.50
0.14 0.50 0.00 0.54
0.23 0.50 0.54 0.00
0.30 0.40 0.50 0.61
-0.04 0.04 0.11 0.03
0.02 0.09 0.14 -0.16
-0.09 -0.06 0.05 -0.16
0.30
0.40
0.50
0.61
0.00
0.15
0.11
0.03
-0.04
0.04
0.11
0.03
0.15
0.00
0.14
-0.06
0.02
0.09
0.14
-0.16
0.11
0.14
0.00
0.36
-0.09
-0.06
0.05
-0.16
0.03
-0.06
0.36
0.00
Distance matrix (8*8)
0
1
2
1
1
0
1
2
2
1
0
3
1
2
3
0
2
3
4
1
2
3
4
1
0
3
4
5
2
1
2
3
4
3
4
1
2
3
2
3
108
3
2
1
4
3
2
5
4
3
2
3
2
1
4
3
Alternative distance matrix (8*8)
0
1
2
1
2
1
0
1
1
1
2
1
0
1
1
1
1
1
0
1
2
1
1
1
0
3
2
2
2
1
2
1
2
2
2
1
2
3
2
3
0
5
4
5
0
1
4
1
0
3
2
2
2
1
0
3
4
2
1
2
2
2
3
0
1
1
2
3
2
3
4
1
0
Worksheet
M1
Correlation matrix
M2
Distance matrix (based on current arrangement of continents)
M3
Alternative distance matrix (based on Continental Drift Hypothesis)
Aims of analysis
To investigate whether the similarity of earwig species between continental-scale regions is correlated
with the "stepwise" geographical distance between those regions.
Randomization procedure : current distribution of continents
MTB > Retrieve "N:\spatial\Earwigs.MTW".
Retrieving worksheet from file: N:\spatial\Earwigs.MTW
# Worksheet was saved on 08/03/01 05:09:05 PM
Results for: Earwigs.MTW
MTB > % N:\spatial\mantelran m1 m2 8 ;
SUBC> nran 499 ;
SUBC> correlations c1.
Executing from file: N:\spatial\mantelran.MAC
Mantel Test, with significance determined by randomization
Data Display (WRITE)
Number of units 8
Observed correlation -0.2170
Number of randomizations 499
One-sided p-value, H1: positive correlation
One-sided p-value, H1: negative correlation
Two-sided p-value 0.3800
0.8160
0.1900
Looking at the data
MTB > Retrieve "N:\resampling\Examples\Earwigs.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Earwigs.MTW
# Worksheet was saved on 13/08/01 11:34:05
Results for: Earwigs.MTW
109
MTB > print m1
Data Display
Matrix M1
0.00
0.40
0.04
0.50
-0.06
0.30
0.50
0.09
0.40
0.00
0.15
0.50
0.03
0.30
0.61
0.11
0.04
0.15
0.00
0.11
-0.06
-0.04
0.03
0.14
0.50
0.50
0.11
0.00
0.05
0.14
0.54
0.14
-0.06
0.03
-0.06
0.05
0.00
-0.09
-0.16
0.36
0.30 0.50 0.09
0.30 0.61 0.11
-0.04 0.03 0.14
0.14 0.54 0.14
-0.09 -0.16 0.36
0.00 0.23 0.02
0.23 0.00 -0.16
0.02 -0.16 0.00
Modified worksheet
C1
A column containing 999 correlation coefficients, one for each simulated dataset.
Discussion
The appropriate one-sided randomization p-value is 0.190, very similar to the 0.183 obtained by Manly
(1997) using 4999 randomizations. This provides no significant evidence of linear association between
the similarity in earwig species and the current distances between continental-level regions.
110
MEAD4RAN
To perform Mead’s randomization test upon a 4*4 grid of spatial count data. Mead’s randomization test is
designed to test the null hypothesis of CSR (Complete Spatial Randomness).
Calling statement
mead4ran m1 ;
nran k1 (999) ;
qstatistics c1.
Input
A 4*4 matrix of quadrat counts, which may not contain any missing values.
The ordering of the counts in the matrix should be the same as the spatial ordering in the experiment
or study, and the results obtained will be dependent upon this ordering.
Subcommands
qstatistics
Specify a column in which to store simulated Q-statistics.
Output
 Observed Q-statistic
 Associated one-sided and two-sided randomization p-values
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 10).
Standard procedure : None
Null hypotheses : Assume that the quadrats are labelled as follows :
1
3
9
11
2
4
10
12
5
7
13
15
6
8
14
16
The null hypothesis is that the division of the quadrats into blocks of 4 [(1) = 1,2,3,4 ; (2) = 5,6,7,8 ; (3) =
9,10,11,12 ; (4) = 13,14,15,16] is random. If the data exhibit Complete Spatial Randomness (CSR), then
this division will be random, so the null hypothesis can also be viewed as CSR.
Alternative hypotheses : Clustering or regularity at an appropriate scale, resulting in a non-random
division of the quadrats into blocks.
Test-statistic : Assume that the data are as follows, so that Ti represents the quadrat count in the ith
quadrat.
T1
T2
T5
T6
T3
T4
T7
T8
T9
T10 T13 T14
T11 T12 T15 T16
111
We use the test-statistic Q = BSS / TSS, where TSS is the variance of the 16 counts in the 4*4 grid, and
BSS is the variance for the 4 counts (in the 2*2 grid formed by aggregating counts as follows :
AC1 = T1 + T2 + T3 + T4
AC3 = T9 + T10 + T11 + T12
Definition
BSS 
T1  T2  T3  T4 2
2
 n 2
TSS    Ti   16T .
 i 1 
AC2 = T5 + T6 + T7 + T8
AC4 = T13 + T14
10 + T15 + T16
13
14
11
2
2
2
2
 T5  T6  T7  T8   T9  T12
10  T11  T12   T13  T14  T15  T16 
16
T
,
15
4
16
Q lies between 0 and 1. In general, unusually large values of Q imply clustering, whilst unusually small
values of Q imply some form of regularity. However, it should be noted that the test is only capable of
detecting regularity or clustering at a particular spatial scale (the scale reflected by blocks of size 4).
Mead's randomization test can either be applied to data which naturally arise as a 4*4 grid of counts, or
(more commonly) by placing a 4*4 grid over a region in which locations of points are recorded, and
counting the number of points within each section of the grid.
Randomization procedure : We randomize the allocation of counts to cells within the grid, since under
the null hypothesis of complete spatial randomness this allocation should occur at random.
WORKED EXAMPLE FOR MEAD4RAN
Name of dataset
SAPLING1
Description
The raw data describes the position of 71 Swedish pine saplings in a 10 x 10m square. In this dataset, we
divide the region in 16 squares (each 2.5m x 2.5m), and count the number of saplings within each square.
Source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Data
Number of observations = 16
Number of variables = 1
Counts within the 4*4 grid are shown.
6
4
3
2
2
6
4
3
5
4
5
4
4
6
6
7
Worksheet
M1
Matrix of counts
Aim of analysis
112
To test whether the distribution of pine saplings is random.
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Sapling1.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Sapling1.MTW
# Worksheet was saved on 09/08/01 09:13:12
Results for: Sapling1.MTW
MTB > print m1
Data Display
Matrix M1
6
4
3
2
2
6
4
3
5
4
5
4
4
6
6
7
MTB > % N:\resampling\library\mead4ran m1 ;
SUBC> nran 999 ;
SUBC> qstatistics c1.
Executing from file: N:\resampling\library\mead4ran.MAC
Mead's randomization test for a 4*4 grid
Data Display (WRITE)
Observed Q-statistic 0.3886
One-sided randomization p-value, H1: regularity
One-sided randomization p-value, H1: clustering
Two-sided randomization p-value
0.2100
0.9010
0.1050
* NOTE * For further details, see
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo
methods in biology, Chapman and Hall, London.
Modified worksheet
C1
A column containing 999 Q-statistics, one for each simulated dataset
Discussion
There is slight evidence of clustering (we obtain a one-sided p-value of 0.105; Manly (1997) obtains
0.111), but this cannot be regarded as statistically significant. Mead's test therefore provides no real
evidence against randomness at this scale (this qualification is important - the test is scale-dependent).
113
MEAD8RAN
To perform Mead’s randomization test upon an 8*8 grid of spatial count data. Mead’s randomization test
is designed to test the null hypothesis of CSR (Complete Spatial Randomness).
Calling statement
mead8ran m1 ;
nran k1 (999) ;
qstatistics c1-c5.
Input
An 8*8 matrix of quadrat counts, which may not contain any missing values.
The ordering of the counts in the matrix should be the same as the spatial ordering in the experiment
or study, and the results obtained will be dependent upon this ordering.
Subcommands
qstatistics
Specify five columns in which to store simulated Q-statistics for each of the four
quarters of the study area (top left in the 1st column, top right in the 2nd column,
bottom left in the 3rd column, bottom right in the 4th column), and the simulated
mean Q-statistics (in the 5th column).
Output
 Observed Q-statistics for each quarter
 Observed mean Q-statistic
 One-sided and two-sided randomization p-values for the mean Q-statistic
Null hypothesis : Complete spatial randomness.
Alternative hypothesis : Clustering or regularity at a particular scale.
Test-statistic : We compute Q statistics (as in Mead's randomization test for a 4*4 grid) for each of the
four 4*4 blocks of cells in each corner of the 8*8 grid. We use the average of these as the test-statistic.
Once again, Mead's randomization test is only capable of picking up deviations from randomness at
particular scales, but in this case the scale is finer.
Randomization procedure : We use restricted randomization, randomizing counts separately within
each of the four 4*4 grids.
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 10).
Standard procedure : None
114
WORKED EXAMPLE FOR MEAD8RAN
Name of dataset
SAPLING2
Description
The raw data describes the position of 71 Swedish pine saplings in a 10 x 10m square. In this dataset, we
divide the region in 64 squares (each 1.25m x 1.25m), and count the number of saplings within each
square.
Source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Data
Number of observations = 16
Number of variables = 1
Counts within the 8*8 grid are shown.
1
3
2
0
1
0
0
0
0
2
1
1
1
1
1
1
0
0
2
3
1
1
0
1
1
1
0
1
1
1
2
0
1
2
2
1
1
0
2
0
1
1
0
1
2
2
1
1
1
1
1
3
2
1
1
1
0
2
0
2
1
2
2
3
Worksheet
M1
Matrix of counts
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Sapling2.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Sapling2.MTW
# Worksheet was saved on 28/08/01 16:45:24
Results for: Sapling2.MTW
MTB > print m1
Data Display
Matrix M1
1
3
2
0
1
0
0
0
2
1
1
1
1
1
0
0
2
3
1
1
0
1
1
0
1
1
1
2
1
2
2
1
1
0
2
1
1
0
1
2
2
1
1
1
1
3
2
1
1
0
2
0
2
1
2
2
115
0 1 1 0 0 1 1 3
MTB > Save "N:\resampling\Examples\Sapling2.MTW";
SUBC> Replace.
Saving file as: N:\resampling\Examples\Sapling2.MTW
* NOTE * Existing file replaced.
MTB > % N:\resampling\library\mead8ran m1 ;
SUBC> nran 999 ;
SUBC> qstatistics c3-c7.
Executing from file: N:\resampling\library\mead8ran.MAC
Mead's randomization test for an 8*8 grid
Data Display (WRITE)
Observed Q-statistic for top left quarter
0.1746
Observed Q-statistic for top right quarter 0.06587
Observed Q-statistic for bottom left quarter
0.1000
Observed Q-statistic for bottom right quarter
0.1282
Data Display (WRITE)
Observed mean Q-statistic 0.1172
Number of randomizations 999
One-sided randomization p-value, H1: regularity
One-sided randomization p-value, H1: clustering
Two-sided randomization p-value
0.1880
0.0940
0.9110
* NOTE * For further details, see
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo
methods in biology, Chapman and Hall, London.
Modified worksheet
C3
A column containing 999 Q-statistics for top left quarter, one for each simulated dataset.
C4
A column containing 999 Q-statistics for top right quarter, one for each simulated dataset.
C5
A column containing 999 Q-statistics for bottom left quarter, one for each simulated dataset.
C6
A column containing 999 Q-statistics for bottom right quarter, one for each simulated dataset.
C7
A column containing 999 average Q statistics, one for each simulated dataset.
Discussion
There is slight evidence of regularity (we obtain a one-sided p-value of 0.094; Manly (1997) obtains
0.093), but this cannot be regarded as statistically significant. Mead's test again provides no real evidence
against randomness at this scale (this qualification is important - the test is scale-dependent), but it is
interesting to note that whilst clustering is the most plausible hypothesis at the 4*4 scale, regularity is the
most plausible hypothesis at the 8*8 scale.
116
Creating and interpreting EDF plots
Introduction
EDF plots are a convenient graphical procedure for investigating the null hypothesis of CSR (Complete
Spatial Randomness). EDF plots are very similar to the Probability Plots produced (for example by
Minitab) to test distributional assumptions. The differences are that :
1) EDF plots test the observed data against the null hypothesis of Complete Spatial Randomness, rather
than against a hypothesised distribution (e.g. normality).
2) EDF plots are constructed using Monte Carlo simulation, rather than theoretical results.
Underlying theory
 Assume that data is available for n points within a fixed rectangular region A of known area.
 We consider the distribution of all inter-event distances.
 Under the null hypothesis of CSR, we can assume that the distance between any two events in A is a
realisation from a random variable T.
 Assume for the time being that T is known (for very simple regions, such as a square or circle, we can
derive the exact distribution of T), and assume T has cumulative distribution function F(t).
 We compare this theoretical distribution, F(t) derived under the null hypothesis of CSR with the
empirical distribution function (EDF) of the data, E(t).
 If the null hypothesis is true, we would then expect E(t) to be "close" to F(t); the basis of the EDF plot
involves plotting E(t) against F(t), and comparing this to a plot of F(t) against itself.
Implementation
Four issues arise in practice :
Question 1 : At what values of t should be evaluate E(t) and F(t) ?
Answer : If we assume that the region is rectangular, with sides a and b, then t is constrained to lie in the
range [0,(a2+b2)1/2]. We then evaluate the CDF and EDF at a fixed number (the default is 100, but there is
an option for the user to change this) of equally spaced points within this range. We therefore work not
with the EDF, but with an approximation to it. If the number of points is reasonably large, the error from
working with an approximation is very small.
Question 2 : How do we compute the empirical distribution function ?
Answer : At each evaluation distance t, the EDF is defined to be
E(t) = (Number of observed inter-event distances less than or equal to t) / n.
Question 3 : How do we find F(t) if the region is not very simple ?
Answer : We create d simulated datasets of size n under the null hypothesis of CSR.
If the theoretical bounds of the study region are [XMIN,XMAX] on the x-axis and [YMIN,YMAX] on
the y-axis, then to create one simulated dataset, we simulate n x-values from a uniform distribution on
[XMIN,XMAX] and n y-values from a uniform distribution on [YMIN,YMAX], and pair these up to give
the location of n simulated datapoints. We then compute the EDF for each of the d simulated datasets.
Let these EDFs be H1(t),…,Hd(t). An estimate of F(t) at distance t is the mean of H1(t),…,Hd(t).
Question 4 : How do we create an indication of variability under the null hypothesis ?
In order to compare the EDF against the CDF generated under CSR we need some indication of the
variability in the CDF under CSR. To do this, we can create lower and upper simulation limits
L(t) = Min{i=1,…d} [Hi(t)]
U(t) = Max{i=1,…d} [Hi(t)]
against F(t). In the EDF, E(t), deviates from the simulation "envelope" bounded by L(t) and U(t), then this
provides evidence against CSR (although exact significance levels are difficult to assign).
117
Graphical representations
There are two possible graphical representations of the EDF and associated simulation envelope :
 Plot the EDF and simulation envelope against the theoretical distribution (the CDF)
 Plot the EDF, CDF and simulation envelope against distance, t.
Both plots are presented, and should be interpreted in much the same way.
Interpreting the graphs
Look at both graphs, and ask yourself the following questions  Does the EDF line generally seem to lie far from the CDF line ?
 Does the EDF line go outside the simulation envelope at any point ?
 Does the EDF line show any systematic deviation from the CDF line (e.g. always lying about the CDF
line) ?
If any of the answers is "yes", this may suggest that the assumption of CSR is false.
If the EDF line always lies close to the CDF, never travels outside the envelope and shows no systematic
deviation from the EDF then there is no evidence against CSR.
Test-statistics
We use two "ad hoc" test-statistics,
1. Maximum pointwise squared difference between the EDF and the estimated CDF
2. Average squared difference (across all t values) between the EDF and the estimated CDF.
Both test the null hypothesis of complete spatial randomness in the location of data points within the
region, against alternative hypothesis of clustering & regularity.
We only consider the one-sided p-value, since deviations of CSR will always be associated with a large
squared difference. The second test-statistic is likely to have low power in many circumstances.
118
DISTEDFMC
! Intensive !
To construct EDF plots based on distances between all points within a region.
Calling statement
distedf.mac c1 c2 k1 ;
nsim k1 (999) ;
npoints k1 (100) ;
distances c1 ;
edfs c1-c5.
Input
c1 and c2 should contain paired x and y co-ordinates for each point in the plane.
k1 should be the number of points at which observations are available.
Interactive input
The user is prompted to enter the minimum and maximum possible values of the x and y values.
The macro checks whether these lie outside the range of the observed data; if they lie within the observed
range, an error similar to the following will arise :
*** ERROR *** Stated theoretical minimum for x is greater than
the observed minimum value of x.
Subcommands
npoints
distances
edfs
Number of distance values which should be used to construct the EDF plot.
Increasing the number of values will increase the resolution of the plot.
Specify a column in which to store all distances between points.
Specify five columns in which to store :
1. An equally-spaced vector of distance values (determined by npoints)
2. An estimate of the expected cumulative distribution function
3. The empirical distribution function
4. The lower bound of the simulation envelope
5. The upper bound of the simulation envelope
Output
 An EDF plot, with simulation envelope
 A global assessment of randomness, with randomization p-value
Missing values
Are allowed. However, take note !
Important note : The specified number of data points (the third argument to the command) should not
include any data points for which either the x or y value is missing. If this is not taken account of, the
following error arises :
*** ERROR *** The number of points in the data is not equal to the
specified number of points.
119
TECHNICAL DETAILS
See above.
STANDARD PROCEDURE
No standard MINITAB procedure is available.
REFERENCE
DIGGLE, P.J. (1983) Statistical analysis of spatial point patterns, Academic Press, London.
WORKED EXAMPLE FOR DISTEDFMC
Name of dataset
JAPANESE
Description
The data record the location of 65 Japanese black pine seedlings within a fixed square of side 5.7m. Data
have been scaled so that x and y co-ordinates must lie between 0 and 1.
Our source
DIGGLE, P.J. (1983) Statistical analysis of spatial point patterns, Academic Press, London (pp. 1).
Original source
NUMATA, M. (1961), Forest vegetation in the vicinity of Choshi. Coastal flora and vegetation at Choshi,
Chiba Prefecture IV.,
Bull. Choshi Marine Lab. Chiba Uni., 3, pp. 28-48 [in Japanese].
Data
Number of observations = 42
Number of variables = 2
For each point, the x-value (top) and y-value (bottom) are given.
0.09 0.59 0.86 0.42 0.02 0.08 0.31 0.94 0.59 0.94 0.17 0.39 0.36
0.09 0.02 0.13 0.22 0.41 0.59 0.53 0.58 0.67 0.78 0.95 0.79 0.97
0.29 0.65 0.89 0.48 0.03 0.08 0.32 0.34 0.66 0.98 0.21 0.52 0.36
0.02 0.16 0.08 0.13 0.44 0.63 0.52 0.68 0.68 0.79 0.79 0.93 0.96
0.38 0.67 0.98 0.62 0.07 0.12 0.42 0.37 0.76 0.97 0.29 0.58 0.39
0.03 0.13 0.02 0.21 0.42 0.63 0.49 0.68 0.66 0.86 0.84 0.83 0.96
0.39 0.73 0.02 0.73 0.52 0.12 0.52 0.47 0.73 0.12 0.32 0.69 0.43
0.18 0.13 0.18 0.23 0.42 0.66 0.52 0.67 0.73 0.84 0.83 0.93 0.96
0.48 0.79 0.11 0.89 0.64 0.17 0.91 0.52 0.89 0.11 0.35 0.77 0.62
0.03 0.03 0.31 0.23 0.43 0.58 0.52 0.67 0.74 0.94 0.86 0.93 0.97
120
Plot
y
1
0
0
1
x
Worksheet
C1
X-co-ordinates
C2
Y-co-ordinates
Aim of analysis
To investigate whether the spatial distribution of Japanese pine seedlings is random.
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Japanese.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Japanese.MTW
# Worksheet was saved on 16/08/01 10:21:11
Results for: Japanese.MTW
MTB > % N:\resampling\library\distedfmc c1 c2 65 ;
SUBC> nsim 99 ;
SUBC> npoints 50 ;
SUBC> distances c4 ;
SUBC> edfs c6-c10.
Executing from file: N:\resampling\library\distedfmc.MAC
What are the minimum and maximum possible x co-ordinates ?
* NOTE * Please enter 2 values (min and max), then press return.
DATA> 0 1
What are the minimum and maximum possible y co-ordinates ?
* NOTE * Please enter 2 values (min and max), then press return.
DATA> 0 1
Exact Monte Carlo test of Complete Spatial Randomness (CSR),
121
based on inter-event distances
* NOTE * In many circumstances, the test based on average
deviation will be very weak to detect departures from CSR
Data Display (WRITE)
Observed test-statistic, average deviation
Randomization p-value
0.3800
0.000344
Observed test-statistic, maximum deviation
Randomization p-value
0.4100
0.00145
EDF plots of inter-event distances, with simulation envelopes
layout ;
* NOTE * Beginning LAYOUT mode.
Type ENDLAYOUT to end mode.
* NOTE * Ending LAYOUT mode.
plot expected*tvec hemp*tvec ltval*tvec utval*tvec;
* NOTE * For further details, see
DIGGLE, P.J. (1983) Statistical analysis of
spatial point patterns, Academic Press, London.
Modified worksheet
C6
Column containing 50 evaluation distances
C7
Column containing 50 empirical distribution function values, one for each distance
C8
Column containing 50 estimates of the cumulative distribution function, one for each distance
C9
Column containing 50 lower simulation envelope bounds, one for each distance
C10 Column containing 50 upper simulation envelope bounds, one for each distance
EDF plot of inter-event distances
Cumulative probability
1.0
0.5
0.0
0.0
0.5
1.0
1.5
Distance
Observed EDF is solid line.
Simulation envelope is formed by dotted lines.
122
EDF plot of inter-event distances
Empirical
1.0
0.5
0.0
0.0
Observed EDF is solid line.
0.5
1.0
Theoretical
Simulation envelope is formed by dotted lines.
Discussion
Both kinds of EDF plot give the same impression; the EDF lies close to the CDF throughout the range,
and well within the simulation envelope. This suggests that we should not reject the null hypothesis of
complete spatial randomness. This is confirmed by the formal tests, with non-significant p-values for tests
based upon average deviation (p-value = 0.38) and maximum deviation (p-value = 0.41).
ADDITIONAL SAMPLE DATA FOR DISTEDFMC
Name of dataset
REDWOOD
Suitable for use with
DISTEDFMC
NEARESTMC
LOCREGULARMC
Description
The data record the location of 62 redwood seedlings within a fixed square of side 23m. Data have been
scaled so that x and y co-ordinates must lie between 0 and 1.
Our source
DIGGLE, P.J. (1983) Statistical analysis of spatial point patterns, Academic Press, London (pp. 2).
Original sources
RIPLEY, B.D. (1977), Modelling spatial patterns (with Discussion), JRSS series B, 39, pp. 172-212.
STRAUSS, D.J. (1975), A model for clustering, Biometrika, 62, pp. 467-475.
Data
Number of observations = 42
Number of variables = 2
123
For each point, the x-value (top) and y-value (bottom) are given.
0.364 0.898 0.864 0.966 0.864 0.686 0.500 0.483 0.339 0.483 0.186
0.082 0.082 0.180 0.541 0.902 0.328 0.598 0.672 0.836 0.820 0.402
0.203 0.186 0.483 0.898 0.780 0.898 0.746 0.644 0.525 0.220 0.381
0.525 0.541 0.082 0.098 0.123 0.754 0.902 0.279 0.574 0.795 0.836
0.483 0.203 0.102 0.186 0.441 0.839 0.780 0.898 0.678 0.610 0.585
0.770 0.426 0.574 0.557 0.098 0.082 0.164 0.779 0.246 0.344 0.574
0.220 0.407 0.508 0.220 0.119 0.500 0.898 0.763 1.000 0.703 0.627
0.836 0.852 0.754 0.459 0.574 0.098 0.164 0.148 0.836 0.279 0.344
0.559 0.263 0.263 0.441 0.237 0.136 0.483 0.898 0.949 0.966 0.729
0.639 0.852 0.697 0.754 0.475 0.574 0.148 0.189 0.525 0.959 0.262
0.644 0.525 0.288 0.441 0.186 0.203 0.119
0.361 0.656 0.852 0.820 0.377 0.500 0.623
Worksheet
C1
X-co-ordinates
C2
Y-co-ordinates
124
NEARESTMC
! Intensive !
To compute kth nearest neighbour distances for a set of points within a fixed rectangular region, and to use
a Monte Carlo test to determine the significance of each of these distances.
Calling statement
nearestmc c1 c2 k1 ;
nsim k1 ;
nstats k1 ;
distances m1 ;
nearest m1.
Input
c1 and c2 should contain paired x and y co-ordinates for each point in the plane.
k1 should be the number of points at which observations are available.
Interactive input
The user will be asked to specify the minimum and maximum theoretical ranges of the x and y; these
quantities (XMIN, XMAX, YMIN and YMAX) are important, as they determine the region from which
points may be drawn in the Monte Carlo simulation.
Subcommands
nstats - the macro considers kth nearest neighbour distances for k = 1,…,m. nstats specifies the
maximum nearest neighbour distance to be considered, m.
distances - specify a matrix within which to store the distance matrix for the observed data.
nearest - specify a matrix within which to store the kth nearest neighbour distances (k = 1,…,m).
Output
gobs
gsmean
gsmin
gsmax
p1low
p1high
p2sided
Observed kth nearest neighbour distance
Mean of simulated kth nearest neighbour distances
Minimum of simulated kth nearest neighbour distances
Maximum of simulated kth nearest neighbour distances
One-sided p-value for kth nearest neighbour distance being unusually small
One-sided p-value for kth nearest neighbour distance being unusually large
Two-sided p-value for kth nearest neighbour distance
Null hypothesis : Complete spatial randomness in the location of data points within the region.
Test-statistic : We use kth nearest neighbour distances (for k = 1,…,nstats) as our test-statistics.
Simulation procedure : Assume that the sample size is n. For each Monte Carlo simulation, we simulate
n realisations from a continuous uniform distribution on the interval [XMIN, XMAX], and use these as x
co-ordinates. We also simulate n realisations from a continuous distribution on the interval [YMIN,
YMAX], and use these as y co-ordinates. We pair the x and y co-ordinates randomly, and use the resulting
points as our simulated dataset.
References
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London (Chapter 4).
125
Standard procedure
No standard MINITAB procedure is available.
WORKED EXAMPLE FOR NEARESTMC
Name of dataset
FIELD
Description
The data are artificial; they show the positions of 24 points within a 2m x 2m square.
Source
MANLY, F.J. (1997) Randomization, bootstrap and Monte Carlo methods in biology,
Chapman and Hall, London.
Data
Number of observations = 24
Number of variables = 2
For each point, the x-value (top) and y-value (bottom) are given.
0.1 0.1 0.3 0.4 0.7 0.9 1.1 1.2 1.1 1.2 1.2 1.0 1.1 1.3 1.0
0.9 1.5 0.9 1.1 0.7 0.3 0.1 0.1 0.3 0.3 0.6 0.8 0.8 0.6 1.9
1.3 1.4 1.5 1.6 1.7 1.6 1.7 1.9 1.9
1.4 0.8 0.7 0.6 0.4 0.8 0.8 0.5 0.9
Plot
y
2
1
0
0
1
2
x
Worksheet
C1
X-co-ordinates
C2
Y-co-ordinates
Aim of analysis
To investigate whether the location of points within the field is random.
126
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Field.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Field.MTW
# Worksheet was saved on 16/08/01 10:11:34
Results for: Field.MTW
MTB > % N:\resampling\library\nearestmc c1 c2 24 ;
SUBC> nsim 999 ;
SUBC> nstats 12 ;
SUBC> distances m1 ;
SUBC> nearest m2.
Executing from file: N:\resampling\library\nearestmc.MAC
What are the minimum and maximum possible x co-ordinates ?
* NOTE * Please enter 2 values (min and max), then press return.
DATA> 0 2
What are the minimum and maximum possible y co-ordinates ?
* NOTE * Please enter 2 values (min and max), then press return.
DATA> 0 2
Monte-Carlo test for nearest-neighbour distances
Data Display
Row gobs gsmean gsmin gsmax p1low p1high p2sided
1
2
3
4
5
6
7
8
9
10
11
12
0.204
0.306
0.361
0.421
0.488
0.537
0.599
0.641
0.684
0.721
0.758
0.786
0.222
0.345
0.444
0.527
0.605
0.678
0.746
0.813
0.876
0.936
0.995
1.054
0.138
0.245
0.306
0.375
0.433
0.497
0.556
0.601
0.638
0.686
0.720
0.741
0.304
0.456
0.566
0.676
0.758
0.857
0.927
1.008
1.088
1.156
1.301
1.359
0.268
0.128
0.025
0.011
0.013
0.011
0.011
0.009
0.009
0.007
0.007
0.003
0.733
0.873
0.976
0.990
0.988
0.990
0.990
0.992
0.992
0.994
0.994
0.998
0.536
0.256
0.050
0.022
0.026
0.022
0.022
0.018
0.018
0.014
0.014
0.006
Modified worksheet
M1
A 24*24 matrix of distances between sample points
M2
A 999*12 matrix of kth nearest neighbour distances (k = 1,…,12).
The kth column contains 999 kth nearest neighbour distances, one for each simulated dataset
Discussion
The 1st and 2nd nearest neighbour distances are not significantly different from that which we would
expect if the points were distributed at random. All higher distances are significant at the 5% level, with
the degree of significance increasing as the order (k) of the distance increases. However, if we allow for
multiple testing, then we should really reduce the significance level from 5% to 5/12 = 0.42% (using the
Bonferroni inequality, a conservative procedure). In this case, none of the distances is found to be
127
significant, suggesting that there is no strong evidence against randomness. The findings are different to
those of Manly (1997), probably because we extracted the data from his graph by eye, and this would
have led to error.
128
LOCREGULARMC
! Intensive !
To perform Monte Carlo tests for randomness, against the alternative hypotheses of regularity or local
regularity, using three statistics based upon nearest neighbour distances.
RUNNING THE MACRO
Calling statement
locregularmc c1 c2 k1 ;
nsim k1 (999) ;
distances m1 ;
statistics c1-c3.
Input
c1 and c2 should contain paired x and y co-ordinates for each point in the plane.
k1 should be the number of points at which observations are available.
Subcommands
nsim
Number of Monte Carlo simulations
distances
Specify a matrix within which to store the distance matrix
statistics
Specify three columns, within which to store simulated statistics for D (column 1),
S (column 2) and G (column 3).
Output
 Observed D, S and G statistics, with associated 1-sided randomization p-values
 Indices of regularity, based upon S and G.
Note that only the one-sided randomization p-values which test against the alternative hypothesis of
regularity are given.
ALTERNATIVE PROCEDURES : Standard procedures
No standard MINITAB procedure is available.
TECHNICAL DETAILS
Null hypothesis : Complete spatial randomness in the location of data points within the region.
Alternative hypothesis : Regularity (including local regularity).
Test-statistic : We use three different test-statistics. The three statistics considered here are :
D, the mean squared nearest neighbour distance.
S, the coefficient of variation of the squared nearest neighbour distances
G, the ratio of the geometric mean of the squared NN distances to their arithmetic mean.
These are defined to be -
 n 
D   vi n , S   (vi  D) (n  1) , G    vi 
i 1
i 1
 i 1 
where the vi are squared nearest neighbour distances.
n
n
1/ n
2
D,
The test-statistics have the following properties  D > 0. Large values of D tend to indicate regularity.
129
S > 0. Small values of S tend to indicate regularity. For complete regularity, S = 0.
G lies between 0 and 1. Large values of G tend to indicate regularity.
The test-statistic D is often used to test for spatial randomness, against alternative hypotheses of both
clustering and regularity, and S and G can be used for the same purpose. In this macro, we restrict
attention to the one-sided alternative hypothesis of regularity, since there are problems in interpreting
deviations of S and G from randomness in the opposite direction.
All three statistics should be sensitive to the detection of global regularity (i.e. the usual kind of largescale regularity), but Brown & Rothery (1978) suggest that S and G should be more effective teststatistics than D for detecting local regularity (i.e. regularity at the small scale, although the data are
clustered at the large scale).
Simulation procedure : Assume that the sample size is n. For each Monte Carlo simulation, we simulate
n realisations from a continuous uniform distribution on the interval [XMIN, XMAX], and use these as x
co-ordinates. We also simulate n realisations from a continuous distribution on the interval [YMIN,
YMAX], and use these as y co-ordinates. We pair the x and y co-ordinates randomly, and use the resulting
points as our simulated dataset.
Indices of regularity
Brown and Rothery (1978) suggest that suitable indices of regularity / local regularity might be :
IG = sqrt(1 - G)
IS = sqrt(S).
REFERENCE
BROWN, D. & ROTHERY, P. (1978), Randomness and local regularity of points in a plane,
Biometrika, 65, pp. 115-122.
WORKED EXAMPLE FOR LOCREGULARMC
Name of dataset
CELLS
Description
The data record the location of the centres of 42 biological cells within a fixed square of known size. Data
have been scaled so that x and y co-ordinates must lie between 0 and 1.
Source
DIGGLE, P.J. (1983) Statistical analysis of spatial point patterns, Academic Press, London (pp. 1).
Original sources
RIPLEY, B.D. (1977), Modelling spatial patterns (with Discussion), JRSS series B, 39, pp. 172-212.
CRICK, F.H.C. & LAWRENCE, P.A. (1975), Compartments and polychones in insect development, Science,
189, pp. 340-347.
Data
Number of observations = 42
Number of variables = 2
For each point, the x-value (top) and y-value (bottom) are given.
0.350 0.062 0.938 0.462 0.462 0.737 0.800 0.337 0.350 0.637 0.325
0.025 0.362 0.400 0.750 0.900 0.237 0.387 0.750 0.962 0.050 0.287
130
0.350 0.737 0.487 0.212 0.150 0.525 0.625 0.825 0.650 0.725 0.237
0.600 0.687 0.087 0.337 0.500 0.650 0.950 0.125 0.362 0.512 0.787
0.775 0.450 0.562 0.862 0.237 0.337 0.987 0.775 0.087 0.900 0.862
0.025 0.287 0.575 0.637 0.150 0.462 0.512 0.850 0.187 0.262 0.525
0.637 0.575 0.600 0.175 0.175 0.400 0.462 0.062 0.900
0.812 0.212 0.475 0.650 0.912 0.162 0.425 0.750 0.775
Plot
y
1
0
0
1
x
Worksheet
C1
X-co-ordinates
C2
Y-co-ordinates
Aims of analysis
To investigate whether the distribution of cells within the study region is random.
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Cells.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Cells.MTW
# Worksheet was saved on 24/08/01 15:00:13
Results for: Cells.MTW
MTB > % N:\resampling\library\locregularmc c1 c2 42 ;
SUBC> nsim 999 ;
SUBC> distances m1 ;
SUBC> statistics c4-c6
Executing from file: N:\resampling\library\locregularmc.MAC
131
What are the minimum and maximum possible x co-ordinates ?
* NOTE * Please enter 2 values (min and max), then press return.
DATA> 0 1
What are the minimum and maximum possible y co-ordinates ?
* NOTE * Please enter 2 values (min and max), then press return.
DATA> 0 1
Monte-Carlo tests for local regularity
of points in a fixed rectangular plane
Data Display
Observed D statistic 0.01694
Randomization p-value
0.0010
Data Display
Observed S statistic 0.06669
Randomization p-value
0.0010
Data Display
Observed G statistic 0.9626
Randomization p-value 0.001000
Indices of local regularity
Data Display (WRITE)
Index based on G
Index based on S
0.1934
0.2582
* NOTE * For further details, see
BROWN, D. & ROTHERY, P. (1978), Randomness
and local regularity of points in a plane,
Biometrika, 65, pp. 115-122.
Modified worksheet
M1
A 65*65 matrix of distances between sample points
C4
A column of 999 D statistics, one for each simulated dataset
C5
A column of 999 S statistics, one for each simulated dataset
C5
A column of 999 G statistics, one for each simulated dataset
Discussion
All three statistics have picked up on the obvious (global) regularity in the dataset, with p-values of 0.001
(the minimum possible p-value for 999 randomizations) in all cases.
132
7
OTHER MACROS
In the course of developing the macro library, we also created two further routines, unrelated to
resampling methods.
DIFFMATRIX
To extract a matrix of differences from a column of data.
RUNNING THE MACRO
Calling statement
diffmatrix c1 m1 k1
Input
 C1 should be a column of numeric data. Missing values are not allowed.
 M1 should be an empty matrix in which the differences are to be stored.
 K1 should be the number of observations (equal to the length of C1).
Output
A matrix of differences.
If xi is the ith element of the input, then the ijth element of the output matrix is equal to (xi - xj).
WORKED EXAMPLE FOR DIFFMATRIX
Data
WALES (see SPATAUTORAN)
Aims of analysis
To compute the matrix of differences between population change ranks for each county.
Randomization procedure
MTB > Retrieve "N:\resampling\Examples\Wales.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Wales.MTW
# Worksheet was saved on 15/08/01 12:31:16
Results for: Wales.MTW
MTB > % N:\resampling\library\diffmatrix c3 m5 13.
Executing from file: N:\resampling\library\diffmatrix.MAC
MTB > print m5
Data Display
Matrix M5
0 3 4
-3 0 1
-4 -1 0
-2 1 2
-5 -2 -1
2
-1
-2
0
-3
5
2
1
3
0
1
-2
-3
-1
-4
-2
-5
-6
-4
-7
-1
-4
-5
-3
-6
7
4
3
5
2
-4
-7
-8
-6
-9
6
3
2
4
1
-3
-6
-7
-5
-8
8
5
4
6
3
133
-1
2
1
-7
4
-6
3
-8
2
5
4
-4
7
-3
6
-5
3
6
5
-3
8
-2
7
-4
1
4
3
-5
6
-4
5
-6
4
7
6
-2
9
-1
8
-3
0 -3 -2 6 -5 5 -4 7
3 0 1 9 -2 8 -1 10
2 -1 0 8 -3 7 -2 9
-6 -9 -8 0 -11 -1 -10 1
5 2 3 11 0 10 1 12
-5 -8 -7 1 -10 0 -9 2
4 1 2 10 -1 9 0 11
-7 -10 -9 -1 -12 -2 -11 0
134
MISSING
To remove missing values from a number of columns of data.
RUNNING THE MACRO
Calling statement
diffmatrix c1-cN;
type k1 (1).
Input
c1-cN
Any number of columns containing numeric data.
Missing values : Allowed.
Subcommand
type - determines how missing values are treated. If type = 1 (default), then missing values are removed
separately from each column. If type = 2, then if any row contains missing values in one or more
column, the entire row is removed. If type = 2, then the columns c1-cN must all be of the same length.
Output
The columns c1-cN are changed on the worksheet, to omit missing values.
WORKED EXAMPLE FOR MISSING
Name of dataset
PUFFIN
Description
The data concern puffin beak measurements, and derive from a study of St. Kilda puffin populations in
1991-93. Beak length and depth are recorded, along with sex (1 = male, 2 = female).
Source
Our own unpublished data (Centre for Ecology and Hydrology).
Data
Number of observations = 41
Number of variables = 3
For each observation, sex (top), length (middle) and depth (bottom) are given.
1.0 2.0 2.0 1.0 1.0 1.0 2.0 1.0 2.0 1.0 1.0 1.0 2.0
28.2 28.5 * 27.1 27.8 28.8 28.0 29.0 27.5 28.0 29.7 26.6 28.0
35.5 32.6 30.6 34.3 37.6 39.0 33.8 34.9 29.1 36.9 35.9 34.4 34.0
1.0 2.0 1.0 1.0 2.0 2.0 2.0 2.0 1.0 1.0 2.0 2.0 2.0
28.2 29.0 27.7 28.0 27.6 29.3 27.5 27.2 29.1 29.5 30.4 29.1 27.8
34.6 34.7 34.0 37.0 33.2 31.4 32.5 32.6 34.5 34.5 32.8 34.0 33.8
2.0 1.0 1.0 2.0 1.0 2.0 1.0 2.0 1.0 1.0 2.0 1.0 2.0
27.6 30.0 27.6 26.6 28.8 28.0 30.5 29.2 28.8 29.9 28.6 29.2 27.0
32.0 36.1 36.5 31.9 34.3 28.6 34.1 29.8 33.5 35.8 32.3 34.8 32.9
135
2.0 1.0
28.5 29.0
30.7 34.0
Minitab output and discussion
First of all, we attempt to remove missing values from the dataset separately for each variable. To do this,
we use the default for the "type" subcommand.
Welcome to Minitab, press F1 for help.
MTB > Retrieve "N:\resampling\Examples\Puffin.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Puffin.MTW
# Worksheet was saved on 04/07/01 11:43:12
Results for: Puffin.MTW
MTB > % N:\resampling\library\missing c1-c3
Executing from file: N:\resampling\library\missing.MAC
* NOTE * Some variables contain missing values, which have
been excluded.
The single missing value has been excluded from the dataset.
Now we reload the original data, and attempt to remove completely any individuals which have missing
values. To do this, we have to use 2 in the "type" subcommand.
MTB > Retrieve "N:\resampling\Examples\Puffin.MTW".
Retrieving worksheet from file: N:\resampling\Examples\Puffin.MTW
# Worksheet was saved on 04/07/01 11:43:12
Results for: Puffin.MTW
MTB > % N:\resampling\library\missing c1-c3 ;
SUBC> type 2.
Executing from file: N:\resampling\library\missing.MAC
* NOTE * Some units have one or more items of missing data,
and have been excluded.
The individual which has a missing value has been excluded from the data.
136
ADDITIONAL SAMPLE DATA
Macros
A
r
t
i
f
i
c
i
a
l
c
e
l
l
s
c
o
l
o
n
y
d
a
r
w
i
n
E
a
r
w
i
g
s
E
x
p
o
n
e
n
t
i
a
l
E
x
t
i
n
c
t
i
o
n
F
e
r
n
b
i
r
d
s
f
i
e
l
d
h
e
x
o
k
i
n
a
s
e
j
a
p
a
n
e
s
e
Datasets
l Mm O
i a o r
z n n e
a d t g
r i h o
d b s n
s l
e
s
O
r
t
h
o
p
t
e
r
a
P
r
o
l
o
c
u
l
i
p
u
f
f
i
n
r
e
d
w
o
o
d
s
a
p
l
i
n
g
S
a
p
l
i
n
g
2
S
n
a
i
l
s
p
a
t
i
a
l
s
w
a
v
e
s
e
y
t
w
o
w
a
y
W
a
l
e
s
ONEWAYRAN
TWOWAYRAN
TWOTRAN
TWOTUNPOOLBOOT
TWOTPOOLBOOT
CORRELATIONRAN
MEANCIBOOT
MEDIANCIBOOT
STDEVCIBOOT
ANYCIBOOT
ONEWAYRAN
TWOWAYRAN
TWOWAYREPRAN
LEVENERAN
REGRESSSIMRAN
REGRESSOBSRAN
REGRESSRESRAN
REGRESSBOOT
ACFRAN
TRENDRAN
SPATAUTORAN
MANTELRAN
MEAD4RAN
MEAD8RAN
DISTEDFMC
NEARESTMC
LOCREGULARMC
137
If the user requires additional sample datasets for any particular macro, the above table suggests some
possibilities (although many of the datasets could be modified to be suitable for the application of other
macros).
KEY :
Worked example for this dataset
Suitable alternative dataset for this macro
ACCESSING THE MACROS
Our intention is that the macro library will be able to be accessed in three ways:
 From the Minitab website : We have submitted our macros, together with documentation and
sample data, to the Minitab macro library at
http://www.minitab.com/support/macros/index.asp
If accepted as suitable, they will appear on this website shortly.
 From the CEH website : The macros will also be placed on the CEH website at
http://www.ceh.ac.uk/
For details of the exact location, contact the authors.
On disk : If you wish to obtain a copy of the macros, please send a blank disk to either Peter Rothery
(CEH Monks Wood) or Adam Butler (Lancaster University).
CONTACT DETAILS
Further information
If you encounter any problems, or for further information, in the first instance please e-mail
[email protected] [Adam Butler]
Full contact details
Peter Rothery
Peter Rothery
CEH Monks Wood
Abbots Ripton
Huntingdon
Cambridgeshire PE28 2LS
Phone: (01487) 772448
E-mail: [email protected]
Adam Butler
Adam Butler
Department of Mathematics and Statistics
Lancaster University
Bailrigg
Lancaster LA1 4YF
E-mail: [email protected]
ACKNOWLEDGEMENTS
Peter Rothery, for supervision of the project.
David Roy, for computer support.
Phil Croxton, for advice upon the layout of the report.
Hannah Butler, for contributing the macro MANTELRAN.
Thanks also to all authors whose data we have quoted.
138
REFERENCES
BROWN, D. & ROTHERY, P. (1978), Randomness and local regularity of points in a plane, Biometrika, 65,
pp. 115-122.
CAIN, A.J. & SHEPPARD, P.M. (1950), Selection in the polymorphic land snail Cepaea nemoralis,
Heredity, 4, 275-294.
CLIFF, A.D. & ORD, J.K. (1973) Spatial autocorrelation, Pion, London.
CRICK, F.H.C. & LAWRENCE, P.A. (1975), Compartments and polychones in insect development, Science,
189, pp. 340-347.
DAVISON, A.C. & HINKLEY, D.V. (1997) Bootstrap methods and their application, CUP, Cambridge.
DIGGLE, P.J. (1983) Statistical analysis of spatial point patterns, Academic Press, London.
DRAPER, N.R. & SMITH, H. (1998) Applied regression analysis (3rd edition), John Wiley & Sons., New
York (Chapter 26).
EFRON, B. & TIBSHIRANI, J. (1993) An introduction to the Bootstrap, Chapman and Hall, London.
FISHER, R.A. (1935) The design of experiments, Oliver & Boyd, Edinburgh.
GENERAL REGISTER OFFICE (1961) England and Wales: Preliminary Census Report, 1961, HMSO,
London.
HARRIS, W.F. (1986) The breeding ecology of the South Island Fernbird in Otago Wetlands, PhD
Thesis, University of Otago, Dunedin, New Zealand.
HIGHAM, C.F.W., KIJNGAM, A. & MANLY, B.F.J. (1980), An analysis of prehistoric canid remains from
Thailand. Journal of Archaeological Science, 7, pp. 149-165.
MANLY, B. F. J. (1997) Randomization, bootstrap and Monte Carlo methods in biology, 2nd edn,
Chapman & Hall, London.
MC. KECHNIE, S.W., EHRLICH, P.R. & WHITE, R.R. (1975), Population genetics of Euphydryas
butterflies. I. Genetic variation and the neutrality hypothesis, Genetics, 81, pp. 571-594.
MINITAB INC. (1999) MINITAB User's Guide, Release 13 for Windows, Minitab Inc., 3081 Enterprise
Drive, State College, Pennsylvania 16801-3008.
NUMATA, M. (1961), Forest vegetation in the vicinity of Choshi. Coastal flora and vegetation at Choshi,
Chiba Prefecture IV., Bull. Choshi Marine Lab. Chiba Uni., 3, pp. 28-48 [in Japanese].
POPHAM, E.J. & MANLY, B.F.J. (1969), Geographical distribution of the Dermaotera and the continental
drift hypothesis, Nature, 222, pp. 981-982.
POWELL, G.L. & RUSSELL, A.P. (1984), The diet of the eastern short-horned lizard (Phrynosoma
douglassi brevirostre) in Alberta and its relationship to sexual size dimorphism, Canadian Journal
of Zoology, 62, pp. 428-440.
POWELL, G.L. & RUSSELL, A.P. (1985), Growth and sexual size dimorphisms in Alberta populations of
the eastern short-horned lizard, Phrynosoma douglassi brevirostre, Canadian Journal of Zoology,
63, pp. 139-154.
RAUP, D.M. (1987), Mass extinctions: a Discussion, Palaeontology, 30, pp. 1-13.
REYMENT, R.A. (1982), Phenotypic evolution in a Cretaceous foraminifer, Evolution, 36, pp. 1182-1199.
RIPLEY, B.D. (1977), Modelling spatial patterns (with Discussion), JRSS series B, 39, pp. 172-212.
STRAUSS, D.J. (1975), A model for clustering, Biometrika, 62, pp. 467-475.
TER BRAAK, C.J.F. (1992), Permutation versus bootstrap significance tests in multiple regression and
ANOVA, in Bootstrapping and Related Techniques (ed. K.H. Jockel), Springer-Verlag, Berlin,
pp.79-86.
139
					 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
									 
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                             
                                            