Download Spatial Analysis Improves Species Distribution Modelling during

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Habitat conservation wikipedia , lookup

Transcript
SPATIAL ANALYSIS IMPROVES SPECIES DISTRIBUTION MODELLING DURING
RANGE EXPANSION
by Paulo De Marco Jr., José Alexandre Felizola Diniz-Filho and Luis Mauricio Bini
SUPPLEMENTARY METHODS
Creating simulated species distribution data
Simulated data of species range distribution is an useful approach to compare
species distribution modeling techniques mainly because it allows to control over some
important species-range properties that affect modeling efficiency (Hirzel et al. 2001;
Meynard & Quinn 2007). As the actual species distributions are unknown, the
performances of modeling methods are difficult to address. Thus, the use of simulated data
allows us to circumvent this difficult (Austin et al. 2006). We applied this approach to
explore the effects of the colonization-extinction mechanisms that could generate nonequilibrium distributions and to evaluate the usefulness of Spatial Eigenvector Mapping
(see below) to improve the prediction of species distribution.
Non-equilibrium species distributions could arise under different ecological and
evolutionary scenarios. Firstly, it could appear by failures in colonization of suitable areas
due to recent environmental changes (e.g. habitat destruction, abrupt climatic shifts or
presence of physical barriers) or low dispersal capacity. Also, during the initial stages of
species’ invasion, non-equilibrium of species’ distributions with environment may occur.
We call this scenario “colonization-lag” non-equilibrium (CNE). In this case, although
distribution is actually determined by the environment (e.g. climatic variables), generating
strong range cohesion, a mismatch between the actual and potential distributions is
expected due to historical time lag. Secondly, complex colonization-extinction dynamics
within the species’ potential range, generated by local processes as, for instance, biotic
interactions or metapopulational dynamics, will appear as random noise in geographical
space. This is what we call demographic non-equilibrium (DNE) scenario, which is
expected to disrupt range cohesion.
All simulations were based on the premise that species distribution is determined
by a “suitability” measure, defined as a linear combination of six standardized
environmental variables (see below) (figure S1). We used a two-step process to produce the
species distribution data. Firstly, the simplest way to produce range distribution consists in
choosing a vector of means for the environmental variables and assuming that suitability
for the species is multi-normally distributed around this vector. This was similar to the
“Gaussian Threshold” used by Meynard & Quinn (2007) to simulate distributions of
artificial species. We assumed that the suitability is not affected by the correlation among
the environmental variables, but only by the elements of the main diagonal of the
covariance matrix which expressed the variances of each environmental variable. These
variances are directly related to the environmental tolerance of the species. Species
distributions were restricted by an envelope delimited by two standard deviation units from
the mean of each environmental variable. Thus, the suitability was a continuous, Gaussiandistributed variable and represented the degree that the environmental vector in a given
pixel is close to the mean values.
This index of suitability was estimated for all 2545 cells covering the Cerrado
Biome and was based on four climatic variables (Temperature: annual mean, seasonality,
Precipitation:
annual
mean,
seasonality)
derived
from
the
WORDCLIM
(http://www.worldclim.org/), and two topographic variables (altitude and slope) derived
from
the
Hydro-1K
global
digital
elevation
model
(http://edcdaac.usgs.gov/gtopo30/hydro/). The Cerrado is a savanna-like vegetation that
originally covered 2 million km2 in the centre of Brazil (Bridgewater et al. 2004), and this
realm was used here for computational facility only.
In the second step, we used a cellular automata (CA) modeling to produce a more
realistic species’ range expansion process allowing for the dynamics of local colonization
and extinction. CA allows one to run simulations of dispersal process in a spatially explicit
context, where the spatial variability of suitability, coupled with stochastic colonization and
extinction processes, could generate ranges distributions at non-equilibrium with the
environment. An initial seed population was introduced in a randomly chosen grid cell
among those with higher suitability, and the dynamic process of colonization and extinction
initiated based on two simple rules (colonization-lag non-equilibrium - CNE): (a) a species
automatically colonize a cell if there is any neighbor cell successfully colonized at time t-1;
(b) extinction probability is linear and negatively related to the suitability. A second
scenario (Demographic non-equilibrium - DNE) was created adding a suitabilityindependent persistence probability that increased linearly with the proportion of neighbor
cells successfully colonized at time t-1. Thus, in CNE the non-equilibrium appears only due
to species’ range expansion process and is fully determined by environmental drivers at a
given time step. CNE is expected to decreases with increasing range expansion and
disappears when a species occupies all the suitable cells. On the other hand, DNE is
continuously caused by stochastic colonization and extinction processes, independently of
the range expansion process. Differences between CNE and DNE range development under
the same suitability data used in this study, for 100 time cycles, are shown in two AVI files
also available as supplementary material.
Figure S1. Spatial variability in suitability across de Cerrado Biome used to constrain
species distributions during the process of range expansion. Suitability of each cell (2545 in
the total) was given by the combination of six environmental variables.
Modeling method
The geographic distributions of the simulated species (under the different
scenarios and at each time cycle) were randomly sampled to obtain 100 occurrence points.
These points were then used as the input data for the species distribution modeling using
Maxent (see below).
Our modeling approach was based on the use of the Maximum Entropy principle
(Phillips et al. 2006) implemented in the program Maxent version 2.3. Maxent is a
machine-learning technique that estimates the probability distribution that is closest to
uniform (i.e., which has the maximum entropy) under the constraint that the expected value
of each environmental variable matches the empirical values observed at the occurrence
data. Our primary interest here are on the general process of modeling range expansion and
the use of Spatial Eigenvector Mapping , therefore, the choice of this method was based
solely on it easy-to-use properties.
The process of model fit in Maxent, as other SDM applications, involves the
estimation of parameters and some optimization criteria. Recommended default values
were used for the convergence threshold (10-5), and regularization parameter (1) (Phillips &
Dudik 2008), except the number of iterations which was set to 1000.
Spatial variables derived from Spatial Eigenvector Mapping (see Diniz-Filho &
Bini 2005; Griffith & Peres-Neto 2006; Dormann et al. 2007, for recent reviews) were
calculated in the software package SAM (Spatial Analysis in Macroecology; Rangel et al.
2006) and added as predictors in Maxent models. In this method, the eigenvectors from a
double-centered truncated geographic distance matrix can be used in SDM as new
orthogonal predictors that capture, at different scales, the geometry of the studied area. We
defined truncation distances based on the intercept of Moran’s I correlograms for the
Maxent residuals of the models estimated only with environmental predictors. We selected
the first 5 eigenvectors to include in Maxent modeling, after testing the successive addition
on the eigenvectors and checking for model stability in the final steps of simulations.
Model evaluation
Evaluation of species distribution models is dependent on the choice of a
minimum threshold to convert predicted probabilities of occurrence on species’ presence or
absence. A ROC (receiver operating characteristics) curve is produced by plotting
sensitivity against the complement of specificity for different threshold values (Liu et al.
2005; Manel et al. 2001). The ROC procedure allows to find an optimum threshold by
identifying the value that maximizes the sum of specificity and sensitivity (Manel et al.
2001), being regarded as one of the best methods for threshold determination in SDM (Liu
et al. 2005). This procedure relies on the existence of a test data, which includes actual
presences and absences, to estimate a confusion matrix. Nevertheless, Maxent can address
the problem by producing a set of random test points (Phillips et al. 2006). This approach
had the advantage of using a large training dataset and includes a proportion of false
absences into the process of model evaluation. We use a set of 10000 random test points
which is recommended for asymptotic properties of AUC (Phillips & Dudik 2008).
The presence/absence prediction of Maxent was compared with the “real”
distribution for each simulation using Kappa probabilities. Kappa values are considered one
of the most useful procedure for model evaluation and had the advantage of taking both
omission and commission errors into account (Liu et al. 2005; Pearson et al. 2006). As a
measure of the relative importance of spatial filters to improve model fit, we used the
difference of Kappa values () estimated between the models with both environmental
predictors and spatial filters and the models with only environmental predictors.
 values were compared between CNE and DNE models controlling for species
range using an Analysis of Covariance (ANCOVA) (figure S2). This analytical design was
chosen due to the known dependency of Kappa on species prevalence (Allouche et al.
2006).
Figure S2. Relationships between  Kappa (gain in Kappa values of models that included
spatial filters in relation to models with only environmental variables) and the geographical
range size, for CNE (filled squares) and DNE (open squares) models.
REFERENCES
Allouche, O., Tsoar, A., and Kadmon, R. 2006 Assessing the accuracy of species
distribution models: prevalence, kappa and the true skill statistic (TSS). J. Appl.
Ecol. 43, 1223-1232.
Austin, M. P., Belbin, L., Meyers, J. A., Doherty, M. D., and Luoto, M. 2006 Evaluation of
statistical models used for predicting plant species distributions: Role of artificial
data and theory. Ecological Modelling 199, 197-216.
Bridgewater, S., Ratter, J. A., and Ribeiro, J. F. 2004 Biogeografic patterns, beta diversity
and dominance in the cerrado biome of Brazil. Biodiversity and Conservation 13,
2295-2318.
Hirzel, A. H., Helfer, V., and Metral, F. 2001 Assessing habitat-suitability models with a
virtual species. Ecological Modelling 145, 111-121.
Liu, C. R., Berry, P. M., Dawson, T. P., and Pearson, R. G. 2005 Selecting thresholds of
occurrence in the prediction of species distributions. Ecography 28, 385-393.
Manel, S., Williams, H. C., and Ormerod, S. J. 2001 Evaluating presence-absence models
in ecology: the need to account for prevalence. J. Appl. Ecol. 38, 921-931.
Meynard, C. N. and Quinn, J. F. 2007 Predicting species distributions: a critical comparison
of the most common statistical models using artificial species. J. Biogeogr.
Pearson, R. G., Thuiller, W., Araujo, M. B., Martinez-Meyer, E., Brotons, L., McClean, C.,
Miles, L., Segurado, P., Dawson, T. P., and Lees, D. C. 2006 Model based
uncertainty in species range prediction. J. Biogeogr. 33, 1704-1708.
Phillips, S. J., Anderson, R. P., and Schapire, R. E. 2006 Maximum entropy modeling of
species geographic distributions. Ecological Modelling 190, 231-259.
Phillips, S. J. and Dudik, M. 2008 Modeling of species distributions with Maxent: new
extensions and a comprehensive evaluation. Ecography 31, 161-175.