Download Supplementary Information (doc 45K)

Supplementary On-line material Multivariate linear regression approach We modeled the multivariate response of the bacterial community to a matrix of environmental variables and spatial covariates using the regression approach and variance partitioning technique proposed by ter Braak (1986), Legendre and Legendre (1998), Borcard and Legendre (2002) and Borcard and colleagues (1992; 2004). This technique quantifies the amount of variation attributable exclusively to the different sets of environmental or spatial correlates and offers observational evidence on the relative importance of the different processes that determine community structure (e.g. Cottenie, 2005 but see Smith and Lundholm, 2010 for a critical review). Before variance partitioning, we de-trended the linear effects of latitude using a correspondence analysis approach (CCA, Legendre and Legendre, 1998). Borcard and colleagues (1992) originally proposed the use of polynomial of several degrees for describing main spatial trends. A correspondence analysis basis is better for raw species contingency table since it preserves chi-square distance, while residuals from CCA can be analyzed by RDA, which is based on principal component analysis (Legendre and Legendre, 1998). Recently, more objective techniques based on extracting the principal coordinates of neighbor matrices (i.e., geographical distances) have been demonstrated to be more appropriate (Borcard et al., 2004; Dray et al., 2006). Eigenvectors are extracted according to a hierarchy that accounts for spatial patterns at progressively finer scales. Model selection procedures (e.g. multivariate extension of the AIC criterion) allow selecting the best linear combination of eigenvectors in terms of maximizing correlation with the data and minimizing number of vectors (Dray et al., 2006). The earlier, and widely used approach of eigenvector extraction is known as Principal Coordinate Analysis of Neighbor Matrices PCNM (Borcard et al., 2004). Dray and colleagues (2006) have generalized this method by showing that PCNM is a special case of the more general class of Moran’s Eigenvector Mapping (MEM), which also consists of extracting the eigenvectors of a sample distance matrix. However, in the case of MEM, the distance matrix is obtained by multiplying a connectivity matrix and a weighting matrix. In our case, connectivity matrices such as Gabriel and Delauny graphs (see Dray et al., 2006 for analytical details) are not trivial to estimate, since the global scale of our study implies that the geometry of our sampling coordinates is spherical. Therefore this system of coordinates cannot be approximated to a Euclidean flat surface. However, the extraction of eigenvectors such as PCNMs is not problematic if the sample distance matrix is based on the great circle distance (the shortest path between two points on a sphere). We thus used this approach. After deriving PCNMs, we followed Dray et al., (2006) and accordingly selected the set of eigenvectors that best accounts for autocorrelation. The linear combination of PCNMs was then used as a predictor of the species table. We had two types of spatial variation, one being represented by the “Continent” effect. This categorical factor includes large, global patterns such as latitudinal gradients in species distribution and possible biogeographical effects that are reflected by the relative position of the continents within the globe. One such effect, for example, could be a possible biogeographical affinity between southern America and Antarctica, which has been documented for many taxa belonging to different phyla and kingdoms (Chown and Convey, 2006). This type of spatial correlation depends on historical biogeography and should be separated by patterns of spatial autocorrelation described by PCNMs, which are presumed to measure the effect of dispersal processes and unmeasured environmental variables (but see Smith and Lundholm, 2010 for a critical review). Thus, PCNM eigenvectors allow one to account for patterns at multiple spatial scales, while a variance partitioning technique based on three matrices (Legendre and Legendre 1998) allows one to quantify the unique contribution of Climate, Continent and spatial patterns, i.e. respectively Climate|Continent, PCNM (Climate effect after accounting for Continent and spatial effects), Continent|Climate, PCNM (Continent effect after accounting for Climate and spatial effects) and PCNM|Climate, Continent (spatial effect after accounting for Climate and Continent effects). The calculation used for quantifying the variance components were based on the function varpart of the R package vegan (Oksanen et al., 2009), which executes variance partitioning of a multivariate system with multiple tables. Results were used for generating Table 1 of the main paper. Figure 3 of the main paper was based on Continent|Climate, PCNMs. Null Model Analysis We performed 5000 randomizations of the original data matrix in order to create null expectations for the C-score, an index which increases with increasing species segregation (i.e. when species tend to avoid each other). This approach decreases the frequency of Type I error but offers satisfying statistical power (Gotelli, 2000 Gotelli and Entsminger, 2001). Random matrices had the same number of species and samples as the original matrix. We then performed pair-wise comparison between the observed and expected C-scores. Neutral Model We estimated the neutral diversity () and immigration (I) parameters with the sampling formula for multiple samples by Etienne (2007) and a maximum likelihood approach. Then, local simulated communities can be predicted by the neutral model from the metacommunity that corresponds to the estimates of  and I. Using indices that quantify ecological distance (e.g. Jaccard and Bray-Curtis indices), actual and simulated communities can be compared in terms of the level of dissimilarity, which thus offers a dynamical null hypothesis. In order to estimate , I for our samples and simulate the neutral local communities, we used the PARI/GP codes given in Etienne (2007). Local communities are assumed to be partially isolated from the metacommunity. These communities are subjected to immigration, in accordance with Hubbell’s neutral model (2001). The rate of immigration that was originally called “m” by Hubbell (2001) can be expressed in terms of number of individuals that are immigrants to the local community. Immigration term, m I I  J 1 where I is fundamental dispersal limitation parameterand J is total community size The local diversity increases with immigration and the gamma diversity of the metacommunity, which is measured by the diversity parameter  . We observed very low values of  (between 2 and 15) and I (around 1 % of total community abundance), which suggests local communities are almost completely isolated. In the neutral theory, metacommunity diversity depends on a speciation parameter, which is implicitly embodied in  (Hubbell, 2001; Etienne, 2007). When sample size is small, the estimate of I is problematic and multiple likelihood maxima may appear. However, the formula for multiple samples is robust to this problem (Etienne, 2007). Accordingly, our likelihood surfaces revealed one peak only. We did not use the likelihood of the model for direct comparison to other models of species abundance distribution such as the log-normal (e.g. Volkov et al., 2003). Instead, we used the estimates of  and I for creating a null expectation in terms of community dissimilarities (McGill, 2003; Gotelli and McGill, 2006; Dornelas, 2006; Etienne, 2007) under the assumption that local community structure can be approximated by demographic stochasticity and limited dispersal only. We compared observed and expected levels of dissimilarity by a bootstrapped t-test. We tried several dissimilarity indices (e.g. Bray-Curtis; Gower; Jaccard) proposed by Anderson and colleagues (2006). Since all tested indices led to the same conclusion, in the main paper we reported results based on the Jaccard dissimilarity, which is d Jaccard  bc abc where a is the number of species shared by two communities (B and C), b is the number of species in B that do not occur in C, and c is the number of species in C that do not occur in B. References Anderson MJ, Ellingsen KE, McArdle BH. (2006) Multivariate dispersion as a measure of beta diversity. Ecol Lett 9: 683–693. Borcard D, Legendre P, Drapeau P. (1992). Partialling out the spatial component of ecological variation. Ecology 73: 1045–1055. Borcard D, Legendre P. (2002). All-scale spatial analysis of ecological data by means of principal coordinates of neighbour matrices. Ecol Model 153: 51–68. Borcard D, Legendre P, Avois-Jacquet C, Tuomisto H. (2004). Dissecting the spatial structure of ecological data at multiple scales. Ecology 85: 1826–1832 Chown SL, Convey P. (2006). Antarctica as a global indicator. In: Bergstom DM, Convey P, Huiskes HL. (eds). Biogeography: Trends in Antarctic terrestrial and limnetic ecosystems. Springer: Dordrecht, pp. 55–70. Cottenie K. (2005). Integrating environmental and spatial processes in ecological community dynamics. Ecol Lett 8: 1175–1182. Dornelas M, Sean R, Connolly SR, Hughes TP. (2006). Coral reef diversity refutes the neutral theory of Biodiversity. Nature 44: 80–82. Dray S, Legendre P, Peres-Neto PR. (2006). Spatial modelling: a comprehensive framework for principal coordinate analysis of neighbour matrices (PCNM). Ecol Model 196: 483–493. Etienne RS. (2007). A neutral sampling formula for multiple samples and an exact test of neutrality. Ecol Lett 10: 608–618. Gotelli NJ. (2000). Null model analysis of species co-occurrence patterns. Ecology 81: 2606–2621. Gotelli NJ, Entsminger GL. (2001). Swap and fill algorithms in null model analysis: rethinking the Knight's Tour. Oecologia 129: 281–291. Gotelli NJ, McGill B. (2006). Null Versus Neutral Models: What's The Difference? Ecography 29: 793–800. Hubbell SP. (2001). The Unified Neutral Theory of Biodiversity and Biogeography. Princeton Univ. Press: Princeton. Legendre P, Legendre L. (1998). Numerical Ecology. Elsevier: Amsterdam. McGill BJ. (2003). Strong and weak tests of macroecological theory. Oikos 102: 679–685. Oksanen J, Kindt R, Legendre P, O’Hara RB, Gavin L, Simpson GL, Solymos P, Stevens MH, Wagner H. (2009). vegan: Community Ecology Package. R package version 1.15–4. http://CRAN.R-project.org/package=vegan Smith TW, Lundholm JT. (2010). Variation partitioning as a tool to distinguish between niche and neutral processes. Ecography 33: 648–655. ter Braak CJF. (1986). Canonical Correspondence Analysis: a new eigenvector technique for multivariate direct gradient analysis. Ecology 67: 1167–1179. Volkov I, Banavar JR, Hubbell SP, Maritan A. (2003). Neutral theory and relative species abundance in ecology. Nature 424: 1035–1037.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Supplementary Information (doc 45K)