Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
QUANTITATIVE PALAEOECOLOGY Lecture 4. Quantitative Environmental Reconstructions BIO-351 CONTENTS Introduction Indicator-species approach Assemblage approach Mutual Climate Range method Probability density functions Proxy data General theory Assumptions of transfer functions Linear-based methods Inverse linear regression Inverse multiple linear regression Principal components analysis regression Segmented inverse regression Partial least squares Requirements – biological and statistical CONTENTS (2) Non-linear (unimodal) based methods Maximum likelihood regression and calibration Weighted averaging regression and calibration Error estimation Training set assessment Reconstruction evaluation Reconstruction validation Examples of weighted-averaging reconstructions Weighted-averaging – assessment Correspondence analysis regression Weighted-averaging partial least-squares (WA-PLS) Pollen-climate response surfaces Analogue-based approaches Consensus reconstructions and smoothers Use of artificial simulated data sets No analogue problem Multiple analogue problem Multi-proxy approaches Synthesis INTRODUCTION ‘TRANSFER FUNCTION’ or ‘BIOTIC INDEX’ CALIBRATION or BIOINDICATION INDICATOR-SPECIES APPROACH = SINGLE SPECIES BIOASSAY ASSEMBLAGE APPROACH = MULTI-SPECIES BIOASSAY Birks H. J. B. (1995) Quantitative palaeoenvironmental reconstructions. In Statistical modelling of Quaternary science data (ed D Maddy & J S Brew), Quaternary Research Association pp161–254. ter Braak C. J. F. (1995) Chemometrics and Intelligent Laboratory Systems 28, 165–180. GRADIENT ANALYSIS AND BIOINDICATION Relation of species to environmental variables or gradients Gradient analysis: Environment gradient Community Bioindication: Community Environment In bioindication, use species optima or indicator values to obtain an estimate of environmental conditions or gradient values. Calibration, bioindication, reconstruction. INDICATOR-SPECIES APPROACH The thermal-limit curves for Ilex aquifolium, Hedera helix, and Viscum album in relation to the mean temperatures of the warmest and coldest months. Samples 1,2,and 3 represent samples with pollen of Ilex, Hedera, and Viscum, Hedera and Viscum, and Ilex and Hedera, respectively. From Iversen 1944. ASSEMBLAGE APPROACH Compare fossil assemblage with modern assemblages from known environments. Identify the modern assemblages that are most similar to the fossil assemblage and infer the past environment to be similar to the modern environment of the relevant most similar modern assemblages. If done qualitatively, standard approach in Quaternary pollen analysis, etc., since 1950s. If done quantitatively, modern analogue technique or analogue matching. MUTUAL CLIMATIC RANGE METHOD Grichuk et al Atkinson et al Coleoptera USSR UK TMAX TMIN TRANGE 1950s–1960s 1986, 1987 - mean temperature of warmest month - mean temperature of coldest month - TMAX–TMIN Quote median values of mutual overlap and ‘limits given by the extremes of overlap'. Thermal envelopes for hypothetical species A, B, and C Schematic representation of the Mutual Climate Range method of quantitative temperature reconstructions (courtesy of Adrian Walking). ASSUMPTIONS 1. Species distribution is in equilibrium with climate. 2. Distribution data and climatic data are same age. 3. Species distributions are well known, no problems with species introductions, taxonomy or nomenclature. 4. All the suitable climate space is available for species to occur. ? Arctic ocean, ? Truncation of climate space. 5. Climate values used in MCR are the actual values where the beetle species lives in all its known localities. Climate stations tend to be at low altitudes; cold-tolerant beetles tend to be at high altitudes. ? Bias towards warm temperatures. Problems of altitude, lapse rates. 495 climate stations across Palaearctic region from Greenland to Japan. Climate reconstructions from (a) British Isles, (b) western Norway, (c) southern Sweden and (d) central Poland. TMAX refers to the mean temperature of the warmest month (July). The chronology is expressed in radiocarbon years BPx1000 (ka). Each vertical bar represents the mutual climatic range (MCR) of a single dated fauna. The bold lines show the most probable value or best estimate of the palaeotemperature derived from the median values of the MCR estimates and adjusted with the consideration of the ecological preferences of the recorded insect assemblages. Coope & Lemdahl 1995 PROBABILITY DENSITY FUNCTIONS Kühl et al. (2002) Quaternary Research 58; 381-392 Kühl (2003) Dissertations Botanicae 375; 149 pp. Kühl & Litt (2003) Vegetation History & Archeobotany 12; 205-214 Basic idea is the quantify the present-day distribution of plants that occur as Quaternary fossils (pollen and/or macrofossils) in terms of July and January temperature and probability density functions (pdf). Assuming statistical independence, a joint pdf can be calculated for a fossil assemblage as the product of the pdfs of the individual taxa. Each taxon is weighted by the extent of its climatic response range, so 'narrow' indicators receive 'high' weight. The maximum pdf is the most likely past climate and its confidence interval is the range of uncertainty. Can be used with pollen (+/-) and/or macrofossils (+/-). Distribution of Ilex aquilifolium in combination with January temperature. Estimated probability density function of Ilex aquilifolium as an example for which the parametric normal distribution (solid line) fits well the non-parametric distribution (e.g., Kernel function (dashed line) histogram). Estimated one- and two-dimensional pdfs of four selected species. The histograms (nonparametric pdf) and normal distributions (parametric pdf) on the left represent the one-dimensional pdfs. Crosses in the righthand plots display the temperature values provided by the 0.5º x 0.5º gridded climatology (New et el., 1999). Black crosses indicate presence, grey crosses absence of the specific taxon. A small red circle marks the mean of the corresponding normal distribution and the ellipses represent 90% of the integral of the normal distribution centred on . Most sample points lie within this range. The interval, however, may not necessarily include 90% of the data points. Carex secalina as an example of an azonally distributed species is an exception. A normal distribution does not appear to be an appropriate estimating function for this species, and therefore no normal distribution is indicated. Climate dependences of Carpinus (betulus) (C), Ilex (aquilifolium) (I), Hedera (helix) (H), and Tilia (T) and their combination. The pdf resulting from the product of the four individual pdfs (dotted) is similar to the ellipse calculated on the basis of the 216 points with common occurrences for the four taxa (dashed). No artificial narrowing of the uncertainty range is evident. Climate dependencies of Acer (A), Corylus (avellana) (C), Fraxinus (excelsior) (F), and Ulmus (U), and their combination. The pdf (dotted) resulting from the product of the four individual pdfs has a mean very similar to the mean of the pdf (dashed) calculated based on the 1667 points with common occurrences, but its variances are much smaller. Reconstruction for the fossil assemblage of Gröbern. The thin ellipses indicate the pdfs of the individual taxa included in the reconstruction, and the thick ellipse the 90% uncertainty range of the reconstruction result. Simplified pollen diagram from Gröbern (Litt 1994), reconstructed January and July temperature, and 18O (after Boettger et al. 2000). Reconstructed most probable mean January (blue) and July (red) temperature and 90% uncertainty range (dotted lines) Kühl & Litt (2003) Comparison of the reconstructed mean January temperature using the pdf-method (green) and the analog technique (blue). Bispingen uncertainty range – 90%; La Grande Pile – 70%. ENVIRONMENTAL PROXY DATA • Biological data from palaeoecological studies • Pollen, molluscs, foraminifera, macrofossil plant remains, diatoms, chrysophytes, coleoptera, chironomids, rhizopods, moss remains, ostracods • Quantitative counts (usually %) • Ordinal estimates (e.g. 1-5 scale) • Presence-absence data (1/0) at different stratigraphical intervals and hence times GENERAL THEORY Y - biological responses ("proxy data") X - set of environmental variables that are assumed to be causally related to Y (e.g. sea-surface temperatures) B - set of other environmental variables that together with X completely determine Y (e.g. trace nutrients) If Y is totally explicable as responses to variables represented by X and B, we have a deterministic model (no allowance for random factors, historical influences) Y = XB If B = 0 or is constant, we can model Y in terms of X and Re, a set of ecological response functions Y = X (Re) In palaeoecology we need to know Re. We cannot derive Re deductively from ecological studies. We cannot build an explanatory model from our currently poor ecological knowledge. Instead we have to use direct empirical models based on observed patterns of Y in modern surface-samples in relation to X, to derive U, our empirical calibration functions. Y = XU In practice, this is a two-step process Regression in which we estimate Û m , modern calibration functions or regression coefficients Ym Uˆm( X m ) or Training set Ym modern surface-sample data Xm associated environmental data X m Uˆm1(Ym ) (inverse regression) Calibration, in which we reconstruct X̂ f , past environment, from fossil core data Xˆ f Uˆm1(Yf ) TRANSFER FUNCTION Yf fossil core data fossil set BIOLOGICAL DATA ENVIRONMENTAL DATA (e.g. Diatoms, pollen, chironomids) (e.g. Mean July temperature) Modern data ”training set” 1, ,m taxa 1 variable Ym Xm n samples Fossil data n samples 1, ,m taxa 1 variable Yf t samples Xo t samples Unknown To be reconstructed Outline of the transfer function approach to quantitative palaeoenvironmental reconstruction GENERAL THEORY OF RECONSTRUCTION Step 1 Regression to estimate modern optima for each species Y m Uˆm( X m) where Ym = modern diatom abundance Xm = modern chemical data (e.g. pH) species Ûm = estimated modern pH optimum for diatom Step 2 Calibration to reconstruct past chemistry * X f Uˆm(Y f ) where Yf = fossil diatom abundance Xf = reconstructed past chemistry (e.g. pH) * 1 Uˆm Uˆm = inverse of modern species optima from Step 1 REGRESSION ('CLASSICAL REGRESSION') Y = f (X) + ERROR Estimate f ( ) from training set by regression. The estimated f ( ) is then ‘inverted’ to find unknown x0 from fossil y0. f Xˆ0 Y0 INVERSE REGRESSION = CALIBRATION X g Y ERROR Xˆ0 g(Y0 ) ‘Plug in’ estimate given Y0 and g PROXY-DATA PROPERTIES Contain many taxa Contain many zero values Commonly expressed as percentages - "closed" compositional data Quantitative data are highly variable, invariably show a skewed distribution Non-quantitative data are either presence / absence or ordinal ranks Taxa generally have non-linear relationship with their environment, and the relationship is often a unimodal function of the environmental variables SPECIES RESPONSES Species nearly always have non-linear unimodal responses along gradients trees (m) J. Oksanen 2002 ASSUMPTIONS IN QUANTITATIVE PALAEOENVIRONMENTAL RECONSTRUCTIONS 1. Taxa in training set (Ym) are systematically related to the physical environment (Xm) in which they live. 2. Environmental variable (Xf , e.g. summer temperature) to be reconstructed is, or is linearily related to, an ecologically important variable in the system. 3. Taxa in the training set (Ym) are the same as in the fossil data (Yf) and their ecological responses (Ûm) have not changed significantly over the timespan represented by the fossil assemblage. 4. Mathematical methods used in regression and calibration adequately model the biological responses (Um) to the environmental variable (Xm). 5. Other environmental variables than, say, summer temperature have negligible influence, or their joint distribution with summer temperature in the fossil set is the same as in the training set. 6. In model evaluation by cross-validation, the test data are independent of the training data. The 'secret assumption' until Telford & Birks (2005). LINEAR-BASED METHODS INVERSE REGRESSION X m UˆmYm July temperature = b0 + b1y1 + b2y2 + ... bzyz Pinus Betula species parameter [ Y = UX ] ‘response’ (e.g. biology) Regression 'Classical' ‘predictor’ (e.g. environmental variables) Inverse regression is most efficient if relation between each taxon and the environment is LINEAR and with a normal error distribution. Basically a linear model. Light micrograph of the Quaternary fossil S. herbacea leaf showing epidermal cells and stomata (x40). The cuticle was macerated in sodium hypochlorine (8% w/v) for 2 min and mounted in glycerol jelly with safranin. CLASSICAL REGRESSION – e.g. GLM Stomatal density = a + b (CO2) + ε Y response variable X predictor variable INVERSE REGRESSION CO2 = a1+b1 (stomatal density) + ε Y response variable X predictor variable CO2 past = a1 + b1 (stomatal density of fossil leaves) X m Y m Uˆm ˆm Xf YfU Regression plot of the total training set (n=29) for stomatal density of Salix herbacea leaves and the atmospheric CO2 concentration in which they grew. The regression details are as follows: Term Regression coefficienterror Standard t p Constant (bo) 294.99 33.248 8.87 <0.001 CO2 (b1) -0.647 0.155 -5.61 <0.001 R2=0.538 R2adj=0.521 Both terms have regression coefficients significantly different from zero and the variance ratio (F[1.27] =31.44) exceeds the critical value of F at the 0.01 significance level (7.68), indicating that stomatal density has a strong statistical relationship with CO2 concentration. Beerling et al. 1995 Kråkenes Lake Late-glacial CO2 reconstructions at Kråkenes, western Norway (38 m a.s.l.) INVERSE MULTIPLE REGRESSION APPROACH Multiple regression of temperature (Xm) on abundance of taxa in core tops (Ym) (inverse regression). ˆm Ym b0 b1y1 b2 y 2 b3 y 3 ... bm y m Xm U Xˆ f Uˆm Yf b0 b1y 1 b2 y 2 b3 y 3 ... bm y m i.e. Xˆf 0 m k 1 k y ik Approach most efficient if: 1. relation between each taxon and environment is linear with normal error distribution 2. environmental variable has normal distribution Usually not usable because: 1. taxon abundances show multicollinearity 2. very many taxa 3. many zero values, hence regression coefficients unstable 4. basically linear model Consider non-linear model and introduce extra terms: X m b0 b1c1 b2c12 b3c2 b4c22 b5c3 b6c32 ... Can end up with more terms than samples. Cannot be solved. Hence "ad hoc" approach of Imbrie & Kipp (1971), and related approaches of Webb et al. Location of 61 core top samples (Imbrie & Kipp 1971) 61 core-top samples x 27 taxa Principal components analysis 61 samples x 4 assemblages (79%) PRINCIPAL COMPONENTS REGRESSION (PCR) Abundance of the tropical assemblage versus winter surface temperature for 61 core top samples. Data from Tables 4 and 13. Curve fitted by eye Abundance of the subtropical assemblage versus winter surface temperature for 61 core top samples. Data from Tables 4 and 13. Curve fitted by eye Abundance of the subpolar assemblage versus winter surface temperature for 61 core top samples. Data from Tables 4 and 13. Curve fitted by eye Abundance of the polar assemblage versus winter surface temperature for 61 core top samples. Data from Tables 4 and 13. Curve fitted by eye Now did inverse regression using 4 varimax assemblages rather than the 27 original taxa. X m b0 b1 A b2 B b3 C b4 D Linear where A, B, C and D are varimax assemblages. X m b0 b1 A b2 B b3 C b4 D b5 AB b6 AC b7 AD b8 BC b9 BD b10 CD b11 A 2 b12 B 2 b13 C b14 D 2 2 Non-linear CALIBRATION STAGE using the fossil assemblages described as the 4 varimax assemblages X f b0 b1 A f b2 B f b3 C f b4 D f General abundance trends for four of the varimax assemblages related to winter surface temperatures. Winter surface temperatures "measured" by Defant (1961) versus those estimated from the fauna in 61 core top samples by means of the transfer function. Imbrie & Kipp 1971 Average surface salinities ”measured” by Defant (1961) versus those estimated from the fauna in 61 core top samples. Summer surface temperatures ”measured” by Defant (1961) versus those estimated from the fauna in 61 core top samples. Imbrie & Kipp 1971 Salinity Palaeoclimatic estimates for 110 samples of Caribbean core V12-133, based on palaeoecological equations (Table 12) derived from 61 core tops. Tw = winter surface temperature; Ts = summer surface temperature; ‰ = average surface salinity. APPROACH AD HOC BECAUSE 1. Why 4 assemblages? Why not 3, 5, 6? No crossvalidation 2. Assemblages inevitably unstable, because of many transformation, standardization, and scaling options in PCA 3. Assumes linear relationships between taxa and their environment 4. No sound theoretical basis SEGMENTED LINEAR INVERSE REGRESSION Scatter diagrams of: (A) the percent birch (Betula); and (B) the percent oak (Quercus) pollen versus latitude. The thirteen regions for which regression equations were obtained. Bartlein & Webb 1985 Regression equations for mean July temperature from the thirteen calibration regions in eastern North America Region A: 54-71 N; 90-110 W Pollen sum: Alnus + Betula + Cyperaceae + Forb sum + Gramineae + Picea + Pinus July T (oC) = 12.39 + 0.50*Pinus.5 + 0.26*Forb sum + 0.15*Picea.5 (1.61) (.14) (.05) (.10) - 0.89*Cyperaceae.5 – 0.37*Gramineae – 0.03*Alnus (.13) (.08) (.01) R2 = 0.80; adj. R2 = 0.78; Se = 0.96oC n = 114; F = 69.86; Pr = 0.0000 Region B: 53-71 N; 50-80 W Pollen sum: Abies + Alnus + Betula + Herb sum + Picea + Pinus July T (oC) = 8.17 + 0.54*Picea.5 + 0.17*Betula.5 - 0.04*Herb sum – 0.01*Alnus (2.27) (.19) (.14) (.01) (.01) R2 = 0.70; adj. R2 = 0.70; Se = 1.52oC n = 165; F = 95.48; Pr = 0.0000 "We selected the appropriate equation for each sample by identifying the calibration region that; (1) contains modern pollen data that are analogous to the fossil sample; and (2) has an equation that does not produce an unwarranted extrapolation when applied to the fossil sample." Regression equations used to reconstruct mean July temperature at 6000 yr BP. Bartlein & Webb 1985 Isotherms for estimated mean July temperatures (ºC) at 6000 yr BP. Difference map for mean July temperatures (ºC) between 6000 yr BP and today. Positive values indicate temperatures that were higher at 6000 yr BP than today. Elk Lake, Minnesota Reconstructions produced by the regression approach Regression equation applications to the Elk Lake pollen data. Mean January Temperature Calibration Region Age Range (varve range) R2 45-55ºN, 85-105ºW 320-10084 0.876 10134-11638 0.842 45-55ºN, 95-105ºW Mean July 40-50ºN,85-95ºW 320-6562 0.799 Temperature 40-50ºN,85-105ºW 6746-10084 0.786 45-50ºN,95-105ºW 10134-11638 0.701 Annual 40-55ºN,85-105ºW 320-3692 0.578 Precipitation 40-50ºN,85-105ºW 3794-7662 0.940 45-55ºN,85-105ºW 7862-11638 0.578 APPROACHES TO MULTIVARIATE CALIBRATION Chemometrics – predicting chemical concentrations from near Infra-red spectra Responses Predictors PARTIAL LEAST SQUARES REGRESSION – PLS Form of PC regression developed in chemometrics PCR - PLS - components are selected to capture maximum variance within the predictor variables irrespective of their predictive value for the environmental response variable components are selected to maximise the covariance with the response variables PLS usually requires fewer components and gives a lower prediction error than PCR. Both are ‘biased’ inverse regression methods that guard against multi-collinearity among predictors by selecting a limited number of uncorrelated orthogonal components. (Biased because some data are discarded). CONTINUUM REGRESSION = 0 = normal least square regression = 0.5 = PLS = 1.0 = PCR PLS is thus a compromise and performs so well by combining desirable properties of inverse regression (high correlation) and PCR (stable predictors of high variance) into one technique. PLS will always give a better fit (r2) than PCR with same number of components. BASIC REQUIREMENTS IN QUANTITATIVE PALAEOENVIRONMENTAL RECONSTRUCTIONS 1. Need biological system with abundant fossils that is responsive and sensitive to environmental variables of interest. 2. Need a large, high-quality training set of modern samples. Should be representative of the likely range of variables, be of consistent taxonomy and nomenclature, be of highest possible taxonomic detail, be of comparable quality (methodology, count size, etc.), and be from the same sedimentary environment. 3. Need fossil set of comparable taxonomy, nomenclature, quality, and sedimentary environment. 4. Need good independent chronological control for fossil set. 5. Need robust statistical methods for regression and calibration that can adequately model taxa and their environment with the lowest possible error of prediction and the lowest bias possible. 6. Need statistical estimation of standard errors of prediction for each constructed value. 7. Need statistical and ecological evaluation and validation of the reconstructions. A straight line displays the linear relation between the abundance value (y) of a species and an environmental variable (x), fitted to artificial data (•). (a=intercept; b=slope or regression coefficient). A unimodal relation between the abundance value (y) of a species and an environmental variable (x). (u=optimum or mode; t=tolerance; c=maximum). Outline of ordination techniques presented in this paper. DCA Gradient length (detrended correspondence estimation analysis) was applied for the determination of the length of gradient (LG). LG is important for choosing between ordination INDIRECT based on a linear or on an GRADIENT unimodal response model. ANALYSIS Correspondence analysis (CA) is not considered any further because in the microcosm DIRECT experiment discussed here LG GRADIENT was =<1.5 SD units. LG <3 SD ANALYSIS units are considered to be typical in experimental ecotoxicology. In cases where LG<3, ordination based on linear response models is considered to be the most appropriate. PCA (principal component analysis) visualizes variation in species data in relation to best fitting theoretical variables. Environmental variables explaining this visualized variation are deduced afterwards, hence indirectly. RDA (redundancy analysis) visualizes variation in species data directly in relation to quantified environmental variables. Before analysis, covariables may be introduced in RDA to compensate for systematic differences in experimental units. After RDA, a permutation test can be used to examine the significance of effects. LINEAR OR UNIMODAL METHODS Estimate the gradient length for the environmental variable(s) of interest. Detrended canonical correspondence analysis with x as the only external or environmental predictor. Detrend by segments, non linear rescaling, ? rare taxa downweighted. Estimate of gradient length in relation to x in standard deviation (SD) units of compositional turnover. Length may be different for different environmental variables and the same biological data. pH alkalinity colour 2.62 SD 2.76 SD 1.52 SD If gradient length < 2 SD, taxa are generally behaving monotonically along gradient and linear-based methods are appropriate. If gradient length > 2 SD, several taxa have their optima located within the gradient and unimodal-based methods are appropriate. Imbrie & Kipp (1971) Core-top data • Species • Sample 'core tops' CANONICAL CORRESPONDENCE ANALYSIS 1. Forward selection of environmental variables Winter SST Salinity Summer SST 0.73 0.13 0.02 p=0.01 82.0% p=0.01 17.0% p=0.17 1.0% 2. Three environmental variables together explain 46.1% of the observed variation in the 61 core tops. 3. First axis (1 = 0.75) is significantly different (p = 0.01) from random expectation, indicating that the taxa are significantly related to the environmental variables. NON-LINEAR (UNIMODAL) METHODS MAXIMUM LIKELIHOOD PREDICTION OF GRADIENT VALUES • Bioindication, Calibration, Transfer function, Reconstruction • Gaussian response model - regression + We know observed abundances y + We know gradient values x = Estimate or model the species response curves for all species • Bioindication - calibration + We know observed abundances y + We know the modelled species response curves for all species = Estimate the gradient value of x • The most likely value of the gradient is the one that maximises the likelihood function given observed and expected abundances of species • Can be generalised for any response function Species - pH response curve GAUSSIAN RESPONSE MODEL Can be reparametrised as a generalised linear model: • Gradient as a 2nd degree polynomial • Logarithmic link function (x u)2 h exp 2t 2 log() b0 b1x b2 x 2 b1 u 2b2 t 1 2b2 b12 h exp b0 4 b2 J. Oksanen 2002 Optimum = 2º Summer sea-surface temperature ºC GLIM Globigerina pachyderma (left coiling) Globigerina pachyderma (right coiling) Orbulina universa Globigerina rubescens GAUSSIAN LOGIT REGRESSION Imbrie and Kipp 1971 61 core tops 27 taxa Summer SST Winter SST Salinity Significant Gaussian logit model 19 21 21 Significant increasing linear logit model 6 3 4 Significant decreasing linear logit model 1 1 0 No relationship 1 2 2 MAXIMUM LIKELIHOOD PREDICTION OF GRADIENT VALUES • Bioindication, Calibration, Transfer function, Reconstruction • Gaussian response model - regression + We know observed abundances y + We know gradient values x = Estimate or model the species response curves for all species • Bioindication - calibration + We know observed abundances y + We know the modelled species response curves for all species = Estimate the gradient value of x • The most likely value of the gradient is the one that maximises the likelihood function given observed and expected abundances of species • Can be generalised for any response function MAXIMUM LIKELIHOOD APPROACH • Likelihood is the probability of a given observed value with a certain expected value • Maximum likelihood estimation: expected or reconstructed values that give the best likelihood for the observed fossil assemblages - ML estimates are close to observed values, and the proximity is measured with the likelihood function - commonly we use the negative logarithm for the likelihood, since combined probabilities may be very small J. Oksanen 2002 INFERRING PAST TEMPERATURE FROM MULTIVARIATE SPECIES COMPOSITION Observed Modern responses Inferred 9 10 11 12 13 14 Temperature (ºC) Modified from J. Oksanen 2002 ROOT MEAN SQUARED ERROR FOR WINTER SST, SUMMER SST, & SALINITY USING DIFFERENT PROCEDURES Winter SST Summer SST Salinity Imbrie & Kipp 1971 linear 2.57 2.55 0.573 Imbrie & Kipp 1971 non-linear 1.54 2.15 0.571 Maximum likelihood regression and ML calibration 3.21 2.09 0.711 Weighted averaging regression and WA calibration 1.97 2.02 0.570 WA regression and WA calibration with tolerance downweighting 1.92 2.03 0.560 ML regression, WA calibration 1.56 (1.56) 1.94 (1.94) 0.557 (0.656) ML regression, WA calibration with tolerance downweighting 1.25 (1.25) 1.80 (1.80) 0.534 (0.615) WACALIB 3.5 (debugged version!) Maximum likelihood regression and ML calibration 1.20 1.63 0.54 WACALIB 2.1 – 3.3 (values in brackets are RMSE when taxa with significant fits only are used). • Only three parameters: - u: location of the optimum on gradient x - h: modal height at the optimum - t: tolerance or width of response • Parameters can be estimated with non-linear regression or generalised linear models • WEIGHTED AVERAGES CAN APPROXIMATE U WEIGHTED AVERAGING The basic idea is very simple. In a lake with a certain pH range, diatoms with their pH optima close to the lake’s pH will tend to be the most abundant species present. A simple estimate of the species’ pH optimum is thus an average of all the pH values for lakes in which that species occurs, weighted by the species’ relative abundance. (WA regression) Conversely, an estimate of a lake's pH is the weighted average of the pH optima of all the species present. (WA calibration) Weighted averaging regression n Uˆk Y ik X i i 1 n Optimum Y ik i 1 X ˆ 2 i 1Y ik i U k tˆ n Y ik i 1 n 1 2 Tolerance k where Uk is the WA optimum of taxon k tk is WA standard deviation or tolerance of k Yik is percentage of taxon k in sample i Xi is environmental variable of interest in sample i And there are i=1,....,n samples and k=1, ....,m taxa Weighted averaging calibration or reconstruction m Y ik Uˆk Xˆi k 1 m WA Y ik k 1 m Xˆ i 2 Y Uˆ tˆ k 1 m ik k k 2 Y tˆ k 1 ik k WAtol Weighted averaging - the simple site average In the simple average all sites where the species is present have equal weight when calculating the optimum. However, the species is likely to be most abundant at sites near the optimum. Therefore samples with high abundance of the species should be given more weight. In weighted averaging this is achieved by weighting the environment variable by a measure of species abundance. RECONSTRUCTING AN ENVIRONMENTAL VARIABLE FROM A FOSSIL ASSEMBLAGE Weighted averaging calibration A lake will tend to be dominated by taxa with chemical optima close to the lake's chemistry Estimate of this chemistry is given by averaging the optima of all taxa present in the lake. If a species' abundance data are available these can be used as weights: m WA Calibration xˆi y k 1 m y k 1 where ik uˆk ik x̂i = estimate of environmental variable for fossil sample i yik= abundance of species k in fossil sample i uk= optima of species k ESTIMATION OF SAMPLE-SPECIFIC ERRORS BASIC IDEA OF COMPUTER RE-SAMPLING PROCEDURES TRAINING SET - 178 modern diatom samples and lake-water pH Jack-knifing Do reconstruction 177 times. Leave out sample 1 and reconstruct pH; add sample 1 but leave out sample 2 and reconstruct pH. Repeat for all 177 reconstructions using a training set of size 177 leaving out one sample every time. Can derive jackknifing estimate of pH and its variance and hence its standard error. Bootstrap Draw at random a training set of 178 samples using sampling with replacement so that same sample can, in theory, be selected more than once. Any samples not selected form an independent test set. Reconstruct pH for both modern testset samples and for fossil samples. Repeat for 1000 bootstrap cycles. Mean square error of prediction = 1. error due to variability in estimating species parameters in training set (i.e. s.e. of bootstrap estimates) + 2. error due to variation in species abundances at a given pH (i.e. actual prediction error differences between observed pH and the mean bootstrap estimate of pH for modern samples when in the independent test). Birks et al. 1990 Use of data and the bootstrap distribution to infer a sampling distribution. The bootstrap procedure estimates the sampling distribution of a statistic in two steps. The unknown distribution of population values is estimated from the sample data, then the estimated population is repeatedly sampled to estimate the sampling distribution of the statistic. The bootstrap algorithm for estimating the standard error of a statistic ˆ s ( x ); each bootstrap sample is an independent random sample of size n from F̂ . The number of bootstrap replications B for estimating a standard error is usually between 25 and 200. As B , seB approaches the plug-in estimate of se f ˆ ERROR ESTIMATION BY BOOTSTRAPPING WACALIB 3.1+ AND C2 61 sample training set, draw 61 samples at random with replacement to give a bootstrap training set of size 61. Any samples not selected form a test set. Mean square error of prediction = + error due to variability in estimates of optima and/or tolerances in training set + (xˆi, boot xi, boot )2 n boot error due to variation in abundances at a given temperature (xi, boot xi, boot )2 n boot (actual prediction error differences between observed xi and mean bootstrap estimate (s.e. of bootstrap estimates) s1 ( xi, boot is mean of s2 xi, boot for all cycles when sample i is in test set). For a fossil sample MSEP boot xˆ i ,boot x i,boot n S 2 2 2 RMSE = (S1 + S2)½ ROOT MEAN SQUARE ERRORS OF PREDICTION ESTIMATED BY BOOTSTRAPPING WA W Atol Summer sea-surface temperature C Training set RMSE total 2.31 2.37 RMSE S1 0.63 0.70 RMSE S2 2.22 2.27 Fossil samples 2.2252.251 2.2832.296 Winter sea-surface temperature C Training set RMSE total 2.23 2.19 RMSE S1 0.62 0.7 RMSE S2 2.14 2.07 Fossil samples Salinity ‰ Training set RMSE total RMSE S1 RMSE S2 Fossil samples 2.1562.201 2.1062.249 0.61 0.11 0.60 0.60 0.13 0.59 0.6030.607 0.5990.606 TRAINING SET ASSESSMENT ROOT MEAN SQUARED ERROR (RMSE) of xi xˆi ˆi CORRELATION BETWEEN xi and x r COEFFICIENT OF DETERMINATION r2 xˆi xi 2 n r or r2 measures strength between observed and inferred values and allows comparison between transfer functions for different variables. RMSE2 = error2 + bias2 Error = SE xi xˆi Bias = Mean xi xˆi SYSTEMATIC PREDICTION ERROR (Mean of prediction errors) RANDOM PREDICTION ERROR ABOUT BIAS Also Maximum bias – divide sampling interval of xi into equal intervals (usually 10), calculate mean bias for each interval, and the largest absolute value of mean bias for an interval is used as a measure of maximum bias. Note in RMSE the divisor is n, not (n - 1) as in standard deviation. This is because we are using the known gradient values only. BIAS AND ERROR Good: Prediction root mean squared error (RMSEP) Correlation unreliable: depends on the range of observations • Root mean squared error RMSE N 2 ( x x ) /N i i i 1 • Bias b: systematic difference • Error : random error about bias. • RMSE2 = b2 + 2 Must be cross-validated or will be badly biased J. Oksanen 2002 ACCURACY OF PREDICTION • Root Mean Squared Error RMSE i (xˆi xi )2 n • Two components - Error RMSE2 = bias2 + error2 - Bias • Correlation coefficient is dependent on the range of observations - Large range: Large part of variance explained • Cross-validation must be used in assessing the prediction accuracy 1: Split sample - Divide your data into training and test data sets 2: Jack-knife - For every site i repeat: Remove site i from the data set Estimate species response curves Do the calibration for site i CROSS-VALIDATION Leave-one-out ('jack-knife'), each in turn, or divide data into training and test data sets. Leave-one-out changes the data too little, and hence exaggerates the goodness of prediction. K-fold cross-validation leaves out a certain proportion (e.g. 1/10) and evaluates the model for each of the data sets left out. Badly biased unless one does cross-validation J. Oksanen 2002 CROSS-VALIDATION STATISTICS RMSEP r jack r2 jack PREDICTED VALUES mean bias maximum bias cf.RMSE r r2 APPARENT VALUES or ESTIMATED VALUES mean bias maximum bias TRAINING SET ASSESSMENT AND SELECTION Lowest RMSEP, highest r or r2 jack, lowest mean bias, lowest maximum bias. Often a compromise between RMSEP and bias. PARTITIONING RMSEP RMSEP2 = ERROR2 + BIAS2 s12 s22 Error due to estimating optima and tolerances Error due to variations in abundance of taxa at given environmental value SWAP (= Surface Waters Acidification Project) Diatom – pH Training Set England 5 Norway 51 Wales 32 178 surface sediments Scotland 60 Sweden 30 267 taxa – pH – in 2 or more samples with 1% or more in sample arithmetic mean 4.33 – 7.25 mean = 5.59 median = 5.51 Screened to 167 samples pH 4.33 – 7.25 mean = 5.56 262 taxa RMSE = 0.297 r = 0.933 RMSEP (bootstrapping) = 0.32 RMSEP (split-sampling) = 0.31 median = 5.27 ROOT MEAN SQUARED ERRORS OF PREDICTION FOR THE TRAINING SET WA WAtol RMSE si1 0.072 0.305 RMSE S2 0.312 0.371 Total RMSE of prediction 0.320 0.480 ________________________________________________ Cross-validation 0.308 RMSE (0.269-0.338) 0.376 (0.287-0.541) The Round Loch of Glenhead, Galloway WA pH reconstructions with bootstrap standard errors of prediction STATISTICAL AND ECOLOGICAL EVALUATION OF RECONSTRUCTIONS INITIAL ASSUMPTIONS 1. Taxa related to physical environment. 2. Modern and fossil taxa have same ecological responses. 3. Mathematical methods adequately model the biological responses. 4. Reconstructions have low errors. 5. Training set is representative of the range of variation in the fossil set. RECONSTRUCTION EVALUATION 1. RMSEP for individual fossil samples Monte Carlo simulation using leave-one-out initially to estimate standard errors of taxon coefficient and then to derive specific sample standard errors, or bootstrapping. 2. Goodness-of-fit statistics CCA of calibration set, fit fossil sample passively on axis (environmental variable of interest), examine squared residual distance to axis, see if any fossil samples poorly fitted. 3. Analogue statistics Good and close analogues. Extreme 5% and 2.5% of modern DCs. 4. Percentages of total fossil assemblage that consist of taxa not represented in all calibration data set and percentages of total assemblage that consist of taxa poorly represented in training set (e.g. < 10% occurrences) and have coefficients poorly estimated in training set (high variance) of beta values in cross-validation). < 5% not present reliable < 10% not present okay < 25% not present possibly okay > 25% not present not reliable ASSESSMENT OF ANALOGUES ANALOG, MAT Chord distance or chi-squared distances. Select first fifth VERY GOOD or CLOSE ANALOGUE GOOD tenth percentiles of all pairs of DC values n samples 1 2 n n 1 FAIR DC values RANDOMISATION TESTS ANALOG Poor fit Chironomids and climate Ordination of (a) chironomid taxa and environmental variables in Labrador, Canada, and (b) lakes. Relationship between actual and chironomid-inferred summer surface-water temperatures for Labrador lakes. Walker et al. 1991 Walker et al. 1996 Percentage abundance of common midge taxa in sediments of Splan Pond, New Brunswick, Canada. For comparison, names of climatic events for correlative European time intervals are included. Summer surface-water paleotemperature reconstruction for Splan Pond. For comparison names of climatic events for correlative European time intervals are included. The apparent root-meansquare error of the temperature estimates is 1.32ºC (10). Walker et al 1997 VALIDATION Diatoms and pH Reconstructions of the pH history of Lysevatten based on historical data and inference from the subfossil diatoms in the sediment. Historical data are pH measurements (thin solid line) and indirect data from fish reports and data from other similar lakes (thin broken line). The insert, showing pH variations from April 1961 to March 1962, is based on real measurements. Diatominferred values (thick solid line) were obtained by weighted averaging. Diatoms and total P validation Plot of observed vs. inferred annual mean TP concentrations (log g l-1) based on simple WA classical regression of 44 lakes. Comparison of the measured seasonal range in TP concentrations for Mondsee (mean is shown by a line with open circles; minimum and maximum are shown as single lines) with the bootstrap RMSE of prediction for each individual reconstructed TP value (Est_se_p) using the diatom model (Mean boot is shown by a line with filled circles and the lower and upper errors are shown as single lines). All model values are back-transformed to g l-1. Measured annual mean TP concentrations (line with open circles) compared with the diatom-inferred TP values calculated as 3year running means (single line), for the period 1975-93. All model values are back transformed to g l-1. Measured Diatominferred total P Baldeggersee frozen core Baldeggersee Lotter 1998 Diatom succession in Baldeggersee freeze-core BA93-C between 1885 & 1993. Only major taxa shown Measured total phosphorus (TP) during spring circulation compared to diatom-inferred TP values and median grain-size distribution in the Baldeggersee annual layers (see Lotter et al. 1997c). The large filled circles show the measured spring circulation TP values for the uppermost 15m, whereas the horizontal lines represent the annual TP range in the uppermost 15m of the water column. The dots on the right side of the graph represent samples with close (filled dots; 2nd percentile) and good modern analogues (open dots; 5th percentile). Diatoms and climate Diatom-inferred mean July air temperature (black dots) from sediments of the three study lakes Alanen Laanijärvi, Lake 850, and Lake Njulla including sample specific error estimates (vertical error bars) and 210Pbdating errors (horizontal error bars) compared with measured July T (grey dots) in Kiruna (for Alanen Laanjärvi) and in Abisko (for Lakes 850 and Njullla) during the past century. Measured July T are corrected for elevation (0.57ºC per 100m; Laaksonen, 1976) and smoothed (grey line) with a running mean (n = 13). The stippled lines separate periods with apparent 'good' and 'poor' correspondence between diatom-inferred and measured July T in Lakes 850 and Njulla. Bigler and Hall 2003 Chironomids and climate Comparison between meteorological data and chironomid-inferred temperatures at each of the 4 study sites. The blue line represents the 5- or 2-year running means of the meteorological data at Abisko and Kiruna respectively, corrected using a lapse rate of 0.57ºC per 100m. The red line represents the 5-year (for lakes Njulla, 850, and Vuoskkujarvi) or 2-year (for Alanen Laanijavri) running means of the meteorological data corresponding to the date obtained at each level.The black line is the chironomid-inferred temperatures with the estimated errors as vertical bars (mean±SSE). The horizontal error bars represent an estimated error in dating. The open stars indicate sediment intervals where the instrumental values fall outside the range of chironomid-inferred temperature (mean±SSE). The Pearson correlation coefficient r, and associated p-values are presented and indicate statistically significant correlations between measured and chironomid-inferred mean July air temperature at all study sites. The arrows indicate the climate normals (mean 1960-1999). Larocque & Hall 2003 WEIGHTED AVERAGING – AN ASSESSMENT 1. Ecologically plausible – based on unimodal species response model. 2. Mathematically simple but has a rigorous mathematical theory. Properties fairly well known now. 3. Empirically powerful: a. does not assume linear responses b. not hindered by too many species, in fact helped by many species! c. relatively insensitive to outliers 4. Tests with simulated and real data – at its best with noisy, species-rich compositional percentage data with many zero values over long environmental gradients (> 3 standard deviations). 5. Because of its computational simplicity, can derive error estimates for predicted inferred values. 6. Does well in ‘non-analogue’ situations as it is not based on the assemblage as a whole but on INDIVIDUAL species optima and/or tolerances. 7. Ignores absences. 8. Weaknesses. Species packing model: Gaussian logit curves of the probability (p) that a species occurs at a site, against environmental variable x. The curves shown have equispaced optima (spacing = 1), equal tolerances (t = 1) and equal maximum probabilities occurrence (pmax = 0.5). xo is the value of x at a particular site. Diatoms and pH 1. Sensitive to distribution of environmental variable in training set. 2. Considers each environmental variable separately. 3. Disregards residual correlations in species data. Can extend WA to WA-partial least squares to include residual correlations in species data in an attempt to improve our estimates of species optima. WEIGHTED AVERAGES • 1. 2. WA estimate of species optimum (u) is good if: Sites are uniformly distributed over species range Sites are close to each other • 1. 2. 3. 4. WA estimates of gradient values (x) are good if: Species optima are dispersed uniformly around x All species have equal tolerances All species have equal modal abundances Optima are close together u~i y x y ij i j y u x = = = x~ j j ij abundance optimum gradient value y u y ij i j i j ~ = = = i ij species site WA estimate These conditions are only true for infinite species packing conditions! WEIGHTED AVERAGING CONDITIONS JOINTLY 1. Both species and sites must have uniform and dense distribution over the gradient 2. To estimate values at gradient ends, some species optima must be outside the gradient endpoints. Result is bias and truncation 3. To estimate extreme species optima, some sites must be outside the most extreme species optima. Result is bias and truncation 4. Conditions 2 and 3 can be satisfied simultaneously only with infinite gradients 5. WA equations define the two-way reciprocal averaging algorithm of CA - 6. Ranges and variances of weighted averages are smaller than the range of values that they are based on. Need to 'deshrink' to restore the original range and variance. x u~, u~ x~, x~ u~... BIAS AND TRUNCATION IN WEIGHTED AVERAGES Weighted averages are good estimates of Gaussian optima, unless the response in truncated. Bias towards the gradient centre: shrinking. WA WA GLR GLR pH J. Oksanen 2002 APPROACHES TO MULTIVARIATE CALIBRATION Chemometrics – predicting chemical concentrations from near infra-red spectra Responses Predictors CORRESPONDENCE ANALYSIS REGRESSION Roux 1979 Reduced Imbrie & Kipp (1971) modern foraminifera data to 3 CA axes. Then used these in inverse regression. RMSE apparent Summer temp Winter temp PC regression 2.55°C 2.57°C CA regression 1.72°C 1.37°C WA-PLS 1.53°C 1.17°C WEIGHTED AVERAGING PARTIAL LEAST SQUARES (WA-PLS) Extend simple WA to WA-PLS to include residual correlations in species data in an attempt to improve our estimates of species optima. Partial least squares (PLS) Form of PCA regression of x on y PLS components selected to show maximum covariance with x, whereas in PCA regression components of y are calculated irrespective of their predictive value for x. Weighted averaging PLS WA = WA-PLS if only first WA-PLS component is used WA-PLS uses further components, namely as many as are useful in terms of predictive power. Uses residual structure in species data to improve our estimates of species parameters (optima) in final WA predictor. Optima of species that are abundant in sites with large residuals are likely to be updated most in WA-PLS. WEIGHTED AVERAGING WA 1. Take the environmental variable (xi) as the site scores. 2. Calculate species scores (optima) (uk) by weighted averaging of site scores – WA regression. 3. Calculate new site scores by weighted averaging of species scores x i – WA calibration. 4. Regress the environmental variable (xi) on the preliminary new site scores and take the fitted values as the estimate of x i – deshrinking regression. x i on xi [Regression or CLASSICAL xi on x i INVERSE good for ‘ends’ lower RMSE] y ikuk xˆi b0 b x b0 b1 y ik k m 1 i m k y ikuˆk y ik where uˆk b0 b1uk The weighted averaging (WA) method thus consists of three parts: WA regression, WA calibration and a deshrinking regression. The parts are motivated as follows. A species with a particular optimum will be most abundant in sites with x-values close to its optimum. This motivates Part 1 (WA regression): Estimate species optima (u*k) by weighted averaging of the x-value of the sites, i.e. * u y ik x y k i i k Species present and abundant in a particular site will tend to have optima close to its x-value. This motivates Part 2 (WA calibration): Estimate the x-value of the sites by weighted averaging of the species optima, i.e. * * x y u y i k ik k i Because averages are taken twice, the range of the estimated x-values (x*i) is shrunken. The amount of shrinking can be estimated from the training set by regression either (x*i) on (xi) or (xi) on (x*i) proposed by ter Braak (1988) and ter Braak & Van Dam (1989), respectively. Birks et al. (1990a) discuss the virtue of these two deshrinking methods. For establishing the link with PLS we need the latter, ”inverse” deshrinking regression. This method also has the attractive property of giving minimum root mean squared error in the training set. This motivates Part 3 (deshrinking regression): Regress the environmental variable (xi) on the preliminary estimates (x*i) and take the fitted values as the estimates of (xi). The final prediction formula for inferring the value of the environmental value from a fossil species assemblage is thus xˆ0 a0 a1 x 0 a0 a1k y 0k xk * k y 0k u ˆk y * y 0 0 where a0 and a1 are the coefficients of the deshrinking regression and ûk = a0 + a1û*k. The final prediction formula is thus again a weighted average, but one with updated species optima. The problem of weighted averaging – shrinking of range of environmental reconstructions Solution – deshrinking inverse regression Derive inverse regression coefficients initial xi = a + bxi Apply regression to reconstructed values to ”deshrink” final xi = (initiali – a)/b Where xi = the measured env var; initial xi = the initial WA estimate of the env var; final xi = the final, deshrunk env var; and a and b are regression coefficients. FULL DEFINITION OF TWO-WAY WEIGHTED AVERAGING 1. Estimate species optima (ûk) by weighted averaging of the environmental variables (x) at the sites n uˆk y ik xi y k i 1 where y+k has + to replace the summation over the subscript, in this case i = 1, ...., n sites. 2. Estimate the x values of the sites by weighted averaging of the species optima m initial xˆi y ikuˆk y i k 1 3. Because averages are taken twice, the range of the estimated initial xvalues (x) is shrunken. Need to deshrink using either (a) Inverse linear regression xi b0 b1(initial xˆi ) εi final xˆi b0 b1(initial xˆi ) This minimises RMSE in the training set or (b) Classical linear regression initial xˆi b0 b1xi εi final xˆi (initial xˆi b0 ) b1 This deshrinks more than inverse regression and takes inferred values further away from the mean. For inverse regression and two-way WA m xˆ0 b0 b1xˆ0 b0 b1 y 0 kuˆk y 0 k 1 m y 0 k uk* y 0 k 1 where uk* b0 b1uˆk For classical regression and two-way WA m xˆ0 y 0 kuˆk y 0 b0 b1 k 1 where uk* uˆk b0 b1 m y 0 k uk* y 0 k 1 Can also estimate for each species its WA tolerance or standard deviation (niche breadth) as 2 ˆ tk y ik x i uˆk y k i 1 n 1 2 and use these in a tolerance-weighted estimate of x m 2 ˆ y ikuˆk tk xˆi k 1 y i tˆk2 WEIGHTED AVERAGING PARTIAL LEAST SQUARES – WA-PLS 1. Centre the environmental variable by subtracting weighted mean. 2. Take the centred environmental variable (xi) as initial site scores – (cf. WA/CA) 3. Calculate new species scores by WA of site scores. 4. Calculate new site scores by WA of species scores. 5. For axis 1, go to 6. For axes 2 and more, make site scores uncorrelated with previous axes. 6. Standardise new site scores and (cf. WA/CA) use as new component. 7. Regress environmental variable on the components obtained so far using a weighted regression (inverse) and take fitted values as current estimate of estimated environmental variable. Go to step 2 and use the residuals of the regression as new site scores (hence name ‘partial’) (cf WA/CA). Optima of species that are abundant in sites with large residuals likely to be most updated. DEFINITION OF WA-PLS Step 0 Centre the environmental variable by subtracting the weighted mean, i.e. xi : xi i yi xi y This simplifies the formulae. Step 1 Take the centred environmental variable (xi) as initial site scores (ri) Do steps 2 to 7 for each component: Step 2 Calculate new species scores (u*k) by weighted averaging of the site scores, i.e. Step 3 uk* i yik ri y k Calculate new site scores (ri) by weighted averaging of the species scores, i.e. new ri k y ik u k* y i Step 4 For the first axis go to step 5. For second and higher components, make the new site scores (ri) uncorrelated with the previous components by orthogonalization. Step 5 Standardise the new site scores (ri). Step 6 Take the standardised scores as the new component. Step 7 Regress the environmental variable (xi) on the components obtained so far using weights (yi+/y++) in the regression and take the fitted values as current estimates ( x̂i). Go to step 2 with the residuals of the regression as the new site scores (ri). Method for calculating inferred temperatures Using WA inverse deshrinking models, inferred summer surface water temperatures (°C) for shallow lakes may be calculated as: xˆ i or xˆ i a b m y ik u ˆ k k 1 a b m y ik u ˆk k 1 m y k 1 tˆ 2 k ik (without tolerance down-shrinking) y m k 1 tˆ 2 ik k (with tolerance down-weighting) With WA classical deshrinking models, the inferred summer surface water temperatures (°C) are calculated as: xˆ m y u ˆ k ik k 1 xˆ m k 1 i or i y ik uˆ k m y k 1 a ik ˆ t y 2 k m k 1 b (without tolerance down-shrinking) tˆ a 2 ik k b (with tolerance down-weighting) Using the WA-PLS models, inferred summer surface water temperatures (°C) may be calculated as: xˆ i m y ik ˆk k 1 m y k 1 ik where xˆi is the inferred temperature for sample i, a and b are the intercept and slope for the deshrinking equations, yik is the abundance (depending on the model, either expressed as a percent of the total identifiable Chironomidae, or as the square-root of this value) of taxon k in sample i, ûk is the temperature optimum (°C) of species, ˆ , is the Beta of species k, and tˆk is the tolerance (°C) of species k (Fritz et al. 1991; ter Braak 1987; Birks, pers.comm.). k LEAVE-ONE-OUT AND TEST SET CROSS-VALIDATION Performance of WA-PLS in relation to the number of components (s): apparent error (RMSE) and prediction error (RMSEP) in simulated data (R = 1 from simulation series III). The estimated optimum number of components is 3 because three components give the lowest RMSEP in the training set. The last column is not available for real data. s 1 2 3 4 5 6 Apparent RMSE 6.14 3.37 2.87 2.22 2.01 1.82 Training set Leave-one-out RMSEP 6.22 4.24 4.16* 4.65 4.65 4.50 Test set RMSEP 6.61 4.40* 4.57 4.94 5.11 5.62 ter Braak & Juggins, 1993 The performance of WA-PLS applied to the three diatom data sets in number of components (s) in terms of apparent RMSE and leave-one-out (RMSEP) (selected model). Dataset s 1 2 3 4 5 6 SWAP RMSE RMSEP Bergen RMSE RMSEP Thames RMSE RMSEP 0.276 0.232 0.194 0.173 0.153 0.134 0.353 0.256 0.213 0.192 0.174 0.164 0.341 0.238 0.196 0.166 0.153 0.140 0.310* 0.302 0.315 0.327 0.344 0.369 Reduction in prediction error (%) 0 0.394 0.318* 0.330 0.335 0.359 0.374 19 0.354 0.279 0.239* 0.224 0.219 0.219 32 Bergen data set: predicted pH and bias as a function of observed pH for components 1 and 2 in WA-PLS. Solid lines represent Cleveland’s LOESS scatterplot smooth (1979). 19% gain. Thames data set: predicted salinity and bias as a function of observed salinity for components 1 and 3 in WA-PLS. Solid lines represents Cleveland’s LOESS scatterplot smooth (1979). Salinity is g-1 and transformed as log10 (salinity – 0.08). 32% gain. NW Europe Total P 152 lakes The relationship between (a) diatom-inferred TP and (b) residuals (inferred TP – observed TP) and observed TP for the oneand two-component WAPLS models. Solid lines show LOWESS scatter plot smoothers. Summary diatom diagram and reconstructed annual mean TP concentrations using one- and twocomponent WA-PLS models for Lake SøbyGård, showing standard errors of prediction for the twocomponent model. 210Pb dates (AD) are shown on the right hand side. Bennion et al. 1996 Imbrie & Kipp (1971) data WA WA-PLS ALPE - DIATOM - pH TRAINING SET Italian and Austrian Alps (Aldo Marchetto & Roland Schmidt) Spanish Pyrenees (Jordi Catalan & Joan Garcia) ALPE sites (Nigel Cameron & Viv Jones) Norway (Frode Berge & John Birks) SWAP Norway & UK (Frode Berge, Roger Flower, Viv Jones) Total One 'rogue' sample detected 118 samples 527 diatom taxa pH 4.48 - 8.04 median 6.10 mean 6.15 Gradient length 5.19 standard deviations 31 28 30 10 20 119 ALPE TRAINING SET - 118 SAMPLES Square root transformation Components WA-PLS -1 -2 -3 -4 -5 RMSE 0.299 0.178 0.131 0.100 0.075 r2 0.85 0.96 0.97 0.98 0.99 RMSEP 0.359 0.337 0.331 0.331 0.339 r2 (jack) 0.78 0.81 0.81 0.81 0.80 Select WA-PLS model with 3 components as simplest model (least parameters) that gives lowest RMSEP. NORWEGIAN CHIRONOMID – CLIMATE TRAINING SET Leave-one-out cross validation Predicted air temperature. 1:1 RMSEP = 0.89ºC 109 samples Bias = 0.61ºC Predicted – observed air temperature Inferred mean July air temperature Oxygen isotope ratios NORWEGIAN POLLEN AND CLIMATE Precipitation 300 - 3537mm Mean July 7.7 - 16.4ºC Mean January -17.8 - 1.1ºC Root mean squared errors of prediction (RMSEP) based on leave-oneout jack-knifing cross-validation for annual precipitation, mean July temperature, and mean January temperature using five different statistical models. Pptn (mm) July (C) January (C) Weighted averaging (WA) (classical) 486.5 1.33 2.86 Weighted averaging (WA) (inverse) 427.2 1.07 2.61 Partial least squares (PLS) 420.1 0.94 2.82 WA-PLS 417.5 1.03 2.57 Modern analogue technique (MAT) 385.3 0.91 2.42 Vuoskojaurasj, Abisko, Sweden Vuoskojaurasj consensus reconstructions Tibetanus, Abisko Valley Inferred from pollen Inferred from pollen Hammarlund et al. 2002 Björnfjelltjörn, N. Norway Björnfjelltjörn consensus reconstructions LINEAR AND UNIMODAL-BASED NUMERICAL METHODS Response model Problem Linear Unimodal Regression Multiple linear regression Weighted averaging (WA) of sample scores Calibration Linear calibration 'inverse regression' WA of taxa scores and simple two-way WA Principal components regression Correspondence analysis regression Partial least squares (PLS-1) WA-PLS (WAPLS-1) Multivariate calibration (PLS-2) WAPLS-2 Ordination Principal components analysis (PCA) Correspondence analysis (CA) Constrained ordination (= reduced rank regression) Redundancy analysis (RDA) Canonical correspondence analysis (CCA) Partial ordination Partial PCA Partial CA Partial constrained ordination Partial RDA Partial CCA RESPONSE SURFACES Pollen percentages in modern samples plotted in ‘climate space’ (cf Iversen’s thermal limit species +/– plotted in climate space). Contoured Trend-surface analysis R2 Bartlein et al. 1986 Contoured only Webb et al. 1987 Reconstruction purposes – grid, analogue matching Simulation purposes PROBLEMS 1. Need large high-quality modern data for large geographical areas. 2. No error estimation for reconstruction purposes. 3. Reconstruction procedure ‘ad hoc’ – grid size, etc. Response surfaces for individual pollen types. Each point is labelled by the abundance of the type. Many points are hidden – only the observation with the highest abundance was plotted at each position. For (a) to (e) ’+’ denotes 0%, ’0’ denotes 0-10%, 1 denotes 1020%, ’2’ denotes 20-30% etc. For (f) to (h), ’+’ denotes 0%, ’0’ denotes 0-1%, 1 denotes 1-2%, ’2’ denotes 2-3%, etc. ’H’ denotes greater than 10%. Percentage of spruce (Picea) pollen at individual sites plotted in climate space along axes for mean July temperature and annual precipitation. (B) Grid laid over the climate data to which the pollen percentage are fitted by local-area regression. The box with the plus sign is the window used for localarea regression. (C) Spruce pollen percentages fitted onto the grid. (D) Contours representing the response surface and pollen percentages shown in part C. Scatter diagram showing the smoothed distribution of percentages of spruce (Picea) and beech (Fagus) pollen from sediment with modern pollen data in eastern North America when the pollen percentages are plotted at coordinates for modern January and July mean temperature (P.J. Bartlein, unpublished). The arrow indicates the direction and approximate magnitude of temperature change at Montreal since 6000 yr BP. Elk Lake, Minnesota Reconstruction produced by the response surface approach Simulation purposes Fossil and simulated isopoll map sequences for Betula. Isopolls are drawn at 5, 10, 25, 50 and 75% using an automatic contouring program. Fossil and simulated isopoll map sequences from Quercus (deciduous). Maps are drawn at 3000year intervals between 12000 yr BP and the present. The upper map sequence presents the observed fossil and contemporary pollen values. The lower map sequence presents the pollen values simulated, by means of the pollen-climate response surface from the climate conditions obtained by applying to the measured contem-porary climate the palaeoclimate anomalies that Kutzbach & Guetter (1986) simulated using the NCAR CCM, for 12000 to 3000 yr BP. The map for the present is simulated from the measured contemporary climate. Isopolls are drawn at 2, 5, 10, 25, and 50% using an automatic contouring program. Huntley 1992 RESPONSE SURFACES - ‘Ad hoc’ 1. Choice of how much or how little smoothing. 2. Choice of scale of grid for reconstructions. 3. No statistical measure of ‘goodness-of-fit’. 4. No reliable error estimation for predicted values. ANALOGUE-BASED APPROACH Do an analogue-matching between fossil sample i and available modern samples with associated environmental data. Find modern sample(s) most similar to i, infer the past environment for sample i to be the modern environment for those modern samples. Repeat for all fossil samples. PROBLEMS 1. Assessment of ‘most similar’? 2. 1, 2, 9, 10 most similar? 3. No-analogues for past assemblages. 4. Choice of similarity measure. 5. Require huge set of modern samples of comparable site type, pollen morphological quality, etc, as fossil samples. Must cover vast geographical area. 6. Human impact. Elk Lake, Minnesota Reconstructions produced using the analogue approach MODIFIED MODERN ANALOGUE APPROACHES Joel Guiot 1. Taxon weighting Palaeobioclimatic operators (PBO) computed from either a timeseries of fossil sequence or from a PCA of fossil pollen data from large spatial array of sites. Weights are selected to 'emphasis the climate signal within the fossil data‘ and to 'highlight those taxa that show the most coherent behaviour in the vegetational dynamics', 'to minimise the human action which has significantly disturbed the pollen spectra', 'to reduce noise'. 2. Environmental estimates are weighted means of estimates based on 20, 40 or 50 or so most similar assemblages. 3. Standard deviations of these estimates give an approximate standard error. Reconstruction of variations in annual total precipitation and mean temperature expressed as deviations from the modern values (1080 mm and 9.5oC for La Grande Pile. 800 mm and 11oC for Les Echets). The error bars are computed by simulation. The vertical axis is obtained by linear interpretation from the dates indicated in Fig.2 Guiot et al. 1989 Cor is the correlation between estimated and actual data. +ME is the mean upper standard deviation associated to the estimates, -ME is the lower standard deviation. These statistics are calculated on the fossil data and on the modern data. In this case, R must be replaced by C. MODERN ANALOGUE TECHNIQUES FOR ENVIRONMENTAL RECONSTRUCTION = K – NEAREST NEIGHBOURS (K – NN) MAT, ANALOG, C2 1. Modern data and environmental variable(s) of interest. Do analog matches and environmental prediction for all samples but with cross-validation jack-knifing. Find number of analogues to give lowest RMSEP for environmental variable based on mean or weighted mean of estimates of environmental variable. Can calculate bias statistics as well. 2. Reconstruct using fossil data using the ‘optimal’ number of analogues (lowest RMSEP, lowest bias). Advise chord distance or chi-squared distance as dissimilarity measure. Optimises signal to noise ratio. CONSENSUS RECONSTRUCTIONS Elk Lake climate reconstruction summary. The three series plotted with red, green and blue lines show the reconstructions produced by the individual approaches, the series plotted with the thin black line show the envelope of the prediction intervals, and the series plotted with a thick purple line represents the stacked and smoothed reconstruction of each variable (constructed by simple averaging of the individual reconstructions for each level, followed by smoothing [Velleman, 1980]). The modern observed values (19781984) for Itasca Park are also shown. PLOTTING OF RECONSTRUCTED VALUES 1. Plot against depth or age the reconstructed values, indicate the observed modern value if known. 2. Plot deviations from the observed modern value or the inferred modern value against depth or age. 3. Plot centred values (subtract the mean of the reconstructed values) against depth or age to give relative deviations. 4. Plot standardised values (subtract the mean of the reconstructed values and divide by the standard deviation of the reconstructed values) against depth or age to give standardised deviations. Add LOESS smoother to help highlight major trends. LOESS smoother THE SECRET ASSUMPTION OF TRANSFER FUNCTIONS Telford & Birks (2005) Quaternary Science Reviews 24: 2173-2179 Estimating the predictive power and performance of a training set as RMSEP, maximum bias, r2, etc., by crossvalidation ASSUMES that the test set (one or many samples) is INDEPENDENT of the training set (The Secret or Totally Ignored Assumption). Cross-validation in the presence of spatial autocorrelation seriously violates this assumption. See Richard Telford's lecture after this lecture. USE OF ARTIFICIAL, SIMULATED DATA-SETS SIMULATED DATA-SETS Generate many training sets (different numbers of samples and taxa, different gradient lengths, vary extent of noise, absences, etc) and evaluation test sets, all under different species response models. NO-ANALOG PROBLEM 1. Probably widespread. 2. Does it matter? 3. Analog-based techniques for reconstruction - YES! Modern analog technique Response surfaces 4. WA and related inverse regression methods What we need are ‘good’ (i.e. reliable) estimates of ûk. Apply them to same taxa but in no-analog conditions in the past. Assume that the realised niche parameter ûk is close to the potential or theoretical niche parameter uk*. WA and WA-PLS are, in reality, additive indicator species approaches rather than strict multivariate analog-based methods. 5. Simulated data ter Braak, 1995. Chemometrics & Intelligent Laboratory Systems 28, 165–180 L-shaped climate configuration of samples (circles) in the training set (Table 3), with x the climate variable to be calibrated and z another climate variable. Also indicated are the regions of the samples in evaluation set A and set C Inverse versus classical methods; method-dependent bias in the leave-one-out error estimate. Comparison of the prediction error of inverse (WA-PLS and k-NN) and classical (MLM) approaches in the training set (t) and the three evaluation sets (B, A and C). Set B is a five time replication of t, set A is a subset of t and set C is an extrapolation set. The data are from simulation series 3 of Ref [53] in which species composition is governed by two climate variable (x and z) with an intermediate amount of unimodality (Rx = Rz = 1). 100 No analogue test set Set C Z Training set Set A 0 Set X t B 100 A C Inverse approach WA-PLS 2.97-9 3.0 2.8 5.9 k-NN 4.43-5 2.5 2.9 13.5 a Classical approach MLM w.r.t. x 4.3 4.4 3.5 10.6 MLM w.r.t. x and z 2.8 3.0 2.9 4.6 ter Braak 1995 Numbers are geometric means of root mean squared errors of prediction of x in four replications. The coefficients of variation of each mean is ca. 10%. Coefficient of variation of the ratio of 2 means within a column is ca. 15%. The range of x is [0, 100]. The number in superscript is the range of optimal number of components in WA-PLS and the optimal number of nearest neighbours in k-NN in the four replicates. k-NN uses Eq.(3) & (5). Significant difference (P<0.01) between leave-out validation and validation by the independent evaluation set B. a 6. General conclusions from simulated data experiments WA, WA-PLS, Maximum likelihood and MAT all perform poorly and no one method performs consistently better than other methods. For strong extrapolation, WA performed best. Appears WA-PLS deteriorates quicker than WA with increasing extrapolation. Hutson (1977) – no-analog conditions WA outperformed inverse regression and PCR. Important therefore to assess analog status of fossil samples as well as ‘best’ training set in terms of RMSEP, bias, etc. Dynamic training set concept. Analogues (say 10–20) for each fossil sample, devise dynamic training set, use linear PLS methods, avoids edge effects, truncated responses, etc. MULTIPLE ANALOG PROBLEM Fossil assemblage is similar to a number of modern samples that differ widely in their modern environment. Happens in pollen studies with training sets covering Europe, N America and parts of Asia. Major taxa only included, e.g. Pinus pollen may dominate northern, Mediterranean and southern assemblages. Constrained analog matching – Guiot e.g. constrain pollen choices on basis of inferred biome, fossil beetles, inferred lake-level changes Constrained response surfaces ( analog matching) – Huntley e.g. constrain area of search on the basis of inferred biome or plant macrofossils Reconstructed range of July temperatures (oC) at La Grande Pile (Vosges, France) from three methods: (a) a) using beetles alone, b) using pollen alone, (b) c) using pollen constrained by beetles (c) Guiot et al. 1993 MULTI-PROXY APPROACHES Swiss surface pollen samples – lake sediments Selected trees and shrubs Swiss surface lake sediments. Selected herbs and pteridophytes Root mean squared errors of prediction (RMSEP) based on leave-one-out jackknifing cross-validation for mean summer temperature (June, July, August), mean winter temperature (December, January, February) and mean annual precipitation using WA-PLS model. RMSEP R2 No. of comps. Mean summer temperature 1.252C 0.90 3 Mean winter temperature 1.025C 0.88 3 Mean annual precipitation 194.1mm 0.57 2 Modern Swiss pollen - climate Gerzensee, Bernese Oberland, Switzerland Gerzensee PB-O YD-PB Tr YD AL-YD Tr G-O Lotter et al. 2000 Gerzensee O16/O18 Pollen Chydorids Lotter et al. 2000 Birks & Ammann 2000 GRADIENT LENGTH (Compositional turnover along environmental gradient – SD units) SHORT (<2sd) LINEAR NOISE OF TRAINING SET DATA † MEDIUM (2-4sd) LONG (>4sd) UNIMODAL-BASED METHODS VERY LOW Least squares linear regression and calibration (inverse regression) GLM Gaussian logit or multinomial regression and calibration GLM ? Generalised additive models GAM LOW Partial least squares PLS regression and calibration PLS Weighted averaging PLS regression and calibration WA-PLS ? WA-PLS MEDIUM Partial least squares PLS or robust linear regression and calibration Weighted averaging regression and calibration WA ? WA or WA-PLS HIGH PCA regression CA-regression ? DCA-regression ? IDEAL PROBLEMS AT GRADIENT ENDS TRY TO AVOID, CAUSES MULTIPLE ANALOG PROBLEM? † HOW TO ESTIMATE NOISE IN REAL DATA? 1. Zero values 2. High sample heterogeneity (root mean squared deviation for samples) 3. High taxon tolerances (root mean squared deviation) 4. Rare taxa 5. % variance in Y explained by X, constrained 1 relative to unconstrained 2 .