Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Lecture 7 All of statistics……revisited Likelihood Methods in Forest Ecology October 9th – 20th , 2006 Standard statistics revisited Bolker Standard statistics revisited: Simple Variance Structures Standard statistics revisited General linear models • Predictions are a linear function of a set of parameters. • Includes: – Linear models – ANOVA – ANCOVA • Assumptions: – Normally distributed, independent errors – Constant variance • Not to be confused with generalized linear models! • Distinction between factors and covariates. Linear regression Y ~ a bX N ( 0, 2 ) Standard R code: >lm.reg<-lm(Y~X) >summary(lm) >anova(lm.reg) Likelihood R code: >lmfun<-function(a, b, sigma) { Y.pred<-a+b*x -sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE)) } Analysis of variance (ANOVA) Yij ~ i j ij N ( 0, 2 ) Standard R code: >lm.onewayaov<-lm(Y~f1) >summary(lm.aov) >anova(lm.aov) # will give you an ANOVA table Likelihood R code: >aovfun<-function(a11, a12, sigma) { Y.pred<-c(a11,a12) -sum(dnorm(DBH, mean=Y.pred, sd=sigma, log=TRUE)) } Analysis of variance (ANOVA): H &M (p177) Individual and cage effects on fly wing length Yij ~ i j ij N ( 0, 2 ) Cage Female Left wing Right wing Compare: 1 1 1 2 1 3 1 2 2 2 2 3 3 3 12 Table 7.5 58.5 59.5 Likeli of mean model Likeli of cage model Likeli of indiv fly model Analysis of covariance (ANCOVA) Yi ~ i i X N ( 0, 2 ) Standard R code: >lm.anc<-lm(Y~f*X) >summary(lm.anc) >str(summary(lm.anc)) Likelihood R code: >ancfun<-function(a11, a12, slope1, slope2, sigma) { Y.pred<-c(a11,a12)[f] + c(slope1, slope2)[f]*X -sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE)) } Standard statistics revisited Nonlinearlity: Non-linear least squares Yi ~ aX b N ( 0, 2 ) Uses numerical methods similar to those use in likelihood Standard R code: >nls(y~a*x^b, start=list(a=1,b=1) >summary(nls) >str(summary(lns)) Likelihood R code: >nlsfun<-function(a, b, sigma) { Y.pred<-a*x^b -sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE)) } Standard statistics revisited Generalized linear models • • Assumptions: – Non-normal distributed errors ( but still independent and only certain kinds of non-normality) – Non-linear relationships are allowed but only if they have a linearizing transformation (the link function). Linearizing transformations: y y x log x 1 y 1 e ex y e x x log( y ) x y y x2 • Non-normal distributed errors ( but still independent and only certain kinds of non-normality). These include the exponential family and are typically used with a specific linearizing function. • Poisson: loglink • Binomial: logit transfomation • Gamma: inverse Gaussian • Fit by iteratively reweighed least square methods: estimate variance associated with each point for each estimate of parameter(s). Not to be confused with general linear models! • GML: Poisson regression Standard R code: y ae >glm.pois<-glm(Y~X, family=poisson) >summary(gml.pois) Likelihood R code: >poisregfun=function(a,b) {Y.pred<-exp(a+b*X) -sum(dpois(Y, lambda=Y.pred, log=TRUE))} bx GML: Logistic regression Standard R code: y ex 1 ex >glm2<-glm(y,x, family=“binomial”) y >summary(gml2) link function log it ( x ) log 1 y Likelihood R code: >logregfun=function(a,b,N) {p.pred<-exp(a + b*X))/(1+exp(a + b*X)) -sum(dbinom(Y, size=N, prob=p.pred, log=TRUE))} Standard statistics revisited Generalized (non)linear least-squares models: Variance changes with a covariate or among groups Standard R code: yi ~ c N ( 0, i2 ) >gls<-gls(y~1,weights=varIdent(form=~1|f) >summary(gls) Likelihood R code: >vardifffun=function(a, sd1,sd2) {sdval<-c(sd1,sd2)[f] -sum(dbinom(Y, mean=a, sd=sdval, log=TRUE)} Standard statistics revisited: Complex Variance Structures Complex error structures • Error structures are not independent • Complex likelihood functions • Includes: – Time series analysis – Spatial correlation – Repeated measures analysis Variance-covariance matrix x Vector of Vector of Means (pred) data x Complex error structures x exp( xi i )2 / 2 2 exp( xi i ) / 2 2 exp( xi i )2 / 2 2 Increasing variance Independent 2 (x exp( xi i )2 / 2 i2 General case Complex error structures • Variance/covariance matrix is symetric so we need to specify at most n(n-1)/2 parameters. • V/C matrix must also be positive definite (logical), this translates to having a positive eigenvalue or positive diagonal values/ • Select elements of matrix that define the error structure and ensure positive definite. | | 1 • In this example, correlation drops off with the number ofd steps between sites.c Complex error structures: An example Spatially-correlated errors R code: >rho=0.5 >m=matrix(nrow=5, ncol=5) >m<-rho^(abs(row(m)-col(m)) #OR# >m[abs(row(m)-col(m))==1]=rho >mvlik<-function(a,b,rho) { mu=a+b*x n=length(x) m=diag(n) generates diag matrix of n rows, n columns m[abs(row(m)-col(m))==1]=rho -dmvnorm(y, mu, Sigma=m, log=TRUE) } Mixed models & Generalized linear mixed models (GLMM) • Samples within a group (block, site) are equally correlated with each other. • Fixed effects: effects of covariates • Random effects: block, site etc. • GLMM’s are generalized linear models with random effects Complex variance structures • So how do you incorporate all potential sources of variance? – Block effects – Individual effects (repeated measures includes both individual and temporal correlation) – Measurement vs. process error – ….. Bolker Analyses of Experimental data Threshold Ambien density Natural variation Variance DD detected DD undetected Osenberg et al. 2002 Why variation in experimental conclusions? • • • • • Inference derived from p-value No effect size (strength of the process) No per-capita effects Time difference between studies Spatial extent difference between studies Approach: Analyze data using one equation di dd Results No difference in per-capita effects Difference due to initial density So….beware of experiments! Evidence of a subterranean trophic cascade (HSS) Strong et al. 1999 Ecology Lupine Response=Survival (1/0) 15 reps each treatment Root feeding caterpillar (x= 0,8, 16,32) Caterpillar effect Base mort Nematode Present/absent Hierarchical structure Analyses i=1 (nem pres), 2 (nem absent) j = caterpillar treatment j No. of sdlgs dead = Binomial random variable Model selection Results • 47% died in absence of nematodes • 11% died in presence of nematodes Nematodes present Nematodes absent Model selection Four models beta 2 positive (neg eff of nematodes and diff from beta 1 (which = 0 or < beta 2) Traditional Approach: Logistic regression Logistic regression Beware of canned packages! Need to determine hierarchical error structure when testing complex hypotheses Take-home points • Non-linear effects and non-normal response variables will often cause problems with canned packages. • Focus on model construction, parameter estimation and model evaluation. • Represent variability in your data using the appropriate probability function Predator-induced hatching plasticity Vonesh & Bolker 2004 Trait-mediated predator effects • Density effects: consumptive effects resulting from predators killing prey (affects density). • Trait effects: non-consumptive effects resulting from changes in prey behavior or morphology in response to predation risk (e.g., growth rates) Lutberg & Kirby 2005 Trait-mediated effects Preisser et al. 2005 Predator-mediated plasticity in anurans • Prey respond to predators by changing their behavior, morphology and life history. • Timing of habitat shifts, metamorphosis, and hatching involve change of habitat and often, suite of predators. • Timing of transition between two life stages should evolve in response to variation in growth and mortality among life history stages. Postponement of hatching in response to predators may • Allow hatchlings to reach a greater body size before encountering predators, thus increasing their survival • But…there may different predation risk at different life stages (i.e., terrestrial vs aquatic predators) so it may be best to hatch early. • How are these tradeoffs determined? The study system Predator effects on terrestrial stage • Both frogs and flies cause embryos to hatch approximately 30% earlier. • Early hatchlings have lower weights and are at earlier developmental stages • Frogs can reduce density of tadpoles entering the pond by 60% • Flies have a much smaller effect. • So both size and density change over time! Experiment I: Quantifying the functional response • Vary larval density in the presence and absence of the dragonfly (aquatic predator). Scientific Model: Functional responseMortality as a function of density Keep size fixed Number of predators Number of prey eaten in t days Attack rate Handling time •Assume that actual attacked number follows a binomial distribution with p =probability of an individual being killed over the course of the experiment •Obtain estimates of α and HD that maximize the likelihood. Experiment II: Effect of larval size on predation risk • Expose five larval age/size classes to aquatic predators. • Dragonflies were replaced daily to keep predator densities constant The Scientific Model Part II: Size-specific mortality (Keep density fixed) Prey size Phenomenological scientific model Size-specific predation prob. Assume that probability of predation follows a binomial distribution with this probability. This function peaks at intermediate prey sizes Combining size & density-dependent mortality to predict attack rate • Two tricks…. Number of prey eaten per predator per day= It becomes the risk of predation at density N Density of predation in the size experiment Monte-Carlo methods • Population of interest is simulated. • Draw repeated samples from pseudopopulation. • Statistic (parameter) computed in each pseudo-sample. • Sampling distribution of statistic examined. • Where do true parameters fall within this distribution? Basic procedure 1. Calculate predicted values with known parameter values (these may also be calculated from data). 2. Add random error to predicted values to create observed. 3. Estimate parameter values given observed and predicted. 4. Go back to step 2 and loop through 100-1000 times. 5. Examine frequency distribution of estimated parameters of interest. Describe the distribution of the predicted variable • Vonesh & Bolker obtain parameter estimates of their model with CI and variance-covariance matrix. • Draw repeatedly from these distributions. • Simulate larval growth and survival from estimated parameters and error around estimates (var-cov matrix). • Generate expected distribution of the variable of interest (number killed). • Can do these with just one set of data analyzed in a traditional framework. Measurement & Observation Error Schnute No process uncertainty (measurement error) X measured Perfectly (process error) Why should we care? • Measurement error only affects the current measurement. • Process error propagates through time. • This is big deal in dynamic models. Bolker A famous example Pascual & Kareiva 1996 Fit L-V models to Gause’s data Traditional conclusion Observation & Process Error • Process uncertainty: random events cause the response variable to change in ways that are not predicted by the model. These may be errors in the process itself or in the observer of y. Propagates through time. • Observer uncertainty: Error in sampling due to measurement. Error in the predictive variable. Does not propagate through time. Observation error • We only need initial conditions. • We take observation from each time step It, and predict just the next step, I(t+1) • Contrast trajectory and actual data. • Minimize the difference between observed and predicted data • Often involves non-linear minimization Process error • We need complete series of observations. • We take observation from each time step It, and predict just the next step, I(t+1) • For estimation we fit a regression between N(t+1) and N(t) • Minimize the difference b/observed and predicted N(t+1) • One-step ahead fitting • Linear regression approach Estimation • To estimate both observation and process error we need either independent estimates of: – the magnitude of the errors OR – their relative size • Otherwise we have to choose between the two • Fitting assuming observation errors provides unbiased and more precise estimates even when data contained only process error. However, it produces downwardbiased estimates of variance. • If two kinds of errors uncorrelated, it gives the extreme values of possible parameter estimates. Statistical flip-flopping • We often use MLE’s to estimate parameters although IT does not have this requirement. • An AIC does not say anything about our confidence (or error) in the parameter estimate. • Therefore, we resort to frequentist stats to generate some 95% CI. • An alternative is to generate a true likelihood profile and chi-square but ultimately this also produces a p value. • The only consistent statistical logic is Bayesian Philosophy vs pragmatism • It is useful to have a broader more encompassing philosophy but… • Greater generality often implies greater complexity –often computational and mathematical