Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Introduction to Biostatistical Analysis Using R Statistics course for PhD students in Veterinary Sciences Session 3 Lecture: Analysis of Variance (ANOVA) Practical: ANOVA Lecturer: Lorenzo Marini, PhD Department of Environmental Agronomy and Crop Production, University of Padova, Viale dell’Università 16, 35020 Legnaro, Padova. E-mail: [email protected] Tel.: +39 0498272807 http://www.biodiversity-lorenzomarini.eu/ Statistical modelling: more than one parameter Nature of the response variable NORMAL (continuous) POISSON, BINOMIAL … Generalized Linear Models GLM General Linear Models Nature of the explanatory variables Categorical Continuous Categorical + continuous ANOVA Regression ANCOVA Session 3 Session 4 ANOVA: aov() ANOVA tests mean differences between groups defined by categorical variables One-way ANOVA ONE factor with 2 or more levels Multi-way ANOVA 2 or more factors, each with 2 or more levels Diet: 4 levels drug: 4 doses Sex ♀ ♂ ASSUMPTIONS Independence of cases - this is a requirement of the design. Normality - the distributions in each cells are normal [hist(), qq.plot(), shapiro.test()] Homogeneity of variances - the variance of data in groups should be the same (variance homogeneity with fligner.test()). One-way ANOVA step by step 1. Test normality (also after model fitting) 2. Test homogeneity of variance 3. Run the ANOVA and fit the maximal model 4. Reject/accept the H0 that all the means are equal 2 approaches 5A. Multiple comparison to test differences between the level of factors 5B. Model simplification working with contrasts (merge similar factor levels) 6. Minimum Adequate Model MAM One-way ANOVA Body weight: 4 diets (k) One-way ANOVA is used to test for differences among two or more independent groups y: body weight (NORMAL CONTINUOS) x: 4 diets (CATEGORICAL: four levels: x1, x2, x3, x4) Ho: µ1= µ2= µ3= µ4 H1: All the means are not equal ANOVA model yi = a + bx2 + cx3 + dx4 a=µ1 b=µ1-µ2 c=µ1-µ3 d=µ1-µ4 c y b a d One-way ANOVA ni µi Diet 1 6.08 5.7 6.5 5.86 6.17 5 6.06 Diet 2 6.87 6.77 7.4 6.63 6.98 5 6.93 Diet 3 10.26 10.21 10.02 9.65 Diet 4 8.79 8.42 8.31 8.57 X 9.03 4 5 10.03 8.62 Number of groups: k = 4 Number of observations N = 19 Grand mean = 7.80 Sum of squares (SS): deviance Degree of freedom (df) SS Total = Σ(yi – grand mean)2 SS Factor = Σ ni(group meani – grand mean)2 SS Error (within group) = Σ(yi – group meani)2 Total: N – 1 Group: k – 1 Error: N – k One-way ANOVA: SS explanation SS Total SS Error Grand mean SS Factor Variability within groups mean3 mean4 SS Total = SS Factor + SS Error mean2 mean1 Variability among groups One-way ANOVA SSTotal=SSFactor SSTotal=SSFactor + SSError µ3 µ4 µ2 Grand mean µ1 SS can be divided by the respective df to get a variance MS = SS /df Mean squared deviation MSFactor MSError The pseudo-replication would work here!!! One-way ANOVA: F test (variance) F = Factor MS Error MS If the k means are not equal, then the Factor MS in the population will be greater than the population’s Error MS How to define the correct F test can be a difficult task with complex design (BE EXTREMELY CAREFUL!!!!) If F calculated > F critic, then we can reject Ho All we conclude is that all the k population means are not equal. I.e. at least one mean differs!!! A POSTERIORI MULTIPLE COMPARISONS MODEL SIMPLIFICATION WORKING WITH CONTRASTS One-way ANOVA: contrasts Contrasts are the essence of hypothesis testing and model simplification in analysis of variance and analysis of covariance. They are used to compare means or groups of means with other means or groups of means We used contrasts to carry out t test AFTER having found out a significant effect with the F test - We can use contrasts in model simplification (merge similar factor levels) - Often we can avoid post-hoc multiple comparisons - We need to specify contrasts a priori One-way ANOVA: multiple comparisons If F calculated > F critic, then we can reject Ho All we conclude is that all the k population means are not equal. At least one mean differs!!! A POSTERIORI MULTIPLE COMPARISONS (lots of methods!!!) Multiple comparison procedures are then used to determine which means are different from which. Comparing K means involves K(K − 1)/2 pairwise comparisons. E.g. Tukey-Cramer, Duncan, Scheffè, LSD One-way ANOVA: multiple comparisons Ordinarily, there is a problem with conducting three comparisons in a row. The problem is that with each additional test, it becomes more likely that you will obtain one statistically significant result just by chance. Think of it as a slot machine. If you pull the slot machine arm 4 times, you are 4 times as likely to hit the jackpot given a completely random process. If you do 4 tests of statistical significance, you are 4 times as likely to obtain one p<0.05 result when there is no real difference between your means. What every method does is to make an adjustment to the obtained significance level (p-value) to make it harder for you to obtain a p<.05. This is like pulling the slot machine handle 4 times and having the slot machine say "I know you just tried 4 times, so I'm making the odds of winning harder." One-way ANOVA: nonparametric If the assumptions are seriously violated, then one can opt for a nonparametric ANOVA However One-way ANOVA is quite strength even in condition of non-normality and non-homogeneity of variance Kruskal-Wallis test kruskal.test() (if k = 2, then it corresponds to the Mann-Whitney test) ANOVA by ranks If there are tied ranks a correction term must be applied Multi-way ANOVA Factorial ANOVA is used when the experimenter wants to study the effects of two or more treatment variables. ASSUMPTIONS Independence of cases - this is a requirement of the design Normality - the distributions in each of the groups are normal Homogeneity of variances - the variance of data in groups should be the same + Equal replication (BALANCED AND ORTHOGONAL DESIGN) Dose 1 Dose 2 Dose 3 X 10 obs 10 obs 10 obs 8 obs Low temp - High temp 10 obs X If you use traditional general linear models just one missing data can affect strongly the results Fixed vs. random factors If we consider more than one factor we have to distinguish two kinds of effects: Fixed effects: factors are specifically chosen and under control, they are informative (E.g. sex, treatments, wet vs, dry, doses, sprayed or not sprayed) Random effects: factors are chosen randomly within a large population, they are normally not informative (E.g. fields within a site, block within a field, split-plot within a plot, family, parent, brood, individuals within repeated measures) Random effects occur in two contrasting kinds of circumstances 1. Observational studies with hierarchical structure 2. Designed experiments with different spatial or temporal scales (dependence) Fixed vs. random factors Why is it so important to identify fixed vs. random effects? They affect the way to construct the F-test in a multifactorial ANOVA. Their false identification leads to wrong conclusions You can find how to construct your F-test with different combinations of random and fixed effects and with different hierarchical structures (choose a well-known sampling design!!!) In the one-way ANOVA only the way to formulate our hypothesis changes but not the test If we have both fixed and random effects, then we are working on MIXED MODELS yi = µ + αi (fixed) + ri (random) + ε Factorial ANOVA: two or more factors Factorial design: two or more factors are crossed. Each combination of the factors are equally replicated and each factor occurs in combination with every level of the other factors 3 diets 4 supplements 10 10 10 10 10 10 10 10 10 10 10 10 Orthogonal sampling Factorial ANOVA: Why? Why use a factorial ANOVA? Why not just use multiple one-way ANOVA’s? With n factors, you’d need to run n one-way ANOVA’s, which would inflate your α-level – However, this could be corrected with a Bonferroni Correction The best reason is that a factorial ANOVA can detect interactions, something that multiple one-way ANOVA’s cannot do Factorial ANOVA: Interactions E.g. We are testing two factors, Gender (male and female) and Age (young, medium, and old) and their effect on performance If males performance differed as a function of age, i.e. males performed better or worse with age, but females performance was the same across ages, we would say that Age and Gender interact, or that we have an Age x Gender interaction Performance Interaction: When the effects of one independent variable differ according to levels of another independent variable Female Male Young Old Age It is necessary that the slopes differ from one another Factorial ANOVA: Main effects Main effects: the effect of a factor is independent from any other factors This is what we were looking at with one-way ANOVA’s – if we have a significant main effect of our factor, then we can say that the mean of at least one of the groups/levels of that factor is different than at least one of the other groups/levels Medium Old Male Performance Performance Young Sex It is necessary that the intercepts differ Female Age Factorial ANOVA: Two-crossed factor design Two crossed fixed effects: every level of each factor occurs in combination with every level of the other factors Model 1: two fixed effects Model 2: two random effects (uncommon situation) Model 3: one random and one fixed effect We can test main effects and interaction: 1. The main effect of each factor is the effect of each factor independent of (pooling over) the other factors 2. The interaction between factors is a measure of how the effects of one factor depends on the levels of one or more additional factors (synergic and antagonist effect of the factors) Factor 1 x Factor 2 We can only measure interaction effects in factorial (crossed) designs Factorial ANOVA: Two-crossed fixed factor design Two crossed fixed effects: Response variable: weight gain in six weeks Factor A: DIET (3 levels: barley, oats, wheat) Factor B: SUPPLEMENT (4 levels: S1, S2, S3, S4) DIET* SUPPLEMENT= 3 x 4 = 12 combinations We have 48 horses to test our two factors: 4 replicates barley+S1 barley+S2 barley+S3 barley+S4 oats+S1 oats+S3 oats+S2 oats+S4 wheat+S1 wheat+S3 wheat+S2 wheat+S4 Factorial ANOVA: Two-crossed fixed factor design The 48 horses must be independent units to be replicates Barley Oats wheat DIET SUPPLEMENT DIET*SUPPLEMENT 26.34 23.29 19.63 23.29 20.49 17.40 22.46 19.66 17.01 Barley Oats Wheat F test for main effects and interaction DIET SUPPLEMENT MS D FD MS error FS DIET*SUPPLEMENT MS S MS error FDxS S1 MS DxS MS error S2 S3 S4 25.57 21.86 19.66 Factorial ANOVA: Two-crossed fixed factor design Examples of ‘good’ ANOVA results 2.0 3.0 3.0 Treatment 25 35 45 < 0.05 n.s. n.s. 15 1.0 2.0 3.0 Treatment 1.0 2.0 Treatment 10 15 Two-crossed factor design CloneA CloneB 5 Mean y n.s. n.s n.s 2.0 20 Treatment Effect Clone Treat CxT Mean y 10 1.0 Worst case 20 Mean y 15 25 Mean y 25 15 1.0 n.s. < 0.05 n.s. Effect Clone Treat CxT 30 < 0.05 < 0.05 n.s. Effect Clone Treat CxT 35 < 0.05 < 0.05 < 0.05 Effect Clone Treat CxT 5 Mean y Effect Clone Treat CxT 1.0 2.0 Treatment 3.0 Treatment: 3 levels 3.0 Mixed models Mixed Models Mixes What? – Fixed Effects – Random Effects – E.g. maybe the residuals are not independent – Common if repeated measurements of same individual and hierarchical sampling Mixed models 1) Hierarchical structure - Example: if collecting data from different medical centers, center might be thought of as random. - Example: if surveying animals, they can be clustered into cohorts, cohort is random 2) Longitudinal studies - Example: Repeated measurements are taken over time for each subject. Subject is random. In all these cases, it is not generally reasonable to assume that observations within the same group are independent. Mixed models: split-plot We can consider random factors to account for the variability related to the environment in which we carry out the experiment Mixed models can deal with spatial or temporal dependence E.g. SPLIT-PLOT design is one of the most useful design The different treatment are applied to plot with different size organized in a hierarchical structure Mixed model: split-plot Treatment (control, dose1, dose2) 8 cages Organ (liver, heart or kidney) Response variable: Metabolite concentration Fixed effects: Treatment (control, dose1, dose2) Organ (liver, heart or kidney) Random effects: Rats within cage Organs within rat Mixed model: aov() Model formulation and simplification Y~ fixed effects + error terms y ~ a*b*c + Error(a/b/c) Growth Here you can specify your sampling hierarchy Cages Uninformative Treatment Organ Informative!!! Growth ~ treatment*organ+ Error(cage/treatment/organ)) Mixed models: tradition vs. REML Mixed models using traditional ANOVA requires perfect orthogonal and balanced design (THEY WORK WELL WITH THE PROPER SAMPLING) avoid to work with multi-way ANOVA in non-orthogonal sampling designs If something has gone wrong with the sampling In R you can run Mixed models with missing data and unbalanced design (non orthogonal design) using the REML estimation lme4() Mixed models: tradition vs. REML Mixed model: REML • REML: Residual Maximum Likelihood – vs. Maximum Likelihood – Unbalance, nonorthogonal, multiple sources of error • Packages – NLME (quite old) • New Alternative – lme4 Mixed model: REML When should I use REML? For balanced data, both ANOVA and REML will give the same answer. However, ANOVA’s algorithm is much more efficient and so should be used whenever possible. Are all factor Can you identify blocks combinations (or a hierarchy of blocks) present and sample sizes of similar experimental equal? units? Most efficient analysis Fixed effects ANOVA (or regression) Mixed effects ANOVA Regression REML Generalized Linear Models (GLM) We can use GLMs when the variance is not constant, and/or when the errors are not normally distributed. A GLM has three important properties 1. The error structure 2. The linear predictor 3. The link function Count data Proportion data 2 20 40 Variance 3 2 1 Variance 6 4 Variance 3.0 0 2 4 6 Mean 8 10 0 0 0 2.0 Variance 60 80 4 Gamma 8 4.0 10 Normal 0 2 4 6 Mean 8 10 0 2 4 6 Mean 8 10 0 2 4 6 Mean 8 10 Generalized Linear Models (GLM) Error structure In GLM we can specify error structure different from the normal: - Normal (gaussian) - Poisson errors (poisson) - Binomial errors (binomial) - Gamma errors (Gamma) glm(formula, family = …, link=…, data, …) Generalized Linear Models (GLM) Linear predictor (η)= predicted value output from a GLM The model relates each observed y to a predicted value The predicted value is obtained by a TRANSFORMATION of the value emerging from the linear predictor xi j j p β are the parameters estimated for the p explanatory variables x are the values measured for the p explanatory variables Fit of the model = comparison between the linear predictor and the TRANSFORMED y measured The transformation is specified in the LINK FUNCTION Generalized Linear Models (GLM) Link functions (g) The link function related the mean value of y to its linear predictor g ( ) The value of η is obtained by transforming the value of y by the link function The predicted value of y is obtained by applying the inverse link function to the η value of y by the link function Typically the output of a GLM is η Need to transform η to get the predicted values Known the distribution start with the canonical link function Generalized Linear Models (GLM) glm() Model fit Deviance is the measure of goodness-of-fit of GLM Deviance=-2(log-likelihoodcurrent model -log-likelihood saturated model) We aim at reducing the residual deviance Error Deviance normal ( yi mean) Variance 2 1 poisson 2 / y ln( y / mean) ( y mean) mean binomial 2 / y ln( y / mean) (n y) ln( n y) /( n mean) mean(n mean) n Gamma 2 / ( y mean) / y ln( y / mean) mean2 n is the size of the sample Proportion data and binomial error Proportion data and binomial error 1. The data are strictly bounded (0-1) 2. The variance is non-constant (∩-shaped relation with the mean) 3. Errors are non-normal p ln a bx q Link function: Logit Proportion 0.8 e abx p 1 e abx 0.4 p 2.0 1.0 0.0 0.0 Variance Proportion data 0.0 0.2 0.4 0.6 p 0.8 1.0 -100 -50 0 x 50 100 Proportion data and binomial error After fitting a binomial-logit GLM must be: Residual df ≈ residual deviance NO YES The model is adequate Fit a quasibinomial Check again YES The model is adequate NO Change distribution 2 Examples 1. The first example concerns sex ratios in insects (the proportion of all individuals that are males). In the species in question, it has been observed that the sex ratio is highly variable, and an experiment was set up to see whether population density was involved in determining the fraction of males. 2. The data consist of numbers dead and initial batch size for five doses of pesticide application, and we wish to know what dose kills 50% of the individuals (or 90% or 95%, as required). Count data and Poisson error Count data and Poisson error 1. The data are non-negative integers 2. The variance is non-constant (variance = mean) 3. Errors are non-normal The model is fitted with a log link (to ensure that the fitted values are bounded below) and Poisson errors (to account for the non-normality). Count data 3 1 2 log(x) 6 4 2 0 0 Variance 8 4 10 Count 0 2 4 6 Mean 8 10 0 20 40 60 x 80 100 Count data and Poisson error After fitting a Poisson-log GLM must be: Residual df ≈ residual deviance NO YES The model is adequate Fit a quasipoisson Check again YES The model is adequate NO Change distribution Example 1. In this example the response is a count of the number of plant species on plots that have different biomass (continuous variable) and different soil pH (high, mid and low).