Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Experimental design and statistical analyses of data Lesson 1: General linear models and design of experiments Examples of General Linear Models (GLM) Simple linear regression: Ex: Depth at which a white disc is no longer visible in a lake y 0 1 x 10 8 Depth (m) y = depth at disappearance Dependent x = nitrogen concentration of water variable Slope β0 6 β1 4 2 0 0 2 4 6 N/volume water Intercept Independent The residual ε expresses variable the deviation between the model and the actual observation 8 10 Polynomial regression: Ex:: y = depth at disappearance x = nitrogen concentration of water 10 Depth (m) 8 y 0 1 x 2 x 2 6 4 2 0 0 2 4 6 N/volume water 8 10 Multiple regression: Eks: y = depth at disappearance x1 = Concentration of N x2 = Concentration of P 10 10 8 8 6 Depth 6 4 4 2 2 0 0 0 0 2 2 4 Concentration of P 4 6 6 8 y 0 1 x1 2 x2 3 x1 x2 Depth 8 Concentration of N Analysis of variance (ANOVA) Ex: 10 8 Depth y = depth at disappearance x1 = Blue disc x2 = Green disc 6 4 2 x1= 0; x2x==00; x = 1 x1=1 1; x2=2 0 y 0 1 x1 2 x2 0 White Blue Disc color Green Analysis of covariance (ANCOVA): Ex: 10 8 Depth y = depth at disappearance x1 = Blue disc x2 = Green disc x3 = Concentration of N 6 4 2 0 0 2 4 6 Concentration of N y 0 1 x1 2 x2 3 x3 4 x1 x3 5 x2 x3 8 10 Nested analysis of variance: Ex: y = depth at disappearance αi = effect of the ith lake β(i)j = effect of the jth measurement in the ith lake y i (i ) j What is not a general linear model? y = β0(1+β1x) y = β0+cos(β1+β2x) Other topics covered by this course: • Multivariate analysis of variance (MANOVA) • Repeated measurements • Logistic regression Experimental designs Examples Randomised design • Effects of p treatments (e.g. drugs) are compared • Total number of experimental units (persons) is n • Treatment i is administrated to ni units • Allocation of treatments among units is random Example of randomized design • 4 drugs (called A, B, C, and D) are tested (i.e. p = 4) • 12 persons are available (i.e. n = 12) • Each treatment is given to 3 persons (i.e. ni = 3 for i = 1,2,..,p) (i.e. design is balanced) • Persons are allocated randomly among treatments A y1A y2A y3A yA y nA Drugs C y1C y2C y3C B y1B y2B y3B jA yB y nB jB yC y nC jC D y1D y2D y3D yD y nD Total jD y y ij n yA yA yB yB yC yC yD yD Note! Different persons yA yA 0 x1 1 y B y B 0 1 x 2 1 yC yC 0 2 x3 1 y D y D 0 3 y 0 1 x1 2 x2 3 x3 yA 0 y B 0 1 1 yB y A yC 0 2 2 y C y A yD 0 3 3 y D y A Source Estimate of 0 Treatments ( 1 2 3 ) Residuals Total Degrees of freedom 1 p-1=3 n-p = 8 n = 12 Randomized block design • All treatments are allocated to the same experimental units • Treatments are allocated at random B A D C C B A D Blocks (b = 3) B D A C Treatments (p = 4) Treatments 1 Persons 2 3 Average A B C D Average y1 A y1B y1C y1D y1 y2 A y2 B y2C y2 D y2 y3 A y3B y3C y3 D y3 yA yB yC yD y y 0 1 x1 2 x2 3 x3 4 x4 5 x5 Blocks (b-1) Treatments (p-1) Randomized block design Source Degrees of freedom Estimate of 0 Blocks (persons) Treatments ( drugs ) Residuals 1 b-1=2 p-1 = 3 n-[(b-1)+(p-1)+1] = 6 Total n = 12 Double block design (latin-square) 1 Sequence 2 3 4 1 B A C D Person 2 D C A B 3 A D B C 4 C B D A Rows (a = 4) Columns (b = 4) y 0 1 x1 2 x2 3 x3 4 x4 5 x5 6 x6 7 x7 8 x8 9 x9 Sequence (a-1) Persons (b-1) Drugs (p-1) Latin-square design Source Estimate of 0 Rows (sequences) Blocks (persons) Treatments ( drugs ) Residuals Total Degrees of freedom 1 a-1 = 3 b-1=3 p-1 = 3 n-[3(p-1)+1] = 6 n = p2 = 16 Factorial designs • Are used when the combined effects of two or more factors are investigated concurrently. • As an example, assume that factor A is a drug and factor B is the way the drug is administrated • Factor A occurs in three different levels (called drug A1, A2 and A3) • Factor B occurs in four different levels (called B1, B2, B3 and B4) Factorial designs Factor B Factor A B1 B2 B3 B4 Average A1 y11 y12 y13 y14 y1 A2 y21 y22 y23 y24 y 2 A3 y31 y32 y33 y34 y 3 Average y 1 y 2 y 3 y 4 y yij 0 1 x1 2 x2 3 x3 4 x4 5 x5 Effect of A Effect of B No interaction between A and B Factorial experiment with no interaction • • • • • Survival time at 15oC and 50% RH: 17 days Survival time at 25oC and 50% RH: 8 days Survival time at 15oC and 80% RH: 19 days What is the expected survival time at 25oC and 80% RH? An increase in temperature from 15oC to 25oC at 50% RH decreases survival time by 9 days • An increase in RH from 50% to 80% at 15oC increases survival time by 2 days • An increase in temperature from 15oC to 25oC and an increase in RH from 50% to 80% is expected to change survival time by –9+2 = -7 days Factorial experiment with no interaction 25 20 Survival time (days) 80 % RH 50 % RH 15 10 5 0 10 15 20 Temperature (oC) 25 30 Factorial experiment with no interaction 25 20 Survival time (days) 80 % RH 50 % RH 15 10 5 0 10 15 20 Temperature (oC) 25 30 Factorial experiment with no interaction 25 20 Survival time (days) 80 % RH 50 % RH 15 10 5 0 10 15 20 Temperature (oC) 25 30 Factorial experiment with no interaction 25 20 Survival time (days) 80 % RH 50 % RH 15 10 5 0 10 15 20 Temperature (oC) 25 30 Factorial experiment with no interaction 25 yij 0 1 x1 2 x2 20 Survival time (days) 2 15 1 10 0 5 0 10 15 20 Temperature (oC) 25 30 Factorial experiment with interaction 25 yij 0 1 x1 2 x2 3 x1 x2 20 Survival time (days) 2 15 1 10 0 3 5 0 10 15 20 Temperature (oC) 25 30 Factorial designs Factor B Factor A B1 B2 B3 B4 Average A1 y11 y12 y13 y14 y1 A2 y21 y22 y23 y24 y 2 A3 y31 y32 y33 y34 y 3 Average y 1 y 2 y 3 y 4 y yij 0 1 x1 2 x2 3 x3 4 x4 5 x5 6 x1 x3 7 x1 x4 8 x1 x5 9 x2 x3 10 x2 x4 11x2 x5 Effect of A Effect of B Interactions between A and B Two-way factorial design with interaction, but without replication Source Estimate of 0 Factor A (drug) Factor B (administration) Interactions between A and B Residuals Total Degrees of freedom 1 a-1 = 2 b-1=3 (a-1)(b-1) = 6 n- ab = 0 n = ab = 12 Two-way factorial design without replication Source Degrees of freedom 0 Estimate of Factor A (drug) Factor B (administration) Residuals 1 a-1 = 2 b-1=3 n- a-b+1 = 6 Total n = ab = 12 Without replication it is necessary to assume no interaction between factors! Two-way factorial design with replications Source Estimate of 0 Factor A (drug) Factor B (administration) Interactions between A and B Residuals Total Degrees of freedom 1 a-1 b-1 (a-1)(b-1) ab( r-1) n = rab Two-way factorial design with interaction (r = 2) Source Degrees of freedom Estimate of 0 Factor A (drug) Factor B (administration) Interactions between A and B Residuals 1 a-1 = 2 b–1=3 (a-1)(b-1) = 6 ab( r-1) = 12 Total n = rab = 24 Three-way factorial design Factor A Factor A Factor B Factor C y ijk 0 1 x1 2 x 2 3 x3 4 x 4 5 x5 6 x6 7 x7 8 x8 9 x9 10 x10 Factor A Factor B Factor C 10 Main effects 11x1 x3 12 x1 x4 13 x1 x5 14 x1 x6 15 x1 x7 16 x1 x8 17 x1 x9 18 x1 x10 19 x2 x3 20 x2 x4 31 Two-way interactions 41 x1 x3 x8 42 x1 x3 x9 43 x1 x3 x10 44 x1 x4 x8 45 x1 x4 x9 70 x2 x7 x8 71 x2 x7 x9 72 x2 x7 x10 30 Three-way interactions Three-way factorial design Source Estimate of 0 Factor A Factor B Factor C Interactions between A and B Interactions between A and C Interactions between B and C Interactions between A, B and C Residuals Total Degrees of freedom 1 a-1 = 2 b–1=5 c-1 = 3 (a-1)(b-1) = 10 (a-1)(c-1) = 6 (b-1)(c-1) = 15 (a-1)(b-1)(c-1) = 30 abc( r-1) = 0 n = rabc = 72 Why should more than two levels of a factor be used in a factorial design? Two-levels of a factor 30 Survival time (days) 25 20 15 10 5 0 10 15 20 Temperature (oC) 25 30 Three-levels factor qualitative 30 y 0 1 x1 2 x2 Survival time (days) 25 1 20 15 10 2 0 5 0 10 15 20 25 Temperature (oC) Low Medium High 30 Three-levels factor quantitative 30 y 0 1 x 2 x 2 Survival time (days) 25 20 15 10 5 0 10 15 20 Temperature (oC) 25 30 Why should not many levels of each factor be used in a factorial design? Because each level of each factor increases the number of experimental units to be used For example, a five factor experiment with four levels per factor yields 45 = 1024 different combinations If not all combinations are applied in an experiment, the design is partially factorial