Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Experimental Design Dr. Anne Molloy Trinity College Dublin Ethical Approach to Animal Experimentation • Replace • Reduce • Refine Reduce •Good Experimental Design •Appropriate Statistical Analysis Good Experimental Design is Essential in Experiments using Animals • The systems under study are complex with many interacting factors • High variability can obscure important differences between treatment groups: – Biological variability (15-30% CV in animal responses) – Experimental imprecision (up to 10% CV). • Confounding variables between treatment groups can affect your ability to interpret effects. – Is a difference due to the treatment or a secondary effect of the treatment? (e.g. weight loss, lack of appetite) Do a Pilot Study and Generate Preliminary Data for Power Calculations. Observational study -not an experiment; an experience (RA Fisher 1890-1962) • Observational; generates data to give you the average magnitude and variability of the measurements of interest • Gives background information on the general feasibility of the project (essentially validates the hypothesis) • Allows you to get used to the system you will be working with and get information that might improve the design of the main study Dealing with Subject Variation • Choose genetically uniform animals where possible • Avoid clinical and sub-clinical disease • Standardize the diet and environment - house under optimal conditions • Uniform weight and age (else choose a randomized block design) • Replicate a sufficient number of times. – Increases the confidence of a genuine result – Allows outliers to be detected. Some issues to think about before you set out to test your hypothesis • What is the best treatment structure to answer the question? – Scientifically – Economically • What type of data are being collected? – Categorical, numerical (discrete or continuous), ranks, scores or ratios. This will determine the statistical analysis to be used • How many replicates will be needed per group? – Too many: wasteful; diminishing additional information – Too few: Important effects can be rejected as nonsignificant Choosing the Correct Design • How many treatments (independent variables)? – e.g. Dose Response over Time • How many outcome measurements (dependent variables) – Aim for the maximum amount of informative data from each experiment – (but power for one) • Are there important confounding factors that should be considered? – Gender, age – Dose Response over Time x Gender x Age •Complex experiments with more treatment groups generally allow reduction in the number of animals per group. •Continuous numerical type data generally require smaller sample sizes than categorical data Types of Study Design • Completely randomized study (basic type) – Random not haphazard sampling • Randomized block design: e.g. stratify by weight or age. (removes systematic sources of variation) • Factorial Design: e.g examine two or more independent variables in one study • Crossover, sequential, repeated measures, split plot, latin square designs • Can greatly reduce the number of animals required – ANOVA type analysis is essential Example: You want to examine the effect of two well known drugs on tissue bio- markers Experiment 1 Control Drug 1 Experiment 2 Control Control Drug 2 Drug 1 Drug 2 Reduces animals by the number of controls in one experiment Identify the Experimental Unit Defines the independent unit of replication Cage; animal; tissue Saline Drug Control Diet Saline Drug Experimental Diet Sometimes pseudoreplication is unavoidable – so be aware of effect limitations Power and Sample Size Calculation in the Design of Experiments What is the likelihood that the statistical analysis of your data will detect a significant effect given that the experimental treatment truly has an effect? POWER How many replicates will be needed to allow a reliable statistical judgement to be made? SAMPLE SIZE The Information You Need • What is the variability of the parameter being measured? • What effect size do you want to see? • What significance level to you want to detect (commonly use minimum of p=0.05)? • What power do you want to have (commonly use 0.80) This information is used to calculate the sample size Variability of the Parameter • An estimate of central location: “About how much?” (e.g. the mean value) • An estimate of variation: “How spread out?” (e.g. the standard deviation) An experiment: Testing the difference between two means • In an experiment we often want to test the difference between two means where the means are sample estimates based on small numbers. • It is easier to detect a difference if: – The means are far apart – There is a low level of variability between measurements – There is good confidence in the estimate of the mean Plasma Cysteine (µmol/L) 3 SEM=22.9/20 =5.12 SEM=22.3/50 =3.15 5 2 Frequency Frequency 4 Means and SDs are about the same! 1 0 160 180 200 220 240 260 3 SEM= SD/N 2 1 0 150 280 180 210 270 300 Mean 236: SD 22.3 Mean 235: SD 22.9: SEM=21.4/500 =0.96 SEM=23.8/2500 =0.48 60 250 50 200 Coefficient of Variation (CV) 40 Frequency Frequency 240 50 Results 20 Results 30 150 100 20 (SD/Mean)% = 10.1% 50 10 0 150.0 0 150 200 250 500 Results 300 Mean 235: SD 21.4 200.0 250.0 300.0 350.0 2,500 Results Mean 236: SD 23.8 Fitting a 'Normal' or 'Gaussian' Distribution 250 Mean = 236 SD = 23.8 2SD = 48 (approx) 3SD = 71 (approx) Frequency 200 About 95% of results are between 236 ± 48 i.e. 188 and 284 150 100 50 0 150.0 200.0 250.0 300.0 Plasma Cysteine (umol/L) 350.0 About 99.7% of results are between 236 ± 71 i.e. 165 and 307 We can make the same predictions for a sample mean using SEM instead of SD Having confidence in the estimate of the mean value Frequency 3 This is a sample. We don’t know the ‘true’ mean of the population 2 1 0 160 180 200 220 240 260 280 20 Results Mean 235: SEM=5.12 2 SEMs = 10.24 We can be 95% confident that true mean of the population will fall between 224.8 and 245.2 The sample mean is our best guess of the true population mean (µ) but with a small sample there is much uncertainty and we need a wide margin of error The effect of increasing numbers 60 3 2 Frequency Frequency 50 40 30 20 1 10 0 0 160 180 200 220 240 260 280 20 Results Mean 235: SD=22.9 20 = 4.47 SEM=5.12 150 200 250 500 Results 300 Mean 235: SD=21.4 500 = 22.36 SEM=0.96 Number of samples is increased 25 times Standard error is decreased by 25 = 5 times 95% CI of the mean is 5 times narrower Sample size considerations: Viewpoint 1 Fix the sample size at six replicates per group and CV at 10% The significance depends on the effect size 2 groups – control and treated 6 replicates per group; CV of the assay 10% Cut-off for a significant result P=0.05 (Mean of treated outside the 95% CI of the controls) Effect you want in treated group Student’s t-test 50% difference P<0.0001 25% P=0.0009 15% P=0.015 12% P=0.048 10% P=0.09 P=0.05 Sample size considerations: Viewpoint 2 Fix the effect size at 25% difference and CV at 10%. The significance depends on the number of replicates 25% difference expected; CV of the assay 10% Cut-off for a significant result P=0.05 P=0.05 Number of replicates per group Student’s t-test 6 P=0.0009 5 P=0.0029 4 P=0.009 3 P=0.03 2 P=0.12 Sample size considerations: Viewpoint 3 Fix the effect size at 25% and number of replicates at 6. The significance depends on the variability of the data (CV) 25% difference expected; 6 replicates per group Cut-off for a significant result P=0.05 CV of the Assay Student’s t-test 10% P=0.0009 15% P=0.009 20% P=0.037 25% P=0.08 30% P=0.14 P=0.05 Summary: The underlying issues in demonstrating a significant effect • The size of the effect you are interested in seeing – Big – e.g. 50% difference will be seen with very few data points – Small - major considerations • The precision of the measurement – Low CVs - few replicates needed – High CVs – multiple replicates How do we interpret a nonsignificant result? A. There is no difference between the groups B. There is a difference but we didn’t see it (because of low numbers, SD too wide, etc.) The decision to reject or not reject the Null Hypothesis can lead to two types of error. Interpreting a Statistical Test Evidence from the experiment DECISION RESULTS Not Significant Significant Do not reject H(o) Declare that the treatment has no effect Reject H(o) Declare that the treatment has an effect Reality The Null is true The Null is false The treatment has The treatment has no effect an effect Correct Decision β (Type II) Error α (Type 1) Error Correct Decision (p value) β-Errors and the overlap between two sample distributions Continuous data range 95% CI of mean A Mean sample A Miss an effect: β-error 95% CI of mean B Mean sample B See an effect: POWER Some Power Calculators • http://www.dssresearch.com/toolkit/spcalc/power.asp • http://statpages.org/ • leads to Java applets for power and sample size calculations. • http://www.stat.uiowa.edu/%7Erlenth/Power/index.html • Direct into Java applet site General Formula r = 16 (CV/d)2 r= No of replicates CV= coefficient of variation (SD/mean) (as a percent) d=difference required (as a percent) Valid for a Type I error =5% and Type II error =80%. Some General Comments on Statistics • All statistical tests make assumptions – They assume independent data points –ignore this at your peril! – They assume that the data are a good representation of the wider experimental series under study – Some assumptions are very specific to the test being carried out Final Thoughts • Ideally, to minimise the sample number, use equal numbers of control and treated animals. • Ethically, if an experiment is particularly stressful, lower numbers may be desired in the treated group. This requires use of more animals overall to gain equivalent power – but can be justified. • Remember - Statistical tests assume that the experiment has been done on a random sample from the complete population of similar items and that each result is an independent event. This is often not the case in laboratory research. • Statistical logic is only part of the data interpretation. Scientific judgement and common sense are essential. Dealing with Experimental Variation • Randomization – Essential! – Ensures that the remaining inescapable differences are spread among all the treatment groups – Minimises potential bias – Provides a reliable estimate of the true variability – “Control treatment” must be one of the randomized arms of the experiment Power Considerations • You know the variability of the parameter being measured • What effect size do you want to see? • You need a minimum significance level of p=0.05 • What power do you want to have (commonly use 0.80)