Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
www.nr.no Plan ► Microarray course Statistics Introduction to hypothesis testing ▪ ▪ ▪ ▪ ▪ ▪ Ingunn Fride Tvete Marit Holden The Norwegian Computing Center ► Multiple hypothesis testing ▪ ▪ ► Null and alternative hypotheses Significance level P-value Type I- and type II-errors Power Test statistic Family-wise error rate (FWER) False discovery rate (FDR) Practical (interpretation) ©Please do not duplicate or use the slides without the expressed permission of The Norwegian Computing Center, Ingunn Fride Tvete ([email protected]) www.nr.no Hypothesis testing ► Often summarize results of experiments by measures as ▪ ▪ ▪ ► Hypothesis testing, cont. ► Typical: have data and information ▪ ▪ ▪ average standard deviation diagrams Uncertainty attached to these Must draw a conclusion Examples ◦ ◦ But sometimes: choose between two competing hypotheses ► Is the new medicine better than the old one? Are these genes differentially expressed in tumor and normal cells? Hypothesis testing ▪ ▪ Method to draw conclusions from uncertain data Can say something about the uncertainty in the conclusion www.nr.no Famous example: Sir Ronald A. Fisher ► British statistician and geneticist who pioneered the application of statistical procedures to the design of scientific experiments ► To call in the statistician after the experiment is done may be no more than asking him to perform a postmortem examination: he may be able to say what the experiment died of. Indian Statistical Congress, Sankhya, ca 1938 http://www.britannica.com/eb/article-9034397/Sir-Ronald-Aylmer-Fisher www.nr.no www.nr.no Famous example: The lady tasting tea ► The Design of Experiments (1935), Sir Ronald A. Fisher ▪ A tea party in Cambridge, the 1920ties ▪ A lady claims that she can taste whether milk is pored inn cup before or after the tea ▪ All professors agree: impossible ▪ Fisher: this is statistically interesting! ▪ Organised a test ► Here: a modified version http://www.maa.org/reviews/ladytea.html www.nr.no 1 The lady tasting tea, cont. ► The lady tasting tea, cont. Test with 8 trials, 2 cups in each trial ▪ ► In each trial: guess which cup had the milk pored inn first The null (conservative) hypothesis ▪ Binomial experiment ► ▪ ▪ The one we initially believes in The alternative hypothesis ▪ Independent trials Two possible outcomes, she guesses right cup (success), wrong cup (failure) Constant probability of success in each trial ▪ ► ► The new claim we wish to test ► She has no special ability to taste the difference ► She has a special ability to taste the difference X=number of right guesses in 8 trials, each with probability of success p ▪ X is Binomial (8,p) distributed www.nr.no The lady tasting tea, cont. ► ► Hypothesis testing, cont. ► Need: a rule that say something about what it takes to be convinced ▪ www.nr.no Two types of error Accept 6 right guesses as good enough, or do we need 7 or 8? Compute a specific probability ▪ true true Accept OK Type II error Reject Type I error OK The probability of significance or the P-value ◦ The probability to obtain the observed value or something more extreme, given that is true NB! The P-value is NOT the probability that is true ◦ ► Type I error most serious ▪ Wrongly reject the null hypothesis ▪ Example ◦ ◦ ◦ person is ill person is healthy To say a person is healthy when she is ill is far more serious than to say she is ill when she is healthy www.nr.no Hypothesis testing: when to reject ► Decide on the hypothesis’ level of significance ▪ Choose a level of significance α ▪ This guarantees P(type I error) ≤ α ▪ Example ◦ Reject ► Demand ▪ P(type I error) ≤ α (level of significance , e.g. 5%) Hypothesis testing, cont. ► Level of significance at 0.05 gives 5 % probability to reject a true ► www.nr.no if P-value is less than α NB! ▪ ▪ ▪ Null hypothesis, Alternative hypothesis, Level of significance → Must be decided upon before we know the results of the experiment P-value The probability to obtain the observed value or something more extreme, given that is true www.nr.no www.nr.no 2 The lady tasting tea, cont. The lady tasting tea, cont. ► Choose 5 % level of significance ► We obtained a p-value of 0.1443 ► Conduct the experiment ► The rejection rule says ▪ ▪ ► ▪ ▪ Say: she identified 6 cups correctly Is this evidence enough? Reject if p-value is less than the level of significance α Since α = 0.05 we do NOT reject P-value Small p-value: reject the null hypothesis Large p-value: keep the null hypothesis P-value The probability to obtain the observed value or something more extreme, given that is true www.nr.no The lady tasting tea, cont. ► Area of rejection In the tea party in Cambridge: ▪ ► www.nr.no The lady got every trial correct! Comment: ▪ Why does it taste different? ◦ Pouring hot tea into cold milk makes the milk curdle, but not so pouring cold milk into hot tea* *http://binomial.csuhayward.edu/applets/appletNullHyp.html www.nr.no Type II error www.nr.no Example, type II error true true Accept OK Type II error Reject Type I error OK www.nr.no www.nr.no 3 Power of the test Power function Probability Power Power function www.nr.no Expand the number of trials to 16 www.nr.no Expand the number of trials, cont. www.nr.no Expand the number of trials, cont. www.nr.no One-sided or two-sided test? Compare power curves Parallel to microarray analysis: do replications to increase power! www.nr.no www.nr.no 4 The one sample t-test The one sample t-test, cont. ► So far: one-sample test for binomially distributed data ► Log-ratios are close to normally distributed ► Now: one-sample test for normally distributed data ► Need several measures of log-ratios ► ▪ ▪ Example: ▪ From several individuals Same cell line ◦ I.e. same individual, “repetitions within individual” Is a gene differentially expressed or not? ◦ ◦ log-ratio: log (tumour tissue)/(normal tissue) If log-ratio is different from 0, the gene is differentially expressed in tumour tissue compared to normal tissue www.nr.no The one sample t-test ► Well known theorem: ► Log-ratios: ► Test: ► Under www.nr.no Two-sample problems ► Two types of problems: ▪ ▪ Two treatments-same subject ◦ E.g. Measure cholesterol level before and after diet Same treatment-two subjects ◦ against ► , the test statistic E.g. Measure cholesterol level men and women How we do the computations depends upon which type of problem we have is T distributed with n-1 degrees of freedom www.nr.no Two-sample problems: Paired data www.nr.no Two-sample problems: different samples Test statistic Test statistic is T-distributed under with n-1 degrees of freedom (n=n1=n2) www.nr.no is under t-distributed with n1+n2-2 degrees of freedom sf is a common std.dev. for both groups s1 and s2 are the empirical std.dev. of X1 and X2, respectively www.nr.no 5 www.nr.no Multiple hypothesis testing Multiple hypothesis testing ► Often Large number of hypothesis tested simultaneously ▪ ► Testing multiple hypotheses simultaneously, using single hypothesis testing procedures, results in a greatly increased false positive (significance) rate www.nr.no Example: 10 000 genes ► Several solutions Q: is gene g, g = 1, …, 10 000, differentially expressed? ► Gives 10 000 null hypothesis: : gene 1 not differentially expressed ▪ ► Assume: no one differentially expressed, i.e ► Significance level ▪ ► ► Adjust the p-values ► Simplest and most conservative: Bonferroni correction ▪ ▪ ▪ true for all g ► Expect p-value below 0.01 by chance Assume significance level for entire set of Adjusted p-value rejected if comparisons Problem: low power! genes to have In the long run, incorrectly conclude that 100 genes are differentially expressed, when in fact none of them are! P-value www.nr.no Several solutions, cont. ► The probability to obtain the observed value or something more extreme, given that Family-wise error rate (FWER) Can obtain significance level by controlling ▪ The family-wise error rate (FWER) or ▪ The false discovery rate (FDR) ► Possible outcomes from The probability of at least one type I error ▪ ► ► hypothesis tests: Control FWER at a level ▪ Procedures that modify the adjusted p-values separately ◦ No. true No. false is true www.nr.no Total ▪ No. accepted Single step procedures More powerful procedures adjust sequentially, from the smallest to the largest, or vice versa ◦ Step-up and step-down methods No. rejected Total ► www.nr.no The Bonferroni correction controls the FWER www.nr.no 6 False discovery rate (FDR) Summary multiple testing procedure The expected proportion of type I errors among the rejected hypotheses ► ► ▪ Various procedures also here ► ▪ ▪ ▪ E.g. The Benjamini and Hochberg procedure (+versions) E.g. Permutation tests ▪ ▪ Avoid cheating by adding known differentially expressed genes ◦ Are you most afraid of getting genes on your significant list that should not have been there ◦ Caution with FDR ► Decide whether you want to control the family-wise error rate (FWER) or the false discovery rate (FDR) Choose FWER Are you most afraid of missing out interesting genes ◦ Choose FDR This reduces FDR Interpretation of FDR ► ▪ FDR applies to a set of genes in a global sense, not to individual genes www.nr.no Example, output from Limma P-value if testing just this one www.nr.no Example, output from Limma (GSEA) Pathway here, but could be a gene Adjusted P-value (FDR) P-value if testing just this one If you want to find a list of differentially expressed genes: typically FDR If you want to find a list of differentially expressed genes: typically FDR if you want to examine the genes further (e.g. pathway analysis) can use p-value → focus on the most interesting genes, but do not say they are stat. sig. if you want to examine the genes further (e.g. pathway analysis) can use p-value → focus on the most interesting genes, but do not say they are stat. sig. Limma: linear Models for Microarray data, http://bioconductor.org/ http://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf www.nr.no www.nr.no Get help? Where? How? ► A statistical supervision and consulting service ▪ Participants ◦ ◦ ◦ ◦ UiO-Department of Informatics UiO-Department of Mathematics Norwegian Computing Center UiO-Section of Medical Statistics ► Write an email to [email protected] and explain briefly which issue you would like to discuss with us ► For more information about the service ▪ http://statgenconsult.nr.no/ www.nr.no 7