Download Microarray course Statistics Plan Hypothesis testing Hypothesis

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Probability wikipedia , lookup

Statistics wikipedia , lookup

Foundations of statistics wikipedia , lookup

Transcript
www.nr.no
Plan
►
Microarray course
Statistics
Introduction to hypothesis testing
▪
▪
▪
▪
▪
▪
Ingunn Fride Tvete
Marit Holden
The Norwegian Computing Center
►
Multiple hypothesis testing
▪
▪
►
Null and alternative hypotheses
Significance level
P-value
Type I- and type II-errors
Power
Test statistic
Family-wise error rate (FWER)
False discovery rate (FDR)
Practical (interpretation)
©Please do not duplicate or use the slides without the expressed permission
of The Norwegian Computing Center, Ingunn Fride Tvete ([email protected])
www.nr.no
Hypothesis testing
►
Often summarize results of experiments by
measures as
▪
▪
▪
►
Hypothesis testing, cont.
►
Typical: have data and information
▪
▪
▪
average
standard deviation
diagrams
Uncertainty attached to these
Must draw a conclusion
Examples
◦
◦
But sometimes: choose between two competing
hypotheses
►
Is the new medicine better than the old one?
Are these genes differentially expressed in tumor and normal cells?
Hypothesis testing
▪
▪
Method to draw conclusions from uncertain data
Can say something about the uncertainty in the conclusion
www.nr.no
Famous example: Sir Ronald A. Fisher
►
British statistician and geneticist who pioneered the application
of statistical procedures to
the design of scientific experiments
►
To call in the statistician after the
experiment is done may be no more
than asking him to perform a
postmortem examination: he may be
able to say what the experiment died of.
Indian Statistical Congress, Sankhya, ca 1938
http://www.britannica.com/eb/article-9034397/Sir-Ronald-Aylmer-Fisher
www.nr.no
www.nr.no
Famous example: The lady tasting tea
►
The Design of Experiments (1935), Sir Ronald A. Fisher
▪ A tea party in Cambridge, the 1920ties
▪ A lady claims that she can taste whether milk is pored inn
cup before or after the tea
▪ All professors agree: impossible
▪ Fisher: this is statistically interesting!
▪ Organised a test
►
Here: a modified version
http://www.maa.org/reviews/ladytea.html
www.nr.no
1
The lady tasting tea, cont.
►
The lady tasting tea, cont.
Test with 8 trials, 2 cups in each trial
▪
►
In each trial: guess which cup had the milk pored inn first
The null (conservative) hypothesis
▪
Binomial experiment
►
▪
▪
The one we initially believes in
The alternative hypothesis
▪
Independent trials
Two possible outcomes, she guesses right cup (success),
wrong cup (failure)
Constant probability of success in each trial
▪
►
►
The new claim we wish to test
►
She has no special ability to taste the difference
►
She has a special ability to taste the difference
X=number of right guesses in 8 trials, each with probability of
success p
▪
X is Binomial (8,p) distributed
www.nr.no
The lady tasting tea, cont.
►
►
Hypothesis testing, cont.
►
Need: a rule that say something about what it takes to be
convinced
▪
www.nr.no
Two types of error
Accept 6 right guesses as good enough, or do we need 7 or 8?
Compute a specific probability
▪
true
true
Accept
OK
Type II error
Reject
Type I error
OK
The probability of significance or the P-value
◦
The probability to obtain the observed value or something more extreme,
given that
is true
NB! The P-value is NOT the probability that
is true
◦
►
Type I error most serious
▪ Wrongly reject the null hypothesis
▪ Example
◦
◦
◦
person is ill
person is healthy
To say a person is healthy when she is ill is far more serious
than to say she is ill when she is healthy
www.nr.no
Hypothesis testing: when to reject
►
Decide on the hypothesis’ level of significance
▪ Choose a level of significance α
▪ This guarantees P(type I error) ≤ α
▪ Example
◦
Reject
►
Demand
▪ P(type I error) ≤ α (level of significance , e.g. 5%)
Hypothesis testing, cont.
►
Level of significance at 0.05 gives 5 % probability to reject a true
►
www.nr.no
if P-value is less than α
NB!
▪
▪
▪
Null hypothesis,
Alternative hypothesis,
Level of significance
→ Must be decided upon before we know the results of the
experiment
P-value
The probability to obtain the observed value or something more extreme, given that
is true www.nr.no
www.nr.no
2
The lady tasting tea, cont.
The lady tasting tea, cont.
►
Choose 5 % level of significance
►
We obtained a p-value of 0.1443
►
Conduct the experiment
►
The rejection rule says
▪
▪
►
▪
▪
Say: she identified 6 cups correctly
Is this evidence enough?
Reject
if p-value is less than the level of significance α
Since α = 0.05 we do NOT reject
P-value
Small p-value: reject the null hypothesis
Large p-value: keep the null hypothesis
P-value
The probability to obtain the observed value or something more extreme, given that
is true
www.nr.no
The lady tasting tea, cont.
►
Area of rejection
In the tea party in Cambridge:
▪
►
www.nr.no
The lady got every trial correct!
Comment:
▪
Why does it taste different?
◦
Pouring hot tea into cold milk makes the milk curdle, but not so
pouring cold milk into hot tea*
*http://binomial.csuhayward.edu/applets/appletNullHyp.html
www.nr.no
Type II error
www.nr.no
Example, type II error
true
true
Accept
OK
Type II error
Reject
Type I error
OK
www.nr.no
www.nr.no
3
Power of the test
Power function
Probability
Power
Power function
www.nr.no
Expand the number of trials to 16
www.nr.no
Expand the number of trials, cont.
www.nr.no
Expand the number of trials, cont.
www.nr.no
One-sided or two-sided test?
Compare power curves
Parallel to microarray analysis: do replications to increase power!
www.nr.no
www.nr.no
4
The one sample t-test
The one sample t-test, cont.
►
So far: one-sample test for binomially distributed data
►
Log-ratios are close to normally distributed
►
Now: one-sample test for normally distributed data
►
Need several measures of log-ratios
►
▪
▪
Example:
▪
From several individuals
Same cell line
◦
I.e. same individual, “repetitions within individual”
Is a gene differentially expressed or not?
◦
◦
log-ratio: log (tumour tissue)/(normal tissue)
If log-ratio is different from 0, the gene is differentially expressed in
tumour tissue compared to normal tissue
www.nr.no
The one sample t-test
►
Well known theorem:
►
Log-ratios:
►
Test:
►
Under
www.nr.no
Two-sample problems
►
Two types of problems:
▪
▪
Two treatments-same subject
◦ E.g. Measure cholesterol level before and after diet
Same treatment-two subjects
◦
against
►
, the test statistic
E.g. Measure cholesterol level men and women
How we do the computations depends upon which type of
problem we have
is T distributed with n-1 degrees of freedom
www.nr.no
Two-sample problems: Paired data
www.nr.no
Two-sample problems: different samples
Test statistic
Test statistic
is T-distributed under
with n-1 degrees of freedom (n=n1=n2)
www.nr.no
is under
t-distributed with
n1+n2-2 degrees of freedom
sf is a common std.dev. for both groups
s1 and s2 are the empirical std.dev.
of X1 and X2, respectively
www.nr.no
5
www.nr.no
Multiple hypothesis testing
Multiple hypothesis
testing
►
Often
Large number of hypothesis tested simultaneously
▪
►
Testing multiple hypotheses simultaneously, using single
hypothesis testing procedures, results in a greatly increased
false positive (significance) rate
www.nr.no
Example: 10 000 genes
►
Several solutions
Q: is gene g, g = 1, …, 10 000, differentially expressed?
►
Gives 10 000 null hypothesis:
: gene 1 not differentially expressed
▪
►
Assume: no one differentially expressed, i.e
►
Significance level
▪
►
►
Adjust the p-values
►
Simplest and most conservative: Bonferroni correction
▪
▪
▪
true for all g
►
Expect
p-value below 0.01 by chance
Assume significance level for entire set of
Adjusted p-value
rejected if
comparisons
Problem: low power!
genes to have
In the long run, incorrectly conclude that 100 genes are
differentially expressed, when in fact none of them are!
P-value
www.nr.no
Several solutions, cont.
►
The probability to obtain the observed value or something more extreme, given that
Family-wise error rate (FWER)
Can obtain significance level by controlling
▪ The family-wise error rate (FWER) or
▪ The false discovery rate (FDR)
►
Possible outcomes from
The probability of at least one type I error
▪
►
►
hypothesis tests:
Control FWER at a level
▪
Procedures that modify the adjusted p-values separately
◦
No. true
No. false
is true www.nr.no
Total
▪
No. accepted
Single step procedures
More powerful procedures adjust sequentially, from the smallest
to the largest, or vice versa
◦
Step-up and step-down methods
No. rejected
Total
►
www.nr.no
The Bonferroni correction controls the FWER
www.nr.no
6
False discovery rate (FDR)
Summary multiple testing procedure
The expected proportion of type I errors among the rejected
hypotheses
►
►
▪
Various procedures also here
►
▪
▪
▪
E.g. The Benjamini and Hochberg procedure (+versions)
E.g. Permutation tests
▪
▪
Avoid cheating by adding known differentially expressed genes
◦
Are you most afraid of getting genes on your significant
list that should not have been there
◦
Caution with FDR
►
Decide whether you want to control the family-wise error rate
(FWER) or the false discovery rate (FDR)
Choose FWER
Are you most afraid of missing out interesting genes
◦
Choose FDR
This reduces FDR
Interpretation of FDR
►
▪
FDR applies to a set of genes in a global sense, not to individual
genes
www.nr.no
Example, output from Limma
P-value if testing
just this one
www.nr.no
Example, output from Limma (GSEA)
Pathway here,
but could be
a gene
Adjusted P-value (FDR)
P-value if testing
just this one
If you want to find a list of
differentially expressed genes:
typically FDR
If you want to find a list of
differentially expressed genes:
typically FDR
if you want to examine the genes further
(e.g. pathway analysis) can use p-value
→ focus on the most interesting genes,
but do not say they are stat. sig.
if you want to examine the genes further
(e.g. pathway analysis) can use p-value
→ focus on the most interesting genes,
but do not say they are stat. sig.
Limma: linear Models for Microarray data, http://bioconductor.org/
http://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf
www.nr.no
www.nr.no
Get help? Where? How?
►
A statistical supervision and consulting service
▪
Participants
◦
◦
◦
◦
UiO-Department of Informatics
UiO-Department of Mathematics
Norwegian Computing Center
UiO-Section of Medical Statistics
►
Write an email to [email protected] and explain briefly
which issue you would like to discuss with us
►
For more information about the service
▪
http://statgenconsult.nr.no/
www.nr.no
7