Download ppt - LAPP

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
False discovery rate:
setting the probability of
false claim of detection
Lucio Baggio
Italy, INFN and University of Trento
Lucio Baggio - False discovery rate: setting the probability of false claim of detection
1
Where does this FDR comes from?
 I met prof. James T. Linnemann (MSU) at PHYSTAT2003, I explained
him the problem we had in assessing frequentist probability for IGEC
results…
 We made many different (almost independent) background and
coincidence counts using different values for the target signal amplitude
(thresholds)”
 At last, one of the 90% confidence intervals was not including the null
hypothesis… but when one accounts for many trials, it is possible to
compute that with 30% probability we had a chance that at least one of the
tests falsely rejected the null hypothesis. So, no true discovery, after all.”
”
”
 Perhaps next time we should use 99.99% confidence intervals, in order
to make the probability of false claim still low after tens of trials…But I’m
afraid that signals cannot any more emerge with such stringent a
requirement.”
 He pointed out that maybe what I’m looking for are false discovery rate
methods…
”
Thanks, Jim!!
Lucio Baggio - False discovery rate: setting the probability of false claim of detection
2
Why FDR?
When should I care of multiple test procedures?.
•
All sky surveys: many source directions and polarizations are tried
•
Template banks
•
Wide-open-eyes searches: many analysis pipelines are tried altogether,
with different amplitude thresholds, signal durations, and so on
•
Periodic updates of results: every new science run is a chance for a
“discovery”. “Maybe next one is the good one”.
•
Many graphical representations or aggregations of the data: “If I change
the binning, maybe the signal shows up better…
Lucio Baggio - False discovery rate: setting the probability of false claim of detection
3
Preliminary (1) : hypothesis testing
False discoveries
(false positives)
Null (Ho) True
Background (noise)
Alternative True
signal
Null Retained
(can’t reject)
Reject Null =
Accept Alternative
Total
U
B
Type I Error α = εb
mo
Type II Error β = 1- εs
T
S
m1
R = S+B
m-R
m
inefficiency
Detected
signals
(true positives)
Reported
signal
candidates
Lucio Baggio - False discovery rate: setting the probability of false claim of detection
4
Preliminary (2): p-level
Assume you have a model for the noise that affects the measure x.
You derive a test statistics t(x) from x.
F(t) is the distribution of t when x is sampled from noise only (off-signal).
The p-level associated with t(x) is the value of the distribution of t in t(x):
p = F(t) = P(t>t(x))
•
•
Example: 2 test  p is the “one-tail” 2 probability associated with n
counts (assuming d degrees of freedom)
The distribution of p is always linearly raising in case of agreement of
the noise with the model P(p)=p  dP/dp = 1
Usually, the alternative hypothesis is not known.
However, for our purposes it is pdf
sufficient assuming that the
signal can be distinguished 1
from the noise, i.e. dP/dp  1.
Typically, the measured values of
p are biased toward 0.
signal
background
p-level
Lucio Baggio - False discovery rate: setting the probability of false claim of detection
5
Usual multiple testing procedures
For each hypothesis test, the condition {p<  reject null} leads to false
positives with a probability 
In case of multiple tests (need not to be the same test statistics, nor the same
tested null hypothesis), let p={p1, p2, … pm} be the set of p-levels. m is the trial
factor.
We select “discoveries” using a threshold T(p): {pj<T(p) reject null}.
• Uncorrected testing: T(p)= 
–The probability that at least one rejection is wrong is
P(B>0) = 1 – (1- )m ~ m
hence false discovery is guaranteed for m large enough
• Fixed total 1st type errors (Bonferroni): T(p)= /m
–Controls familywise error rate in the most stringent manner:
P(B>0) = 
–This makes mistakes rare…
–… but in the end efficiency (2nd type errors) becomes negligible!!
Lucio Baggio - False discovery rate: setting the probability of false claim of detection
6
Controlling false discovery fraction
We desire to control (=bound) the ratio of false discoveries over the
total number of claims: B/R = B/(B+S)  q.
The level T(p) is then chosen accordingly.
Let us make a simple case when signals are easily separable (e.g. high SNR)
pdf

B B
T ( p) 

m0 m
S
B
m0
q
cumulative
counts
R

B mT ( p )
FDR  q  
R
R
p
m

T ( p) q

R
m
B
S

T ( p)
p
Lucio Baggio - False discovery rate: setting the probability of false claim of detection
7
Benjamini & Hochberg FDR control procedure
Among the procedures that accomplish this task, one simple recipe was proposed
by Benjamini & Hochberg (JRSS-B (1995) 57:289-300)
• compute p-values {p1, p2, … pm} for a set of tests, and sort them in creasing order;
• choose your desired FDR q (don’t ask too much!);
• determine the threshold T(p)= pk by finding the index k such that pj<(q/m) j/c(m)
for every j>k;
• define c(m)=1 if p-values are independent or positively correlated; otherwise
c(m)=Sumj(1/j)
m
q/c(m)
reject H0

T ( p)
p
Lucio Baggio - False discovery rate: setting the probability of false claim of detection
8
Summary
In case of multiple tests one wants to control the false claim probability, but it is
advisable to mitigate the strict requirement that we want NO false claim, which
could end up in burying the signals also.
Controlling FDR seems to be a wise suggestion.
This talk was based mainly on
Miller et. al. ApJ 122: 3492-3505 Dec 2001 http://arxiv.org/abs/astro-ph/0107034
and Jim Linnemann’s talk http://user.pa.msu.edu/linnemann/public/FDR/
However, there is a fairly wide literature aboute this, if one looks for references in
biology, imaging, HEP, and recently also astrophysics, at last!
Lucio Baggio - False discovery rate: setting the probability of false claim of detection
9