Download False Discovery or Missed Discovery

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Gene expression programming wikipedia , lookup

Genome-wide association study wikipedia , lookup

Tag SNP wikipedia , lookup

Transcript
False Discovery or
Missed Discovery
Bernie Devlin
Dept of Psychiatry,
University of Pittsburgh, SOM
Collaborators: K. Roeder, L.
Wasserman, SA. Bacanu, C.R. Genovese
Introduction
Topic: Multiple tests, and tradeoff of true /
false discoveries
Properties of single & multiple tests
FDR: When some hypotheses are true
How to increase the power of FDR: weighted
FDR
Conclusions
False or Missed Discoveries
Statistical decision: choose a rejection
threshold to control the Type I error rate, False
Discovery
Missed
Discovery
while achieving desirable power for relevant
alternatives.
Now Commonly
Many, many tests
Power
Power ?
How to Choose Threshold α?
• Uncorrected Testing
• Ignores multiple testing
• Reject if P < α
• WRONG!
• Control Experiment-wise Type I Error (FWER)
• Bonferroni correction
• Reject if P < α/m
The Multiple Testing Problem
•
•
•
Reject if p-value Pi ≤ Threshold
D = # discoveries
FD= # false discoveries
Ho Retained
Ho Rejected
Total
Ho True
TN
FD
mo
H1 True
FN
TD
m1
Total
N
D
m
More Power To You
• FWER is principled but ... Right goal?
• Aim for power and principle.
• False Discovery Control: Bound ratio FD/D
Proposed by Benjamini and Hochberg 1995 :
False Discovery Rate
FDR = E ( FD / D ) ≤ α
Power
Reject P < Threshold
α/m
t
tw
α
The Benjamini-Hochberg Procedure (1995)
• Order
• The
the p-values
BH threshold is
The FDR is controlled
B-H, in action
FDR Control for Dependent Tests
•
Contiguous regions of the brain.
Dependent FDR: Benjamini and Yukateli
•
Association with gene X gene interaction.
Devlin et al.(2003) Gen Epi 25: 36-47
For large m, FDR holds provided empirical distribution
of p-values is a consistent estimator the distribution
of p-values.
What Kind of Dependent Tests?
Search for liability loci in a large number of genes,
allowing for gene-gene interaction.
GLM model for phenotype with pairwise
interactions
k
g(µ) = Σ βr Xir + Σ βrs Xir Xis + …
r=1
1≤r<s≤k
A way to view model
X1
X2
…
Xk
Y
X1*X2
X2*X3
Full Model: All terms
Reduced Model: Drop “terms”
…
Target set Approach
1
2
12
3
23
4
34
A
B
13
24
14
A*B
Prior information
• FDR does not incorporate scientific priors
into the analysis.
• Consider formal procedure for using prior
information:
• Genes targeted due to linkage
• Genes in biological pathways
• Try Weights
Power
Reject P < Threshold
α/m
t
tw
α
p-Value Weighting
• Control overall FDR, but favor some
hypotheses with weights.
• Think betting:
•
•
•
•
Up-weight candidates (wi > 1)
Down-weight others (wk < 1)
Budget: average weight equals 1
Placing bets: Pi/wi
Choosing good weights
• Mean (w1,…,wm) = 1
• True model:
“a” = fraction alternatives
• Betting model: “ε” = fraction up-weighted
• Binary weights: w ∝ 1 for (1 − ε )
⎧
Bet = ⎨
⎩w ∝ B for ε
0
1
Power For a Given Test
Power For a Given Test
Genome-wide Association
• A decade ago the idea of genome-wide
association was envisioned (Risch and Merikangas).
• ASHG: Risch suggested focusing testing on SNPs
under linkage peaks.
• Technology favors pre-selected SNPs, fixed
platform?
Linkage Trace for Body Mass Index
Continuous Weights
• Linkage trace Y for weights.
• Exponential Weights
• eBY
• Cumulative Weights
• P(Z < Y-B)
Exponential weights exaggerates
extreme values in linkage trace.
14
12
10
8
6
4
2
0
-2
0
25
linkage
50
cumulative
75
exponential
100
Cumulative weights give broader
peaks.
14
12
10
8
6
4
2
0
-2
0
25
linkage
50
cumulative
75
exponential
100
Bigger B creates concentrated,
dramatic up-weighting.
18
16
14
12
10
8
6
4
2
0
-2
0
25
linkage
50
cumulative, small B
75
cumulative, large B
100
Simulation Experiment
• Generate linkage traces
• Generate associate tests
• 500K SNPs
• 10 causal SNPs
• Power = # Discoveries/10
Impact of Weights on Power
• Improvements for
either choice --finding 1-2 more
signals out of 10.
1.0
0.8
Bonferroni
No Weight
Weight
Noise
0.6
0.4
• Uninformative
linkage trace,
losses are small.
0.2
0.0
4.0
4.5
5.0
5.5
6.0
Brain Imaging
Increase power?
Standard FDR versus FDR using
weights based on regions
Standard B-H (red + blue)
Results: False Region Control Threshold
Pacifico et al. CMU Tech Report #771
Conclusions
In tradeoff of true / false discoveries
FDR has good properties when some
tests true
To increase FDR power, “focus” tests
Prior knowledge: expected G x G
P-value weighting: ‘prior’ knowledge