Download Handout/Slides

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Biostatistics Case Studies 2006
Session 5:
Reporting Subgroup Results
Peter D. Christenson
Biostatistician
http://gcrc.LAbiomed.org/Biostat
Subgroup Issues
• Measuring subgroup effect
• Subgroups separately
• Interaction
• Selection of subgroups
• A priori
• Post-hoc
• Based on data
• Significance/strength of Conclusions
• Transparency of analysis
• Formal statistical comparisons; p-values, CIs.
Case Study
Editorial:
pp. 1667-69
Case Study: Abstract
Main Subgroup Result
Separate Subgroup Comparisons
% with
Events
7.9
Symptomatic
N = 12153
Δ= 1.0 p=0.05
RR=0.88 0.77 to 1.0
6.9
6.6
N = 3284
Asymptomatic
Combination
5.5
Aspirin Only
Δ=-1.0 p=0.20
RR=1.2 0.91 to 1.59
Separate Subgroup Conclusions
• Symptomatic group: Combination better
• Large N.
• Is magnitude of effect relevant? See CIs.
• Asymptomatic group: Inconclusive (0.91 ≤RR≤ 1.59)
• Same magnitude, apparent inverse from symptomatics.
• Much smaller N; less power.
• Have not demonstrated subgroup difference.
• Use interaction to do so.
• Need to, based on CIs?
Subgroup Interaction
N = 12153
Δ= 1.0 p=0.05
RR=0.88 0.77 to 1.0
Interaction = Δ Δ = 2.0%
N = 3284
vs.
Δ=-1.0 p=0.20
RR=1.2 0.91 to 1.59
with 95% CI ~ 0.65% to 3.35%
Why Is Interaction Relevant? Next slide
Subgroup Conclusions with Interaction
• Symptomatic group: Combination better
• Large N.
• Is magnitude of effect relevant? See CIs.
• Asymptomatic group: Inconclusive (0.91 ≤RR≤ 1.59)
• Same magnitude, apparent inverse from symptomatics.
• Much smaller N; less power.
• Difference between subgroups:
• Significant according to interaction.
• Inverse “non-effect” nevertheless incorporated.
Change Data to Give Non-Significant Interaction
Suppose:
% with
Events
7.9
Symptomatic
N = 12153
Δ= 1.0 p=0.05
RR=0.88 0.77 to 1.0
6.9
6.6
6.4
Δ=-0.2 p=0.80
RR=1.03 0.40 to 1.4
Asymptomatic
Combination
N = 3284
Aspirin Only
→ P for interaction ~ 0.50.
Change conclusions?
Changed Data Subgroup Conclusions
• Symptomatic group: Combination better
• Large N.
• Is magnitude of effect relevant? See CIs.
• Asymptomatic group: Inconclusive (0.40 ≤RR≤ 1.40)
• Apparently negligible, but not proven.
• Much smaller N; less power.
• Difference between subgroups:
• Not demonstrated.
• Use CI for ΔΔ to quantify magnitude of difference.
Change Data to Give Non-Significant Interaction
Suppose:
% with
Events
7.9
Symptomatic
N = 12153
Δ= 1.0 p=0.05
RR=0.88 0.77 to 1.0
6.9
6.6
6.4
N = 3284 10000
Δ=-0.2 p=0.80
RR=1.03 0.40 to 1.3
Asymptomatic
0.96 to 1.1
Combination
Aspirin Only
New Changes
→ P for interaction will be small.
Twice-Changed Data Subgroup Conclusions
• Symptomatic group: Combination better
• Large N.
• Is magnitude of effect relevant? See CIs.
• Asymptomatic group: Negligible (0.96 ≤RR≤ 1.1)
• Negligible, proven.
• Larger N → smaller CI; power not relevant.
• Difference between subgroups:
• Significantly demonstrated with interaction.
• Use CI for ΔΔ to quantify magnitude of difference.
Many Subgroup Analyses
12 Subgroups + Overall
Formal Multiple Comparison Adjustment
• Number of comparisons: k.
• Individual comparison false positive error rate = α.
• Experiment-wise error rate = α*.
• Bonferroni adjustment:
• Assume k comparisons are independent.
• True negative rate = specificity = 1 – α.
• Set α* = 1 - (1 – α)k → solve for α = 1 - (1 – α*)1/k =~ α*/k.
• So, typically p< 0.05/(# tests) = 0.05/13= 0.004 here.
• Conservative if comparisons are correlated; can improve
if correlation is known.
• No adjustment: Prob[≥1 false pos]=1-0.95k =0.49 if k=13. See
next slide.
Likelihood of False Positive Conclusions
Subgroup Multiple Comparison Comments
• Many other specialized methods.
• Pre-specified comparisons count just as post-hoc, if
post-hoc not based on results.
• Why limit “experiment-wise” count to subgroup
comparisons?
• No formal comparisons in this paper (but what if a
large diff was observed?): Table 1-3: 22+20+26
potential covariates.
• P-values: Table 4 – 12 efficacy and safety
comparisons.
• Figure 2: 12 Subgroups. At least one explicit test.
Subgroup Multiple Comparison Conclusions
• Obviously usually need to examine subgroups.
• If want to claim more than observations, need to
adjust in a well-defined way.
• Typically, report as observational and:
• Explain decisions and choices of subgroups.
• Formal adjustment typically not necessary.
• Avoid p-values. Emphasize CI range.
• Separate planned from data mining results.
• Number of comparisons should be explicit.
Recommendations for Reporting on Subgroups:
• See Editorial. Use to justify the following approach to journal.
• Do not make multiple comparison adjustment.
• Be transparent about all analyses.
• State where conclusions are based on interactions.
• Report number of comparisons that were planned prior to
looking at data (1) included and (2) not included in
paper.
• Report which results were a consequence of looking at data;
no p-values.
• Report if alternate definitions for a subgroup were examined.
• Give confidence intervals for effects that are compatible with
the data, not p-values, for subgroups.
Recommendations: Example of a Start at Them
Cohan(2005) Crit Care 23;10:2359-66.