Download This is a sample

Document related concepts

Predictive analytics wikipedia , lookup

Business intelligence wikipedia , lookup

Sensitivity analysis wikipedia , lookup

Transcript
Primer on Statistics
for Interventional
Cardiologists
Giuseppe Sangiorgi, MD
Pierfrancesco Agostoni, MD
Giuseppe Biondi-Zoccai, MD
Why waisting time with statistics?
BMJ 2003
What you will learn - hopefully!
•
•
•
•
•
•
•
•
•
•
•
•
Introduction
Basics
Descriptive statistics
Probability distributions
Inferential statistics
Finding differences in mean between two groups
Finding differences in mean between more than 2 groups
Linear regression and correlation for bivariate analysis
Analysis of categorical data (contingency tables)
Analysis of time-to-event data (survival analysis)
Advanced statistics at a glance
Conclusions and take home messages
What you will NOT learn
•
•
•
•
•
•
•
•
•
•
Multivariable analysis
Advanced linear regression methods
Logistic regression
Cox proportional hazards analysis
Generalized linear models
Bayesian methods
Propensity analysis
Resampling methods
Meta-analysis
Most popular statistical packages (beyond SPSS)
What you will learn
•
•
•
•
•
•
•
•
•
•
•
•
Introduction
Basics
Descriptive statistics
Probability distributions
Inferential statistics
Finding differences in mean between two groups
Finding differences in mean between more than 2 groups
Linear regression and correlation for bivariate analysis
Analysis of categorical data (contingency tables)
Analysis of time-to-event data (survival analysis)
Advanced statistics at a glance
Conclusions and take home messages
What to choose?
Simple and
easy-going
or …
fast
but tough?
Science or fiction?
There are three kind of lies: lies, damn
lies, and statistics
B. Disraeli
Knowledge is the process of piling up
facts, wisdom lies in their simplification
M. Fisher
What is statistics?
DEFINITIONS
•
A whole subject or discipline
•
A collection of methods
•
Collections of data
•
Specially calculated figures
What is statistics?
DEFINITIONS
•
A whole subject or discipline
•
A collection of methods
•
Collections of data
•
Specially calculated figures
A collection of methods
Statistics is great
Find out stuff
– Finding stuff out is fun
• Feel like you have done something
• It’s small, but it’s something
Understand stuff
– When are we being deceived
– Support, or illumination?
Ultimate goal: appraisal of causation
Methods of inquiry
Statistical inquiry may be…
Descriptive
(to summarize or describe an observation)
or
Inferential
(to use the observations to make estimates or predictions)
Questions?
What you will learn
•
•
•
•
•
•
•
•
•
•
•
•
Introduction
Basics
Descriptive statistics
Probability distributions
Inferential statistics
Finding differences in mean between two groups
Finding differences in mean between more than 2 groups
Linear regression and correlation for bivariate analysis
Analysis of categorical data (contingency tables)
Analysis of time-to-event data (survival analysis)
Advanced statistics at a glance
Conclusions and take home messages
What you will learn
• Basics
– concepts of population and sample
– collecting data
– study design and protocol
– randomization
– intention-to-treat vs per-protocol analysis
– types of variables
– measurement scales
What you will learn
• Basics
– concepts of population and sample
– collecting data
– study design and protocol
– randomization
– intention-to-treat vs per-protocol analysis
– types of variables and measurement scales
Population and sample: at the heart of
descriptive and inferential statistics
Again: statistical inquiry may be…
Descriptive
(to describe a sample/population)
or
Inferential
(to measure the likelihood that estimates generated from the
sample may truly represent the underlying population)
Descriptive statistics
100
100
AVERAGE
Descriptive statistics
example
Descriptive statistics
Meredith et al, Am J Cardiol 2007
Descriptive statistics
example
Meredith et al, Am J Cardiol 2007
Inferential statistics
If I become a scaffolder, how likely
I am to eat well every day?
P
values
Confidence
Intervals
Inferential statistics
Mauri et al, New Engl J Med 2007
Inferential statistics
Mauri et al, New Engl J Med 2007
Focus on p values
Mauri et al, New Engl J Med 2007
Focus on confidence intervals
Mauri et al, New Engl J Med 2007
Samples and populations
This is a sample
Samples and populations
And this is its
universal population
Samples and populations
example
Samples and populations
Only 300 patients!
Kastrati et al, JAMA 2005
Samples and populations
This is another sample
Samples and populations
And this might be its
universal population
Samples and populations
But what if THIS is its
universal population?
Samples and populations
Any inference thus
depend on our confidence
in its likelihood
What you will learn
• Basics
– concepts of population and sample
– collecting data
– study design and protocol
– randomization
– intention-to-treat vs per-protocol analysis
– types of variables and measurement scales
Data collection
• Data collection is pivotal and should be planned
well before actually performing it
• Any variable or item code should be collected in
a clear and unequivocal way
• A missing code is still a code (eg 999)
• Data types can be dozens:
–
–
–
–
–
–
String
Categorical
Ordinal
Data
Time
Interval
Data collection
• Coherence and safety checks should always be
implemented
• Multiple data entry should be used to minimize
human error
• Thorough monitoring and quering are also critical
• Currently, the best approach for data collection in
the current era are web-based case report forms
(CRF)
• Despite this, the risk of information bias is always
there and should be kept at a minimum as much
as possible
What you will learn
• Basics
– concepts of population and sample
– collecting data
– study design and protocol
– randomization
– intention-to-treat vs per-protocol analysis
– types of variables and measurement scales
Designs for various research goals
•
CASE STUDY/REPORT/SERIES
•
SURVEY
•
CROSS SECTIONAL
•
MATCHED PAIRS (CASE-CONTROL)
•
HISTORICAL CONTROLS (BEFORE-AFTER)
•
CONCURRENT CONTROLS
•
LONGITUDINAL (COHORT)
•
CROSS-OVER
• RANDOMIZED CLINICAL TRIAL
•
META-ANALYSIS
Phases of clinical research
ANIMAL
CHEMICAL
STUDIES PHARMACOLOGY
AND TOXICOLOGY
PHASE I
PHASE II
PHASE III
PHASE IV
REGULATORY APPROVAL
PILOT/FEASIBILITY
STUDY
PIVOTAL
STUDY
POST-MARKETING
STUDY
REGISTRATION
(CE MARK)
MARKETING
Endeavor research program
ENDEAVOR I
Phase I FIM
60 month results
ENDEAVOR II
Double-blind Randomized Trial
48 month results
ENDEAVOR II CA Registry
Continued Access Safety
48 month results
ENDEAVOR III
Confirmatory Trial vs. Cypher
36 month results
ENDEAVOR IV
Confirmatory Trial vs. Taxus
24 month results
ENDEAVOR Japan
Single Arm Trial
12 month results
E-Five Registry
Real-World Performance and Safety Evaluation –
12 month results
PROTECT
Endeavor vs. Cypher Safety Study
8,800 patient RCT
42
Reviews
Preclinical studies
Joner et al, JACC 2008
Case report(s)
McFadden et al, Lancet 2004
Cross-sectional study
Case-control study
Before-after study
Cohort study (registry)
Lee et al, EuroInterv 2007
Cohort study (registry)
Lee et al, EuroInterv 2007
Cross-over study
Randomized trial
Fajadet et al, Circulation 2006
Another RCT– the SORT OUT II
Galloe et al, JAMA 2008
Another RCT– the SORT OUT II
Galloe et al, JAMA 2008
Another RCT– the SORT OUT II
Would you trust this trial?
Galloe et al, JAMA 2008
Another RCT– the SORT OUT II
Would you trust this trial?
Galloe et al, JAMA 2008
Another RCT – the ENDEAVOR IV
Patients Enrolled
N = 1548
Endeavor
n = 773
Randomized
Taxus
n = 775
Clinical F/U
(12 mo)
754/773
97.5%
Clinical F/U
(12 mo)
751/775
96.9%
Clinical F/U
(24 mo)
742/773
96.0%
Clinical F/U
(24 mo)
739/775
95.4%
Meta-analysis
Kastrati et al, NEJM 2007
What you will learn
• Basics
– concepts of population and sample
– collecting data
– study design and protocol
– randomization
– intention-to-treat vs per-protocol analysis
– types of variables and measurement scales
Randomization
• Technique enabling the correct application of
statistical tests according to frequentist theory (R.
Fisher)
• Randomization means random allocation of the
patient (or any other study unit) to one of the
possible treatments
• On the long run, randomization minimizes the
chances of finding imbalances in patient or
procedural features, but this applies only to large
samples (several hundreds) and few key clinical
features
Randomization types
• Simple
• In blocks
• Stratified
• Clustered
Randomization types
• Simple
• In blocks
• Stratified
• Clustered
Pt number
Rx
Pt number
Rx
1
A
12
B
2
B
13
B
3
B
14
A
4
B
15
A
5
B
16
B
6
A
17
B
7
A
18
A
8
B
19
B
9
B
20
B
10
A
21
B
11
A
22
B
Randomization types
• Simple
• In blocks
• Stratified
• Clustered
Pt number
Rx
Pt number
Rx
1
A
1
A
2
B
2
B
3
B
3
B
4
B
4
A
5
B
5
A
6
A
6
A
7
A
7
B
8
B
8
B
9
B
9
A
10
A
10
B
11
A
11
A
Randomization types
• Simple
• In blocks
• Stratified
• Clustered
Pt number
Rx
Pt number
Rx
1
A
12
B
2
B
13
B
3
B
14
A
4
A
15
A
5
B
16
B
6
A
17
B
7
A
18
A
8
B
19
B
9
B
20
A
10
A
21
B
11
A
22
B
Wrong or pseudo-randomizations
EXAMPLES – TO AVOID!
1. Alternate days of admission
2. According to birthday
3. Coin tossing
4. Card deck selection
5. Patient initials
What you will learn
• Basics
– concepts of population and sample
– collecting data
– study design and protocol
– randomization
– intention-to-treat vs per-protocol analysis
– types of variables and measurement scales
Intention-to-treat analysis
• Intention-to-treat (ITT) analysis is an
analysis based on the initial treatment
intent, irrespectively of the treatment
eventually administered
• ITT analysis is intended to avoid various types of
bias that can arise in intervention research,
especially procedural, compliance and survivor
bias
• However, ITT dilutes the power to achieve
statistically and clinically significant differences,
especially as drop-in and drop-out rates rise
Per-protocol analysis
• In contrast to the ITT analysis, the per-protocol
(PP) analysis includes only those patients who
complete the entire clinical trial or other particular
procedure(s), or have complete data
• In PP analysis each patient is
categorized according to the actual
treatment received, and not according
to the originally intended treatment
assignment
• PP analysis is largely prone to bias,
and is useful almost only in
equivalence or non-inferiority studies
ITT vs PP
100 pts
enrolled
50 pts to group A
(more toxic)
45 pts treated with A, 5
shifted to B because of poor
global health (all 5 died)
RANDOMIZATION
ACTUAL THERAPY
50 pts to group B
(conventional Rx, less toxic)
50 patients treated
with A (none died)
ITT vs PP
100 pts
enrolled
50 pts to group A
(more toxic)
45 pts treated with A, 5
shifted to B because of poor
global health (all 5 died)
RANDOMIZATION
ACTUAL THERAPY
50 pts to group B
(conventional Rx, less toxic)
50 patients treated
with A (none died)
• ITT: 10% mortality in group A vs 0% in
group B, p=0.021 in favor of B
ITT vs PP
100 pts
enrolled
50 pts to group A
(more toxic)
45 pts treated with A, 5
shifted to B because of poor
global health (all 5 died)
RANDOMIZATION
ACTUAL THERAPY
50 pts to group B
(conventional Rx, less toxic)
50 patients treated
with A (none died)
• ITT: 10% mortality in group A vs 0% in
group B, p=0.021 in favor of B
• PP: 0% (0/45) mortality in group A vs 9.1%
(5/55) in group B, p=0.038 in favor of A
What you will learn
• Basics
– concepts of population and sample
– collecting data
– study design and protocol
– randomization
– intention-to-treat vs per-protocol analysis
– types of variables and measurement scales
Types of variables
Variables
Types of variables
Variables
CATEGORY
QUANTITY
Types of variables
Variables
CATEGORY
nominal
QUANTITY
ordinal
ordered
categories
ranks
Types of variables
Variables
CATEGORY
nominal
QUANTITY
ordinal
ordered
categories
ranks
discrete
continuous
counting
measuring
Types of variables
Variables
CATEGORY
nominal
QUANTITY
ordinal
discrete
continuous
ranks
counting
measuring
TIMI
flow
Stent diameter
Stent length
BMI
Blood pressure
QCA data (MLD, late loss)
Death: yes/no
TLR: yes/no
ordered
categories
Radial/brachial/femoral
Paired vs unpaired data
Variables
Paired vs unpaired data
Variables
PAIRED
OR
REPEATED
MEASURES
UNPAIRED
OR
INDEPENDENT
MEASURES
Paired vs unpaired data
Variables
PAIRED
OR
REPEATED
MEASURES
eg
• blood pressure measured
twice in the same patients
at different times
• MLD measured at
different times in the
same segment
UNPAIRED
OR
INDEPENDENT
MEASURES
eg
• blood pressure measured
in several different groups
of patients only once
• MLD measured at the
same time in different
vessels
Measurement scales
• What is measurement: the assignment of
numbers to objects or events in a
systematic fashion
• Thus, four levels of measurement scales
are commonly distinguished:
– nominal
– ordinal
– interval
– ratio
Thank you for your attention
For any correspondence:
[email protected]
For further slides on these topics feel
free to visit the metcardio.org website:
http://www.metcardio.org/slides.html