Download Biostatistics: A Review Handout

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Transcript
Biostatistics: A Review
Tony Gerlach, Pharm.D, BCPS
Statistics
•  Methods for collecting , classifying,
summarizing & analyzing data
•  Descriptive
–  Frequency, Histogram, Measure central
Tendency, Measure of spread, Scatter plot
•  Inferential
–  Conclusion or generalization made about a
population from study using a sample
population
Variables
•  Discrete (Nonparametric)
–  Nominal: classified into groups with no
particular order or severity (Yes/NO)
•  E.g., Sex, mortality, Disease State
–  Ordinal: Ranked in specific order but no
consistent level of magnitude between
groups
•  E.g., NYHA class, Trauma scores, Likert scales
1
Variables
•  Continuous (Parametric)
–  Interval: Data are ranked in specific order
with constant change in magnitude with
zero point arbitrary
•  E.g., Fahrenheit Temperature
–  Ratio: Like interval but with an absolute
zero
•  E.g., Heart Rate, Age, Blood pressure
Normal (Gaussian) Distribution
•  Bell Shape
•  Mu: Mean
•  Theta: Standard
Deviation (SD)
–  68% of population
are within +/- 1 SD
–  95% of population
are within +/- 2 SD
Descriptive Statistics
•  Mean: Average
–  Only for Continuous data
–  Sensitive to outliers
•  Median: Point where half of observations fall
below and above
–  Used with ordinal & continuous data
–  Insensitive to outliers
•  Mode: Most common value
–  For all data types
•  Mode=Median=Mean for Normal Distribution
2
Standard Deviation
•  Calculated to reflect range of samples
•  Appropriate for normal or nearly normal
data
•  Therefore can only use continuous data
Standard Error of the Mean (SEM)
•  SEM is estimated from Standard
deviation
•  SEM = SD/ √ n
•  Like any normal distribution 95% of
sample means lie with in +/- 2 SEM of
mean
•  Use to calculate Confidence intervals
Range
SD
SEM
Interval/Ratio data
Yes
Yes
Yes
Ordinal data
Yes
No
No
Descriptive of
sample variability
Yes
Yes
No
Assists in Statistical
Inference
No
Yes
Yes
Used to calculate
Confidence Intervals
No
No
Yes
3
Hypothesis Testing
•  Null hypothesis (H0): No difference between
group (X=Y)
•  Is used to determine if any observed
difference between groups is due to chance
alone
•  If H0 is rejected = statistical difference
between groups
•  If H0 is accepted = no statistical difference
(any difference due to chance)
Hypothesis Testing
•  Choose correct statistical test based on:
–  Type of data (nominal, ordinal, continuous)
–  Study design (parallel, crossover)
–  Presence of confounding variables
•  Depending on statistics value H0 is
accepted or rejected
Decision Errors
H0 True
H0 False
Accept H0
No Error (A)
True Positive (TP)
Type II Error (B)
False negative (FN)
Reject H0
Type 1 Error (C)
False Positive (FP)
No Error (D)
True Negative (TN)
•  Type I Error: Concluding H0 is false
when really true (Wrongly concluding
statistical difference between groups)
•  Type II Error: Concluding H0 is true
when it is really false (Wrongly
concluding group equal)
4
Decision Errors
•  P-value: is calculated chance Type I
error has occurred
•  Probability of making Type I error is α
•  When α level is set a priori, H0 when
p< α
•  Probability of Type II error is β
•  Holding other variables constant α and
β are inversely related
Sensitivity & Specificity
•  Sensitivity: positivity of test or is test
sensitive to detect disease presence
= 100 X A/(A+B) or 100 X TP / (TP+FN)
•  Specificity: negativity of test or is test
specific enough to test absence of
disease
= 100 X B/(B+D) or 100 x FN/(FN=TN)
Power
•  Is the probability of making correct decision and
ability to detect difference= 1 - β
•  Analogous to β
•  β = 1/α i.e. if α=0.05, β=20% & Power = 80%
•  As Increase α Decrease β
•  Increased by
–  Increasing α
–  Increasing n
–  Magnitude of difference being studied (Δ)
–  One tail versus two tail
•  Decreased by
–  Poor study design
–  Incorrect statistical test
5
Statistical Significance
•  Size of p-value not related to importance of
results
–  Smaller p-values mean less likely chance explains
difference
•  Statistically significance does not mean
clinically significance
•  Lack of statistical difference does not mean
results are not important
–  Lancet 2000;356:2139-43 Dopamine in ARF
Statistical Test
Type of variable
Statistical test
Nominal
Chi squared
Fishers Exact Test
Wilcoxan Rank Sum
Mann Whitney U
Student’s t-test
Ordinal
Continuous
Confidence Intervals (CI)
•  P-value tells if difference between groups, but
not magnitude
•  CI give idea of magnitude of difference with
point estimate
•  All values in CI are statistically possible
•  CI that include zero interpreted as p>0.05
•  What is difference between 90% and 95% CI
•  Changes in MAP for a drug is 95% CI –12
mmHg, (-22 to –10)
Means with 95% confidence drug a reduces MAP
between 22 and 10 mmHg with a best point
estimate of 12 mmHg
6
Correlation
•  Used to estimate strength of relationship between 2
variables
•  R is correlation coefficient range -1 to +1
•  -1 is perfect negative correlation or indirect
relationship
•  +1 is perfect positive correlation or direct relationship
•  Correlation is quantitative way to measure strength
of relationship OR
•  Simply recognizes relation but does not imply
causation (chicken or egg)
Regression
•  Regression PREDICTIVE, correlation is not!
•  Math method to describe relationship with
goal to develop equation for prediction of one
variable from one or more variables
•  Often use line regression where
•  Y = MX + B
X is independent variable
Y is dependent variable
R2 is used for regression
•  Intensive Care Med 2004;30:1537-43
•  BIS XP and RASS R2 0.36 , p=0.011
•  Means 36% of time BIS predicted RASS score
Types of Studies
•  Case Reports
•  Case Studies
•  Case – Control Studies
•  Cohort
–  Retrospective
–  Prospective
•  Randomized Clinical Trials
7
Case Control Studies
•  Advantages
–  Causal influence on relatively uncommon
conditions
–  Allows for smaller n than cohort
–  Allows investigation of many causes
–  Can be done inexpensively and fast
•  Disadvantages
–  Selection of control can be difficult
–  Confounding and bias are concerns
Cohort Studies
•  Can be prospective or retrospective
•  Advantages
–  Allows study of more than one disease and /or
exposures
–  Less bias than case control
•  Disadvantages
–  Cost more
–  Long time to conduct
–  Larger n
–  Bias can be introduced via outcome information
Randomized Controlled Trials
•  Advantages
–  Best design to determine causality
–  Minimizes bias via randomization
–  May be parallel or crossover design
•  Disadvantages
–  Cost
–  Time
8
Relative Risks (RR) and Odds Ratios
(OR)
•  Estimate the magnitude of exposure
between exposure and disease
•  Interpreted based on their difference
from unity (1.0)
•  Like Confidence Intervals given as
range and point
Disease
Exposure
Yes
No
Yes
A
B
No
C
D
•  RR can not be directly calculated for
most case control studies
•  Use RR for Cohort Studies
RR = (A/A+B)/(C/C+D)
•  Use OR for Case-Control Studies
OR = (A/C)/(B/D)
N Engl J Med 2004;351:1089-96.
ABX Use
Person-Years
Deaths RR (95% CI)
Current Use
Erythromycin
5305
10
2.01 (1.08-3.75)
Current Use
amoxicillin
6846
8
1.18 (0.59-2.36)
Former use of
erythromycin
111779
100
0.89 (0.72-1.09)
None
1126013
1358
1.0
194
3
5.35 (1.72-16.64)
Current use of
erythromycin and
CYP3A inhibitor
9
Which is the Best Agent?
•  All drug have been shown to decrease
incidence of DVT in ICU patients
•  Drug A has an absolute reduction of 6.1%
•  Drug B has a relative risk reduction of 20%
•  Drug C reduced DVT from 30.8% (placebo) to
24.7%
•  Drug D shows you need to treat 16 patients
to prevent one DVT
Application
•  Absolute Reduction = % placebo - %
study drug e.g. 30.8%-24.7%
•  Relative Reduction
= (placebo – study)/Placebo
e.g.,
(30.8%-24.7%)/30.8%
•  Number Needed to Treat (NNT)
= 1/(placebo-study) e.g., 1/(30.8-24.7)
Survival Analysis
•  Studies entry into study and death
–  Kaplan-Meier Curve
–  Cox Regression Model AKA
–  Proportional hazards regression analysis
•  Hazard Ratio (HR)
–  Use log rank formula to calculate slopes of curves
–  Interpreted similar to OR or RR
–  From Clin Infec Dis 2004;39:797-802
HR , 0.27; 95% CI , 0.09-0.78; p=0.011
10
Meta Analysis
•  Sum greater that parts
•  Advantages
–  Combines results of many studies
–  Greater statistical power
•  Disadvantages
–  Studies may not have same inclusion, hypothesis,
outcomes
–  Publication bias
•  May not fine studies that have not been published
“Likeness to truth is not the
same thing as truth”
Socrates
Resources
• 
• 
• 
• 
• 
• 
• 
• 
Ann Emerg Med 1990;86-9.
Ann Emerg Med 1990;309-15.
Ann Emerg Med 1990;591-7
Ann Emerg Med 1990;820-5
Ann Emerg Med 1990;1054-9
Ann Emerg Med 1990;1462-8.
Graphpad.com
Bmj.com/collections/statbk.index.shtml
11