Download Selecting Right Statistics - University of Michigan Department of

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data assimilation wikipedia , lookup

Choice modelling wikipedia , lookup

Least squares wikipedia , lookup

Time series wikipedia , lookup

Regression toward the mean wikipedia , lookup

Linear regression wikipedia , lookup

Regression analysis wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Transcript
Selecting Right Statistics
Hyungjin Myra Kim, Sc.D.
The University of Michigan
Choosing an Analytic Method (1)
• First, analytic plan should be considered while
planning the study.
• What do you plan to study (or measure)?
Primary outcome measure determines the type
of dependent variable
–
–
–
–
–
Continuous (ex: hours of sleep)
Dichotomous (ex: binge drinking or not)
Ordinal (ex: depression diagnosis)
Categorical (ex: choice of treatment)
Time to event (ex: time to relapse)
Choosing an Analytic Method (2)
• Sometimes, there is no dependent variable
–
–
–
–
–
–
Factor analysis
Cluster analysis
Higher-way contingency table analyses
Agreement (kappa)
Correlation analysis (correlation coefficient)
Accuracy (sensitivity, specificity)
(We will not discuss the above today.)
Choosing an Analytic Method (3)
Study design
• Do you have a primary comparison?
• Determines the nature of the primary predictor
variable (independent variable)
(ex) 2 group or 3 group comparison?
(ex) Evaluating the relationship between happiness
to ratio of leisure to work hours
• How often do you plan to measure?
• X-sectional, longitudinal, x-over
• Determines the number of dependent variables
(ex) pre/post has measurements twice per person
The choice of analysis will also depend on
• Unvariate vs. bivariate analysis
• Bivariate vs. multivariate analysis
– Potential confounder?
– Adjust for covariates?
• Data skewed or sample size small?
– Transformation
– Parametric vs. non-parametric
Dependent Variable (Outcome)
Study Designs
Pre/Post
Continuous
Effect of nightly exercise on hrs
of sleep before/after in
insomniacs
Matched pairs Mastectomy vs. Lumpectomy on
QOL in patients matched by age
& family history
Binary (yes/no)
Patient satisfaction before vs.
after color change in hospital
ward
Mastectomy vs. Lumpectomy on
survival in patients matched by
age & family history
1-group
Cholesterol in diabetic patients:
Is it higher than general public?
Depression in substance
abusers
2-group
Writing skill between teaching
methods A vs. B
Comparison of drugs A vs. B on
relapse to heavy drinking
2-group,
pre/post
Weight before/after in exercise
vs. no exercise group
Satisfaction before & after
between 2 skin products
3-group
Comparing effectiveness of three
drugs on cholesterol
Pain reduction in three
different pain relief medication
Does pack-year of smoking
predict Cognitive deficit?
Is average nightly sleep
predictive of hair loss?
Continuous
Predictor
What Type of Analysis?
• Descriptive
– Numerical – tables of means, counts,
proportion
– Graphical - histograms, box plots, scatter
plots, etc.
• Inferential
– Estimation – Point estimates/Confidence
Intervals
– Hypothesis Tests
Analytic Methods
Dependent Variable (Outcome) Type
Study Design
Continuous
(multiple regression for
multivariate analysis)
Binary (yes/no)
(logistic regression for
multivariate analysis)
Pre/Post
Paired t-test
McNemar’s test
Matched pairs
Paired t-test
McNemar’s test
1-group
One-group t-test
One proportion test
2-group*
Two-group t-test
Two proportion test or
Chi-square Test
2-group, pre/post*
Analysis of Covariance or
multiple regression
Repeated measures
logistic regression
3-group*
Analysis of Variance
Chi-square test
Continuous
Predictor
Simple regression
Logistic regression
* Bivariate relationships
Binary Dependent Variable
Descriptive Statistics: Proportion
To estimate a proportion or prevalence, subjects must
be a representative sample from the population.
Assuming the subjects are representative and
independent, the rate is estimated as:
p = n/N
where n is the number of subjects with the attribute
and N is the total number of subjects tested (or
studied).
Binary Dependent Variable (2)
When Only One Group is of Interest:
• Test
Proportion compared to a null value
 one proportion test
Ex) Are substance abusers more likely to be
depressed than general public?
• Confidence Interval (95% CI: proportion ± 1.96*SE)
Ex) Prevalence of depression in substance abusers
Ex) Sensitivity and specificity of a new short
depression instrument compared with the
physician’s gold standard depression diagnosis
• More on interpretation of 95% CI tomorrow.
Binary Dependent Variable (3)
When Comparing Two Independent Groups:
Ex) Comparing drugs A vs. B on relapse to heavy
drinking
• Essentially a 2 by 2 table
• Comparative Test
- Chi-square test
- Two proportion test
• Comparative Statistics (summary effect size)
- Absolute Difference in Proportions
- Odds Ratios (OR)
- Relative Risks (RR)
For both OR and RR, 1 means no difference
• Can calculate 95% CI for any of the above (OR, etc.)
Binary Dependent Variable (4)
When Comparing Two Independent Groups:
If sample size is small
rule of thumb = expected cell count < 5
• Comparative Test: Fisher’s Exact Test
|
A
B|
Total
--------------------------------------------yes |
3
6 |
9
no |
9
2 |
11
-------------------------------------------Total |
12
8 |
20
Pearson chi-square test p-value = 0.028
Fisher's exact test p-value = 0.065
Continuous Dependent Variable
Descriptive Statistics
• Mean and Standard Deviation if data are symmetric
• Median and Inter-quartile range if data are skewed
Mean can be affected by one very large or one very small
value, and therefore is sensitive to outlying values
Median is robust to an outlying value because it is simply
the value at the center when data are ranked in order.
• If mean and median are very different, data are skewed.
• Always graphically explore the distribution (e.g., using
histogram, box plot) and choose the appropriate
descriptive statistics
• More on mean vs. median tomorrow.
Continuous Dependent Variable (2)
When Only One Group is of Interest:
• Test (One mean compared to a null value)
One sample t-test
Ex) Is cholesterol higher in diabetic patients compared
with the general public?
• Confidence Interval (95% CI = mean ± 1.96*SE)
Ex) Sample mean cholesterol = 124
Sample SD = 10, N = 200
95% CI for mean cholesterol = 124 ±1.96*10/sqrt(200)
= (122.6, 125.4)
Continuous Dependent Variable (3)
When Only One Group is of Interest:
When sample size is small (N<25) or cannot assume that the
dependent variable is interval and normally distributed
Use a Non-parametric Test
• One Sample Median Test
- Sign test
- Sign rank test
Continuous Dependent Variable (4)
When Comparing Two Independent Groups:
• Test
Two independent group t-test
Ex) Writing skill comparison between teaching methods A vs. B
• Comparative Statistics: difference in means
Ex) Difference in mean writing skill scores between
those who were taught with method A vs. method B
• Confidence Interval for Difference in Means
95% CI = difference ± 1.96*SE (of difference)
Continuous Dependent Variable (5)
When Comparing Two Independent Groups:
If sample size is small (N < 25) or cannot assume that the
dependent variable is interval and normally distributed
Use a Non-parametric Test (Test of Median)
• Wilcoxon ranksum test (tests equality of medians)
Graphical Methods to Compare Groups: Box Plots
Resting
Heart
Rate
No
Exercise
Mild
Exercise
Strenuous
Exercise
Using Subjects as Their Own Controls:
Cross-Over Designs
Same subject undergoes 2 or more treatments
•
Advantage
•
•
Limitations of reusing the same subject
•
•
•
•
May not be possible
Carryover effect of treatment – need washout
Length of experiment
Order effect
•
•
Maximizes power – fewest subjects needed
Order should be randomized and balanced
Period effect
Cross-Over Designs (2)
Examples
•
Pre-post study (poor design, why?)
Ex) Weight before an exercise program and weight
after a month of exercise program
•
Traditional X-over Study
Ex) Alternating exposure to guided imagery procedure
between stressful situation and a natural relaxing
situation on different days in random order and
assessing the effect on craving
•
•
Stressful Image – washout period – Relaxing Image
Relaxing Image – washout period – Stressful Image
Ex) Drug A then cross over to B
Cross-Over Designs (2)
Analytic Method
• Pre-post study
• Analyze change-score or gain-score and treat it as
a one sample problem
Ex) change in weight within a person before and
after the exercise program
•
Traditional X-over Study
• Analysis must first assess carryover effect, order
effect and period effect.
• If any effect, then must account for it.
Multiple Comparison: Doing Many Tests
• α-level (significance level) – the probability
of claiming that there is a difference when
there is no true difference
– Small α is good.
– We usually set α-level at 0.05.
– This means we allow 5% for making the
kind of error where we declare a
significant difference (reject the null
hypothesis) when the result happened by
chance (Type 1 error).
Multiple Comparison (2)
• When ≥2 comparisons, α (5%) should be reduced to
adjust for the number of comparisons.
• Suppose we are performing two independent
statistical tests, then:
• P(of rejecting the 1st when true) is 0.05
• P(of rejecting the 2nd when true) is 0.05
• What is probability of rejecting at least one?
• P(of accepting 1st when true) is 0.95
• P(of accepting 2nd when true) is 0.95
• Therefore, p(of accepting both)
= 0.95 x 0.95 = 0.9025
• That is, p(of rejecting at least one) = 0.0975
Multiple Comparison (3)
Number of
independent tests
1
Probability of rejecting
null hypothesis,
when true
0.05
2
0.0975
3
0.143
5
0.226
10
0.401
If perform enough significant tests, you are sure to
find significant results by chance alone even when
none exists.
Multiple Comparisons: What to do? (4)
For independent tests, one easy way of adjusting the
level of significance is to use:
0.05/k
where k is the number of tests to be performed.
Therefore, instead of 0.05,
– When there are 5 tests, use 0.01
– When there are 10 tests, use 0.005
Multiple Comparison (5)
• When testing a pre-specified relationship, use a
significance level of 5%.
• When screening for interesting relationships,
use significance level of 1% so as not to identify
too many false relationships.
Confounding
Example 1: Sex bias in graduate admissions?
(UC, Berkeley, 1973)
Overall:
44% of males admitted
35% of females admitted
Admissions are made by department.
Confounding (2)
Male
Female
Number of
Applicants
Percent
Admitted
Number of
Applicants
Percent
Admitted
A
B
825
62%
108
82%
560
63%
25
68%
C
D
325
37%
593
34%
417
33%
375
35%
E
F
191
28%
393
24%
373
6%
341
7%
Total
2691
45%
1835
30%
Major
Weighted Average:
39%
43%
Confounding (3)
Example2 : Is psychiatric hospitalization rate different
in substance users versus non-users?
Hospitalization
Yes
No
User
20
373 5.1%
Non-user
6
316 1.9%
Substance use looks to be associated with higher
psychiatric hospitalization rate.
User
Non-User
Separated by Bipolar Status
No Bipolar
Bipolar I/II
3
176 1.7%
17 197 7.9%
4
293 1.4%
2
23 8.0%
Confounding (4)
Example 3: Smoking versus MI
Smoker
Non-Smoker
MI
51
54
No MI
43
67
54%
44.6%
OR = 1.47
Male
Female
MI
37
25
14
29
No MI 24
20
19
47
61% 56%
48% 38%
OR = 1.23
OR = 1.19
Smokers have higher MI rate, but the magnitude of the
relative likelihood of MI (measured as odds ratio (OR)) is
larger in the combined data.
Confounding (5)
Example 4:
1) Regression of Happiness on Smoker Group
Coef
SE
p-value
Intercept
65.05
1.48
0.000
Smoke
4.80
2.03
0.020
2) Regression of Happiness on Age
Coef
SE
Intercept
7.48
2.45
Age
1.85
0.07
p-value
0.003
0.000
3) Regression of Happiness on Age and Smoke
Coef
SE
p-value
Intercept
2.65
2.07
0.203
Age
2.08
0.07
0.000
Smoke
-5.25
0.70
0.000
Confounding (6)
Relationship between Happiness and Age
20
100
20
40
40
Happiness
60
60
80
80
100
Without Considering Age, smokers appear to have higher mean
by Smoking Status
20
Not
25
30
Smoke
Y, Smoke == Not
•
•
Age
35
40
45
Y, Smoke == Smoke
Increasing age is associated with greater happiness.
Smokers tend to be older, making it look like smoking is associated
with greater happiness when not adjusting for age.
• But smokers tend to be less happy than non-smokers given same age.
Developing a Statistical Analysis Plan
• Comparing two groups
– Continuous: t-test
– Proportion: chi-square test
• Comparing multiple groups (continuous): ANOVA
– Adjusted for other factors: ANCOVA, or regression
• Dichotomous outcome: Logistic regression
• Count outcome: Poisson regression
• Survival time outcome: Cox regression
• Watch for correlated data (repeated measures, clusters –
e.g., teeth in the mouth
To Keep in Mind
• Typically, multiple appropriate methods are available
to analyze the same data that could yield legitimate
answers.
• Try to use at least two different available methods to
confirm your results.
• Always look at the raw data and display data
graphically, so learn to choose the right graphical
displays (ex: cross tabs, scatter plots, box plots)
• It helps to make sample tables summarizing results
before you start the analysis.