Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Be able to understand statistical concepts in a medical paper Understanding: p‐value and hypothesis testing mean and standard deviation (SD) median and inter‐quartile‐range [IQR] standard error and confidence interval 1 P‐value and hypothesis testing 2 “Will you marry me, Mary?” “Yes John, only if you can prove me that love is our destiny.” John (1st Try): Sure I can. Mary, many wonderful things happened since I met you, so love must be our destiny. John (2nd Try): Mary, it is hard to believe that our love happened just by chance alone. If it happened just by chance alone, the probability of us coming from the other side of the universe to Vanderbilt CQS Summer Institute, and fell in love in a stat class is 0.000000001. It COULD NOT happen just by chance alone! Sure, love must be our destiny!!!! 3 What is a scientific evidence? When you want to prove that a new drug works (or love is their destiny), which approach do you want to take? Alternative hypothesis (Ha) A. Give evidences to support that the drug works (love is their destiny). B. Give an evidence to against that the drug does not work (love is NOT their destiny). Which approach do you think more convincing (or easier to collect evidence)? Null hypothesis (Ho) 4 Disproving hypothesis in Evidence Based Medicine In general, it is much easier to find evidence against a hypothesis than to prove that it is correct. In fact, one view of science is a process of disproving hypothesis. Statistical methods formalize this idea by looking for evidence against a null hypothesis (Ho): that there is no difference between groups or no association between variables (or love is NOT their destiny). Data are then collected and assessed for their consistency with the null hypothesis. (Kirkwood and Sterne, page 72). P‐value is used as an evidence to against the null hypothesis, it is defined as “probability of observing the observed difference or greater difference when the null hypothesis is true”. (So smaller is better) The drug does not work. It in fact does work! 5 But be careful, p>0.05 does not mean that the two drugs are the same. A. Low dose 60‐80ng/ml B. High dose 2.5㎎・kg/day A.Low dose 60‐80ng/ml B. High dose 2.5㎎・kg/day Meaning of the p‐value: If the truth is that there is NO difference between the two treatments, the probability of observing this or larger difference is 6 out of 1000, probably hard to believe that it is happening jusy by chance, therefore there must be a difference. Meaning of the p‐value: If the truth is that there is NO difference between the two treatments, the probability of observing this or larger difference is 6 out of 100. No enough evidence that there is a difference. 6 Question:How can I make a p‐value smaller? Enroll as many patients as you can。 P‐value!? 7 What impacts on p-value when comparing new drug v.s. placebo? The effect of the new drug. ex: Larger reduction (10lbs) in weight by the new drug! Variation of data: Larger variation can result in larger p-value. Source of variation: Between-subject variation Measurement error And what else?????? 8 A. Low dose 60‐80ng/ml B. High dose 2.5㎎・kg/day A.Low dose 60‐80ng/ml B. High dose 2.5㎎・kg/day Meaning of the p‐value: If the truth is that there is NO difference between the two treatments, the probability of observing this or larger difference is 6 out of 1000, probably hard to believe that it is happening jusy by chance, therefore there must be a difference. Meaning of the p‐value: If the truth is that there is NO difference between the two treatments, the probability of observing this or larger difference is 6 out of 100. No enough evidence that there is a difference. 9 ❓ Clinical Difference Statistical Difference 10 Probability of Survival Hazard Ratio(95% CI) 0.63 (0.57 – 1.11) Steroid + Mizoribine Steroid alone 11 12 43% 36% 13 Steroid + Mizoribine Steroid alone 56% 35% 14 Suggestion: Clinically meaningful reduction of flare was observed, though it did not reach statistical significance due to a small sample size. 15 What can we do in order not to miss a clinically important difference? 16 Steroid + Mizoribine Sample size computation: Steroid alone 56% 35% In order to detect the observed difference statistically with 2‐sided 5% significance level With 80% power, It requires at least 87 patients per arm. 17 Descriptive Statistics Measure for central tendency Mean (Median) Measure for variation Standard deviation (IQR) 18 Example 1 Ely, Shintani, Truman et al, JAMA 2003;289:2983‐91 19 Mean(SD) Ely, Shintani, Truman et al, JAMA 2003;289:2983‐91 20 Group B Group A Group B Q3:75% Error bar :95%CI N=30 N=30 0 5 10 15 20 25 30 Group A Group B Error bar :SD Max 20 30 P=0.01 Significant? Min 10 10 20 Q2:50% Q1:25% 0 Group A Group A Group B 50 Whisker 30 40 50 Error bar:SE Significant? P=0.01 Group B Box‐Whisker plot 0 Group A Significant? 40 Significant? P=0.1 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Mean Outcome P=0.1 Group A Group B Box‐Whisker plot with data points 21 Standard Deviation (SD) SD=Average distance from each data point to the sample mean. SD = 8.1 Mean=25.6 22 SD: How to use When data are normally distributed: 67% of Patients’ Apache score lie between 1SD. 25.6±8.1 =(17.5, 33.7) 67% 95% of Patients’ Apache score lie between 2SD. 25.6±2x8.1 =(9.4, 41.8) 95% Apache II Score 23 Ely, Shintani, Truman et al, JAMA 2003;289:2983‐91 4.8±2 x 12.8 =(‐20.8, 30.4) 使って見よう! 95% of patients used lorazepam does between ‐20.8㎎ to 30.4㎎ ???? 24 95% of patients used lorazepam does between ―20.8㎎ to 30.4㎎ ???? Mean=4.8 SD=12.8 Median [IQR] = 0 [1, 4.25] 50% 1 mg 25% 0 mg 75% 4.25mg 25 26 Patient’s characteristics 50% 10 ㎎ 25% 2 ㎎ 75% 41 ㎎ 27 SD is to describe data (Typically for Table 1) SE: Multiply by 2, and use as 95% Confidence Interval for a statistical inference 28 Group B Group A Group B Q3:75% Error bar :95%CI N=30 N=30 0 5 10 15 20 25 30 Group A Group B Error bar :SD Max 20 30 P=0.01 Significant? Min 10 10 20 Q2:50% Q1:25% 0 Group A Group A Group B 50 Whisker 30 40 50 Error bar:SE Significant? P=0.01 Group B Box‐Whisker plot 0 Group A Significant? 40 Significant? P=0.1 0 5 10 15 20 25 30 0 5 10 15 20 25 30 0 5 10 15 20 25 30 Mean Outcome P=0.1 Group A Group B Box‐Whisker plot with data points 29 Relationship between p‐value and 95% Confidence Interval, CI 95% CI including the null value P>0.05 No difference detected 95% CI including the null value P<=0.05 A difference detected Null Value Student t-test ANOVA Linear regression absolute difference in values 0 Logistic regression Relative difference in proportions through a ratio 1 Cox regression Relative difference in risks through a ratio 1 30 95% CI of the difference between two groups. P‐value is a function of both sample size and treatment effect Mean Diff Sample Size A. P>0.05 0 days Large B. P>0.05 3 days Small C. P<0.05 3 days Large D. P<0.05 3 days Small (but bigger than B) Null Value e.g.,difference = 0 days 95% CI = Mean – 2 x SD / √sample size, Mean + 2 x SD / √sample size 31 32 The effect of Creatinine and Diastolic BP are both not significant (P > 0.05), however with 95% CI, you may know creatinine may be associated with calcification, because CI is too wide and OR=3.96 is meaningfully large, thus the p‐value may become less than 0.05 with enrolling more patients. 33