* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download two-sample ind
Survey
Document related concepts
Transcript
PH1690: Foundations of Biostatistics L8(2): Two Sample Hypothesis Testing 1 Objective • Understand how to conduct appropriate ttests for two independent samples 2 Two Independent Samples • Numerator of t statistic: difference of sample means • Denominator-depends on…whether variances are considered equal. – It uses the pooled variance estimate if the assumption of equal variance cannot be rejected. – Uses approximate method assuming variances not equal 3 To be pooled, or not to be pooled? Equal S.D. or not equal? • We could base our decision on an informal rule (usually works and is much less complicated). – If no sample standard deviation is twice the other, (i.e 0.5 < s1/s2 < 2), then the assumption of equal standard deviations should be ok. • We could perform a graphical analysis and look at the box-plots for the samples to informally assess the equal standard deviations assumption. • There are formal tests to assess the evidence against equal population standard deviations – Variance Ratio Test – Levene’s Test 4 Variance Ratio Test • Ho: σ12 = σ22 vs. Ha: σ12 ≠ σ22 • Test statistic: F= s12 / s22 ~ Fdf1=n1-1, df2=n2-1 • p-value: 5 A note on effect of unequal variance on student’s t-test with equal variances • When the sample sizes are approximately equal (what we call a balanced design), unequal variances have little effect on pvalues and CI. T Test with Equal Variances • H0: μ1 = μ2 • H1: μ1 < μ2 or H1: μ1 > μ2 or H1: μ1 ≠ μ2 • Test statistic: t • Numerator: difference in sample means • Denominator: square root of pooled variance estimate, times a quantity involving sample sizes of both groups • df: n1+n2-2 • Critical value: depending on H0 7 T Test with Equal Variances • t test statistic: t x1 x2 1 1 s n1 n2 (n 1)s s 1 2 1 (n2 1) s22 / n1 n2 2 • Rejection region: • t>tn +n -2,1-α or t<-tn +n -2,1-α • t>tn +n -2,1-α/2 or t<-tn +n -2,1-α/2 1 2 1 2 1 2 1 2 8 Vitamin C Example • To test whether Vitamin C will reduce the cold occurrence compared to a placebo. • 20 subjects are randomly assigned to Vitamin C or placebo group. 9 Vitamin C Data • The Excel data sheet: Subject Vitamin C Placebo 1 4 7 2 0 8 Group Mean SD 3 3 4 Vitamin C 3.3 1.57 4 4 6 Placebo 5.7 1.34 5 4 6 6 3 4 7 4 6 8 3 4 9 2 6 10 6 6 10 • Stata examples are provided. We will walk through the steps in class. Stata Checking Equal Variance Notice, that the standard deviations pass the 0.5<s1/s2<2 rule. Formally, using the Variance ratio test, ratio=1.37, two-sided p-value=0.6446. Decision: Fail to reject the Null hypothesis that the variances are equal. Conclude that we can use ttest with equa 12 variance. Stata T Test Output . ttest VitaminC=Placebo, unpaired Two-sample t test with equal variances Variable Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] VitaminC Placebo 10 10 3.3 5.7 .4955356 .4229526 1.567021 1.337494 2.179021 4.743215 4.420979 6.656785 combined 20 4.5 .4198997 1.877849 3.62114 5.37886 -2.4 .651494 -3.768738 -1.031262 diff = mean(VitaminC) - mean(Placebo) Ho: diff = 0 t = -3.6838 degrees of freedom = 18 diff Ha: diff < 0 Pr(T < t) = 0.0008 Ha: diff != 0 Pr(|T| > |t|) = 0.0017 Ha: diff > 0 Pr(T > t) = 0.9992 13 Another Scenario • What will happen if we reject equal variance hypothesis? In other words, the intermediate null hypothesis 12 22 has been rejected. 14 T Test with Unequal Variances • Hypotheses: same as t test with equal variances • Test statistic • Numerator: same as t test with equal variances • Denominator: different • df: different 15 Satterthwaite’s Method • t test statistic x1 x2 t s12 s22 n1 n2 • The approximate degrees of freedom, d’ d' s 2 1 s 2 1 / n1 s / n2 2 2 2 / n1 / n1 1 s22 / n2 / n2 1 2 2 16 T Test for Unequal Variances • Rejection Region • t>td’,1-α or t<-td’,1-α • t>td’,1-α/2 or t<-td’,1-α/2 17 Another Example Related to Vitamin C • The data set was collected to answer the research question: whether Vitamin C intake is different between smokers and non-smokers. • 20 smokers and 20 non-smokers were included in this study. 18 Equal Variance Rejection . sdtest Smoker=NonSmoker Variance ratio test Variable Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] Smoker NonSmo~r 20 20 54.4 5.803991 25.95624 42.25211 66.54789 115.9 10.44129 46.69487 94.04613 137.7539 combined 40 85.15 7.681609 48.58276 69.61248 100.6875 ratio = sd(Smoker) / sd(NonSmoker) Ho: ratio = 1 Ha: ratio < 1 Pr(F < f) = 0.0070 Ha: ratio != 1 2*Pr(F < f) = 0.0139 f = 0.3090 degrees of freedom = 19, 19 Ha: ratio > 1 Pr(F > f) = 0.9930 19 T Test with Unequal Variance . ttest Smoker=NonSmoker, unpaired unequal Two-sample t test with unequal variances Variable Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] Smoker NonSmo~r 20 20 54.4 115.9 5.803991 10.44129 25.95624 46.69487 42.25211 94.04613 66.54789 137.7539 combined 40 85.15 7.681609 48.58276 69.61248 100.6875 -61.5 11.946 diff -85.90668 -37.09332 diff = mean(Smoker) - mean(NonSmoker) t = -5.1482 Ho: diff = 0 Satterthwaite's degrees of freedom = 29.7183 Ha: diff < 0 Pr(T < t) = 0.0000 Ha: diff != 0 Pr(|T| > |t|) = 0.0000 Ha: diff > 0 Pr(T > t) = 1.0000 20 CI for Paired Data • Apply the confidence interval methods as illustrated for one sample t test in Chapter 6 to d, to find a confidence interval for Δ , as shown in the following equation: d tn1,1 / 2 sd / n d tn1,1 / 2 sd 21 CI for Independent Data • If equal variances, 1 1 1 1 x1 x2 tn1 n2 2,1 / 2 s 1 2 x1 x2 tn1 n2 2,1 / 2 s n1 n2 n1 n2 • If unequal variances, x1 x2 td ',1 / 2 s12 s22 s12 s22 1 2 x1 x2 td ',1 / 2 n1 n2 n1 n2 22 Summary • You have learned: – How to conduct a variance ratio test – How to conduct a two-sample independent t-test • Assuming equal variances • Assuming unequal variances 23