Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
History of statistics wikipedia , lookup
Psychometrics wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Confidence interval wikipedia , lookup
Taylor's law wikipedia , lookup
Statistical inference wikipedia , lookup
German tank problem wikipedia , lookup
Misuse of statistics wikipedia , lookup
1 Comparison of Two Populations In this section we will study how to use inferential statistics for comparing two populations. Again we will consider being interested in a mean of a population and in the proportion of a population. 1.1 Inference for the difference between two population means µ1 − µ2 Consider a study for comparing the effect of two drugs on the blood pressure. If µ1 and µ2 are the mean changes in the blood pressure for drug one and two, respectively, then here one might want to compare the two means based on sample data from the two populations including people being treated with one or the other drug. This would call for a method using the information in the two samples and estimating and testing properties for the two means. The sample data will give us: sample size mean s.d. Sample 1 n1 x̄1 s1 Sample 2 n2 x̄2 s2 When comparing two population means we need to distinguish the following two situations describing the type of samples we will base the comparison on. 1. independent samples – example: height of males and females, where the samples of males and females were independently obtained. The two samples under investigation don’t have any interaction, they are independent. 2. paired samples – example: blood pressure measured in the morning and evening in the same group of people The two samples are connected. For every measurement in one sample you have a corresponding measurement in the second sample (like in the example, the measurements might be even taken for the same individual). 1.1.1 t-Procedure for Two Paired Samples In order to control extraneous factors in some studies you can use paired samples. In this case for every individual in the sample from population 1 you find a matching individual from population 2. And the decision is made based on the resulting sample data. In this case we always get n = n1 = n2 . Example: • Compare the resting pulse and pulse after exercise. To control for all other influences, you take both measurements for every individual resulting in two samples (before and after exercise). 1 We are interested in the difference in the population means µd = µ1 − µ2 . For statistical inference the differences of the paired observations sample 1 value − sample 2 value are used. Which then will create one sample of size n of measurements of pairwise differences. x̄d and sd denote the mean and the standard deviation,respectively, for those differences. For the distribution of x̄d 1. µx̄d = µ1 − µ2 , x̄d is an unbiased estimator for µ1 − µ2 √ 2. σx̄d = σ/ n, where σ is the population standard deviations of the pairwise differences. 3. If n is large than x̄d is normally distributed. So we get for the t-score t= x̄d − (µ1 − µ2 ) √ sd / n is t-distributed with df = n − 1, if n is large or the pairwise differences come from a normal distribution. These facts lead to the following Paired t-Test for Comparing Two Population Means 1. Hypotheses: Test type Upper tail H0 : µd ≤ d0 ⇔ µ1 − µ2 ≤ d0 versus Ha : µd > d0 ⇔ µ1 − µ2 > d0 Lower tail H0 : µd ≥ d0 ⇔ µ1 − µ2 ≥ d0 versus Ha : µd < d0 ⇔ µ1 − µ2 < d0 Two tail H0 : µd = d0 ⇔ µ1 − µ2 = d0 versus Ha : µd 6= d0 ⇔ µ1 − µ2 6= d0 Assumption: Random sample of differences, and n is large or the population distribution of differences is approximately normal. Test statistic: t0 = x̄d − d0 sd √ n with n − 1 df. P-value: Test type Upper tail Lower tail Two tail P-value P (t > t0 ) P (t < t0 ) 2 · P (t > abs(t0 )) 2 Rejection Region t0 > t α t0 < −tα abs(t0 ) > tα/2 2. Decision: If the P-value≤ α reject H0 If the P-value> α do not reject H0 3. Context: Beside testing for difference or trend of the means it is also beneficial to have a confidence interval: Paired t-Confidence Interval for µ1 − µ2 Assumption: n is large or the population distribution of differences is approximately normal. The (1 − α) Confidence Interval for µd : sd x̄d ± tn−1 (1−α/2) √ n and tn−1 (1−α/2) is the critical value of the t-distribution with n − 1 degrees of freedom (Table 4). Example: The effect of exercise on the amount of lactic acid in the blood was examined. Blood lactate levels were measured in eight males before and after playing three games of racquetball. Player Before 1 13 2 20 3 17 4 13 5 13 6 16 7 15 8 16 After 18 37 40 35 30 20 33 19 Difference -5 -17 -23 -22 -17 -4 -18 -3 This data results in x̄d = −13.63, sd = 8.28, n = 8 Lets test if the decrease in mean lactate levels is significant at a significance level of 0.05. That is 1. H0 : µb − µa ≥ 0 vs. Ha : µb − µa < 0 . where µb (µa ) is the mean lactate level before (after) three games of racquetball. α = 0.05. 2. Assumption: The sample is a random sample and it is appropriate to assume that the difference in lactate level is normal distributed. 3. Test statistic: with d0 = 0 t= x̄d − d0 sd √ n = −13.63 with 7 df. 3 8.28 √ 8 = −4.65597 4. P-value: Since we perform a lower tail test the P-value=P (t > abs(t0 )). Use table VI from the text book. Focus on row with df = 7 find the 2 numbers that enclose abs(t0 ) = 4.656. Here:3.499< 4.6 < 4.785. Then the P-value falls between the two upper tail probabilities corresponding to these values. Here: 0.001 <P-value< 0.005. 5. Decision: Since P-value< 0.005 < α = 0.05, we reject H0 and accept Ha . 6. Result: The lactate level after three games of racquet ball is significantly lower than before at a significance level of 0.05. Lets give an estimate (95% Confidence interval) for the reduction in the mean lactate level through three games of racquetball in males. sd 8.28 √ = −13.63 ± 6.938 x̄d ± tn−1 (1−α/2) √ = −13.63 ± 2.365 n 8 7 or (−20.568; −6.696). tn−1 (1−α/2) = t0.975 = 2.365. Based on the sample data, we can be 95 % confident that the mean decrease in lactate level is between 6.692 and 20.568 after three racquetball games. 1.1.2 Independent Random Samples In this section we will see how to perform statistical tests for the difference of the means µ1 − µ2 and how to calculate a confidence interval for the difference based on two independent samples. The point estimate for µ1 − µ2 that comes first to mind is the difference of the sample means x¯1 − x¯2 . In order to do inferential statistics using this difference we have to investigate the distribution of this statistic. Sample Distribution of x̄1 − x̄2 from two independent samples. • For the mean: µx̄1 −x̄2 = µx̄1 − µx̄2 = µ1 − µ2 , so that x̄1 − x̄2 is an unbiased estimate for µ1 − µ2 . • For the variance: σx̄21 −x̄2 = σx̄21 + • For the standard deviation: σx̄22 s σx̄1 −x̄2 = σ12 σ22 = + n1 n2 σ12 σ22 + n1 n2 • If n1 and n2 are both large or both populations are normal distributed, then is the sampling distribution of x̄1 − x̄2 (approximately) normal. The t-statistic t= x̄1 − x̄2 − (µ1 − µ2 ) r s21 n1 4 + s22 n2 Mathematical results tell us that t is t-distributed with df = min(n1 − 1, n2 − 1) min(n1 − 1, n2 − 1) is the smaller of the two numbers n1 − 1 and n2 − 1. This is all the information we need to put together a Two–sample t-Test for Comparing Two Population Means 1. Hypotheses Test type Upper tail H0 : µ1 − µ2 ≤ d0 versus Ha : µ1 − µ2 > d0 Lower tail H0 : µ1 − µ2 ≥ d0 versus Ha : µ1 − µ2 < d0 Two tail H0 : µ1 − µ2 = d0 versus Ha : µ1 − µ2 6= d0 2. Assumption: random samples, n1 and n2 are large or both populations are approximately normal distributed. 3. Test statistic: t0 = x̄1 − x̄2 − d0 r s21 n1 + s22 n2 with df = min(n1 − 1, n2 − 1). 4. P-value: Test type P-value Upper tail P (t > t0 ) Lower tail P (t < t0 ) Two tail 2 · P (t > abs(t0 )) Rejection region t0 > t α t0 < −tα abs(t0 ) > tα/2 5. Decision if P-value≤ α, reject H0 if P-value> α, do not reject H0 . 6. Context Example : A company wants to show, that a vitamin supplement decreases the recover time from a common cold. They selected randomly 70 adults with a cold. 35 of those were randomly selected to receive the vitamin supplement. The data on the recover time for both samples is shown below. population 1 no vitamin sample size 35 sample mean 6.9 sample standard deviation 2.9 2 vitamin 35 5.8 1.2 Now test the claim of the company: H0 : µ1 − µ2 ≤ 0 versus Ha : µ1 − µ2 > 0 at a significance level of α=0.05. 5 Assumption: The recovery time is approximately normally distributed. Test statistic: with d0 = 0 t0 = x̄1 − x̄2 − d0 r s21 n1 + s22 n2 6.9 − 5.8 1.1 =q 2 = = 2.07 2.9 1.22 0.53 + 35 35 and df = 35 − 1 Rejection Region: Since this is an upper tail test the rejection region is t0 > tα for a t-distribution with 35 df and α = 0.05. From Table VI we get tα = 1.697 (for df = 30). Decision: Since the value of the test statistic falls into the rejection region, we reject H0 and accept Ha , that vitamins decrease the mean recovery time for common colds at a significance level of 0.05. P-value approach: We found t0 = 2.07 for 30 df (largest number in the table that falls below 35), we find that t0 falls between t0.025 = 2.042 and t0.02 = 2.0147, so the p-value falls between 0.02 and 0.025. So p − value ≤ 0.025 < 0.05 so the p-value is less than α = 0.05. The test is significant at significance level 0.05, we can reject H0 . Beside testing for difference or trend of the means it is also beneficial to have a confidence interval: Two–sample t-Confidence Interval for Comparing Two Population Means Assumption: n1 and n2 are large or both populations are approximately normal distributed. The (1 − α) Confidence Interval for µ1 − µ2 : s (x̄1 − x̄2 ) ± tdf (α/2) s21 s2 + 2 n1 n2 with df = min(n1 − 1, n2 − 1) and tdf (α/2) is the critical value of the t-distribution with the given number of degrees of freedom (Table D). Continue Example: Calculate a 95% Confidence Interval for the difference in recovery time µ1 − µ2 . The degrees of freedom are 45: s s22 s21 x̄1 − x̄2 ± tdf + (α/2) n1 n2 6 6.9 − 5.8 ± 2.042 · 0.53 1.1 ± 1.082 or (0.018 ; 2.182). The 95% confidence interval lies entirely above 0. So that 0 is with a confidence of 95% less than µ1 − µ2 . We can state with confidence 0.95 that the recovery time without vitamin treatment takes longer than with vitamin treatment. 1.2 Estimating the Difference between Two Proportions Instead of comparing two population means lets now compare two population proportions. Assume you want to compare • the rate of people who play computer games in the age groups of 20 to 30 and 30 to 40 • The proportion of defective items manufactured in two production lines The statistic for estimating the difference in two population proportions that comes to mind is the difference in the sample proportion (p̂1 − p̂2 ). Let study the sampling distribution of this statistic to construct a confidence interval. Properties of the Sampling Distribution of the Difference between two Sample Proportions (p̂1 − p̂2 ) Consider that you have two independent samples of sizes n1 and n2 from binomial populations with parameters p1 and p2 , respectively. The sampling distribution of (p̂1 − p̂2 ) has these properties: 1. The mean of (p̂1 − p̂2 ) is µp̂1 −p̂2 = p1 − p2 and the standard error is s SE = which is estimated by s ˆ = SE p1 (1 − p1 ) p2 (1 − p2 ) + n1 n2 p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 ) + n1 n2 2. The sampling distribution of (p̂1 − p̂2 ) is approximately normal distributed, when the sample sizes n1 and n2 are large, that is when n1 p1 > 5 and n1 (1 − p1 ) > 5 and n2 p2 > 5 and n2 (1 − p2 ) > 5 These results now lead to the description of the estimation of (p1 − p2 ). Large Sample Point Estimation of (p1 − p2 ) Point estimate: (p̂1 − p̂2 ) q Margin of error: ±1.96SE = ±1.96 p1 (1−p1 ) n1 7 + p2 (1−p2 ) n2 Large Sample (1 − α)100% Confidence Interval for (p1 − p2 ) s (p̂1 − p̂2 ) ± zα/2 p1 (1 − p1 ) p2 (1 − p2 ) + n, 1 n2 For this we have to assume again that n1 and n2 are large, that is n1 p1 5, n1 (1 − p1 ), n2 p2 , n2 (1 − p2 ) are greater than 5. In order to apply the tools described above, find that p1 and p2 , the population proportions, are unknown. In order to use the above procedures, we have to replace the population proportions by their estimates pˆ1 and p̂2 . So that you will estimate the margin of error by s ˆ = ±1.96 p̂1 (1 − p̂1 ) + p̂2 (1 − p̂2 ) ±1.96SE n1 n2 and use the following Approximate Large Sample (1 − α)100% Confidence Interval for (p1 − p2 ) s (p̂1 − p̂2 ) ± zα/2 p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 ) + n1 n2 For this we have to assume again that n1 and n2 are large, that is n1 p1 , n1 (1 − p1 ), n2 p2 , n2 (1 − p2 ) are greater than 5. Example: Suppose we want to compare therapies. The criteria for the comparison is the probability to survive at least 5 years after therapy. The study produced the following data: n x p̂ = x/n Population 1 Population 2 100 80 90 70 0.9 0.875 That is 90 out of 100 patients, who underwent therapy 1 survived at least 5 years. If we use p̂1 as estimate for p1 and p̂2 as estimate for p2 , we find that n1 p1 , n1 (1 − p1 ), n2 p2 , n2 (1 − p2 ) are all greater than 5. So we can use the formula from above for calculating a 95% confidence interval for p1 − p2 . s (p̂1 − p̂2 ) ± zα/2 p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 ) + = (0.025) ± 1.96 · n1 n2 s 0.9(0.1) 0.875(0.125) + = 0.025 ± 0.093 100 80 or [-0.068 ; 0.118]. Since 0 is captured in this interval, we find, that this data does not provide evidence, that the two therapies result in different probabilities to survive 5 years. They can be different, but this data does not show it. 8 1.3 Statistical Test for Two Population Proportions p1 and p2 Notation: population 1 population 2 population proportion p1 p2 sample size proportion n1 p̂1 n2 p̂2 Large-Sample z Test for comparing p1 and p2 • Hypotheses Test type Upper tail H0 : p1 − p2 ≤ 0 versus Ha : p1 − p2 > 0 Lower tail H0 : p1 − p2 ≥ 0 versus Ha : p1 − p2 < 0 Two tail H0 : p1 − p2 = 0 versus Ha : p1 − p2 6= 0 Assumption: Both sample sizes are large: Random samples, n1 p̂1 > 5, n1 (1 − p̂1 ) > 5, n2 p̂2 > 5, n2 (1 − p̂2 ) > 5 Test statistic: (p̂1 − p̂2 ) z = q p̂ c (1−p̂c ) n1 + p̂c (1−p̂c ) n2 P-value and Rejection Region: Test type P-value Rejection Region Upper tail P (z > z0 ) z0 > zα Lower tail P (z < z0 ) z0 < −zα Two tail 2 · P (z < −abs(z0 )) abs(z0 ) > zα/2 • Decision • Context Example: Find if the proportions of red M&M’s in the plain and peanut variety do differ at a significance level of 0.05. The sample Plain(1) Peanut(2) Sample Size 56 32 Number of red M&Ms 12 8 This results in p̂1 = 12/56 = 0.214 and p̂2 = 8/32 = 0.25 and p̂c = (12+8)/(56+32) = 20/88 = 0.227 1. The question asks for a test of H0 : p1 − p2 = 0 vs. Ha : p1 − p2 6= 0. α = 0.05 9 2. Assumption: Since p̂1 n1 , (1 − p̂1 )n1 , p̂2 n2 , (1 − p̂2 )n2 are all greater than 5, the assumptions are met and the test will deliver a reliable result. 3. Test statistic: z0 = q p̂ (p̂1 − p̂2 ) c (1−p̂c ) n1 + p̂c (1−p̂c ) n2 =q (0.214 − 0.25) 0.227(0.773) 56 + 0.227(0.773) 32 =√ −0.036 = −0.3882 0.0031 + 0.0055 4. Rejection region: With α = 0.05 the rejection region for a two tailed test is: abs(z0 ) > zα/2 = 1.96. or using the p-value: 2-tailed p-value=2P (z > abs(z0 )) = 2P (z > 0.3882) = 2(0.5 − 0.1517) = 0.6966 5. Decision: Since abs(−0.3882) = 0.3882 <1.96, we find that zo is not falling into the rejection region and the null hypothesis is not rejected at a significance level of 0.05. 6. At significance level of 5% we conclude that we do not have enough evidence, that the proportion of red M&M’s is different for the plain and peanut variety. 10