Download 1 Comparison of Two Populations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Confidence interval wikipedia , lookup

Taylor's law wikipedia , lookup

Statistical inference wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
1
Comparison of Two Populations
In this section we will study how to use inferential statistics for comparing two populations.
Again we will consider being interested in a mean of a population and in the proportion of a population.
1.1
Inference for the difference between two population means µ1 − µ2
Consider a study for comparing the effect of two drugs on the blood pressure. If µ1 and µ2 are the
mean changes in the blood pressure for drug one and two, respectively, then here one might want
to compare the two means based on sample data from the two populations including people being
treated with one or the other drug. This would call for a method using the information in the two
samples and estimating and testing properties for the two means.
The sample data will give us:
sample size
mean
s.d.
Sample 1
n1
x̄1
s1
Sample 2
n2
x̄2
s2
When comparing two population means we need to distinguish the following two situations describing
the type of samples we will base the comparison on.
1. independent samples – example: height of males and females, where the samples of males and
females were independently obtained.
The two samples under investigation don’t have any interaction, they are independent.
2. paired samples – example: blood pressure measured in the morning and evening in the same
group of people
The two samples are connected. For every measurement in one sample you have a corresponding
measurement in the second sample (like in the example, the measurements might be even taken
for the same individual).
1.1.1
t-Procedure for Two Paired Samples
In order to control extraneous factors in some studies you can use paired samples. In this case for
every individual in the sample from population 1 you find a matching individual from population 2.
And the decision is made based on the resulting sample data. In this case we always get n = n1 = n2 .
Example:
• Compare the resting pulse and pulse after exercise.
To control for all other influences, you take both measurements for every individual resulting
in two samples (before and after exercise).
1
We are interested in the difference in the population means µd = µ1 − µ2 .
For statistical inference the differences of the paired observations
sample 1 value − sample 2 value
are used.
Which then will create one sample of size n of measurements of pairwise differences.
x̄d and sd denote the mean and the standard deviation,respectively, for those differences.
For the distribution of x̄d
1. µx̄d = µ1 − µ2 , x̄d is an unbiased estimator for µ1 − µ2
√
2. σx̄d = σ/ n, where σ is the population standard deviations of the pairwise differences.
3. If n is large than x̄d is normally distributed.
So we get for the t-score
t=
x̄d − (µ1 − µ2 )
√
sd / n
is t-distributed with df = n − 1, if n is large or the pairwise differences come from a normal distribution.
These facts lead to the following
Paired t-Test for Comparing Two Population Means
1. Hypotheses:
Test type
Upper tail H0 : µd ≤ d0 ⇔ µ1 − µ2 ≤ d0 versus Ha : µd > d0 ⇔ µ1 − µ2 > d0
Lower tail H0 : µd ≥ d0 ⇔ µ1 − µ2 ≥ d0 versus Ha : µd < d0 ⇔ µ1 − µ2 < d0
Two tail
H0 : µd = d0 ⇔ µ1 − µ2 = d0 versus Ha : µd 6= d0 ⇔ µ1 − µ2 6= d0
Assumption: Random sample of differences, and n is large or the population distribution of
differences is approximately normal.
Test statistic:
t0 =
x̄d − d0
sd
√
n
with n − 1 df.
P-value:
Test type
Upper tail
Lower tail
Two tail
P-value
P (t > t0 )
P (t < t0 )
2 · P (t > abs(t0 ))
2
Rejection Region
t0 > t α
t0 < −tα
abs(t0 ) > tα/2
2. Decision: If the P-value≤ α reject H0
If the P-value> α do not reject H0
3. Context:
Beside testing for difference or trend of the means it is also beneficial to have a confidence interval:
Paired t-Confidence Interval for µ1 − µ2
Assumption: n is large or the population distribution of differences is approximately normal.
The (1 − α) Confidence Interval for µd :
sd
x̄d ± tn−1
(1−α/2) √
n
and tn−1
(1−α/2) is the critical value of the t-distribution with n − 1 degrees of freedom (Table 4).
Example:
The effect of exercise on the amount of lactic acid in the blood was examined.
Blood lactate levels were measured in eight males before and after playing three games of racquetball.
Player Before
1
13
2
20
3
17
4
13
5
13
6
16
7
15
8
16
After
18
37
40
35
30
20
33
19
Difference
-5
-17
-23
-22
-17
-4
-18
-3
This data results in x̄d = −13.63, sd = 8.28, n = 8
Lets test if the decrease in mean lactate levels is significant at a significance level of 0.05. That is
1.
H0 : µb − µa ≥ 0 vs. Ha : µb − µa < 0
.
where µb (µa ) is the mean lactate level before (after) three games of racquetball.
α = 0.05.
2. Assumption: The sample is a random sample and it is appropriate to assume that the difference in lactate level is normal distributed.
3. Test statistic: with d0 = 0
t=
x̄d − d0
sd
√
n
=
−13.63
with 7 df.
3
8.28
√
8
= −4.65597
4. P-value: Since we perform a lower tail test the P-value=P (t > abs(t0 )).
Use table VI from the text book. Focus on row with df = 7 find the 2 numbers that enclose
abs(t0 ) = 4.656.
Here:3.499< 4.6 < 4.785. Then the P-value falls between the two upper tail probabilities
corresponding to these values.
Here: 0.001 <P-value< 0.005.
5. Decision: Since P-value< 0.005 < α = 0.05, we reject H0 and accept Ha .
6. Result: The lactate level after three games of racquet ball is significantly lower than before
at a significance level of 0.05.
Lets give an estimate (95% Confidence interval) for the reduction in the mean lactate level through
three games of racquetball in males.
sd
8.28
√ = −13.63 ± 6.938
x̄d ± tn−1
(1−α/2) √ = −13.63 ± 2.365
n
8
7
or (−20.568; −6.696). tn−1
(1−α/2) = t0.975 = 2.365.
Based on the sample data, we can be 95 % confident that the mean decrease in lactate level is
between 6.692 and 20.568 after three racquetball games.
1.1.2
Independent Random Samples
In this section we will see how to perform statistical tests for the difference of the means µ1 − µ2 and
how to calculate a confidence interval for the difference based on two independent samples.
The point estimate for µ1 − µ2 that comes first to mind is the difference of the sample means x¯1 − x¯2 .
In order to do inferential statistics using this difference we have to investigate the distribution of this
statistic.
Sample Distribution of x̄1 − x̄2 from two independent samples.
• For the mean: µx̄1 −x̄2 = µx̄1 − µx̄2 = µ1 − µ2 , so that x̄1 − x̄2 is an unbiased estimate for µ1 − µ2 .
• For the variance:
σx̄21 −x̄2
=
σx̄21
+
• For the standard deviation:
σx̄22
s
σx̄1 −x̄2 =
σ12 σ22
=
+
n1 n2
σ12 σ22
+
n1 n2
• If n1 and n2 are both large or both populations are normal distributed, then is the sampling
distribution of x̄1 − x̄2 (approximately) normal.
The t-statistic
t=
x̄1 − x̄2 − (µ1 − µ2 )
r
s21
n1
4
+
s22
n2
Mathematical results tell us that t is t-distributed with
df = min(n1 − 1, n2 − 1)
min(n1 − 1, n2 − 1) is the smaller of the two numbers n1 − 1 and n2 − 1.
This is all the information we need to put together a
Two–sample t-Test for Comparing Two Population Means
1. Hypotheses
Test type
Upper tail H0 : µ1 − µ2 ≤ d0 versus Ha : µ1 − µ2 > d0
Lower tail H0 : µ1 − µ2 ≥ d0 versus Ha : µ1 − µ2 < d0
Two tail
H0 : µ1 − µ2 = d0 versus Ha : µ1 − µ2 6= d0
2. Assumption: random samples, n1 and n2 are large or both populations are approximately
normal distributed.
3. Test statistic:
t0 =
x̄1 − x̄2 − d0
r
s21
n1
+
s22
n2
with
df = min(n1 − 1, n2 − 1).
4. P-value:
Test type P-value
Upper tail P (t > t0 )
Lower tail P (t < t0 )
Two tail
2 · P (t > abs(t0 ))
Rejection region
t0 > t α
t0 < −tα
abs(t0 ) > tα/2
5. Decision
if P-value≤ α, reject H0
if P-value> α, do not reject H0 .
6. Context
Example :
A company wants to show, that a vitamin supplement decreases the recover time from a common
cold. They selected randomly 70 adults with a cold. 35 of those were randomly selected to receive
the vitamin supplement. The data on the recover time for both samples is shown below.
population
1
no vitamin
sample size
35
sample mean
6.9
sample standard deviation 2.9
2
vitamin
35
5.8
1.2
Now test the claim of the company: H0 : µ1 − µ2 ≤ 0 versus Ha : µ1 − µ2 > 0 at a significance level
of α=0.05.
5
Assumption: The recovery time is approximately normally distributed.
Test statistic: with d0 = 0
t0 =
x̄1 − x̄2 − d0
r
s21
n1
+
s22
n2
6.9 − 5.8
1.1
=q 2
=
= 2.07
2.9
1.22
0.53
+
35
35
and
df = 35 − 1
Rejection Region: Since this is an upper tail test the rejection region is t0 > tα for a
t-distribution with 35 df and α = 0.05. From Table VI we get tα = 1.697 (for df = 30).
Decision: Since the value of the test statistic falls into the rejection region, we reject H0 and
accept Ha , that vitamins decrease the mean recovery time for common colds at a significance
level of 0.05.
P-value approach: We found t0 = 2.07 for 30 df (largest number in the table that falls below
35), we find that t0 falls between t0.025 = 2.042 and t0.02 = 2.0147, so the p-value falls between
0.02 and 0.025. So
p − value ≤ 0.025 < 0.05
so the p-value is less than α = 0.05. The test is significant at significance level 0.05, we can
reject H0 .
Beside testing for difference or trend of the means it is also beneficial to have a confidence interval:
Two–sample t-Confidence Interval for Comparing Two Population Means
Assumption: n1 and n2 are large or both populations are approximately normal distributed.
The (1 − α) Confidence Interval for µ1 − µ2 :
s
(x̄1 − x̄2 ) ± tdf
(α/2)
s21
s2
+ 2
n1 n2
with
df = min(n1 − 1, n2 − 1)
and tdf
(α/2) is the critical value of the t-distribution with the given number of degrees of freedom
(Table D).
Continue Example:
Calculate a 95% Confidence Interval for the difference in recovery time µ1 − µ2 .
The degrees of freedom are 45:
s
s22
s21
x̄1 − x̄2 ± tdf
+
(α/2)
n1 n2
6
6.9 − 5.8 ± 2.042 · 0.53
1.1 ± 1.082
or (0.018 ; 2.182). The 95% confidence interval lies entirely above 0. So that 0 is with a confidence
of 95% less than µ1 − µ2 . We can state with confidence 0.95 that the recovery time without vitamin
treatment takes longer than with vitamin treatment.
1.2
Estimating the Difference between Two Proportions
Instead of comparing two population means lets now compare two population proportions.
Assume you want to compare
• the rate of people who play computer games in the age groups of 20 to 30 and 30 to 40
• The proportion of defective items manufactured in two production lines
The statistic for estimating the difference in two population proportions that comes to mind is the
difference in the sample proportion (p̂1 − p̂2 ). Let study the sampling distribution of this statistic to
construct a confidence interval.
Properties of the Sampling Distribution of the Difference between two Sample Proportions (p̂1 − p̂2 )
Consider that you have two independent samples of sizes n1 and n2 from binomial populations with
parameters p1 and p2 , respectively.
The sampling distribution of (p̂1 − p̂2 ) has these properties:
1. The mean of (p̂1 − p̂2 ) is
µp̂1 −p̂2 = p1 − p2
and the standard error is
s
SE =
which is estimated by
s
ˆ =
SE
p1 (1 − p1 ) p2 (1 − p2 )
+
n1
n2
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
+
n1
n2
2. The sampling distribution of (p̂1 − p̂2 ) is approximately normal distributed, when the sample
sizes n1 and n2 are large, that is when
n1 p1 > 5 and n1 (1 − p1 ) > 5 and n2 p2 > 5 and n2 (1 − p2 ) > 5
These results now lead to the description of the estimation of (p1 − p2 ).
Large Sample Point Estimation of (p1 − p2 )
Point estimate: (p̂1 − p̂2 )
q
Margin of error: ±1.96SE = ±1.96
p1 (1−p1 )
n1
7
+
p2 (1−p2 )
n2
Large Sample (1 − α)100% Confidence Interval for (p1 − p2 )
s
(p̂1 − p̂2 ) ± zα/2
p1 (1 − p1 ) p2 (1 − p2 )
+
n, 1
n2
For this we have to assume again that n1 and n2 are large, that is
n1 p1 5, n1 (1 − p1 ), n2 p2 , n2 (1 − p2 ) are greater than 5.
In order to apply the tools described above, find that p1 and p2 , the population proportions, are
unknown. In order to use the above procedures, we have to replace the population proportions by
their estimates pˆ1 and p̂2 .
So that you will estimate the margin of error by
s
ˆ = ±1.96 p̂1 (1 − p̂1 ) + p̂2 (1 − p̂2 )
±1.96SE
n1
n2
and use the following
Approximate Large Sample (1 − α)100% Confidence Interval for (p1 − p2 )
s
(p̂1 − p̂2 ) ± zα/2
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
+
n1
n2
For this we have to assume again that n1 and n2 are large, that is
n1 p1 , n1 (1 − p1 ), n2 p2 , n2 (1 − p2 ) are greater than 5.
Example:
Suppose we want to compare therapies. The criteria for the comparison is the probability to survive
at least 5 years after therapy.
The study produced the following data:
n
x
p̂ = x/n
Population 1 Population 2
100
80
90
70
0.9
0.875
That is 90 out of 100 patients, who underwent therapy 1 survived at least 5 years.
If we use p̂1 as estimate for p1 and p̂2 as estimate for p2 , we find that n1 p1 , n1 (1 − p1 ), n2 p2 , n2 (1 − p2 )
are all greater than 5. So we can use the formula from above for calculating a 95% confidence interval
for p1 − p2 .
s
(p̂1 − p̂2 ) ± zα/2
p̂1 (1 − p̂1 ) p̂2 (1 − p̂2 )
+
= (0.025) ± 1.96 ·
n1
n2
s
0.9(0.1) 0.875(0.125)
+
= 0.025 ± 0.093
100
80
or [-0.068 ; 0.118]. Since 0 is captured in this interval, we find, that this data does not provide
evidence, that the two therapies result in different probabilities to survive 5 years.
They can be different, but this data does not show it.
8
1.3
Statistical Test for Two Population Proportions p1 and p2
Notation:
population 1
population 2
population
proportion
p1
p2
sample
size proportion
n1
p̂1
n2
p̂2
Large-Sample z Test for comparing p1 and p2
• Hypotheses
Test type
Upper tail H0 : p1 − p2 ≤ 0 versus Ha : p1 − p2 > 0
Lower tail H0 : p1 − p2 ≥ 0 versus Ha : p1 − p2 < 0
Two tail
H0 : p1 − p2 = 0 versus Ha : p1 − p2 6= 0
Assumption: Both sample sizes are large: Random samples,
n1 p̂1 > 5, n1 (1 − p̂1 ) > 5, n2 p̂2 > 5, n2 (1 − p̂2 ) > 5
Test statistic:
(p̂1 − p̂2 )
z = q p̂
c (1−p̂c )
n1
+
p̂c (1−p̂c )
n2
P-value and Rejection Region:
Test type P-value
Rejection Region
Upper tail P (z > z0 )
z0 > zα
Lower tail P (z < z0 )
z0 < −zα
Two tail
2 · P (z < −abs(z0 )) abs(z0 ) > zα/2
• Decision
• Context
Example:
Find if the proportions of red M&M’s in the plain and peanut variety do differ at a significance level
of 0.05.
The sample
Plain(1) Peanut(2)
Sample Size
56
32
Number of red M&Ms 12
8
This results in p̂1 = 12/56 = 0.214 and p̂2 = 8/32 = 0.25 and p̂c = (12+8)/(56+32) = 20/88 = 0.227
1. The question asks for a test of H0 : p1 − p2 = 0 vs. Ha : p1 − p2 6= 0.
α = 0.05
9
2. Assumption: Since p̂1 n1 , (1 − p̂1 )n1 , p̂2 n2 , (1 − p̂2 )n2 are all greater than 5, the assumptions are
met and the test will deliver a reliable result.
3. Test statistic:
z0 = q p̂
(p̂1 − p̂2 )
c (1−p̂c )
n1
+
p̂c (1−p̂c )
n2
=q
(0.214 − 0.25)
0.227(0.773)
56
+
0.227(0.773)
32
=√
−0.036
= −0.3882
0.0031 + 0.0055
4. Rejection region: With α = 0.05 the rejection region for a two tailed test is:
abs(z0 ) > zα/2 = 1.96.
or using the p-value:
2-tailed p-value=2P (z > abs(z0 )) = 2P (z > 0.3882) = 2(0.5 − 0.1517) = 0.6966
5. Decision: Since abs(−0.3882) = 0.3882 <1.96, we find that zo is not falling into the rejection
region and the null hypothesis is not rejected at a significance level of 0.05.
6. At significance level of 5% we conclude that we do not have enough evidence, that the proportion
of red M&M’s is different for the plain and peanut variety.
10