Download COMPARING TWO POPULATIONS

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Psychometrics wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Topic 20 – Two Populations
20-1
Topic 20 – COMPARING TWO POPULATIONS (OR
TREATMENTS)
A) Two Population Means Using Independent
Samples
EXAMPLE A scientist is interested in determining which
of two butterfly subspecies has a larger wingspan.
Subspecies 1 is found on forest understory plants and
tends to feed on its nursery plants. Thus it doesn’t travel
far. The other species is found on open field flowers and
migrates seasonally. She hypothesizes that the migrating
species has larger average wingspans than the forest
species and plans to take two samples to test her
hypothesis.
Notation:
Popu- Popula- Popula- Sample Sample Sample
lation
tion
tion
Size
Mean Standard
Deviation
Mean Standard
Deviation
x1
1
n1
s1
σ1
μ1
x2
2
n2
s2
σ2
μ2
To compare 2 population means we shall consider the size
of the difference μ1 − μ 2 :
Topic 20 – Two Populations
20-2
μ1 − μ 2 = 0 ⇒ μ1 = μ 2
μ1 − μ 2 > 0 ⇒ μ1 > μ 2
μ1 − μ 2 < 0 ⇒ μ1 < μ 2
Our sampling estimator of this population difference is
the sample mean difference x1 − x2 when the two samples
are independent of one another.
Sampling Distribution of x1 − x2 when the two samples
are independently and randomly taken:
1) the mean of the distribution is
μ X1 − X 2 = μ1 − μ 2 (that is, x1 − x2 is unbiased)
2) the standard deviation of the distribution is
σ X1 − X 2 =
σ 12 σ 22
n1
+
n2
3) the shape of the sampling distribution is
approximately normal (a bell curve) if
a) both n1 and n1 are large, or
b) both of the populations being sampled are
approximately normally distributed
The estimator of μ1 − μ 2 is μˆ1 − μˆ 2 = x1 − x2 .
Topic 20 – Two Populations
20-3
The estimator of σ X1 − X 2 =
σ 12 σ 22
+
depends on
n1 n2
whether σ 1 ≠ σ 2 (unequal variance case) or σ 1 = σ 2
(equal variance case).
Equal Variance Case: When σ 1 = σ 2 , the estimator of
σ X1 − X 2 is given by
⎛1 1⎞
s x1 − x2 = sc2 ⎜⎜ + ⎟⎟
⎝ n1 n2 ⎠
where the estimator of the common variance is
sc2
s12 (n1 − 1) + s22 ( n2 − 1)
.
=
n1 + n2 − 2
The degrees of freedom for this estimator are n1 + n2 − 2 .
Unequal Variance Case: When σ 1 ≠ σ 2 , the estimator of
σ X1 − X 2 is given by
s12 s22
s ' x1 − x2 =
+ .
n1 n2
What are the degrees of freedom for s ' x1 − x2 ?
Topic 20 – Two Populations
20-4
Satterthwaite showed that the appropriate degrees of
freedom for this estimator are
(V1 + V2 ) 2
s12
s22
where V1 =
and V2 =
df = 2
2
n1
n2
V1
V2
+
n1 − 1 n2 − 1
These reduce to df = n1+n2–2 when the two variances are
in fact equal.
So, if x1 − x2 is at least approximately normally
distributed we get that
t 'obs =
or
tobs =
( x1 − x2 ) − ( μ1 − μ 2 )
s12 s22
+
n1 n2
( x1 − x2 ) − ( μ1 − μ 2 )
2⎛
sc ⎜⎜
1 1⎞
+ ⎟⎟
n
⎝ 1 n2 ⎠
have approximate T-distributions on the associated
degrees of freedom.
Topic 20 – Two Populations
20-5
Hypothesis Test of the Difference in Two Population
Means Based on Two Independent Samples:
Hypotheses are one of three:
a) H0: μ1 − μ 2 ≤ D0 vs. HA: μ1 − μ2 > D0
b) H0: μ1 − μ 2 ≥ D0 vs. HA: μ1 − μ2 < D0
c) H0: μ1 − μ2 = D0 vs. HA: μ1 − μ 2 ≠ D0
where D0 is the hypothesized difference between the
means
Test Statistic: depends on whether the variances in the
two populations are different or the same – so the statistic
is either
( x − x ) − D0
(1) t 'obs = 1 2
or
2
2
s1 s2
+
n1 n2
(2) tobs =
( x1 − x2 ) − ( μ1 − μ 2 )
⎛1 1⎞
sc2 ⎜⎜ + ⎟⎟
⎝ n1 n2 ⎠
The degrees of freedom are
Topic 20 – Two Populations
for (1):
(V1 + V2 ) 2
V12
V22
+
n1 − 1 n2 − 1
20-6
s22
s12
where V1 =
and V2 =
n1
n2
and for (2): n1 + n2 − 2 .
P-value: depends on the alternative hypothesis:
a) P-value = Pr( T > t)
b) P-value = Pr( T < t)
c) P-value = 2 Pr( T > |t|)
Decision Rule: reject Ho if P-value ≤ α
Assumptions:
1. n1 and n2 are large enough for the sample means to
be approximately normally distributed
2. the sampling was random and not more than 5% of
the population.
3. the two samples are independently taken
EXAMPLE Nitrogen is the most common nutrient
applied to soils. In tropical areas with warm temperatures
and heavy rainfall, only part of the applied nitrogen is
used by crops and the rest is lost. Information about the
mean nitrogen loss (N-loss) is important for research on
optimal growth of plants.
Topic 20 – Two Populations
20-7
To that end, two nitrogen fertilizer treatments are to be
compared for their average N-loss: Urea alone
(population 1) and Urea+N-Serve (population 2).
A sugarcane field was divided into equal size plots and
plots were randomly assigned to one of the two
treatments. There were sufficient numbers of plots so that
no treated plots were adjacent on any side.
Important Point about Experimental Design: when
planning an experiment to compare two or more
treatments:
1) experimental units (plants, field plots, people, etc)
should be randomly selected from the larger group
from which they could be selected (the population of
potential experimental units)
2) treatments should be randomly assigned to the
experimental units
3) extraneous or confounding factors should be
considered and minimized when assigning and
running the experiment (e.g. all units should be the
same size, have the same weather conditions, etc)
The following data represent Nitrogen loss (% of total N
applied) at the end of a 16 week period:
Topic 20 – Two Populations
Fertilizer
UN
U
Group
U
UN
20-8
Percentage N-loss
10.8, 10.5,14.0, 13.5, 8.0, 9.5, 11.8, 10.0,
8.7, 9.0, 9.8, 13.8, 14.7, 10.3, 12.8
8.0, 7.3, 14.1, 9.8, 7.1, 6.3, 10.0, 7.1, 7.9,
6.1, 6.9, 11.0, 10.0
Treatment
N
ID
1
13
2
15
Mean
SD
S2
8.585 2.288 5.235
11.147 2.140 4.580
Question: Is there sufficient evidence to support the
hypothesis that the two treatments differ in their mean
percentage N-loss?
Hypotheses:
Ho: μ1 − μ 2 = 0
HA: μ1 − μ 2 ≠ 0
Significance level:
α = 0.05
Test Statistic (assuming unequal variances)
t=
( x1 − x2 ) − D0
s12
n1
+
s22
n2
=
(8.58 − 11.15) − 0
2
(2.29) (2.14)
+
13
15
2
= −3.045
Topic 20 – Two Populations
20-9
s12 ( 2.29) 2
Degrees of Freedom: V1 =
=
= 0.4034
n1
13
s22 ( 2.14) 2
V2 =
=
= 0.3053
n2
15
(V1 + V2 ) 2
(.4034 + .3053) 2
df = 2
=
= 24.8 ≈ 24.
2
2
2
V1
V
(.4034)
(.3053)
+ 2
+
n1 − 1 n2 − 1
13 − 1
15 − 1
(always round down)
P-value: 2 Pr ( T > |t| ) =2 Pr(T> 3.0). From the table in
the book, we see that for 24 df, tobs = 3.0 lies between
2.797 (p-value=0.005) and 3.467 (p-value=0.001). Hence,
we have that the p-value for our test lies between 2(0.001)
= 0.002 and 2(0.005) = 0.01.
Test Statistic (assuming equal variances)
t=
( x1 − x2 ) − D0
2⎛
sc ⎜⎜
where
1 1⎞
+ ⎟⎟
n
⎝ 1 n2 ⎠
=
(8.58 − 11.15) − 0
= −3.06
⎛1 1⎞
4.882⎜ + ⎟
⎝ 13 15 ⎠
Topic 20 – Two Populations
sc2
20-10
s12 ( n1 − 1) + s22 (n2 − 1) 5.235(12) + 4.580(14)
= 4.882
=
=
n1 + n2 − 2
13 + 15 − 2
Degrees of Freedom: n1 + n2 − 2 = 26.
P-value: 2 Pr ( T > |t| ) =2 Pr(T> 3.0). From the table in
the book, we see that for 26 df, tobs = 3.0 lies between
2.779 (p-value=0.005) and 3.435 (p-value=0.001). Hence,
we have that the p-value for our test lies between 2(0.001)
= 0.002 and 2(0.005) = 0.01.
Conclusion: Regardless of the choice of test, the p-value
is less than α=0.05, so we reject the null hypothesis.
There is sufficient evidence to indicate that the two
nitrogen treatments differ in their average percentage
nitrogen loss.
Had we used SAS the code and output would be:
data nloss;
input treatment$ loss @@;
cards;
UN 10.8 UN 10.5 UN 14.0 UN 13.5 UN 8.0
UN 9.5 UN 11.8 UN 10.0 UN 8.7 UN 9.0
UN 9.8 UN 13.8 UN 14.7 UN 10.3 UN 12.8
U 8.0 U 7.3 U 14.1 U 9.8 U 7.1
U 6.3 U 10.0 U 7.1 U 7.9 U 6.1
U 6.9 U 11.0 U 10.0
;
Topic 20 – Two Populations
20-11
proc ttest data=nloss;
class treatment;
var loss;
quit;
Statistics
LowerCL
N Mean
Mean
treatment
U
13 7.201
UN
15 9.961
Diff (1-2)
-4.283
8.584
11.14
-2.56
UpperCL LowerCL
UpperCL
Mean
StdDev StdDev StdDev StdErr
9.967
12.33
-0.84
sc2
Y1 − Y2
Variable
loss
loss
Method
Pooled
Satterthwaite
Variable
loss
1.640
1.566
1.740
T-Tests
Variances
Equal
Unequal
DF
26
24.8
Equality of Variances
Method
Num DF
Den DF
Folded F
12
14
2.288
2.139
2.209
3.777
3.374
3.028
0.6347
0.5525
0.8373
⎛1 1 ⎞
sc2 ⎜⎜ + ⎟⎟
⎝ n1 n2 ⎠
t Value
-3.06
-3.04
F Value
1.14
Pr > |t|
0.0051
0.0054
Pr > F
0.8014
Note that the p-values are exact here and are equal to
0.0051 or 0.0054 depending on whether we use the test
assuming equal variance or not.
There are two questions here:
Topic 20 – Two Populations
20-12
1) We just saw that when the two sample variances are
close in value, the two test statistics are almost identical.
So, why not just use the unequal variance test all of the
time?
Actually, that is not unreasonable since the test for
unequal variances reduces to the equal variance test when
the two sample variances are identical. In reality though,
even when the two populations have equal variance, the
sample variances can be quite different. This is especially
true when the sample sizes are not the same. As a result,
the unequal variance test is not as good as the equal
variance test when the two population variances are equal.
It tends to have higher type II error when the two
variances are equal but we assume they are not. So, the
next question is …
2) How do we identify which test statistic should be used?
Well, we can either
• use the rule of thumb that the sample variances
should be within 3 times each other OR
• do a test of equality of the two variances.
We will learn the test for equality of variances next after
CI estimation of the difference in two population means.
Topic 20 – Two Populations
20-13
Confidence Interval Estimation of the Difference of
Two Means Based on Independent Samples:
Interval Estimator:
( x1 − x2 ) ± tα / 2,df × estimator of σ x1 − x2
where the t-value is based on the confidence level desired
(α) and has degrees of freedom calculated according to
which estimator you use for the variance (equal or
unequal).
Assumptions:
1. n1 and n2 are large enough for the sample means to
be approximately normally distributed
2. the sampling was random and not more than 5% of
the population.
3. the two samples are independently taken
EXAMPLE N-loss experiment. A 95% confidence
interval based on two independent samples is given by
either
( x1 − x2 ) ± t0.025, 24
or
s12 s22
+
n1 n2
Topic 20 – Two Populations
20-14
⎛1 1⎞
( x1 − x2 ) ± t0.025, 26 sc2 ⎜⎜ + ⎟⎟
⎝ n1 n2 ⎠
From earlier:
Group
U
UN
Treatment
N
ID
1
13
2
15
Mean
SD
S2
8.585 2.288 5.235
11.147 2.140 4.580
For unequal variances, the t-value for 95% confidence
and 24 df = 2.06. So, the confidence interval is
(2.29) 2 (2.14) 2
(8.58 − 11.15) ± 2.06
+
13
15
= (−4.2870 − 0.8371)
Similarly, if we assume equal variances the t-value for
95% and 26 df = 2.05, so we obtain
⎛1 1⎞
(8.58 − 11.15) ± 2.05 4.882⎜ + ⎟
⎝ 13 15 ⎠
= (−4.2831, − 0.8410)
Topic 20 – Two Populations
20-15
Thus, with 95% confidence, the mean nitrogen loss (%)
from Urea alone is between 0.8% and 4.3% below the
mean loss of the Urea+N-Serve combination, regardless
of which method we use.
Note that the SAS output reports the 95% confidence
interval of the difference assuming equal variances. If you
want the interval for unequal variances, you will have to
calculate it yourself.
EXAMPLE Discharge of industrial waste into rivers
affects water quality. To assess the effect of a power plant
on water quality, 24 samples were taken 16 km upstream
of the plant and another 24 were taken at 4 km
downstream. Alkalinity (mg/l) was measured on each
water sample. Do the data suggest that the true mean
alkalinity below the plant is more than 50 mg/l higher
than the true mean alkalinity upstream of the plant?
Since the two tests report similar results, we will use the
unequal variance test here. Output from a statistical
software program:
Group
N
Mean
SD
Pop’ln
upstream
24
75.9
1.83
2
downstream 24
183.6
1.70
1
tobs = 113.2
df
= 45
2-sided P-value = 0+
Topic 20 – Two Populations
Hypotheses:
20-16
Ho: μ1 − μ 2 = 50
HA: μ1 − μ 2 > 50
Check the t-score:
t=
( x1 − x2 ) − D0
s12
n1
+
s22
n2
Assumptions:
=
(183.6 − 75.9) − 50
2
(1.70)
(1.83)
+
24
24
2
= 113.17
1) sample sizes large enough?
2) samples independent and random?
Conclusion: There is strong evidence to suggest that the
average alkalinity of the water below the power plant is
more than 50 mg/l higher than the mean alkalinity of the
water above the power plant.
For a 95% confidence interval estimate of the difference
we have: the t-value for 95% and 45 df ≈ 2.02. So,
(1.70) 2 (1.83) 2
(183.6 − 75.9) ± 2.02
+
24
24
= (100.67, 102.73)
Topic 20 – Two Populations
20-17
We conclude with 95% confidence that the mean
alkalinity below the power plant is between 100.7 and 103
mg/l higher than the mean alkalinity of the water above
the power plant!
B) Comparing Two Population Variances Using
Independent Samples
Suppose we are interested in determining which t-test to
use to compare two means or we might be interested in
comparing two populations variances for other purposes.
A simple test of two population variances based on two
independent samples is called Hartley’s Fmax test or the
folded F-test. An underlying assumption is that the two
populations being tested are Normally distributed.
To test hypotheses about population variances we look at
the ratio of the two sample variances:
Fobs =
2
smax
2
smin
where
2
2
smax
= max(s12 , s22 ) > smin
= min( s12 , s22 ) .
Hence, the larger sample variance is always put in the
numerator.
Topic 20 – Two Populations
20-18
This test statistic, Fobs, has a sampling distribution known
as the F-distribution with two sets of degrees of freedom,
the numerator and the denominator degrees of freedom.
For the Fmax test:
• the numerator df are nmax – 1 (nmax is the sample size
2
for smax
) and
• the denominator df are nmin – 1 (nmin is the sample
2
size for smin
).
Note that nmax need not be larger than nmin!
The F-distribution is positively skewed with a long right
tail and a shape that depends on the two df values. It is a
probability distribution for random variables whose
values are > 0 (like the Chi-Square distribution).
Like the chi-square distribution, we can use a table of
cutoff values to determine whether to reject the null
hypothesis. See pages 625-635 of Fruend & Wilson. For
the Fmax test, use the table on page 635 if the two sample
sizes are the same, i.e. n1 = n2.
Hartley’s Fmax Test of Equality of Two Population
Variances Based on Two Independent Samples:
Hypotheses:
H0: σ 12 = σ 22 vs. HA: σ 12 ≠ σ 22
Topic 20 – Two Populations
20-19
Test Statistic: Fobs =
2
smax
2
smin
2
2
where smax
> smin
The numerator df are nmax – 1 (nmax is the sample size for
2
smax
) and the denominator df are nmin – 1 (nmin is the
2
sample size for smin
)
Decision Rule (2 approaches):
1) reject H0 if Fobs > tabulated F-value for α and the two
sets of df.
2) reject H0 if the p-value of the test < α.
EXAMPLE A wildlife biologist is interested in
comparing the variability in weights for two populations
of deer: those raised in the wild and those raised in a zoo.
She randomly selected eight deer from each population
and weighed them (lbs) at the age of 1 year. The data are:
W
W
Z
Z
114.7
134.5
103.1
182.5
W
W
Z
Z
128.9
126.7
90.7
76.8
W
W
Z
Z
H0: σ W2 = σ Z2 vs. HA: σ W2 ≠ σ Z2
111.5
120.6
129.5
87.3
W
W
Z
Z
116.4
129.6
75.8
77.3
Topic 20 – Two Populations
20-20
From SAS we have (subset of the output):
Statistics
Variable
weight
weight
weight
location
W
Z
Diff (1-2)
Method
Pooled
Satterthwaite
Variable
weight
N
8
8
Mean
122.86
102.88
19.988
Std Dev
8.2342
36.853
26.701
T-Tests
Variances
DF
Equal
14
Unequal
7.7
Std Err
2.9112
13.029
13.351
t Value
1.50
1.50
Equality of Variances
Method
Num DF Den DF F Value
Folded F
7
7
20.03
Test statistic:
Fobs =
2
smax
2
smin
=
36.852
8.23
2
Pr > |t|
0.1566
0.1742
Pr > F
0.0008
= 20.03
Choose α = 0.05 .
Decision:
1) From the table on pg. 635, with the denominator df =
8-1 = 7, we have a cutoff value of 4.99. Since Fobs = 20.03
> cutoff = 4.99, we reject the null hypothesis and
conclude that the two populations of deer, those raised in
Topic 20 – Two Populations
20-21
the wild and those raised in zoos, differ in the variability
of their weights at 1 year of age.
2) From the SAS output, the p-value for Fobs is 0.0008
which is less than α = 0.05. Hence, we reject the null
hypothesis and conclude that there is sufficient evidence
to indicate that the variability of weights of deer raised in
zoos differs from the variability of weights of wild deer.
C) Comparing Two Population Means Using Paired
Samples
Consider the following experiments:
1. In order to determine if two IQ tests yield similar
results (means and standard deviations), the researcher
selected 50 college students at random to take both tests.
The order in which any given student took the tests was
randomized and the tests were taken 1 month apart to
minimize crossover effects. The hypothesis is that test # 1
is biased in that it yields a higher average score than test
#2 which has been in use for many years.
Hypotheses: Ho: μ1 − μ 2 = 0 vs. HA: μ1 − μ 2 > 0
Note the experimental design here as well as the
hypotheses being tested. We can’t use the independent
samples test for this case.
Topic 20 – Two Populations
20-22
2. A swine nutritionist wished to compare a nitrogen
poor + enzyme diet (#1) to a nitrogen rich diet (#2) for
pigs. Rather than take one piglet from each new litter and
assign it a diet at random, he chose instead to take 2
piglets from each litter and randomly assign one pig to
one diet and the other to the other diet. The hypothesis is
that the nitrogen rich diet results in a higher average
weight gain than the nitrogen poor + enzyme diet.
Hypotheses: Ho: μ1 − μ 2 = 0 vs. HA: μ1 − μ 2 < 0
3. A researcher is interested in the effect of oxygen
exposure on cell fluidity in pulmonary artery cells in
dogs. She intends to collects cells from ten dogs for the
experiment. For each dog, two agar plates of artery cells
are prepared and each plate is randomly assigned to either
receive O2 or not receive O2 treatment. She wishes to test
the hypothesis that the mean fluidity for oxygen treated
cells (2) differs from the mean for untreated cells (1).
Hypotheses: Ho: μ1 − μ 2 = 0 vs. HA: μ1 − μ 2 ≠ 0
In all three cases, the samples are NOT independent of
each other. In fact, they are deliberately dependent.
Topic 20 – Two Populations
20-23
One reason for this is that the estimator of the difference
between two means based on 2 independent samples has a
large standard deviation (recall that it is the square root of
the SUM of two variances).
When samples are paired as is done here, the standard
deviation of the estimator of μ1 − μ 2 used for a paired
experiment is often smaller.
Defn: A PAIRED or “BLOCKED” experiment is one in
which each randomly selected experimental unit in the
first sample is paired deliberately with a selected unit in
the second sample. The units in the second sample are
chosen so that they have characteristics similar to the unit
in the first sample to which they have been paired.
The characteristics used for pairing are usually those that
likely have an effect on the response variable being
studied in the experiment but are not of direct interest.
It is this last statement that often leads to the standard
deviation being smaller in paired experiments.
Example #1. Perfect pairing since each experimental unit
in sample 1 is also used in sample 2. By having each
student take both tests and looking at the differences in
scores we have removed variability due to intelligence or
Topic 20 – Two Populations
20-24
test taking ability or other things that influence an
individuals test taking ability.
Intuitively, comparing how several people react to each
test is more informative and accurate than comparing
results for independently chosen people for each test.
Example #2. Genetics has a relatively large influence on
adult size and growth in most animals. Hence it would not
be surprising that two pigs from the same litter would
respond to each of the two diets similarly in the sense that
one would respond as the other would have had it been on
the first pig’s diet as well. Hence, the two littermates are
paired in this experiment and we will look at the
difference in responses of litter mates on different diets.
Example #3. Although the cells in each of the 2
treatments are not exactly the same, they are as close as
possible, being from the same animal. Hence any effect
due to animal variability is controlled somewhat by using
the same dogs for both treatments.
For paired samples, the estimator of the difference
μ1 − μ 2 is the average of the paired sample differences
D.
To obtain this mean difference:
Topic 20 – Two Populations
20-25
For each pair, calculate the difference in Y of the two
paired experimental units under the two treatments. Call
this difference D.
EXAMPLE: Cell fluidity
Dog
1
2
3
4
5
6
7
8
9
10
Mean
SD
Without O2 With O2
(Y1)
(Y2)
0.308
0.308
0.304
0.309
0.305
0.305
0.304
0.311
0.301
0.303
0.278
0.293
0.296
0.302
0.301
0.300
0.302
0.308
0.237
0.250
0.294
0.299
0.022
0.018
Difference
D=(Y1-Y2)
0.000
-0.005
0.000
-0.007
-0.002
-0.015
-0.006
0.001
-0.006
-0.013
-0.0053
0.00542
We have transformed the original data from two
dependent samples -- the new data consist of a single
sample of n differences. The average of the sample
differences is
Topic 20 – Two Populations
20-26
1 n
D = ∑ Di
n i =1
and the standard deviation is
n
sD =
2
(
D
D
)
−
∑ i
i =1
n −1
.
The sample of differences can be regarded as a random
sample from a population of differences if the
experimental units (e.g. the ten dogs) can be regarded as a
random selection from among all experimental units. In
that case, we have
SAMPLING DISTRIBUTION of D :
1) the mean of the distribution is μ D = μ1 − μ 2
2) the standard deviation of the distribution is
σD =
σD
where σ D is the standard deviation of the
n
population of differences from which we sampled n
differences.
3) the shape of the distribution is approximately normal
(a bell curve) if n is large or the two populations being
sampled are approximately normally distributed.
Topic 20 – Two Populations
20-27
The estimator of μ D is D , the sample mean difference
and the estimator of σ D is s D , the sample standard
deviation of the differences. The problem reverts to a test
of the mean μ D based on a single sample.
Hypothesis Test of the Difference in Two Population
Means Using Paired Samples:
Hypotheses are one of three:
a) H0: μ D ≤ D0 vs. HA: μ D > D0
b) H0: μ D ≥ D0 vs. HA: μ D < D0
c) H0: μ D = D0 vs. HA: μ D ≠ D0
D − Do
on n – 1 df
sD
n
P-value: depends on the alternative hypothesis:
a) P-value = Pr( T > tobs)
b) P-value = Pr( T < tobs)
c) P-value = 2 Pr( T > |tobs|)
Test Statistic: tobs =
Decision Rule: reject Ho if p-value ≤ α
Assumptions:
1. D is approximately normally distributed
2. the sampling was random and not more than 5% of
the population.
Topic 20 – Two Populations
20-28
EXAMPLE dog fluidity study
Hypotheses: Ho: μD = 0 vs. HA: μD ≠ 0
(i.e. D0 = 0)
Significance Level: we’ll choose α=0.025.
Now, the numbers we need are:
D = −0.0053 , s D = 0.00542 , and n = 10
Test Statistic: tobs =
D −0
− 0.00530
= −3.0939
=
sd
0.00542
10
n
df = n − 1 = 9.
P-value: 2Pr(T>|tobs|) = 2Pr(T>+3.09) is between 0.01
and 0.02 using the T-table in the book.
Conclusion: p-value < α=0.025. Hence we reject Ho and
conclude that the data provide sufficient evidence at
α=0.025 to indicate that oxygen treatment changes the
mean fluidity of pulmonary artery cells in dogs.
Assumptions: The sample size is small but it is likely
that the population of differences are not too skewed.
In SAS, the code and output are:
data dogs;
input dogID Y1 Y2 @@;
Topic 20 – Two Populations
20-29
datalines;
1
0.308 0.308 2
3
0.305 0.305 4
5
0.301 0.303 6
7
0.296 0.302 8
9
0.302 0.308 10
;
proc ttest data=dogs;
paired Y1*Y2;
quit;
0.304
0.304
0.278
0.301
0.237
0.309
0.311
0.293
0.300
0.250
The TTEST Procedure
LowerCL
UpperCL LowerCL
UpperCL
Diff
N Mean
Mean
Mean StdDev StdDev StdDev StdErr
Y1-Y2 10 -0.009 -0.005 -0.001 0.0037 0.0054 0.0099 0.0017
Difference
Y1 - Y2
T-Tests
DF
t Value
9
-3.09
Pr > |t|
0.0128
The output gives the exact p-value of 0.0128 < α =0.025.
Confidence Interval For the Difference of Two Means
Based on a Paired Sample:
⎛s ⎞
D ± tα / 2,n −1 ⎜ D ⎟
⎝ n⎠
Assumptions:
1) sampling is random and
2) either the sample size is large so we can use the CLT or
the original population of differences has a frequency
distribution that is bell-curve shaped.
Topic 20 – Two Populations
20-30
EXAMPLE: dog fluidity study. For a 95% confidence
interval of the difference of two means we need the t
critical value for 95% and 9 df. It is t = 2.26.
Hence, the 95% C. I. of the difference between mean
fluidity in cells with and without oxygen is
⎛ .00542 ⎞
− 0.0053 ± 2.26⎜
⎟ = −0.0053 ± 0.0039
⎝ 10 ⎠
= (−0.0092, − 0.0014)
which implies that the mean fluidity in the cells without
oxygen is below the mean fluidity for those that receive
oxygen with 95% confidence. The SAS output provides
the same confidence limits.