Download Chapter 11

Document related concepts
no text concepts found
Transcript
Chapter 11
Comparing Two Populations or
Treatments
Section 11.1
Inferences Concerning the
Difference between Two
Populations or Treatment
Means Using Independent
Samples
Suppose we have a population of adult men
with a mean height of 71 inches and
standard deviation of 2.5 inches. We also have a
population of adult women with a mean height of 65
inches and standard deviation of 2.3 inches.
Assume heights are normally distributed.
Suppose we take a random sample of 30 men and
a random sample of 25 women from their
respective populations and calculate the difference
On
the
next
slide
we
will
in their heights (man’s height – woman’s height).
investigate this distribution.
If we did this many times, what would the
distribution of differences be like?
Male Heights
Female Heights
Randomly take
one of the sample
means for the
71 sM = 2.5
65 sF = 2.3
malesSuppose
and one ofwe took repeated
Suppose we took repeated
thesamples
sample
samples of size n = 25
from
themeans
of size n = 30 from the
for the females
population of female heights
population
and of male heights and
and find the
calculated the sample difference
means.
calculated
We
the sample means. We
in
mean
would have the sampling
would have the sampling
heights.
distribution of xF
71
s xM 
Doing this repeatedly,
we will create the
sampling distribution
of (xM – xF)
2.5
30
distribution of xM.
65
xM - xF
s xF 
2. 3
25
2
 2.5   2.3 
σ x M -x F = 
 +

30
25

 

6
2
Heights Continued . . .
Describe the sampling distribution of the difference
in mean heights between men and women.
The sampling distribution is normally distributed
with
2.5
2.3
s
 (
) (
)
 xM  xF  71  65  6
30
25
xM  x F
2
2
What is the probability that the difference in mean
heights of a random sample of 30 men and a
random sample of 25 women is less than 5 inches?
P((x M  x F )  5)  .0618
6
Notation - Comparing Two Means
Mean
Standard
Value Variance Deviation
Population or Treatment 1
1
s
Population or Treatment 2
2
s
2
1
2
2
s1
s2
More Notation - Comparing Two Means
Sample
Standard
Size
Mean Variance Deviation
Population or Treatment 1
n1
x1
s12
s1
Population or Treatment 2
n2
x2
s22
s2
Properties of the Sampling
Distribution of x1 – x2
If the random samples on which x1 and x2 are based are
selected independently of one another, then
1. μ x -x = μ x - μ x = μ1 - μ 2
1
2
1
2
2
2
σ
σ
σ
σ
2
1
2
of
x
–
x
is
always
σ
=
σ
+
σ
=
+
σ
=
+
1
2
2. xMean
and
x1 -x 2
1 -x 2
value of n
n
n2
– 2, so x1 – x2 isnan
1
x1 – unbiased
x2
statistic for estimating 1 – 2.
2
2
2
2
1distribution
2
The
sampling
x1
x2
centered at the1value of
2 1
3. In
n1variance
and n2 are
large or theispopulation
distributions are
The
of both
the differences
the
(at least approximately)
normal, x1 and x2 each have (at least
sum of the variances.
approximately) normal distributions. This implies that the
sampling distribution of x1 – x2 is also (approximately) normal.
The properties for the sampling distribution of x1 – x2
implies that x1 – x2 can be standardized to obtain a
variable with a sampling distribution that is
approximately the standard normal (z) distribution.
When two random samples are independently selected and
Weare
must
n1 and n2 are both large or the population distributions
If s1ofknow
and ss
(at least approximately) normal, the distribution
2 1isand
z=
x1 - x 2 - (μ1 - μ 2 )
σ12 σ 22
+
n1 n 2
unknown
s2 in we
order
musttouse
uset this
distributions.
procedure.
is described (at least approximately) by the standard normal
(z) distribution.
Two-Sample t Test for Comparing
Two Populations
Null Hypothesis: H0: 1 – 2 = hypothesized value
Test Statistic:
t=
x1 - x 2 - hypothesized value
2
2
s
s
1
A conservative
+ 2 estimate of the Pn1 found
n2
The hypothesized
is tvalue
can be
byvalue
using the
oftenwith
0, but
there are
times of
curve
the number
of degrees
The appropriate df for the two-sample t test is
freedom
equal
to
the smaller
of
when
we
are
interested
in
2
V1 +V2 
(n1 –
1) or (n2 – 1).that

2 difference
testing
for
a
is
2
s
df= 2
s
1
2
V2 =
where V1 = n not and
V1
V22
0.
n2
1
+
n1 -1 n 2 -1
The computed number of df should be truncated to an integer.
Two-Sample t Test for Comparing
Two Populations Continued . . .
Null Hypothesis: H0: 1 – 2 = hypothesized value
Alternative Hypothesis:
P-value:
Ha: 1 – 2 > hypothesized value
Area under the appropriate t
curve to the right of the
computed t
Ha: 1 – 2 < hypothesized value
Area under the appropriate t
curve to the left of the computed t
Ha: 1 – 2 ≠ hypothesized value
2(area to right of computed t) if +t
or
2(area to left of computed t) if -t
Another Way to Write Hypothesis
Statements:
H0: 1 =
- 22 = 0
Ha: 1 <
- 22< 0
Ha: 1 >
- 22> 0
Ha: 1 -≠22≠ 0
When the
hypothesized
value is 0, we
Be sure to
can rewrite
define
theseBOTH
1 and 2!
hypothesis
statements:
Two-Sample t Test for Comparing
Two Populations Continued . . .
Assumptions:
1) The two samples are independently selected random
samples from the populations of interest
2) The sample sizes are large (generally 30 or larger) or
the population distributions are (at least
approximately) normal.
When comparing two treatment groups, use the following
assumptions:
1) Individuals or objects are randomly assigned to
treatments (or vice versa)
2) The sample sizes are large (generally 30 or larger) or
the treatment response distributions are
approximately normal.
Are women still paid less than men for comparable
work? A study was carried out in which salary data
was
collected from a random sample of men and from a random
sample of women who worked as purchasing managers and
who were subscribers to Purchasing magazine. Annual
salaries (in thousands of dollars) appear below (the actual
sample sizes were much larger). Use a = .05 to determine if
there isIfconvincing
evidence
that
mean
annual
we had defined
1 as
thethe
mean
salary
for salary for
male purchasing managers is greater than the mean annual
female
purchasing
managers
and 2 as the
salary for
female
purchasing
managers.
mean salary for male purchasing
managers,
alternative
Men
81 69 then
81 the
76 correct
76 74
69 76 79 65
hypothesis would be the difference in the
Women 78 60 67 61 62 73 71 58 68 48
means is less than 0.
H0: 1 – 2 = 0
Ha: 1 – 2 > 0
Where 1 = mean
annual
salary for male
State
the hypotheses:
purchasing managers and 2 = mean annual
salary for female purchasing managers
Salary War Continued . . .
Men
81 69
Women 78 60
H0: 1 – 2 = 0
Ha: 1 – 2 > 0
81
67
76
61
76
62
74
73
69
71
76
58
79
68
65
48
Where 1 = mean annual salary for male
purchasing managers and 2 = mean annual
salary for female purchasing managers
Assumptions:
1) Given two independently selected random samples of male and
female purchasing managers.
Men
2) Since
the
sample
sizes
are
small,
we
must
Even though Verify
these the
are assumptions
samples from subscribers of
determine if it is plausible that the sampling
Women
Purchasing
of the
distributions
for magazine,
each of the the
two authors
populations
are study believed it
60
approximately
normal.
the samples
boxplots are
was reasonable
to Since
view the
as representative
of
reasonably symmetrical
with no outliers,
it is
the
populations
of
interest.
plausible that the sampling distributions are
approximately normal.
80
Salary War Continued . . .
Men
81
69
81
76
76
74
69
76
79
65
Women 78
60
67
61
62
73
71
58
68
48
Where 1 = mean annual salary for male
H0: 1 – 2 = 0
managers and 2 = mean annual
Ha: 1 – What
2 > 0potentialpurchasing
type for
of error
salary
female purchasing managers
could we have made with this
74.6  64.6  0  3.11
conclusion?
t

Test Statistic:
(round down) this
8.62
Type I 5.4 2  Truncate
value.
10
10
P-value =.004
a = .05
Now find the area to the 2.916  7.3962
Since the
P-value
< a, in
wethe
reject
is convincing
dfH
0. There
 15.14  15
right
of t = 3.11
t-curve
2
2
evidence that the mean salaryCompute
for male
purchasing
2.916the
7
.396statistic
test
P-value,
with than
df = 15.
To find
the for
first find
managers is higher
the mean
salary
female
9and P-value
9
purchasing managers.
the appropriate df.
Mean Fill Example
We would like to compare the mean fill of 32
ounce cans of beer from two adjacent filling
machines. Past experience has shown that
the population standard deviations of fills for
the two machines are known to be s1 = 0.043
and s2 = 0.052 respectively.
A sample of 35 cans from machine 1 gave a
mean of 16.031 and a sample of 31 cans from
machine 2 gave a mean of 16.009. State,
perform and interpret an appropriate
hypothesis test using the 0.05 level of
significance.
Mean Fill Example Continued
1 = mean fill from machine 1
2 = mean fill from machine 2
H0: 1 - 2 = 0
Ha: 1  2
Significance level: a = 0.05
Test statistic:
z
x1  x 2  hypothesized value
s s

n1 n 2
2
1
2
2

x1  x 2  0
s12 s22

n1 n 2
Mean Fill Example Continued
Since n1 and n2 are both large (> 30) we do not
have to make any assumptions about the
nature of the distributions of the fills.
This example is a bit of a stretch, since knowing
the population standard deviations (without
knowing the population means) is very unusual.
Accept this example for what it is, just a sample
of the calculation. Generally this statistic is used
when dealing with “what if” type of scenarios
and we will move on to another technique that
is somewhat more commonly used when s1 and
s2 are not known.
Mean Fill Example Continued
Calculation:
z
X1  X 2  0
s12 s 22

n1 n 2

16.031  16.009 
0.0432 0.0522

35
31
P-value:
P-value = 2P(z > 1.86) = 2P(z < -1.86)
= 2(0.0314) = 0.0628
 1.86
Mean Fill Example Continued
Since the P-value > a, we fail to reject H0.
There is not convincing evidence that the
two machines produce bottles with
different mean fills.
Cold Medicine Example
In an attempt to determine if two competing brands of cold
medicine contain, on the average, the same amount of
acetaminophen, twelve different tablets from each of the
two competing brands were randomly selected and tested
for the amount of acetaminophen each contains. The
results (in milligrams) follow. Use a significance level of
0.01.
Brand A
517, 495, 503, 491
503, 493, 505, 495
498, 481, 499, 494
Brand B
493, 508, 513, 521
541, 533, 500, 515
536, 498, 515, 515
State and perform an appropriate hypothesis test.
Cold Medicine Example Continued
1 = the mean amount of acetaminophen in cold
tablet brand A
2 = the mean amount of acetaminophen in cold
tablet brand B
H0: 1 = 2 (1 - 2 = 0)
Ha: 1  2 (1 - 2  0)
Significance level: a = 0.01
Test Statistic
t
x1  x 2  hypothesized mean
s12 s 22

n1 n 2

x1  x 2  0
s12 s 22

n1 n 2
Cold Medicine Example Continued
Assumptions: The samples were selected
independently and randomly. Since the
samples are not large, we need to be able to
assume that the populations (of amounts of
acetaminophen) are both normally distributed.
Cold Medicine Example Continued
Assumptions (continued):
As we can see from the normality plots and the
boxplots, the assumption that the underlying
distributions are normally distributed appears to be
quite reasonable.
Cold Medicine Example Continued
Calculation:
n1  12, x1  497.83, s1  8.830
n 2  12, x 2  515.67, s1  15.144
t
x1  x 2  0
2
1
2
2
s
s

n1 n 2

497.83  515.67  0
2
8.830 15.144

12
12
2
 3.52
Cold Medicine Example Continued
Calculation:
s12 8.83002
V1  
 6.4974
n1
121
s 22 15.144 2
V2  =
=19.112
n2
12
2
V

V

(6.4974  19.112)
1
2
df 

 17.7
2
2
2
2
V1
V2
6.4974 19.112


n1  1 n 2  1
11
11
2
We truncate the degrees of freedom to give df = 17.
Cold Medicine Example Continued
P-value: From the table of tail areas for t curve
(Table IV) we look up a t value of 3.5 with df = 17
to get 0.001. Since this is a two-tailed alternate
hypothesis, P-value = 2(0.001) = 0.002.
Conclusion: Since P-value = 0.002 < 0.01 = a,
H0 is rejected. The data provides strong
evidence that the mean amount of
acetaminophen is not the same for both brands.
Specifically, there is strong evidence that the
average amount per tablet for brand A is less
than that for brand B.
The Two-Sample t Confidence Interval for the
Difference Between Two Population or Treatment
Means
The general formula for a confidence interval for 1 – 2 when
1) The two samples are independently selected random samples from the
populations of interest
2) The sample sizes are large (generally 30 or larger) or the population
distributions are (at least approximately) normal.
is
2
2
s
s
1 the2 following
For a comparison
two
treatments,
critical
value) use
+
 x1 -x 2 of
 ±(t
n1 n 2
assumptions:
The t critical value is based on
1) Individuals2 or objects are randomly assigned to
 V +V2  vice versa)
treatments
df= 21 (or
s
V1
V22
s
V=
V=
+
2) The sample
sizes are
large (generally
30
or
larger)
or
where
and
n
n
n1 -1 n 2 -1
the treatment response distributions are approximately
dfnormal.
should be truncated to an integer.
2
1
1
2
2
2
1
2
In a study on food intake after sleep deprivation,
men were randomly assigned to one of two
treatment groups. The experimental group was required to
sleep only 4 hours on each of two nights, while the control
group was required to sleep 8 hours on each of two nights.
The amount of food intake (Kcal) on the day following the two
nights of sleep was measured. Compute a 95% confidence
interval for the true difference in the mean food intake for the
two sleeping conditions.
4-hour
sleep
3585
4470
3068
5338
2221
4791
4435
3187
3901
3868
3869
4878
3632
4518
8-hour
sleep
4965
3918
1987
4993
5220
3653
3510
4100
5792
4547
3319
3336
4304
4057
3099
3338
the mean x
and
standard deviation for
x4 = 3924 s4 =Find
829.67
8 = 4069.27 s8 = 952.90
each treatment.
Food Intake Study Continued . . .
4-hour
sleep
3585
4470
3068
5338
2221
4791
4435
3187
3901
3868
3869
4878
3632
4518
8-hour
sleep
4965
3918
1987
4993
5220
3653
3510
4100
5792
4547
3319
3336
4304
4057
x4 = 3924 s4 = 829.67
3099
3338
x8 = 4069.27 s8 = 952.90
Assumptions:
1) Men were randomly assigned to two treatment groups
Verify the assumptions.
2) The assumption of normal response 4-hour
distributions is plausible because both 8-hour
boxplots are approximately
4000
symmetrical with no outliers.
Food Intake Study Continued . . .
4-hour
sleep
3585
4470
3068
5338
2221
4791
4435
3187
3901
3868
3869
4878
3632
4518
8-hour
sleep
4965
3918
1987
4993
5220
3653
3510
4100
5792
4547
3319
3336
4304
4057
3099
3338
x4 Based
= 3924upon
s4 =this
829.67
x8 = a4069.27
interval, is there
significants8 = 952.90
difference in the mean food intake for the two
2
No, sincesleeping
0 is in the
confidence
there is not
829.672 952.90interval,
conditions?
(3924  4069.27)  2.05

 (814.0, 523.5)
15 mean
15 food intake for the two
convincing evidence that the
Calculate the interval.
sleep conditions are different.
We are 95% confident that the true difference in the mean
Interpret the interval in context.
food intake for the two sleeping conditions is between -814.1
Kcal and 523.5 Kcal.
Thread Example
Two kinds of thread are being compared for
strength. Fifty pieces of each type of thread
are tested under similar conditions. The
sample data is given in the following table.
Construct a 98% confidence interval for the
difference of the population means.
Thread A
Thread B
Sample
Sample Sample Standard
Size
mean Deviation
50
78.3
5.62
50
87.2
6.31
Thread Example Continued
s12 5.622
V1  
 0.632
n1
50
2
2
2
s
6.31
V2 

 0.796
n2
50
0.632  0.796 

df 
2
0.632 0.796

49
49
2
2
 96.7
Truncating, we have df = 96.
Thread Example Continued
Looking on the table of t critical values (table
III) under 98% confidence level for
df = 96, (we take the closest value for df ,
specifically df = 120) and have the t critical
value = 2.36.
5.622 6.312

 78.3  87.2   2.36
50
50
8.9  2.82
The 98% confidence interval estimate for the
difference of the means of the tensile strengths is
(-11.72, -6.08)
Octane Example
A student recorded the mileage he obtained
while commuting to school in his car. He
kept track of the mileage for twelve different
tanksful of fuel, involving gasoline of two
different octane ratings. Compute the 95%
confidence interval for the difference of
mean mileages. His data follow:
87 Octane
26.4, 27.6, 29.7
28.9, 29.3, 28.8
90 Octane
30.5, 30.9, 29.2
31.7, 32.8, 29.3
Octane Example Continued
Let the 87 octane fuel be the first group and the 90
octane fuel the second group, so we have
n1 = n2 = 6 and
x1 =28.45, s1  1.228, x 2 =30.73, s 2  1.392
s 22 1.3922
V2 

 0.3231
n2
6
s12 1.2282
V1  
 0.2512
n1
6
0.2512  0.3231

df 
2
2
0.2512 0.3231

5
5
Truncating, we have df = 9.
2
 9.8
Octane Example Continued
Looking on the table under 95% with 9 degrees of
freedom, the critical value of t is 2.26.
2
1
2
2
s
s
x1 - x 2  (t critical value)

n1 n 2
1.2282 1.3922
 28.45  30.73  2.26

6
6
-2.28  1.71
The 95% confidence interval for the true difference of
the mean mileages is (-3.99, -0.57).
Octane Example Continued
Comments: We had to assume that the samples
were independent and random and that the
underlying populations were normally
distributed since the sample sizes were small.
If we randomized the order of the tankfuls of
the two different types of gasoline we can
reasonably assume that the samples were
random and independent. By using all of the
observations from one car we are simply
controlling the effects of other variables such as
year, model, weight, etc.
Octane Example Continued
By looking at the following normality plots, we see
that the assumption of normality for each of the two
populations of mileages appears reasonable.
Given the small sample sizes, the assumption of normality
is very important, so one would be a bit careful utilizing
this result.
Pooled t Test
• Used when the variances of the two populations
are equal (s1 = s2)
• CombinesP-values
information
fromusing
boththe
samples
computed
pooled tto
create a “pooled”
estimate
offrom
the the
common
procedure
can be far
actual
variance which
in placevariances
of the two
P-valueisif used
the population
are
not equal.
sample standard deviations
When the population variances are equal,
• Is not widely
used
due to itsis sensitivity
to any
the pooled
t procedure
better at detecting
departure
from the
equal
variance
assumption
deviations
from
H0 than
the two-sample
t
test.
Section 11.2
Difference in Means with Paired
Samples
Suppose that an investigator wants to determine if
regular aerobic exercise improves blood pressure. A
random sample of people who jog regularly and a
second random sample of people who do not
exercise regularly are selected independently of one
another.
Can we conclude that the difference in mean blood
pressure is attributed to jogging?
What about other factors like weight?
One way to avoid these difficulties
would be to pair subjects by weight
then assign one of the pairs to jogging
and the other to no exercise.
Summary of the Paired t test for Comparing
Two Population or Treatment Means
Null Hypothesis: H0: d = hypothesized value
x d - hypothesized value
Where The
the
mean of the
hypothesized
value
is
usually
Test Statistic:
d is t=
sd
differences
0 –inmeaning
the paired
that
there is no
n
Where n is the number ofobservations
sample differencesdifference.
and xd and sd are the mean and
standard deviation of the sample differences. This test is based on df = n – 1.
Alternative Hypothesis:
Ha: d > hypothesized value
Ha: d < hypothesized value
Ha: d ≠ hypothesized value
or
P-value:
Area to the right of calculated t
Area to the left of calculated t
2(area to the right of t) if +t
2(area to the left of t) if -t
Summary of the Paired t test for Comparing
Two Population or Treatment Means Continued
Assumptions:
1. The samples are paired.
2. The n sample differences can be viewed as a
random sample from a population of differences.
3. The number of sample differences is large
(generally 30 or more) or the population
distribution of differences is (at least
approximately) normal.
Is this an example of paired samples?
An engineering association wants to see if there
is a difference in the mean annual salary for
electrical engineers and chemical engineers. A
random sample of electrical engineers is
surveyed about their annual income. Another
random sample of chemical engineers is
surveyed about their annual income.
No, there is no pairing of
individuals, you have two
independent samples
Is this an example of paired samples?
A pharmaceutical company wants to test its
new weight-loss drug. Before giving the drug to
volunteers, company researchers weigh each
person. After a month of using the drug, each
person’s weight is measured again.
Yes, you have two observations on
each individual, resulting in paired
data.
Can playing chess improve your memory? In a study,
students who had not previously played chess participated in
a program in which they took chess lessons and played chess
daily for 9 months. Each student took a memory test before
starting the chess program and again at the end of the 9month period. Test the claim at the 0.05 level of significance
If we had subtracted Post-test minus
alternative
Student
1
2
3
4
5Pre-test,
6
7then8the 9
10
11
12
hypothesis
Pre-test
510 610 640 675 600
550 610would
625 be
450 the
720mean
575 675
greater
Post-test
850 790 850 775 700difference
775 700 is850
690 than
775 0.
540 680
Difference
-340
-180
-210
-100
-100
-225
-90
-225
-240
-55
35
-5
H0: d = 0
First, find the differences:
the hypotheses.
Ha: dState
< 0 pre-test
minus post-test.
Where d is the mean memory score difference
between students with no chess training and students
who have completed chess training
Playing Chess Continued . . .
Student
1
2
3
4
5
6
7
8
9
10
11
12
Pre-test
510
610
640
675
600
550
610
625
450
720
575
675
Post-test
850
790
850
775
700
775
700
850
690
775
540
680
Difference
-340
-180
-210
-100
-100
-225
-90
-225
-240
-55
35
-5
H0: d = 0
Ha: d < 0
Where d is the mean memory score difference between
students with no chess training and students who have
completed chess training
Assumptions:
1) Although the sample of students isVerify
not a random
sample, the
assumptions
investigator believed that it was reasonable to view the
12 sample differences as representative of all such
differences.
2) A boxplot of the differences is approximately
symmetrical with no outliers so the
assumption of
normality is plausible.
Playing Chess Continued . . .
Student
1
2
3
4
5
6
7
8
9
10
11
12
Pre-test
510
610
640
675
600
550
610
625
450
720
575
675
Post-test
850
790
850
775
700
775
700
850
690
775
540
680
Difference
-340
-180
-210
-100
-100
-225
-90
-225
-240
-55
35
-5
H0: d = 0
Ha: d < 0
Test Statistic:
Where d is the mean memory score difference between
students with no chess training and students who have
State
the conclusion in context.
completed chess
training
t=
-144.6-0
Compute
the test statistic and
=-4.56
109.74
P-value.
12
P-value ≈ 0
df = 11
a = .05
Since the P-value < a, we reject H0. There is
convincing evidence to suggest that the mean
memory score after chess training is higher than
the mean memory score before training.
Paired t Confidence Interval for d
When
1.
2.
3.
The samples are paired.
The n sample differences can be viewed as a random sample from
a population of differences.
The number of sample differences is large (generally 30 or more)
or the population distribution of differences is (at least
approximately) normal.
the paired t interval for d is
 sd 
x d ±(t critical value) 

 n
Where df = n - 1
Playing Chess Revisited . . .
Student
1
2
3
4
5
6
7
8
9
10
11
12
Pre-test
510
610
640
675
600
550
610
625
450
720
575
675
Post-test
850
790
850
775
700
775
700
850
690
775
540
680
Difference
-340
-180
-210
-100
-100
-225
-90
-225
-240
-55
35
-5
 109.74 
 144.6  1.796
  ( 201.5,  87.69)
 12 
Compute a 90% confidence interval for the
Wedifference
are 90% confident
the true
mean
mean
in memorythat
scores
before
difference
in memory
scores
before
chess
chess
training and
the memory
scores
after
training and
the memory
chess
training. scores after chess
training is between -201.5 and -87.69.
Weight Loss Example
A weight reduction center advertises that
participants in its program lose an average of
at least 5 pounds during the first week of the
participation. Because of numerous
complaints, the state’s consumer protection
agency doubts this claim. To test the claim
at the 0.05 level of significance, 12
participants were randomly selected. Their
initial weights and their weights after 1 week
in the program appear on the next slide. Set
up and perform an appropriate hypothesis
test.
Weight Loss Example Continued
Member
Initial
Weight
One Week
Weight
1
195
195
2
153
151
3
174
170
4
125
123
5
149
144
6
152
149
7
135
131
8
143
147
9
139
138
10
198
192
11
215
211
12
153
152
Weight Loss Example Continued
Member
Initial
Weight
One Week Difference
Initial -1week
Weight
0
195
1
195
2
153
151
2
3
174
170
4
4
125
123
2
5
149
144
5
6
152
149
3
7
135
131
4
8
143
147
-4
9
139
138
1
10
198
192
6
11
215
211
4
12
153
152
1
Weight Loss Example Continued
d = mean of the individual weight changes
(initial weight–weight after one week)
This is equivalent to the difference of means:
d = 1 – 2 = initial weight - 1 week weight
H0: d = 5
Ha: d < 5
Significance level: a = 0.05
Test statistic: t  x d  hypothesized value  x d  5
sd
sd
n
n
Weight Loss Example Continued
Assumptions: According to the statement of the
example, we can assume that the sampling is random.
The sample size (12) is small, so from the boxplot we
see that there is one outlier but never the less, the
distribution is reasonably symmetric and the normal
plot confirms that it is reasonable to assume that the
population of differences (weight losses) is normally
distributed.
57
Weight Loss Example Continued
Calculations: According to the statement of the
example, we can assume that the sampling is
random. The sample size (10) is small, so
n  12, x d  2.333, s d  2.674
x d  5 2.333  5
t

 3.45
sd
2.674
12
n
P-value: This is a lower tail test, so looking up the t value
of 3.0 under df = 11 in the table of tail areas for t curves
(table IV) we find that the P-value = 0.002.
Weight Loss Example Continued
Conclusions: Since P-value = 0.002 < 0.05 = a,
we reject H0.
We draw the following conclusion.
There is strong evidence that the mean weight
loss for those who took the program for one
week is less than 5 pounds.
Weight Loss Example Continued
Minitab returns the following when asked to
perform this test.
This is substantially the same result.
Paired T-Test and CI: Initial, One week
Paired T for Initial - One week
Initial
One week
Difference
N
12
12
12
Mean
160.92
158.58
2.333
StDev
28.19
27.49
2.674
SE Mean
8.14
7.93
0.772
95% upper bound for mean difference: 3.720
T-Test of mean difference = 5 (vs < 5): T-Value = -3.45
P-Value = 0.003
Section 11.3
Large-Sample Inferences
Concerning the Difference
Between Two Population or
Treatment Proportions
Some people seem to think that duct tape
can fix anything . . . even remove warts!
Investigators at Madigan Army Medical Center
tested using duct tape to remove warts versus the
more traditional freezing treatment.
Suppose that the duct tape treatment will
successfully remove 50% of warts and that the
traditional freezing treatment will successfully
remove 60% of warts.
Let’s investigate the sampling
distribution of pfreeze - ptape
pfreeze = the true proportion of
warts that are
successfully removed
by freezing
ptape = the true proportion of
warts that are
successfully removed
by using duct tape
Randomly take
pfreeze = .6
ptape = .5
one of the sample
Suppose we repeatedly treated
Suppose
we repeatedly treated
proportions
for the
100 warts using the duct
tape
100
warts using the traditional
freezing
treatment
method and calculatedand
the
one of the
freezing
treatment and calculated
sample
proportion of warts that are
the proportion of warts that are
proportions
successfully removed. We
would for the removed. We would
successfully
duct tape
have the
sampling
distribution
of the sampling distribution
.6
.5
.6(.4)
have
of
treatment
and find
s pˆ

.5(.5)
s pˆ 
ptape100
.
pfreeze
the difference.
100
freeze
tape
Doing this
repeatedly, we will
create the sampling
distribution of
(pfreeze – ptape)
pfreeze - ptape
s pˆ
ˆ
freeze  ptape
.1

.6(.4) .5(.5)

100
100
Use:
n1 pˆ1  n2 pˆ2
pˆc 
1
2 n1  n2
Properties of the Sampling
Distribution of 𝒑 - 𝒑
If two random samples are selected independently
of one a
When performing
another, the following properties hold: Since
hypothesis
test,forwe
the value
p1
null
andwill
p2use
are the
unknown,
1.  pˆ1  pˆ2  p1  p2
hypothesis
that p𝑝11
we
will
combine
This says that the sampling distribution of 𝑝1 - 𝑝2 is centered at p1 – p2
and
p are equal. We
so 𝑝1 - 𝑝2 is an unbiased statistic for estimatingand
p1 –𝑝p222.to estimate the
will not know the
p1 (1  p1 ) p2 (1  p2 )
common value of p1
s


common value for p1
2. pˆ1  pˆ2
and p2
n1
n2
and p2.
3. If both n1 and n2 are large (that is, if n1p1 > 10,
n1(1 – p1) > 10, n2p2 > 10, and n2(1 – p2) > 10), then 𝑝1 and 𝑝2
each have a sampling distribution that is approximately
normal, and their difference 𝑝1 - 𝑝2 also has a sampling
distribution that is approximately normal.
Summary of Large-Sample z Test
for p1 – p2 = 0
Null Hypothesis: H0: p1 – p2 = 0
Test Statistic:
Use:
n1pˆ 1 +n 2 pˆ 2
p̂c =
n1 +n 2
pˆ 1 -pˆ 2 -(p1 -p 2 )
z=
pˆ c (1-pˆ c ) pˆ c (1-pˆ c )
+
n1
n2
Alternative Hypothesis:
Ha: p1 – p2 > 0
Ha: p1 – p2 < 0
Ha: p1 – p2 ≠ 0
P-value:
area to the right of calculated z
area to the left of calculated z
2(area to the right of z) if +z or
2(area to the left of z) if -z
Another Way to Write Hypothesis
statements:
H00:: pp11 -=pp22= 0
H
H
p11 >- p
p22 > 0
Haa:: p
Haa:: p
H
p11 <- p
p22 < 0
H
p11 ≠- pp22≠ 0
Haa:: p
Be sure to
define both p1
& p2!
Summary of Large-Sample z Test
for p1 – p2 = 0 Continued
Assumptions:
1) The samples
independently chosen
Since pare
1 and p2 are unknown we must use
and 𝑝2 to verify
that the samples
large
random𝑝1samples
or treatments
were are
assigned
enough.
at random to individuals
or objects.
2) Both sample sizes are large
n1𝑝1 > 10, n1(1 - 𝑝1) > 10,
n2𝑝2 > 10, n2(1 - 𝑝2) > 10
Investigators at Madigan Army Medical Center tested using
duct tape to remove warts. Patients with warts were
randomly assigned to either the duct tape treatment or to the
more traditional freezing treatment. Those in the duct tape
group wore duct tape over the wart for 6 days, then removed
the tape, soaked the area in water, and used an emery board
to scrape the area. This process was repeated for a
maximum of 2 months or until the wart was gone. The data
follows:
n
Number with wart
successfully removed
Liquid nitrogen freezing
100
60
Duct tape
104
88
Treatment
Do these data suggest that freezing is less
successful than duct tape in removing warts?
Duct Tape Continued . . .
n
Number with wart successfully removed
Liquid nitrogen freezing
100
60
Duct tape
104
88
Treatment
H0: p1 – p2 = 0
Ha: p1 – p2 < 0
Where p1 is the true proportion of warts that would
be successfully removed by freezing and p2 is the
true proportion of warts that would be successfully
removed by duct tape
Assumptions:
1) Subjects were randomly assigned to the two treatments.
2) The sample sizes are large enough because:
n1𝑝1 = 100(.6) = 60 > 10
n1(1 - 𝑝1) = 100(.4) = 40 > 10
n2𝑝2 = 104(.88) = 91.52 > 10
n2(1 - 𝑝2) = 104(.12) = 12.48> 10
Duct Tape Continued . . .
n
Number with wart successfully removed
Liquid nitrogen freezing
100
60
Duct tape
104
88
Treatment
H0: p1 – p2 = 0
Ha: p1 – p2 < 0
z
.6  .85  0
 4.56
.73(.27) .73(.27)

100
104
p̂ c 
60  88
 0.73
100  104
P-value ≈ 0
a = .01
Since the P-value < a, we reject H0. There is
convincing evidence to suggest the proportion of
warts successfully removed is lower for freezing
than for the duct tape treatment.
Student Retention Example
A group of college students were asked what they
thought the “issue of the day”. Without a pause the
class almost to a person said “student retention”. The
class then went out and obtained a random sample
(questionable) and asked the question, “Do you plan
on returning next year?”
The responses along with the gender of the person
responding are summarized in the following table.
Gender
Male
Female
Response
Yes
No Maybe
211
45
19
141
32
9
Test to see if the proportion of students planning on returning is
the same for both genders at the 0.05 level of significance.
Student Retention Example Continued
p1 = true proportion of males who plan on returning
p2 = true proportion of females who plan on returning
n1 = number of males surveyed
n2 = number of females surveyed
𝑝1 = sample proportion of males who plan on returning
𝑝2 = sample proportion of females who plan on returning
Null hypothesis: H0: p1 – p2 = 0
Alternate hypothesis: Ha: p1 – p2  0
Student Retention Example Continued
Significance level: a = 0.05
Test statistic:
z
p1  p 2
pc (1  p c ) p c (1  pc )

n1
n2
Assumptions: The two samples are independently
chosen random samples. Furthermore, the sample sizes
are large enough since
n1 p1 = 211  10, n1(1- p1) = 64  10
n2p2 = 141  10, n2(1- p2) = 41  10
Student Retention Example Continued
Calculations:
n1p1  n 2 p 2 211  141 352
pc 


 0.7702
n1  n 2
275  182 457
z
p1  p 2
p c (1  p c )

p c (1  p c )
275

182
0.76727  0.77473
0.77024(1  0.77024)
275
-0.0074525

 -0.19
0.040198

0.77024(1  0.77024)
182
Student Retention Example Continued
P-value:
The P-value for this test is 2 times the area
under the z curve to the left of the computed
z = -0.19.
P-value = 2(0.4247) = 0.8494
Conclusion:
Since P-value = 0.849 > 0.05 = a, the hypothesis H0 is
not rejected at significance level 0.05.
There is no evidence that the return rate is different for
males and females..
Washing Machine Example
A consumer agency spokesman stated that he
thought that the proportion of households having
a washing machine was higher for suburban
households then for urban households. To test to
see if that statement was correct at the 0.05 level
of significance, a reporter randomly selected a
number of households in both suburban and
urban environments and obtained the following
data.
Number Proportion
Suburban
Urban
Number
surveyed
having
washing
machines
having
washing
machines
300
250
243
181
0.810
0.724
Washing Machine Example Continued
p1 = proportion of suburban households having
washing machines
p2 = proportion of urban households having washing
machines
p1 - p2 is the difference between the proportions of
suburban households and urban households that
have washing machines.
H0: p1 - p2 = 0
H a: p 1 - p 2 > 0
Washing Machine Example Continued
Significance level: a = 0.05
Test statistic:
z
p1  p 2
pc (1  p c ) p c (1  pc )

n1
n2
Assumptions: The two samples are independently
chosen random samples. Furthermore, the sample sizes
are large enough since
n1 p1 = 243  10, n1(1- p1) = 57  10
n2p2 = 181  10, n2(1- p2) = 69  10
Washing Machine Example Continued
Calculations:
n1p1  n 2 p 2 243  181 424
pc 


 0.7709
n1  n 2
300  250 550
z
p1  p2
pc (1  pc ) pc (1  pc )

n1
n2
0.810  0.742
1 
 1
0.7709(1  0.7709) 


300
250


 2.390

Washing Machine Example Continued
P-value:
The P-value for this test is the area under the z
curve to the right of the computed z = 2.39.
The P-value = 1 - 0.9916 = 0.0084
Conclusion:
Since P-value = 0.0084 < 0.05 = a, the hypothesis H0 is
rejected at significance level 0.05. There is sufficient
evidence at the 0.05 level of significance that the
proportion of suburban households that have washers is
more than the proportion of urban households that have
washers.
A Large-Sample Confidence
Interval for p1 – p2
When
1)The samples are independently chosen random samples or
treatments were assigned at random to individuals or objects
2) Both sample sizes are large
n1𝑝1 > 10, n1(1 - 𝑝1) > 10, n2𝑝2 > 10, n2(1 - 𝑝2) > 10
a large-sample confidence interval for p1 – p2 is calculated by:
 pˆ 1 -pˆ 2  ±  z critical value 
pˆ 1 (1-pˆ 1 ) pˆ 2 (1-pˆ 2 )
+
n1
n2
The article “Freedom of What?” (Associated Press, February
1, 2005) described a study in which high school students and
high school teachers were asked whether they agreed with
the following statement: “Students should be allowed to
report controversial issues in their student newspapers
without the approval of school authorities.” It was reported
that 58% of students surveyed and 39% of teachers
surveyed agreed with the statement. The two samples –
10,000 high school students and 8000 high school teachers
– were selected from schools across the country.
Compute a 90% confidence interval for the difference in
proportion of students who agreed with the statement
and the proportion of teachers who agreed with the
statement.
Newspaper Problem Continued . . .
p1 = .58
p2 = .39
1) Assume
that it
reasonable
to regard
thesedoes
two samples
as
Based
onisthis
confidence
interval,
there appear
being independently
selecteddifference
and representative
of theof
populations
to be a significant
in proportion
students
of interest.
who agreed with the statement and the proportion of
2) Both sample
sizes
areagreed
large enough
teachers
who
with the statement? Explain.
n1p1 = 10000(.58) > 10, n1(1 – p1) = 10000(.42) > 10,
n2p2 = 8000(.39) > 10, n2(1 – p2) = 8000(.61) > 10
.58(.42) .39(.61)
(.58  .39)  1.645

 (.178, .202)
10000
8000
We are 90% confident that the difference in proportion
of students who agreed with the statement and the
proportion of teachers who agreed with the statement
is between .178 and .202.
Survey Example
A student assignment called for the students to survey
both male and female students (independently and
randomly chosen) to see if the proportions that
approve of the College’s new drug and alcohol policy.
A student went and randomly selected 200 male
students and 100 female students and obtained the
data summarized below.
Number Number that Proportion
surveyed
approve
that approve
Female
100
43
0.430
Male
200
61
0.305
Use this data to obtain a 90% confidence interval estimate
for the difference of the proportions of female and male
students that approve of the new policy.
Survey Example Continued
For a 90% confidence interval the z value to use is
1.645. This value is obtained from the bottom row of
the table of t critical values (Table III).
We use p1 to be the female’s sample approval
proportion and p2 as the male’s sample approval
proportion.
0.430(1  0.430) 0.305(1  0.305)
(0.430  0.305)  1.645

100
200
(0.125)  0.097
or
(0.028,0.222)
We are 90% confident that the proportion of females that
approve of the policy exceeds the proportion of males that
approve of the policy by somewhere between 0.028 and
0.222.
Related documents