Download Chapter 9

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 9
Objective
Compare the parameters of two populations
using two samples from each population.
Inferences from
Two Samples
Use Confidence Intervals and Hypothesis Tests
For the first population use index 1
For the second population use index 2
9.2 Inferences About Two Proportions
9.2
9.3
9.4
9.5
9.3 Inferences About Two Means (Independent)
9.4 Inferences About Two Means (Matched Pairs)
9.5 Comparing Variation in Two Samples
Compare p1 , p2
Compare µ1 , µ2 (Independent)
Compare µ1 , µ2 (Matched Pairs)
Compare σ12 , σ22
1
2
Notation
Section 9.2
First Population
Inferences About Two Proportions
1
Objective
Compare the proportions of two populations
using two samples from each population.
p1
First population proportion
n1
First sample size
x1
Number of successes in first sample
p1
First sample proportion
Hypothesis Tests and Confidence Intervals
of two proportions use the z-distribution
3
Notation
Definition
Second Population
p2
Second population proportion
n2
Second sample size
x2
Number of successes in second sample
p2
Second sample proportion
4
The pooled sample proportion p
x1 + x2
p= n +n
1
2
q =1–p
5
6
Requirements
Tests for Two Proportions
The goal is to compare the two proportions
(1) Have two independent random samples
(2) For each sample:
The number of successes is at least 5
The number of failures is at least 5
H0 : p1 = p2
H1 : p1  p2
Two tailed
Both requirements must be satisfied to make a
Hypothesis Test or to find a Confidence Interval
H0 : p1 = p2
H0 : p1 = p2
H1 : p1 < p2
H1 : p1 > p2
Left tailed
Right tailed
Note: We only test the relation between p1 and p2
(not the actual numerical values)
7
8
Test Statistic
Finding the Test Statistic
z=
^ )–(p –p )
( p^1 – p
2
1
2
Note: p1 –
pq
pq
n1 + n2
2
p2 =0 according to H0
This equation is an altered form of the test
statistic for a single proportion (see Ch. 8-3)
9
Steps for Performing a Hypothesis
Test on Two Proportions
Note: Hypothesis Tests are done in same way as
in Ch.8 (but with different test statistics)
10
Example 1
The table below lists results from a simple random sample
of front-seat occupants involved in car crashes.
Use a 0.05 significance level to test the claim that the
fatality rate of occupants is lower for those in cars
equipped with airbags.
•
Write what we know
•
State H0 and H1
•
Draw a diagram
•
Calculate the sample and pooled proportions
•
Find the Test Statistic
•
Find the Critical Value(s)
•
State the Initial Conclusion and Final Conclusion
p1 : Proportion of fatalities with airbags
p2 : Proportion of fatalities with no airbags
What we know:
x1 = 41
n1 = 11541
x2 = 52
n2 = 9853
Claim
p1 < p2
α = 0.05
Claim: p1 < p2
Note: Same process as in Chapter 8
11
Note: Each sample has more than 5 successes and failures, thus fulfilling the requirements 12
Given:
Example 1
x1 = 41
n1 = 11541
x2 = 52
n2 = 9853
α = 0.05
Claim: p1 < p2
Diagram
H0 : p1 = p2
Left-Tailed
H1 = Claim
H1 : p1 < p2
Sample Proportions
z = –1.9116
Example 1
z-dist.
Given:
x2 = 52
n2 = 9853
α = 0.05
Claim: p1 < p2
Diagram
H0 : p1 = p2
–zα = –1.645
x1 = 41
n1 = 11541
Left-Tailed
H1 = Claim
H1 : p1 < p2
z = –1.9116
z-dist.
–zα = –1.645
Using StatCrunch
Pooled Proportion
Stat → Proportions → Two sample → With summary
Sample 1: Number of successes: . 41
● Hypothesis Test
Number of observations: 11541
Null: prop. diff.=
Sample 2: Number of successes: . 52
Alternative
Number of observations: 9853
Test Statistic
0
<
P-value = 0.028
Critical Value
(Using StatCrunch)
Initial Conclusion: Since z is in the critical region, reject H0
Initial Conclusion: Since P-value is less than α (with α = 0.05), reject H0
Final Conclusion: We Accept the claim the fatality rate of occupants is
lower for those who wear seatbelts
13
Final Conclusion: We Accept the claim the fatality rate of occupants is
lower for those who wear seatbelts
14
Example 2
Confidence Interval Estimate
Use the same sample data in Example 1 to construct a
90% Confidence Interval Estimate of the difference
between the two population proportions (p1–p2)
We can observe how the two proportions relate by
looking at the Confidence Interval Estimate of p1–p2
3
CI = ( (p1–p2) – E, (p1–p2) + E )
x1 = 41
n1 = 11541
x2 = 52
n2 = 9853
p1 = 0.003553
p2 = 0.005278
Where
CI = (-0.003232, -0.000218 )
15
Note: CI negative implies p1–p2 is negative. This implies p1<p2
Example 2
Example 2
Use the same sample data in Example 1 to construct a
90% Confidence Interval Estimate of the difference
between the two population proportions (p1–p2)
Use the same sample data in Example 1 to construct a
90% Confidence Interval Estimate of the difference
between the two population proportions (p1–p2)
x1 = 41
n1 = 11541
x1 = 41
n1 = 11541
x2 = 52
n2 = 9853
p1 = 0.003553
p2 = 0.005278
x2 = 52
n2 = 9853
16
Using StatCrunch
Stat → Proportions → Two sample → With summary
Sample 1: Number of successes: . 41
● Confidence Interval
Number of observations: 11541
Level 0.9
Sample 2: Number of successes: . 52
Number of observations: 9853
CI = (-0.003232, -0.000218 )
Note: CI negative implies p1–p2 is negative. This implies p1<p2
CI = (-0.003232, -0.000218 )
17
Note: CI negative implies p1–p2 is negative. This implies p1<p2
18
Example 3
Interpreting Confidence Intervals
Drug Clinical Trial
Chantix is a drug used as an aid to stop smoking. The number of
subjects experiencing insomnia for each of two treatment groups in
a clinical trial of the drug Chantix are given below:
If a confidence interval limits does not contain 0, it
implies there is a significant difference between
the two proportions (i.e. p1 ≠ p2).
Thus, we can interpret a relation between the two
proportions from the confidence interval.
Placebo
129
805
Number experiencing insomnia
19
13
(a) Use a 0.01 significance level to test the claim proportions of
subjects experiencing insomnia is the same for both groups.
In general:
• If p1 = p2 then the CI should contain 0
(b) Find the 99% confidence level estimate of the difference of the
two proportions. Does it support the result of the test?
• If p1 > p2 then the CI should be mostly positive
• If p1 > p2 then the CI should be mostly negative
What we know:
Example 3a
Given: x1 = 19
n1 = 129
x2 = 13
n2 = 805
Diagram
H0 : p1 = p2
Two-Tailed
H0 = Claim
Sample Proportions
-zα/2 = -2.576
α = 0.01
Claim: p1= p2
z-dist.
x1 = 41
n1 = 129
α = 0.01
Claim: p1= p2
x2 = 52
n2 = 9853
Note: Each sample has more than 5 successes and failures, thus fulfilling the requirements 20
19
H1 : p1 ≠ p2
Chantix Treatment
Number in group
Example 3a
z = 7.602
Given: x1 = 19
n1 = 129
Diagram
H0 : p1 = p2
zα/2 = 2.576
H1 : p1 ≠ p2
α = 0.01
Claim: p1= p2
x2 = 13
n2 = 805
z-dist.
Two-Tailed
H0 = Claim
Using StatCrunch
Pooled Proportion
4
Stat → Proportions → Two sample → With summary
Sample 1: Number of successes: . 19
Number of observations: 129
Sample 2: Number of successes: . 13
Number of observations: 805
Test Statistic
● Hypothesis Test
Null: prop. diff.=
Alternative
0
≠
P-value < 0.0001
i.e. the P-value
is very small
Critical Value
(Using StatCrunch)
Initial Conclusion: Since the P-value is less than α (0.01), reject H0
Initial Conclusion: Since z is in the critical region, reject H0
Final Conclusion: We Reject the claim the proportions of the subjects
experiencing insomnia is the same in both groups.
21
Final Conclusion: We Reject the claim the proportions of the subjects
experiencing insomnia is the same in both groups.
Example 3b
Example 3b
Use the same sample data in Example 3 to construct a
99% Confidence Interval Estimate of the difference
between the two population proportions (p1–p2)
Use the same sample data in Example 3 to construct a
99% Confidence Interval Estimate of the difference
between the two population proportions (p1–p2)
x1 = 19
n1 = 129
x1 = 19
n1 = 129
x2 = 13
n2 = 805
p1 = 0.14729
p2 = 0.01615
x2 = 13
n2 = 805
22
Using StatCrunch
Stat → Proportions → Two sample → With summary
Sample 1: Number of successes: . 19
Number of observations: 129
Sample 2: Number of successes: . 13
Number of observations: 805
CI = (0.0500, 0.2123 )
Note: CI does not contain 0 implies p1 and p2 have significant difference. 23
● Confidence Interval
Level
0.9
CI = (0.0500, 0.2123 )
Note: CI does not contain 0 implies p1 and p2 have significant difference. 24
Section 9.3
Inferences About Two Means
(Independent)
Objective
Compare the proportions of two independent
means using two samples from each population.
Hypothesis Tests and Confidence Intervals of
two proportions use the t-distribution
25
26
Notation
Definitions
Two samples are independent if the sample
values selected from one population are not
related to or somehow paired or matched with
the sample values from the other population
Examples:
Flipping two coins (Independent)
First Population
5
μ1
First population mean
σ1
First population standard deviation
n1
First sample size
x1
First sample mean
s1
First sample standard deviation
Drawing two cards (not independent)
27
28
Requirements
Notation
Second Population
μ2
Second population mean
σ2
Second population standard deviation
n2
Second sample size
x2
Second sample mean
s2
Second sample standard deviation
(1) Have two independent random samples
(2) σ1 and σ2 are unknown and no assumption is
made about their equality
(3) Either or both the following holds:
Both sample sizes are large (n1>30, n2>30)
or
Both populations have normal distributions
All requirements must be satisfied to make a
Hypothesis Test or to find a Confidence Interval
29
30
Tests for Two Independent Means
Finding the Test Statistic
The goal is to compare the two Means
t
H0 : μ1 = μ2
H0 : μ1 = μ2
H0 : μ1 = μ2
H1 : μ1 ≠ μ2
H1 : μ1 < μ2
H1 : μ1 > μ2
Two tailed
Left tailed
x  x  
1
1
 2

s12 s22

n1 n2
Note: 1 –
Right tailed
2
2 =0 according to H0
Degrees of freedom: df = smaller of n1 – 1 and n2 – 1.
Note: We only test the relation between μ1 and μ2
(not the actual numerical values)
This equation is an altered form of the test statistic
for a single mean when σ unknown (see Ch. 8-5)
31
32
Steps for Performing a Hypothesis
Test on Two Independent Means
Test Statistic
6
Degrees of freedom
df = min(n1 – 1, n2 – 1)
Note: Hypothesis Tests are done in same way as
in Ch.8 (but with different test statistics)
•
Write what we know
•
State H0 and H1
•
Draw a diagram
•
Find the Test Statistic
•
Find the Degrees of Freedom
•
Find the Critical Value(s)
•
State the Initial Conclusion and Final Conclusion
Note: Same process as in Chapter 8
33
Example 1
34
n1 = 186
x1 = 15668.5
s1 = 8632.5
Example 1
A headline in USA Today proclaimed that “Men,
women are equal talkers.” That headline referred to
a study of the numbers of words that men and
women spoke in a day.
Use a 0.05 significance level to test the claim that
men and women speak the same mean number of
words in a day.
H0 : µ1 = µ2
H1 : µ1 ≠ µ2
Two-Tailed
H0 = Claim
n2 = 210
x2 = 16215.0
s2 = 7301.2
t-dist.
df = 185
t = 7.602
-tα/2 = -1.97
Test Statistic
α = 0.05
Claim: μ1 = μ2
tα/2 = 1.97
Degrees of Freedom
df = min(n1 – 1, n2 – 1) = min(185, 209) = 185
Critical Value
tα/2 = t0.025 = 1.97
(Using StatCrunch)
Initial Conclusion: Since t is not in the critical region, accept H0
35
Final Conclusion: We accept the claim that men and women speak the
same average number of words a day.
36
Example 1
H0 : µ1 = µ2
n1 = 186
x1 = 15668.5
s1 = 8632.5
Two-Tailed
H0 = Claim
H1 : µ1 ≠ µ2
n2 = 210
x2 = 16215.0
s2 = 7301.2
α = 0.05
Claim: μ1 = μ2
Confidence Interval Estimate
We can observe how the two proportions relate by
looking at the Confidence Interval Estimate of μ1–μ2
Stat → T statistics → Two sample → With summary
Sample 1:
Using StatCrunch
(Be sure to not use pooled variance)
Sample 2:
Mean 15668.5
Std. Dev. 8632.5
Size
186
Mean 16215.0
Std. Dev. 7301.2
Size
210
● Hypothesis Test
Null: prop. diff.=
Alternative
0
≠
(No pooled variance)
CI = ( (x1–x2) – E, (x1–x2) + E )
P-value = 0.4998
2
2
Where
df = min(n1–1, n2–1)
Initial Conclusion: Since P-value > α (0.05), accept H0
Final Conclusion: We accept the claim that men and women speak the
same average number of words a day.
37
38
Example 2
Example 2
Use the same sample data in Example 1 to construct a
95% Confidence Interval Estimate of the difference
between the two population proportions (µ1–µ2)
Use the same sample data in Example 1 to construct a
95% Confidence Interval Estimate of the difference
between the two population proportions (µ1–µ2)
n1 = 186
n2 = 210
df = min(n1–1, n2–1) = min(185, 210) = 185
df = min(n1–1, n2–1) = min(185, 210) = 185
x1 = 15668.5
x2 = 16215.0
tα/2 = t0.05/2 = t0.025 = 1.973
tα/2 = t0.1/2 = t0.05 = 1.973
s1 = 8632.5
s2 = 7301.2
x1 - x2 = 15668.5 – 16215.0 = -546.5
x1 - x2 = 15668.5 – 16215.0 = -546.5
7
n1 = 186
x1 = 15668.5
s1 = 8632.5
n2 = 210
x2 = 16215.0
s2 = 7301.2
Stat → T statistics → Two sample → With summary
Sample 1:
Sample 2:
Using StatCrunch
Mean 15668.5
Std. Dev. 8632.5
Size
186
Mean 16215.0
Std. Dev. 7301.2
Size
210
● Confidence Interval
Level:
0.95
(No pooled variance)
(x1 - x2) + E = -546.5 + 1596.17 = 1049.67
(x1 - x2) – E = -546.5 – 1596.17 = -2142.67
CI = (-2142.7, 1049.7)
Note: slightly different because
of rounding errors
CI = (-2137.4, 1044.4)
39
Example 3
Example 3a
Consider two different classes. The students in the first
class are thought to generally be older than those in the
second. The students’ ages for this semester are summed
as follows:
n1 = 93
x1 = 21.2
s1 = 2.42
H0 : µ1 = µ2
H1 : µ1 > µ2
n2 = 67
x2 = 19.8
s2 = 4.77
n1 = 93
x1 = 21.2
n2 = 67
x2 = 19.8
s1 = 2.42
s2 = 4.77
40
α = 0.1
Claim: µ1 > µ2
Right-Tailed
H1 = Claim
t-dist.
df = 66
Test Statistic
𝒕=
(a) Use a 0.1 significance level to test the claim that the
average age of students in the first class is greater than the
average age of students in the second class.
𝒙1 −𝒙𝟐
𝒔1 𝒔2
+
𝒏1 𝒏2
=
= 2.207
tα/2 = 1.668
t = 7.602
Degrees of Freedom
df = min(n1 – 1, n2 – 1) = min(92, 66) = 66
Critical Value
tα/2 = t0.05 = 1.668
(b) Construct a 90% confidence interval estimate of the
difference in average ages.
21.2−19.8
(2.42)𝟐 (4.77)𝟐
+
93
67
(Using StatCrunch)
Initial Conclusion: Since t is in the critical region, reject H0
41
Final Conclusion: We accept the claim that the average age of students
in the first class is greater than that in the second.
42
Example 3a
H0 : µ1 = µ2
H1 : µ1 > µ2
n1 = 93
x1 = 21.2
s1 = 2.42
Right-Tailed
H1 = Claim
α = 0.1
Claim: µ1 > µ2
Example 3b
(90% Confidence Interval)
Sample 1:
Sample 2:
Mean
Std. Dev.
Size
Mean
Std. Dev.
Size
21.2
2.42
93
19.8
4.77
67
● Hypothesis Test
Null: prop. diff.=
Alternative
n1 = 93
x1 = 21.2
s1 = 2.42
n2 = 67
x2 = 19.8
s2 = 4.77
α = 0.1
df = min(n1–1, n2–1) = min(92, 66) = 66
Stat → T statistics → Two sample → With summary
Using StatCrunch
(Be sure to not use pooled variance)
n2 = 67
x2 = 19.8
s2 = 4.77
tα/2 = t0.1/2 = t0.05 = 1.668
0
≠
x1 - x2 = 21.2 – 19.8 = 1.4
(No pooled variance)
𝑬 = 𝒕𝜶
2
𝒔1
𝒏1
𝒔
mp
+ 𝒏1 = 1.668
1
2.42 𝟐
93
+
4.77 𝟐
67
= 1.058
P-value = 0.0299
(x1 - x2) + E = 1.4 + 1.058 = 2.458
(x1 - x2) – E = 1.4 – 1.058 = 0.342
Initial Conclusion: Since P-value < α (0.1), reject H0
CI = (0.34, 2.46)
Final Conclusion: We accept the claim that the average age of students
in the first class is greater than that in the second.
43
Example 3b
(90% Confidence Interval)
n1 = 93
x1 = 21.2
s1 = 2.42
n2 = 67
x2 = 19.8
s2 = 4.77
44
α = 0.1
Stat → T statistics → Two sample → With summary
Sample 1:
Mean
Std. Dev.
Size
Sample 2:
Mean
Std. Dev.
Size
Using StatCrunch
(Be sure to not use pooled variance)
21.2
2.42
93
19.8
4.77
67
● Hypothesis Test
Null: prop. diff.=
Alternative
0
≠
8
(No pooled variance)
CI = (0.35, 2.45)
45
Related documents