Download Ch9a

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Confidence interval wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Chapter 9
Inferences from Two Samples
In this chapter we will deal with two samples
from two populations.
The general goal is to compare the
parameters of the two populations.
For the first population we use index 1, for the
second population index 2.
1
Section 9-2
Two Proportions
2
Notation for Two Proportions
For the first population, we let:
p1 = first population proportion
n1 = size of the first sample
x1 = number of successes in the first
sample
x1
^
p1 = n (the first sample proportion)
1
^q = 1 – ^p
1
1
p2, n2 , x2 , p^2, and q^2 are used for the second population.
3
Pooled Sample Proportion
 The pooled sample proportion
is denoted by p and is given by:
x1 + x2
p= n +n
1
2
 We denote
q =1–p
4
Requirements
1. We have two independent random
samples.
2. For each of the two samples, the
number of successes is at least 5 and
the number of failures is at least 5.
5
Tests for Two Proportions
The goal is to compare the two proportions.
H0: p1 = p2
H1: p1  p2 ,
two tails
H1: p1 < p2 , H1: p 1> p2
left tail
right tail
Note: no numerical values for p1 or p2 are
claimed in the hypotheses.
6
Test Statistic for Two Proportions
z=
^ )–(p –p )
( p^1 – p
2
1
2
pq
pq
n 1 + n2
Note: p1 –
p^
1
p=
x1 + x 2
n1 + n2
p2 =0 according to H0
x1
= n
1
and
and
p^
2
x2
=
n2
q=1–p
7
Example:
The table below lists results from a simple
random sample of front-seat occupants
involved in car crashes.
Use a 0.05 significance level to test the claim
that the fatality rate of occupants is lower for
those in cars equipped with airbags.
8
Example:
Requirements are satisfied: two simple random
samples, two samples are independent; Each
has at least 5 successes and 5 failures.
Step 1: Express the claim as p1 < p2.
Step 2: p1 < p2 does not contain equality so it is
the alternative hypothesis. The null
hypothesis is the statement of equality.
9
Example:
H0: p1 = p2
H1: p1 < p2 (original claim)
Step 3: Significance level is 0.05
Step 4: Compute the pooled proportion:
x1  x2
41  52
p

 0.004347
n1  n2 11,541  9,853
With
p  0.004347
it follows
q  0.995653
10
Example:
Step 5: Find the value of the test statistic.
p̂1  p̂2  p1  p2 

z
pq pq

n1 n2

52 
 41
 11, 541  9,853   0


 0.004347  0.995653   0.004347  0.995653
11, 541
9,853
z  1.91
11
Example:
Left-tailed test. Area to left of z = –1.91 is
0.0281 (Table A-2), so the P-value is 0.0281.
12
Example:
Step 6: Because the P-value of 0.0281 is less
than the significance level of  = 0.05, we
reject the null hypothesis of p1 = p2.
Because we reject the null hypothesis, we
conclude that there is sufficient evidence to
support the original claim.
Final conclusion:
the proportion of accident fatalities for
occupants in cars with airbags is less than the
proportion of fatalities for occupants in cars
without airbags.
13
Example: Using the Traditional Method
With a
significance
level of  = 0.05
in a left- tailed
test,
the critical value is z = –1.645. The test statistic
of z = –1.91 does fall in the critical region
bounded by the critical value of z = –1.645.
We again reject the null hypothesis.
14
Confidence Interval
Estimate of p1 – p2
( p^1 – p^2 ) – E < ( p1 – p2 ) < ( p^ 1
where E =
z  
^ )+
–p
2
E
p^1 q^1
p^2 q^2
n1 + n2
15
Example:
Use the same sample data to construct a
90% confidence interval estimate of the
difference between the two population
proportions.
Note: 1─ = 0.90, so  = 0.10 and  = 0.05 .
16
Example:
Requirements are satisfied as we saw earlier.
90% confidence interval: z/2 = 1.645
Calculate the margin of error, E
E  z 2
p̂1q̂1 p̂2 q̂2

n1
n2
 41   11, 500   52   9801 

 
 
 

11, 541 11, 541
9, 853 9, 853
 1.645

11, 541
9, 853
 0.001507
17
Example:
Construct the confidence interval
p̂1  p̂2  E  p1  p2   p̂1  p̂2  E
0.003553  0.005278   0.001507
 p1  p2  
0.003553  0.005278   0.001507
0.00323  p1  p2   0.000218
18
Final note:
The confidence interval limits do not contain 0,
implying that there is a significant difference
between the two proportions.
Thus the confidence interval, too, suggests
that the fatality rate is lower for occupants in
cars with air bags than for occupants in cars
without air bags.
19
Two proportions by TI-83/84
•
•
•
•
•
•
•
Press STAT and select TESTS
Scroll down to 2-PropZTest press ENTER
Type in x1: (number of successes in 1st sample)
n1: (number of trials in 1st sample)
x2: (number of successes in 2nd sample)
n2: (number of trials in 2nd sample)
choose H1: p1 ≠p2
<p2
>p2
(two tails) (left tail) (right tail)
• Press on Calculate
• Read test statistic z=… and P-value p=…
20
Two proportions by TI-83/84
• Press STAT and select TESTS
• Scroll down to 2-PropZInt
press ENTER
• Type in x1: (number of successes in 1st sample)
•
n1: (number of trials in 1st sample)
•
x2: (number of successes in 2nd sample)
•
n2: (number of trials in 2nd sample)
C-Level: (confidence level)
• Press on Calculate
• Read the interval (…,…)
21
Section 9-3
Two Means:
Independent Samples
22
Definitions
Two samples are independent if
the sample values selected from
one population are not related to
or somehow paired or matched
with the sample values from the
other population.
23
Notation for the first population:
1 = population mean
σ1 = population standard deviation
n1 = size of the first sample
x1 = sample mean
s1 = sample standard deviation
Corresponding notations for 2, σ2, s2, x
2
and n2 apply to the second population.
24
Requirements
1. σ1 an σ2 are unknown and no assumption is
made about the equality of σ1 and σ2 .
2. The two samples are independent.
3. Both samples are random samples.
4. Either or both of these conditions are
satisfied: The two sample sizes are both
large (with n1 > 30 and n2 > 30) or both
populations have normal distributions.
25
Tests for Two Means
The goal is to compare the two means.
H 0:  1 =  2
H 1:  1   2 ,
two tails
H 1:  1 <  2 , H 1 :  1 >  2
left tail
right tail
Note: no numerical values for
claimed in the hypotheses.
1 or 2 are
26
Hypothesis Test for Two Means
with Independent Samples:
Test Statistic is
x  x  

t
1
2
1
2
1
 2

2
2
s
s

n1 n2
Note: 1 –
2 =0 according to H0
Degrees of freedom: df = smaller of n1 – 1 and n2 – 1.
27
Example:
A headline in USA Today proclaimed that “Men,
women are equal talkers.” That headline
referred to a study of the numbers of words that
men and women spoke in a day, see below.
Use a 0.05 significance level to test the claim
that men and women speak the same mean
number of words in a day.
28
Example:
Requirements are satisfied: two population
standard deviations are not known and not
assumed to be equal, independent samples,
both samples are large.
Step 1: Express claim as 1 = 2.
Step 2: If original claim is false, then 1 ≠ 2.
Step 3: Alternative hypothesis does not
contain equality, null hypothesis does.
H0 : 1 = 2 (original claim)
H1 : 1 ≠ 2
29
Example:
Step 4: Significance level is 0.05
Step 5: Use a t distribution
Step 6: Calculate the test statistic
x  x  

t
1
2
1
 2

s12 s22

n1 n2
15,668.5  16,215.0  0


 0.676
8632.5 2 7301.22

186
210
30
Example:
Use Table A-3: area in two tails is 0.05, df = 185,
which is not in the table, the closest value is
t = ±1.972
31
Example:
Step 7: Because the test statistic does not fall
within the critical region, fail to reject
the null hypothesis:
1 = 2 (or 1 – 2 = 0).
Final conclusion:
There is sufficient evidence to support the
claim that men and women speak the same
mean number of words in a day.
32
Confidence Interval Estimate of
1 – 2: Independent Samples
(x1 – x2) – E < (µ1 – µ2) < (x1 – x2) + E
where E =
t
s2
s
+
n2
n1
2
1
2
df = smaller n1 – 1 and n2 – 1
33
Example:
Using the given sample data, construct a
95% confidence interval estimate of the
difference between the mean number of
words spoken by men and the mean
number of words spoken by women.
34
Example:
Find the margin of Error, E; use t/2 = 1.972
E  t
2
s12 s22
8632.52 7301.22

 1.972

 1595.4
n1 n2
186
210
Construct the confidence interval use E = 1595.4
and x1  15,668.5 and x2  16,215.0.
x  x  E  
2141.9  
1
2
1
1
 

   1048.9
 2  x1  x2  E
2
35
Tests about two means by TI-83/84
•
•
•
•
•
•
•
•
•
Press STAT and select TESTS
Scroll down to 2-SampTTest press ENTER
Select Input: Data or Stats. For Stats:
Type in x1: (1st sample mean)
sx1: (1st sample st. deviation)
n1: (1st sample size)
x2: (2nd sample mean)
sx2: (2nd sample st. deviation)
n2: (2nd sample size)
choose H1: 1 ≠2
<2
> 2
(two tails) (left tail) (right tail)
36
Tests about two means (continued)
• choose Pooled: No or Yes (always No)
•
•
•
•
Press on Calculate
Read the test statistic t=…
and the P-value p=…
Note: the calculator gives a more accurate
P-value than the book does, because it uses
a more accurate formula for degrees of
freedom (see the line df=… in the
calculator). The book adopts a simple but
inaccurate rule df=smaller of n1-1 and n2-1.
37
Intervals for two means by TI-83/84
•
•
•
•
•
•
•
•
•
•
Press STAT and select TESTS
Scroll down to 2-SampTInt press ENTER
Select Input: Data or Stats. For Stats:
Type in x1: (1st sample mean)
sx1: (1st sample st. deviation)
n1: (1st sample size)
x2: (2nd sample mean)
sx2: (2nd sample st. deviation)
n2: (2nd sample size)
C-Level: confidence level
38
Intervals for two means (continued)
•
•
•
•
choose Pooled: No or Yes (always No)
Press on Calculate
Read the confidence interval (…,…)
Note: the calculator gives a more accurate
confidence interval than the book does,
because it uses a more accurate formula for
degrees of freedom (see the line df=… in
the calculator). The book adopts a simple but
inaccurate rule df=smaller of n1-1 and n2-1.
39