Download 5-3 Inference on the Means of Two Populations, Variances Unknown

Document related concepts
no text concepts found
Transcript
5-3 Inference on the Means of Two
Populations, Variances Unknown
5-3.1 Hypothesis Testing on the Difference in
Means
2

Use the data in both samples to estimate
Each ( X ij  X i )2 is an estimator of  2
The pooled sample variance is the sum of all
those terms divided by the total degrees of
freedom
5-3 Inference on the Means of Two
Populations, Variances Unknown
5-3.1 Hypothesis Testing on the Difference in
Means
Replace 
in the test statistics by S p
2
S
The degrees of freedom of T is the degrees of freedom of p
5-3 Inference on the Means of Two
Populations, Variances Unknown
5-3.1 Hypothesis Testing on the Difference in
Means
p.226
5-3 Inference on the Means of Two
Populations, Variances Unknown
5-3.1 Hypothesis Testing on the Difference in
Means
p.227
2
2
7

2
.
39

7

2
.
98
2
sp 
 7.3
882
92.255  92.733
t0 
 0.35  t0.05 ,14  2.145
2
1 1
7.3(  )
8 8
Do not reject
H0
Since both samples are from normal distributions with equal
variance. Now data shows that the means are equal, so two
samples are from the same distribution. See p.210 for plots.
5-3 Inference on the Means of Two
Populations, Variances Unknown
Estimate them separately
(1) Equal sample sizes ( n1  n2  n )
T
X1  X 2
S S
n
2
1
2
2
~ t (2(n  1))
(2) Unequal sample sizes n1  n2  This is the Behrens-Fisher
problem, the test (T) is not exactly t-distributed
p.229
If the sample sizes are moderate, t (n1  n2  2) can be used.
If the sample sizes are large, use z-value.
5-3 Inference on the Means of Two
Populations, Variances Unknown
Compare to the plots on Fig.5-2
5-3 Inference on the Means of Two
Populations, Variances Unknown
5-3.1 Hypothesis Testing on the Difference in
Means   7.63 / 10  15.3 / 10  13.2
Round down to 13.
7.63 / 10  15.3 / 10
2
2
9
2
2
2
2
9
p.231
t0.025,13  2.16
Since the CI does not contain 0, the difference in mean is significantly
different from0.
Type II Error Probability and sample size can be
2
2
2





obtained from Table V for the case of 1
2
And n1  n2  n
with
d
  0
2
5-3 Inference on the Means of Two
Populations, Variances Unknown
5-3.3 Confidence Interval on the Difference
p.232
in Means     
2
1
2
2
2
5-3 Inference on the Means of Two
Populations, Variances Unknown
5-3.3 Confidence Interval on the Difference
in Means
5-3 Inference on the Means of Two
Populations, Variances Unknown
5-3.3 Confidence Interval on the Difference
in Means
p.234
Example #5-19 (p235) xA  36.51, xB  34.21, s A  1.43, sB  0.93
a) 1) The parameter of interest is the difference in mean battery
2
2
2
life,  A  B . Assuming  1   2  
2) H 0 :  A   B  0vs.H1 :  A   B  0
( x1  x2 )   0
3)  = 0.01
t

0
4) The test statistic is
1 1
sp
n1
11(1.43) 2  11(0.93) 2
sp 
 1.206
22
5)

n2
(36.51  34.21)
t0 
 4.67  2.508  t0.01, 22
1 1
1.206

12 12
6) Since 4.67 > 2.508, reject the null hypothesis and conclude
that the mean battery life of Type A significantly exceeds that
of Type B at  = 0.01.
b) Calculate the P-value for this test.
P-value = P(t > 4.67) < 0.0005
c) Construct a 99% CI.
99% lower-side confidence bound:
1 1
(36.51  34.21)  2.508(1.206)

 μ A  μB
12 12
1.065   A  B
Since zero is not contained in this interval, we conclude that the
null hypothesis can be rejected and the alternative accepted.
The life of Type A is significantly longer than that of Type B.
d) Suppose that if the mean of A batteries exceeds that of B by
2 months, it is important to detect this difference with
probability at least 0.95. Is the choice of n1  n2  n
of this problem adequate?
d
20
2 1.206
 0.83
Form Table V(d), we obtain that
~ 1
n
20  n~  30  11  n 
 16
2
Should be adequate.
Hw for 5-3
5-20,5-21
5-4 The Paired t-Test
• A special case of the two-sample t-tests of Section 5-3
occurs when the observations on the two populations of
interest are collected in pairs.
• Each pair of observations, say (X1j , X2j ), is taken under
homogeneous conditions, but these conditions may change
from one pair to another.
• The test procedure consists of analyzing the differences
between two means.
•We will transform the two-sample problem to a one-sample
problem by defining Di  X1i  X 2i.
Let ( X 1i , X 2i ), i  1,..., n be n pairs of independent observations
from a bivariate normal distribution.
E ( X 1i )  1
E ( X 2i )  2
V ( X 1i )   12
V ( X 2i )   22
i  1,..., n
COV ( X 1i , X 2i )  1 2
where  is the correlation coefficient between X1 and X 2
Define Di  X1i  X 2i . Then the distribution of Di is normal
with
mean D  E( X1  X 2 )  1  2
and
variance  D2  V ( X1  X 2 )  12   22  21 2 is reduced if   0
i.e.,
Di ~ N ( D ,  D2 ), i  1,..., n.
5-4 The Paired t-Test
Original null hypothesis:
1 n
D   Di
n i 1
H 0 : 1   2  0
1 n
2
S 
(
D

D
)

i
n  1 i 1
2
D
5-4 The Paired t-Test
n=9
5-4 The Paired t-Test
p.220
Example #5-32
d  0.2736 sD  0.1356 , n = 9 t0.025,8  2.306
95% confidence interval for example 5-8:
 sd 
 sd 
d  t  / 2,n 1
   d  d  t  / 2,n 1

 n
 n
 01356

 01356

.
.
0.2736  2.306
   d  0.2736  2.306



9 
9 
5-4 The Paired t-Test
Paired Versus Unpaired
Comparisons
5-4 The Paired t-Test
Confidence Interval for D
5-4 The Paired t-Test
Example 5-9
d  1.21
sD  12.68
t0.05,13  1.771
1.21  1.77112.68 / 14
The 90% CI
 4.79  D  7.21
p.241
Hw for 5-4
5-39, 5-40
5-5 Inference on the Ratio of Variances
of Two Normal Populations
5-5.1 The F Distribution
Let X11, X12 ,..., X1n ~ N (1 , 12 ) and
1
X 21, X 22 ,..., X 2n2 ~ N (2 , )
2
2
be two independent samples with both
unknown.
1 and 2
5-5 Inference on the Ratio of Variances
of Two Normal Populations
5-5.1 The F Distribution
We wish to test the hypotheses:
• The development of a test procedure for these hypotheses
requires a new probability distribution, the F distribution.
5-5 Inference on the Ratio of Variances
of Two Normal Populations
5-5.1 The F Distribution
Example: 5-44.p.249
f 0.25,5,10  1.59
f 0.75,5,10 
1
f 0.25,10,5
1

 0.529
1.89
p.464
f 0.1, 24,9  2.28
f 0.9, 24,9 
1
f 0.1,9, 24
1

 0.524
1.91
p.465
f 0.05,8,15  2.64
p.466
f 0.95,8,15 
1
f 0.05,15,8
1

 0.311
3.22
5-5 Inference on the Ratio of Variances
of Two Normal Populations
5-5.1 The F Distribution
Why?
  P( Fv ,u
Y /v
 f  , v ,u )  P (
 f  , v ,u )
W /u
1
1
W /u
1
 P(

)  P(

)
Fv,u f ,v,u
Y / v f  , v ,u
W /u
1
 1  P(

)
Y /v
f  , v ,u
 1  P( Fu ,v  f1 ,u ,v )
Therefore, we have
f1 ,u ,v 
1
f  , v ,u
5-5 Inference on the Ratio of Variances
of Two Normal Populations
The Test Procedure
2
X
,
X
,...,
X
~
N
(

,

)
Recall: If
1
2
n1
(n  1) S 2
(n1  1) S12 /  12 (n1  1) S12 /  12
F
 2 2 ~ Fn1 1,n2 1
2
2
(n2  1) S 2 /  2 (n2  1) S 2 /  2
2
~  n21
5-5 Inference on the Ratio of Variances
of Two Normal Populations
The Test Procedure
 12
H0 : 2  1
2
5-5 Inference on the Ratio of Variances
of Two Normal Populations
The Test Procedure
5-21
Example #5-49(p.249) s1  0.422
H 0 :  12   22
vs.
s2  0.231
n1  n2  10
H1 :  12   22
Reject H 0 if f 0  f 0.975,9,9  0.248 or f 0  f 0.025,9,9  4.03
f0 
(0.422) 2
(0.231)
2
 3.34
Since 0.248 < 3.34 < 4.03 do not reject the null hypothesis and
conclude the etch rate variances do not differ at the 0.05 level
of significance.
Confidence Interval
1    P( f1
2
, n2 , n1
S /

 f , n , n )
2 2 1
S /
2
2
2
1
2
2
2
1
S

S
 P(
f1 ,n 1,n 1 

f ,n 1,n 1 )
1
1
2 2
2 2
S

S
2
1
2
2
2
1
2
2
2
1
2
2
5-5 Inference on the Ratio of Variances
of Two Normal Populations
5-5 Inference on the Ratio of Variances
of Two Normal Populations
5-5 Inference on the Ratio of Variances
of Two Normal Populations
5-6 Inference on Two Population
Proportions
5-6.1 Hypothesis Testing on the Equality of
Two Binomial Proportions
Let X1 ~ B(n1, p1 ) and X 2 ~ B(n2 , p2 ) be two independent
Binomial random variables.
pˆ1 
X1
pq
 N ( p1 , 1 1 )
n1
n1
pˆ 2 
X2
pq
 N ( p2 , 2 2 )
n2
n2
The sampling distribution of pˆ1  pˆ 2 is approximately
p1q1 p2 q2
N ( p1  p2 ,

)
n1
n2
5-6 Inference on Two Population
Proportions
5-6.1 Hypothesis Testing on the Equality of
Two Binomial Proportions
5-6 Inference on Two Population
Proportions
5-6.1 Hypothesis Testing on the Equality of
Two Binomial Proportions
Under the null hypothesis, H 0 : p1  p2  p
we pull the information in both samples together to estimate the
variance of pˆ1  pˆ 2 ,
by
pˆ 
V ( pˆ 1  pˆ 2 ) 
pq pq

n1 n2
,by first estimating p
n1 pˆ1  n2 pˆ 2 x1  x2

n1  n2
n1  n2
The sampling distribution becomes approximately
1 1
N ( p1  p2 , pq(  ))
n1 n2
5-6 Inference on Two Population
Proportions
5-6.1 Hypothesis Testing on the Equality of
Two
Binomial Proportions
5-6 Inference on Two Population
Proportions
5-6 Inference on Two Population
Proportions
p.251
5-6 Inference on Two Population
Proportions
5-6 Inference on Two Population
Proportions
5-6.2 Type II Error and Choice of Sample Size
Where
p
n1 p1  n2 p2
n1  n2
5-6 Inference on Two Population
Proportions
5-6.2 Type II Error and Choice of Sample Size
5-6 Inference on Two Population
Proportions
5-6.2 Type II Error and Choice of Sample Size
5-6 Inference on Two Population
Proportions
5-6.3 Confidence Interval on the Difference in
Binomial Proportions
5-6 Inference on Two Population
Proportions
5-6.3 Confidence Interval on the Difference in
Binomial Proportions
Discuss Example 5-13
Example #5-60(p.255) Two types of injection-molding machine.
n1  n2  300
x1  15
x2  8
Is it reasonable to conclude that the two machines produces
the same fraction of defective parts? Using   0.05
H 0 : p1  p2  pvsH1 : p1  p2
pˆ1  0.05
z0 
pˆ 2  0.0267
pˆ 
15  8
 0.0383
600
0.05  0.0267
1 
 1
0.0383(1  0.0383)


300
300


 1.49  z0.025  1.96
Since 1.96 < 1.49 < 1.96 do not reject the null hypothesis and
conclude that the evidence indicates that there is not a significant
difference in the fraction of defective parts produced by the two
machines at the 0.05 level of significance.
P-value = 2(1P(z < 1.49)) = 0.13622