Download here

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Statistics for
Business and Economics
6th Edition
Chapter 9
Estimation: Additional Topics
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-1
Chapter Goals
After completing this chapter, you should be able to:
 Form confidence intervals for the mean difference from dependent
samples
 Form confidence intervals for the difference between two
independent population means (standard deviations known or
unknown)
 Compute confidence interval limits for the difference between two
independent population proportions
 Create confidence intervals for a population variance
 Find chi-square values from the chi-square distribution table
 Determine the required sample size to estimate a mean or
proportion within a specified margin of error
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-2
Estimation: Additional Topics
Chapter Topics
Population
Means,
Dependent
Samples
Population
Means,
Independent
Samples
Population
Proportions
Population
Variance
Proportion 1 vs.
Proportion 2
Variance of a
normal distribution
Examples:
Same group
before vs. after
treatment
Group 1 vs.
independent
Group 2
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-3
Dependent Samples
Tests Means of 2 Related Populations
Dependent
samples



Paired or matched samples
Repeated measures (before/after)
Use difference between paired values:
di = xi - yi
 Eliminates Variation Among Subjects
 Assumptions:
 Both Populations Are Normally Distributed
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-4
Mean Difference
The ith paired difference is di , where
Dependent
samples
di = xi - yi
The point estimate for
the population mean
paired difference is d :
The sample
standard
deviation is:
n
d
d
i 1
i
n
n
Sd 
2
(d

d
)
 i
i1
n 1
n is the number of matched pairs in the sample
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-5
Confidence Interval for
Mean Difference
Dependent
samples
The confidence interval for difference
between population means, μd , is
d  t n1,α/2
Sd
Sd
 μd  d  t n1,α/2
n
n
Where
n = the sample size
(number of matched pairs in the paired sample)
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-6
Confidence Interval for
Mean Difference
(continued)
Dependent
samples
 The margin of error is
ME  t n1,α/2
sd
n
 tn-1,/2 is the value from the Student’s t
distribution with (n – 1) degrees of freedom
for which
α
P(t n1  t n1,α/2 ) 
2
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-7
Paired Samples Example
 Six people sign up for a weight loss program. You
collect the following data:
Person
1
2
3
4
5
6
Weight:
Before (x)
After (y)
136
205
157
138
175
166
125
195
150
140
165
160
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Difference, di
11
10
7
-2
10
6
42
di

d = n
= 7.0
Sd 
2
(d

d
)
 i
n 1
 4.82
Chap 9-8
Paired Samples Example
(continued)
 For a 95% confidence level, the appropriate t value is
tn-1,/2 = t5,.025 = 2.571
 The 95% confidence interval for the difference between
means, μd , is
d  t n1,α/2
7  (2.571)
Sd
S
 μd  d  t n1,α/2 d
n
n
4.82
4.82
 μd  7  (2.571)
6
6
 1.94  μd  12.06
Since this interval contains zero, we cannot be 95% confident, given this
limited data, that the weight loss program helps people lose weight
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-9
Difference Between Two Means
Population means,
independent
samples
Goal: Form a confidence interval
for the difference between two
population means, μx – μy
 Different data sources
 Unrelated
 Independent
 Sample selected from one population has no effect on the
sample selected from the other population
 The point estimate is the difference between the two
sample means:
x–y
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-10
Difference Between Two Means
(continued)
Population means,
independent
samples
σx2 and σy2 known
Confidence interval uses z/2
σx2 and σy2 unknown
σx2 and σy2
assumed equal
σx2 and σy2
assumed unequal
Confidence interval uses a value
from the Student’s t distribution
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-11
σx2 and σy2 Known
Population means,
independent
samples
σx2 and σy2 known
σx2 and σy2 unknown
Assumptions:
*
 Samples are randomly and
independently drawn
 both population distributions
are normal
 Population variances are
known
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-12
σx2 and σy2 Known
(continued)
When σx and σy are known and
both populations are normal, the
variance of X – Y is
Population means,
independent
samples
2
σx2 and σy2 known
σx2 and σy2 unknown
σ 2X Y
*
2
σy
σx


nx
ny
…and the random variable
Z
(x  y)  (μX  μY )
2
σ 2x σ y

nX nY
has a standard normal distribution
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-13
Confidence Interval,
σx2 and σy2 Known
Population means,
independent
samples
σx2 and σy2 known
σx2 and σy2 unknown
(x  y)  z α/2
*
The confidence interval for
μx – μy is:
σ 2X σ 2Y
σ 2X σ 2Y

 μX  μY  (x  y)  z α/2

nx ny
nx ny
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-14
σx2 and σy2 Unknown,
Assumed Equal
Assumptions:
Population means,
independent
samples
 Samples are randomly and
independently drawn
σx2 and σy2 known
 Populations are normally
distributed
σx2 and σy2 unknown
σx2 and σy2
assumed equal
*
 Population variances are
unknown but assumed equal
σx2 and σy2
assumed unequal
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-15
σx2 and σy2 Unknown,
Assumed Equal
(continued)
Forming interval
estimates:
Population means,
independent
samples
 The population variances
are assumed equal, so use
the two sample standard
deviations and pool them to
estimate σ
σx2 and σy2 known
σx2 and σy2 unknown
σx2 and σy2
assumed equal
σx2 and σy2
assumed unequal
*
 use a t value with
(nx + ny – 2) degrees of
freedom
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-16
σx2 and σy2 Unknown,
Assumed Equal
(continued)
Population means,
independent
samples
The pooled variance is
σx2 and σy2 known
σx2 and σy2 unknown
σx2 and σy2
assumed equal
*
sp2 
(n x  1)s2x  (n y  1)s2y
nx  ny  2
σx2 and σy2
assumed unequal
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-17
Confidence Interval,
σx2 and σy2 Unknown, Equal
σx2 and σy2 unknown
σx2 and σy2
assumed equal
*
The confidence interval for
μ1 – μ2 is:
sp2
sp2
σx2 and σy2
assumed unequal
(x  y)  t nx ny 2,α/2
Where
sp2 
sp2
nx

ny
 μX  μY  (x  y)  t nx ny 2,α/2
nx

sp2
ny
(n x  1)s2x  (n y  1)s2y
nx  ny  2
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-18
Pooled Variance Example
You are testing two computer processors for speed.
Form a confidence interval for the difference in CPU
speed. You collect the following speed data (in Mhz):
CPUx
Number Tested
17
Sample mean
3004
Sample std dev
74
CPUy
14
2538
56
Assume both populations are
normal with equal variances,
and use 95% confidence
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-19
Calculating the Pooled Variance
The pooled variance is:
2
2




n

1
S

n

1
S

17  1742  14  1562
x
x
y
y
2
S 

p
(n x  1)  (ny  1)
(17 - 1)  (14  1)
 4427.03
The t value for a 95% confidence interval is:
tnx ny 2 , α/2  t 29 , 0.025  2.045
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-20
Calculating the Confidence Limits
 The 95% confidence interval is
(x  y)  t nx ny 2,α/2
(3004  2538)  (2.054)
sp2
nx

sp2
ny
 μX  μY  (x  y)  t nx ny 2,α/2
sp2
nx

sp2
ny
4427.03 4427.03
4427.03 4427.03

 μX  μY  (3004  2538)  (2.054)

17
14
17
14
416.69  μX  μY  515.31
We are 95% confident that the mean difference in
CPU speed is between 416.69 and 515.31 Mhz.
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-21
σx2 and σy2 Unknown,
Assumed Unequal
Assumptions:
Population means,
independent
samples
 Samples are randomly and
independently drawn
σx2 and σy2 known
 Populations are normally
distributed
σx2 and σy2 unknown
 Population variances are
unknown and assumed
unequal
σx2 and σy2
assumed equal
σx2 and σy2
assumed unequal
*
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-22
σx2 and σy2 Unknown,
Assumed Unequal
(continued)
Forming interval estimates:
Population means,
independent
samples
 The population variances are
assumed unequal, so a pooled
variance is not appropriate
σx2 and σy2 known
 use a t value with  degrees
of freedom, where
σx2 and σy2 unknown
2
σx2 and σy2
assumed equal
σx2 and σy2
assumed unequal
*
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
 s2x
s2y 
( )  ( )
n y 
 n x
v
2
2
2 2


s
 sx 
  /(n x  1)   y  /(n y  1)
n 
 nx 
 y
Chap 9-23
Confidence Interval,
σx2 and σy2 Unknown, Unequal
σx2 and σy2 unknown
σx2 and σy2
assumed equal
σx2 and σy2
assumed unequal
(x  y)  t ,α/2
*
The confidence interval for
μ1 – μ2 is:
2
2
s2x s y
s2x s y

 μX  μY  (x  y)  t ,α/2

nx ny
nx ny
Where
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
v
 s2x
s2y 
( )  ( )
n y 
 n x
2
2
 s2 
 s2x 
  /(n x  1)   y  /(n y  1)
n 
 nx 
 y
2
Chap 9-24
Two Population Proportions
Population
proportions
Goal: Form a confidence interval for
the difference between two
population proportions, Px – Py
Assumptions:
Both sample sizes are large (generally at
least 40 observations in each sample)
The point estimate for
the difference is
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
pˆ x  pˆ y
Chap 9-25
Two Population Proportions
(continued)
Population
proportions
 The random variable
Z
(pˆ x  pˆ y )  (p x  p y )
pˆ x (1 pˆ x ) pˆ y (1 pˆ y )

nx
ny
is approximately normally distributed
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-26
Confidence Interval for
Two Population Proportions
Population
proportions
The confidence limits for
Px – Py are:
(pˆ x  pˆ y )  Z / 2
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
pˆ x (1 pˆ x ) pˆ y (1 pˆ y )

nx
ny
Chap 9-27
Example:
Two Population Proportions
Form a 90% confidence interval for the
difference between the proportion of
men and the proportion of women who
have college degrees.
 In a random sample, 26 of 50 men and
28 of 40 women had an earned college
degree
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-28
Example:
Two Population Proportions
(continued)
Men:
ˆp x  26  0.52
50
Women:
ˆp y  28  0.70
40
pˆ x (1 pˆ x ) pˆ y (1 pˆ y )
0.52(0.48) 0.70(0.30)



 0.1012
nx
ny
50
40
For 90% confidence, Z/2 = 1.645
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-29
Example:
Two Population Proportions
(continued)
The confidence limits are:
(pˆ x  pˆ y )  Z α/2
pˆ x (1  pˆ x ) pˆ y (1  pˆ y )

nx
ny
 (.52  .70)  1.645 (0.1012)
so the confidence interval is
-0.3465 < Px – Py < -0.0135
Since this interval does not contain zero we are 90% confident that the
two proportions are not equal
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-30
Confidence Intervals for the
Population Variance
Population
Variance
 Goal: Form a confidence interval
for the population variance, σ2
 The confidence interval is based on
the sample variance, s2
 Assumed: the population is
normally distributed
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-31
Confidence Intervals for the
Population Variance
(continued)
Population
Variance
The random variable

2
n1
(n  1)s

2
σ
2
follows a chi-square distribution
with (n – 1) degrees of freedom
2

The chi-square value n1,  denotes the number for which
P( χn21  χn21, α )  α
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-32
Confidence Intervals for the
Population Variance
(continued)
Population
Variance
The (1 - )% confidence interval
for the population variance is
(n  1)s
(n  1)s
2
σ  2
2
χ n1, α/2
χn1, 1 - α/2
2
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
2
Chap 9-33
Example
You are testing the speed of a computer processor. You
collect the following data (in Mhz):
CPUx
Sample size
17
Sample mean
3004
Sample std dev
74
Assume the population is normal.
Determine the 95% confidence interval for σx2
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-34
Finding the Chi-square Values
 n = 17 so the chi-square distribution has (n – 1) = 16
degrees of freedom
  = 0.05, so use the the chi-square values with area
0.025 in each tail:
2
χn21, α/2  χ16
, 0.025  28.85
2
χn21, 1 - α/2  χ16
, 0.975  6.91
probability
α/2 = .025
probability
α/2 = .025
216 = 6.91
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
216 = 28.85
216
Chap 9-35
Calculating the Confidence Limits
 The 95% confidence interval is
2
(n  1)s 2
(n

1)s
2

σ
 2
2
χ n1, α/2
χn1, 1 - α/2
2
(17  1)(74) 2
(17

1)(74)
 σ2 
28.85
6.91
3037  σ 2  12683
Converting to standard deviation, we are 95%
confident that the population standard deviation of
CPU speed is between 55.1 and 112.6 Mhz
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-36
Sample PHStat Output
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-37
Sample PHStat Output
(continued)
Input
Output
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-38
Sample Size Determination
Determining
Sample Size
For the
Mean
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
For the
Proportion
Chap 9-39
Margin of Error
 The required sample size can be found to reach
a desired margin of error (ME) with a specified
level of confidence (1 - )
 The margin of error is also called sampling error
 the amount of imprecision in the estimate of the
population parameter
 the amount added and subtracted to the point
estimate to form the confidence interval
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-40
Sample Size Determination
Determining
Sample Size
For the
Mean
x  z α/2
σ
n
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Margin of Error
(sampling error)
ME  z α/2
σ
n
Chap 9-41
Sample Size Determination
(continued)
Determining
Sample Size
For the
Mean
ME  z α/2
σ
n
Now solve
for n to get
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
z σ
n
2
ME
2
α/2
2
Chap 9-42
Sample Size Determination
(continued)
 To determine the required sample size for the
mean, you must know:
 The desired level of confidence (1 - ), which
determines the z/2 value
 The acceptable margin of error (sampling error), ME
 The standard deviation, σ
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-43
Required Sample Size Example
If  = 45, what sample size is needed to
estimate the mean within ± 5 with 90%
confidence?
z σ
(1.645) (45)
n

 219.19
2
2
ME
5
2
α/2
2
2
2
So the required sample size is n = 220
(Always round up)
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-44
Sample Size Determination
Determining
Sample Size
For the
Proportion
pˆ  z α/2
pˆ (1 pˆ )
n
ME  z α/2
pˆ (1  pˆ )
n
Margin of Error
(sampling error)
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-45
Sample Size Determination
(continued)
Determining
Sample Size
For the
Proportion
ME  z α/2
pˆ (1  pˆ )
n
pˆ (1 pˆ ) cannot
be larger than
0.25, when p̂ =
0.5
Substitute
0.25 for pˆ (1 pˆ )
and solve for
n to get
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
0.25 z
n
2
ME
2
α/2
Chap 9-46
Sample Size Determination
(continued)
 The sample and population proportions, p̂ and P, are
generally not known (since no sample has been taken
yet)
 P(1 – P) = 0.25 generates the largest possible margin
of error (so guarantees that the resulting sample size
will meet the desired level of confidence)
 To determine the required sample size for the
proportion, you must know:
 The desired level of confidence (1 - ), which determines the
critical z/2 value
 The acceptable sampling error (margin of error), ME
 Estimate P(1 – P) = 0.25
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-47
Required Sample Size Example
How large a sample would be necessary
to estimate the true proportion defective in
a large population within ±3%, with 95%
confidence?
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-48
Required Sample Size Example
(continued)
Solution:
For 95% confidence, use z0.025 = 1.96
ME = 0.03
Estimate P(1 – P) = 0.25
0.25 z
n
2
ME
2
α/2
2
(0.25)(1.9 6)

 1067.11
2
(0.03)
So use n = 1068
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-49
PHStat Sample Size Options
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-50
Chapter Summary
 Compared two dependent samples (paired samples)
 Formed confidence intervals for the paired difference
 Compared two independent samples
 Formed confidence intervals for the difference between two
means, population variance known, using z
 Formed confidence intervals for the differences between two
means, population variance unknown, using t
 Formed confidence intervals for the differences between two
population proportions
 Formed confidence intervals for the population variance
using the chi-square distribution
 Determined required sample size to meet confidence
and margin of error requirements
Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc.
Chap 9-51