Download Confidence Interval

Document related concepts

Bootstrapping (statistics) wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-1
Statistics for Business and
Economics
Chapter 6
Inferences Based on a Single Sample
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-2
Content
1. Identifying and Estimating the Target
Parameter
2. Confidence Interval for a Population Mean:
Normal (z) Statistic
3. Confidence Interval for a Population Mean:
Student’s t-Statistic
4. Large-Sample Confidence Interval for a
Population Proportion
5. Determining the Sample Size
6. Finite Population Correction for Simple
Random Sampling
7. Confidence Interval for a Population Variance
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-3
Learning Objectives
1. Estimate a population parameter (means,
proportion, or variance) based on a large
sample selected from the population
2. Use the sampling distribution of a statistic to
form a confidence interval for the population
parameter
3. Show how to select the proper sample size
for estimating a population parameter
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-4
Thinking Challenge
Suppose you’re
interested in the
average amount of
money that students
in this class (the
population) have on
them. How would
you find out?
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-5
Statistical Methods
Statistical
Methods
Descriptive
Statistics
Inferential
Statistics
Estimation
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
Hypothesis
Testing
6-6
6.1
Identifying and Estimating
the Target Parameter
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-7
Estimation Methods
Estimation
Point
Estimation
Interval
Estimation
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-8
Target Parameter
The unknown population parameter (e.g., mean or
proportion) that we are interested in estimating is
called the target parameter.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-9
Target Parameter
Determining the Target Parameter
Parameter Key Words of Phrase Type of Data
µ
Mean; average
Quantitative
p
Proportion; percentage
fraction; rate
Qualitative
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-10
Point Estimator
A point estimator of a population parameter is a
rule or formula that tells us how to use the sample
data to calculate a single number that can be used
as an estimate of the target parameter.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-11
Point Estimation
1. Provides a single value
• Based on observations from one sample
2. Gives no information about how close the
value is to the unknown population
parameter
3. Example: Sample mean x = 3 is the
point estimate of the unknown
population mean
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-12
Interval Estimator
An interval estimator (or confidence interval) is
a formula that tells us how to use the sample data
to calculate an interval that estimates the target
parameter.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-13
Interval Estimation
1. Provides a range of values
•
Based on observations from one sample
2. Gives information about closeness to unknown
population parameter
•
Stated in terms of probability
– Knowing exact closeness requires knowing
unknown population parameter
3. Example: Unknown population mean lies between
50 and 70 with 95% confidence
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-14
6.2
Confidence Interval for a
Population Mean:
Normal (z) Statistic
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-15
Estimation Process
Population


Mean, , is
unknown

Random Sample
Mean

 x = 50
I am 95%
confident that 
is between 40 &
60.



Sample





Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-16
Key Elements of
Interval Estimation
Sample statistic
Confidence interval
(point estimate)
Confidence limit
(lower)
Confidence limit
(upper)
A confidence interval provides a range of
plausible values for the population parameter.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-17
Confidence Interval
According to the Central Limit Theorem, the
sampling distribution of the sample mean is
approximately normal for large samples. Let us
calculate the interval estimator:
1.96
x  1.96 x  x 
n
That is, we form an interval from 1.96 standard
deviations below the sample mean to 1.96 standard
deviations above the mean. Prior to drawing the
sample, what are the chances that this interval will
enclose µ, the population mean?
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-18
Confidence Interval
If sample measurements yield a value of x that falls
between the two lines on either side of µ, then the
interval x  1.96 x will contain µ.
The area under the
normal curve between
these two boundaries
is exactly .95. Thus,
the probability that a
randomly selected
interval will contain µ
is equal to .95.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-19
Confidence Coefficient
The confidence coefficient is the probability that
a randomly selected confidence interval encloses
the population parameter - that is, the relative
frequency with which similarly constructed
intervals enclose the population parameter when
the estimator is used repeatedly a very large
number of times. The confidence level is the
confidence coefficient expressed as a percentage.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-20
95% Confidence Level
If our confidence level is 95%, then in the long run,
95% of our confidence intervals will contain µ and 5%
will not.
For a confidence coefficient of 95%, the area in the
two tails is .05. To choose a different confidence
coefficient we increase or decrease the area (call it )
assigned to the tails. If we place /2 in each tail
and z/2 is the z-value, the
confidence interval with
coefficient (1 – ) is
 
x  z 2  x .
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-21
Conditions Required for a Valid
Large-Sample
Confidence Interval for µ
1. A random sample is selected from the target
population.
2. The sample size n is large (i.e., n ≥ 30). Due to
the Central Limit Theorem, this condition
guarantees that the sampling distribution of x is
approximately normal. Also, for large n, s will be
a good estimator of .
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-22
Large-Sample (1 – )% Confidence
Interval for µ
  
 x  x  z 2 
 n 
 
x  z 2
where z/2 is the z-value with an area /2 to its right
and in the standard normal distribution. The
parameter  is the standard deviation of the
sampled population, and n is the sample size.
Note: When  is unknown and n is large (n ≥ 30),
the confidence interval is approximately equal to
 s 
x  z 2 
 n 
where s is the sample standard deviation.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-23
Thinking Challenge
You’re a Q/C inspector for
Gallo. The  for 2-liter bottles
is .05 liters. A random
sample of 100 bottles showed
x = 1.99 liters. What is the
90% confidence interval
estimate of the true mean
amount in 2-liter bottles?
22 liter
liter
© 1984-1994 T/Maker Co.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-24
Confidence Interval
Solution*
x  z /2 
1.99  1.645

n
.05
   x  z /2 

n
   1.99  1.645
100
.05
100
1.982    1.998
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-25
6.3
Confidence Interval for a
Population Mean:
Student’s t-Statistic
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-26
Small Sample  Unknown
Instead of using the standard normal statistic
xµ
z
x
use the t–statistic
xµ

 n
xµ
t
s n
in which the sample standard deviation, s, replaces
the population standard deviation, .
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-27
Student’s t-Statistic
The t-statistic has a sampling distribution very
much like that of the z-statistic: mound-shaped,
symmetric, with mean 0.
The primary
difference between
the sampling
distributions of t and
z is that the t-statistic
is more variable than
the z-statistic.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-28
Degrees of Freedom
The actual amount of variability in the sampling
distribution of t depends on the sample size n. A
convenient way of expressing this dependence is
to say that the t-statistic has (n – 1) degrees of
freedom (df).
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-29
Student’s t Distribution
Standard
Normal
Bell-Shaped
t (df = 13)
Symmetric
t (df = 5)
‘Fatter’ Tails
0
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
z
t
6-30
t - Table
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-31
t-value
If we want the t-value with an area of .025 to its
right and 4 df, we look in the table under the
column t.025 for the entry in the row corresponding
to 4 df. This entry is t.025 = 2.776. The
corresponding standard normal z-score is z.025 =
1.96.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-32
Small-Sample
Confidence Interval for µ
 s 
x  t 2 
 n 
where ta/2 is based on (n – 1) degrees of freedom.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-33
Conditions Required for a
Valid Small-Sample
Confidence Interval for µ
1. A random sample is selected from the target
population.
2. The population has a relative frequency
distribution that is approximately normal.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-34
Estimation Example
Mean ( Unknown)
A random sample of n = 25 has x = 50 and s = 8.
Set up a 95% confidence interval estimate for .
s
s
x  t /2 
   x  t /2 
n
n
8
8
50  2.064 
   50  2.064 
25
25
46.70    53.30
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-35
Thinking Challenge
You’re a time study analyst
in manufacturing. You’ve
recorded the following task
times (min.):
3.6, 4.2, 4.0, 3.5, 3.8, 3.1.
What is the 90% confidence
interval estimate of the
population mean task time?
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-36
Confidence Interval Solution*
• x = 3.7
• s = 3.8987
• n = 6, df = n – 1 = 6 – 1 = 5
• t.05 = 2.015
3.7  2.015
.38987
   3.7  2.015
6
.492    6.908
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
.38987
6
6-37
6.4
Large-Sample Confidence
Interval for a Population
Proportion
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-38
Sampling Distribution of p̂
1. The mean of the sampling distribution of p̂ is p;
that is, p̂ is an unbiased estimator of p.
2. The standard deviation of the sampling
distribution of p̂ is pq n ; that is,  p̂ 
where q = 1–p.
pq n
3. For large samples, the sampling distribution of p̂
is approximately normal. A sample size is
considered large if both np̂  15 and nq̂  15.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-39
Large-Sample Confidence
Interval for p̂
pq
p̂q̂
p̂  z 2 p̂  p̂  z 2 
 p̂  z 2 
n
n
x
where p̂ 
n
and q̂  1  p̂.
Note: When n is large, p̂ can approximate the
value of p in the formula for  p̂ .
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-40
Conditions Required for a
Valid Large-Sample
Confidence Interval for p
1. A random sample is selected from the target
population.
2. The sample size n is large. (This condition will be
satisfied if both np̂  15 and nq̂  15 . Note that np̂
and nq̂ are simply the number of successes and
number of failures, respectively, in the sample.).
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-41
Estimation Example
Proportion
A random sample of 400 graduates showed
32 went to graduate school. Set up a 95%
confidence interval estimate for p.
ˆˆ
ˆˆ
pq
pq
pˆ  Z /2 
 p  pˆ  Z /2 
n
n
.08  1.96 
.08 .92 
400
pˆ 
 p  .08  1.96 
32
 0.08
400
.08 .92 
400
.053  p  .107
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-42
Thinking Challenge
You’re a production
manager for a newspaper.
You want to find the %
defective. Of 200
newspapers, 35 had
defects. What is the 90%
confidence interval estimate
of the population
proportion defective?
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-43
Confidence Interval
Solution*
pˆ  z /2 
pˆ  qˆ
 p  pˆ  z /2 
n
pˆ  qˆ
n
.175(.825)
.175(.825)
.175  1.645 
 p  .175  1.645 
200
200
.1308  p  .2192
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-44
Adjusted (1 – )100%
Confidence Interval for a
Population Proportion, p
p  z 2
p 1  p 
n4
x2
p
where
n  4 is the adjusted sample proportion
of observations with the characteristic of interest, x
is the number of successes in the sample, and n is
the sample size.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-45
6.5
Determining the Sample Size
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-46
Sampling Error
In general, we express the reliability associated
with a confidence interval for the population mean
µ by specifying the sampling error within which
we want to estimate µ with 100(1 –)% confidence.
The sampling error (denoted SE), then, is equal to
the half-width of the confidence interval.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-47
Sample Size Determination
for 100(1 – ) %
Confidence Interval for µ
In order to estimate µ with a sampling error (SE)
and with 100(1 – )% confidence, the required
sample size is found as follows:
  
z 2 
 SE

 n
The solution for n is given by the equation
z   

n
2
2
 2
SE 
2
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-48
Sample Size Example
What sample size is needed to be 90%
confident the mean is within  5? A pilot
study suggested that the standard deviation
is 45.
(z 2 ) 
2
n
(SE) 2
2
1.645 45


 219.2  220
5
2
2
2
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-49
Sample Size Determination
for 100(1 – ) %
Confidence Interval for p
In order to estimate p with a sampling error SE and
with 100(1 – )% confidence, the required sample
size is found by solving the following equation for
n:
pq
z 2
 SE
n
The solution for n can be written as follows:
z   pq 

n
2
 2
SE 
2
Note: Always round n
up to the nearest
integer value.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-50
Sample Size Example
What sample size is needed to estimate p
within .03 with 90% confidence?
width .03
SE 

 .015
2
2
(Z 2 )  pq 
2
n
(SE) 2
1.645  .5 .5 


 3006.69  3007
2
.015 
2
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-51
Thinking Challenge
You work in Human
Resources at Merrill Lynch.
You plan to survey
employees to find their
average medical expenses.
You want to be 95% confident
that the sample mean is
within ± $50.
A pilot study showed that 
was about $400. What
sample size do you use?
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-52
Sample Size Solution*
n
(z 2 )2  2
(SE)2
1.96  400 


50
2
2
2
 245.86  246
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-53
Key Ideas
Population Parameters, Estimators, and
Standard Errors
Parameter
Estimator
 
̂ 
Standard
Error of
Estimator
Estimated
Std Error
̂ 
 
̂
̂
Mean, µ
x
Proportion, p
p̂

n
pq n
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
s
n
p̂q̂ n
6-54
Key Ideas
Population Parameters, Estimators, and
Standard Errors
Confidence Interval: An interval that encloses
an unknown population parameter with a certain
level of confidence (1 – )
Confidence Coefficient: The probability (1 – )
that a randomly selected confidence interval
encloses the true value of the population
parameter.
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-55
Key Ideas
Key Words for Identifying the Target
Parameter
µ – Mean, Average
p – Proportion, Fraction, Percentage, Rate,
Probability
2 - Variance
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-56
Key Ideas
Commonly Used z-Values for a LargeSample Confidence Interval
90% CI:
(1 – ) = .10
z.05 = 1.645
95% CI:
(1 – ) = .05
z.025 = 1.96
98% CI:
(1 – ) = .02
z.005 = 2.326
99% CI:
(1 – ) = .01
z.005 = 2.575
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-57
Key Ideas
Determining the Sample Size n
   ME 
Estimating µ: n   z 2  
2
2
2
Estimating p: n   z 2   pq   ME 
2
2
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-58
Key Ideas
Finite Population Correction Factor
Required when n/N > .05
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-59
Key Ideas
Confidence Interval for Population
Variance
Uses chi-square (2) distribution

Need to know
and df.
2
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-60
Key Ideas
Illustrating the Notion of “95%
Confidence”
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-61
Key Ideas
Illustrating the Notion of “95%
Confidence”
Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
6-62