• Study Resource
• Explore

Survey

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
```Business Statistics:
A Decision-Making Approach
8th Edition
Chapter 8
Estimating Single
Population Parameters
8-1
Chapter Goals
After completing this chapter, you should be able to:
 Distinguish between a point estimate and a confidence
interval estimate

Construct and interpret a confidence interval estimate for a
single population mean using both the z and t distributions

Determine the required sample size to estimate a single
population mean or a proportion within a specified margin
of error

Form and interpret a confidence interval estimate for a
single population proportion
8-2
Overview of the Chapter


Builds upon the material from Chapter 1 and 7
Introduces using sample statistics to estimate population
parameters


Confidence Intervals for the Population Mean, μ




expensive, time consuming and sometimes not feasible
when Population Standard Deviation σ is Known
when Population Standard Deviation σ is Unknown
Confidence Intervals for the Population Proportion, p
Determining the Required Sample Size for means and
proportions
8-3
Estimation Process
Random Sample
(point estimate)
Population
(mean, μ, is
unknown)
Sample
Mean
x = 50
Confidence Level
I am 95%
confident that
μ is between
40 & 60.
confidence
interval
8-4
Point Estimate

Suppose a poll indicate that 62% (sample mean) of the
people favor limiting property taxes to 1% of the market
value of the property.



The 62% is the point estimate of the true population of people
who favor the property-tax limitation.
EPA Automobile Mileage Test Result (point estimate)
A point estimate is a single number, used to estimate
an unknown population parameter
8-5
Confidence Interval

The point estimate is not likely to exactly equal the
population parameter because of sampling error.



Probability of “sample mean = population mean” is zero
With sample mean, it is impossible to determining how
far the sample mean is from the population mean.
To overcome this problem, “confidence interval” can be
used as the most common procedure.

Stated in terms of level of confidence: Never 100% sure
8-6
Confidence Level

Confidence Level (not same as critical value, α)


A percentage (less than 100%)


Describes how strongly we believe that a particular
sampling method will produce a confidence interval that
includes the true population parameter.
Most common: 90% (α = 0.1), 95% (α = 0.05), 99%
Suppose confidence level = 95%

In the long run, 95% of all the confidence intervals will
contain the unknown true parameter
8-7
General Formula

The general formula for the
confidence interval is:
σ
xz
n
Point Estimate  (Critical Value)(Standard Error)
z-value (or t value) based on the level of
confidence desired
8-8
How to Find the Critical Value

The Central Limit Theorem states that the sampling
distribution of a statistic will be normal or nearly normal,
if any of the following conditions apply.


n > 30: will give a sampling distribution that is nearly normal
The sampling distribution of the mean is normally distributed
because the population distribution is normally distributed.
8-9
How to Find the Critical Value

When one of these conditions is satisfied, the critical value
can be expressed as a z score or as a t score. To find the


Compute alpha (α): α = 1 - (confidence level / 100) = 1 - 0.95 = 0.05
Find the critical probability (p*): p* = 1 – α/2 (because there are lower
confidence limit and upper confidence limit) =1 - 0.05/2 = 0.975


To express the critical value as a z score, find the z score having a
cumulative probability equal to the critical probability (p*).
See the example 8-1 on page 335
8-10
t score as the critical value

When the population standard deviation is unknown or
when the sample size is small, the t score is preferred.


Find the degrees of freedom (DF). When estimating a mean
score or a proportion from a single sample, DF is equal to the
sample size minus one. For other applications, the degrees of
freedom may be calculated differently.
The critical t score (t*) is the t score having degrees of freedom
equal to DF and a cumulative probability equal to the critical
probability (p*).
8-11
From Chapter 7

The standard deviation of the possible sample means
computed from all random samples of size n is
equal to the population standard deviation divided
by the square root of the sample size:
σ
xz
n
Also called the
standard error
σ
σx 
n
8-12
Margin of Error

Margin of Error (e): the amount added and subtracted
to the point estimate to form the confidence interval
Example: Margin of error for estimating μ, σ known:
σ
xz
n
σ
ez
n
8-13
Point and Interval Estimates

So, a confidence interval provides additional
information about variability within a range of z-values

Lower
Confidence
Limit
The interval incorporates the sampling error
Point Estimate
Upper
Confidence
Limit
Width of
confidence interval
8-14
Using Analysis ToolPak
8-15
Using Analysis ToolPak
Confidence Interval
119.9 (mean) ± 2.59
117.31 ------ 122.49
Don’t even worry
about p* = 1 - α/2
Margin of
Error
8-16
Using Analysis ToolPak
(large sample: use normal (z) distribution automatically )
8-17
Using Analysis ToolPak
(small sample: use (t) distribution automatically)
8-18
Factors Affecting Margin of Error
σ
ez
n

Data variation, σ :
e
as σ

Sample size, n :
e
as n

Level of confidence, 1 -  :
e
if 1 - 

Video Lecture: Confidence Intervals
8-19
Confidence Intervals
Confidence
Intervals
Population
Mean
σ Known
Population
Proportion
σ Unknown
8-20
Confidence Interval for μ
(σ Known)

Assumptions



Population standard deviation σ is known
If population is not normal, use larger sample n > 30
(Central Limit Theorem)
Confidence interval estimate
σ
xz
n
8-21
Finding the Critical Value

0.95/2 = 0.475
Consider a 95% find
on z table
confidence
Normsinv(0.5 – 0.475)=1.959963…
interval:
z  1.96
(1   ) / 2  .95/2
α
 .025
2
z units:
x units:
α
 .025
2
-z = -1.96
Lower
Confidence
Limit
xz
σ
n
0
Point Estimate
x
z = 1.96
Upper
Confidence
Limit
xz
σ
n
8-22
Common Levels of Confidence

Commonly used confidence levels are 90%,
95%, and 99%
Confidence
Level
80%
90%
95%
98%
99%
99.8%
99.9%
Critical
value, z
1.28
1.645
1.96
2.33
2.58
3.08
3.27
8-23
Computing a Confidence Interval
Estimate for the Mean (s known)
1.
2.
3.
4.
5.
Select a random sample of size n
Specify the confidence level
Compute the sample mean
Determine the standard error
Determine the critical value (z) from the
normal table
6. Compute the confidence interval estimate
8-24
Example


A sample of 11 circuits from a large normal
population has a mean resistance of 2.20
ohms. We know from past testing that the
population standard deviation is 0.35 ohms.
Determine a 95% (α = 0.05) confidence
interval for the true mean resistance of the
population.
8-25
Example Solution
(continued)
Solution:
σ
xz
n
 2.20  1.96 (0.35/ 11)
 2.20  .2068
1.9932
2.4068
8-26
Interpretation

We are 95% confident that the true mean resistance is
between 1.9932 and 2.4068 ohms


Although the true mean may or may not be in this interval, 95% of
intervals formed in this manner will contain the true mean
An incorrect interpretation is that there is 95% probability that this
interval contains the true population mean.
8-27
Confidence Interval for μ
(σ Unknown)

In most real world situations, population mean and
StdDev are NOT KNOWN.

When the population standard deviation is unknown or
when the sample size is small, the t score is preferred.


When the sample size is large (n > 30), it doesn't make much
difference. Both approaches yield similar results.
So we use the t distribution instead of the normal
distribution.
8-28
Confidence Interval for μ
(σ Unknown)
(continued)

Assumptions

Population standard deviation is unknown




Sample size is small
If population is not normal, use large sample n > 30
Use Student’s t Distribution
Confidence Interval Estimate
s
xt
n
8-29
Student’s t Distribution

The t is a family of distributions

The t value depends on degrees of freedom (d.f.)


For example, if n = 28, then the d.f. is 27.
As the d.f. increase, the t distribution approaches the
normal distribution (see the t distribution simulation
one the class website)
d.f. = n – 1

Only n-1 independent pieces of data information left in the
sample because the sample mean has already been obtained
8-30
Student’s t Distribution
Note: t compared to z as n increases
As n the estimate of s
becomes better so t
converges to z
Standard
Normal
(t with df = )
t (df = 13)
t-distributions are bellshaped and symmetric, but
have ‘fatter’ tails than the
normal
t (df = 5)
0
t
8-31
Student’s t Table
Confidence Level
df 0.50 0.80 0.90
Let: n = 3
df = n - 1 = 2
confidence level: 90%
1 1.000 3.078 6.314
2 0.817 1.886 2.920
/2 = 0.05
3 0.765 1.638 2.353
The body of the table
contains t values, not
probabilities
0
2.920 t
8-32
t Distribution Values
With comparison to the z value
Confidence
t
Level
(10 d.f.)
t
(20 d.f.)
t
(30 d.f.)
z
____
0.80
1.372
1.325
1.310
1.28
0.90
1.812
1.725
1.697
1.64
0.95
2.228
2.086
2.042
1.96
0.99
3.169
2.845
2.750
2.58
Note: t compared to z as n increases
8-33
Example
A random sample of n = 25 has x = 50 and
s = 8. Form a 95% confidence interval for μ

d.f. = n – 1 = 24, so
t/2 , n 1  t 0.025,24  2.0639
The confidence interval is
s
8
xt
 50  (2.0639)
n
25
46.698
53.302
8-34
TINV



NORMSINV(p) gives the z-value that puts probability
(area) p to the left of that value of z.
TINV(p,DF) gives the t-value that puts one-half the
probability (area) to the right with DF degrees of
freedom.

Only need to review “TINV”
8-35
TINV Example

For 95% confidence intervals we use α = .05,
so that we are looking t.025.


Suppose d.f. is 17
t.025,17 = t value that puts .025 to the right of t with 17
degrees of freedom. Since TINV splits α = .05 to
.025, this value is =TINV(.05,17).
8-36
In-class Practice Example

The file Excel file Coffee on the class website contains
a random sample of 144 German coffee drinkers and
measures the annual coffee consumption in kilograms
for each sampled coffee drinker. A marketing research
firm wants to use this information to develop an
advertising campaign to increase German coffee
consumption.

Develop a 95% , 90% and 75% confidence interval estimates
for the mean annual coffee consumption of German coffee
drinkers.
8-37
Determining Required Sample Size

Wishful situation


High confidence level, low margin of error, and small sample size
In reality, conflict among three….



For a given sample size, a high confidence level will tend to
generate a large margin of error
For a given confidence level, a small sample size will result in an
increased margin of error
Reducing of margin of error requires either reducing the confidence
level or increasing the sample size, or both
8-38
Determining Required Sample Size

How large a sample size do I really need?


Sampling budget constraint
Cost of selecting each item in the sample
8-39
Determining Required Sample Size
When σ is known

The required sample size can be found to
reach a desired margin of error (e) and
level of confidence (1 - )

Required sample size, σ known:
2
zσ
z σ
  2
n  
e
 e 
2
2
8-40
Required Sample Size Example
If s = 45 (known), what sample size is
needed to be 90% confident of being correct
within ± 5?
zσ
1.645 (45)
n 2 
 219.19
2
e
5
2
2
2
2
So the required sample size is n = 220
(Always round up)
8-41
If σ is unknown (most real situation)

If σ is unknown, three possible approaches
1.
2.
3.
Use a value for σ that is expected to be at least as large as
the true σ
Select a pilot sample (smaller than anticipated sample size)
and then estimate σ with the pilot sample standard
deviation, s
Use the range of the population to estimate the population’s
Std Dev. As we know, µ ± 3σ contains virtually all of the
data. Range = max – min.
Thus, R = (µ + 3σ) – (µ - 3σ) = 6σ. Therefore, σ = R/6 (or
R/4 for a more conservative estimate, producing a larger
sample size)
8-42
Example when σ is unknown

Jackson’s Convenience Stores


Using a pilot sample approach
Available on the class website
8-43
Confidence Intervals for the
Population Proportion, π

An interval estimate for the population
proportion ( π ) can be calculated by adding an
allowance for uncertainty to the sample
proportion ( p ).

For example, estimation of the proportion of
customers who are satisfied with the service

Sample proportion, p = x/n
8-44
Confidence Intervals for the
Population Proportion, π
(continued)

Recall that the distribution of the sample
proportion is approximately normal if the
sample size is large, with standard deviation
π(1 π)
σπ 
n

See Chpt. 7!!
We will estimate this with sample data:
p(1 p)
sp 
n
8-45
Confidence Interval Endpoints

Upper and lower confidence limits for the
population proportion are calculated with the
formula
p(1 p)
pz
n

where



z is the standard normal value for the level of confidence desired
p is the sample proportion
n is the sample size
8-46
Example

A random sample of 100 people
shows that 25 are left-handed.

Form a 95% confidence interval for
the true proportion of left-handers
8-47
Example
(continued)

A random sample of 100 people shows
that 25 are left-handed. Form a 95%
confidence interval for the true proportion
of left-handers.
1. p  25/100  0.25
2. Sp  p(1  p)/n  0.25(0.75)/100  0.0433
3.
0.25  1.96 (0.0433)
0.1651
0.3349
8-48
Interpretation

We are 95% confident that the true
percentage of left-handers in the population
is between
16.51% and 33.49%

Although this range may or may not contain
the true proportion, 95% of intervals formed
from samples of size 100 in this manner will
contain the true proportion.
8-49
Changing the sample size

Increases in the sample size reduce
the width of the confidence interval.
Example:

If the sample size in the above example is
doubled to 200, and if 50 are left-handed in the
sample, then the interval is still centered at
0.25, but the width shrinks to
0.19
0.31
8-50
Finding the Required Sample Size
for Proportion Problems
Define the
margin of error:
π(1 π)
ez
n
z π (1  π)
2
n
Solve for n:
Will be in %
units
e
2
π can be estimated with a pilot sample, if
necessary (or conservatively use π = 0.50 –
worst possible variation thus the largest sample size)
8-51
What sample size...?

How large a sample would be necessary
to estimate the true proportion defective in
a large population within 3%, with 95%
confidence?
(Assume a pilot sample yields p = 0.12)
8-52
What sample size...?
(continued)
Solution:
For 95% confidence, use Z = 1.96
e = 0.03
p = 0.12, so use this to estimate π
n
z 2 π (1  π)
e2
(1.96) 2 (0.12)(1  0.12)

 450.74
2
(0.03)
So use n = 451
8-53
Using PHStat




PHStat can be used for confidence
intervals for the mean or proportion
Two options for the mean: known and
unknown population standard deviation
Required sample size can also be found

The link available on the class website
8-54
PHStat Interval Options
options
8-55
PHStat Sample Size Options
8-56
Using PHStat
(for μ, σ unknown)
A random sample of n = 25 has x = 50 and
s = 8. Form a 95% confidence interval for μ
8-57
Using PHStat
(sample size for proportion)
How large a sample would be necessary to estimate the true
proportion defective in a large population within 3%, with
95% confidence?
(Assume a pilot sample yields p = 0.12)
8-58
Chapter Summary






Discussed point estimates
Introduced interval estimates
Discussed confidence interval estimation for
the mean [σ known]
Discussed confidence interval estimation for
the mean [σ unknown]
Addressed determining sample size for mean
and proportion problems
Discussed confidence interval estimation for
the proportion