Download Chap. 10: Estimation

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Estimation
Confidence
Intervals
Learning Objectives
1. State What Is Estimated
2. Distinguish Point & Interval Estimates
3. Explain Interval Estimates
4. Compute Confidence Interval
Estimates for Population Mean &
Proportion
5. Compute Sample Size
Thinking Challenge
Suppose you’re
interested in the
average amount of
money that students in
this class (the
population) have on
them. How would you
find out?
Statistical Methods
Statistical
Methods
Descriptive
Statistics
Inferential
Statistics
Estimation
Hypothesis
Testing
Estimation Process
Population
Mean, m, is
unknown
Random Sample
Mean
`X = 50
I am 95%
confident that m
is between 40 &
60.
Unknown Population
Parameters Are Estimated
Estimate Population
Parameter...
Mean
m
Proportion
Variance
Differences
p
s
with Sample
Statistic
`x
p
2
m1 - m2
s
2
`x1 -`x2
Estimator and Estimate
1. The Estimator is a Random Variable
Used to Estimate a Population Parameter
 Sample
Mean, Sample Proportion, Sample
Median
 Sample Mean X is an Estimator of Population
Mean m
2. The estimate is the numerical value of
the estimator
 If
X = 3 then 3 Is the Estimate of m
Properties of Mean

Unbiasedness


Efficiency


Mean of Sampling Distribution Equals Population
Mean
Sample Mean Comes Closer to Population Mean
Than Any Other Unbiased Estimator
Consistency

As Sample Size Increases, Variation of Sample Mean
from Population Mean Decreases
Unbiasedness
P(`X)
Unbiased
Biased
A
C
mx= mx A
mx C
`X
Efficiency
P(`X)
Sampling
Distribution
of Mean
B
Sampling
Distribution
of Median
A
mx
`X
Consistency
P(`X)
Larger
Sample
Size
B
Smaller
Sample
Size
A
mx
`X
Estimation Methods
Estimation
Point
Estimation
Confidence
Interval
Interval
Estimation
Bootstrapping
Prediction
Interval
Point Estimation

Provides Single Value



Based on Observations from 1 Sample
Gives No Information about How
Close Value Is to the Unknown
Population Parameter
Sample Mean`X = 3 Is Point Estimate
of Unknown Population Mean
Estimation Methods
Estimation
Point
Estimation
Confidence
Interval
Interval
Estimation
Bootstrapping
Prediction
Interval
Interval Estimation

Provides Range of Values


Gives Information about Closeness to
Unknown Population Parameter


Based on Observations from 1 Sample
Stated in terms of Probability
– Knowing Exact Closeness Requires
Knowing Unknown Population Parameter
e.g., Unknown Population Mean Lies
Between 50 & 70 with 95%
Confidence
Key Elements of
Interval Estimation
A Probability That the Population Parameter Falls
Somewhere Within the Interval.
Confidence Interval
Confidence Limit
(Lower)
Sample Statistic
(Point Estimate)
Confidence Limit
(Upper)
Confidence Limits
for Population Mean
Parameter =
Statistic ± Error
© 1984-1994 T/Maker Co.
(1)
m  X  Error
(2)
Error  X - m or X  m
X -m
(3)
Z
(4)
Error  Zs x
(5)
m  X  Zs x
sx

Error
sx
Many Samples Have Same
Confidence Interval
X = m ± Zs`x
sx_
m-2.58s`x
m-1.65s`x
m-1.96s`x
m m+1.65s`x
90% Samples
95% Samples
99% Samples
m+2.58s`x
m+1.96s`x
`X
Level of Confidence

Probability that the Unknown
Population Parameter Falls Within
Interval

Denoted (1 - a) %
a
Is Probability That Parameter Is Not
Within Interval

Typical Values Are 99%, 95%, 90%
Intervals &
Level of Confidence
Sampling
Distribution
of Mean
s_
a/2
1-a
a/2
m =m
Intervals
Extend from
`X - Zs`X to
`X + Zs`X
x
x
_
X
(1 - a) % of
Intervals
Contain m .
a % Do Not.
Large Number of Intervals
Factors Affecting
Interval Width

Data Dispersion
 Measured

Sample Size
 sx =

by s
Intervals Extend from
`X - Zs`X to`X + Zs`X
s / n
Level of Confidence
(1 - a)
 Affects
Z
© 1984-1994 T/Maker Co.
Confidence Interval Estimates
Confidence
Intervals
Mean
s Known
Proportion
s Unknown
Variance
Finite
Population
Confidence Interval Estimate
Mean (sX Known)

Assumptions
 Population
Standard Deviation Is Known
 Population Is Normally Distributed
 If Not Normal, Can Be Approximated by
Normal Distribution (n  30)

Confidence Interval Estimate
s
s
X - Za / 2 
 m  X  Za / 2 
n
n
Estimation Example
Mean (sX Known)
The mean of a random sample of n = 25 is`X =
50. Set up a 95% confidence interval estimate
for mX if s = 10.
s
s
X- Z 
 m X Z 
n
n
50 - 196
. 
10
25
. 
 m  50  196
46.08  m  53.92
10
25
Thinking Challenge
You’re a Q/C inspector
for Gallo. The s for 2liter bottles is .05 liters.
A random sample of 100
bottles showed`X = 1.99
liters. What is the 90%
confidence interval
estimate of the true
mean amount in 2-liter
bottles?
2 liter
© 1984-1994 T/Maker Co.
Confidence Interval
Solution*
X- Z 
199
. - 1645
.

s
n
.05
100
 m X Z 
s
n
.  1645
.
 m  199

1982
.
.
 m  1998
.05
100
Confidence Interval Estimates
Confidence
Intervals
Mean
sx Known
Proportion
sx Unknown
Variance
Finite
Population
Confidence Interval Estimate
Mean (s Unknown)

Assumptions
 Population
Standard Deviation Is Unknown
 Population Must Be Normally Distributed
 n<30 (See comment)

Use Student’s t Distribution

Confidence Interval Estimate
X
- t 
S
n
 m  X 
t 
S
n
Student’s t Distribution
Standard
Normal
Bell-Shaped
t (df = 13)
Symmetric
t (df = 5)
‘Fatter’ Tails
0
Z
t
Student’s t Table
Area in Both Tails
df
.50
.20
Assume:
n=3
df = n - 1 = 2
a = .10
a/2 =.05
a
.10
1 1.000 3.078 6.314
2 0.817 1.886 2.920
.05
.05
3 0.765 1.638 2.353
t Values
0
2.920
t
Degrees of Freedom (df)

Number of Observations that Are
Free to Vary After Sample Statistic
Has Been Calculated
degrees of freedom

Example
 Sum
of 3 Numbers Is 6
X1 = 1 (or Any Number)
X2 = 2 (or Any Number)
X3 = 3 (Cannot Vary)
Sum = 6
= n -1
= 3 -1
=2
Estimation Example
Mean (sX Unknown)
A random sample of n = 25 has`X = 50
& S = 8. Set up a 95% confidence
interval estimate for m.
X -t 
50 - 2.064 
S
n
8
25
 m 
X t 
S
n
 m  50  2.064 
46.69  m  53.30
8
25
Thinking Challenge
You’re a time study analyst in
manufacturing. You’ve recorded
the following task times (min.):
3.6, 4.2, 4.0, 3.5, 3.8, 3.1.
 What is the 90% confidence
interval estimate of the
population mean task time?

Confidence Interval Solution*
X -t 
S
n
 m 
X t 
S
n

X = (3.6+4.2+4.0+3.5+3.8+3.1)/6 = 3.7

S = .38987

n = 6, df = n -1 = 6 -1 = 5

Sx=S / n = 3.8987 / 6 = .1592

t.05,5 = 2.0150

3.7 - (2.015)(.1592)  m  3.7 + (2.015)(.1592)

3.38  m  4.02
Computer Printout
De scriptives
MINUTES
Mean
90% Confidence
Int erval for Mean
St atis tic
3. 700
Lower
Bound
Upper
Bound
St d. Error
.159
3. 379
4. 021
5% Trimmed Mean
3. 706
Median
Variance
St d. Deviation
Minimum
Maximum
Range
Int erquartile Range
Sk ewness
Kurtos is
3. 700
.152
.390
3. 1
4. 2
1. 1
.650
-.364
-.130
.845
1. 741
Confidence Interval Estimates
Confidence
Intervals
Mean
sx Known
Proportion
sx Unknown
Variance
Finite
Population
Estimation for
Finite Populations

Assumptions
 Sample
Is Large Relative to Population
– n / N > .05

Use Finite Population Correction Factor

Confidence Interval (Mean, s Unknown)
X -
t 
S
n

N-n
N -1
 m  X t 
S
n

N-n
N -1
Confidence Interval Estimates
Confidence
Intervals
Mean
sx Known
Proportion
sx Unknown
Variance
Finite
Population
Confidence Interval Estimate
Proportion

Assumptions
 Two
Categorical Outcomes
 Population Follows Binomial Distribution
 Normal Approximation Can Be Used
– n·p  5 & n·(1 - p)  5

Confidence Interval Estimate
p -Z
p  (1 - p )
 p  p Z
n
p  (1 - p )
n
Estimation Example
Proportion
A random sample of 400 graduates
showed 32 went to grad school. Set up a
95% confidence interval estimate for p.
p - Za / 2 
.08 - 196
. 
p  (1 - p )
 p  p  Za / 2 
n
.08  (1-.08 )
400
. 
 p  .08  196
.053  p  .107
p  (1 - p )
n
.08  (1-.08 )
400
Thinking Challenge
You’re
a production
manager for a
newspaper. You want to
find the % defective. Of
200 newspapers, 35 had
defects. What is the
90% confidence interval
estimate of the
population proportion
defective?
Confidence Interval
Solution*

n·p  5
n·(1 - p)  5
p  (1 - p )
p  (1 - p )
p - Za / 2 
 p  p  Za / 2 
n
n
.175  (.825)
.175  (.825)
.175 - 1645
.
.

 p  .175  1645

200
200
.1308  p  .2192
Confidence Interval of the
Difference between Two Means
• Two independent samples
• Two large samples - both samples >= 30
• Population standard deviations are unknown
• Answer finds the interval:
 u1 - u2 
Confidence Interval of the
Difference between Two Means
s
x1 - x2

n
n

1
2
s
s
1
2
2
2
where
(x - x )  z  s
1
2
x1 - x2
Example 6.3, Page 283
Sample 1
Sample 2
x1 = $76
s1 = $25
n1 = 100
x2 = $65
s2 = $22
n2 = 100
s
x1 - x2

n1 n2 Sample 1
 2
1
s
s
2
2
where
( x1 - x )  z  s
2
x1 - x2
x1 = $76
s1 = $25
n1 = 100
sx1 )- x 2
Sample 2
x2 = $65
s2 = $22
n2 = 100
(25) 2 (22) 2


 3.33
100
100
(76 - 65)  3(3.33)  11  9.99
$1  u1 - u2  $21
Confidence Interval of the Difference
between Two Proportions
• Two independent samples
• Two large samples - both samples >= 30
• Answer finds the interval:
 p1 - p2 
Confidence Interval of the Difference
between Two Proportions
p - p ) z  s
2
1
p1 - p 2
where
sp -p 
1
2
p1 q1 p 2 q 2

n2
n1
Selecting a Sample Size
Selecting a Sample Size



The Degree of Confidence
Selected
The Maximum Allowable Error
The Population Standard
Deviation
Finding Sample Sizes
(1)
(2)
(3)
Z
X - mx
sx

Error
sx
Error  Zs x  Z
n
2
Z sx
2
Error
2
sx
n
I don’t want to
sample too much
or too little!
Sample Size for Means
 z s   z s 
n

 
2
 E
  E 
2
2
2
E is the allowable error
z is the z score associated with degree of confidence
s is the population standard deviation
The marketing manager
would like to estimate the
population mean annual
usage of home heating oil
to within 50 gallons of the
true value and desires to be 95% confident of
correctly estimating the true mean. Based on a
previous study taken last year,the marketing
manager feels that the standard deviation can be
estimated as 325 gallons. What is the sample
size need to obtain these results?
z = 1.96
Confidence = 95%
E = 50
s = 325
196
z s
. ) (325)
(384
. )(105,625)
n


 162.31
2
2
2500
E
(50)
2
2
2
2
 n 163 homes need to be sampled
Sample Size for Proportions
n
p  1 - p)  z
E
2
2
E is the maximum allowable error
z is the z value associated with the degree of confidence
p is the estimated proportion
A political pollister would like to
estimate the proportion of voters who
will vote for the Democratic candidate
in a presidential campaign. The
pollster would like 95% confidence
that her prediction is correct to within
.04 of the true proportion. What
sample size is needed?

Z=1.96
Confidence = 95%
E = .04
p = unknown
use p = .5
p(1 - P) z
.5(1-.5)(196
. )
n

 600.25
2
2
E
(.04)
2

n = 601 voters
2
Conclusion

Stated What Is Estimated

Distinguished Point & Interval Estimates

Explained Interval Estimates

Computed Confidence Interval Estimates
for Population Mean & Proportion

Computed Sample Size