Download Chapter 10 Estimating Means and Proportions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

History of statistics wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Misuse of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Chapter 10
Estimating Means
and Proportions
Stat-Slide-Show, Copyright 1994-95 by Quant Systems Inc.
The Problem
Process or
Population
m=?
s2 = ?
p=?
How do you
estimate these
unknown
parameters?
10 - 2
Definition
Using properly drawn sample
data to draw conclusions about
the population is called
statistical inference.
Process or
Population
m=?
Sample
x
x is a sample estimate of m .
10 - 3
Definitions
• An estimator is a strategy or rule
that is used to estimate a
population parameter.
• For example, use
x to estimate m
s2 to estimate s2
x and s2 are
estimators.
• If the rule is applied to a specific
set of data, the result is an
estimate.
• Example:
x = 33.2
This is an
estimate of
the population
mean m .
10 - 4
Statistical Inference
• Statistical inference permits
the estimation of statistical
measures (means,
proportions, etc.) with a known
degree of confidence.
• This ability to assess the
quality (confidence) of
estimates is one of the
significant benefits statistics
brings to decision making and
problem solving.
10 - 5
Randomly Selected
Samples
• If samples are selected
randomly, the uncertainty of the
inference is measurable.
• The ability to measure the
confidence associated with a
statistical inference is the value
received for drawing random
samples.
• If samples are not selected
randomly, there will be no known
relationship between the sample
results and the population.
10 - 6
The One-sample
Problem
• This chapter is devoted to the
one-sample problem.
• That is, a sample consisting of
n measurements, x1, x2,..., xn,
of some population
characteristic will be analyzed
with the objective of making
inferences about the
population or process.
m=?
s2 = ?
p=?
10 - 7
Estimation
10 - 8
Judgment Estimates
• Many estimates are subjective,
that is, a person with
experience in the field is
utilized to estimate an
unknown population value.
• The problem with judgment
estimates is that their degree
of accuracy or inaccuracy
cannot be determined.
• Even if experts exist, statistics
offers estimates with known
reliability.
10 - 9
Point Estimation of the
Population Mean
10 - 10
How can you tell a good
estimator from a bad one?
• Good estimators
conform to the
rules of horse
shoes: the closer
to the true population
measure, the better.
• Since the objective in this
instance is to estimate the
population mean, closeness is
measured in terms of the
distance the estimate is from
the actual population mean.
10 - 11
Estimate Accuracy
How can you judge how accurate
your estimate is without knowing
the true value of the population
parameter?
It’s similar to shooting an arrow at
the bull's-eye without being able to
see the bull’s-eye.
If you can’t see the bull's-eye, how
do you know how close you were?
10 - 12
Mean Squared Error
An estimator’s average squared
distance from the true parameter
is referred to as its mean
squared error (MSE).
The mean squared error for the
sample mean is given by:
MSE(x)  E(x  m )2
10 - 13
Finding an Estimator
• A perfect estimator would have
a mean squared error of zero,
but there is no such thing as a
perfect estimator.
• Since statistical estimators
depend on data which is
randomly drawn, they are
random variables and cannot
always be equal to the true
population characteristic.
• The goal is to find an estimator
whose average squared error
is the smallest.
10 - 14
Restricting Estimators
There are an infinite number of
possible estimators and without
restricting the kinds of
estimators that will be
considered, very little progress
can be made.
10 - 15
Unbiasedness
• On desirable restriction is
unbiasedness.
• To be an unbiased estimator,
the expected value of the
estimator must be equal to the
parameter that is being
estimated.
• For example, x is an unbiased
estimator of the population
mean since
E(x)  m .
10 - 16
Unbiased Estimators
• There are many estimators that are
unbiased estimators of the population
mean: including the sample mean, sample
median, or any single sample value.
• Among unbiased estimators the mean
squared error is equal to the variance of
the estimator.
• Among unbiased estimators, the sample
mean has the smallest mean squared
error.
• Consequently, there is no other unbiased
estimator that can consistently do a better
job of estimating the population mean.
10 - 17
Interval Estimation of
the Population Mean
10 - 18
Precision of the
Estimate
• One of the limitations of simply
reporting a point estimate is the
lack of information concerning
the estimator’s accuracy.
• Example: If 33.2 is a point
estimate of the population mean,
how good is this estimate?
• Interval estimates, however, are
constructed to provide additional
information about the precision
of the estimate.
10 - 19
Constructing an
Interval estimator
• An interval estimator is made
by developing an upper and a
lower boundary for an interval
that will hopefully contain the
population parameter.
• It would be easy to construct
an interval estimator that would
definitely contain a population
parameter, namely minus
infinity to positive infinity.
-
+

0
10 - 20
Constructing an
Interval estimator
• However, this particular
interval estimator would not
contain any useful information
about the location of the
population parameter.
• In interval estimation, the
smaller the interval for a given
amount of confidence, the
better.
10 - 21
Central Limit Theorem
Recall that if the sample size is
reasonable large (n > 30), the
central limit theorem ensures
that x has an approximate
normal distribution
with mean, m,
2
and variance, sn .
m
10 - 22
Example 1
• The sampling distribution can
be used to develop an interval
estimator.
• For the standard normal
random variable,
P(-2.17 < z < 2.17) = .97.
10 - 23
Example 1
Since x can be transformed in the
standard normal random variable
by using the z-transform,
z  xsm ,
x
then by substitution,
m) < 2.17) = .97 ,
P( -2.17 < (xsx
and with some algebraic
manipulation we obtain
P( x-2.17 sx < m < x  2.17 sx) = .97 .
10 - 24
Example 1
P( x-2.17 sx < m < x  2.17 sx ) = .97
The expression above suggests
a specific form for the interval.
The population mean will fall
within the interval:
x  2.17 sx
97% of the time.
10 - 25
Example 1
• After the sample is selected, the
sample mean is no longer a random
variable.
• x is a random variable, but x = 33.2
is the sample mean for a particular
sample.
• Suppose a sample has been drawn
from a population with a standard
deviation of 200, and the following
characteristics have been observed:
n = 100, and x = 150.
Note: sx  s  200  200 .
n
100
10
10 - 26
Example 1
The resulting interval would be
150  2.17(200) .
10
That is,
[
200
150  2.17(
)
10
150
[
106.6
]
200
150  2.17(
)
10
]
150
193.4
10 - 27
Example 1
[
106.6
]
150
193.4
Is the population mean (m) inside
this interval?
10 - 28
Example 1
Even though the interval is
calculated using a technique that
captures the population mean
97% of the time, it would not be
appropriate, from a relative
frequency point of view, to state
that
P(106.6 < m < 193.4) = .97
since the population mean is an
unknown but constant quantity.
10 - 29
Example 1
• Either m will always be inside
the interval or will always be
outside the interval.
• What information do we have
about the interval?
10 - 30
Example 1
• Since it was constructed from a
technique that will include the
true population mean in the
interval .97 of the time, we are
97% confident in the technique.
• Confidence is one way of
expressing a subjective
probability.
• Hence, the term confidence
interval is used to describe the
method of construction rather
than a particular interval.
10 - 31
Example 1
A 97% confidence interval can
be interpreted to mean that if all
possible samples of a given size
are taken from a population,
97% of the samples would
produce intervals that captured
the true population mean and
3% would not.
The idea of the confidence of a
confidence interval is a general
one and can be extended to any
specified degree of confidence.
80%, 85%, 88%, 95%, 98%, ...
10 - 32
Confidence Interval for
the Population Mean
Definition:
If n>30 or if s is known, and the
population being sampled is
normal, a (1 - a) confidence
interval for the population mean
is given by
x  za
2
s
n
If s is unknown and n>30, s can be
used as an approximation for s.
10 - 33
Confidence Interval for
the Population Mean
The expression, x  z a
2
s
,
n
creates the interval shown below.
[
x  za
2
s
n
]
x
x  za
s
2
n
The term za represents the
2
z-value required to obtain an
area of 1 - a centered under the
standard normal curve.
10 - 34
Various Z-values
The z-values for obtaining
various (1 - a) areas centered
under the standard normal curve
are given in the table below.
Area
1a
.80
.90
.95
.99
za
2
1.28
1.645
1.96
2.58
10 - 35
Graphs of the Various Zvalues
(1 - a) = .80
-1.28
(1 - a) = .90
0
1.28
(1 - a) = .95
-1.96
-1.645 0
1.645
(1 - a) = .99
0
1.96
-2.58
0
2.58
10 - 36
To achieve more confidence
we must pay a price.
• For a fixed sample size, the
larger the desired confidence,
the greater the number of
standard deviations that must
be used to form the boundary
points for the confidence
interval.
• When the interval becomes
wider, the resulting information
provides a less precise
location of the population
mean.
10 - 37
Error of Estimation
We can also think about the
confidence interval as a means
of describing the quality of a
point estimate.
x
point
estimate


za
2
s
n
maximum error of
estimation with a
specific level of
confidence (1 - a)
10 - 38
Example 2
Find za 2 for the following levels of a.
1. a = .02
2. a = .08
10 - 39
Example 2 - Solution
1.
a = .02
a  .02  .01
2
2
.49
.01
z.01  2.33
10 - 40
Example 2 - Solution
2.
a = .08
a  .08  .04
2
2
.46
.04
z.04  1.75
10 - 41
Example 3
Find za 2 for the following confidence
levels:
1. 96%
2. 88%
10 - 42
Example 3 - Solution
1 - a = .96
1.
a = .04
a  .04  .02
2
2
.48
.02
z.02  2.05
10 - 43
Example 3 - Solution
1 - a = .88
2.
a = .12
a  .12  .06
2
2
.44
.06
z.06  1.555
10 - 44
Example 4
A paint manufacturer is
developing a new type of paint.
Thirty panels were exposed to
various corrosive conditions to
measure the protective ability of
the paint.
The mean life for the samples
was 168 hours before corrosive
failure.
10 - 45
Example 4
The life of paint samples is
assumed to be normally
distributed with population
standard deviation of 30 hours.
Find the 95%
confidence interval
for the mean life of
the paint.
10 - 46
Example 4 - Solution
We are given:
X = time before corrosive failure
of the paint has a normal
distribution,
s = 30, n = 30, x = 168,
and the confidence level = .95.
10 - 47
Example 4 - Solution
1 - a = .95
a = .05
a  .05  .025
2
2
.475
.025
z.025  1.96
10 - 48
Example 4 - Solution
We want to determine a 95%
confidence interval for the true
mean life before corrosive failure.
Since X is normal and s is known,
the confidence interval is given by
x  za
2
s
30
 168  1.96
n
30
 168  10.7354
 (157.2646 , 178.7354) .
10 - 49