Download Estimating Means and Proportions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Misuse of statistics wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
HAWKES LEARNING SYSTEMS
math courseware specialists
Copyright © 2010 by Hawkes Learning
Systems/Quant Systems, Inc.
All rights reserved.
Chapter 10
Estimating Means and
Proportions
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Sections 10.1-10.3 Introduction to Estimating Means
Objectives:
• Determine the best point estimates for population parameters.
• Choose which method would be most appropriate when
calculating the margin of error for the population mean.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.1 What is an Estimator?
Definitions:
•
Estimator– a strategy or rule that is used to estimate a
population parameter.
•
Estimate– the result of applying an estimator to a specific set of
data.
•
Point Estimate – a single number estimate of a population
parameter. The best point estimate of a population mean is the
sample mean.
•
Interval Estimate – a range of possible values for the
population parameter.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.3 Point Estimation of the Population Mean
Definitions:
•
Unbiased Estimator – an estimator whose expected value is
equal to the parameter that is being estimated.
•
Mean Squared Error (MSE) – an estimator’s average squared
distance from the true parameter.
•
The Mean Squared Error (MSE) for the sample mean is
given by:
MSE  x   E  x   
2
Among unbiased estimators of the population mean, the
sample mean has the smallest mean squared error.
Estimating Means and Proportions
HAWKES LEARNING SYSTEMS
math courseware specialists
Sections 10.4, 10.6 Interval Estimation of the
Population Mean
Objectives:
• To learn the meaning of a confidence interval.
• To determine the required sample size for a particular
confidence level.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.4 Interval Estimation of the Population Mean
Definitions:
•
•
Confidence Interval– a particular interval estimate of a
population parameter.
A Confidence Interval for the Population Mean is given
by
x  z

2
n
if either of the following conditions are true.
• n > 30 (use s as an approximation for σ)
• If σ is known and the population being studied is
normal.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.4 Interval Estimation of the Population Mean
Definitions:
•
This expression, x  z 2

n
confidence interval. The
z

2
n
, creates the “generalized”
x is the point estimate, while
is the maximum error of estimation with a specific level of
confidence, given a sample size n.
The term z 2 represents the z-value required to obtain an
area of 1-α centered under the standard normal curve. The
z-values for obtaining various (1- α) areas centered under the
standard normal curve are given below.
Confidence 1−α
z
0.80
1.28
0.90
1.645
0.95
1.96
0.99
2.575
2
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.4 Interval Estimation of the Population Mean
Example:
•
Construct a 95% confidence interval for the population mean if
the standard deviation of the population is 900. Use the
following sample data:
n  100, and
x  500.
Solution:
95% Confidence Interval
900
500  1.96 
100
500 176.4 or 323.6 to 676.4
We are 95% confident that the point estimate of μ, x = 500, has an error
of estimation no larger than 176.4. Being able to assess the error of an
estimate is one of the most useful applications of statistical methods.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.6 Precision and Sample Size
Precision and Sample Size:
•
•
•
The more accurate an estimate, the greater its potential value in
decision making. The only way to perfectly determine an
unknown population parameter is to perform a census. This is
usually impractical because of cost and/or time considerations.
The width of the confidence interval defines the precision; the
smaller the interval, the greater the precision.
The three components which affect the width of the confidence
interval for the population mean are:
z
2
Represents the distance the confidence
interval boundary is from the estimated
mean x in standard deviation units. The
distance is related to the specified level of
confidence.

Represents the population standard
deviation.
n
Represents the sample size.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.6 Precision and Sample Size
Precision and Sample Size:
•
Level of confidence can vary in order to reduce/expand the
confidence interval width, but doing so does not increase the
information it only presents it differently.
•
σ, the population standard deviation, is a constant and does
not change.
•
n, the sample size, can vary, and the larger the sample the
smaller the width of the resulting confidence interval for some
given level of confidence.
•
How large should the sample size be?
The sample size should be selected relative to the size of the
maximum positive or negative error the decision maker is
willing to accept, given by:
error  z 2

n
.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.6 Precision and Sample Size
Example:
Suppose that a quality control manager wishes to measure the
average amount of cleaning fluid the company puts in their 24
ounce bottles. From previous samples, they believe their standard
deviation is .4 ounces. How large must the sample be in order to
be 95% confident of estimating the mean cleaning fluid in a 24
ounce bottle to within .08 ounces?
Solution:
•
Using
error  z 2

1.96  0.4
we get .08 
from which
n
n
we can get n  1.96  0.4 so n   1.96  0.4   96.04


.08
 .08 
2
•
We need to round 96.04 up to the next integer, 97, in order to
assure the desired level of confidence. With a sample size of 97
we are 95% confident of estimating the mean cleaning fluid in a
24 ounce bottle to within .08 ounces.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.6 Precision and Sample Size
Sample Size for σ Unknown :
•
In the previous slides σ was assumed to be known. This
assumption is usually unreasonable.
•
The most obvious method for obtaining an estimate of σ is to
take a small sample and use the sample standard deviation as
an estimate of the population standard deviation. Replacing σ
with s in the sample size formula will provide an initial estimate
of the required sample size.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.5 Interval Estimation of the Population
Mean for a Normal Population with σ Unknown
Objectives:
• To determine t if given a probability.
• To determine the value of t 2 .
• To construct a confidence interval for a given situation.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.5 Interval Estimation of the Population
Mean for a Normal Population with σ Unknown
Constructing a Confidence Interval with σ Unknown:
•
If σ is not known and n  30, the confidence interval must be
changed slightly. As long as the population is normally
distributed, the distribution of the quantity
x  

t
s
n
where s is the standard deviation of the sample, has a
Student’s t-distribution.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.5 Interval Estimation of the Population
Mean for a Normal Population with σ Unknown
The t-Distribution:
•
The t-distribution is very much like the normal distribution. It is a
symmetrical, bell-shaped distribution with slightly thicker tails
than a normal distribution. The t-distribution approaches the
normal distribution as the degrees of freedom become larger.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.5 Interval Estimation of the Population
Mean for a Normal Population with σ Unknown
Degrees of Freedom:
•
The t-distribution has one parameter, degrees of freedom. The
degrees of freedom for any t-distribution is computed in the
following manner.
d . f .  number of sample observations  1  n  1
Confidence Interval for the Population Mean:
•
If σ is unknown and the sample is drawn from a normal
population, a (1-α) confidence interval for the population mean is
given by
x  t
2
, n 1
s
n
where t ,n 1 is the critical value for a t-distribution with n−1
2
degrees of freedom which captures an area of  2 in the right
tail of the distribution.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.5 Interval Estimation of the Population
Mean for a Normal Population with σ Unknown
Example:
•
•
Given the following data drawn from a normal population with
unknown mean and variance, construct a 95% confidence
interval for the population mean.
Seven data values have been selected randomly from the
population.
25, 19, 37, 29, 40, 28, 31
Solution:
•
The sample mean and standard deviation are x = 29.86 and
s = 7.08, respectively. The degrees of freedom are
d.f. = n − 1 = 7 − 1 = 6.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.5 Interval Estimation of the Population
Mean for a Normal Population with σ Unknown
Solution:
•
The t-value corresponding to 6 degrees of freedom and 95%
confidence is given in Table D of Appendix A as t0.025, 6 = 2.447.
The corresponding confidence interval is
 7.08 
29.86  2.447 

 7 
29.86  6.55.
Thus, we are 95 percent confident that the interval
23.31 to 36.41
will contain the population mean.
•
An alternate interpretation would be that we are 95% confident
that the point estimate (29.86) has a maximum error of
estimation of 6.55.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Sections 10.7-10.9 Estimation (Proportions)
Objectives:
• To calculate the minimum sample size for proportions.
• To estimate the standard deviation of proportions.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.7 Estimating Population Attributes
Population Attributes:
•
•
An attribute is a characteristic that members of a population
either possess or do not possess. Attributes are almost always
measured as the proportion of the population that possess the
characteristic.
Examples include: the percentage of television viewers who are
watching a particular program, the fraction of teachers who
believe group learning is a beneficial instructional method, and
the percentage defective in a lot of goods.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.7 Estimating Population Attributes
Estimating Population Attributes:
•
Estimating the proportion of the population that possess an
attribute is straightforward. A random sample is selected and the
sample proportion is computed as follows:
X  number in the sample that possess the attribute,
n  sample size, and
X
pˆ  .
n
•
The symbol above the p indicates an estimate of the quantity
specified.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.7 Estimating Population Attributes
Example:
•
Estimate the fraction of defective transistors in a lot containing
10,000 transistors. Suppose a sample size of 400 is drawn from
the lot, and 3 transistors were found to be defective.
Solution:
•
Let X  3 and n  400
Then pˆ 
3
 0.0075
400
The question though, is “How good is the estimate of the fraction
of defective transistors?” The answer to this question arises in
the discussion of interval estimation for proportions.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.8 Interval Estimation of a
Population Attribute
Interval Estimation of a Population Attribute:
•
•
The concept of confidence intervals can also be applied to
estimating proportions. In order to develop the confidence
interval for a population proportion, the sampling distribution of
the point estimate must be developed.
The sample proportion, p̂ , is distributed normally with mean, p,
and variance,
p 1  p 
 pˆ 2 
.
n
•
If the sample size is sufficiently large, np  5 and n(1  p)  5, the
(1-α) confidence interval for the population proportion is given by
the expression p  z 2 pˆ .
•
Where z 2 is the distance from the point estimate to either end
of the interval in standard deviation units, and  p̂ is the standard
deviation of pˆ .
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.8 Interval Estimation of a
Population Attribute
Example:
•
Suppose a sample of 410 randomly selected radio listeners
revealed that 48 listened to WXQI.
pˆ 
48
 0.117
410
This is a point estimate of the
proportion that listen to WXQI.
Solution:
•
To obtain an interval estimate, the amount of confidence to be
placed in the interval must be specified. Suppose we desire
95% confidence.
z  z.05  z.025  1.96, and  pˆ 
2
2
.117(1  .117)
pˆ (1  pˆ )


410
n
0.0159
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.8 Interval Estimation of a
Population Attribute
Solution:
•
•
Note on the previous slide that the sample proportion p̂ is used
in place of p in the computation of  pˆ . For any realistic problem,
this will always be the case. Fortunately, unless p̂ and p are
far apart, the value of  p̂ will not be greatly affected.
Computing the confidence interval results in
p  z  pˆ  0.117  1.96(0.0159)
2
 0.117  0.03116
 0.0858 to 0.1482
We are 95% confident that the point estimate, 0.117 has a
maximum error of estimation of 0.03116. A maximum error of
only 0.03116 with 95% confidence suggests a rather high level of
accuracy in the estimation of the proportion.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.9 Precision and Sample Size of
Population Attributes
Precision and Sample Size of Population Attributes:
•
Just as for the population mean, a specific level of accuracy in
estimating a population proportion is desirable. In order to
estimate extremely small quantities, highly precise estimates are
necessary. The technique for deriving the sample size parallels
the discussion of the sample mean. Setting one-half of the
entire width of the confidence interval equal to the maximum
allowable error yields
error  z  pˆ  z
2
p(1  p)
.
n
2
Solving for n yields
n
z 2 p(1  p)
2
error
2
.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.9 Precision and Sample Size of
Population Attributes
Precision and Sample Size of Population Attributes:
•
Usually the population proportion is unknown and is estimated
from a previous study. In this case the sample size is given by
n
z 2 pˆ (1  pˆ )
2
error
2
.
where p̂ is the estimate of the population proportion obtained
from the previous study.
•
If an estimate of the population proportion is not available, then
the population proportion is set equal to .5. The value .5
maximizes the quantity p (1  p ) and thus provides the most
conservative estimate of the sample size possible. Remember
to always round the sample size to the next largest integer to
assure the desired level of accuracy.
HAWKES LEARNING SYSTEMS
Estimating Means and Proportions
math courseware specialists
Section 10.9 Precision and Sample Size of
Population Attributes
Example:
•
How large of a sample would be required to estimate the
proportion of buyers on a mailing list with an accuracy of 0.002,
a 95% degree of confidence, and a previous proportion of 0.008.
Solution:
•
Using the sample size formula yields
n
z 2 pˆ (1  pˆ )
2
error 2
1.962  0.0081  .008 

.0022
 7621.7344  7622  always round up .
Thus, to be 95% confident that the proportion is estimated with
an error of at most 0.002 requires a sample size of 7,622.