Download Interval Estimation of the Population Mean for a

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Student's t-test wikipedia , lookup

Transcript
Interval Estimation of the
Population Mean for a Normal
Population with s Unknown
10 - 50
Student’s t-distribution
• If s is not known and n  30, the
derivation of the confidence
interval must be changed slightly.
• Provided the population from
which the sample is drawn is
normally distributed, the
distribution of the quantity
t  (x -s  )
n
has a Student’s t-distribution,
which was discovered by W. S.
Gossett in 1908.
10 - 51
Degrees of Freedom
• The t-distribution has one
parameter, degrees of
freedom.
• Degrees of freedom for any
t-distribution is computed in
the following manner
d.f. = n - 1,
where n is the number of
sample observations.
10 - 52
Shape of the
t-distribution
• The t-distribution is very much like
the normal; it is a symmetrical, bellshaped distribution with slightly
larger tails than a normal.
• In fact, as the degrees of freedom
become larger, the t-distribution
approaches the normal distribution.
normal
t
10 - 53
Confidence Interval for
the Population Mean
Definition:
If n  30, s is unknown, and the
sample is drawn from a normal
population, a (1 - a) confidence
interval for the population mean
is given by
x  ta
2
,n-1
s
n
where ta 2 , n - 1 is the critical value
for a t-distribution with n - 1
degrees of freedom which
captures an area of a/2 in the right
tail of the distribution.
10 - 54
Example 5
Construct an 80% confidence
interval for the mean of a normal
population assuming that the
values listed below comprise a
random sample taken from the
population.
83.9
80.3
91.9
87.4
92.7
71.1
65.2
87.5
79.1
86.0
69.3
72.4
73.1
77.5
88.2
The population standard
deviation is unknown.
10 - 55
Example 5 - Solution
We are given:
X has a normal distribution,
the variance is unknown,
n = 15,
and the confidence level = .80.
10 - 56
Example 5 - Solution
1 - a = .80
a = .20
a  .20  .10
2
2
d.f. = n - 1 = 15 - 1 = 14
.40
.80
.40
.10
.10
t.10 , 14  1 .345
10 - 57
Example 5 - Solution
We must calculate x = 80.37 and
s = 8.68. We want to determine an
80% confidence interval for the mean.
Since X is normal, s is unknown and
n  30, the confidence interval is
given by
x  ta
2
,n 1
s  80.37  1.345 8.68
n
15
 80.37  3.01
 ( 77 .36 , 83 .38 ) .
10 - 58
Precision and
Size
Sample
10 - 59
Precision
• The best interval estimates a
decision maker could hope for
would be those which are very
small in size and possess a
large amount of confidence.
• The width of the confidence
interval defines the precision
with which the population
mean is estimated; the
smaller the interval the greater
the precision.
10 - 60
Components Which
Affect the Width
Note: The entire confidence interval
width = 2 za s .
2
n
za
2
represents the distance the
confidence interval boundary is from
the mean in standard deviation
units. The distance is related to the
specified level of confidence.
s
represents the population standard
deviation.
n
represents the sample size.
10 - 61
Which component
can change?
• Since the population standard
deviation (s) is a constant, it
does not change.
• However, the sample size, n,
is selected by the decision
maker.
• Since the sample size can
enlarge or reduce the width of
the confidence interval, how
large should the sample be?
10 - 62
Error
• The sample size should be
selected in relation to the size
of the maximum positive or
negative error the decision
maker is willing to accept.
• This can be achieved by
setting the error equal to one
half the confidence interval
width,
error = za s .
2
n
10 - 63
Determining the
Sample Size
• The equation for error can be
solved for the sample size,
2


z
s
a

2


n = 
.


error




• By selecting a level of confidence
and the maximum error, the
relationship can be used to
determine the sample size
necessary to estimate the sample
mean with the desired accuracy.
• In order to assure the desired level
of confidence, always round the
value obtained for the sample size
up to the next whole integer.
10 - 64
Example 6
A computer software company
would like to estimate how long
it will take a beginner to become
proficient at creating a graph
using their new spreadsheet
package.
10 - 65
Example 6
• Past experience has indicated that
the time required for a beginner to
become proficient with a particular
function of new software products
has an approximately a normal
distribution with a standard
deviation of 15 minutes.
• Find the sample size necessary to
estimate the average time required
for a beginner to become proficient
at creating a graph with the new
spreadsheet package to within 5
minutes with 95% confidence.
10 - 66
Example 6 - Solution
We are given:
s = 15,
error = E = 5,
and the confidence level = .95.
10 - 67
Example 6 - Solution
1 - a = .95
a = .05
a  .05  .025
2
2
.475
.025
z.025  1.96
10 - 68
Example 6 - Solution
We want to determine the sample
size necessary to estimate the
mean time required for a beginner
to become proficient at creating a
graph with the new spreadsheet
package.
The sample size is given by
n  [
za s
2
E
2
]
1.96(15) 2
 [
]  34.5744.
5
To get the desired accuracy, we
must round up to 35.
10 - 69
Sample Size for s
Unknown
• The most obvious method for
obtaining an estimate of s is to
take a small sample and use
the sample standard deviation
as an estimate of the
population standard deviation.
• Replacing s with s in the
sample size determination
relationship will provide an
initial estimate of the required
sample size.
10 - 70
Estimating Population
Attributes
•
10 - 71
Attribute
• An attribute is a characteristic that
members of a population either do or
do not possess.
• Attributes are almost always
measured as the proportion of the
population that possess the
characteristic.
• Example:
An attribute of a person would be
whether they smoke cigarettes or not.
p = proportion of the population which
smoke cigarettes
10 - 72
Estimating the
Proportion
• Estimating the proportion of
the population that possesses
an attribute is straightforward.
• A random sample is selected
and the sample proportion is
computed as follows:
X = number in the sample that
possess the attribute,
n = sample size, and
p  X
n.
10 - 73
Example 7
The Richland Gazette, a local
newspaper, conducted a poll of
1,000 randomly selected
readers to determine their views
concerning the city's handling of
snow removal. The paper found
that 650 people in the sample
felt the city did a good job.
10 - 74
Example 7
Compute the best point estimate
for the percentage of readers
who believe the city is doing a
good job in snow removal.
10 - 75
Example 7 - Solution
X = number of people who believe
the city is doing a good job
X = 650
n = number of randomly selected
readers
n = 1000
650  .65
p  X

n
1000
10 - 76
Interval Estimation of a
Population Attribute
•
10 - 77
Sampling Distribution
of the Point Estimate
• In order to develop the
confidence interval for a
population proportion the
sampling distribution of the
point estimate must be
developed.
• The random variable, p , has a
binomial distribution which is
approximated with a normal
random variable.
10 - 78
The Sample Proportion
The sample proportion, p , is
distributed normally with mean,
p, and variance, p(1p) .
n
p
10 - 79
The Sample Proportion
• The standard deviation of the
sample proportion ( p ) is
denoted symbolically as sp
and is given by
sp 
p(1p) 
n
 p)

p(1
n
• where p is used as an
estimate of p if the population
proportion is unknown.
10 - 80
Confidence Interval for
the Population Proportion
Definition:
If the sample size is sufficiently large,
np  5 and n(1-p)  5, the 1 - a
confidence interval for the
population proportion is given by the
expression
p  za sp
2
where za is the distance from the
2
point estimate to the end of the
interval in standard deviation units,
and sp is the standard deviation of p .
10 - 81
Example 8
The Peacock Cable Television
Company thinks that 40% of their
customers have more outlets wired
than they are paying for.
A random sample of 400 houses
reveals that 110 of the houses have
excessive outlets.
10 - 82
Example 8
Construct a 99% confidence
interval for the true proportion of
houses having too many outlets.
Do you feel the company is
accurate in its belief about the
proportion of customers who
have more outlets wired than
they are paying for?
10 - 83
Example 8 - Solution
We are given:
X = number of houses that have
excessive outlets
= 110,
n = 400,
and the confidence level = .99.
Thus, p  110  .275 .
400
10 - 84
Example 8 - Solution
1 - a = .99
a = .01
a  .01  .005
2
2
.495
.005
z.005  2.575
10 - 85
Example 8 - Solution
A 99% confidence interval for the
true proportion of houses having
too many outlets is given by
p  za
2

p (1  p)
n
 .275  2.575 .275 (1  .275 )
400
 .275  .0575
 (.2175 , .3325 ) .
10 - 86
Example 8 - Solution
• Interpretation: We are 99%
confident that the true
proportion of houses having
too many outlets is between
.2175 and .3325.
• Note: We do not believe that
40% of customers have more
outlets wired than they are
paying for because we are
99% confident that the true
proportion is in the interval
21.75% to 33.25% and 40% is
not in that interval.
10 - 87
Precision and Sample Size
for Population Attributes
•
10 - 88
Accuracy in Estimating a
Population Proportion
• Just as for the population mean,
a specific level of accuracy in
estimating a population
proportion is desirable.
• When we estimate extremely
small quantities, highly precise
estimates are necessary.
10 - 89
Deriving the
Sample Size
• The technique for deriving the
sample size parallels the discussion
of the sample mean.
• Setting one half the entire width of
the confidence interval equal to the
maximum allowable error yields
error = za sp  za
2
2
p (1 p) .
n
• Solving for n yields
za2 p (1  p)
.
n= 2
2
error
10 - 90
Sample Size
when p is Known
• Generally the population
proportion is unknown and is
estimated from a pilot study.
• In this case, the sample size
necessary to estimate the
population proportion to within
a particular error with a certain
level of confidence is given by

za2 p (1  p)
,
n= 2
2
error
where p is the estimate
obtained from the pilot study.
10 - 91
Sample Size
when p is Unknown
• If an estimate of the population
proportion is not available,
then the population proportion
is set equal to .5 and the
sample size is given by
za2 .5 (1  .5)
n= 2
.
2
error
• The value .5 maximizes the
quantity p(1 - p) and thus
provides the most
conservative estimate of the
sample size.
10 - 92
Example 9
Researchers working in a remote area
of Africa feel that 40% of the families
in the area are without
adequate drinking
water either
through
contamination or
unavailability.
What sample size will
be
necessary to estimate
the
percent without adequate water to
within 5% with 95% confidence?
10 - 93
Example 9 - Solution
We are given:
p  .40 ,
error = E = .05,
and the confidence level = .95.
10 - 94
Example 9 - Solution
•
1 - a = .95
a = .05
a  .05  .025
2
2
.475
.025
z.025  1.96
10 - 95
Example 9 - Solution
We want to determine the sample size
necessary to estimate the proportion
of families in the area without
adequate drinking water. Since we
have an estimate of p , the sample
size is given by

2
za2 p (1  p)
(1
.96)
.40 (1  .40)
n 2 2
=
E
(.05)2
 368.79.
To get the desired accuracy, we must
round up to 369 families.
10 - 96