Download Understanding Basic Statistics By Brase and Brase

Document related concepts

Confidence interval wikipedia , lookup

Transcript
Chapter Eight
Estimation
Understanding Basic Statistics
Fourth Edition
By Brase and Brase
Prepared by: Lynn Smith
Gloucester County College
Estimating  When  is Known
Assumptions:
• We have a simple random sample of
size n.
• If the distribution is normal, methods
work for any sample size.
• If the distribution is unknown, a sample
size of at least 30 (sometimes even
more) is required.
Copyright © Houghton Mifflin Company. All rights reserved.
8|2
Point Estimate
• an estimate of a population parameter
given by a single number
Copyright © Houghton Mifflin Company. All rights reserved.
8|3
Examples of Point Estimates
•
x is used as a point estimate for  .
•
s is used as a point estimate for .
Copyright © Houghton Mifflin Company. All rights reserved.
8|4
Margin of Error
• the magnitude of the difference between
the point estimate and the true
parameter value
Copyright © Houghton Mifflin Company. All rights reserved.
8|5
Margin of Error
The margin of error using x
as a point estimate for  is
x   or x  
Copyright © Houghton Mifflin Company. All rights reserved.
8|6
Confidence Level
• A confidence level, c, is a measure of
the degree of assurance we have in our
results.
• The value of c may be any number
between zero and one.
• Typical values for c include 0.90, 0.95,
and 0.99.
Copyright © Houghton Mifflin Company. All rights reserved.
8|7
Critical Value for a Confidence Level, c
• the value zc such that the area under
the standard normal curve falling
between – zc and zc is equal to c.
Copyright © Houghton Mifflin Company. All rights reserved.
8|8
Confidence Level
Copyright © Houghton Mifflin Company. All rights reserved.
8|9
Find z0.90 such that 90% of the area
under the normal curve lies between
z0.90 and z0.90
P(– z0.90 < z < z0.90 ) = 0.90
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 10
Find z0.90 such that 90% of the area under
the normal curve lies
between –z0.90 and z0.90
P(z < z0.90 ) = (1 – 0.90)/2 = .05
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 11
Find z0.90 such that 90% of the area under
the normal curve lies
between –z0.90 and z0.90
• According to Appendix Table 3, 0.0500 lies exactly
halfway between two values in the table (.0505 and
.0495).
• Averaging the z values associated with areas gives
z0.90 = 1.645.
• z0.90 = 1.645.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 12
Common Levels of Confidence and Their
Corresponding Critical Values
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 13
Confidence Interval
• A c confidence interval for  is an
interval computed from sample data.
• In a c confidence interval for , c is the
probability of generating an interval
containing the actual value of .
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 14
To Find a Confidence
Interval for  When  is Known:
• Let x represent the appropriate random
variable.
• Obtain a simple random sample (of size n)
of x values
• Compute the sample mean, x .
• If you cannot assume x has a normal
distribution, use a sample size of 30 or
more.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 15
Confidence Interval for 
When  is Known:
xE xE
x  Sample Mean

E  zc
n
c  confidence level
z c  critical value for confidence level
where
based on the standard normal distributi on
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 16
Create a 95% confidence interval for
the mean driving time between
Philadelphia and Boston.
Assume that the mean driving time of
64 trips was 6.4 hours and that the
standard deviation is 0.9 hours.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 17
Creating a 95% confidence interval
x = 6.4 hours
 = 0.9 hours
c = 95%, so zc = 1.96
n = 64
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 18
x = 6.4 hours
 = 0.9 hours
95% confidence interval will be from
xE
to
Copyright © Houghton Mifflin Company. All rights reserved.
xE
8 | 19
x = 6.4 hours
 = 0.9 hours
c = 95%, so zc = 1.96
n = 64

0.9
E  zc
 1.96
 .2205
n
64
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 20
95% Confidence Interval:
6.4 – .2205 <  < 6.4 + .2205
6.1795 <  < 6.6205
We are 95% sure that the true time is
between 6.18 and 6.62 hours.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 21
We may get different confidence
intervals for different samples.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 22
We may get different confidence intervals
for different samples.
• For each sample the c confidence interval
goes from
x  E to x  E
• If we select many samples of the same size
and find the corresponding confidence
intervals, then the proportion of these
intervals that actually contain  is c.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 23
When estimating the mean, how
large a sample must be used in order
to assure a given level of
confidence?
Use the formula:
 z c 
n

 E 
Copyright © Houghton Mifflin Company. All rights reserved.
2
8 | 24
How do we determine the value of the
population standard deviation, ?
• Use the standard deviation, s, of a
preliminary sample of size 30 or larger
to estimate .
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 25
Determine the sample size necessary to
determine (with 99% confidence) the mean
time it takes to drive from Philadelphia to
Boston. We wish to be within 15 minutes of
the true time. Assume that a preliminary
sample of 45 trips had a standard deviation
of 0.8 hours.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 26
... determine with 99% confidence...
• z0.99 = 2.58
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 27
... We wish to be within 15 minutes of the
true time. ...
• E = 15 minutes = 0.25 hours
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 28
...a preliminary sample of 45 trips had a
standard deviation of 0.8 hours.
• Since the preliminary sample is large
enough, we can assume that the
population standard deviation is
approximately equal to 0.8 hours.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 29
Minimum Sample Size =
 zc 
n
 
 E 
2
2
 2.58(0.8) 

  68.16
 .25 
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 30
Rounding Sample Size
• Any fractional value of n is always
rounded to the next higher whole
number.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 31
Minimum Sample Size
• n  68.16
• Round to the next higher whole
number.
• To be 99% confident in our results, the
minimum sample size = 69.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 32
Estimating  When  is Unknown
• Apply the Student’s t distribution.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 33
Student’s t Variable
x
t
s
n
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 34
The shape of the t distribution
depends only on the sample size, n,
if the basic variable x has a normal
distribution.
When using the t distribution, we will
assume that the x distribution is
normal.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 35
Appendix Table 4
(Page A8)
gives values of the variable t
corresponding to the number of
degrees of freedom (d.f.)
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 36
Degrees of Freedom
• d.f. = n – 1
• where n = sample size
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 37
The t Distribution has a Shape
Similar to that of the the Normal
Distribution
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 38
Properties of a Student’s t Distribution
• Symmetric about the mean 0.
• Depends on the degrees of freedom.
• Bell-shaped with thicker tails than the
normal distribution.
• As the degrees of freedom increase,
the t distribution approaches the
standard normal distribution
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 39
Appendix Table 4
• Gives various t values for different
degrees of freedom
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 40
Using Table 4 to Find Critical Values tc for a
c Confidence Level
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 41
If the required d.f. are not in the table:
• Use the closest d.f. that is smaller than
the needed d.f.
• This results in a larger critical value tc.
• The resulting confidence interval will be
longer and have a probability slightly
higher than c.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 42
Using Table 4 to Find Critical Values of tc
• Find the column in the table with the
given c heading
• Compute the number of degrees of
freedom:
d.f. = n  1
• Read down the column under the
appropriate c value until we reach the
row headed by the appropriate d.f.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 43
To find the critical value tc for a 95%
confidence interval if n = 8.
• Find the column in the table with c
heading 0.950
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 44
To find the critical value tc for a 95%
confidence interval if n = 8.
• Compute the number of degrees of
freedom:
d.f. = n  1 = 8  1 = 7
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 45
To find the critical value tc for a 95%
confidence interval if n = 8.
• Read down the column under the appropriate c
value until we reach the row headed by d.f. = 7
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 46
Find the critical value tc for a 95%
confidence interval if n = 8.
tc = 2.365
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 47
Finding Confidence Intervals
for  When  is Unknown
xE xE
where
x  Sample Mean
s
E  tc
n
c = confidence level (0 < c < 1)
tc = critical value for confidence level c,
and degrees of freedom = n  1
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 48
The mean weight of eight fish caught in
a local lake is 15.7 ounces with a
standard deviation of 2.3 ounces.
Construct a 90% confidence interval for
the mean weight of the population of
fish in the lake.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 49
Mean = 15.7 ounces
Standard deviation = 2.3 ounces.
• n = 8, so d.f. = n – 1 = 7
• For c = 0.90, Appendix Table 4
gives t0.90 = 1.895.
s
2.3
E  tc
 1.895
 1.54
n
8
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 50
Mean = 15.7 ounces
Standard deviation = 2.3 ounces.
E = 1.54
The 90% confidence interval is:
xE xE
15.7 - 1.54 <  < 15.7 + 1.54
14.16 <  < 17.24
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 51
The 90% Confidence Interval:
14.16 <  < 17.24
• We are 90% sure that the true mean
weight of the fish in the lake is between
14.16 and 17.24 ounces.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 52
Review of the Binomial Distribution
• Completely determined by the number
of trials (n) and the probability of
success (p) in a single trial.
• q=1–p
• If np and nq are both > 5, the binomial
distribution can be approximated by the
normal distribution.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 53
A Point Estimate for p, the Population
Proportion of Successes
r
pˆ ( read as " p hat " ) 
n
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 54
Point Estimate for q
(Population Proportion of Failures)
qˆ ( read as " q hat " )  1  pˆ
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 55
For a sample of 500 airplane departures,
370 departed on time. Use this
information to estimate the probability
that an airplane from the entire
population departs on time.
r 370
pˆ  
 0.74
n 500
We estimate that there is a 74% chance that any
given flight will depart on time.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 56
Margin of Error for p
as a Point Estimate for p
pˆ  p
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 57
Maximal Margin of Error
• the maximal error of estimate E for a
confidence interval
E  zc
Copyright © Houghton Mifflin Company. All rights reserved.
pq
n
8 | 58
Confidence Interval for p for Large Samples
(np and nq > 5)
pˆ  E
r
where pˆ 
n
p
and

pˆ  E
E  zc
pˆ (1  pˆ )
n
c = confidence level
zc = critical value for confidence level c
taken from a normal distribution
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 59
For a sample of 500 airplane departures, 370
departed on time. Find a 99% confidence
interval for the proportion of airplanes that
depart on time.
Is the use of the normal distribution justified?
n  500
Copyright © Houghton Mifflin Company. All rights reserved.
pˆ  0.74
8 | 60
For a sample of 500 airplane departures, 370
departed on time. Find a 99% confidence
interval for the proportion of airplanes that
depart on time.
Can we use the normal distribution?
npˆ  370
Copyright © Houghton Mifflin Company. All rights reserved.
nqˆ  130
8 | 61
For a sample of 500 airplane departures, 370
departed on time. Find a 99% confidence
interval for the proportion of airplanes that
depart on time.
n pˆ
and
n qˆ
are
both  5
so the use of the normal distribution is justified.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 62
Out of 500 departures, 370 departed on time.
Find a 99% confidence interval.
r 370
pˆ  
 0.74
n 500
.74(.26)
E  2.58
 0.0506
500
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 63
99% confidence interval for the proportion
of airplanes that depart on time:
E = 0.0506
Confidence interval is:
pˆ  E  p  pˆ  E
. 74  0 . 0506  p  . 74  0 . 0506
0 . 6894  p  0 . 7906
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 64
99% confidence interval for the
proportion of airplanes that depart on
time
Confidence interval is
0 . 6894  p  0 . 7906
We are 99% confident that between 69%
and 79% of the planes depart on time.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 65
The point estimate and the confidence
interval do not depend on the size of
the population.
The sample size, however, does affect
the accuracy of the statistical estimate.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 66
Interpretation of Poll Results
The proportion responding in a certain way is:
p̂
the sample estimate of the population proportion.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 67
A 95% confidence interval for
population proportion p is:
p̂  margin of error  p  p̂  margin of error
p̂  poll report
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 68
Interpret the following poll results:
“ A recent survey of 400 households
indicated that 84% of the households
surveyed preferred a new breakfast
cereal to their previous brand.
Chances are 19 out of 20 that if all
households had been surveyed, the
results would differ by no more than 3.5
percentage points in either direction.”
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 69
“Chances are 19 out of 20 …”
• 19/20 = 0.95
• A 95% confidence interval is being
used.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 70
“... 84% of the households surveyed
preferred …”
• 84% represents the percentage of
households who preferred the new
cereal.
84 % represents pˆ .
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 71
“... the results would differ by no more than
3.5 percentage points in either direction.”
•
•
•
•
3.5% represents the margin of error, E.
The confidence interval is:
84% - 3.5% < p < 84% + 3.5%
80.5% < p < 87.5%
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 72
The poll indicates ( with 95% confidence):
• between 80.5% and 87.5% of the
population prefer the new cereal.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 73
Sample Size for Estimating p for the
Binomial Distribution
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 74
Formula for Minimum Sample Size for
Estimating p for the Binomial Distribution
If p is an estimate of the true
population proportion,
 zc 
n  p1  p   
E
Copyright © Houghton Mifflin Company. All rights reserved.
2
8 | 75
Formula for Minimum Sample Size for
Estimating p for the Binomial Distribution
If we have no preliminary estimate for
p, the probability is at least c that the
point estimate r/n for p will be in error
by less than the quantity E if n is at
least:
2
1  zc 
n 


4  E 
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 76
The manager of a furniture store wishes
to estimate the proportion of orders
delivered by the manufacturer in less
than three weeks. She wishes to be 95%
sure that her point estimate is in error
either way by less than 0.05. Assume no
preliminary study is done to estimate p.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 77
She wishes to be 95% sure ...
• z0.95 = 1.96
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 78
... that her point estimate is in error either
way by less than 0.05.
• E = 0.05
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 79
... no preliminary study is
done to estimate p.
1  zc 
n  
4 E 
2
2
1  1.96 
n 
  384.16
4  0.05 
The minimum required sample size (if no
preliminary study is done to estimate p) is 385.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 80
If a preliminary estimate estimate of p indicated that p
was approximately equal to 0.75:
 zc 
n  pq  
E
2
2
 1.96 
n  .75(.25) 
  288.12
 0.05 
The minimum required sample size (if this
preliminary study is done to estimate p) is 289.
Copyright © Houghton Mifflin Company. All rights reserved.
8 | 81