Download CHAPTER 7

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
CHAPTER 7
Confidence Intervals and Sample Size
Objectives
•
•
•
•
•
•
Find the confidence interval for the mean when  is known or n > 30.
Determine the minimum sample size for finding a confidence interval for the mean.
Find the confidence interval for the mean when  is unknown and n < 30.
Find the confidence interval for a proportion.
Determine the minimum sample size for finding a confidence interval for a proportion.
Find a confidence interval for a variance and a standard deviation.
7.1 Introduction
•
•
Estimation is the process of estimating the value of a parameter from information
obtained from a sample.
The procedures for estimating the population mean, estimating the population
proportion, and estimating a sample size will be explained.
7.2 Confidence Intervals for the Mean (  Known or n  30 )
Three Properties of a Good Estimator
• The estimator should be an unbiased estimator. That is, the expected value or the
mean of the estimates obtained from samples of a given size is equal to the
parameter being estimated.
• The estimator should be consistent. For a consistent estimator, as sample size
increases, the value of the estimator approaches the value of the parameter
estimated.
• The estimator should be a relatively efficient estimator; that is, of all the statistics that
can be used to estimate a parameter, the relatively efficient estimator has the
smallest variance.
Point and Interval Estimates
• A point estimate is a specific numerical value of a parameter. The best point estimate
of the population mean  is the sample mean x .
•
An interval estimate of a parameter is an interval or a range of values used to
estimate the parameter. This estimate may or may not contain the value of the
parameter being estimated.
I. Confidence Intervals
•
Even the best point estimate of the population mean is the sample mean, for the most
part, the sample mean x will be different from the population mean  due to
sampling error. For this reason, statisticians prefer an interval estimate. The
confidence level of an interval estimate of a population mean  is the probability that
the interval estimate will contain  .
1
Confidence Level and Confidence Interval
•
•
The confidence level of an interval estimate of a parameter is the probability that the
interval estimate will contain the parameter.
A confidence interval is a specific interval estimate of a parameter determined by
using data obtained from a sample and by using the specific confidence level of the
estimate.
Formula
• Formula for the Confidence Interval of the Mean for a Specific 
  
  
X  z / 2 
    X  z / 2 

 n
 n
For a 90% confidence interval,
for a 95% confidence interval,
and for a 99% confidence interval,
Example 1:
(a)
Find the critical values for each.
z / 2 for the 99% confidence interval
(b)
z / 2 for the 95% confidence interval
(c)
z / 2 for the 90% confidence interval
II. Maximum Error of the Estimate
The maximum error of estimate is the maximum difference between the point estimate of
a parameter and the actual value of the parameter.
Definition:
When estimating  by x from a large sample, the maximum error of the estimate, with
level of confidence 1  , is E  z / 2 

n
2
When  is unknown, we can estimate it by s , as long as n  30 →   s
E  z / 2 

n
 z / 2 
s
n
95% Confidence Interval
For =0.05, 95% of the sample means will fall within the error value on either side of the
population mean.
Example 1: Find the maximum error for  based on x =128.3, n = 64, s = 32.4, and
confidence level of 98%.
III. Rounding Rule for a Confidence Interval for a Mean
• When you are computing a confidence interval for a population mean by using raw
data, round off to one more decimal place than the number of decimal places in the
original data.
• When you are computing a confidence interval for a population mean by using a
sample mean and a standard deviation, round off to the same number of a decimal
places as given for the mean.
3
Example 1:
(Ref: General Statistics by Chase/Bown, 4th Ed.)
A physician wanted to estimate the mean length of time  that a patient
had to wait to see him after arriving at the office. A random sample of 50
patients showed a mean waiting time of 23.4 minutes and a standard
deviation of 7.1 minutes. Find a 95% confidence interval for  .
Example 2:
(Ref: General Statistics by Chase/Bown, 4th Ed.)
A union official wanted to estimate the mean hourly wage  of its members. A
random sample of 100 members gave x = $18.30 and s = $3.25 per hour.
a)
Find an 80% confidence interval for  .
(b)
Find a 95% confidence interval for  .
(c)
If you were to construct a 90% confidence interval for  (do not construct
it), would the interval be longer or shorter than the 80% confidence
interval? Longer or shorter than the 95% confidence interval?
4
(Ref: General Statistics by Chase/Bown, 4th Ed.)
A restaurant owner believed that customer spending was below normal
at tables manned by one of waiters. The owner sampled 36 checks from
the waiter’s tables and got the following amounts (rounded to the nearest
dollar):
47 46 56 70 52 58 48 57 49 61 52 40 60 22 74 59 60 30
61 44 62 41 53 57 50 52 57 59 69 51 58 56 44 36 47 51
Example 3:
Find a 95% confidence interval for the true mean amount of money spent at
the waiter’s tables.
IV. Determining the Sample Size for
Maximum Error of Estimate for 
E  z / 2 
Solve for n
Example 1:
Example 2:

n

n  z / 2 
E
2
 z / 2   
n
 Round the answer up to obtain a whole number.
 E 
To estimate  , what sample size is required so that the maximum error of
the estimate is only 8 square feet? Assume  is 42 square feet.
(Ref: General Statistics by Chase/Bown, 4th Ed.)
Consider a population with unknown mean  and population standard
deviation  = 15.
(a) How large a sample size is needed to estimate  to within five units
with 95% confidence?
5
(b) Suppose you wanted to estimate  to within five units with 90%
confidence. Without calculating, would the sample size required be
larger or smaller than the found in part (a)?
(c) Suppose you wanted to estimate  to within six units with 95%
confidence. Without calculating, would the sample size required
be larger or smaller than the found in part (a)?
7.3 Confidence Intervals for the Mean (  Unknown and n  30 )
•
When the population sample size is less than 30, and the standard deviation is
unknown, the t distribution must be used.
Characteristics of the t Distribution
The t distribution is similar to the standard normal distribution in the following ways:
•
•
•
It is bell shaped.
It is symmetrical about the mean.
The mean, median, and mode are equal to 0 and are located at the center of the
distribution.
• The curve never touches the x axis.
The t distribution differs from the standard normal distribution in the following ways.
•
•
•
The variance is greater than 1.
The t distribution is actually a family of curves based on the concept of degrees
of freedom, which is related to sample size.
As the sample size increases, the t distribution approaches the standard normal
distribution.
t Distribution
•
•
The degrees of freedom are the number of values that are free to vary after a sample
statistic has been computed.
The degrees of freedom for the confidence interval for the mean are found by
subtracting 1 from the sample size. That is,
6
Formula
Formula for the Confidence Interval of the Mean When  is Unknown and
n  30
 s 
 s 
X  t / 2 
    X  t / 2 

 n
 n
s
Maximum Error for
: E  t / 2 
n
Degree of Freedom
: d.f .  n  1
Example 1:
Find t / 2 with the following information.
(a) Level of confidence is 98% with n = 19
(b) Level of confidence is 90% with n = 25
Example 2:
A sample of 25 two-year-old chickens shows that they lay an
average of 21 eggs per month. The standard deviation of the
sample was 2 eggs. Assume the population is approximately
normal. Construct a 99% confidence interval for the true mean.
Example 3:
A random sample of 20 parking meters in a large municipality
showed the following incomes for a day.
$2.60
$2.00
$2.10
$1.05
$2.40
$1.75
$2.45
$2.35
$1.00
$2.90
$2.40
$2.75
$1.30
$1.95
$1.80
$3.10
$2.80
$1.95
$2.35
$2.50
7
Assume the population is approximately normal. Find the 95%
confidence interval of the true mean.
When to Use the Z or t Distribution
7.4 Inference Concerning a Population Proportion
I. Confidence Intervals for Proportions
• Symbols Used in Proportion Notation
p = symbol for the population proportion
p̂ (read p “hat”) = symbol for the sample proportion
For a sample proportion,
pˆ 
X
nX
and qˆ 
or 1  pˆ
n
n
where X = number of sample units that possess the characteristics of interest and n =
sample size.
Formula
•
Formula for a specific confidence interval for a proportion
 
pˆ  z
2
 
ˆˆ
pq
 p  pˆ  z
2
n
ˆˆ
pq
n
when np and nq are each greater than or equal to 5.
8
II.
Determining the Sample Size for p
z
ˆ ˆ   / 2 
n  pq
 E 
2
Round the answer up to obtain a whole number.
Since the sample has not yet been obtained, we do not know the value of p̂ and q̂ .
However, it can be shown that regardless of the values of p̂ and q̂ , the value of
p̂  q̂ will never be more than ¼. Therefore, to be on the safe side, we should take
the sample size to be at least
2
2
z  1 z 
n    / 2   =   / 2   (0.25)
 E  4  E 
Round the answer up to obtain a whole number.
Example 1:
(Ref: General Statistics by Chase/Bown, 4th Ed.)
A city council commissioned a statistician to estimate to proportion
p of voters in favor of a proposal to build a new library. The statistician
obtained a random sample of 200 voters, with 112 indicating approval
of the proposal.
(a) What is a point estimate for p ?
(b) What is the maximum error of estimate for p ?
(c) Find a 90% confidence interval for p .
Example 2:
A Roper poll of 2,000 American adults showed that 1,440 thought that
chemical dumps are among the most serious environmental problems.
Estimate with a 98% confidence interval the proportion of population who
consider chemical dumps among the most serious environmental
problem.
9
Example 3: A recent study indicated that 29% of the 100 women over age 55
in the study were widows.
(a) How large a sample must one take to be 90% confident that the
stimate is within 0.05 of the true proportion of women over 55 who are
widows?
(b) If no estimate of the sample proportion is available, how large
should the sample be?
Example 4: How large a sample is necessary to estimate the true proportion of adults
who are overweight to within 2 % with 95% confidence?
Summary
• A good estimator must be unbiased, consistent, and relatively efficient.
• There are two types of estimates of a parameter: point estimates and interval
estimates.
• A point estimate is a single value. The problem with point estimates is that the
accuracy of the estimate cannot be determined, so the interval estimate is preferred.
• By calculating a 95% or 99% confidence interval about the sample value, statisticians
can be 95% or 99% confident that their estimate contains the true parameter.
• Once the confidence interval of the mean is calculated, the z or t values are used
depending on the sample size and whether the standard deviation is known.
•
•
•
•
The following information is needed to determine the minimum sample size
necessary to make an estimate of the mean:
The degree of confidence must be stated.
The population standard deviation must be known or be able to be estimated.
The maximum error of the estimate must be stated.
Conclusions
Estimation is an important aspect of inferential statistics.
10