Download Chapter Seven: Confidence Intervals and Sample Size A point

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Misuse of statistics wikipedia , lookup

Student's t-test wikipedia , lookup

German tank problem wikipedia , lookup

Transcript
Chapter Seven: Confidence Intervals and Sample Size
A point estimate is:
The best point estimate of the population mean µ is the sample mean X.
Three Properties of a Good Estimator
1. Unbiased
2. Consistent
3. Relatively efficient
Why do statisticians prefer interval estimates to point estimates?
The confidence level of an interval estimate is:
A confidence interval is:
Example: Consider the pennies.mtw data set. Generate a random sample of 50 pennies. What is your point
estimate of the mean µ? What is the sampling error?
Let’s construct a confidence interval for the mean. We have a large sample size, so the distribution of the
sample means is:
By the Empirical Rule, approximately
of the sample means will lie within
standard deviations of
the true population mean. What is the 95% confidence interval for your sample? (In Minitab, go to Stat>Basic
Statistics>1-Sample Z. The Options button allows you to specify the level of confidence desired).
1
Derivation of Confidence Interval Formula:
Formula for Confidence Interval of the Mean for a Specific n:
σ
σ
X − zα/2 √
< µ < X + zα/2 √
n
n
where zα/2 is the z value with an area of α/2 to its right. For a 90% confidence interval zα/2 = 1.65; for a 95%
confidence interval zα/2 = 1.96; for a 99% confidence interval zα/2 = 2.58
σ
is called the
The term zα/2 √
n
Rounding Rule: When using raw data, round off one more decimal place than that found in the data. When
using a sample mean and standard deviation, use the same number of decimal places as given in the mean.
Example: Find zα/2 for the 98% confidence interval.
Warning!! Here we are assuming:
Example: A large airline wants to estimate its average number of unoccupied seats per flight over the past
year. The records of 225 flights are randomly selected and the number of unoccupied seats is noted for each of
the sampled flights. The sample mean is 11.6 seats and the sample standard deviation is 4.1 seats.
(a) What is the best point estimate for µ, the average number of unoccupied seats per flight during the past
year?
(b) Estimate µ using a 90% confidence interval.
2
(c) Interpret your result in (b). We express our confidence as “We can be 90% confident that
Example: A random sample of 100 observations from a normally distributed population possesses a mean equal
to 83.2 and a standard deviation equal to 6.4. Find a 99% confidence interval for µ.
This procedure is used when:
Sample Size
Sometimes we may want to determine the sample size necessary to make an accurate estimate. To do so, we
use the following formula:
Example: A university president would like to estimate the average age of students at the university. How
large a sample is necessary if she wishes to be 95% confident that the estimate should be accurate to within 1
year? A previous study determined that the standard deviation of the ages is known to be 3.5 years.
Note: When finding sample size, the size of the population is irrelevant when the population is large or infinite
or when sampling is done with replacement. If σ is unknown, one can estimate it using s from a previous study.
We have been working under the assumption that σ is known and the variable is normally distributed or that
σ is unknown and n ≥ 30. What happens if n < 30 and σ is not known?
When σ is unknown and the sample size is less than 30, we use the t-distribution.
Characteristics of the t-Distribution
The t-distribution is similar to the standard normal distribution in that:
1. It is bell-shaped.
2. It is symmetric about the mean.
3
3. The mean, median, and mode are all equal to 0 and are located at the center of the distribution.
4. The curve never touches the x-axis.
The t-distribution differs from the standard normal distribution in the following ways:
1. The variance is greater than 1.
2. The t-distribution is a family of curves based on the degrees of freedom, which is related to the sample
size.
3. As the sample size increases, the t-distribution approaches the standard normal distribution.
What are the degrees of freedom?
Formula for Confidence Interval of the Mean when σ is unknown and n < 30:
s
s
< µ < X + tα/2 √
X − tα/2 √
n
n
where zα/2 is the z value with an area of α/2 to its right. The degrees of freedom are n − 1.
Example: Find the tα/2 value for a 95% confidence interval when the sample size is 20.
Example: A manufacturer of printers wishes to estimate the mean number of characters printed before a
printhead fails. The printer manufacturer tests 15 printheads and finds the mean number of characters is 1.24
and standard deviation 0.19.
(a) Form a 99% confidence interval for the mean number of characters printed before the printhead fails.
Interpret the result.
(b) What assumption is required for the interval to be valid?
What happens if the population distribution departs greatly from normality?
Section 7.3: Confidence Intervals and Sample Size for Proportions
4
Notation: Let p be the population proportion and pb be the sample proportion. Then
pb =
X
n
and qb =
n−X
= 1 − pb
n
where X is the number of sample units possessing the characteristic of interest and n is the sample size.
Example: In a recent survey of 150 households, 54 had central air. Find pb and qb where pb is the proportion that
has central air.
For proportions, the confidence interval is given by:
r
r
pbqb
pbqb
pb − zα/2
< p < pb + zα/2
.
n
n
Example: A survey conducted by Gallop of 1404 respondents found 323 students paid for their education using
student loans. Find the 90% confidence interval of the true proportion of students who paid for their education
using student loans.
To determine the sample size for proportions, use the formula
z 2
α/2
n = pbqb
E
where E is the level of accuracy desired.
5
Example: A researcher wants to estimate, with 95% confidence, the proportion of people who own a home
computer. The previous study indicated that 45% of those surveyed owned a computer at home. The researcher
wants to be accurate within 2% of the true proportion. Find the minimum sample size required for the study.
6