Download Confidence Task answers A

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
1. The Empirical Rule
a. Suppose that the mean SATM score for seniors in Georgia was 550 with a standard deviation of
50 points. Consider a simple random sample of 100 Georgia seniors who take the SAT. Describe
the distribution of the sample mean scores.
Solution: The distribution, given that there are more than 1000 seniors who take the SAT, should be
approximately normal.
b. What are the mean and standard deviation of this sampling distribution?
Solutions: The mean is 550 and the standard deviation is 50/(sqrt 100) = 5.
c. Use the Empirical Rule to determine between what two scores 68% of the data falls, 95% of the
data falls, and 99.7% of the data falls.
Solutions:
 68% of the data would be one standard deviation on either side of the mean, so (550 – 5, 550 +
5) = (545, 555)
 95% of the data would be within two standard deviations of the mean or (540, 560).
 99.7% of the data is within three standard deviations or (535, 565).
For the 95% interval, this means that in 95% of all samples of 100 students from this population, the mean
score for the sample will fall within ___ standard deviations of the true population mean or ____ points
from the mean.
2. Confidence Intervals
In the above problem, we took the mean and added/subtracted a certain number of standard deviations.
That is, we calculated  x  2 x   x  2

n
for the 95% interval and  x  3 x   x  3

n
for the
99.7% interval.
The interval of numbers found, i.e. (540, 560) is called a 95% confidence interval for the population
mean.
Above, we knew the population mean, but in practice, we often do not. So we take samples and create
confidence intervals as a method of estimating the true value of the parameter. When we find a 95%
confidence interval, we believe with 95% confidence that the true parameter falls within our interval.
However, we must accept that 5% of all samples will give intervals which do not include the parameter.
Every confidence interval takes the same shape: estimate  margin of error. In the interval (540, 560), the
margin of error is 10.
The margin of error has two main components: the number of standard deviations from the mean (i.e. the
z-score) and the standard deviation. (Margin of error = z .)
Because we do not usually know the details of a population parameter (e.g. mean and standard deviation),
we must use estimates of these values. So our margin of error becomes m = z(estimate). Therefore, the
confidence interval becomes
estimate  margin of error  estimate  z(estimate).
The z-score used in the confidence interval depends on how confident one wants to be. There are a few
common levels of confidence used in practice: 90%, 95%, and 99%.
The Empirical Rule provides estimates for the amount of data within specified numbers of standard
deviations, and therefore, can help us find approximate intervals for being 68%, 95%, and 99.7%
confident that we have included the true population parameter. Let’s find closer estimates for the number
of standard deviations from the mean within which certain percentages of data lie.
a. Within how many standard deviations of the mean would one locate the middle 95% of the data?
(Hint: Draw a picture and use the normal table or invNorm on your calculator.)
Solution: This is a question about z-scores. Using either feature, consider the following: If 95% is in
the middle, then 2.5% is on each end. So the area under each end is 0.025. Using the standard normal
table, an area of .025 is located exactly at the z-score of -1.96. Because the curve is symmetric, the
upper z-score is 1.96. Using the invNorm feature, first determine the lower tail area as above (0.025).
Then input invNorm(.025). The output is approximately -1.96. One could also input invNorm(.975) to
get the upper z-score, however, this is not necessary because of the symmetry of the curve.
b. Within how many standard deviations of the mean would one locate the middle 90% of the data?
(Hint: Draw a picture and use the normal table or invNorm on your calculator.)
Solution: If 90% is in the middle, then 5% is on each end. So the area under each end is 0.05. Using
the standard normal table, an area of .05 is half way between the area of .0495 (z-score of -1.65) and
an area of .0505 (z-score of -1.64). We can reason that the appropriate lower z-score is -1.645. Because
the curve is symmetric, the upper z-score is 1.645. Using the invNorm feature, first determine the lower
tail area as above (0.05). Then input invNorm(.05). The output is approximately -1.645. One could also
input invNorm(.95) to get the upper z-score, however, this is not necessary because of the symmetry of
the curve.
c. For the common confidence levels, then, we have the following z-scores, called z*. Complete the
following table using your answers above.
Confidence
level
90%
95%
99%
z*
2.576
Solutions: 1.645; 1.96
For any confidence intervals you are expected to compute by hand in Math 4, you will use these z*
values. Thus, our final form of the confidence interval is estimate  z*(estimate).
You will continue investigating confidence intervals and, specifically, margin of error, through the next
two activities.
Related documents