Download Review of Confidence Interval Concepts

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Foundations of statistics wikipedia , lookup

History of statistics wikipedia , lookup

Degrees of freedom (statistics) wikipedia , lookup

Taylor's law wikipedia , lookup

Bootstrapping (statistics) wikipedia , lookup

Resampling (statistics) wikipedia , lookup

Student's t-test wikipedia , lookup

German tank problem wikipedia , lookup

Misuse of statistics wikipedia , lookup

Transcript
One-Proportion and One-Mean Confidence Intervals
 Say we wanted to estimate the population proportion or percentage of female
undergraduate students at PSU-UP. I could instruct each of you to take a random sample
of say 100 and record the gender of each. Then you would calculate the percentage of
your sample that was female. Do you believe that each of these sample proportions
would be the same? No, but each by themselves is a point estimate of the true
proportion. What if we wanted to estimate the true population mean age of PSU
undergraduate students? Again I could instruct you to take a random sample of some
size, record each person’s age in your sample and then calculate the sample mean.
Again, even though each sample mean is a point estimate of the population mean, would
you expect each sample mean to be the same?
 A confidence interval is an interval of values that is likely to "capture" the unknown
value of a population parameter of interest, such as the true population mean, μ, or the
true difference, μd. Another concept is to estimate the difference between two
independent samples. However, we will save this discussion for a future lesson.
 The confidence level is the probability (fraction of times) that the procedure used to
determine an interval gives an interval that actually captures the true population value.
For example, say we repeatedly drew samples of the same size from a population and
constructed 95% confidence intervals for each sample, and we repeated this process 1000
times. Then we would expect 95%, or 950, of these confidence intervals to contain the
true population parameter. In reality, though, we typically construct only one such
confidence interval and thus we are X-% confident that this interval has captured the true
parameter. However in reality, this interval might or might not contain the true value. As
a result, confidence intervals are exactly that: statements of how confident you are. These
should not be interpreted, for example, to say that there is a 95% probability that the true
value is in this interval. This is not true because the true value is either in the interval (i.e.
probability of 1) or not in the interval (probability of 0).
 For example, In most situations considered in our text, the general format for determining
a confidence interval is
Sample statistic ± Multiplier × Standard error

In other words, we form a confidence interval by adding and subtracting an appropriate
number of standard errors to (and from) the sample estimate. The common levels of
confidence will be 90%, 95%, 98% and 99%.
This week we considered confidence intervals for 1-proportion and 1-mean. For the proportion
the formula is:
pˆ  Z *
pˆ (1  pˆ )
and the multipliers are standard.
n
Confidence Level
90%
95%
98%
99%
Z-Multiplier
1.65
1.96
2.33
2.58
1
But what if our variable of interest is a quantitative variable (e.g. GPA, Age, Height) and we want
to estimate the population mean? In such a situation proportion confidence intervals are not
appropriate since our interest is in a mean amount and not a proportion.
For 1-mean the confidence interval will involve a new concept: Degrees of Freedom, or df. We
will use this df in conjunction with Table A2 to find the multiplier. The formula for a 1-mean
confidence interval is:
xt*
s
n
Therefore we apply similar techniques but now we are interested in estimating the population
mean, μ, by using the sample statistic and the multiplier is a t-value. Until now we assumed that
our random variable came from a normal distribution with a known population standard
deviation, σ. However, typically we do not know this parameter and therefore must estimate it.
This is done by using the standard deviation of the sample which is expressed as "S". Since we
need to make this estimate we lose our reference to the variable being from a normal distribution.
These t-values come from a t-distribution which is similar to the standard normal distribution
from which the z-values came. The similarities are that the distribution is symmetrical and
centered on 0. The difference is that when using a t-table we need to consider a new feature:
degrees of freedom (df). This degree of freedom will be based on the sample size, n.
Example of 1-proportion and 1-mean confidence intervals
Assume our class survey represents a random sample taken from the PSU undergraduate
population. Find 95% confidence intervals for the following:
1. Find 95% confidence interval Do you think marijuana should be legalized?
2. Find 90% confidence interval Do you believe in same-sex marriages?
3. Find 95% confidence interval for mean GPA.
4. Find 99% confidence interval for mean amount of money students spent on books.
Solutions:
1. pˆ  Z *
pˆ (1  pˆ )
0.615(1  0.615)
= 0.615  1.96 *
= 0.497 ≤ p ≤ 0.734
n
65
In Minitab we use Stat > Basic Statistics > 1-Proportion
Event = Yes
Variable
X
LegalMJ? 40
N
65
Sample p
0.615385
95% CI
(0.497114, 0.733656)
Interpretation: We are 95% confident that the proportion of PSU-UP undergraduate students
who think marijuana should be legalized is between 49.7% and 73.4%
2. 0.707  1.65 *
Event = Yes
Variable
X
SameSexMar 46
0.707(1  0.707)
= 0.614 ≤ p ≤ 0.800
65
N
65
Sample p
0.707692
90% CI
(0.614900, 0.800485)
2
Interpretation: We are 90% confident that the proportion of PSU-UP undergraduate students
who agree in same-sex marriage is between 61.5% and 80.0%
3. x  t *
s
n
= 3.38  2.00 *
0.473
= 3.26 ≤ u ≤ 3.49
66
In Minitab we use Stat > Basic Statistics > 1-Sample t
Variable
GPA
N
66
Mean
3.3785
StDev
0.4733
SE Mean
0.0583
95% CI
(3.2621, 3.4948)
Interpretation: We are 95% confident that the mean GPA of PSU-UP undergraduates is between
3.26 and 3.49
4. x  t *
s
n
= 350.2  2.66 *
153.5
= 299.6 ≤ u ≤ 400.8
65
In Minitab we use Stat > Basic Statistics > 1-Sample t
Variable
TextSpd
N
65
Mean
350.2
StDev
153.5
SE Mean
19.0
99% CI
(299.6, 400.8)
Interpretation: We are 95% confident that the mean amount of money PSU-UP undergraduates
spent on books is between $299.6 and $400.8
3