* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Review of Confidence Interval Concepts
Foundations of statistics wikipedia , lookup
History of statistics wikipedia , lookup
Degrees of freedom (statistics) wikipedia , lookup
Taylor's law wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Resampling (statistics) wikipedia , lookup
Student's t-test wikipedia , lookup
One-Proportion and One-Mean Confidence Intervals Say we wanted to estimate the population proportion or percentage of female undergraduate students at PSU-UP. I could instruct each of you to take a random sample of say 100 and record the gender of each. Then you would calculate the percentage of your sample that was female. Do you believe that each of these sample proportions would be the same? No, but each by themselves is a point estimate of the true proportion. What if we wanted to estimate the true population mean age of PSU undergraduate students? Again I could instruct you to take a random sample of some size, record each person’s age in your sample and then calculate the sample mean. Again, even though each sample mean is a point estimate of the population mean, would you expect each sample mean to be the same? A confidence interval is an interval of values that is likely to "capture" the unknown value of a population parameter of interest, such as the true population mean, μ, or the true difference, μd. Another concept is to estimate the difference between two independent samples. However, we will save this discussion for a future lesson. The confidence level is the probability (fraction of times) that the procedure used to determine an interval gives an interval that actually captures the true population value. For example, say we repeatedly drew samples of the same size from a population and constructed 95% confidence intervals for each sample, and we repeated this process 1000 times. Then we would expect 95%, or 950, of these confidence intervals to contain the true population parameter. In reality, though, we typically construct only one such confidence interval and thus we are X-% confident that this interval has captured the true parameter. However in reality, this interval might or might not contain the true value. As a result, confidence intervals are exactly that: statements of how confident you are. These should not be interpreted, for example, to say that there is a 95% probability that the true value is in this interval. This is not true because the true value is either in the interval (i.e. probability of 1) or not in the interval (probability of 0). For example, In most situations considered in our text, the general format for determining a confidence interval is Sample statistic ± Multiplier × Standard error In other words, we form a confidence interval by adding and subtracting an appropriate number of standard errors to (and from) the sample estimate. The common levels of confidence will be 90%, 95%, 98% and 99%. This week we considered confidence intervals for 1-proportion and 1-mean. For the proportion the formula is: pˆ Z * pˆ (1 pˆ ) and the multipliers are standard. n Confidence Level 90% 95% 98% 99% Z-Multiplier 1.65 1.96 2.33 2.58 1 But what if our variable of interest is a quantitative variable (e.g. GPA, Age, Height) and we want to estimate the population mean? In such a situation proportion confidence intervals are not appropriate since our interest is in a mean amount and not a proportion. For 1-mean the confidence interval will involve a new concept: Degrees of Freedom, or df. We will use this df in conjunction with Table A2 to find the multiplier. The formula for a 1-mean confidence interval is: xt* s n Therefore we apply similar techniques but now we are interested in estimating the population mean, μ, by using the sample statistic and the multiplier is a t-value. Until now we assumed that our random variable came from a normal distribution with a known population standard deviation, σ. However, typically we do not know this parameter and therefore must estimate it. This is done by using the standard deviation of the sample which is expressed as "S". Since we need to make this estimate we lose our reference to the variable being from a normal distribution. These t-values come from a t-distribution which is similar to the standard normal distribution from which the z-values came. The similarities are that the distribution is symmetrical and centered on 0. The difference is that when using a t-table we need to consider a new feature: degrees of freedom (df). This degree of freedom will be based on the sample size, n. Example of 1-proportion and 1-mean confidence intervals Assume our class survey represents a random sample taken from the PSU undergraduate population. Find 95% confidence intervals for the following: 1. Find 95% confidence interval Do you think marijuana should be legalized? 2. Find 90% confidence interval Do you believe in same-sex marriages? 3. Find 95% confidence interval for mean GPA. 4. Find 99% confidence interval for mean amount of money students spent on books. Solutions: 1. pˆ Z * pˆ (1 pˆ ) 0.615(1 0.615) = 0.615 1.96 * = 0.497 ≤ p ≤ 0.734 n 65 In Minitab we use Stat > Basic Statistics > 1-Proportion Event = Yes Variable X LegalMJ? 40 N 65 Sample p 0.615385 95% CI (0.497114, 0.733656) Interpretation: We are 95% confident that the proportion of PSU-UP undergraduate students who think marijuana should be legalized is between 49.7% and 73.4% 2. 0.707 1.65 * Event = Yes Variable X SameSexMar 46 0.707(1 0.707) = 0.614 ≤ p ≤ 0.800 65 N 65 Sample p 0.707692 90% CI (0.614900, 0.800485) 2 Interpretation: We are 90% confident that the proportion of PSU-UP undergraduate students who agree in same-sex marriage is between 61.5% and 80.0% 3. x t * s n = 3.38 2.00 * 0.473 = 3.26 ≤ u ≤ 3.49 66 In Minitab we use Stat > Basic Statistics > 1-Sample t Variable GPA N 66 Mean 3.3785 StDev 0.4733 SE Mean 0.0583 95% CI (3.2621, 3.4948) Interpretation: We are 95% confident that the mean GPA of PSU-UP undergraduates is between 3.26 and 3.49 4. x t * s n = 350.2 2.66 * 153.5 = 299.6 ≤ u ≤ 400.8 65 In Minitab we use Stat > Basic Statistics > 1-Sample t Variable TextSpd N 65 Mean 350.2 StDev 153.5 SE Mean 19.0 99% CI (299.6, 400.8) Interpretation: We are 95% confident that the mean amount of money PSU-UP undergraduates spent on books is between $299.6 and $400.8 3