* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download ch2 freq dist and histogram # R code
Survey
Document related concepts
Transcript
Kerns confidence intervals / ch14 image Ch14 (confidence intervals) exercises 14.7, 14.8, 14.9, 14.22, 14.23, 14.24 14.7 IQ test scores. Here are the IQ test scores of 31 seventh-grade girls in a Midwest school district:4 (a) These 31 girls are an SRS of all seventh-grade girls in the school district. Suppose that the standard deviation of IQ scores in this population is known to be σ = 15. We expect the distribution of IQ scores to be close to Normal. Make a stemplot of the distribution of these 31 scores (split the stems) to verify that there are no major departures from Normality. You have now checked the “simple conditions” to the extent possible. (b) Estimate the mean IQ score for all seventh-grade girls in the school district, using a 99% confidence interval. Follow the four-step process as illustrated in Example 14.3. Answer (a) A stemplot is provided. The two low scores (72 and 74) are both possible outliers, but there are no other apparent deviations from Normality. (b) The problem states that these girls are an SRS of the population, which is very large, so conditions for inference are met. In part (a), we saw that the scores are consistent with having come from a Normal population. Our 99% confidence interval for μ is given by 105.84 ± 2.576 = 98.90 to 112.78. We are 99% confident that the mean IQ of seventh-grade girls in this district is between 98.90 and 112.78 points. 14.8 Confidence level and margin of error. Example 14.1 described NHANES survey data on the body mass index (BMI) of 654 young women. The mean BMI in the sample was = 26.8. We treated these data as an SRS from a Normally distributed population with standard deviation σ = 7.5. (a) Give three confidence intervals for the mean BMI μ in this population, using 90%, 95%, and 99% confidence. (b) What are the margins of error for 90%, 95%, and 99% confidence? How does increasing the confidence level change the margin of error of a confidence interval when the sample size and population standard deviation remain the same? x σn ± Answer (a) The three confidence intervals are given in the table below. In all three cases,xbar = 26.8 and sigma/sqrt(n)= 0.2933, so the confidence interval is computed as 26.8 (plus/minus) z*(0.2933), where z* changes with the confidence level. (b) The margins of error, given in the “m.e.” column of the table, increase as confidence level increases. 14.9 Sample size and margin of error. Example 14.1 described NHANES survey data on the body mass index (BMI) of 654 young women. The mean BMI in the sample was = 26.8. We treated these data as an SRS from a Normally distributed population with standard deviation σ = 7.5. (a) Suppose that we had an SRS of just 100 young women. What would be the margin of error for 95% confidence? (b) Find the margins of error for 95% confidence based on SRSs of 400 young women and 1600 young women. (c) Compare the three margins of error. How does increasing the sample size change the margin of error of a confidence interval when the confidence level and population standard deviation remain the same? Answer With z* = 1.96 and σ = 7.5, the margin of error is . (a) and (b) The margins of error are given in the table. (c) Margin of error decreases as n increases. (Specifically, every time the sample size n is quadrupled, the margin of error is halved.) 14.22 Explaining confidence. A student reads that a 95% confidence interval for the mean ideal weight given by adult American women is 140 ± 1.4 pounds. Asked to explain the meaning of this interval, the student says, “95% of all adult American women would say that their ideal weight is between 138.6 and 141.4 pounds.” Is the student right? Explain your answer. Answer The student is wrong. A 95% confidence interval does not contain 95% of population values. Instead, all we can say is that if we repeatedly sampled the same number of women, each determining a 95% confidence interval for their average perceived ideal weight, then in the long run 95% of these confidence intervals would capture the true, unknown average ideal weight as perceived by all American women. 14.23 Explaining confidence. You ask another student to explain the confidence interval for mean ideal weight described in the previous exercise. The student answers, “We can be 95% confident that future samples of adult American women will say that their mean ideal weight is between 138.6 and 141.4 pounds.” Is this explanation correct? Explain your answer. Answer This student is also confused. If we repeated the sample over and over, 95% of all future sample means would be within 1.96 standard deviations of μ (that is, within 1.96 ) of the true, unknown value of μ. Future samples will have no memory of our sample. 14.24 Explaining confidence. Here is an explanation from the Associated Press concerning one of its opinion polls. Explain briefly but clearly in what way this explanation is incorrect. For a poll of 1,600 adults, the variation due to sampling error is no more than three percentage points either way. The error margin is said to be valid at the 95 percent confidence level. This means that, if the same questions were repeated in 20 polls, the results of at least 19 surveys would be within three percentage points of the results of this survey. Answer The mistake is in saying that 95% of other polls would have results close to the results of this poll. Other surveys should be close to the truth — not necessarily close to the results of this survey. (Additionally, there is the suggestion that 95% means “exactly 19 out of 20,” when really 95% refers to repeating the survey infinitely often.) R confidence intervals set.seed(12345) # First, in R, install.packages("TeachingDemos") library(TeachingDemos) # Draw 25 observations from a normal distribution x <- rnorm(25, mean = 100, sd = 5) ## Compute a Z-test of the hypothesis mu = 120 z.test(x, mu = 120, stdev = 5, conf.level = 0.95) One Sample z-test data: x z = -20.0059, n = 25, Std. Dev. = 5, Std. Dev. of the sample mean = 1, p-value < 2.2e-16 alternative hypothesis: true mean is not equal to 120 95 percent confidence interval: 98.03415 101.95408 sample estimates: mean of x 99.99411 set.seed(12345) x <- rnorm(25, mean = 100, sd = 5) ## Compute a t-test of the hypothesis mu = 120 t.test(x, mu = 120, conf.level = 0.95) One Sample t-test data: x t = -21.1677, df = 24, p-value < 2.2e-16 alternative hypothesis: true mean is not equal to 120 95 percent confidence interval: 98.04349 101.94473 sample estimates: mean of x 99.99411 # plots of Z and t x <- seq(-4,4,length=100) plot(x,dnorm(x),type="l",ylab="Density",xlab="Z, t") lines(x,dt(x,df=10),lty=2,col=2) legend(-4,max(dnorm(x)),c("Z","t (df=10)"),lty=c(1,2),col=c(1,2),cex=.5) set.seed(12345) x <- rnorm(100, mean = 10) # Use the t.test() function to compute a confidence interval # for mu.x when the variance is unknown t.test(x, conf.level = 0.95)$conf.int # Of course, you could do it manually mean(x)-qt(0.975,df=length(x)-1)*sqrt(var(x)/length(x)) mean(x)+qt(0.975,df=length(x)-1)*sqrt(var(x)/length(x)) set.seed(12345) x <- rnorm(100, mean = 10) y <- rnorm(100, mean = 5) # Use the t.test() function to compute a confidence interval # for mu.x - mu.y when the variances are unknown and unequal t.test(x, y, conf.level = 0.95, var.equal = FALSE)$conf.int