* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download AP Statistics: Section 10.1 A
Foundations of statistics wikipedia , lookup
History of statistics wikipedia , lookup
Taylor's law wikipedia , lookup
Bootstrapping (statistics) wikipedia , lookup
Resampling (statistics) wikipedia , lookup
German tank problem wikipedia , lookup
Misuse of statistics wikipedia , lookup
AP Statistics: Section 10.1 A Confidence interval Basics How long does a new laptop battery last? What proportion of college undergraduates have engaged in binge drinking? It certainly would not be feasible to test every laptop or question every college undergraduate. Instead we choose a sample from the population of interest and collect data from these subjects. Our goal is to use the sample statistic to estimate the unknown population parameter. Statistical inference provides methods for drawing conclusions about a population from sample data. The two most common types of formal statistical inference are confidence intervals & significance tests Inference is most reliable when the data is produced by a properly ____________ randomized design. Sample values, such as a proportion or mean, will probably vary from sample to sample, but there is only one true population proportion or mean. Only by considering our sample as one of many such samples can we draw inferences. The sampling distribution of x describes how the values of x vary in repeated samples. Recall from Chapter 9 some important facts about the sampling distribution of x . CENTER: x SPREAD: x n Note : N 10n SHAPE: 1. If the population is Normally distributed, then the distribution of x will be Normally distributed regardless of our sample size. 2. Central Limit Theorem: If n is sufficiently large, the sampling distribution of x will be approximately Normal regardless of the shape of the population distribution. Example: The admissions director at Big City University proposes using the IQ scores of current students as a marketing tool. The university decides to provide him with enough money to administer IQ tests to an SRS of 50 of the university’s 5000 freshman. The mean IQ score for the sample is 112. What can the director say about the mean score of the population of all 5000 freshman? Because n 50 is fairly large, the Law of Large Numbers says will be very close to 112. Now, 112 is probably not the true population mean for the IQ of Big City University freshman. The goal of a confidence interval is to give a range of values that we are “confident” the true population mean will lie within. The following will give us a glimpse of how this is done. When the distribution of x is Normally distributed, the 68-95-99.7 rule for Normal distributions says that in about 95% of all samples, the mean score, x , for the sample will be within ___ 2 standard deviations of the population mean . So, whenever x is within 2 standard deviations of , is within 2 standard deviations of x . So the unknown lies between ________ x 2 and ________ x 2 in about 95% of all samples. For the example above, let’s assume the standard deviation of freshman IQ scores at BCU is 15, so the standard deviation of x =15 2.1 50 So, we estimate that lies somewhere in x 2(2.1) or the interval from _______ x 2(2.1) to _______ (_____ x 4.2 , _____) x 4.2 Our sample of 50 freshmen gave x 112. The resulting interval is 112 4.2 or ( _____, 116.2 ). 107.8 _____ The key idea is that the sampling distribution of x tells us how big the error is likely to be when we use x to estimate . Understand that our confidence is in the procedure used to generate the interval. It is incorrect to try and associate any type of probability to an already found interval because there are only two possibilities: 1. The interval between 107.8 and 116.2 contains the true 2. Our SRS was one of the few samples for which is x not within 4.2 points of the true . Only ____ 5% of our samples give such inaccurate results. The interval of numbers is called a 95% _______________________ confidence interval for . It catches the unknown in 95% of all possible samples. The 4.2 is called the margin of error The 95% is the confidence level