Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Central Limit Theorem The Central Limit Theorem (CLT), is one of the most important ideas in Statistics. It allows us to model a wide variety of phenomena and make astoundingly accurate predicts. However, very specific conditions need to apply in order for the CLT to be valid. If you use the CLT where it is not valid, all sorts of disasters may await. Consider yourself warned!! Before we get to the CLT itself, we need a few definitions: A sampling distribution is the probability distribution of a set of sample means when samples of a fixed size n are repeatedly taken from a population (with replacement.) Note that this set no longer consists of data points, like it has for the entirety of the class up to now, but of the means of various samples. For example, take a data set A. Its elements are data points and can be written thusly: ๐ด = {๐ฅ1, ๐ฅ2 , ๐ฅ3 , ๐ฅ4 , โฆ . . } However, we are now dealing with sets of means. Take the set of means B. Its elements are means and can be written thusly: ๐ต = {๐ฅฬ 1 , ๐ฅฬ 2 , ๐ฅฬ 3 , ๐ฅฬ 4 , โฆ . } The mean of a sampling distribution is the same as the mean of the population from which it was drawn: ๐๐ฅฬ = ๐ The variance of a sampling distribution is the variance of the population from which it was drawn, divided by n, the sample size: ๐๐ฅฬ 2 ๐2 = ๐ And thus the standard deviation can be given as: ๐๐ฅฬ = ๐ โ๐ We now have in mathematical form the very important idea that weโve been talking about for most of the year: The larger the sample size, the less uncertainty in the result. (Remember, standard deviation is a measure of risk or uncertainty.) By the way, the standard deviation of a set of sample means has a special name: The standard error Now we are ready for the CLT itself. It states: 1.) If samples of a fixed size n, if ๐ โฅ 30, are drawn from any population, then the set of sample means approximates a normal distribution. 2.) OR, if samples of any fixed size are drawn from a normally-distributed population, then the set of sample means is also normally distributed. p. 251 Example 4: First you have to read the graph and realize the population weโre concerned with is only very young drivers (between 15 and 19). The mean of this part of the sample is ๐ฅฬ = 25 and we are told that the standard deviation of the population is ๐ =1.5. (The fact that we are just given this parameter is the one unrealistic thing about this problem) The sample size is 50, which is greater than 30, so weโre justified in using the CLT, even though it doesnโt tell us that the original population was normal (it doesnโt matter.) So we know that our sample mean of 25 came from somewhere within a normally distributed set of all possible sample means from this population. What are the mean and standard deviation of this set? Well, we know that its mean is still 25, just like the original data set, and its standard deviation is given by ๐๐ฅฬ = ๐ โ๐ = 1.5 โ50 = 0.2121 We are now asked to answer the question โWhat is the probability that the real mean ( ๐ ) is somewhere between 24.7 and 25.5 minutes? In other words, ๐(24.7 < ๐ฅฬ < 25.5) = _____ Well, we know how to do these problems already! Theyโre just the โbetweenโ problems from the last section! Just make sure youโre using the โnewโ standard deviation, 0.2121, and not the โoriginalโ!! (this is the most common mistake in this section.) Evaluating the expression gives us ๐ = 0.9116 In other words, based on our data, we are 91.16% confident that the true mean is between 24.7 and 25.5 minutes. This is called a confidence interval. (Although most confidence intervals are symmetric about the mean, they donโt have to be.) Now try p.252 โTry it yourselfโ #4. Note that nothing changes from Example 4 except the sample size goes from 50 to 100. Notice how that affects the confidence level for the same range of boundsโฆ Continue with examples 5,6 HW: p.254 #1-8, p.256-7 #21-34 Continue with CLT worksheets