Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
5.3 Central Limit Theorem (CLT) ! Also a very important section in the book " In the previous section, we computed probabilities related to an individual observation, such as P(X < 120) = ? " We now move to statements about a group of observations, specifically, we compute probabilities relate to the sample mean (shown as X ), such as P( X < 120) = ? 1 Performance by a group of individuals ! Suppose a class of 25 students is taking the Stanford-Binet IQ test. ! The teacher would like to consider her students’ performance as a whole. ! For example, what is the probability that the class average is above 110? 2 Performance by a group of individuals ! We know how to compute probabilities (or percentiles) for any single (individual) student taking the Stanford-Binet IQ test because we know the test scores have a normal distribution and with a mean of 100 and standard deviation of 16. Or, in short-hand, for an IQ score X, we have… X ~ N(µ=100,σ=16) 3 Performance by a group of individuals ! But what about a sample mean X ? " To compute probabilities for X , we need to know it’s distribution. Does X also have a normal distribution? " In this case, YES!!!! So, we can use our previously learned procedure for computing probabilities for X (i.e. z-tables). 4 Performance by a group of individuals ! It turns out that X is also normally distributed, and the mean and standard deviation for X are related to the normal distribution for X . σ " If X ~ N(µ, σ ) , then X ~ N(µ, ) n The sample mean X is normally distributed with a mean equal to the population2 mean µ and a variance of σ n . µx σ x 5 Performance by a group of individuals " If X ~ N(µ, σ ) , then σ X ~ N(µ, ) n And if n is large enough (say n >30), then X has this same approximate distribution, even if X was not normally distributed (by Central Limit Theorem). 6 ! It turns out that an average X is less variable than an individual observation. 2 The variance of a single observation X is σ , while the variance of a sample mean taken 2 from n observations is σ . n ! If you had to predict the points scored in a single NFL football game, what range of points would be relevant? 0 to 54? ! If you had to predict the AVERAGE points scored per game over the whole season, what range of points would be relevant? 10 to 30? 7 Performance by a group of individuals ! For a random sample of n=25 students taking the Stanford-Binet IQ test, we have… X ~ N(µ = 100, σ = 16) , so we have… 16 X ~ N(µ x = 100, σ x = ) 25 as a decimal is 3.2 8 Compare the distribution of scores for an individual to the distribution of scores for a mean (n=25). 0.10 0.05 X ! The distribution for X and X are both normally distributed. ! The distribution for X and X both have a mean equal to 100, or µ =100 and µ x =100. ! The spread of X IS MUCH SMALLER than the spread of X . Specifically, 0.00 relative frequency 0.15 Individual IQ Scores 60 80 100 120 140 IQ Score 0.10 0.05 X 0.00 relative frequency 0.15 Mean IQ Scores (n=25) 60 80 100 IQ Score 120 140 16 σ = 16 and σ x = = 3.2 25 9 Performance by a group of individuals ! For a random sample of n=25 taking the Stanford-Binet IQ test, what is the probability that the group average is more than 110? P( X > 110) = 1 - P( X ≤ 110) = ? Using X ~ N(µ x = 100, σ x = 3.2) , we will convert 110 to a z-score and compute the probability. 10 Performance by a group of individuals X ~ N(µ x = 100, σ x = P( 16 ) 25 X > 110) = 1 - P( X ≤ 110) = 1- P(z ≤ 110 −100 16 25 = 1 – P(z ≤ 3.13) = 1 – 0.9991 = 0.0009 ) Probability distribution for the average score from 25 students taking the Stanford-Binet test. 11 Performance by a group of individuals Very small X chance of getting a mean at 110 or higher. 0.05 0.10 The area under the curve in red is 0.0091 0.00 relative frequency 0.15 Mean IQ Scores (n=25) 60 80 100 IQ Score 120 140 12 Performance by a single individual X ~ N(µ, σ ) Z-score: x−µ z= σ Performance by a group of n individuals σ X ~ N(µ, ) n Z-score: x −µ z= σ n 13 Exercise 1: ! For a random sample of size n=9, find the probability that the average IQ score is 110 or higher. 14 Exercise 2: ! For a random sample of size n=30, find the probability that the average IQ score is 98 or lower. 15 What if the original distribution was NOT normal? ! In the previous example, the distribution of individual IQ scores was normal. So, it naturally followed that the average IQ score ( X ) would also have a normal distribution. ! But what about other distributions? There are many other possibilities (uniform, rightskewed, left-skewed, etc.). What then? 16 What if the original distribution was NOT normal? ! Enter… ! An THE CENTRAL LIMIT THEOREM. incredibly useful rule. ! NO MATTER WHAT DISTRIBUTION you’re drawing from, X will be normally distributed as long as you take a large enough random sample (n>=30). 17 What if the original distribution was NOT normal? ! See applet linked at our website: http://onlinestatbook.com/stat_sim/ sampling_dist/index.html 18 Comment ! Central " Gives Limit Theorem us σ x ~ N(µ, ) n ! If parent population is VERY non-normal, need n>=30 ! If parent population nearly normal, any size n OK 19