Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Math/Stat 352 Lecture 10 Section 4.11 The Central Limit Theorem 1 Summing random variables Summing random variables Summing random variables • Generally summation changes the shape of the distribution: range of values, spread, mean, etc. • There is no simple way to tell what is the distribution of X+Y if we know distributions of X and Y: you have to do complicated math including multiple integration and stuff. • So, what to do when we have more summands, like 10, 20, …1000… We need some magic to deal with it … enter the Central Limit Theorem. The Central Limit Effect Start with one random variable (one summand). Consider two types of random variables: continuous (sample on left) and discrete (sample on right) One summand, 100 observations from random variable: X1 (left) or Y1 (right) Two summands, 100 observations from new random variable: X1+X2 (left) or Y1 +Y2 (right) The Central Limit Effect Three summands, 100 observations from random variable: X1+X2+X3 (left) or Y1+Y2+Y3 (right) Twenty summands, 100 observations from new random variable: X1+X2+ …+X20 (left) or Y1 +Y2+…+Y20 (right) The Central Limit Effect 100 summands, 100 observations from random variable: X1+X2+…+X100 (left) or Y1+Y2+…+Y100 (right) 1000 summands, 100 observations from new random variable: X1+X2+ …+X1000 (left) or Y1 +Y2+…+Y1000 (right) Normal distribution The Central Limit Theorem Let X1,…,Xn be a random sample from a population with mean µ and variance σ2. Let � = 𝑿𝟏 +𝑿𝟐+⋯+𝑿𝒏 𝑿 𝒏 be the sample mean. Let Sn = X1+…+Xn be the sum of the sample observations. Then if n is sufficiently large, σ2 X ~ N µ , n and S n ~ N ( nµ , nσ 2 ) approximately. 8 How large a sample is large enough? Rule of Thumb for the CLT For most populations, if the sample size is greater than 30, the Central Limit Theorem approximation is good. 9 Two Examples of the CLT Normal approximation to the Binomial: If X ~ Bin(n,p) and if np > 10, and n(1 − p) >10, then p (1 − p ) ˆ p ~ N , approximately. p, X ~ N(np, np(1 − p)) approximately and n Normal Approximation to the Poisson: If X ~ Poisson(λ), where λ > 10, then X ~ N(λ, λ2), approximately. 10 Example The manufacturer of a certain part requires two different machine operations. The time on machine 1 has mean 0.4 hours and standard deviation 0.1 hours. The time on machine 2 has mean 0.45 hours and standard deviation 0.15 hours. The times needed on the machines are independent. Suppose that 65 parts are manufactured. What is the distribution of the total time on machine 1? On machine 2? What is the probability that the total time used by both machines together is between 50 and 55 hours? Soln: Let X=time on machine 1 in hours, EX = 0.4, and St. Dev. X = 0.1. Y=time on machine 2, EX= 0.45, and St. Dev. Y= 0.15. X and Y independent. X1, X2, …, X65 are times for the 65 parts on machine 1. Y1, Y2, …, Y65 are times for the 65 parts on machine 2. Sx=X1+X2+…+X65= total time on machine 1; Sx has approximately Normal distribution with mean ESx= 65(0.4)= 26, and VarSx =65 (0.1)2= 0.65, so the distribution of Sx is approximately N(26, 6.5). Let Sy=Y1+Y2+…+Y65= total time on machine 2; Sy has approximately Normal distribution with mean ESy= 65(0.45)= 29.25, and VarSy =65 (0.15)2= 1.4625, so the distribution of Sy is approximately N(29.25, 1.4625). 11 Example contd. What is the probability that the total time used by both machines together is between 50 and 55 hours? Need distribution of the total time on machines 1 and 2. Let T=total time for the 65 parts on both machines, then T= Sx+Sy. Since both Sx and Sy are approximately normal, their sum is also approximately normal with ET=ESx+ESy= 26+29.25=55.25 and variance VarT=VarSy + VarSx = 0.65 + 1.4625 = 2.1125. So T’s distribution is approximately N(55.25, 2.1125). Then, P(50 < T < 55)= standardization = P(-3.61 <Z < -0.17) = 0.4325 – 0.0002 = 0.4323 12 Continuity Correction The binomial distribution is discrete, while the normal distribution is continuous. The continuity correction is an adjustment, made when approximating a discrete distribution with a continuous one, that can improve the accuracy of the approximation. If you want to include the endpoints in your probability calculation, then extend each endpoint by 0.5. Then proceed with the calculation. Histogram of X ~ Bin(100, 0.5), approximated by Y ~ N(50, 25) P( 45 ≤ X ≤ 55) ≈ P( 44.5 < Y < 55.5) If you want exclude the endpoints in your probability calculation, then include 0.5 less from each endpoint in the calculation. Histogram of X ~ Bin(100, 0.5), approximated by Y ~ N(50, 25) P( 45 < X < 55) ≈ P( 45.5 < Y < 54.5) We use continuity correction for the normal approximation to the binomial distribution, but not for normal approx. to Poisson distribution. 13 Example If a fair coin is tossed 100 times, use the normal curve to approximate the probability that the number of heads is between 45 and 55 inclusive. Soln. X=# of H; X ~ Bin(100, 0.5), approximated by Y ~ N(50, 25) P( 45 ≤ X ≤ 55) ≈ P( 44.5 < Y < 55.5) = = P(-1.1 < Z < 1.1)=0.7286. How about the probability of the number of heads between 45 and 55 exclusive? P( 45 < X < 55) ≈ P( 45.5 < Y < 54.5) = = P(-0.9 < Z < 0.9)=0.6318. 14 Example Suppose X is the score on a test and X~N(500, 1002). Let X1, X2, …X16 be a sample of scores for 16 individuals and their average score. Find P( 550 < X ≤ 600). X X has a normal Solution: Since the data come from a normal distribution, distribution with mean µ X= µ= 500 and σ X= σ / n= 100 / 16= 25. Thus, P(550 < X 550 − 500 X − 500 600 − 500 < ≤ )= ≤ 600) = P ( 25 25 25 = P(2 < Z ≤ 4) = P(Z ≤ 4) - P(Z ≤ 2 ) = = 1 – 0.9772 = 0.0228. Example Suppose X1, X2, …, X25 are lifetimes of electronic components, with μ=700 hours and σ=10 hours. Find P( ≤ 702), where is the sample mean of the lifetimes of 25 components. X X Solution. Usually lifetime data is skewed to the right, so not normal (Why?) Since n=25 (reasonably large), we will use CLT: X has approx. a N(μ, σ2/n) = N(700, 102/25) = N(700, 4) distr. So, X − 700 702 − 700 P( X ≤ 702)= P( ≤ )= P ( Z ≤ 1)= 0.8413. 2 2 Example – Water Taxi Safety A water taxi has capacity of 3500 lb. Given the population of men has normally distributed weights with a mean of 172 lb and a standard deviation of 29 lb, a) if one man is randomly selected, find the probability that his weight is greater than 175 lb. b) if 20 different men are randomly selected, find the probability that their mean weight is greater than 175 lb (so that their total weight exceeds the safe capacity of 3500 pounds). Example – cont a) if one man is randomly selected, find the probability that his weight is greater than 175 lb. b) if 20 different men are randomly selected, find the probability that their mean weight is greater than 172 lb. z = 175 – 172 = 0.46 z = 175 – 172 = 0.10 29 29 20 Example - conclusion a) if one man is randomly selected, find the probability that his weight is greater than 175 lb. P(x > 175) = 0.4602 b) if 20 different men are randomly selected, their mean weight is greater than 175 lb. P( X > 175) = 0.3228 It is much easier (larger probability) for an individual to deviate from the mean than it is for a group of 20 to deviate from the mean.