Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 ECO 72 - INTRODUCTION TO ECONOMIC STATISTICS Topic 7 The Central Limit Theorem These slides are copyright © 2003 by Tavis Barr. This material may be distributed only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub/). 2 • • • • Random Samples Sampling Distribution The Law of Large Numbers The Central Limit Theorem o Known population mean and variance o Known population mean, unknown variance 3 Random Sample • Every element of the population have the same chance of being included in the sample at each draw. o Simple sampling: all samples of the same size have an equal chance of being selected from the population. o Cluster sampling: population is divided into clusters and then a random sample of the clusters are selected. (clusters are homogeneous with nonhomogeneous elements). o Stratified sampling: population is divided into strata and then a random sample is selected from each strata. (strata are non-homogeneous with homogeneous elements). 4 Sampling Distribution (SD) •The distribution of a given statistic based on a random sample. •It may be considered as the distribution of the statistic for all possible samples of a given size. •The sampling distribution depends on the underlying distribution of the population, the statistic being considered, and the sample size used. 5 SD of the Sample Mean • In order to obtain information about the population mean μ, a sample is taken and the sample mean X is calculated. • Changing the sample the sample mean changes. • All possible values of the sample mean along with the probability of occurrence of the possible values is called the SD of the sample mean. • So the sample mean has also a mean and a standard deviation. 6 The IID assumption • We make the assumption that observations in the sample are independent and identically distributed (IID). • Random sampling gives IID sample observations. • Observations are independent when knowing the value of one observation in a sample does not tell us anything about the value of other observations in that sample. • Observations are identically distributed if they are all draws from a random variable with the same distribution and parameters (we make no assumption about the distribution). 7 • It turns out that if our samples are independent and identically distributed, we can predict the behavior of large samples. • The law of large numbers and the central limit theorem are two of the basic ways of doing this. 8 The Law of Large Numbers The Law of Large Numbers (LLT) states that if the sample is IID and the population has a finite mean and variance then the sample mean approaches the population mean with probability one as the sample becomes infinitely large. 9 Central Limit Theorem-Result The Central Limit Theorem (CLT) states that if the sample is IID and the population has a finite mean μ and variance σ2 then: ⎛ σ2 ⎞ X ≈ N ⎜ μ, ⎟ ⎝ n ⎠ when n (the sample size) approaches infinity. 10 Central Limit Theorem-Remarks • The observations don’t have to be normal for the CLT to work! • You usually need n ≥ 30 observations for the approximation to work well. (Need fewer observations if the observations come from a symmetric distribution.) • The standard deviation (also known as the standard error) is σ/ n 11 Example of Central Limit Theorem ● ● Suppose we produce soda. Our quality control engineer claims that our bottles of soda have a mean contents of 2000ml and a standard deviation of 2 ml. We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are 1999.5 ml or less? 12 Example of Central Limit Theorem ● ● Suppose we produce soda. Our quality control engineer claims that our bottles of soda have a mean contents of 2000ml and a standard deviation of 2 ml. We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are 1999.5 ml or less? 13 Example of Central Limit Theorem ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are 1999.5 ml or less? – The sample mean will be normally distributed. It will have an expected value of 2000, and a standard error of 2 / 1 0 0=0 . 2 14 Example of Central Limit Theorem ● We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are 1999.5 ml or less? – The sample mean will be normally distributed. It will have an expected value of 2000, and a standard error of 2 / 1 0 0=0 . 2 – So we want to know the probability that a Normally distributed variable with mean 2000 and standard deviation 0.2 is less than 1999.5 15 ● Example of Central Limit Theorem We take a sample of 100 bottles. How likely is is that the mean contents of the bottles in our sample are 1999.5 ml or less? – So we want to know the probability that a Normally distributed variable with mean 2000 and standard deviation 0.2 is less than 1999.5 – This is the same as the probability that a standard normal variable is less than (1999.5-2000)/0.2 = -2.5. 16 Another Example of CLT ● ● Suppose we know that the mean marital age of men in the U.S. is 24.8 years and the standard deviation is 2.5 years. If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? 17 Another Example of CLT ● ● Suppose we know that the mean marital age of men in the U.S. is 24.8 years and the standard deviation is 2.5 years. If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – Sample mean will be a Normal variable with mean 24.8 and standard deviation 2 . 5/ 6 0=2 . 5/7 . 7 5=0 . 3 2 18 Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – – Sample mean will be a Normal variable with mean 24.8 and standard deviation 2 . 5/ 6 0=2 . 5/7 . 7 5=0 . 3 2 What is the probability that a Normal variable with mean 24.8 and standard devation 0.32 is at least 25.1? 19 Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – What is the probability that a Normal variable with mean 24.8 and standard devation 0.32 is at least 25.1? – Same as the probability that a standard normal is at least(25.1 – 24.8)/0.32 = 0.3/0.32 = 0.9375. 20 Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – Same as the probability that a standard normal is at least(25.1 – 24.8)/0.32 = 0.3/0.32 = 0.9375. – From the table, P(z<.94) is 0.826 21 Another Example of CLT ● If we take a sample of 60 married men, what is the probability that the mean marital age in the sample will be 25.1 years or more? – Same as the probability that a standard normal is at least(25.1 – 24.8)/0.32 = 0.3/0.32 = 0.9375. – From the table, P(z<.94) is 0.826. – So P(z > .94) = 1 – 0.826 = 0.174. 22 What if we don't know ? ● ● ● Sometimes we know the population mean, but not the population standard deviation In this case, we can substitute the sample standard deviation, s, for the population standard deviation . Then, the result is that the sample mean is normally distributed with expected value and standard error s /n 23 Example with unkown ● ● ● Suppose a company claims that its light bulbs last an average of a thousand hours. We take a sample of 500 light bulbs. The average bulb in the sample lasts 950 hours, and the sample standard deviation is 100 hours. What is the probability of observing a sample mean this small? 24 Example with unkown ● ● ● Suppose a company claims that its light bulbs last an average of a thousand hours. We take a sample of 500 light bulbs. The average bulb in the sample lasts 950 hours, and the sample standard deviation is 100 hours. What is the probability of observing a sample mean this small? – Here = 1000, unknown, n = 500, X = 950, s = 100 25 Example with unkown ● ● Recap: – Population mean () of 1000, population standard deviation () unknown – Sample size (n) 500, sample mean ( X ) 950, sample standard deviation (s) 100 What is the probability of X this small or smaller? is Normal with mean 1000, std error 1 0 0/ 5 0 0 = 100/22.36 = 4.47. – X – P( X<950) is the same as P(z < [950-1000]/4.47), i.e., P( z < -11.18).