Download Lecture 8 - Center for Statistical Sciences

PHP2510: Principles of Biostatistics & Data Analysis Lecture VIII: Law of Large numbers and central limit theorem PHP 2510 – Lec 8: law of large numbers, CLT 1 Properties of the Sample Mean Example I: Consider the experiment of rolling a dice. Ω = {1, 2, 3, 4, 5, 6} E[X] = 6 X i=1 P (X = i) × i = 6 X 1 i=1 6 i = 3.5 If we roll a dice and calculate the average from all the rolls, what happes to that average (sample mean)? Let Xn represent the sample mean when we have rolled n times. • Xn is a random variable • E[Xn ] = E[X] = 3.5 • Anything more we can say about Xn ? PHP 2510 – Lec 8: law of large numbers, CLT 2 Simulation: Below is the result of one such attempt of rolling a dice for many times, simulated by a computer: 5 6 5 6 3 1 2 4 5 6 1 1 5 1 3 3 3 3 2 6 ... This is one realization of that experiment. Let xn denote the sample mean for the first n rolls. We have n=1 5 2 (5+6)/2=5 3 (5+6+5)/3=5.333 4 (5+6+5+6)/4= 5.5 5 (5+6+5+6+3)/5= 5 6 (5+6+5+6+3+1)/6= 4.333 ... ... Let’s observe how the sample mean x̄k changes when we follow n = 1, n = 2, . . . till a large number PHP 2510 – Lec 8: law of large numbers, CLT 3 5.5 5.0 5.5 4.5 sample mean 4.5 3.5 4.0 3.5 sequence 1, first 10000 rolls 4.0 5.0 sequence 1, first 100 rolls 0 20 40 60 80 100 0 2000 4000 8000 10000 k 3.0 2.5 seqeunce 3, first 10000 rolls 2.0 seqeunce 2, first 10000 rolls sample mean 3.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 k 6000 0 2000 4000 6000 8000 10000 k PHP 2510 – Lec 8: law of large numbers, CLT 0 2000 4000 6000 8000 10000 k 4 The LAW OF LARGE NUMBERS states that: If we take larger and larger independent samples of a random variable X, then the sample mean converges to the expected value. This applies to all random variables with an expectation. PHP 2510 – Lec 8: law of large numbers, CLT 5 Another Example: Suppose X follows Binomial distribution, Binom(n=15,p=.3). You have done a similar simulation in lab. set obs 100 set seed 111 gen x1=rbinomial(15,.3) Now let’s compute the samples means: gen k=_n gen xbar1=sum(x1)/k twoway scatter xbar1 k E[X] = np = 15 × .3 = 4.5, if the LAW OF LARGE NUMBERS applies, we should see the sample mean converge to 4.5 as n increases. PHP 2510 – Lec 8: law of large numbers, CLT 6 The first 13 numbers and the samples means... list k x1 xbar if k<=13 +--------------------+ | k x1 xbar1 | |--------------------| 1. | 1 6 6 | 2. | 2 5 5.5 | 3. | 3 5 5.333333 | 4. | 4 5 5.25 | 5. | 5 1 4.4 | |--------------------| 6. | 6 4 4.333333 | 7. | 7 4 4.285714 | 8. | 8 8 4.75 | 9. | 9 4 4.666667 | 10. | 10 1 4.3 | |--------------------| 11. | 11 5 4.363636 | 12. | 12 5 4.416667 | 13. | 13 8 4.692307 | PHP 2510 – Lec 8: law of large numbers, CLT 7 6 xbar1 5 4.5 4 4 4.5 xbar1 5 5.5 5.5 6 Example: Sample means from iid binomial(15,.3) 0 20 40 60 80 100 0 200 400 600 800 1000 600 800 1000 xbar1 3 4 5 4 xbar1 6 5 7 6 k 8 k 0 200 400 600 800 1000 k PHP 2510 – Lec 8: law of large numbers, CLT 0 200 400 k 8 sample means from iid poisson(7.5) clear set obs 1000 set seed 111 gen x1=rpoisson(7.5) gen k=_n gen xbar1=sum(x1)/k twoway scatter xbar1 k,c(l) msize(.4) PHP 2510 – Lec 8: law of large numbers, CLT 9 14 12 8 7 8 xbar1 10 xbar1 6 6 5 4 0 200 400 600 800 1000 0 200 400 600 800 1000 6000 8000 10000 k 7 7 7.5 8 8 xbar1 9 xbar1 8.5 10 9 11 9.5 k 0 200 400 600 800 1000 k PHP 2510 – Lec 8: law of large numbers, CLT 0 2000 4000 k 10 Sample means from iid normal with mean 15, standard deviations 2.5 or 1 or 10. clear set obs 5000 set seed 111 gen x1=rnormal(15,2.5) gen k=_n gen xbar1=sum(x1)/k twoway scatter xbar1 k if xbar1<15.5&xbar1>14.75,c(l) msize(.4) clear set obs 5000 set seed 222 gen x1=rnormal(15,1) gen k=_n gen xbar1=sum(x1)/k twoway scatter xbar1 k if xbar1<15.5&xbar1>14.75,c(l) msize(.4) c(l) PHP 2510 – Lec 8: law of large numbers, CLT 11 14.8 14.8 15 14.9 xbar1 15.2 xbar1 15 15.4 15.1 15.2 σ=1 15.6 σ = 2.5 0 1000 2000 3000 4000 5000 0 1000 2000 k 3000 4000 5000 k σ = 10,zoomed in 14 14.8 16 15 xbar1 18 xbar1 15.2 20 15.4 22 15.6 σ = 10 0 1000 2000 3000 4000 5000 k PHP 2510 – Lec 8: law of large numbers, CLT 0 1000 2000 3000 4000 5000 k 12 Notice that in each example, the sample mean converges to the expectation µ = 15. Also notice that for most n, the sample mean Xn does not equal to 15. And for the last simulation from N (µ = 15, σ 2 = 102 ), the sample mean varies more than the first example (N (µ = 15, σ 2 = 2.52 )) and second example (N (µ = 15, σ 2 = 1). Recall if Xi ∼ N (µ, σ 2 ) we have Xn ∼ N (µ, σ 2 /n) Thus the variance of X̄ depends on both the orignal σ 2 as well as the sample size n PHP 2510 – Lec 8: law of large numbers, CLT 13 We have seen examples of Binomial, Poisson and Normal distributions. In each case, the sample mean converges to the expected value when sample size increases. In fact, the law of large numbers applies to all probability distributions that have expectations. If X1 , X2 , . . . are independent identical random variables from a distribution with expectation E[X] X1 + X 2 + . . . + X n → E[X] as n → ∞ Xn = n PHP 2510 – Lec 8: law of large numbers, CLT 14 FYI: Cauchy distribution 0.05 0.15 0.25 Exceptions exist but you may never run into these in your study. For those interested, the most famous example is the Cauchy distribution with probability density 1 f (x) = π(1+x 2) −4 −2 0 2 4 x This distribution has no expected value, and when you take a sequence of independent observations from a cauchy distribution, the sample mean does not converge to the ”center”. PHP 2510 – Lec 8: law of large numbers, CLT 15 FYI: law of large numbers on variance estimate Law of large numbers applies to all distributions with expectations. Recall that we have defined variance as ”expectation of squared distance to the mean”: for a random variable X with mean µ, var(X) = E[ (X − µ)2 ]. If we take an infinite sequence of independent random variables X1 , X2 , . . . , Xk , . . . and each Xk has the same probability distribution of X, we have Y1 = (X1 − µ)2 , Y2 = (X2 − µ)2 , . . . , Yk = (Xk − µ)2 , . . . still independent of each other and each have the expectation E[Yk ] = E[(Xk − µ)2 ] = var(X). Thus Yk = (Xk − µ)2 → var(X) as n → ∞ PHP 2510 – Lec 8: law of large numbers, CLT 16 FYI: law of large numbers on variance estimate Let’s see the Bernoulli example. Consider X ∼ Bernoulli(p = .35). k X (X − .35)2 (X − .35)2 k 1 0 0.1225 0.1225 2 1 0.4225 (.1225+.4225)/2=0.2725 3 0 0.1225 (.1225+.4225+.1225)/3=0.2225 4 1 0.4225 0.2725 5 1 0.4225 0.3025 6 0 0.1225 0.2725 7 0 0.1225 0.2511 8 1 0.4225 0.2725 9 0 0.1225 0.2558 10 0 0.1225 0.2425 ... ... ... ... Recall var(X) = p(1 − p) = .35 ∗ .65 = .2275 PHP 2510 – Lec 8: law of large numbers, CLT 17 0.15 0.20 0.25 0.30 FYI: law of large numbers on variance estimate 0 200 400 PHP 2510 – Lec 8: law of large numbers, CLT 600 800 1000 18 FYI: law of large numbers on a function of X In general, if we have take a long sequence of independent observations of random variable X, and if a function of X, g(X) has expectation E[g(X)], then the law of large numbers applies to g(X) as well n X g(X)k = g(Xk ) k=1 PHP 2510 – Lec 8: law of large numbers, CLT n → E[g(X)] 19 FYI: law of large numbers on a function of X For example, if X1 , X2 , . . . , Xn all have the same probability distribution Normal(µ = 5, σ 2 = 22 ) • X̄ = X1 +X2 +... n → 5 as n → ∞ • Consider the function g(X) = X 2 . 2 X12 +X22 +...+Xn n n→∞ → E[X 2 ] = var(X) + (E[X])2 = 22 + 52 = 29 as (since var(X) = E[X 2 ] − (E[X])2 , we have E[X 2 ] = var(X) + (E[X])2 ) Exercise: if X1 , X2 , . . . , Xn all have the same probability distribution Normal(µ = 0, σ 2 = 32 ), what does the sample mean of X 2 converges to? PHP 2510 – Lec 8: law of large numbers, CLT 20 The law of large numbers is the reason statistics works: by taking a large sample of observations, we can use an average of the observations to estimate unknown parameters. • For Bernoulli trials, we can use Xn to estimate p • For Poisson distributions, we can use Xn to estimate λ since Xn → λ • For normal distributions, we can use Xn to estimate µ We know these estimates should be close to the (unknown) truth when n gets larger and larger because of the law of large numbers. But how close do we get? By ”close” do we mean equally probable to be greater than or less than the truth? The sample mean Xn from n independent observations of random variable X is itself a random variable. What is the probability distribution of Xn like? PHP 2510 – Lec 8: law of large numbers, CLT 21 Central Limit Theorem Central Limit Theorem gives answer to the probability distribution of Xn when n increases towards infinity. PHP 2510 – Lec 8: law of large numbers, CLT 22 Central Limit Theorem Recall that for normal random variables X1 , X2 , . . . , Xn that are independent identical Normal N (µ, σ 2 ), then we know Xn ∼ N (µ, σ 2 /n) and n does not have to be a large number. For other random variables, the distribution of sample mean converges to a normal distribution as n → ∞. √ nXn −→ N (µ, σ 2 ) as n → ∞. For a large number n, Xn is approximately distributed as N (µ, σ 2 /n) PHP 2510 – Lec 8: law of large numbers, CLT 23 Central Limit Theorem Example: Suppose that white blood cell count (per microliter) in adults is normally distributed with mean 8000 and standard deviation 1200. If we have a random sample of 15 adults, what is the probability that the average from the 15 adults • is less than 7500? • is greater than 8600? • is between 8000 and 8600? PHP 2510 – Lec 8: law of large numbers, CLT 24 Central Limit Theorem Since X1 , X2 , . . . , X15 are i.i.d. N (8000, 12002 ) X ∼ N (µ, σ 2 /n) X ∼ N (8000, 12002 /15) Standardization: X −µ √ ∼ N (0, 1) = Z σ/ n P (X < 7500) = = = 7500 − 8000 X − 8000 √ < √ ) 1200/ 15 1200/ 15 7500 − 8000 √ ) P (Z < 1200/ 15 P (Z < −1.61) = .053 P( PHP 2510 – Lec 8: law of large numbers, CLT 25 Central Limit Theorem The probability that any individual having wbc count less than 7500 is, in contrast, 7500 − 8000 X1 − 8000 < ) P (X1 < 7500) = P ( 1200 1200 7500 − 8000 ) = P (Z < 1200 = P (Z < −0.41) = .34 So we would not be surpised to see one individuals wbc count to be less than 7500. But we may be somewhat surprised that the average from 15 individuals to be less than 7500. PHP 2510 – Lec 8: law of large numbers, CLT 26 Central Limit Theorem P (X > 8600) = = = 8600 − 8000 X − 8000 √ > √ ) 1200/ 15 1200/ 15 8600 − 8000 √ ) P (Z < 1200/ 15 P (Z > 1.936) ≈ 0.026 P( Since P (X > 8000) = 0.5 P (8000 < X < 8600) = 0.50 − .026 = 0.474 PHP 2510 – Lec 8: law of large numbers, CLT 27 Central Limit Theorem Example: X1 , X2 , . . . , XN are i.i.d. Binomial(n = 8, p = .95) (NOTE here N is the number of X you observe, n is the parameter for each binomial distribution. n = 8 and is fixed. ) Suppose N is a large number. What do we know about X? • E[XN ] = E[X] = np = 8 × .95 = 7.6 • var(XN ) = var(X1 )/N = 8(.95)(1 − .95)/N = .38/N What about the distribution of X? Central Limit Theorem says the distribution of X is approximately N (7.6, .38/N ) PHP 2510 – Lec 8: law of large numbers, CLT 28 Central Limit Theorem Distribution of X̄, when N = 5 0 50 100 150 200 250 distribution of sample mean, N=5 6.5 7.0 PHP 2510 – Lec 8: law of large numbers, CLT 7.5 8.0 29 Central Limit Theorem distribution of sample mean, N=10 0 0 50 50 100 100 150 150 200 250 200 distribution of sample mean, N=5 6.5 7.0 7.5 8.0 7.2 7.4 7.6 7.8 8.0 300 distribution of sample mean, N=100 0 0 50 50 100 100 150 150 200 200 250 250 300 distribution of sample mean, N=20 7.0 7.2 7.4 7.6 7.8 8.0 PHP 2510 – Lec 8: law of large numbers, CLT 7.4 7.5 7.6 7.7 7.8 30 Central Limit Theorem 0 50 100 150 200 250 300 350 distribution of sample mean from Binom(8,.95), N=100 7.4 7.5 7.6 7.7 7.8 The red smooth curve is a normal distribution with mean 7.6 and √ variance 0.38/100=.038, standard deviation 0.038 = 0.0616 Recall we showed that • E[XN ] = 8 × .95 = 7.6 • var(XN ) = var(X1 )/N = 8(.95)(1 − .95)/N = .38/N √ √ • thus SD(X̄) = .38/ N = .0616 PHP 2510 – Lec 8: law of large numbers, CLT 31 Central Limit Theorem Example: What if we have i.i.d. Poisson random variables, what is the distribution of Xn ? Suppose X1 , X2 , . . . , Xn are i.i.d. P oisson(λ = 1.5) PHP 2510 – Lec 8: law of large numbers, CLT 32 Central Limit Theorem distribution of sample mean, N=10 0 0 50 50 100 150 100 200 150 250 300 200 distribution of sample mean, N=5 0.5 1.0 1.5 2.0 2.5 3.0 3.5 1.0 2.0 2.5 3.0 distribution of sample mean, N=100 150 distribution of sample mean, N=20 1.5 0 0 50 50 100 150 100 200 250 300 0.0 1.0 1.5 2.0 PHP 2510 – Lec 8: law of large numbers, CLT 1.2 1.4 1.6 1.8 33 Central Limit Theorem x 0 50 100 150 distribution of sample mean , N=100 1.2 1.4 1.6 1.8 The red smooth curve is a normal distribution with mean 1.5 and variance 0.15. Recall each X ∼ P ois(1.5), so • E[Xn ] = E[X] = 1.5 • var(Xn ) = var(X1 )/N = 1.5/N = .015 PHP 2510 – Lec 8: law of large numbers, CLT 34 Central Limit Theorem √ • thus SD(X̄) = .015 = .122 PHP 2510 – Lec 8: law of large numbers, CLT 35 Central Limit Theorem Exercise: A department of a Rhode Island Hospital has daily visits following a poisson distribution of mean 5. Assume the number of visits is independent from day to day. • What is the expected value of ”average daily visit in October”? • What is the variance of average daily visit in October? • What is the probability that there are more than 6 visits in a day? • What is the probability that the average daily visit in October is greater than 5.5? • What is the probability that the average daily visit from a year is greater than 5.5? PHP 2510 – Lec 8: law of large numbers, CLT 36 Central Limit Theorem Since we have X ∼ P oisson(5), n=31 for Oct and n=365 for a year, • E[Xn ] = E[X] = 5 • var(X31 ) = var(X)/31 = 5/31 • P (X > 6) = 1 − P (X ≤ 6) = 1 − 6 X e−5 5k k=0 k! = 0.238 • X31 approximately N ormal(5, 5/31) X31 −5 5.5−5 >√ ) = P (Z > 1.245) = 0.1067 P (X31 > 5.5) = P ( √ 5/31 5/31 • X365 approximately N ormal(5, 5/365) X365 −5 > √5.5−5 ) = P (Z > 4.27) ≈ 0 P (X365 > 5.5) = P ( √ 5/365 PHP 2510 – Lec 8: law of large numbers, CLT 5/365 37 Central Limit Theorem Example: Rice Chapter 5 problem 15 Suppose you bet $5 on a fair game. Use the central limit theorem to approximate your probability of losing more than $75 in a sequence of 50 independent games. PHP 2510 – Lec 8: law of large numbers, CLT 38 Central Limit Theorem Let Xi be your winning of the i − th game. Fair game means P (Xi = 5) = P (Xi = −5) = 0.5 Event of interest: 50 X i=1 Xi < −75, equivalent to X < −75/50 = −1.5 In order to apply CLT we need the mean and variance of X. E[X] = 0 2 E[X ] = X x={−5,5} x2 P (X = x) = 52 × 0.5 + (−5)2 × 0.5 = 25 σ 2 = var(X) = E[X 2 ] − [EX]2 = 25 − 0 = 25 PHP 2510 – Lec 8: law of large numbers, CLT 39 Central Limit Theorem Since by CLT, X ∼ N (0, σ 2 /n), Here X ∼ N (0, 25/50), thus P (X < −1.5) = −1.5 − 0 ) P (Z < p 25/50 = P (Z < −2.12)) = 0.017 PHP 2510 – Lec 8: law of large numbers, CLT 40 Central Limit Theorem EXERCISE:Rice, Chapter 5, Problem 17 Suppose that a measurement has mean µ and variance σ 2 = 25. We would like to estimate µ by taking a number of measurements and calculate the average X. We want to be 95% sure that the estimate is close enough to µ such that the difference between X̄ and µ is less than 1. How many measurements do you need to take? PHP 2510 – Lec 8: law of large numbers, CLT 41 Central Limit Theorem X −µ √ =Z X ∼ N (µ, σ /n) =⇒ σ/ n 2 P (|X − µ| < 1) = 0.95 X − µ 1 =⇒ P ( √ < √ ) = 0.95 σ/ n σ n =⇒ P (|Z| < Lookup the Z-table, 1√ σ/ n 1 √ ) = 0.95 σ/ n = 1.96 Thus n = 1.962 σ 2 = 1.962 × 25 = 96 PHP 2510 – Lec 8: law of large numbers, CLT 42 Central Limit Theorem Summary: For i.i.d. random variables X1 , X2 , . . . , Xn that each follow a distribution with mean µ and variance σ 2 • The sample mean Xn is a random variable • The observed data points x1 , x2 , . . . , xn are one set of realizations of the random variables X1 , X2 , . . . , Xn , and the sample mean calculated on that particular dataset, xn , is one realization of r.v. Xn • The sample mean Xn has mean µ and variance σ 2 /n • when n is a large number, the distribution of Xn is approximately normal with mean µ and variance σ 2 /n PHP 2510 – Lec 8: law of large numbers, CLT 43

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Lecture 8 - Center for Statistical Sciences