Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Sums of Independent Random Variables Sums of Independent Bernoulli Random Variables Suppose X1 , X2 , . . . , Xn are independent Bernoulli(p) random variables. If X and Y are independent random variables Consider Var (X + Y ) = Var (X) + Var (Y ) S = X1 + X2 + · · · + Xn • E [S] = E [X1 + X2 + · · · + Xn ] = E [X1 ] + E [X2 ] + · · · + E [Xn ] sd (X + Y ) = sd (X)2 + sd (Y )2 = • Var (S) = Var (X1 + X2 + · · · + Xn ) = Var (X1 ) + Var (X2 ) + · · · + Var (Xn ) For constants a and b = 2 2 Var (a X + b Y ) = a Var (X) + b Var (Y ) Sums of Independent Poisson Random Variables Suppose X1 , X2 , . . . , Xn are independent Poisson random variables with parameters λ1 , λ2 , . . . , λn respectively. S = X1 + X2 + · · · + Xn Consider • E [S] = E [X1 + X2 + · · · + Xn ] = E [X1 ] + E [X2 ] + · · · + E [Xn ] S∼ Central Limit Theorem • Sums of independent normal random variables are normally distributed X1 , X2 ∼ N (3, 4) =⇒ X1 + X2 ∼ N (6, 8) X1 , X2 , . . . , Xn ∼ N (µ, σ 2 ) =⇒ X1 +X2 +· · ·+Xn ∼ • What about sums of other random variables? = • Var (S) = Var (X1 + X2 + · · · + Xn ) = Var (X1 ) + Var (X2 ) + · · · + Var (Xn ) = S∼ Central Limit Theorem If X1 , X2 , . . . , Xn are independent and have mean µ and variance σ 2 then S = X1 + X2 + · · · + Xn ≈ N (nµ, nσ 2 ) Central Limit Effect (b) Uniform (a) Triangular n=1 n=2 n=1 2 2 1 1 n=2 3 3 2 2 1 1 2 2 0 0.0 0.2 0.4 0.6 0.8 1 0 0.0 1.0 1 0 0.0 0.2 0.4 0.6 0.8 1.0 0 0.0 0.2 0.4 0.6 0.8 0.2 0.4 0.6 0.8 0 0.0 1.0 0.2 0.4 0.6 0.8 0 0.0 1.0 0.2 n=4 2 1 0 0.0 0.2 0.4 0.6 0.8 1.0 5 4 3 2 1 0 0.0 0.8 1.0 n = 10 4 3 n = 10 3 0.6 1.0 n=4 4 0.4 3 2 2 1 0 0.0 1 0.2 0.4 0.6 0.8 0 0.0 1.0 0.2 0.4 0.6 1.0 0.8 From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000. 0.2 0.4 0.6 0.8 1.0 From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000. (b) Quadratic U (a) Exponential n=1 1.0 0.8 0.6 1.0 0.8 0.8 0.6 0.6 0.4 1 2 3 4 5 6 0.0 3 3 2 2 1 1 1 0 0.0 0 0.0 2 0.2 0.2 0 n=2 3 0.4 0.4 0.2 0.0 n=2 n=1 0.0 0 1 2 3 4 5 6 0 n=4 1 2 3 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0 0.0 0.2 0.4 0.6 0.8 1.0 4 n = 10 n=4 n = 10 1.0 0.8 0.6 0.4 0.2 0.0 0 1 From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000. 2 3 1.2 3 3 0.8 2 2 0.4 1 1 0.0 0 0.0 0 1 2 0.2 0.4 0.6 From Chance Encounters by C.J. Wild and G.A.F. Seber, © John Wiley & Sons, 2000. 0.8 1.0 0 0.0 0.2 0.4 0.6 0.8 1.0 Continuity Correction Normal Approximation to the Binomial A fair coin is tossed 1,000 times. What is the probability of more than 520 heads? H ∼ Binomial(1000, 0.5) H can be considered as sum of 1000 independent Bernoulli random variables with mean 0.5 and variance 0.25 = 0.5(1 − 0.5). By Central Limit Theorem H ≈ N (1000 ∗ 0.5, 1000 ∗ 0.25) ≡ N (500, 250) So P (H > 520) = Using the normal distribution to approximate a discrete distribution (e.g. binomial) we need to take into account the fact that the normal distribution is continuous. Discrete P (X > k) −→ P (X ≥ k) −→ P (X < k) −→ P (X ≤ k) −→ P (k1 < X < k2 ) −→ P (k1 ≤ X ≤ k2 ) −→ Continuous P X > k + 12 P X > k − 12 P X < k − 12 P X < k + 12 P k1 + 12 < X < k2 − 12 P k1 − 12 < X < k2 + 12 Normal Approximation to the Poisson Distribution Normal Approximation to the Binomial If X ∼ Binomial(n, p) then X ≈ N (np, npq) provided that both np ≥ 5 and nq ≥ 5. • i.e. approximation OK if the expected number of successes and failures are both at least 5 • approximation is best for values of p ≈ 0.5 • for more extreme values of p need larger sample sizes, n We have seen that the sum of independent Poisson distributions is Poisson. So, Poisson(100) distribution can be thought of as the sum of 100 independent Poisson(1) distributions and hence may be approximately normal. If X ∼ Poisson(λ) then X ≈ N (λ, λ) for λ ≥ 20 • approximation improves as λ gets larger • use continuity correction for calculating probabilities using the normal approximation The Sample Mean The number of phone calls at a call centre is Poisson distributed with mean 64 per hour. 1. What is the probability of 70 or more calls in a given hour? Sums of independent random variables are approximately normal. What about the mean? 2. What is the probability of less than 240 calls in a 4 hour period? If X1 , X2 , . . . , Xn are independent with the same distribution and have mean µ and variance σ 2 then X1 + X2 + · · · + Xn ≈N X= n This is the basis of statistics! Salaries of employees in MegaRipOff Corporation are normally distributed with mean 50,000 and variance 20, 0002 . • What is the probability that a random sample of 25 employees has an average salary of more than 58,000? • What is the probability that a random sample of 100 employees has an average salary between 48,000 and 52,000? σ2 µ, n