Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Bernoulli and Binomial Random Variables Suppose a random experiment is conducted in which the occurrence of an event A having probability P (A) = p is observed. Let’s call the experiment a success if A occurs and a failure if Ac occurs. MTHE/STAT 351: 5 – Special Discrete Distributions This simple experiment is called a Bernoulli trial with probability of success p. The indicator of A T. Linder Queen’s University Y = Fall 2016 8 <1 :0 if A occurs if A does not occur is called a Bernoulli random variable with parameter p. The pmf of Y is p(0) = P (Y = 0) = 1 MTHE/STAT 351: 5 – Special Discrete Distributions 1 / 37 p, p(1) = P (Y = 1) = p MTHE/STAT 351: 5 – Special Discrete Distributions 2 / 37 The prototypical example for a Bernoulli trial is flipping a biased coin. We can call heads success and tails failure, so that P (heads) = P (Y = 1) = p, Let’s calculate the mean and the variance of Y : E(Y ) = 0 · (1 p) + 1 · p = p p) + 12 · p = p Since each trial is a success with probability p and a failure with probability 1 p, the probability that there are exactly k successes is the same as the probability of getting k heads in n tosses of a biased coin with P (heads) = p. Thus, as we have seen earlier, ✓ ◆ n k P (X = k) = P (exactly k heads) = p (1 p)n k k Thus Var(Y ) = E(Y 2 ) MTHE/STAT 351: 5 – Special Discrete Distributions [E(Y )]2 = p p2 = p(1 p Now consider repeating the trial n times. We assume that the trials are independent of each other. The number of successes X is a random variable which can take the values 0, 1, 2, . . . , n. and E(Y 2 ) = 02 · (1 P (tails) = P (Y = 0) = 1 p) 3 / 37 MTHE/STAT 351: 5 – Special Discrete Distributions 4 / 37 Definition The number of successes X in n independent Bernoulli trials with success probability p is called a binomial random variable with parameters (n, p). Its pmf is given by ✓ ◆ n k p(k) = P (X = k) = p (1 p)n k , k = 0, 1, 2, . . . , n k Example: 5 fair dice are rolled. What is the probability that more than one 6 is obtained? Solution: We assume that rolling 5 fair dice at once is equivalent to rolling one fair dice 5 times. If rolling a 6 is called success, then the number of 6’s is a binomial r.v. with parameters (5, 1/6). We want to know P (X > 1): Let’s check if p(k) is indeed a valid pmf. Recalling the binomial theorem Pn (x + y)n = k=0 nk xk y n k , we have n X p(k) = k=0 n ✓ ◆ X n k=0 = = k [p + (1 k p (1 p)] p) P (X > 1) = 1 = 1 n k n P (X = 0 or X = 1) = 1 P (X = 0) P (X = 1) ✓ ◆✓ ◆0 ✓ ◆5 ✓ ◆✓ ◆1 ✓ ◆4 5 1 5 5 1 5 ⇡ 0.2 0 6 6 1 6 6 n 1 =1 MTHE/STAT 351: 5 – Special Discrete Distributions 5 / 37 MTHE/STAT 351: 5 – Special Discrete Distributions 6 / 37 Expected value and variance of a binomial r.v. Thus Let X be binomial with parameters (n, p). Intuitively, since a X is the number of successes in n trials for an event with probability p, the average number of successes should be np. Let’s show that this is indeed the case. E(X) = = By definition E(X) = n X kp(k) = k=0 ✓ ◆ n X n k k p (1 k p)n = k k=0 We notice that ✓ ◆ n k k = = = = n! = k!(n k)! (k (n 1)! n (k 1)!(n k)! ✓ ◆ n 1 n k 1 k MTHE/STAT 351: 5 – Special Discrete Distributions n! 1)!(n = k)! ✓ ◆ n X n k k p (1 p)n k k k=0 ✓ ◆ n X n 1 k n p (1 p)n k k 1 k=1 ◆ n ✓ X n 1 k 1 np p (1 p)n k 1 k=1 n X1 ✓n 1◆ np pj (1 p)n 1 j j=0 np [p + (1 p)]n 1 = np k j (by letting j = k 1) (by the binomial theorem) We obtained E(X) = np 7 / 37 MTHE/STAT 351: 5 – Special Discrete Distributions 8 / 37 Example: In a factory manufacturing a certain item, on the average each batch of 10,000 items contains 50 defective items. What is the probability that there are less than 2 defective items in a batch of 100? Solution: Assuming the items are defective independently of each other with probability p, the number of defective items in group of 100 is a binomial r.v. X with parameters (100, p). We know that on the average there are 0.5 defectives among 100 items, so A similar, but more involved manipulation can be used to calculate E(X 2 ) from which one can obtain Var(X) = np(1 and X = p p) 0.5 = E(X) = 100p =) p = np(1 p) Thus P (X < 2) = = ⇡ MTHE/STAT 351: 5 – Special Discrete Distributions 9 / 37 Geometric Random Variables P (X = 0) + P (X = 1) ✓ ◆ ✓ ◆ 100 100 0 100 (0.005) (0.995) + (0.005)1 (0.995)99 0 1 0.91 MTHE/STAT 351: 5 – Special Discrete Distributions 10 / 37 The previous argument holds in general. Thus the pmf of a geometric r.v. with parameter p, (0 < p < 1), is p)n p(n) = P (X = n) = (1 Suppose we keep repeating Bernoulli trials (success with prob. p) until the first success occurs. The number of trials X needed is called a geometric random variable with parameter p. 1 X By the independence of the tosses = (1 p)n 1 p(n) = = p n = 1, 2, 3, . . . 1 X qn n=1 1 X p 1 qn p. p 1 n=1 = MTHE/STAT 351: 5 – Special Discrete Distributions p, Let’s check that this is indeed a valid pmf. Let q = 1 n=1 1 Hn ) 1 Note: the range of X is X = {1, 2, 3, . . .}, the set of positive integers. Example: A biased coin with P (heads) = p is flipped until the first head is obtained. The number of flips X is a geometric r.v. P (X = n) = P (T1 T2 . . . Tn 0.5 = 0.005 100 11 / 37 MTHE/STAT 351: 5 – Special Discrete Distributions p 1 1 q =p 1 =1 p 12 / 37 Example: Suppose a card is randomly drawn from a deck of 52 until the first spade comes up. If each drawn card is replaced before the next is drawn, what is he probability that (b) Let q = (1 P (X p) = 3/4. 6) = (a) exactly n draws are needed? 1 X 1 X P (X = k) = k=6 = (b) at least 6 draws are needed? = 1 X (a) From the pmf of a geometric r.v. ✓ ◆ 1 3n 1 = n 4 4 P (X MTHE/STAT 351: 5 – Special Discrete Distributions qj 1 1 Second solution: The event {X cards are not spades. Thus The number of draws is a geometric r.v. X with parameter 1/4. 1 q5 p 13 / 37 1 p k=6 (by letting j = k j=0 Solution: Let’s call it a success if a spade is picked. Since we replace the drawn cards, the trials are independent, each with prob. of success p = 13/52 = 1/4. ✓ ◆n 3 P (X = n) = 4 q5 p qk q = q5 = 6) ✓ ◆5 3 ⇡ 0.237 4 6} occurs if and only if the first 5 6) = P (first 5 cards not spades) = ✓ ◆5 3 4 MTHE/STAT 351: 5 – Special Discrete Distributions 14 / 37 Mean and variance of a geometric r.v. Let X be a geometric r.v. with parameter p and set q = 1 E(X) = = = = 1 X np(n) = n=0 1 X 1 X nq n 1 p. Using the same technique, one can show that p n=0 1 X d n d n p nq n 1 = p q (since x = nxn dq dx n=0 n=0 ✓1 ◆ ✓ ◆ d X n d 1 q p =p dq n=0 dq 1 q p (1 q)2 = E(X 2 ) = 1 ) 2 p2 1 p Since E(X) = 1/p, we get Var(X) = E(X 2 ) 1 p [E(X)]2 = 1 p2 1 1 p = p p2 Thus E(X) = MTHE/STAT 351: 5 – Special Discrete Distributions 1 p 15 / 37 MTHE/STAT 351: 5 – Special Discrete Distributions 16 / 37 Negative Binomial Random Variables Thus the pmf of a negative binomial random variable with parameters (r, p) is Suppose we keep repeating Bernoulli trials (success with prob. p) until a total of r successes accumulate. The number of trials X needed is called a negative binomial random variable with parameters (r, p). The possible values of X are r, r + 1, r + 2, . . . Define the events Ar 1,n 1 = “r 1 successes in the first n Bn = “nth trial is a success” p(n) = 1 trials” ✓ n r ◆ 1 r p (1 1 One can analytically check that For n = r, r + 1, r + 2, . . . we have E(X) = P (X = n) \ Bn ) = P (Ar = P (Ar 1,n 1 )P (Bn ) (since the trials are independent) ✓ ◆ n 1 r 1 p (1 p)n 1 (r 1) p r 1 ✓ ◆ n 1 r p (1 p)n r r 1 = = 1,n 1 MTHE/STAT 351: 5 – Special Discrete Distributions 17 / 37 = n=r , n = r, r + 1, r + 2, . . . p(n) = 1 and Var(X) = r(1 p) p2 MTHE/STAT 351: 5 – Special Discrete Distributions 18 / 37 The following distinction between the binomial and negative binomial (e.g. geometric) random variables is worth emphasizing: Solution: If we call a Yankees’ win a success, then we want to find the probability that 7 trials are needed until 4 successes occur. If X is a negative binomial r.v. with parameters (4, 0.6), then = P1 r Note: A geometric r.v. with parameter p is a negative binomial r.v. with parameters (1, p). Example: In the American League Championship Series (ALCS) the Yankees play the Red Sox. The team that records its 4th win wins the series. Suppose P (Yankees win a game) = 0.6 and that the games are won an lost independently of each other. Find the probability P (Yankees win in 7 games). P (Yankees win in 7 games) r , p p)n A binomial random variable counts the number of successes k in a fixed number n of Bernoulli trials. A negative binomial random variable counts the number of trials n needed for a fixed number of successes r to occur. P (X = 7) ✓ ◆ 6 (0.6)4 (0.4)3 ⇡ 0.11 3 Note: Because the ALCS is a best-of-seven playo↵, it is not true that Y = # of games until the Yankees win 4 is a negative binomial r.v. MTHE/STAT 351: 5 – Special Discrete Distributions 19 / 37 MTHE/STAT 351: 5 – Special Discrete Distributions 20 / 37 The Poisson R.V. and the Poisson Process Poisson r.v.’s do not arise as naturally from simple experiments as, say, binomial r.v.’s, but in fact have a tremendous scope of applications in a large number of seemingly unrelated areas of science. Definition A discrete r.v. X with possible values 0, 1, 2, . . . is said to be a Poisson random variable with parameter > 0 if its pmf is p(i) = P (X = i) = This is mostly due to the fact that the pmf of a binomial (n, p) r.v. can be approximated with the Poisson pmf if n is large and p is small such that np is “moderate.”In particular, if X is such a binomial r.v., then ✓ ◆ i n i e P (X = i) = p (1 p)n i ⇡ i i! i e i! , i = 0, 1, 2 . . . Check if valid pmf: 1 X p(i) = i=0 since P1 xi i=0 i! 1 X e i=0 1 X i =e i! where i =1 i! i=0 | {z } =e = ex for all x 2 R. We will prove the following more precise statement: For any i = 1, 2, . . . ✓ ◆ i n i e lim p (1 p)n i = i i! n!1 p!0 np! MTHE/STAT 351: 5 – Special Discrete Distributions ✓ ◆ n i p (1 i p)n i 21 / 37 n! pi (1 (n i)! i! n(n 1) · · · (n i! n(n 1) · · · (n ni = = = p)n n!1 n(n 1) · · · (n ni i + 1) i + 1) pi (1 p)n i + 1) (pn)i (1 i! (1 = 1, 22 / 37 How small is small for p and what “moderate” means for np? According to the text, the Poisson approximation to the binomial is quite accurate if p < 0.1 and np 10. i p)n p)i The following examples are just a few of the many random variables that obey the Poisson distribution due to this approximation property: lim np! i (pn)i = i! i! The number of misprints on the frames shown in this lecture The number of wrong phone numbers dialed in a day in a given city and lim (1 p!0 Thus MTHE/STAT 351: 5 – Special Discrete Distributions i We have lim = np. p)i = 1, lim (1 n!1 np! ✓ ◆ n i p (1 i n!1 lim p!0 np! MTHE/STAT 351: 5 – Special Discrete Distributions p)n = lim n!1 p)n i = ✓ 1 n ◆n Number of customers entering a post office in a day =e The number of video games bought in a day in a local electronic store. i e i! 23 / 37 MTHE/STAT 351: 5 – Special Discrete Distributions 24 / 37 Example: Suppose eggs sold at a certain place are spoiled with probability 0.1. What is the probability that a carton of dozen eggs contains no more than one spoiled egg. Solution: We assume that the eggs are spoiled independently. Thus the number of spoiled eggs in a dozen is a binomial r.v. X with parameters (12, 0.1). We obtain Example: Suppose the number of misprints on a book page is Poisson with parameter = 1/3. What is the probability that on a given page there is at least one misprint. P (X 1) Solution: If X denotes the number of misprints P (X 1) = 1 P (X = 0) = 1 0! P (X = 0) + P (X = 1) ✓ ◆ ✓ ◆ 12 12 (0.1)0 (0.9)12 + (0.1)1 (0.9)11 0 1 0.65900 = 0 e = =1 e 1/3 ⇡ 0.283 = On the other hand, we can use the Poisson approximation with = np = 1.2 to obtain P (X 1) = ⇡ MTHE/STAT 351: 5 – Special Discrete Distributions 25 / 37 = = = 1 X i p(i) = i=0 1 X i=1 1 X j=0 = | 1 X e i i=0 (i e i E(X 2 ) i! j (by letting j = i = 1) = Thus 26 / 37 1 X i2 i=0 1 X j=0 1 X = i! i=1 (j + 1) j e + (i 1) (i 1)! j e (by letting j = i j! j j! j=0 | {z } E(X) = 2 1 X e i i e + 1 X e j=0 | 1) j j! {z } =1 Thus E(X) = Var(X) = E(X 2 ) Note: This could have been guessed from the Poisson approximation to the binomial since the expected value of a binomial (n, p) r.v. is np = . MTHE/STAT 351: 5 – Special Discrete Distributions = = 1)! j! {z } =1 = 0.66263 MTHE/STAT 351: 5 – Special Discrete Distributions (i 1) e 1.2 Variance of a Poisson r.v. Expectation of a Poisson r.v. E(X) P (X = 0) + P (X = 1) 0 1 e e + = e 1.2 + 1.2e 0! 1! [E(X)]2 = ( 2 + ) 2 = Again, this could have been guessed since the variance of a binomial (n, p) r.v. is np(1 p), and if p is small, then np(1 p) ⇡ np = . 27 / 37 MTHE/STAT 351: 5 – Special Discrete Distributions 28 / 37 The Poisson process Example: On a typical week 35 spam emails get through my spam filter. If I don’t read my email for an entire day, what is the probability that there will be more than 3 spams in my mailbox when I log in? Recall that a binomial r.v. counts the number of successes in n trials. A possible generalization of such a “counting” random variable is when events occur at certain points in time. Solution: We assume that the number of spams I receive during a 24-hour period is a Poisson r.v. X with parameter . Since I get an average of 35/7 = 5 spams a day, = E(X) = 5. Thus P (X > 3) = = 1 1 P (X 3) = 1 3 X i=0 = 1 e ⇡ 0.735 5 i e 3 X For example: The number of customers entering a bank, post office, etc. in a time interval of length t. P (X = i) The number of earthquakes in a certain area during a time interval of length t. i=0 The number of telephone call coming into a hotel switchboard in a time interval of length t. i! ✓ 50 51 52 53 + + + 0! 1! 2! 3! ◆ MTHE/STAT 351: 5 – Special Discrete Distributions In all these examples, the number of “events” is a random variable which obviously depends on the length t of the time interval. Such a collection of random variables, parametrized by a time index is called a process. 29 / 37 30 / 37 (1) Orderliness The probability that exactly one event occurs in [0, t] is proportional to the length t of the interval if t is small. Formally, there is a > 0 such that Let N (t) = # of events in the time interval [0, t] P (N (t) = 1) = t + o(t) Then for each t, N (t) is a random variable taking the values 0, 1, 2, . . .. We will see that the pmf of N (t) can be found if we assume some regularity conditions on the way the events occur in time. Also, the probability that 2 or more events occur in [0, t] is negligible compared to the length t of the interval if t is small: Recall the o(t) notation: if f (t) is a nonnegative function of t 0, then we write f (t) f (t) = o(t) if lim =0 t!0 t That is, f (t) = o(t) if f (t) converges to zero faster than t as t ! 0. P (N (t) 2) = o(t) (2) Stationarity The probability that k events occur in the interval [0, t] is the same as the probability that k events occur in the interval [⌧, ⌧ + t] for all ⌧ 0. For example, f (t) = t↵ is o(t) if ↵ > 1, but f (t) = t1/2 is not o(t). (3) Independent increments Let I1 , I2 , . . . , In be non-overlapping intervals and j1 , j2 , . . . , jn nonnegative integers. If Ai denotes the event that ji events occur in Ii , then A1 , A2 , . . . , An are independent events. We make the following assumptions on N (t): MTHE/STAT 351: 5 – Special Discrete Distributions MTHE/STAT 351: 5 – Special Discrete Distributions 31 / 37 MTHE/STAT 351: 5 – Special Discrete Distributions 32 / 37 The following can be proved (but we won’t do it): Theorem 1 Example: It is observed that on the average 10 customers enter a bank in an hour. If the event counting process N (t) satisfies Assumptions 1,2, and 3, then P N (t) = i = t ( t)i , i! e (a) Assuming the Poisson model, what is the probability that in the next hour 4 customers come in? i = 0, 1, 2, . . . The theorem states that for any t > 0, the random variable N (t) is Poisson with parameter t. Soln: Let N (t) be the number of customers entering in t hours. Then N (t) is a Poisson process with rate = E[N (1)] = 10. Thus From our previous calculations P N (t) = i = E[N (t)] = t, Var N (t) = t so that P N (1) = 4 = Note: Since e E[N (t)] = t the parameter represents the expected number of events per unit time. For this reason, is called the rate of the process. MTHE/STAT 351: 5 – Special Discrete Distributions 33 / 37 e t ( t)i i! 10 104 ⇡ 0.019 4! MTHE/STAT 351: 5 – Special Discrete Distributions 34 / 37 (b) What is the probability that during the next 5 hours 52 customers come in? Soln: We have = 10 and t = 5, so P N (5) = 52 = e 10·5 Example from text: Suppose that in a certain region of California earthquakes occur according to a Poisson process at a rate 7 a year. (10 · 5)52 ⇡ 0.053 52! (a) What is the probability of no earthquakes in a year? (c) What is the probability that in the next half an hour the bank gets no more than 2 customers ? Soln: Use years as unit of time. Then Soln: P N (1) = 0 = P N (1/2) 2 = 2 X 7 · 70 =e 0! 7 ⇡ 0.00091 P N (1/2) = i i=0 = 2 X e i=0 ⇡ MTHE/STAT 351: 5 – Special Discrete Distributions e = E[N (1)] = 7 and 10/2 (10/2) i i! 0.125 35 / 37 MTHE/STAT 351: 5 – Special Discrete Distributions 36 / 37 (b) What is the probability that in during the next 10 years there will be a year without earthquakes? A: Let’s call it a “success” if no earthquakes occur in a given year. We know from part (a) that P (success) = e 7 By Assumption 3, the occurrence of successes in 10 consecutive years are independent events. Thus if X = # of years out of the next 10 years in which no earthquakes occur then X is a binomial r.v. with parameters (n, p) = (10, e 7 ). This gives ✓ ◆ n 0 P (X > 0) = 1 P (X = 0) = 1 p (1 p)n 0 = 1 MTHE/STAT 351: 5 – Special Discrete Distributions (1 e 7 10 ) ⇡ 0.009 37 / 37