Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
ST2334: SOME NOTES ON THE GEOMETRIC AND NEGATIVE BINOMIAL DISTRIBUTIONS AND MOMENT GENERATING FUNCTIONS Geometric Distribution Consider a sequence of independent and identical Bernoulli trials with success probability p ∈ (0, 1). Define the random variable X as the number of trials until we see a success, and we include the successful trial (for example, if I flip a coin which shows heads (a success) with probability p, then X is the number of flips to obtain a head, including the successful flip). We know that: X ∈ X = {1, 2, . . . , } that is, X is a positive integer. Now, what is the probability that X takes the value x? Well, suppose X = 1, then we must have: P(X = 1) = p. This is because, we have only one Bernoulli trial, and it is a success. Suppose, now X = 2; then: P(X = 2) = (1 − p)p. This is because we have two Bernoulli trials, and the first is a failure and the second a success. Similarly P(X = 3) = (1 − p)2 p. Thus, it follows that: P(X = x) = f (x) = (1 − p)x−1 p x ∈ X = {1, 2, . . . , }. Any random variable with the above PMF is said to have a geometric distribution and we write X ∼ Ge(p). The distribution function, for x ∈ X: F (x) = x x X X 1 − (1 − p)x (1 − p)y−1 p = p (1 − p)y−1 = p = 1 − (1 − p)x . p y=1 y=1 1 2 ST2334 Calculating the expectation is somewhat tedious: E[X] = ∞ X x(1 − p)x−1 p x=1 ∞ X = p i dh − (1 − p)x dp x=1 ∞ i dh X − (1 − p)x dp x=1 d h (1 − p) i = p − dp p 1 = . p = p Perhaps an easier way (and this is the case for E[X q ], q ≥ 1) is via the MGF: M (t) = E[eXt ] ∞ X = ext (1 − p)x−1 p x=1 = = = ∞ p X [(1 − p)et ]x 1 − p x=1 p 1 (1 − p)et 1−p 1 − (1 − p)et pet 1 − (1 − p)et where we have assumed that (1 − p)et < 1, i.e. T = {t ∈ R : t < log(1/(1 − p))}. Then M 0 (t) = pet p(1 − p)e2t + . t 1 − (1 − p)e (1 − (1 − p)et )2 Setting t = 0, we have M 0 (t) = 1 . p Negative Binomial Distribution Consider again a sequence of independent and identical Bernoulli trials with success probability p ∈ (0, 1). Define the random variable X as the number of trials until we see r ≥ 1 successes, and we include the rth successful trial (for example, if I flip a coin which shows heads (a success) with probability p, then X is the number of flips to obtain r heads, including the rth successful flip). We know that X ∈ X = {r, r + 1, . . . }. That is, to achieve r successes, the minimum number of trials that we can have is r. Then as p ∈ (0, 1), we do not know when the trials will stop. Now what is the probability that X takes the value x? Well, for r = 1 we have already obtained the ST2334 3 solution (that is the Geometric distribution). Let us suppose that r = 2. Then we have, for x ∈ {2, 3, . . . } P(X = x) = (x − 1)(1 − p)x−2 p2 . The logic is as follows: we have to have two successes and hence x − 2 failures, which accounts for the (1 − p)x−2 p2 part, then we know that the last successful trial is at x, so the first successful trial must lie in one of the first x − 1 trials; this is why we multiply by x − 1 (remember the trials are identical). Now suppose that r = 3, Then we have, for x ∈ {3, 4, . . . } x−1 P(X = x) = (1 − p)x−3 p3 . 2 The logic is as follows: we have to have three successes and hence x − 3 failures, which accounts for the (1 − p)x−3 p3 part, then we know that the last successful trial is at x, so the first and second successful trials must lie in one of the first x − 1 trials; this is why we multiply by x−1 which is the number of ways of picking two 2 out of x − 1 when the order does not matter. Then, following this reasoning, we have for any r ≥ 1 x−1 P(X = x) = f (x) = (1 − p)x−r pr x ∈ X = {r, r + 1, . . . }. r−1 A random variable with the above PMF is said to have a negative binomial distribution with parameters r, p, denoted X ∼ N e(r, p). The distribution function cannot typically be written down in terms of an analytic expression (i.e. without a summation) and computing the expectation from the definition is a very tedious and tricky exercise. We focus on calculating the moment generating function: M (t) = E[eXt ] ∞ X x−1 = (1 − p)x−r pr ext r − 1 x=r ∞ X x−1 = ((1 − p)et )x−r (pet )r r − 1 x=r ∞ r X x−1 pet ((1 − p)et )x−r (1 − (1 − p)et )r = 1 − (1 − p)et x=r r − 1 r pet = t 1 − (1 − p)e if t < log(1/(1 − p)) i.e. T = {t ∈ R : t < log(1/(1 − p))} and we have used the fact that the last summation is 1, as we are summing a N e(r, 1 − (1 − p)et ) random variable. Now r−1 pet pet p(1 − p)e2t M 0 (t) = r + . 1 − (1 − p)et 1 − (1 − p)et (1 − (1 − p)et )2 So E[X] = M 0 (0) = r . p 4 ST2334 Moment Generating Functions We note: d d d M (t) = E[eXt ] = E[ eXt ] = E[XeXt ]. dt dt dt Thus M 0 (0) = E[X]. We are assuming that it is legitimate to swap the order of summation and differentiation, which holds for all cases in this course. Using a similar approach, one can show that M (2) (0) = E[X 2 ]. M 0 (t) =