Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Notes for ISyE 3232, Spring 2002 by Christos Alexopoulos School of Industrial and Systems Engineering, Georgia Tech Chapter 1 The Poisson Distribution and Poisson Process This chapter contains a self-contained introduction to Poisson random variables and their applications in Poisson processes. Also included are applications of exponential and gamma random variables. 1.1 Definition of a Poisson Random Variable A discrete random variable X has the Poisson distribution if p(x) = P (X = x) = λx e−λ for x = 0, 1, . . . x! where λ is a positive constant. The property the infinite series formula ∞ X λx λ e = . x! x=0 P∞ x=0 p(x) = 1 follows from This formula is also used to show that E(X) = λ, A Poisson random following: The number of The number of The number of Var(X) = λ. variable is typically used to represent quantities like the sales of a product in a week. units of a product purchased in one sale. sick-days an employee takes in a year. 1 The number of vehicles entering an exit ramp in 30 minutes. The number of cases a court tries in a year. Keep in mind that such quantities may also be represented by other discrete distributions (geometric, negative binomial, uniform . . . ), depending on the particular situation. One chooses the type of distribution that fits the situation the best based on logical principles, laws of nature, subjective judgement, or by a frequency study that yields a sample distribution that “fits” the distribution (there are statistical “goodness-of-fit-tests” for this covered in other courses). The Poisson distribution has a wide range of applications dealing with the number of occurrences of an event over time. The rest of these notes describe such phenomena, which are called Poisson processes. 1.2 Poisson Process In this section we shall describe a framework for representing random occurrences of an event over the entire (continuous) time axis instead of only at integer time points. Consider an event that may occur at any point in time. Similar to the (discrete-time) assumptions A1, A2, we make the following (continuous-time) assumptions: C1. The numbers of occurrences of the event in disjoint time intervals are independent. C2. For any s ≥ 0, lim t−1 P (one occurrence of the event in [s, s + t]) = λ. t→0 lim t−1 P (2 or more occurrences of the event in [s, s + t]) = 0. t→0 This implies that for the time interval [s, s + t], we have the approximations P (one occurrence of the event in [s, s + t]) ≈ λt P (2 or more occurrences of the event in [s, s + t]) ≈ 0 provided the length t of the interval [s, s + t] is very small. Our interest is in the random variable N (t) that denotes the number of occurrences of the event in the time interval [0, t] (or up to time t). To determine the distribution of this random variable, it is convenient to consider the probability px (t) = P (N (t) = x) for x = 0, 1, . . . 2 as a function of t. Under the assumptions C1, C2, one can show that the function px (t) satisfies the differential equations p00 (t) = −λp0 (t) p0x (t) = λpx−1 (t) − λpx (t). It is known that the solution to these equations is px (t) = (λt)x e−λt . x! (The procedure for obtaining this solution is involved and we will not cover it, but you can easily check that this is indeed the solution by substituting it into the differential equations.) This justifies that the random variable N (t) denoting the number of occurrences of the event up to time t has the Poisson distribution with parameter λt. Also, E[N (t)] = Var[N (t)] = λt. Note that λ = E[N (1)] is the expected number of occurrences of the event per unit time (or the rate of its occurrence). The family of random variables {N (t) : t ≥ 0} is called a Poisson process with rate λ. In using a Poisson process for modeling a particular situation, one typically gives an argument that assumptions C1, C2 are reasonable. Also, if it is feasible to obtain data on actual occurrences of the event over time, then one can use statistical procedures (covered in other courses) to test whether the occurrences “fit” a Poisson process. Here is a typical application of a Poisson process. Example 1 A mail-order company is studying the efficiency of its operation. Its aim is to reduce the processing time per order for certain highpriority orders by phone. An important factor is the number of these phone orders that arrive throughout the day, and that will be our focus here. A typical approach for describing the random characteristics of these orders is to model them by a Poisson process as follows. For simplicity, we shall assume that orders are placed throughout the day at about the same rate. (When this is not true, one can partition the day into periods in which the rate of orders within each period is about the same and do the following analysis for each period separately.) The orders are coming from all over the country from myriad potential customers, and so it is reasonable to assume 3 that the number of orders in disjoint time intervals are independent. This means that if you observe, say 20 orders in 10 minutes, it tells you nothing about the arrivals of orders in the next 10 minutes. Next, the orders do not arrive at exactly at the same time, and so it is reasonable to assume that the probability of two or more arrivals in a small period of time is essentially 0. Finally, the number of orders are arriving at about the same rate, and so it reasonable to assume that the probability of exactly one arrival in a small time period of length t is directly proportional to t (call it λt). In summary, the Poisson assumptions C1, C2 are satisfied. Therefore, we can say that the orders arrive according to a Poisson process with rate λ. Then the number of orders N (t) in a time period of length t has a Poisson distribution with mean λt. To answer specific questions about this random variable, we would need to know the value of λ. The typical approach is to estimate λ from past orders. (In some cases, one might choose the rate based on other criteria, for instance, when designing a new system one may choose an “anticipated” rate.) For this example, suppose that in a twenty day period (160 hours), it was observed that the total number of orders was 2080. Then we use λ = 2080/160 = 13 orders per hour as the arrival rate. Now we can answer questions about the number of orders in various time intervals. For instance, in the first half hour (t = .5), the the number of orders N (.5) has a mean of E[N (.5)] = λt = (13)(.5) = 6.5, and some probabilities for it are P [N (.5) = 0] = e−6.5 = .0015 P [N (.5) ≤ 4] = 4 X (6.5)x e−6.5 x=0 x! = .2237 P [N (.5) > 6] = 1 − P [N (.5) ≤ 6) = 1 − .5265 = .4735. Also, the 90th percentile of X.5 is 10 since an easy check (or a table lookup) shows that P [N (.5) ≤ 9] = .877 < .90 < .933 = P [N (.5) ≤ 10]. These probabilities are the same for the number of orders in any 30 minute period; it need not be the first 30 minutes. The reason is that, under the 4 assumptions C1, C2, we can take any point in real time to be the time origin for the Poisson process. Also, keep in mind that the independence assumption implies, for instance, that the number of orders in the first hour is independent of the numbers of orders in the second hour (or in any following time period). 1.3 Time Between Occurrences of an Event Suppose {N (t) : t ≥ 0} is a Poisson process with rate λ. Let Y denote the time to the first occurrence of an event in this process. (This is like the geometrically distributed time to the first occurrence of an event in a Bernoulli process.) The time Y is a continuous random variable. We can derive its distribution by relating it to the Poisson random variables. Namely, note that the event {Y > t} (the first occurrence is after time t) is the same as the event {N (t) = 0} (there are no occurrences up to time t); they are just two different ways of saying the same thing. Then, P (Y > t) = P [N (t) = 0] = e−λt for t > 0 and so P (Y ≤ t) = 1 − P (Y > t) = 1 − e−λt . Taking the derivative of this, we see that the density of Y is f (t) = λe−λt for t > 0. Therefore, we recognize the time Y to the first occurrence of the event as being an exponential random variable with mean λ−1 (or rate λ). Under the assumptions of the Poisson process, it also follows that the time between any two successive occurrences of the event is exponential with rate λ. Example 2 In the preceding example for phone-in orders, consider the time Y between two successive orders. From the last paragraph, we know that Y is an exponential random variable with rate λ = 13. Remember that time is measured in hours. Then E(Y ) = 1/λ = 1/13 = .077 hours and typical probabilities for this time are P (Y > .10) = e−(13)(.1) = .272 P (.1 < Y ≤ .2) = (1 − e−(13)(.2) ) − (1 − e−(13)(.1) ) = .198. 5 1.4 Time for Several Occurrences of an Event Now, let’s consider the time Z to the kth occurrence of the event. Again, this is related to the Poisson random variables. Namely, the event {Z ≤ t} (the k-th event occurs before or at time t) is equivalent to the event {N (t) ≥ k} (at least r events occur up to time t). Then P (Z ≤ t) = P [N (t) ≥ k] = 1 − k−1 X (λt)x e−λt . x! x=0 Taking the derivative of this, we see that the density of Z is f (t) = λk tk−1 e−λt /(k − 1)! for t ≥ 0. (Verify this as an exercise with derivatives.) This is called a gamma (or Erlang) density with order r and rate λ. For such a gamma random variable, we know (see the text) that Var(Z) = k/λ2 . E(Z) = k/λ, Example 3 In the mail-order example, consider the time Z to obtain 10 orders. We know that this is a gamma random variable with order 10 and rate λ = 13. Typical values of interest are E(Z) = k/λ = 10/13 = .769 hours P (Z ≤ 1.1) = 1 − Var(Z) = k/λ2 = 10/169 = .0592 9 X (λt)x e−λt x! x=0 = 1 − .09594 = .90406, where λt = (13)(1.1) = 14.3, and P (Z > .8) = 9 X (λt)x e−λt x! x=0 = .40899, where λt = (13)(.8) = 10.4. 1.5 Summary A Poisson process of rate λ representing the occurrences of an event has the following properties: 6 • The number of occcurrences N (t) of the event in an interval of length t has a Poisson distribution with mean λt. • The time Y between occurrences of the event is exponentially distributed with mean λ (or with rate λ). • The time Z for k occurrences of the event has a gamma distribution with order r and rate λ. These random variables are analogous to the binomial/geometric/negative binomial trinity for describing occurrences of an event in discrete time. 7