Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Math 5652: Introduction to Stochastic Processes Homework 4 solutions (1) (10 points) During normal working hours (9am to 5pm), customers arrive to a queue as a Poisson process with rate 6 per hour. Between 9am and 11am, there were 10 arrivals. Compute the distribution (give the probability mass function) of the number of arrivals between the hours of 10am and noon. Solution: Let x be the number of customers arriving between 9am and 10am, y be the number between 10 and 11, and z be the number between 11 and 12. There are two solution approaches here. The first is to write out conditional probability explicitly: X P(x = a, y = 10 − a, z = k − (10 − a)) P(y + z = k|x + y = 10) = P(x + y = 10) a At the top, we have three independent increments of a Poisson process, which are independent Poisson random variables, each with parameter 6/hr × 1hr = 6. The probability in the numerator is therefore product of the three individual probabilities. At the bottom we have a single Poisson random variable. What’s the range of values for a (the number of arrivals 9 to 10)? It can’t be more than 10 (since otherwise P(y = 10 − a) = 0), and if k is small then k − (10 − a) ≥ 0 means that a ≥ 10 − k (if there’s one arrival 10 to 12, and 10 arrivals 9 to 11, then there’s at least 9 arrivals 9 to 10). So, a ranges from max(0, 10 − k) (if k > 10 then a can be 0, but it can’t be negative) to 10. Thus, a P(y + z = k|x + y = 10) = 10−a k−(10−a) 6 6 e−6 6a! e−6 (10−a)! e−6 (k−(10−a))! 10 X 10 a=max(0,10−k) = 10 X a=max(0,10−k) e−12 1210! a 10−a 1 10! 1 6k−(10−a) e−6 2 2 a!(10 − a)! (k − (10 − a))! You should recognize the first part as the binomial distribution with parameters 10 and 1/2, and the second as the Poisson distribution with parameter 6. An alternative way of doing this computation is to notice that conditionally on the number of arrivals 9 to 11, the number of arrivals 10-11am is binomial, with parameters 10 and 1/2 (the probability that a uniform arrival time falls in the interval from 10 to 11). The number of arrivals from 11 to 12 is Poisson with parameter 6, independent of anything that happened 9 to 11. Thus, we are looking for the sum of a binomial and a Poisson random variable, and the answer is max(10,k) P(y + z = k|x + y = 10) = b 10−b 10 1 1 6k−b e−6 . b 2 2 (k − b)! X b=0 Here, b is the number of arrivals 10 to 11, so b = 10 − a from the previous expression. 1 (2) (10 points) (Durrett 2.27) A math professor waits at the bus stop at the Mittag-Leffler Institute in the suburbs of Stockholm, Sweden. Since he has forgotten to find out about the bus schedule, his waiting time until the next bus is uniform on (0, 30 min).1 Cars drive by the bus stop as a Poisson process at a rate of 6 per hour, and each car will take the professor into town with probability 1/3. What is the probability that he will end up riding the bus, i.e. none of the cars he sees until the bus arrival agree to take him into town? Solution: Let’s call a car “friendly” if it’s willing to give the professor a lift. Each car is friendly with probability 1/3, so we’re looking at a thinned Poisson process: the process counting friendly cars is Poisson with rate 1/3 times the total rate of car arrivals, i.e. has rate 2 per hour. Consequently, the time until the arrival of the first friendly car is exponential with parameter 2/hr (mean 30 minutes). The time until the bus arrival is uniform on the interval from 0 to 30 minutes, i.e. from 0 to 1/2 hour. Measuring all times in hours, P(bus comes before 1st friendly car) Z 1/2 Z ∞ = 2dx |{z} x=0 y=x 2e−2y dy | {z } · density of uniform at x density of exponential at y Z 1/2 = x=0 2 |{z} e−2x dx = 1 − e−1 ≈ 0.632. P(E(2)>x) (3) (20 points) (Durrett 2.50, 2.51) A copy editor reads a 200-page manuscript, finding 108 typos.2 Suppose the author’s typos follow a Poisson process with some unknown rate λ per page, while from long experience we know that the copyeditor finds 90% of the mistakes that are there. (a) Compute the expected number of typos found as a function of the arrival rate λ. (b) Use your answer to find an estimate of λ and the number of undiscovered typos. Now suppose that two different copyeditors read a 300-page manuscript. The first one finds 100 typos, the second one finds 120, and their lists contain 80 errors in common. Suppose that the author’s typose follow a Poisson process with some unknown rate λ per page (this is a different author, so a different λ from the previous book), and the two copyeditors catch mistakes independently, with unknown probabilities p1 and p2 . Let X0 be the number of typos not found by either copyeditor, X1 the number found by copyeditor 1 alone, X2 the number found by copyeditor 2 alone, and X3 the number found by both of them. (c) As a function of λ, p1 , and p2 , find the joint distribution of (X0 , X1 , X2 , X3 ). 1I haven’t been to Mittag-Leffler, but I’m pretty sure buses come by it more often than the once an hour in the problem in the book. 2This is actually a small number, at least for a textbook. 2 (d) Use your answer to estimate λ, p1 , p2 , and the number of undiscovered typos in the manuscript. Solution: (a) The process of typos that are found is Poisson with parameter λ·0.9 from the theory of thinning Poisson processes. Thus, the mean number of typos found in 200 pages will be 0.9λ · 200 = 180λ. (b) Our goal here is to find a reasonable value for λ given that an observation of a Poisson random variable with parameter 180λ was equal to 108. There is a rich theory of how you do this sort of parameter fitting; let me briefly mention some possibilities: • We could (and will) fit λ to the mean of the distribution. • We could just as well look for a λ that makes 108 be the mode of the Poisson(λ) distribution. It’ll be harder to find it analytically, but it certainly can be done. • The maximal likelihood approach says that the correct value of λ is the one that maximizes the probability of observing the number 108: that is, we 108 could look for λ that maximizes the value of e−λ λ108! . This is called “maximal likelihood estimation” or MLE for short. • In a Bayesian approach, we could have a prior distribution of our beliefs for λ (e.g.: “uniform on the interval 0 to 1”, or “truncated normal with mean 1 and standard deviation 1” – truncated because λ should be nonnegative), and we could update that distribution based on the data. (The output would then be a distribution, rather than a point estimate.) We will take the analytically simplest route of fitting λ to the mean of the data: then 180λ = 108 =⇒ λ = 0.6 errors per page. This gives an expected number of 200λ = 120 typos actually there, of which 12 haven’t been discovered by the copyeditor. (c) Consider the four processes of errors, indexed by the number of pages: X0 (t) is the number of typos not found by either copyeditor in t pages, X1 (t) is typos in t pages found by editor 1 but not 2, etc. Each error is of type 0, 1, 2, or 3 with probabilities (1−p1 )(1−p2 ), p1 (1−p2 ), (1−p1 )p2 , and p1 p2 respectively. The theory of thinning Poisson processes tells us that X0 (t) through X3 (t) are independent Poisson processes with rates λ(1 − p1 )(1 − p2 ) through λp1 p2 respectively. Setting t = 300 gives four independent Poisson random variables. Consequently, the joint distribution of X0 through X3 is that of four independent Poisson random variables 3 with parameters 300λ(1 − p1 )(1 − p2 ) through 300λp1 p2 . The joint pmf is (300λ(1 − p1 )(1 − p2 ))n0 · n0 ! (300λp1 (1 − p2 ))n1 −300λ(1−p1 )p2 (300λ(1 − p1 )p2 )n2 −300λp1 p2 (300λp1 p2 )n3 e−300λp1 (1−p2 ) ·e ·e n1 ! n2 ! n3 ! n2 n1 n0 ((1 − p1 )(1 − p2 )) (p1 (1 − p2 )) ((1 − p1 )p2 ) (p1 p2 )n3 . = e−300λ (300λ)n0 +...+n3 n1 !n2 !n3 !n4 ! P(X0 = n0 , X1 = n1 , X2 = n2 , X3 = n3 ) = e−300λ(1−p1 )(1−p2 ) If you’re familiar with the multinomial distribution, you’ll see that what is happening here is the product of the Poisson distribution with parameter 300λ evaluated at the total number of errors n0 + . . . + n3 , multiplied by the multinomial probability that these errors are partitioned into n0 errors of type 0, n1 of type 1, etc. (If you haven’t seen multinomial distributions before, it’s a generalization of a binomial distribution to more than two types, but you don’t really need to worry about it.) (d) We again have a variety of ways of fitting parameters based on observations, but the simplest is still to fit the means: thus, we take E[X0 ] = 300λ(1 − p1 )(1 − p2 ) =? E[X1 ] = 300λp1 (1 − p2 ) = 20 E[X2 ] = 300λ(1 − p1 )p2 = 40 E[X3 ] = 300λp1 p2 = 80 and solving we find E[X3 ] p1 2 = = 2 =⇒ p1 = E[X2 ] 1 − p1 3 p2 E[X3 ] 4 = = 4 =⇒ p1 = E[X1 ] 1 − p2 5 2 4 1 E[X3 ] = 300λ · · = 80 =⇒ λ = 3 5 2 E[X1 ] = 300λ(1 − p1 )(1 − p2 ) = 10 Thus, we estimate a rate of one mistake every 2 pages, editor 1 finds 2/3 of the mistakes that are there, editor 2 finds 4/5 of them, and 10 typos were undiscovered by either editor. (An interesting statistical question is how to put error bounds on all these numbers. We won’t go there.) (4) (10 points) Note: the numbers in this problem have been pulled out of a hat, and do not represent true statistics. New lawyers arrive in Los Angeles at a rate of 300 per year. (The bar exam is given twice a year, and on average, each round produces 150 new arrivals.) In the years 2004 through 2013, the number of lawyers in Los Angeles has been as follows: year 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 number of lawyers 7390 7610 7702 7523 7487 7610 7384 7401 7387 7506 Use this data to estimate the average time for which a lawyer practices. 4 Solution: We use Little’s law. The number of lawyers in LA appears to be fairly stable with mean 7500. The arrival rate is given as about 300 per year. Thus, in L = λW, the value of W is 25 years: on average, a lawyer must practice for 25 years (and then retire, or move out of LA, or switch to a different profession, or potentially die). (5) (10 points) (Durrett 3.3) Thousands of people are going to a Grateful Dead concert in Pauley Pavillion at UCLA. They park their 10-foot cars on several of the long streets near the arena. There are no lines delineating parking spaces, so they end up leaving spacings between the cars that are independent and uniform on the interval (0, 10). In the long run, what fraction of the street is covered with cars? Solution: We model this as a renewal process, indexed by space rather than time. The renewal cycle is the car plus the empty space in front of it; the reward per cycle is the space covered by cars. The expected length of a cycle is 10 + E[U (0, 10)] = 15 (feet); the expected reward per cycle is 10 (feet); thus, the fraction of street covered by cars is the ratio of the two, namely 10/15 = 2/3. (6) (10 points) This question concerns a simple model of queues with abandonment. Customers arrive to a queue as a Poisson process of rate λ. When the customer finds the server available, the customer and the server discuss for an amount of time which is uniform on the interval (a, b). Any customers that arrive to find the server already busy abandon, meaning they go away and never come back again. What is the fraction of the customers who abandon the queue? Solution: This is an alternating renewal process; the server alternates between periods of being available, which last time E(λ), and periods of being busy, which last time U (a, b). The proportion of time that the server is busy is 1 (a + b) E[U (a, b)] = 1 2 . E[U (a, b)] + E[E(λ)] (a + b) + 1/λ 2 By the PASTA property (Poisson arrivals see time averages), an arriving customer will see the server busy with the probability equal to the probability that the server is actually busy. Hence, the proportion of customers that abandon is equal to that number. (7) (10 points) (Durrett 3.18) A scientist has a machine for measuring ozone in the atmosphere, which is located in the mountains just north of Los Angeles. At times of a Poisson process with rate 1, storms or animals disturb the equipment so that it can no longer collect data. The scientist comes every L units to check the equipment, and fix it if necessary. Repairs take very little time, which we will model as 0. (a) What is the limiting fraction of time the machine is working? (b) Suppose that the data that is being collected is worth a units per unit (working) time, while each inspection costs c, with c < a. Find the optimal value of the inspection time L. Solution: 5 (a) This is a renewal-reward process. I’m aware of two possibilities for what “cycle” can be here: one is to define the renewal event “scientist visits the set-up”, and the other is to define the renewal event “scientist fixes the machine”. The first makes cycle time be easy, the second makes working time per cycle be easy. Let’s first work through the solution approach where the renewal event is “scientist visits the set-up”, and the reward per cycle is “time the set-up is working in a cycle”. The expected length of a cycle is L; the expected reward per cycle takes a little more work, because the reward is min(E(1), L): if the scientist comes and the equiptment is still working, then the reward in a cycle is L, not the (larger) value of E(1) in this case. Thus, Z ∞ −1x 1e E[ri ] = min(x, L) dx | {z } | {z } 0 density of E(1) reward, given E(1) = x Z L −x Z xe dx + = 0 0 L Le−x dx = 1 − e−L − Le−L + L |{z} e−L = 1 − e−L . P(E(1)>L) We see that the expected working time per cycle is a little smaller than 1, since we’re truncating off the large values of the exponential. Consequently, the proportion of working time is (1 − e−L )/L. Now let the renewal event be “the equipment is fixed”. Then the cycle time may be L or 2L or 3L or ... – in general, the number of visits a scientist makes to the machine in one cycle has a geometric distribution. On the other hand, the working time per cycle is simply E(1). Here, computing the mean cycle length will be a little harder: E[cycle length] = L · (1 − e−L ) + 2L · e−L (1 − e−L ) + 3L · (e−L )2 (1 − e−L ) + . . . = L(1 − e−L ) · (1 + 2e−L + 3(e−L )2 + . . . ) How do we find the infinite sum? If you know the mean of the geometric distribution, then you just write down the answer: mean cycle length is L(1 − e−L )−1 . If you don’t, here’s how to go about it: let f (p) = 1 + 2p + 3p2 + 4p3 + . . . then Z f (p)dp = p + p2 + p3 + p4 + . . . = 1 1 =⇒ f (p) = . 1−p (1 − p)2 Plugging in p = e−L we get E[cycle length] = L 1 − e−L L = , −L 2 (1 − e ) 1 − e−L so the fraction of working time is E[E(1)] 1 1 − e−L = = . E[cycle length] L/(1 − e−L ) L 6 (b) Now in a renewal cycle of length L, we get an expected profit of a(1 − e−L ) units, and a cost of c units (once per inspection, so once per cycle). Thus, the monetary expected reward per cycle is a(1 − e−L ) − c, and the rate of accumulating profits is a(1 − e−L ) − c . L We want to maximize this over L, which requires taking the derivative and setting it to 0: a((1 + L)e−L − 1) + c = 0 =⇒ a((1 + L)e−L − 1) + c = 0. 2 L This can’t actually be solved analytically (because of the Le−L term), so we’ll leave it at this expression. (Some computer algebra systems give an answer in terms of the Lambert W function, which solves Le−L = x for L as a function of x; but since this isn’t one of the standard functions, people rarely have an intuition for it.) In the other version of what a cycle is, notice that the cost of scientist visits per cycle isn’t c, it’s c times the expected number of visits! So there the benefit (or cost) per unit time would be computed as mean benefit per cycle a − c/(1 − e−L ) = . mean cycle length L/(1 − e−L ) Of course, the answer is still the same. 7