Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Renormalization group wikipedia , lookup
Financial economics wikipedia , lookup
Regression analysis wikipedia , lookup
Information theory wikipedia , lookup
Generalized linear model wikipedia , lookup
Probability box wikipedia , lookup
Fisher–Yates shuffle wikipedia , lookup
1 STAT512/432 – 2011 Notes on order statistics of discrete random variables In STAT 512/432 we will almost always focus on the order statistics of continuous random variables. Despite this, these notes discuss order statistics, in particular the maximum and the minimum, of n discrete random variables. We start with some basic background theory. Some background theory Example: the uniform distribution For a discrete uniform distribution, all possible values of the (discrete) random variable have the same probability. In these notes we focus attention on the discrete random variable Y that is equally likely to take each of the values 0, 1, 2, . . . , 9. That is, Probability(Y = y) = 1 , 10 y = 0, 1, . . . , 9. (1) The mean of Y is, clearly, mean of Y = 9 X y=0 y 1 = 4.5. 10 (2) A slightly more complicated calculation shows that the variance of Y is 8.25. The cumulative distribution function of a discrete random variable The cumulative distribution function F (y) of any discrete random variable Y is the probability that the random variable takes a value less than or equal to y. Thus using a standard notation, X F (y) = P r(Y = a). (3) a≤y The maximum of n discrete random variables Definitions The maximum of a number of a number if independent random variables is frequently used in statistics. As an introduction to the relevant theory, we now consider properties of the maximum of n iid discrete random variables Y1 , Y2 , . . . , Yn , each having the same probability distribution with common cumulative distribution function F (y). We denote this maximum by Ymax . Given 2 data y1 , y2 , . . . , yn , we write the observed (data) value of Ymax as ymax . There can be ties: for example, if n = 5 and y1 = 7, y2 = 9, y3 = 6, y4 = 9, y5 = 6, then ymax = 9. In this case two of the observations tied at the maximum value. Theory Since the maximum of n quantities is less than or equal to any number y if and only if all of these quantities are less than or equal to y, we have n (4) Prob (Ymax ≤ y) = F (y) , n Prob (Ymax ≥ y) = 1 − F (y − 1) , (5) so that Prob (Ymax = y) = F (y) n n − F (y − 1) . (6) Example: the uniform distribution again As an example, we use the theory above to consider properties of the maximum of n random variables, each having the above uniform distribution considered above. The cumulative distribution function F (y) of a random variable having the above uniform distribution is easily seen to be given by F (y) = y+1 , 10 y = 0, 1, 2, . . . , 9. Then from (6) and (7), the probability that Ymax = Y is given by n y+1 y n Pmax (y) = − , x = 0, 1, 2, . . . , 9. 10 10 (7) (8) Mean and variance of a maximum of n random variables from the above uniform distribution From the above theory, the mean value µmax of Xmax , the maximum of n random variables from the uniform distribution considered above is given by µmax = n 9 X y+1 y n y − . 10 10 y=0 This simplifies, after some algebra, to n n n 1 2 9 µmax = 9 − − − ··· − . 10 10 10 In Homework 3 you will be asked to verify this calculation. (9) (10) 3 As a check on this calculation, the case n = 1 gives 1 2 9 1 + 2 + ··· + 9 µmax = 9 − − − ··· − =9− = 4.5. (11) 10 10 10 10 One can also find the value of the variance of Ymax , but we do not do this here. One immediate conclusion that can be drawn from (10) is that, as n → ∞, the mean of Ymax approaches 9. It can also be shown that the variance mean of Ymax approaches 0 as n → ∞. Both these conclusions “make sense.” The following table indicates the rate at which these occur. value of n 5 10 20 mean of Ymax 7.79175 8.508466 8.866059 variance of Ymax 1.928782 0.624477 0.142473 The minimum of n discrete random variables Properties of the minimum Ymin of n independently and identically distributed random variables can be found in a manner similar to that for which properties of a maximum were found. Using the notation above, we get n Prob (Ymin ≥ y) = 1 − F (y − 1) , (12) n Prob (Ymin ≥ y + 1) = 1 − F (y) , (13) so that Prob (Ymin = y) = 1 − F (y − 1) n n − 1 − F (y) . (14) From (14) one can find the mean and the variance of Ymin . We do not give the details here, and note only that in the case of the uniform distribution considered above, the mean of Ymin is 2 9 1 n ) + ( )n + · · · + ( )n . 10 10 10 Note that as n → ∞, this mean approaches 0. This makes sense. ( (15) The general order statistic of n discrete random variables Ymin and Ymax are examples of order statistics: specifically, Ymin is the first order statistic and Ymax is the nth order statistic. One can also define the second, third, ... , (n − 1)th order statistics. However the theory for these for a discrete random variable becomes quite complicated, so we do not consider it here.