Download Notes on order statistics of discrete random variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Renormalization group wikipedia , lookup

Financial economics wikipedia , lookup

Regression analysis wikipedia , lookup

Information theory wikipedia , lookup

Generalized linear model wikipedia , lookup

Probability box wikipedia , lookup

Fisher–Yates shuffle wikipedia , lookup

Hardware random number generator wikipedia , lookup

Randomness wikipedia , lookup

Transcript
1
STAT512/432 – 2011
Notes on order statistics of discrete random
variables
In STAT 512/432 we will almost always focus on the order statistics of continuous random variables. Despite this, these notes discuss order statistics, in
particular the maximum and the minimum, of n discrete random variables. We
start with some basic background theory.
Some background theory
Example: the uniform distribution
For a discrete uniform distribution, all possible values of the (discrete) random
variable have the same probability. In these notes we focus attention on the
discrete random variable Y that is equally likely to take each of the values
0, 1, 2, . . . , 9. That is,
Probability(Y = y) =
1
,
10
y = 0, 1, . . . , 9.
(1)
The mean of Y is, clearly,
mean of Y =
9
X
y=0
y
1
= 4.5.
10
(2)
A slightly more complicated calculation shows that the variance of Y is 8.25.
The cumulative distribution function of a discrete random variable
The cumulative distribution function F (y) of any discrete random variable Y is
the probability that the random variable takes a value less than or equal to y.
Thus using a standard notation,
X
F (y) =
P r(Y = a).
(3)
a≤y
The maximum of n discrete random variables
Definitions
The maximum of a number of a number if independent random variables is
frequently used in statistics. As an introduction to the relevant theory, we
now consider properties of the maximum of n iid discrete random variables
Y1 , Y2 , . . . , Yn , each having the same probability distribution with common cumulative distribution function F (y). We denote this maximum by Ymax . Given
2
data y1 , y2 , . . . , yn , we write the observed (data) value of Ymax as ymax . There
can be ties: for example, if n = 5 and
y1 = 7,
y2 = 9,
y3 = 6,
y4 = 9,
y5 = 6,
then ymax = 9. In this case two of the observations tied at the maximum value.
Theory
Since the maximum of n quantities is less than or equal to any number y if and
only if all of these quantities are less than or equal to y, we have
n
(4)
Prob (Ymax ≤ y) = F (y) ,
n
Prob (Ymax ≥ y) = 1 − F (y − 1) ,
(5)
so that
Prob (Ymax = y) = F (y)
n
n
− F (y − 1) .
(6)
Example: the uniform distribution again
As an example, we use the theory above to consider properties of the maximum
of n random variables, each having the above uniform distribution considered
above. The cumulative distribution function F (y) of a random variable having
the above uniform distribution is easily seen to be given by
F (y) =
y+1
,
10
y = 0, 1, 2, . . . , 9.
Then from (6) and (7), the probability that Ymax = Y is given by
n y+1
y n
Pmax (y) =
−
, x = 0, 1, 2, . . . , 9.
10
10
(7)
(8)
Mean and variance of a maximum of n random variables from the
above uniform distribution
From the above theory, the mean value µmax of Xmax , the maximum of n random
variables from the uniform distribution considered above is given by
µmax =
n 9
X
y+1
y n
y
−
.
10
10
y=0
This simplifies, after some algebra, to
n n
n
1
2
9
µmax = 9 −
−
− ··· −
.
10
10
10
In Homework 3 you will be asked to verify this calculation.
(9)
(10)
3
As a check on this calculation, the case n = 1 gives
1
2
9
1 + 2 + ··· + 9
µmax = 9 −
−
− ··· −
=9−
= 4.5. (11)
10
10
10
10
One can also find the value of the variance of Ymax , but we do not do this here.
One immediate conclusion that can be drawn from (10) is that, as n → ∞,
the mean of Ymax approaches 9. It can also be shown that the variance mean
of Ymax approaches 0 as n → ∞. Both these conclusions “make sense.” The
following table indicates the rate at which these occur.
value of n
5
10
20
mean of Ymax
7.79175
8.508466
8.866059
variance of Ymax
1.928782
0.624477
0.142473
The minimum of n discrete random variables
Properties of the minimum Ymin of n independently and identically distributed
random variables can be found in a manner similar to that for which properties
of a maximum were found. Using the notation above, we get
n
Prob (Ymin ≥ y) = 1 − F (y − 1) ,
(12)
n
Prob (Ymin ≥ y + 1) = 1 − F (y) ,
(13)
so that
Prob (Ymin = y) = 1 − F (y − 1)
n
n
− 1 − F (y) .
(14)
From (14) one can find the mean and the variance of Ymin . We do not give
the details here, and note only that in the case of the uniform distribution
considered above, the mean of Ymin is
2
9
1 n
) + ( )n + · · · + ( )n .
10
10
10
Note that as n → ∞, this mean approaches 0. This makes sense.
(
(15)
The general order statistic of n discrete random variables
Ymin and Ymax are examples of order statistics: specifically, Ymin is the first
order statistic and Ymax is the nth order statistic. One can also define the
second, third, ... , (n − 1)th order statistics. However the theory for these for
a discrete random variable becomes quite complicated, so we do not consider it
here.