Download Lecture-5-Discrete-Random

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Probability interpretations wikipedia , lookup

Randomness wikipedia , lookup

Transcript
Discrete Random
Variables
Introduction
• In previous lectures
•
•
•
•
we established a foundation of the probability theory;
we applied the probability theory to some applications;
we covered combinatorial problems;
we studied Bayesian theorem.
• In this lecture
• we will define and describe the random variables;
• we will give a definition of probability mass function and
probability distribution functions;
• a number of discrete probability mass functions will be given.
• Properties of probability mass and cumulative functions will be
given.
Definition of Discrete Random
Variables
• Sample space of a die is {1,2,3,4,5,6}, because each
face of a die has a dot pattern containing 1,2,3,4,5, of
6 dots.
• This type of experiment is called numerical valued
random phenomenon.
• What if outcomes of an experiment are nonnumerical?
• The sample space of a coin toss is S = {tail, head}. It is often
useful to assign numeric values to such experiments.
ìï 0 s1 = tail
X(si ) = í
ïî 1 s2 = head
Example: the total number of
heads observed can be found as
M
N H = å X(si )
i=1
Definition of Discrete Random
Variables
• The function that maps S into SX and which is denoted by X(.)
is called a random variable.
• The name random variable is a poor one in that the function
X(.) is not random but a known one and usually one of our
own choosing.
• Example: In phase-shift keyed (PSK) digital system a ”0” or “1”
is communicated to a receiver by sending
ìï -A cos 2p F0t
s(t) = í
ïî A cos 2p F0t
for a 0
for a 1
• A 1 or a 0 occurs with equal probability so we model the
choice of a bit as a fair coin tossing experiment.
si (t) = X(si )Acos2p F0t
random variable
ìï -1 s1 = tail
X(si ) = í
1 s2 = head
îï
Definition of Discrete Random
Variables
• Random variable is a function that maps the sample space S
into a subset of the real line.
SX Ì R = {x : -¥ < x < ¥}
S  SX
• For a discrete random variable this subset is a finite or countable
infinite set of points.
• A discrete random variable may also map multiple elements of
the sample space into the same number.
One-to-one
Many-to-one
Example:
ìï 0 if si = 1, 3,5 dots
X(si ) = í
ïî 1 if si = 2, 4,6 dots
Probability of Discrete Random
Variables
• What is the probability P[X(si) = xi] for each xi ÎSx?
• If X(.) maps each si into a different xi (or X(.) one-to-one), then
because si and xi are just two different names for the same
event
P[X(s) = xi ] = P[{s j : X(s j ) = xi }] = P[{si }]
• If, there multiple outcomes in S that map into the same value
xi (X(.) is many-to-one) then
P[X(s) = xi ] = P[{s j : X(s j ) = xi }] =
å
{ j:X (S j )=xi }
P[{si }]
sj’s are simple events in S and are therefore mutually exclusive.
• The probability mass function (PMF) can be summarized as
pX [xi ] = P[X(s) = xi ]
Random variable (RV)
[] discrete RV
The PMF is the probability that the RV X takes on the value xi for each possible xi.
Examples
• Coin toss – one-to-one mapping
• The experiment consists of a single coin toss with a probability of
heads equal to p. The sample space is S = {head, tail} and the RV is
• The PMF is therefore
ìï 0 si = tail
X(si ) = í
ïî 1 si = head
pX [0] = P[X(s) = 0] = 1- p
pX [1] = P[X(s) = 1] = p
• Die toss – many-to-one mapping
• The experiment consists of a single fair die toss. With a sample
space of S = (1,2,3,4,5,6} and in interest only in whether the
outcome is even or odd we define the RV as
pX [0] = P[X(s) = 0] =
å
j=1,3,5
P[{s j }] =
3
6
pX [1] = P[X(s) = 1] =
å
j=2,4,6
• The event is just the subset of S for which X(s) = xi holds.
P[{s j }] =
3
6
Properties
• Because PMFs pX[xi] are just new names for the probability
P[X(s) =xi] they must satisfy the usual properties:
• Property 1: Range values
0 £ pX [xi ] £ 1
• Property 2: Sum of values
M
åp
X
i=1
¥
åp
X
[xi ] = 1 If SX consists of M outcomes
[xi ] = 1 If SX is countably infinite.
i=1
• We will omit the s argument of X to write pX[xi] = P[X = xi].
• Thus, we probability of event A defined on SX is given by
P[X ÎA] =
å
{i:xi ÎA}
pX [xi ]
Example
• Calculating probabilities based on the PMF
• Consider a die with sides S = {side1, side2, side3, side4, side5,
side6} whose sides have been labeled as with 1,2 or 3 dots.
ì 1 i = 1,2
ï
X(si ) = í 2 i = 3, 4
ï 3 i = 5,6
î
• Thus,
• Assume we are interested in the probability that a 2 or 3
occurs or A = {2,3}, then
P[X Î{2, 3}] = pX [2] + pX [3] =
2
3
Important Probability Mass
Functions
• Bernoulli
ìï 1- p k = 0
Ber( p) ~ pX [k] = í
p
k =1
îï
p =0.25
~ is distributed according to
• Binomial
max
(M + 1)p
æ M ö k
M -k
bin(M , p) ~ pX [k] = ç
p
(1p)
è k ÷ø
M = 10
p =0.25
Important Probability Mass
Functions
• Geometric
geom( p) ~ pX [k] = (1- p)k-1 p, k = 1,2,...
• Poisson
Pois( p) ~ pX [k] = exp(- l )
p =0.25
lk
k!
, k = 0,2, 3,...
l=2
l=5
Approximation of Binomial
PMF by Poisson PMF
• If in a binomial PMF, M  ∞ as p  0 such that the product λ = Mp
remains constant, then bin(M,p)  Pois(λ) and λ represents average
number of successes in M Bernoulli trials.
In other words, by keeping the average number of successes
fixed but assuming more and more trials with smaller and smaller
probabilities of success on each trial, leads to a Poisson PMF.
M = 10
p =0.5
M = 100
p =0.05
Approximation of Binomial
PMF by Poisson PMF
• We have for the binomial PMF with p = λ/M  0 as M  ∞.
k
M!
lö
æ lö æ
=
1ç ÷ ç
÷
(M - k)!k! è M ø è
Mø
• For a fixed k, as M  ,we have that
• Using L’Hospital’s rule we have
M -k
and
.
and therefore
Transformation of Discrete
Random Variable
• It is frequently of interest to be able to determine the PMF of
the new random variable Y = g(X).
• Consider a die which sides are labeled with the numbers
0,0,1,1,2,2. Then the transformation appears as
many-to-one
ì y1 = 0 if x = x1 = 1 or x = x2 = 2
ï
Y = í y2 = 1 if x = x3 = 3 or x = x4 = 4
ï y = 2 if x = x = 5 or x = x = 6
5
6
î 3
ì p X [1] + p X [2] i = 1
ï
pY [yi ] = í p X [3] + p X [4] i = 2
ï p [5] + p [6] i = 3
X
î X
Examples
• In general,
pY [yi ] =
å
{ j:g( x j )=yi }
pX [x j ]
• Example: One-to-one transformation of Bernoulli random variable.
• If X ~ Ber(p) and Y = 2X – 1, determine the PMF of Y.
• The sample space SX = {0, 1} and consequently SY = {-1, 1}. It follows that
x1 = 0 maps into y1 = -1 and x2 = 1 maps into y2 = 1. Thus,
pY [-1] = pX [0] = 1- p
pY [1] = pX [1] = p
• Example: Many-to-one transformation
• Let the transformation be Y = g(X) = X2 which is defined on the sample
space SX = {-1, 0, 1} so that SY = {0, 1}. Clearly, g(xj) = xj2 = 0 only for xj = 0.
pY [0] = pX [0]
• However, g(xj) = xj2 = 1 for xj = -1 and xj = 1. Thus, using (*) we have
pY [1] =
å
{ j:g( x j )=yi }
pX [x j ] = pX [-1] + pX [1]
Example: Many-to-one transformation of Poisson random variable
• Consider X ~ Pois(λ) and define the transformation Y = g(X) as
ì 1 if X = k is even
Y =í
î -1 if X = k is odd
• To find the PMF for Y we use
ìï P[X is even] k = 1
pY [k] = P[Y = k] = í
îï P[X is odd] k = -1
• We need only determine pY [1] since pY [-1] = 1- pY [1]. Thus,
pY [1] =
¥
å
pX [ j] =
j=0 and even
¥
å
j=0 and even
exp(- l )
lj
j!
Taylor expansion
1 ¥ l j 1 ¥ ( -l )
1
1
=
+
=
exp(
l
)
+
exp(- l )
å j! 2 å j! 2 å j! 2
2
j=0 and even
j=0
j=0
1
é1
ù 1
pY [1] = exp(- l ) ê exp(l ) + exp(- l ) ú = (1+ exp(-2 l ))
2
ë2
û 2
1
pY [-1] = 1- pY [1] = (1- exp(-2 l ))
2
¥
lj
j
Cumulative Distribution
Function
• An alternative means of summarizing the probabilities of a discrete
random variable is the cumulative distribution function (CDF).
• The CDF for a RV X is defined as
FX (x) = P[X £ x],-¥ < x < ¥
• Note that the value X = x is included in the interval.
• Example: if X ~ Ber(p), then the PMF and the corresponding CDF are
PMF
p = 0.25
CDF
p = 0.25
Example: CDF for geometric random variable
k-1
• Since pX [k] = (1- p) p for k = 1,2,…, we have the CDF
ì
ï
FX (x) = í
ï
î
0
x <1
[x]
i-1
(1p)
p x ³1
å
i=1
• Where [x] denotes the largest integer not exceeding x.
• The PMF can be recovered from CDF by pX [x] = FX (x + ) - FX (x - ).
Example: CDF for geometric random variable
• PMF and CDF are equivalent descriptions of the probability and
can be used to find the probability of X being in an interval.
• To determine P[3 / 2 < X < 7 / 2] for geometric RV we have
3
7
æ 7ö
æ 3ö
P[ < X £ ] = pX [2] + pX [3] = FX ç ÷ - FX ç ÷
è 2ø
è 2ø
2
2
• In general, the intervals (a,b] will have different from (a,b), [a,
b), and [a,b] probabilities.
• but
( )
( )
( )
( )
P[2 < X £ 3] = FX 3+ - FX 2 - = pX [3] = (1- p)2 p = 0.125
P[2 £ X £ 3] = FX 3+ - FX 2 + = (1- p)p + (1- p)2 p = 0.375
Properties of CDF
• Property 3. CDF is between 0 and 1
• Proof: Since by def. FX(x) = P[X ≤ x] is a probability for all x, it must lie
between 0 and 1.
0 £ FX (x) £ 1,-¥ < x < ¥
• Property 4. Limits of CDF as x  -∞ and x  ∞.
lim FX (x) = 0
x®-¥
• Proof:
lim FX (x) = 1
x®+¥
lim FX (x) = P[{s : X(s) < -¥}] = P[0] = 0
x®-¥
• since the values that X(s) can take on do not include -∞. Also
lim FX (x) = P[{s : X(s) < +¥}] = P[S] = 1
x®+¥
• since the values that X(s) can take on are all included on the real line
Properties of CDF
• Property 5. A monotonically increasing function g(.) is one in which
for every x1 and x2 with x1 ≤ x2, it follows that g(x1) ≤ g(x2).
• Proof:
FX (x2 ) = P[X £ x2 ]
= P[(X £ x1 ) È (x1 < X £ x2 )]
(definition)
= P[ X £ x1 ] + P[x1 < X £ x2 ]
(Axiom 3)
= FX (x1 ) + P[x < X £ X2 ] ³ FX (x1 ) (def. and Axiom 1)
• Property 6. CDF is right-continuous
• By right-continuous is meant, that as we approach the point x0 from
the right, the limiting value of the CDF should be the value of the CDF
at that point
lim FX (x) = FX (x0 )
x®x0+
Properties of CDF
• Property 7. Probability of interval found using the CDF
P[a < X £ b] = FX ( b ) - FX ( a )
• Proof: Since a < b
P[-¥ < X £ b] = {-¥ < X £ a} È {a < X £ b}
and the intervals on the right-hand-side are disjoint, by Axiom 3
P[-¥ < X £ b] = P[-¥ < X £ a]+ P[a < X £ b]
Or rearranging the terms we have that
P[a < X £ b] = P[-¥ < X £ a]- P[-¥ < X £ b]
Computer Simulations
• Assume that X can take on values SX = {1,2,3} with PMF
ì p1 = 0.2, x = x1 = 1
ï
p X [x] = í p2 = 0.6, x = x2 = 2
ï p = 0.2, x = x = 3
3
î 3
Computer Simulations
• u is a random variable whose values are equally likely to fall
within the interval (0,1). It is called uniform RV.
• To estimate the PMF pX [k] = P[X = k] a relative frequency
interpretation of probability will yield
p̂X [k] =
Number of outcomes equal to k
M
k = 1,2,3
• The CDF is estimated for all x via
F̂X (x) =
Number of outcomes £ x
M
M = 100
F̂X (x) =
M = 100
å
{k:k£x}
p̂X [k]
Computer Simulations
• Note that an inverse CDF is given by
ì 1, if 0 < u £ 0.2
ï
-1
x = FX (u) = í 2, if 0.2 < u £ 0.8
ï 3, if 0.8 < u < 1
î
so we choose the value of x as shown below.
Exercise
1. Draw a picture depicting a mapping of the outcome of a clock
hand that is randomly rotated with discrete steps 1 to 12.
2. Consider a random experiment for which S = {si : si = -3, -2, -1,
0, 1, 2, 3} and the outcomes are equally likely. If a random
variable is defined as X(si) = si2, find Sx and the PMF.
Exercise
1. If X is a geometric RV with p =0.25, what is the probability
that X ≥ 4?
2. Compare the PMFs for Pois(1) and bin(100, 0.01) RVs.
3. Generate realizations of a Pois(1) RV by using a binomial
approximation.
pois
4. Find and plot the CDF of Y = - X if X ~ Ber(1/4).
Homework problems
1. A horizontal bar of negligible weight is loaded with three
weights as shown in Figure. Assuming that the weights are
concentrated at their center locations, plot the total mass of
the bar starting at the left (where x = 0 meters) to any point
on the bar. How does this relate to a PMF and a CDF?
2. Find the PMF if X is a discrete random variable with the CDF
ì
0,
x<0
ï
FX (x) = í [x] / 5, 0 £ x £ 5
ï
1, x > 5
î
Homework problems
3. Prove that the function g(x) = exp(x) is a monotonically
increasing function by showing that g(x2) ≥ g(x1) if x2 ≥ x1.
4. Estimate the PMF for a geom(0.25) RV for k=1,2,….,20 using
a computer simulation and compare it to the true PMF. Also,
estimate the CDF from your computer simulation.