Download Discrete Random Variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
CPSC 531
Systems Modeling and Simulation
Discrete Random Variables
Dr. Anirban Mahanti
Department of Computer Science
University of Calgary
[email protected]
Random Variables
A random variable is a real-valued mapping that
assigns a numerical value to each possible
outcome of an experiment.
Formally, a random variable X on a sample
space S is a function X : S →ℜ that assigns for all
s ∈ S a real number X(s).
Notation: Upper case characters (e.g., X, Y , Z
etc) denote random variable; lower case
characters (e.g., x, y, z etc.) denote value
attained by a random variable.
Discrete Random Variable
2
Examples
Consider arrival of packets at a router. Let X be
the number of packets that arrive per unit time. X
is a random variable that can take the values
{0,1,2,…}.
Consider rolling a pair of dice. Let X equal the
sum of the dice on a roll. If we think of the
sample points as a pair (i, j), where i = value
rolled by the first die and j = value rolled by the
second die, we have X(s) = i + j. X is a random
variable that can take any value between 2 and
12.
Discrete Random Variable
3
1
Another Example
Consider tossing two fair coins (i.e., one toss followed
by another). Define a random variable X as the number
of heads seen in the experiment. For this experiment:
The sample space S = {(H,H), (H, T), (T,H), (T, T)}
The mapping of s ∈ S to a real number X(s) is as
follows:
s
X(s)
(H,H)
(H,T)
2
1
(T,H)
(T,T)
1
0
Discrete Random Variable
4
Event Space of Random Variables
Random variable partitions its sample space into
mutually exclusive and collectively exhaustive events.
Define Ax = {X = x} to be a subset of S consisting of all
sample points s to which the random variable X assigns
the value x.
Ax = {s ∈ S | S ( s ) = x}
Clearly, Ax satisfies the following properties:
Ax ∩ Ay = ∅, ∀x ≠ y
Ux∈ℜ Ax = S
Discrete Random Variable
5
Event Space - An Example
Consider again the coin tossing experiment.
Enumerate all possible event spaces of the
random variable X?
A0 = {s ∈ S | X ( s ) = 0 } = {(T , T )}
A1 = {s ∈ S | X ( s ) = 1} = {( H , T ), (T , H )}
A2 = {s ∈ S | X ( s ) = 2 } = {( H , H )}
Discrete Random Variable
6
2
Discrete Random Variables and
PMF
A random variable X is said to be discrete if the number of
possible values of X is finite, or at most, an infinite
sequence of different values.
Discrete random variables are characterized by the
probabilities of the values attained by the variable. These
probabilities are referred to as the Probability Mass
Function (PMF) of X. Mathematically, we define PMF as:
p X ( x ) = P( X = x)
= P ({s | X ( s) = x})
=
∑
P( s)
X (s) = x
Discrete Random Variable
7
Properties of PMF
0 ≤ p X ( x) ≤ 1, ∀x ∈ ℜ ; this follows from Axioms of
Probability.
∑
p ( x) = 1 ; this is true because random variable X
x∈ℜ X
assigns some value x ∈ ℜ to each sample point s ∈ S.
A discrete random variable can take finite or countably
infinite different values, say x1, x2,…. Thus, the condition
above can be restated as follows:
∑i p X ( xi ) = 1
Terminology: Probability Mass Function is also referred
to as discrete density function.
Discrete Random Variable
8
Cumulative Distribution Function
The PMF is defined for a specific random variable value
x. How can we compute the probability of A ⊂ ℜ?
Compute P ({s | X ( s ) ∈ A}) = U x ∈A P ({s | X ( s ) = xi })
i
Write the above as P ({s | X ( s ) ∈ A}) = P( X ∈ A);
if A = (a, b), we write P ( X ∈ A) as P (a < X < b).
The cumulative distribution function (CDF) of a random
variable X is a function FX(t), −∞ < t < ∞, defined as:
FX (t ) = P(−∞ < X ≤ t )
= P( X ≤ t )
= ∑ p X ( x)
x ≤t
Discrete Random Variable
9
3
Properties of CDF
0 ≤ F(x) ≤ 1, ∀ −∞ < x < ∞.
F(x) is a monotone increasing function of x
because if x1 ≤ x2, then F(x1) ≤ F(x2).
limx →∞ = 1 and limx →-∞ = 0.
Note: In the above, the random variable is not explicitly
specified. If we are talking about two random variables,
say X and Y at the same time, then we should explicitly
specify them in the CDFs, e.g., FX(t) and FY(t).
Terminology: Cumulative Distribution Function may be
called Probability Distribution Function or simply a
Distribution Function.
Discrete Random Variable
10
Expectation
PMF of a random variable provides several
numbers. Expectation (or mean) of X is one
way to summarize the information provided by
PMF
E[ X ] = x p X ( x) = μ
Definition:
∑
x
weighted average of possible values of X.
Discrete Random Variable
11
Properties of Mean
E[cX ] = cE[ X ]
n
n
i =1
i =1
E[∑ ci X i ] =∑ ci E[ X i ]
• ci ‘s are constants
• works even if Xi’s are not independent
Discrete Random Variable
12
4
Variance
Variance of a random variable X, denoted as
Var(X) or σ2, is defined as the expected value of
(X – E[X])2:
Var ( X ) = E[( X − E[ X ]) 2 ]
Var(X) can also be calculated as:
Var ( X ) = E[ X 2 ] − ( E[ X ]) 2
where E[X2] is the second moment of X.
Discrete Random Variable
13
Properties of Variance
Variance is a measure of dispersion of a
random variable about its mean.
Var ( X ) ≥ 0
Var (cX ) = c 2Var ( x)
⎛ n
⎞ n
Var ⎜ ∑ xi ⎟ = ∑ Var ( xi )
⎝ i =1 ⎠ i =1
if xi’s are independent
Discrete Random Variable
14
Common Discrete Random
Variables
Bernoulli Random Variable
Binomial Random Variable
Geometric Random Variable
Poisson Random Variable
Discrete Random Variable
15
5
Bernoulli Random Variable
Consider an experiment whose outcome can be
either success or failure. If X is a random
variable that characterizes this experiment, we
can say X = 1 for success, and X = 0 for failure.
The PMF for this random variable is given by:
pX(1) = P({X = 1}) = p
pX(0) = P({X = 0}) = 1 – p = q
where p is the probability of success of an
experiment.
Discrete Random Variable
16
Bernoulli Distribution
The CDF of a Bernoulli
random variable X with
parameter p=1- q is given
by:
⎧0, x < 0
⎪
F ( x ) = ⎨q, 0 ≤ x < 1
⎪1, x ≥ 1
⎩
Mean and variance:
Figure 1: CDF of a Bernoulli
Random Variable
E[ X ] = p
Var ( X ) = p(1 – p)
Discrete Random Variable
17
Binomial Random Variable
Consider n Bernoulli trials, where each trial can result
in a success with probability p. The number of
successes X in such a n-trial sequence is a binomial
random variable.
The PMF for this random variable is given by:
⎧⎛ n ⎞ k
n−k
⎪⎜ ⎟ p (1 − p) , k = 0,1,2,..., n
p X (k ) = P{( X = k )} = ⎨⎜⎝ k ⎟⎠
⎪0,
otherwise
⎩
where p is the probability of success of a Bernoulli trial.
Discrete Random Variable
18
6
Binomial PMF
Binomial Distribution (n = 10)
0.3
0.25
P ({X = k })
0.2
p = 1/2
p = 1/4
p = 3/4
p = 1/2
0.15
0.1
0.05
0
0
1
2
3
4
5
6
7
8
9
10
Number of successes (k )
Discrete Random Variable
19
Binomial PMF (large n, small p)
Binomial Distribution (n = 100, p =0.02)
0.3
P ({X = k })
0.25
0.2
0.15
0.1
0.05
0
0
5
10
15
20
Number of successes (k )
Discrete Random Variable
20
CDF of Binomial R.V.
⎣t ⎦ ⎛ n ⎞
FX (t ) = ∑ ⎜⎜ ⎟⎟ p i (1 − p ) n −i , t ≥ 0
i =0 ⎝ i ⎠
How did we compute pX(k)?
From independent trial assumption, we know
pk(1-p)n-k is the probability of any sequence of outcomes
that results in k successes. There are
⎛n⎞
⎜⎜ ⎟⎟ such sequences.
⎝k ⎠
Hence, we calculate pX(k) as shown above.
Mean and variance:
E[ X ] = np
Var ( X ) = np (1 − p )
Discrete Random Variable
21
7
Geometric Random Variable
The number of Bernoulli trials, X, until first success is a
Geometric random variable.
PMF is given as:
⎧(1 − p ) k −1 p,
k = 1,2,...
p X (k ) = ⎨
0
,
otherwise
⎩
CDF is given as:
⎣t ⎦
FX (t ) = ∑ (1 − p ) i −1 p = 1 − (1 − p) ⎣t ⎦ ,
t≥0
i =1
Mean and variance:
E[ X ] =
1
p
Var ( X ) =
1− p
p2
Discrete Random Variable
22
Geometric PMF
Geometric Distribution (p = 0.5)
0.6
P ({X = k})
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
6
7
8
9
10
Number of trials until first success (k )
Discrete Random Variable
23
Example: Modeling Packet Loss
Geometric r.v. gives number of trials required to get first
success
It is easy to see pX(k) = (1-p)k-1p, k = 1,2,…
where p is the probability of success of a trial
Modeling packet losses seen at a router
We can model using a Bernoulli process
{Y0, Y1, Y2,…} where Yi represents a Bernoulli trial for packet
number i
We can say:
P{Yi = 1} = p (i.e., a packet loss)
P{Yi = 0} = 1 - p (i.e., no loss)
So number of successful packet transmissions before
first loss, X, is geometrically distributed
P{(X = n)} = p (1-p)n-1 , n = 1,2,…(good length distribution)
Discrete Random Variable
24
8
Example: Modeling Packet Loss (…)
Model
Suppose each bit transmitted through a channel is received
correctly with probability 1-p and corrupted with probability p.
Each transmission is an independent Bernoulli experiment.
Assume p is constant over time.
S
R
PKT
ACK
Timer times out if no ACK is received
Assume each packet has length l bits
Questions
1) How many trials we need to successfully deliver a packet?
2) How does (1) depend on the channel BER (p)?
Discrete Random Variable
25
Example: Modeling Packet Loss (…)
P(no error in transmission of packet) = (1-p)l = q
P1 P2 P3 … are packet transmission trials
X = number of trials needed to successfully transmit a
packet
⇒X is geometrically distributed with probability q
⇒E[X] = 1 / q
As p → 1, q → 0 ⇒ E[X] → ∞,
which coincides well with intuition
Also, for fixed p, as l → ∞ ⇒ E[X] → ∞
Discrete Random Variable
26
Poisson Random Variable
A discrete random variable, X, that takes only nonnegative integer values is said to be Poisson with
parameter λ > 0, if X has the following PMF:
⎧ −λ λk
,
⎪e
p X (k ) = ⎨
k!
⎪0,
⎩
k = 0,1,2,...
otherwise
Poisson PMF with parameter λ is a good approximation
of Binomial PMF with parameters n and p, provided
λ = np, n is very large, and p is very small.
Discrete Random Variable
27
9
Poisson Random Variable (…)
Note that
∞
zk
∑ k! = e z ,
for any real or complex number
k =0
⇒
∞
∑ p X ( k ) = e −λ e λ = 1
k =0
Discrete Random Variable
28
Poisson PMF
Poisson Distribution
0.7
λ = 0.5
λ=1
λ=5
0.6
P [X = k]
0.5
0.4
0.3
0.2
0.1
0
0
1
2
3
4
5
6
7
8
9
10
Number of events (k)
Discrete Random Variable
29
Poisson Approximation to Binomial
Poisson Distribution (λ = 2)
0.3
0.25
0.25
0.2
P ({X = k })
P ({X = k })
Binomial Distribution (n = 100, p =0.02)
0.3
0.15
0.1
0.2
0.15
0.1
0.05
0.05
0
0
0
5
10
0
5
10
Number of events (k )
Number of successes (k )
Binomial distribution with large n and small p can be approximated by
Poisson distribution with λ = np
Discrete Random Variable
30
10
Poisson Random Variable (cont.)
CDF of Poisson Random Variable:
⎣t ⎦
λk
k =0
k!
FX (t ) = ∑ e −λ
t≥0
,
Mean and variance:
E[ X ] = λ
Var ( X ) = λ
Consider N independent Poisson random variables Xi,
i=1,2,3,…,N, with parameters Xi. Then X=X1+X2+…+XN is
also a Poisson r.v. with parameter λ=λ1+λ2+...+λΝ
Discrete Random Variable
31
Deriving the Mean of Poisson R.V.
Poisson r.v. has PMF : p X (k ) = e −λ
λk
, k = 0,1,2,...
k!
E[ X ] can be calculated as
∞
∞
∑ k e −λ
λk
= λ ∑ e −λ
λ
E[ X ] =
k =0
∞
k =1
=λ
k!
= ∑ k e −λ
k =1
k −1
(k − 1)!
∞
∑ p X ( m)
λk
∞
k!
=
= λ ∑ e −λ
m =0
λm
m!
=
=λ
m
0 24
1=4
3
=1, from axioms
of probability
Discrete Random Variable
32
Example: Job arrivals
Consider modeling number of job arrivals at a
shop in an interval (0,t]
Let λ be the rate of arrival of jobs
In an interval Δt → 0
P{one arrival in Δt} = λ Δt
P{two or more arrivals in Δt} is negligible
Divide the interval (0,t] into n subintervals of
equal lengths
Assume arrival of jobs in each interval to be
independent of arrivals in another interval
Discrete Random Variable
33
11
Example: Job arrivals (…)
If n → ∞, the interval can be viewed as a sequence of
Bernoulli trials with
p = λ Δt = λ
t
n
The number of successes k in n trials can be given by
the Binomial distribution’s PMF
⎛n⎞
= ⎜⎜ ⎟⎟ p k (1 − p) n −k
⎝k ⎠
Discrete Random Variable
34
Example: Job arrivals (…)
Substitute for p = λt/n to get
k
n −k
⎛ n ⎞⎛ λt ⎞ ⎛ λt ⎞
⎜⎜ k ⎟⎟⎜ n ⎟ ⎜1 − n ⎟ , k = 0,1,..., n
⎠
⎝ ⎠⎝ ⎠ ⎝
Setting n → ∞ the above reduces to
e -λ t (λ t ) k
, k = 0,1,2,...
k!
Letting k → 1, the probability of k events in time interval (0,1] is
e -λ
λk
k!
which is the Poisson distribution
Discrete Random Variable
35
Example: ALOHA Protocol
ALOHA protocol was developed in 1970’s
at the University of Hawaii
It is a Medium Access Control (MAC) –
layer protocol developed for sharing
wireless channels
ALOHA protocol is designed to allow
multiple users to use a single
communication channel
Discrete Random Variable
36
12
Example: ALOHA - Basic Idea
Very, very simple…
Let users transmit whenever they have data to
be sent
If two or more users send data at the same time,
a collision occurs and the packets are destroyed
Upon collision, the sender waits for a random
amount of time and resends the packet
Modeling question: What is the throughput of an
ALOHA channel?
Discrete Random Variable
37
Example: ALOHA - Model
Infinite population of users generating N frames per
frame time
Frame time: amount of time required to transmit a fixed-length
frame
0<N<1
“Poisson model”: Poisson distribution predicts the number of
events to occur in a time period
Question: what is the rate of generation of frames G?
Frames generated include new + retransmitted frames
At low loads G ≈ N
At high loads G > N
Discrete Random Variable
38
Example: ALOHA - Model (…)
Let
S = rate of successful frame transmission
p0= probability of successful transmission =S/G
Question: What is the vulnerable period?
2t
Discrete Random Variable
39
13
Example: ALOHA - Model (…)
Mean # of frames generated in vulnerable
period 2t = 2G
Why? E[X] = 2G, where X = X1+X2
We assume X1+X2 are Poisson distributed with rate
G, so X is also Poisson with rate G
Probability that no other traffic is initiated in
vulnerable period is pX(0) = e-2G, which is
Poisson model
Discrete Random Variable
40
Example: ALOHA - Throughput
p0 =
S
= e − 2G ⇒ S = G e − 2G
G
dS
=0
dG
1⋅ e
− 2G
is throughput
at maximum throughput
+ G (−2) e − 2G = 0
e − 2G (1 − 2G ) = 0 ⇒ G =
1
2
1 − 2⋅(1 / 2) 1
e
=
= 0.184
2
2e
ALOHA protocol' s maximum channel utilization is 18.4%
S max =
Discrete Random Variable
41
Jointly Distributed Random
Variables
14
Joint PMFs of Multiple R.V.’s
Probabilistic models may involve more than one
random variable
These random variables are defined for the same
experiment and sample space, and they may have
relationships among them
Let Z: (X, Y) be defined on sample space S
For each sample point s in S, the random
variables X and Y take one of its possible values,
e.g. X(s)=x, Y(s)=y
Z is then a 2-dimensional vector satisfying the
following relationship:
Z: S → ℜ2 with
Z(s)=z=(x, y)
Discrete Random Variable
43
Joint PMF (...)
The Joint PMF of X and Y (or the Joint PMF of random
vector Z) is defined as:
pZ ( z ) = P ({Z = z})
= P ({ X = x}, {Y = y})
Properties of this PMF
pZ ( z ) ≥ 0, z ∈ ℜ 2
{z | pZ ( z ) ≠ 0} : a subset of ℜ 2
p X ( x) = ∑ p X ,Y ( x, y )
y
pY ( y ) = ∑ p X ,Y ( x, y )
x
p X ( x) and pY ( y ) : Marginal PMFs of x and y, respectively
Discrete Random Variable
44
Conditional PMF
15
Conditioning On An Event
Look at conditional PMFs given the occurrence of a
certain event or given the value of another random
variable
Conditional PMF of random variable X, conditioned on
an event A with P(A)>0 is defined as:
p X | A ( x) = P{( X = x | A)} =
P({ X = x} ∩ A}
P( A)
Calculate pX|A(x) by adding the probabilities of outcomes
that result in X=x and belong to the conditioning event A,
and then normalize by dividing with P(A).
Discrete Random Variable
46
Example: A Web Surfer
A web surfer will repeatedly attempt to connect
to a Web server, up to a maximum of n times.
Each attempt has a probability p of being
successful.
What is the PMF of the number of events, given
that the surfer successfully connects to the Web
server?
Discrete Random Variable
47
Example: A Web Surfer (…)
Let A be the event that the web surfer successfully
connects (with at most n attempts)
Let X be the number of attempts needed to establish a
connection assuming unlimited number of attempts
could be made.
Clearly, X is a geometric random variable with
parameter p and A={X ≤ n}.
⎧ (1 − p ) k −1 p
, k = 1,2,..., n
⎪ n
⎪
j −1
P( A) = ∑ (1 − p ) j −1 p, and p X | A (k ) = ⎨ ∑ (1 − p ) p
j =1
⎪ j =1
⎪
0,
otherwise
⎩
n
Discrete Random Variable
48
16
Conditioning on Another R.V.
Let X and Y be two random variables associated with
the same experiment. Suppose Y equals y, then this
provides some information regarding the value of X.
This information is captured by the conditional PMF:
p X |Y ( x | y ) =
P({ X = x, Y = y}) p X ,Y ( x, y )
=
P({Y = y})
pY ( y )
The conditional PMF pX|Y(x,y) satisfies the normalization
property
∑ p X |Y ( x | y) = 1
x
Discrete Random Variable
49
17