Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Discrete Distributions
Tieming Ji
Fall 2012
1 / 31
Definition: A random variable is discrete if it can
take a countable number of values.
Example 1: Flip a coin. If head, X = 1; otherwise, X = 0.
Example 2: Roll a dice. Variable X = outcome.
X = {1, 2, · · · , 6}.
Example 3: Flip a coin, and stop when you get a tail. Let
X be the number of flips. Then,
the sample space S={T, HT, HHT, HHHT, · · · }, and
the random variable X = {1, 2, 3, 4, · · · }.
2 / 31
Definition: The probability function (or the
probability mass function, or the probability density
function) for a discrete variable X , f (x), is a
function to describe the probability that X = x, i.e.
f (x) = P(X = x),
such that, for any realization X = x,
1.f (x) ≥ 0; and
X
2.
f (x) = 1.
all x
3 / 31
Example 1. Flip a coin with probability p to get a head. Let X = 1,
if a head; otherwise X = 0. We have X = {0, 1}, and
f (1) = P(X = 1) = p,
f (0) = P(X = 0) = 1 − p,
f (1) + f (0) = 1.
Example 2. Flip a coin with probability p to get a head. Stop when
you get a tail. Let X denote the number of total flips. We have
X = {1, 2, 3, · · · }, and
f (k) = P(X = k) = p k−1 (1 − p), k = 1, 2, · · ·
∞
X
k=1
f (k) =
∞
X
k=1
P(X = k) =
∞
X
p k−1 (1 − p) = 1.
k=1
4 / 31
Theorem: (Convergence of geometric series)
∞
X
k=1
ar k−1 =
a
, where |r | < 1.
1−r
(Note: a is the first term, and r is the ratio.)
Proof:
5 / 31
Theorem: (Sum of first K terms)
K
X
ar k−1 =
k=1
Exercise. Compute
a, and the ratio r .
a(1 − r K )
, where r 6= 1.
1−r
P∞
1
k=1 3 (1
− 13 )k . Correctly define the first term
6 / 31
Definition: The cumulative distribution function for
random variable X is denoted by F , and defined as
F (x) = P(X ≤ x), for real x.
Example 1. Roll a fair dice. Let X = outcome. Compute F (4).
Example 2. Flip a coin with probability p landing a head. Stop when
you get a tail. Let X denote the number of total flips. Compute
F (10).
7 / 31
Definition: The expectation (or mean) of a discrete
random variable X , E (X ), is defined as
X
E (X ) =
xf (x).
all x
Extension: The expectation of a function of a
discrete random variable h(X ) is denoted by
E (h(X )), and computed as
X
E (h(X )) =
h(x)f (x).
all x
8 / 31
Example 3.3.1 (on page 53). Let X denote the number of heartbeats per
minute obtained per patient in a hospital.
x
f (x)
40
0.01
60
0.04
68
0.05
70
0.80
72
0.05
80
0.04
100
0.01
What is E (X )?
9 / 31
Properties: X and Y are random variables, and c is
a constant. Then,
(1) E (c) = c,
(2) E (cX ) = cE (X ),
(3) E (X + Y ) = E (X ) + E (Y ).
Example 3.3.2 (on page 54). We know E (X ) = 7 and E (Y ) = −5,
compute E (4X − 2Y + 6).
10 / 31
Definition: The variance of a random variable X ,
Var(X ), is defined as
Var(X ) = σX2 = E (X − E (X ))2 .
Extension 1:
Var(X ) = E (X − E (X ))2 = E (X 2 ) − (E (X ))2 .
Extension 2: The variance of a function of a random
variable h(X ) is
2
2
Var(h(X )) = σh(X
) = E (h(X ) − E (h(X ))) .
11 / 31
Example 3.3.3 (page 54). Let X and Y denote the number of heartbeats
per minute for two groups of patients, respectively. Compute the
expectations and variances for these two variables.
x
f (x)
40
0.01
60
0.04
68
0.05
70
0.80
72
0.05
80
0.04
100
0.01
y
f (y )
40
0.40
60
0.05
68
0.04
70
0.02
72
0.04
80
0.05
100
0.40
12 / 31
Properties: X and Y are random variables, c is a
constant. Then
(1) Var(c) = 0,
(2) Var(cX ) = c 2 Var(X ),
(3) If X and Y are independent, then
Var(X + Y ) = Var(X ) + Var(Y ).
Exercise 3.3.6 (on page 58): X and Y are independent with σX2 = 9 and
σY2 = 3. Compute Var(4X − 2Y + 6).
13 / 31
Definition: The standard deviation of aprandom
variable X , σX , is computed as σX = σX2 .
14 / 31
Families of Discrete Distributions: Some families of
distributions are frequently used and more
important. We summarize the a few most useful
ones.
Uniform Distribution;
Geometric Distribution;
Bernoulli Distribution;
Binomial Distribution;
Poisson Distribution.
15 / 31
Uniform Distribution
Definition A random variable X follows a discrete
uniform distribution if the probability of X taking
each possible value x is equally likely.
Example. Roll a fair dice. Let X = outcome. X can take 6 values, and
f (X = x) = 61 , where x = 1, · · · , 6.
Compute E(X ) and Var(X ) when X can take n values, i.e.
x = x1 , · · · , xn .
E(X ) =
Pn
k=1
n
Pn
xk
;
Pn
2
xk2
k=1 xk
Var(X ) = k=1
−
;
n
n
P
F (x) = #{x1 ,··· ,xn }≤x n1 .
16 / 31
Geometric Distribution
Definition A random variable X follows a geometric
distribution with parameter p if its probability
density function f is given by
f (x) = (1 − p)x−1 p, 0 < p < 1, x = 1, 2, · · · .
Example. Flip a coin with probability p landing a head. Stop when
getting a head. Let X denote the total number of flips.
With parameter p,
E(X ) = p1 ;
Var(X ) =
1−p
p2 ;
F (x) = 1 − (1 − p)[x] , where [x] is the largest integer less or equal to
x (the floor of x).
17 / 31
Bernoulli Distribution
Definition A random variable X follows a Bernoulli
distribution with parameter p if it has probability p
to be successful (taking value 1), and probability
(1 − p) fails (taking value 0).
Example. Flip a coin. If a head occurs, let X =1; otherwise, X =0.
With parameter p,
E(X ) = p;
Var(X ) = p(1 − p);
if x < 0;
0,
1 − p, if 0 ≤ x < 1;
F (x) =
1,
if 1 ≤ x.
18 / 31
Binomial Distribution
Definition A random variable X follows a binomial
distribution with parameters n and p if its density
function f is given by
n
f (x) =
p x (1 − p)n−x ,
x
x = 0, 1, 2, · · · , n, 0 < p < 1, where n is a positive
integer.
Example. Flip a coin n times with probability p to get a head each time.
Let X to denote the number of heads you get in total n flips.
19 / 31
Binomial Distribution (continued)
With parameter n and p, the binomial distribution has
E(X ) = np;
Var(X ) = np(1 − p);
P[t]
n
p x (1 − p)n−x .
F (t) = x=0
x
20 / 31
Binomial Distribution (continued)
Example 3.5.1 (on page 67) In a study on air traffic controllers, let X
denote the number of radar signals correctly identified in a 30-minute
time span in which 10 signals arrive. The probability of correctly
identifying a signal that arrives at random is 0.5.
What isthe density
function of X ?
10
x
f (x) =
0.5 (1 − 0.5)10−x , where x = 0, · · · , 10.
x
Calculate the mean and standard
of X .
p deviation
√
E(X ) = 10 × 0.5 = 5, σX = σX2 = 10 × 0.5 × 0.5 ≈ 1.58.
21 / 31
Binomial Distribution (continued)
What is the probability that at most 7 signals will be identified
correctly?
F (7)
1 − f (10) − f (9) − f (8)
10
10
10
10−10
= 1−
0.5 (1 − 0.5)
−
0.59 (1 − 0.5)10−9
10
9
10
−
0.58 (1 − 0.5)10−8
8
≈ 0.945
=
What is the probability that 2 ≤ X ≤ 7?
F (7) − F (1)
= F (7) − f (0) − f (1)
10
≈ 0.945 −
0.50 (1 − 0.5)10−0
0
10
−
0.51 (1 − 0.5)10−1
1
≈ 0.935
22 / 31
Poisson Distribution
Definition A random variable X follows a poisson
distribution with parameter λ if its density function
f is given by
e −λ λx
f (x) =
.
x!
Theorem For any real number z, we have
z2 z3 z4
e =1+z +
+
+
+ ··· .
2! 3! 4!
This is the Maclaurin series for e z .
z
23 / 31
Poisson Distribution (continued)
With parameter λ, a poisson distribution has
E(X ) = λ;
Var(X ) = λ;
P[t]
P[t]
F (t) = x=0 f (x) = x=0
e −λ λx
x!
.
A random variable follows a poisson distribution in many
waiting-for-occurence applications. Generally speaking, X follows a
poisson distribution in following cases.
X denotes the number of car accidents in a month.
X denotes the number of customers coming for service in 5 minutes.
X denotes the number of incoming calls in a period of time.
24 / 31
Poisson Distribution (continued)
Example. Consider a telephone operator who, on the average, handles 5
calls every 3 minutes. Model the number of calls in a minute by a
poisson distribution. What is the probability that there will be no calls in
the next minute? At least two calls?
Solution:
Define X as the number of calls in a minute. Then E(X ) = 53 . So
P(no calls in the next minute)
= P(X = 0)
e −5/3 (5/3)0
0!
= e −5/3 = 0.189;
=
P(at least two calls in the next minute)
= P(X ≥ 2)
=
1 − f (0) − f (1)
=
1 − 0.189 −
=
0.496.
e −5/3 (5/3)1
1!
25 / 31
Definition: Let X be a random variable. The k th
ordinary moment for X is defined as E(X k ).
So, by definition, the mean is the first ordinary moment. E(X k ) helps to
describe distribution characters of X .
Definition: The moment generating function (MGF)
for a random variable X is denoted by mX (t), and is
given by
mX (t) = E(e tX ),
provided this expectation is finite for all real
numbers t in some open interval (−h, h).
Why MGF? Because MGF helps to find E(X k ) for any k.
26 / 31
Example: Random variable X follows a discrete
uniform distribution with x = x1 , x2 , · · · , xn . Find
the moment generating function for X .
27 / 31
Example: Random variable X follows a Bernoulli
distribution with parameter p. Find the moment
generating function for X .
28 / 31
Example: Random variable X follows a binomial
distribution with parameters n and p. Find the
moment generating function for X .
29 / 31
Example: Random variable X follows a geometric
distribution with parameter p. Find the moment
generating function for X .
30 / 31
Example: Random variable X follows a poisson
distribution with parameter λ. Find the moment
generating function for X .
31 / 31