Download Random Variables

Document related concepts
no text concepts found
Transcript
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Random Variables
M. George Akritas
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Random Variables and Their Distribution
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
Definition
Let S be the sample space of some probabilistic experiment. A
function X : S → R is called a random variable.
Example
1. A unit is selected at random from a population of units. Thus
S is the collection of units in the population. Suppose a
characteristic (weight, volume, or opinion on a certain matter)
is recorded. A numerical description of the outcome is a
random variable.
P
2. S = {s = (x1 . . . , xn ) : xi ∈ R, ∀i}, X (s) = i xi or
X (s) = x, or X (s) = max{x1 . . . , xn }.
3. S = {s : 0 ≤ s < ∞} (e.g. we may be recording the life time
of an electrical component), X (s) = I (s > 1500), or
√
X (s) = s, or X (s) = log(s).
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
I
A random variable X induces a probability measure on the
range of its values, which is denoted by X (S). (SX in the
book.)
I
X (S) can be thought of as the sample space of a compound
experiment which consists of the original experiment, and the
subsequent transformation of the outcome into a numerical
value.
Because the value X (s) of the random variable X is
determined from the outcome s, we may assign probabilities
to the possible values of X .
I
I
I
For example, if a die is rolled and we define X (s) = 1 for
s = 1, 2, 3, 4, and X (s) = 0 for s = 5, 6, then
P(X = 1) = 4/6, P(X = 0) = 2/6.
The probability measure PX , induced on X (S) by the random
variable X , is called the (probability) distribution of X .
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
I
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
The distribution of a random variable is considered known if
the probabilities PX ((a, b]) = P(a < X ≤ b) are known for all
a < b.
Definition
A random variable X is called discrete if X (S) is a finite or a
countably infinite set. If X (S) is uncountably infinite, then X is
called continuous.
I
For discrete r.v.’s X , PX is completely specified by the
probabilities PX ({k}) = P(X = k), for each k ∈ X (S).
I
The function p(x) = P(X = x) is called the probability mass
function of X .
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
Example
Consider a batch of size N = 10 products, 3 of which are defective.
Draw 3 at random and without replacement and let the r.v. X
denote the number of defective items. Find the pmf of X .
Solution: The sample space of X is SX = {0, 1, 2, 3}, and:
7
7 3
P(X = 0) =
3
,
10
3
2
3
P(X = 2) =
7
1
P(X = 1) =
1 ,
10
3
3
3
10
3
2
, P(X = 3) =
10
3
Thus, the pmf of X is
x
p(x)
0
0.292
1
0.525
M. George Akritas
2
0.175
Random Variables
3
0.008
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
0.7
Bar Graph
Probability Mass Function
0.6
0.5
0.4
0.3
0.2
0.1
0
−1
0
1
2
x−values
M. George Akritas
Random Variables
3
4
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
Example
Three balls are randomly selected at random and without
replacement from an urn containing 20 balls numbered 1 through
20. Find the probability that at least one of the balls will have
number ≥ 17.
Solution: Here S = {s = (i1 , i2 , i3 ) : 1 ≤ i1 , i2 , i3 ≤ 20},
X (s) = max{i1 , i2 , i3 }, X (S) = {3, 4, . . . , 20} and we want to find
P(X ≥ 17) = P(X = 17) + P(X = 18) + P(X = 19) + P(X = 20).
These are found from the formula
k−1
P(X = k) =
2
20
3
(why?)
The end result is P(X ≥ 17) = 0.508.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
The PMF of a Function of X
Let X be a discrete random variable with range (i.e. set of possible
values) X and distribution PX , and let Y = g (X ) be a function of
X with range Y. Then the pmf pY (y ) of Y is given in terms of the
pmf pX (x) of X by
P
I pY (y ) =
x∈X :g (x)=y pX (x), for all y ∈ Y.
Example
Roll a die and let X denote the outcome. If X = 1 or 2, you win
$1; if X = 3 you win $2, and if X ≥ 4 you win $4. Let Y denote
your prize. Find the pmf of Y .
Solution: The pmf of Y is:
y
pY (y )
M. George Akritas
1
0.333
Random Variables
2
0.167
4
0.5
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
Definition
The function FX : R → [0, 1] (or simply F if no confusion is
possible) defined by
FX (x) = P(X ≤ x) = PX ((−∞, x])
is called the (cumulative) distribution function of the rv X .
Proposition
FX determines the probability distribution, PX , of X .
Proof: We have that PX is determined by its value PX ((a, b]) for
any interval (a, b]. However, PX ((a, b]) is determined from FX by
PX ((a, b]) = FX (b) − FX (a).
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
Example
Consider a batch of size N = 10 products, 3 of which are defective.
Draw 3 at random and without replacement and let the r.v. X
denote the number of defectives. Find the cdf of X .
Solution:
x
p(x)
F (x)
0
0.292
0.292
1
0.525
0.817
2
0.175
0.992
3
0.008
1.000
Moreover, F (−1) = 0. F (1.5) = 0.817. Also, p(1) = F (1) − F (0)
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
Example
Consider a random variable X with cumulative distribution
function given by
F (x) = 0, for all x that are less than 1,
F (x) = 0.4, for all x such that 1 ≤ x < 2,
F (x) = 0.7, for all x such that 2 ≤ x < 3,
F (x) = 0.9, for all x such that 3 ≤ x < 4,
F (x) = 1, for all x that are greater than or equal to 4.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
1.0
0.8
0.6
0.4
0.2
0.0
0
1
2
3
4
5
6
Figure: The CDF of a Discrete Distribution is a Step or Jump Function
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
Example
Let X have cdf as shown above. Use the form of the cdf to deduce
the distribution of X .
Solution. Since its cdf is a jump function, we conclude that X is
discrete with sample space the jump points of its cdf, i.e. 1,2,3,
and 4. Finally, the probability with which X takes each value
equals the size of the jump at that value (for example,
P(X = 1) = 0.4). These deductions are justified as follows:
a) P(X < 1) = 0 means that X cannot a value less than one.
b) F (1) = 0.4, implies that P(X = 1) = 0.4.
c) The second of the equations defining F also implies that
P(1 < X < 2) = 0, and so on.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Discrete and Continuous Random Variables
The Probability Mass Function
The (Cumulative) Distribution Function
Proposition (Properties of the CDF)
1. If a ≤ b then F (a) ≤ F (b).
2. F (−∞) = 0, F (∞) = 1.
3. If a < b, then P([a < X ≤ b]) = F (b) − F (a).
4. F (x) is right continuous.
5. If p(x) is the pmf, then
I
I
I
I
P
F (x) = k≤x p(k) and p(k) = FX (k) − F (k − 1)
F is a jump or step function.
The flat regions of F correspond to regions where X takes no
values.
The size of the jump at each x ∈ SX equals p(x).
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
I
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
A unit is selected at random from a population of N units,
and let the random variable X be the (numerical) value of a
characteristic of interest. Let v1 , v2 , . . . , vN be the values of
the characteristic of interest of each of the N units. Then, the
expected value of X , denoted by µX or E (X ) is defined by
N
1 X
vi
E (X ) =
N
i=1
Example
1. Let X denote the outcome of a roll of a die. Find E (X ).
2. Let X denote the outcome of a roll of a die that has the six
on four sides and the number 8 on the other two sides. Find
E (X ).
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
The expected value, E (X ) or µX , of a discrete r.v. X having a
possibly infinite sample space SX and pmf p(x) = P(X = x), for
x ∈ SX , is defined as
X
µX =
xp(x).
x in SX
Example
Roll a die and let X denote the outcome. If X = 1 or 2, you win
$1; if X = 3 you win $2, and if X ≥ 4 you win $4. Let Y denote
your prize. Find E (Y ).
Solution: The pmf of Y is:
y
pY (y )
1
0.333
2
0.167
Thus, E (Y ) = 0.333 + 2 × 0.167 + 4 × 0.5 = 2.667
M. George Akritas
Random Variables
4
0.5
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
You are given a choice between accepting $ 3.52 = 12.25 or roll a
die and win X 2 . What will you choose and why?
Solution: If the game will be played several times your decision will
be based on the value of E (X 2 ). (Why?) To find this use
1 + 4 + 9 + 16 + 25 + 36 = 91.
Proposition
Let X be a discrete r.v. taking values xi , i ≥ 1, having pmf pX .
Then,
X
E [g (X )] =
g (xi )pX (xi ).
i
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
A product that a particular store location stocks monthly, yields a
net profit of b dollars for each unit sold and a net loss of ` dollars
for each unit left unsold at the end of the month. The monthly
demand (i.e. # of units ordered) for this product is a rv having
pmf p(k), k ≥ 0. If the store stocks s units, find the expected
profit, and determine the number of units the store should stock to
maximize the expected profit.
Solution: Let X be the monthly demand. The random variable of
interest here is the profit
Ys = gs (X ) = bX − (s − X )`, if X ≤ s
= bs, if X > s
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
Next, E (Ys ) = sb + (b + `)
Ps
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
x=0 (x
− s)p(x) (details in class).
To determine the optimum value of s, note that the difference
E (Ys+1 ) − E (Ys ) > 0 provided
s
X
p(x) <
x=0
b
b+`
(details in class). Thus, if the inequality on the right hand side
above holds, stocking s + 1 units is better than stocking s. Let sL
be the largest value of s that satisfied the inequality. Then,
stocking sopt = sL + 1 maximizes the expected profit.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
I
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
For constants a, b we have
E (aX + b) = aE (X ) + b.
Definition
The variance, σX2 or Var(X ), and standard deviation, σX or
SD(X ), of a rv X are
q
2
2
σX = Var(X ) = E (X − µX ) , σX = σX2 .
Proposition
Two common properties of the variance are
1. Var(X ) = E (X 2 ) − µ2X
2. Var(aX + b) = a2 Var(X )
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
The Bernoulli Random Variable
I
I
I
A r.v. X is called Bernoulli if it takes only two values.
The two values are referred to as success (S) and failure (F),
or are re-coded as 1 and 0. Thus, always, SX = {0, 1}.
Experiments resulting in a Bernoulli r.v. are called Bernoulli.
Example
1. A product is inspected. Set X = 1 if defective, X = 0 if
non-defective.
2. A product is put to life test. Set X = 1 if it lasts more than
1000 hours, X = 0 otherwise.
I
If P(X = 1) = p, we write X ∼ Bernoulli(p) to indicate that
X is Bernoulli with probability of success p.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
The Bernoulli Random Variable
If X ∼ Bernoulli(p), then
I
Its pmf is:
x
p(x)
0
1−p
1
p
I
Its expected value is, E (X ) = p
I
Its variance is, σX2 = p(1 − p). (Why?)
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
The Binomial Random Variable
I
A experiment consisting of n independent replications of a
Bernoulli experiment is called a Binomial experiment.
I
If X1 , X2 , . . . , Xn are the Bernoulli r.v. for the n Bernoulli
experiments,
Y =
n
X
Xi = the total number of 1s,
i=1
is the Binomial r.v. Clearly SY = {0, 1, . . . , n}.
I
We write Y ∼ Bin(n, p) to indicate that Y is binomial with
probability of success equal to p for each Bernoulli trial.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
The Binomial Random Variable
If Y ∼ Bin(n, p), then
I
I
Its pmf is:
n k
P(Y = k) =
p (1 − p)n−k , k = 0, 1, . . . , n
k
n
n−1
n
n−1
Use x
=n
and x 2
= nx
to get.
x
x −1
x
x −1
1. Its expected value is E (Y ) = np
2. Its variance is σY2 = np(1 − p)
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
M. George Akritas
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
A company sells screws in packages of 10 and offers a money-back
guarantee if two or more of the screws are defective. If a screws is
defective with probability 0.01, independently of other screws,
what proportion of the packages sold will the company replace?
Solution: 1 − P(X = 0) − P(X = 1) ∼
= 0.004
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
Physical traits, such as eye color, are determined from a pair of
genes, each of which can be either dominant (d) or recessive (r ).
One inherited from the mother and one from the father. Persons
with genes (dd) (dr) and (rd) are alike in that physical trait.
Assume that a child is equally likely to inherit either of the two
genes from each parent. If both parents are hybrid with respect to
a particular trait (i.e. both have pairs of genes (dr) or (rd)), find
the probability that three of their four children will be like their
parents in that physical trait.
Solution: Probability that an offspring of two hybrid parents is also
hybrid is 0.75. Thus, the desired probability is
4
0.753 0.251 ∼
= 0.422.
3
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
In order for the defendant to be convicted in a jury trial, at least
eight of the twelve jurors must enter a guilty vote. Assume each
juror makes the correct decision with probability 0.7 independently
of other jurors. If 40% of defendants in such jury trials are
innocent, what is the probability that the jury renders the correct
verdict to a randomly selected defendant?
Solution: Let B = {jury renders the correct verdict}, and
A = {defendant is innocent}. Then, according to the Law of Total
Probability,
P(B) = P(B|A)P(A) + P(B|Ac )P(Ac )
= P(B|A)0.4 + P(B|Ac )0.6.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Solution Continued:
Next, let X denote the number of jurors who reach the correct
verdict in a particular trial. Thus, X ∼ Bin(12, 0.7), and
P(B|A) = P(X ≥ 5) = 1 −
4 X
12
k=0
k
0.7k 0.312−k = 0.9905,
12 X
12
P(B|A ) = P(X ≥ 8) =
0.7k 0.312−k = 0.724.
k
c
k=8
Thus,
P(B) = P(B|A)0.4 + P(B|Ac )0.6 = 0.8306
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
A communications system consisting of n components works if at
least half of its components work. Suppose it is possible to add
components to the system, and that currently the system has
n = 2k − 1 components.
1. Show that by adding one component the system becomes
more reliable for all integers k ≥ 1.
2. Show that this is not necessarily the case if we add two
components to the system.
Solution: 1. Let
An = {the system works when it has n components}. Then
A2k−1 = {k or more of the 2k − 1 work}
A2k
= A2k−1 ∪ {k − 1 of the original 2k − 1 work, and the 2kth works}
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Solution Continued:
It follows that A2k−1 ⊆ A2k . Thus, P(A2k−1 ) ≤ P(A2k ).
2. Using the same notation,
A2k+1 = {k + 1 or more of the original 2k − 1 work} ∪ {k of the original
2k − 1 work, and at least one of the 2kth and (2k + 1)th work} ∪ {k − 1
of the original 2k − 1 work, and both the 2kth and (2k + 1)th work}.
It is seen that A2k−1 is not a subset of A2k+1 , since, for example,
A2k−1 includes the outcome {k of the original 2k − 1 work} but
A2k+1 does not. It is also clear that A2k+1 is not a subset of
A2k−1 . Thus, more information is needed to compare the reliability
of the two systems.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example (Example Continued)
Suppose each component functions with probability p
independently of the others. For what value of p is a
(2k + 1)-component system more reliable than a
(2k − 1)-component system?
Solution: Let X denote the number of the first 2k − 1 that
function. Then,
P(A2k−1 ) = P(X ≥ k) = P(X = k) + P(X ≥ k + 1)
P(A2k+1 ) = P(X ≥ k + 1) + P(X = k)(1 − (1 − p)2 ) + P(X = k − 1)p 2
and P(A2k+1 ) − P(A2k−1 ) > 0 iff p > 0.5. (Details in class.)
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
The Binomial CDF Tables
Example
Suppose 70% of all purchases in a certain store are made with
credit card. Let X denote the number of credit card uses in the
next 10 purchases. Find a) µX and σX2 , and b) P(5 ≤ X ≤ 8).
Solution. It seems reasonable to assume that X ∼ Bin(10, 0.7).
a) E (X ) = np = 10(0.7) = 7, σX2 = 10(0.7)(0.3) = 2.1.
b) Using the binomial table, we have
P(5 ≤ X ≤ 8) = P(4 < X ≤ 8) = F (8) − F (4)
= 0.851 − 0.047 = 0.804.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
The Hypergeometric Random Variable
The hypergeometric distribution arises when a simple random
sample of size n is taken from a finite population of N units of
which M are labeled 1 and the rest are labeled 0.
The number X of units labeled 1 in the sample is a
hypergeometric random variable with parameters n, M and N.
This is denoted by X ∼ Hypergeo(n, N, M)
If X ∼ Hypergeo(n, N, M), its pmf is
M N−M
I
P(X = x) =
x
n−x
N
n
Note that P(X = x) = 0 if x > M, or if n − x > N − M.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
0.4
Hypergeometric (10,60,M)
0.2
0.1
0.0
P(X=k)
0.3
M = 15
M = 30
M = 45
0
2
4
6
8
Figure: Some Hypergeometric PMFs.
M. George Akritas
Random Variables
10
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Applications of the Hypergeometric Distribution
Example (Quality Control)
A company buys electrical components in batches of size 10.
Quality inspection consists of choosing 3 components at random
and accepting the batch only if all 3 are nondefective. If 30% of
the batches have 4 defective components and 70% have only 1,
what proportion of batches does the company accept?
Solution: Let A be the event a batch is accepted.
P(A) = P(A|4 defectives)0.3 + P(A|1 defective)0.7
1 9
4 6
=
0
3 0.3 +
10
3
M. George Akritas
0
3 0.7 = 0.54.
10
3
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example (The Capture/Recapture Method)
This method is used to estimate the size N of a wildlife
population. Suppose that 10 animals are captured, tagged and
released. On a later occasion, 20 animals are captured. Let X be
N
the number of recaptured animals. If all 20
possible groups are
equally likely, X is more likely to take small values if N is large.
The precise form of the hypergeometric pmf can be used to
estimate N from the value that X takes.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
If X ∼ Hypergeo(n, N, M) then,
I
I
Its expected value is: µX = n M
N
M
M N −n
2
Its variance is: σX = n
1−
N
N N −1
I
N −n
is called finite population correction factor
N −1
I
Binomial Approximation to Hypergeometric Probabilities
If n ≤ 0.05 × N, then
P(X = x) ' P(Y = x), where Y ∼ Bin(n, p = M/N).
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example (Illustration of the binomial approximation)
We will contrast P(X = 2) for X ∼ Hypergeo(n = 10, N, M),
when M/N = 0.25, with its binomial approximation
P(Y = 2) = 0.282 for Y ∼ Bin(n = 10, p = 0.25).
1. If N = 20 and M = 5, then
P(X = 2) =
5
2
15
8
20
10
2. If N = 100 and M = 25, then
25
P(X = 2) =
M. George Akritas
2
= 0.348.
75
8
100
10
= 0.292,
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Binomial Approximation to Hypergeometric Probabilities
I
As N, M → ∞, with M/N → p, and n remains fixed,
M N−M
n x
x
n−x
→
p (1 − p)n−x , ∀x = 0, 1, . . . , n.
N
x
n
One way to show
√this is via Stirling’s formula for approximating
factorials: n! ' 2πn( ne )n , or more precisely
n! =
√
2πn
n n
e
e λn where
1
1
< λn <
12n + 1
12n
Use this
√ on the left hand side and note that the terms resulting
from 2πn tend to 1, and powers of e cancel. Thus,
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
M
x
N−M
n−x
N
n
'
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
n
M M (N − M)N−M (N − n)N−n
N
x N (M − x)M−x (N − M − n + x)N−M−n+x
MM
(M − x)M−x
= (1 +
x
)M−x M x
M −x
(N − n)N−n
NN
= (1 −
n N
) (N − n)−n
N
(N − M N−M
(N − M − n + x)N−M−n+x
= (1 +
n−x
)N−M−n+x (N − M)n−x
N −M −n+x
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
The Negative Binomial Random Variable
I
In the negative binomial experiment, a Bernoulli experiment
is repeated independently until the r th 1 is observed.
For example, products are inspected, as they come off the
assembly line, until the r th defective is found.
I
The number, Y , of Bernoulli experiments until the r th 1 is
observed is the negative binomial r.v.
I
If p is the probability of 1 in a Bernoulli trial, we write
Y ∼ NBin(r , p)
I
If r = 1, Y is called the geometric r.v.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
The Negative Binomial Random Variable
If Y ∼ NBin(r , p), then
I
Its pmf is:
y −1 r
P(Y = y ) =
p (1 − p)y −r , y = r , r + 1, . . .
r −1
I
Its expected value is:
E (Y ) =
I
Its variance is:
σY2 =
M. George Akritas
r
p
r (1 − p)
p2
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
M. George Akritas
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
I
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
If r = 1 the Negative Binomial is called Geometric:
I
I
P(X = x|p) = p(1 − p)x−1 , x ≥ 1.
The ”memoryless” property: For integers s > t,
P(X > s|X > t) = P(X > s − t)
Example
Independent Bernoulli trials are performed with probability of
success p. Find the probability that r successes will occur before m
failures.
Solution: r successes will occur before m failures iff the r th success
occurs no later than the (r + m − 1)th trial. Hence the desired
probability is found from
r +m−1
X k − 1
p r (1 − p)k−r
r −1
k=r
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
Two athletic teams, A and B, play a best-of-three series of games.
Suppose team A is the stronger team and will win any game with
probability 0.6, independently from other games. Find the
probability that the stronger team will be the overall winner.
Solution: Let X be the number of games needed for team A to win
twice. Then X has the negative binomial distribution with r = 2
and p = 0.6. Team A will win the series if X = 2 or X = 3. Thus,
P(Team A wins the series) = P(X = 2) + P(X = 3)
1
2
=
0.62 (1 − 0.6)2−2 +
0.62 (1 − 0.6)3−2
1
1
= 0.36 + 0.288 = 0.648
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
A candle is lit every evening at dinner time with a match taken
from one of two match boxes. Assume each box is equally likely to
be chosen and that initially both contained N matches. What is
the probability that there are exactly k matches left,
k = 0, 1, . . . , N, when one of the match boxes is first discovered
empty?
Solution: Let E be the event that box #1 is discovered empty and
there are k matches in box #2. E will occur iff the (N + 1)th
choice of box #1 is made at the (N + 1 + N − k)th trial. Thus,
2N − k
P(E ) =
0.52N−k+1 ,
N
and the desired probability is 2P(E ).
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
Three electrical engineers toss coins to see who pays for coffee. If
all three match, they toss another round. Otherwise the ’odd
person’ pays for coffee.
1. Find the probability of a round of tossing resulting in a match.
Answer: 0.53 + 0.53 = 0.25.
2. Let Y be the number of times they toss coins until the odd
person is determined. What is the distribution of Y ? Answer:
Geometric with p = 0.75.
3. Find P(Y ≥ 3). Answer: P(Y ≥ 3) =
1 − P(Y = 1) − P(Y = 2) = 1 − 0.75 − 0.75 × 0.25 = 0.0625.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
The Poisson Random Variable
I
A RV X with SX = {0, 1, 2, . . .} is a Poisson RV with
parameter λ, X ∼ Poisson(λ), if its pmf is
p(x) = P(X = k) = e −λ
λx
, x = 0, 1, 2, . . . ,
x!
for some λ > 0.
I
I
P∞
x=0
p(x) = 1 follows from e λ =
P∞
k=0 (λ
µX = λ, σX2 = λ.
M. George Akritas
Random Variables
k
/k!).
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
M. George Akritas
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
The Poisson random variable X can be:
1. the number of fish caught by an angler in an afternoon,
2. the number of new potholes in a stretch of I80 during the
winter months,
3. the number of disabled vehicles abandoned in I95 in a year,
4. the number of earthquakes (or other natural disasters) in a
region of the United States in a month,
5. the number of wrongly dialed telephone numbers in a given
city in an hour,
6. the number of freak accidents, such as falls in the shower, in a
given time period.
7. the number of hits in a website in a day.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
I
In general, the Poisson distribution is used to model the
probability that a number of certain events occur in a
specified period of time (or distance, area or volume).
I
The events must occur at random and at a constant rate.
I
The occurrence of an event must not influence the timing of
subsequent events (i.e. events occur independently).
I
Its earliest use dealt with the number of alpha particles
emitted from a radioactive source in a given period of time.
I
Current applications include areas such as insurance industry,
tourist industry, traffic engineering, demography, forestry and
astronomy.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example (Use of the Poisson Table)
Let X ∼ Poisson(5). Find: a) P(X ≤ 5), b) P(6 ≤ X ≤ 9), and c)
P(X ≥ 10).
Solution. a) P(X ≤ 5) = F (5) = 0.616.
b) Write
P(6 ≤ X ≤ 9) = P(5 < X ≤ 9) = P(X ≤ 9) − P(X ≤ 5)
= F (9) − F (5) = 0.968 − 0.616.
c) Write
P(X ≥ 10) = 1 − P(X ≤ 9) = 1 − F (9) = 1 − 0.968.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
Suppose that a person taking Vitamin C supplements contracts an
average of 3 colds per year, and that this average increases to 5
colds per year for persons not taking Vitamin C supplements.
Suppose further that the number of colds a person contracts in a
year is a Poisson random variable.
1. Compare the probability that a person taking Vitamin C
supplements catches no more than two colds with the
corresponding probability for a person not taking supplements.
2. Suppose 70% of the population takes Vitamin C supplements.
Compute the probability that a randomly selected person will
have no more than two colds in a given year.
3. Suppose that a randomly selected person contracts no more
than two colds in a given year. What is the probability that
he/she takes Vitamin C supplements?
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Proposition (Poisson Approximation to Binomial Probabilities)
If Y ∼ Bin(n, p), with n ≥ 100, p ≤ 0.01, and np ≤ 20, then
P(Y ≥ k) ' P(X ≥ k), k = 0, 1, 2, . . . , n,
where X ∼ Poisson(λ = np).
I
The enormous range of applications of the Poisson
distribution is due to this proposition. Read the discussion in
p. 137 (following the proof of Proposition 3.5.1).
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
For the following four binomial RVs np = 3: a) Y1 ∼ Bin(9, 1/3),
b) Y2 ∼ Bin(18, 1/6), c) Y3 ∼ Bin(30, 0.1), d) Y4 ∼ Bin(60, 0.05).
Compare the P(Yi ≤ 2) with P(X ≤ 2) where X ∼ Poisson(3).
2
Comparison: First, P(X ≤ 2) = e −3 1 + 3 + 32 = 0.4232. Next,
a) P(Y1 ≤ 2) = 0.3772,
b) P(Y2 ≤ 2) = 0.4027,
c) P(Y3 ≤ 2) = 0.4114,
d) P(Y4 ≤ 2) = 0.4174.
Note: The conditions of the Proposition on n and p are not
satisfied for any of the four binomial RVs.
M. George Akritas
Random Variables
Outline
Random Variables and Their Distribution
The Expected Value of Discrete Random Variables
The Expected Value in the Simplest Case
General Definition for Discrete RVs
Types of Random Variables
Example
Due to a serious defect, n = 10, 000 cars are recalled. The
probability that a car is defective is p = 0.0005. If Y is the number
of defective cars, find: (a) P(Y ≥ 10), and (b) P(Y = 0).
Solution. Here Y ∼ Bin(10, 000, 0.0005), and all conditions of the
above Proposition are satisfied. Let X ∼ Poisson(λ = np = 5).
Then,
(a) P(Y ≥ 10) ' P(X ≥ 10) = 1 − P(X ≤ 9) = 1 − 0.968.
(b) P(Y = 0) ' P(X = 0) = e −5 = 0.007.
M. George Akritas
Random Variables
Related documents