Download Lecture 02: Discrete Random Variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
Lecture 02: Discrete Random Variables
Ba Chu
E-mail: ba [email protected]
Web: http://www.carleton.ca/∼bchu
(Note that this is a lecture note. Please refer to the textbooks suggested in the course outline for
details. Examples will be given and explained in the class)
1
Objectives
The introduction of random variables in Lecture 1 was completely general – the principles that we
discussed apply to all kinds of random variables. In this lecture, we will study an important special
class of random variables, the discrete random variables, such as the Bernoulli random variables,
the binomial random variables, the geometric random variables, and the Poisson random variables.
I will illustrate the usefulness of these random variables with interesting examples.
2
Basic Concepts
We begin with formal definition.
Definition 1. A random variable X is discrete if the set of possible values of X, X(S), is countable.
Definition 2. Let X be a discrete random variable. The probability density function (pdf ) of X is
the function f : R −→ R defined by f (x) = P (X = x) which satisfies the following properties:
1. f (x) > 0 for every x ∈ R.
2. If x 6∈ X(S), then f (x) = 0.
1
3. By the definition of X(S),
P
x∈X(S)
f (x) =
P
x∈X(S)
P (X = x) = P (∪x∈X(S) {x}) = P (X ∈
X(S)) = 1.
Example 1. A fair coin is tossed and the outcome is Heads (H) or Tails (T). Define a random
variable X by X(H) = 1 and X(T ) = 0. The pdf of X is the function f (x) defined by f (0) = 0.5,
f (1) = 0.5, and f (x) = 0 for all x 6∈ X(S) = {0, 1}.
Example 2. Fair die is tossed and the number of dots on the upper face is observed. The sample
space is S = {1, 2, 3, 4, 5, 6}. Define a random variable X by X(s) = 1 if s is a prime number and
X(s) = 0 if s is not a prime number. Then, the pdf of X is f (0) = 1/3, f (1) = 2/3, and f (x) = 0
for all x 6∈ X(S) = {0, 1}.
Definition 3. A random variable X is a Bernoulli trial if X(S) = {0, 1}. Traditionally, we call
X = 1 a ‘success’ and X = 0 a ‘failure’. The pdf of this random variable is defined by f (0) = 1 − p,
f (1) = p, and f (x) = 0 for all x 6∈ X(S).
Definition 4. Consider a series of n independent Bernoulli trials, let Y denote the number of
successes. We call Y a binomial random variable. The pdf of Y is given by
f (k) = P (Y = k) = C(n, k)pk (1 − p)n−k .
Definition 5. Let X1 , X2 , . . . denote independent Bernoulli trials, and let Y denote the number
of failures that precede the first success. We call Y a geometric random variable. The pdf of Y is
given by
f (j) = P (Y = j) = P (X1 = 0, . . . , Xj = 0, Xj+1 = 1) = (1 − p)j p.
Thus, we can obtain the cdf of Y , i.e.,
F (k) = P (Y ≤ k) = 1 − P (Y > k) = 1 − P (Y ≥ k + 1) = 1 − (1 − p)k+1 .
Example 3. John is a 4-th year undergraduate student who is determined to have a formal date.
He believes that each woman he asks is twice as likely to decline his invitation as to accept it, but
2
he resolves to extend invitations until one is accepted. However, each of his first ten invitations
is declined. Assuming that John’s assumptions about his own desirability are correct, what is the
probability that he would encounter such run of failures?.
The answer is that John evidently believes that he can model his invitations as a sequence of
independent Bernoulli trials, each with success probability p = 1/3. If so, the number of unsuccessful
invitations that he extends is a geometric random variable Y and
2 10
= 0.0173.
P (Y ≥ 10) = 1 − P (Y ≤ 9) = 1 − F (9) = 1 − 1 − ( )
3
Imagine an urn that contains m red balls and n black balls. Our experiment involves selecting k
balls from the urn in a way that each of the C(m + n, k) possible outcomes that might be obtained
are equally likely. Let X denote the number of red balls selected in this manner. If we observe
X = x, then x red balls were selected from a total of m red balls and k − x black balls were selected
from a total of n black balls. We call X a hypergeometric random variable, and its pdf is given by
f (x) = P (X = x) =
#(X = x)
C(m, x)C(n, k − x)
=
.
#(S)
C(m + n, k)
Example 4. Suppose that, in ECON 4002 class, 30 female students and 20 male students are
randomly assigned grade A or B. What is probability that exactly 15 female students and 5 male
students receive A?
Answer:
3
C(30,15).C(20,5)
.
C(50,20)
Expectation
Since decision-makers often face many different prospects, it is plausible to use decision criteria in
which each prospect should be weighted by the chance of realizing that prospect. This motivates
the notion of expectation
Definition 6. The expected value of a discrete random variable X, which we will denote E(X), is
3
the probability-weighted average of the possible values of X, i.e.,
E(X) =
X
xP (X = x) =
x∈X(S)
X
xf (x).
x∈X(S)
The expected value of X, E(X), is often called the population mean and denoted µ.
Example 5. If X ∼ Bernoulli(p), then µ =
P
x∈{0,1}
xP (X = x) = p.
Example 6. Suppose that you own a slot machine that pays a jackpot of $1000 with p = 0.0005
and $0 with 1 − p. How much you should charge a customer to play this machine?. Let X denote
the payoff, the expected payoff per play is EX = 1000.0.0005 + 0.0.9995 = 0.5. Thus, you should
charge more than $0.5.
Example 7. You are offered the choice between receiving a certain $5 and playing the following
game: a fair coin is tossed and you receive $10 or $0 according to whether H or T is observed. Will
you play the game?
Example 8. Consider a lottery game, in which 6 numbers are drawn ( without replacement) from
the set {1, 2, . . . , 39, 40}. For $2, you can purchase a ticket that specifies 6 such numbers. If the
numbers on your ticket match the numbers selected by the state, then you win $1 million; otherwise,
you win nothing. Is buying a lottery ticket ‘completely irrational’?
The probability of winning a lottery ticket is p =
1
C(40,6)
= 2.6053 × 10−7 , so the expected value is
approximately $0.26, which is considerably less than the cost of a ticket. Evidently, it is completely
irrational to buy tickets for this lottery as an investment strategy.
In most state lotteries, the fair value of game is less than the cost of a ticket. This is natural–
lotteries exist because it generate revenue for the states.
Definition 7. Suppose that X : S −→ R is a discrete random number and φ : R −→ R is a
function. Let Y = φ(X). Then Y : R −→ R is a random variable and
Eφ(X) =
X
x∈X(S)
4
φ(x)f (x).
Example 9. Consider a game in which the jackpot starts at $1 and doubles each time that Tails
is observed when a fair coin is tossed. The game terminates when Heads is observed for the first
time. How much payoff would you receive from this game?.
This is very curious game. The payoff will, most probably, be rather small; there is small chance
of a very large payoff. Let X denote the number of T that are observed before the game terminates.
Then X(S) = {0, 1, 2, . . . } and the geometric random variable X has the pdf: f (x) = 0.5x . The
P
P∞
x
x
payoff from this game is Y = 2X ; thus, the expected payoff is E[2X ] = ∞
x=0 2 0.5 =
0 1 =
∞. This is very amazing!. This example is famous in probability theory – it is known as the St.
Petersburg Paradox.
3.1
Properties of expectation
• E[cφ(X)] = cE[φ(X)].
• E[φ1 (X) + φ2 (Y )] = E[φ1 (X)] + E[φ2 (X)].
Definition 8. The variance of a discrete random variable X, which we denote V ar(X), is the
probability-weighted average of the squared deviations of X from E(X) = µ, i.e.,
V ar(X) = E[(X − µ)2 ] =
X
(x − µ)2 f (x).
x∈X(S)
The variance of X if oftern called the population variance and denoted by σ 2 .
Example 10. If X is a Bernoulli random variable with the probability measure p, then
V ar(X) = E[(X − µ)2 ] = p(1 − p).
Theorem 1. If X is a discrete random variable, then V ar(X) = E[X 2 ] − (E[X])2 .
Example 11. Suppose that X is a random variable whose possible values are X(S) = {2, 3, 5, 10}.
Suppose that the probability of each of these values is given by the formula f (x) = P (X = x) = x/20.
5
Then, the expected value µ = E[X] = 6.9, the variance σ 2 = V ar(X) = 10.39, and the standard
deviation σ = 3.2234.
Theorem 2. Let X denote a discrete random variable and suppose that c ∈ R is constant. Then
V ar(cX) = c2 V ar(X).
If the discrete random variables X1 and X2 are independent, then
V ar(X1 + X2 ) = V ar(X1 ) + V ar(X2 ).
3.2
Binomial distribution
Suppose that a fair coin is tossed twice and the number of Heads (H) is counted. Let Y denote the total number of Hs. Because the sample space has 4 equally likely outcomes – S =
{HH, HT, T H, T T }, the pdf of Y is given by
f (0) = P (Y = 0) = 0.25,
f (1) = P (Y = 1) = 0.5,
f (2) = P (Y = 2) = 0.25,
and f (y) = 0 if y 6∈ Y (S) = {0, 1, 2}. Now, we state the formal definition of the binomial random
variable.
Definition 9. Let X1 , . . . , Xn be mutually independent Bernoulli trials, each with success probability
p. Then,
Y =
n
X
Xi
i=1
is a binomial random variable, denoted Y ∼ Bin(n, p).
The mean of Y is np, and the variance of Y is np(1 − p).
6
The general formula for the binomial pdf is
f (j) = P (Y = j) = C(n, j)pj (1 − p)n−j ,
(3.1)
and the binomial cdf is
F (k) = P (Y ≤ k) =
k
X
C(n, j)pj (1 − p)n−j .
j=0
Example 12. In 12 trials with success probability 0.5, what is the probability that no more than 4
success will be observed?. That is, P (Y ≤ 4) = F (4).
In 15 trials with success probability 0.8, what is the probability that at least 4 but no more than
10 successes will be observed?. That is, P (4 ≤ Y ≤ 10) = F (10) − F (4).
Theorem 3. If, in (3.1), we let n −→ ∞ and p −→ 0 in such a way that λ = np remains constant,
then
lim P (Y = j) =
n−→∞
p−→0
np=const
lim C(n, j)pj (1 − p)n−j =
n−→∞
p−→0
np=const
e−λ λj
.
j!
In this case, we call Y a Poisson random variable, E[Y ] = λ and V ar(Y ) = λ.
4
Exercises
1. Suppose that a weighted die is tossed. Let X denote the number of dots that appear on
the upper face of the die, and suppose that P (X = x) = (7 − x)/20 for x = 1, 2, 3, 4, 5 and
P (X = 6) = 0. Determine each of the following:
(a) The probability density function (pdf) of X.
(b) The cumulative distribution function (cdf) of X.
(c) the expected value (mean) of X.
(d) the variance of X.
7
2. Paul is planning a dinner party at which he will be able to accommodate 7 guests. From
past experience, he knows that each person invited to the party will accept his invitation with
probability 0.5. He also knows that each person who accepts will actually attend with probability 0.8. Suppose that Paul invites 12 people. Assuming that they behave independently
from each other, what is the probability that he will end up with more guests than he could
accommodate?.
3. If X ∼ Bin(n, p), find the pdf of Y = n − X.
4. John teaches 2 sections of Statistics for a total of 1500 students. Each of his students spins a
penny 89 times and counts the number of Heads. Assuming that each of these 1500 pennies
has P (H) = 0.3 for a single spin, what is the probability that John will encounter at least 1
student who observes no more thatn 2 Hs?.
5. Problems 3.3.6, 3.3.10, 3.3.14, and 3.3.15 in LM06.
8