Download Random Variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
Transcript
2
Random Variables
2.1
Random variables
Real valued-functions defined on the sample space, are known as random variables (r.v.’s):
RV : S → R
Example.
• X is a randomly selected number from a set of 1, 2, 4, 5, 6, 10.
• Y is the number heads that has occured in tossing a coin 10 times.
• V is the height of a randomly selected student.
• U is a randomly selected number from the interval (0, 1).
Discrete and Continuous Random Variables
Random variables may take either a finite or a countable number of possible values. Such
random variables are called discrete. However, there also exist random variables that take
on a continuum of possible values. These are known as continuous random variables.
Example Let X be the number of tosses needed to get the first head.
Example Let U be a number randomly selected from the interval [0,1].
Distribution Function
The cumulative distribution function (c.d.f) (or simply the distribution function) of the
random variable X, say it F , is a function defined by
F (x) = P (X ≤ x)
∀x ∈ R.
Here are some properties of the c.d.f F ,
(i) F (x) is a nondecreasing function,
(ii) limx→∞ F (x) = 1
(iii) limx→−∞ F (x) = 0
All probability questions about X can be answered in terms of the c.d.f F . For instance,
P (a < X ≤ b) = F (b) − F (a)
If we desire the probability that X is strictly smaller than b, we may calculate this
probability by
P (X < b) = lim+ P (X ≤ b − h) = lim+ F (b − h)
h→ 0
h→ 0
Remark. Note that P (X < b) does not necessarily equal F (b).
1
2
2.2
Discrete Random Variables
Definition. (Discrete Random Variable) A random variable that can take on at most a
countable number of possible values is said to be discrete.
For a discrete random variable X, we define the probability mass function (or probability
density function, p.d.f) of X by
p(a) = P (X = a).
Let X be a random variable takes the values x1 , x2 , . . . . Then we must have
∞
X
p(xi ) = 1.
i=1
The distribution function F can be expressed in terms of the mass function by
F (a) =
X
p(xi )
all xi ≤a
Example. Let X be a number randomly selected from the set of numbers 0, 1, 2, 3, 4, 5.
Find the probability that P (X ≤ 4).
The Binomial Random Variable
Suppose that n independent trials, each of which results in a ”success” with probability p
and in a ”failure” with probability 1 − p, are to be performed. If X represents the number
of successes that occur in the n trials, then X is said to be a binomial random variable with
parameters (n, p). Denote X ∼ B(n, p).
The probability mass function of a binomial random variable with parameters (n, p) is
given by
n k
P (X = k) = p(k) =
p (1 − p)n−k ,
k = 0, 1, 2, . . . , n
k
where
Note that
n!
n
=
.
k
k!(n − k)!
n
X
k=0
p(k) =
n X
n k
p (1 − p)n−k = (p + (1 − p))n = 1
k
k=0
Example. According to a CNN/USA Today poll, approximately 70% of Americans believe
the IRS abuses its power. Let X equal the number of people who believe the IRS abuses its
power in a random sample of n=20 Americans. Assuming that the poll results still valid,
find the probability that
(a) X is at least 13
(b) X is at most 11
3
The Geometric Random Variable
Suppose that independent trials, each having probability p of being a success, are performed
until a success occurs. If we let X be the number of trials required until the first success,
then X is said to be a geometric random variable with parameter p.
Its probability mass function is given by
p(n) = P (X = n) = (1 − p)n−1 p,
n = 1, 2, . . .
Note that,
∞
X
p(n) = p
n=1
∞
X
(1 − p)n = 1
n=1
Example. Let X be the number of tosses needed to get the first head.
P (X = n) =
1
,
2n
x = 1, 2, 3, . . . .
The mass function of X is then, p(x) = 1/2x . Hence,
X
p(x) =
all x
∞
X
1
= 1.
2x
x=1
Example. Signals are transmitted according to a Poisson process with rate λ. Each signal
is successfully transmitted with probability p and lost with probability 1 − p. The fates
of different signals are independent. What is the distribution of the number of signals lost
before the first one is successfully transmitted?
The Poisson Random Variable
A random variable X, taking on one of the values 0, 1, 2, . . . is said to be a Poisson random
variable with parameter λ, if
p(k) = P (X = k) = e−λ
λk
,
k!
k = 0, 1, 2, . . .
This equation defines a probability mass function since
∞
X
k=0
p(k) = e−λ
∞
X
λk
= e−λ eλ = 1.
k!
k=0
A Poisson random variable involves observing discrete events in a continuous ”interval”
of time, length, or space.
Example. Suppose that the number of typographical errors on a single page of a book has a
Poisson distribution with parameter λ = 1. Calculate the probability that there is at least
one error on a page?
4
Assume that the average number of occurences of the event in per unit of ”time” is λ.
Let Y be the number of the occurences of the event in s unit of ”time”. Then N (t) is a
Poisson random variable with parameter λt, that is
P (N (t) = k) = e−λt
(λt)
k!
k
k = 0, 1, 2, . . .
Example. People enter a casino at a rate of 1 for every 2 minutes.
(a) What is the probability that none enters between 12:00 and 12:05
(b) What is the probability that at least 4 people enter the casino during that time?
Theorem. Let X ∼ B(n, p). If n is very large such that λ = np. then
n x
e−k k x
P (X = x) =
p (1 − p)n−x ≈
.
x
x!
In other words, B(n, p) ≈ P oisson(λ), where λ = np.
Proof. Let k = np or p = k/n
n −x
x λ
λ
n x
λ
n
p (1 − p)n−x =
1−
1−
=
x
n
n
n
x
x n x
x n
λ
λ
n!
n!
n
λ
k
1−
1−
=
=
=
x!(n − x)! n
n
n−λ
x!(n − x)! n − λ
n
n
n
λ
λ
λx n(n − 1) · · · (n − x + 1)
λx
n!
1
1
−
1
−
=
.
=
x! (n − x)! (n − λ)x
n
x!
(n − λ)x
n
P (X = x) =
Example. Suppose that the probability that a random chosen item to be defective is 0.01.
800 items are shipped to a warehouse. What is the probability that there will be at most 5
defective items in that 800 items?
5
2.3
Continuous Random Variables
Let X be a random variable whose set of possible values is uncountable. It is known that
such random variable is called continuous.
Definition. A random variable X is continuous if there exists a nonnegative function f (x),
defined for all real x ∈ (−∞, ∞), having the property that for any set B of real numbers
Z
P (X ∈ B) =
f (x) dx.
B
The function f (x) is called the probability density function of the random variable X.
A density function must have
Z
∞
f (x) dx = P (X ∈ (−∞, ∞)) = 1
−∞
and
Z
P (a ≤ X ≤ b) =
b
f (x) dx.
a
The relationship between the c.d.f F (x) and the p.d.f f (x) is expressed by
d
F (x) = f (x).
dx
Remark. The density function is not a probability.
P (a − ε ≤ X ≤ a + ε) =
Z
a−ε
f (x) dx ≈ εf (a)
a+ε
when ε is small. From this, we see that f (a) is a measure of how likely it is that the random
variable will be near a.
The Uniform Random Variable
A random variable is said to be uniformly distributed over the interval (0, 1) if its probability
density function is given by
f (x) =
(
1,
0<x<1
0,
otherwise.
Note that the preceding is a density function since f (x) ≥ 0
Z
∞
f (x) dx =
Z
−∞
1
dx = 1.
0
and for any 0 < a < b < 1
P (a ≤ X ≤ b) =
Z
b
f (x) dx =
a
Z
b
1 dx =
a
Z
b
f (x) dx = b − a.
a
6
In general, we say that X is a uniform random variable on the interval (α, β) if its p.d.f is
given by
 1

, α<x<β
f (x) = α − β

0,
otherwise.
Exponential Random Variables
A continuous random variable whose p.d.f is given, for some λ, by
f (x) =
(
λe−λx ,
0,
x≥0
x < 0.
is said to be an exponential random variable with parameter λ.
The c.d.f of X is
Z x
Z x
F (x) =
f (x) =
λe−λt dt = 1 − e−λx ,
0
0
x ≥ 0.
7
2.4
Expectation of a Random Variable
The Discrete Case
If X is a discrete random variable having a probability mass function p(x), then the
expected value of X is defined by
E(X) =
X
x p(x)
all x
provided
P
all x
|x| p(x) < ∞.
Lemma. If X is non-negative integer valued random variable, then
∞
X
E(X) =
P (X > k).
k=0
Example.
(a) (Expectation of a Binomial Random Variable) Let X ∼ B(n, p). Calculate E(X).
(b) (Expectation of a Geometric Random Variable) Calculate the expectation of a geometric random variable having parameter p.
(c) (Expectation of a Poisson Random Variable) Calculate the expectation of a Poisson
random variable having parameter λ.
The Continuous Case
The expected value of a continuous random variable is defined by
E(X) =
Z
∞
x f (x) dx
−∞
provided
R∞
−∞
|x| f (x) dx < ∞
Lemma. If X is non-negative random variable, then
E(X) =
Z
∞
P (X > x) dx.
0
Example.
(a) (Expectation of a Uniform Random Variable) Let X ∼ B(n, p). Calculate E(X).
(b) (Expectation of an Exponential Random Variable) Calculate the expectation of an
exponential random variable having parameter λ.
(c) (Expectation of a Normal Random Variable) Calculate the expectation of a Normal
random variable having parameter µ and σ 2 .
8
2.5
Expectation of a Function of a Random Variable
Now, we are interested in calculating, not the expected value of X, but the expected
value of some function of X, say, g(X).
Proposition 1.
(a) If X is a discrete random variable with probability mass function p(x), then for any
real-valued function g,
X
g(x) p(x)
E[g(x)] =
all x
(b) If X is a continuous random variable with probability density function f (x), then for
any real-valued function g,
E[g(x)] =
Z
∞
g(x) f (x) dx
−∞
Proposition 2. If a and b are constants, then
E(aX + b) = a E(X) + b.
and
E(X + Y ) = E(X) + E(Y )
Variance of a Random Variable
The expected value of a random variable X, E(X), is also referred to as the mean or the
first moment. The quantity E(X n ), n ≥ 1, is called the nth moment of X.
The variance of X, denoted by Var(X), is defined by
Var(X) = E[X − E(X)]2 .
A useful formula to compute the variance is
Var(X) = E(X 2 ) − [E(X)]2 .
9
2.6
Jointly Distributed Random Variables
Thus far, we have concerned ourselves with the probability distribution of a single random
variable. However, we are often interested in probability statements concerning two or more
random variable.
Joint Distribution Function
To deal with probabilities of two random variables X and Y , we define the joint distribution function of X and Y by
FX,Y (a, b) = P (X ≤ a, Y ≤ b),
−∞ < a, b < ∞.
The distribution function of X can be obtained from the joint c.d.f as follows:
FX (a) = P (X ≤ a, Y < ∞) = F (a, ∞).
Similarly, the c.d.f. of Y is given by
FY (b) = P (X < ∞, Y ≤ b) = F (∞, b).
Joint Probability Mass Function
Let X and Y be both discrete random variables, then the joint mass function of X and
Y is given by
p(x, y) = P (X = x, Y = y).
The probability mass function of X may be obtained from p(x, y) by
X
pX (x) =
p(x, y)
all y
and similarly, the mass function of Y is
pY (y) =
X
p(x, y)
all x
Joint Probability Density Function
We say that X and Y are jointly continuous if there exists a function f (x, y), defined for
all real x and y, having the property that for all sets A and B of real numbers
Z Z
P (X ∈ A, y ∈ B) =
f (x, y) dx dy.
B
A
The function f (x, y) is called the joint probability density function of X and Y . The p.d.f.
of X and Y can be obtained from their joint p.d.f. by
Z Z ∞
P (X ∈ A) =
f (x, y) dy dx
A
−∞
10
and
P (Y ∈ B) =
Z
Z
B
∞
f (x, y) dx dy.
−∞
The integrals
fX (x) =
Z
∞
f (x, y) dy
and fY (y) =
−∞
Z
∞
f (x, y) dx.
−∞
are called the density function of X and Y respectively.
Expectation of a Function of Two Random Variables
If X and Y are random variables and g is a function of two variables, then
XX
E[g(X, Y )] =
g(x, y) p(x, y) in the discrete case
y
=
Z
x
∞
Z
−∞
∞
g(x, y)f (x, y) dx dy
in the continuous case
−∞
For instance, if g(X, Y ) = X + Y , then, in the continuous case,
Z ∞Z ∞
(x + y) f (x, y) dx dy =
E(X + Y ) =
−∞ −∞
Z ∞Z ∞
Z ∞Z ∞
x f (x, y) dx dy +
y f (x, y) dx dy =
=
−∞
−∞
−∞
−∞
= E(X) + E(Y )
Proposition. For any constants a and b,
E(aX + bY ) = aE(X) + bE(Y ).
Example. Let us compute the expectation of a binomial variable with parameters n and p.
X ∼ B(n, p).
Solution.
X = X1 + X2 + · · · + Xn
where
Xi =
(
1,
0,
if the ith trial is a success
if the ith trial is a failure.
Hence, E(Xi ) = 0 · (1 − p) + 1 · p = p. Thus E(X) = E(X1 + X2 + · · · + Xn ) = n p.
Example. At a party N men throw their hats into the center of a room. The hats are mixed
up and each man randomly selects one. Find the expected number of men who select their
own hat.
Solution. Let X denote the number of men that select their own hats. Define Xi by
(
1, if the ith man selects his own hat
Xi =
0,
otherwise.
11
Hence, X = X1 + X2 + · · · + XN . E(Xi ) = 1/N . Thus, E(X) = 1. That is, no matter
how many people at the party, on the average just one of them will select his own hat.
Example. Suppose there are 4 different types of coupons and suppose that each time one
obtains a coupon, it is equally likely to be any one of the 4 types. Compute the expected
number of different types that are contained in a set of 10 coupons.
Solution. Define
(
1, if at least one type-i coupon is in the set of 10
Xi =
0,
otherwise.
Hence X = X1 + X2 + X3 + X4 . Now,
E(Xi ) = P (Xi = 1) =
= P (at least one type-i coupon is in the set of 10) =
= 1 − P (no type-i coupons are in the set of 10 ) =
10
3
= 1−
.
4
Hence,
10 #
3
E(X) = E(X1 ) + E(X2 ) + E(X3 ) + E(X4 ) = 4 1 −
4
"
12
2.7
Independent Random Variables
The random variable X and Y are said to be independent if, for all a, b,
P (X ≤ a, Y ≤ b) = P (X ≤ a) P (Y ≤ b).
When X and Y are discrete, the condition of independence reduces to
p(x, y) = pX (x) pY (y),
and if X and Y are jointly continuous, independence reduces to
f (x, y) = fX (x) fY (y).
Proposition. If X and Y are independent, then for any functions h and g, g(X) and h(Y )
are independent and
E[g(X) h(Y )] = E[g(X)] E[h(Y )].
Remark. In general, E[g(X) h(Y )] = E[g(X)] E[h(Y )] does NOT imply independence.
13
2.8
Covariance
The covariance of any two random variables X and Y , denoted by Cov(X, Y ), is defined
by
Cov(X, Y ) = E[(X − E(X))(Y − E(Y )].
The following is an useful formula to compute the covariance:
Cov(X, Y ) = E(XY ) − E(Y )E(Y ).
Proposition. If X and Y are independent, then Cov(X, Y ) = 0.
Properties of Covariance
For any random variables X, Y , Z, and a constant c,
• Cov(X, X) = Var(X)
• Cov(X, Y ) = Cov(Y, X)
• Cov(c X, Y ) = c Cov(X, Y )
• Cov(X, Y + Z) = Cov(X, Y ) + Cov(X, Z)
Sums of Random Variables
• Let X1 , X2 , . . . Xn be a sequence of random variables. Then
Var
n
X
Xi
!
=
i=1
n
X
Var(Xi ) + 2
i=1
n X
i−1
X
i=2 j=1
• If X1 , X2 , . . . Xn are independent, then
Var
n
X
i=1
Xi
!
=
n
X
i=1
Var(Xi )
Cov(Xi , Xj )