Download Chapter 5 Continuous random variable

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Chapter 5
Continuous random variable
5.1
Definitions, Mean and Variance
Continuous random variable as one which can take any value in an interval. Some
continuous random variables are defined on a finite interval, others on the whole
real line or the positive half of the real line. Whichever is the case we cannot give
a non-zero probability to any particular value. Basically if we could then when we
added up the uncountably many non-zero probabilities they would sum to more
than one. Instead we can find the probability that a continuous random variable
lies in an interval. Note that because the probability of the end-points is zero it
does not matter whether we include or exclude them.
Definition If X is a continuous random variable then there exists a non-negative
function, f (x), called the probability density function of X such that
f (x) ≥ 0
Z
∞
f (x)dx = 1
−∞
and
P (a < X < b) =
Z
b
f (x)dx.
a
If X is a continuous random variable, for any x1 and x2 :
P (x1 ≤ X ≤ x2 ) = P (x1 < X ≤ x2 ) = P (x1 ≤ X < x2 ) = P (x1 < X < x2 )
1
CHAPTER 5. CONTINUOUS RANDOM VARIABLE
2
and
P (X = x) = 0
Note that any function which is non-negative and integrates to one is a possible
probability density function for a random variable X. As with discrete random
variables some density functions are commonly used to model continuous random
variables.
It is also convenient to define the following function.
Definition The cumulative distribution function, F (x) of a continuous random
variable X is defined by
Z x
f (u)du.
F (x) = P (X ≤ x) =
−∞
Note that for a discrete random variable the cumulative distribution function
P (X ≤ x) will be a step function with steps of height P (X = xi ) at the points
at which X is defined. The continuous version can be thought of as a limiting
case when all values of x in an interval are possible. Note that the cumulative
distribution function is always non-decreasing and
lim F (x) = 0
x→−∞
lim F (x) = 1.
x→∞
Knowledge of either the density function or the cumulative distribution function
is enough to define X.
We define the mean of a continuous random variable as follows.
Definition If X is a continuous random variable with probability density function
f (x) then the mean or expected value of X, E[X] or µ is defined by
Z ∞
E[X] = µ =
xf (x)dx.
−∞
E[cX + d] = cE[X] + d
CHAPTER 5. CONTINUOUS RANDOM VARIABLE
3
Similarly the variance is defined by
Definition If X is a continuous random variable with probability density function
f (x) then the variance of X, V ar[X] is defined by
Z ∞
V ar[X] =
(x − µ)2 f (x)dx
Z−∞
∞
=
x2 f (x)dx − µ2 .
−∞
We can also define the median of a continuous random variable.
Definition If X is a continuous random variable with probability density function
f (x) then the median of X is the value m satisfying the equation
Z ∞
Z m
1
f (x)dx = .
f (x)dx =
2
m
−∞
It is the value such that X is equally likely to be more than the median as less
than it.
Example 5.1
The recorded temperature in an engine is a r.v. X whose p.d.f. is given by:
f (x) = n(1 − x)n−1 , 0 ≤ x ≤ 1 (and 0 otherwise), where n − 1 is a known integer.
1. Show that f is, indeed, a p.d.f.
2. Determine the corresponding c.d.f. F
Solution
R1
1. Because f (x) ≥ 0 for all x, we simply have to check that 0 f (x)dx = 1. To
R1
R1
n
this end, 0 f (x)dx = 0 n(1 − x)n−1 dx = −n (1−x)
|10 = −(1 − x)n |10 = 1
n
Rx
2. First, F (x) = 0 for x < 0, whereas for 0 ≤ x ≤ 1, F (x) = 0 n(1 − t)n−1 dt =
−(1 − t)n |x0 , and this is equal to: −(1 − x)n + 1 = 1 − (1 − x)n . Thus,

0,
x<0

n
1 − (1 − x) , 0 ≤ x ≤ 1
F (x) =

1,
x > 1.
△
4
CHAPTER 5. CONTINUOUS RANDOM VARIABLE
Exercises
Exercise 5.1
A random variable X has probability density function
f (x) = cx2 (1 − x)
if 0 ≤ x ≤ 1, 0 otherwise.
1. Determine c.
2. Find E[X].
3. Find V ar[X].
4. Show that the median m satisfies the equation
6m4 − 8m3 + 1 = 0
Solution
1.
R∞
−∞
f (x)dx = 1
R1
c(x2 − x3 )dx = 1
h 3
i1
x
x4
c 3 − 4
=1
0
0
1
=1
c 12
c = 12
2. E[X] =
2
R1
3. E[X ] =
0
2
h
3
x12(x − x ) = 12
R1
2
2
3
h
x4
4
x 12(x − x ) = 12
2
1
− 53 = 25
.
0
E[X]2 = 52
Rm
4. 0 12(x2 − x3 ) = 0.5
h 3
im
4
12 x3 − x4
= 0.5
0
3
4
12 m3 − m4 = 0.5
−
x5
5
x5
5
i1
−
x6
6
0
= 53 .
i1
0
=
12
30
=
2
5
V ar[X] = E[X 2 ] −
4m3 − 3m4 = 0.5
6m4 − 8m3 + 1 = 0
△
CHAPTER 5. CONTINUOUS RANDOM VARIABLE
5.2
5
The exponential distribution
The exponential distribution can be used to model the amount of time until a
specific event occurs or to model the time between independent events.
The exponential probability density function with parameter is
f (x) = λe−λx ;
λ > 0.
The p.d.f. of exponential r.v. with λ = 1 is shown in a figure below:
The mean and variance of an exponential random variable are given by the
following:
E[X] =
1
,
λ
V [X] =
1
λ2
and
The cumulative distribution function of an exponential random variable is given
by
F (x) =
0;
1 − e−λx ;
x<0
x≥0
The exponential distribution is the only continuous distribution that has the memoryless property. This property describes the fact that the remaining lifetime of an
object (whose lifetime follows an exponential distribution) does not depend on the
amount of time it has already lived. This property is rep-resented by the following
equality, where s ≥ 0 and t ≥ 0:
P (X > s + t|X > s) = P (X > t).
In words, this means that the probability that the object will operate for time s+t,
given it has already operated for time s, is simply the probability that it operates
for time t.
CHAPTER 5. CONTINUOUS RANDOM VARIABLE
6
Example 5.2
The lifetime of an automobile battery is described by a r.v. X having the Exponential distribution with parameter λ = 31 .
1. Determine the expected lifetime of the battery and the variation around this
mean.
2. Calculate the probability that the lifetime will be between 2 and 4 time units.
3. If the battery has lasted for 3 time units, what is the (conditional) probability
that it will last for at least an additional time unit?
Solution
1. Since E[X] = λ1 and V ar(X) =
and s.d.(X) = 3.
1
,
λ2
we have here: E[X] = 3, V ar(X) = 9,
x
2. Since, F (x) = 1 − e− 3 for x > 0, we have P (2 < X < 4) = P (X ≤
4
2
2
4
4) − P (X ≤ 2) = F (4) − F (2) = (1 − e− 3 ) − (1 − e− 3 ) = e− 3 − e− 3 ≈ 0.252.
3. The required probability is: P (X > 4|X > 3) = P (X > 1), by the memoryless property of this distribution, and P (X > 1) = 1 − P (X ≤ 1) =
1
1 − F (1) = e− 3 ≈ 0.716.
△
Answer: 3,9; 0.252; 0.716.
When the exponential is used to represent interarrival times, then the parameter λ is a rate with units of arrivals per time period. When the exponential is used
to model the time until a failure occurs, then λ is the failure rate.
Exercises
Exercise 5.2
The time between arrivals of vehicles at an intersection follows an exponential
distribution with a mean of 12 seconds. What is the probability that the time
between arrivals is 10 seconds or less?
Solution
We are given the average interarrival time, so λ = 1/12. The required probability
is obtained as follows
P (X ≤ 10) = 1 − e−(1/12)10 ≈ 0.57
CHAPTER 5. CONTINUOUS RANDOM VARIABLE
7
△
Answer: 0.57
5.3
The uniform distribution
A random variable that is uniformly distributed over the interval (a, b) follows the
probability density function given by
f (x) =
1
;
b−a
a < x < b.
The p.d.f. given by U [0, 5] is shown in figure below
The parameters for the uniform are the interval endpoints, a and b. The mean
and variance of a uniform random variable are given by
E[X] =
a+b
,
2
and
V ar[X] =
(b − a)2
.
12
The cumulative distribution function for a uniform random variable is

x≤a
 0;
x−a
;
a
<
x<b
F (x) =
 b−a
1;
x ≥ b.
Example 5.3
If the r.v. X is distributed as U (−α, α), (α > 0), determine the parameter α, so
that each of the following equalities holds:
8
CHAPTER 5. CONTINUOUS RANDOM VARIABLE
1. P (−1 < X < 2) = 0.75.
2. P (|X| < 1) = P (|X| > 2).
Solution
1. P (−1 < X < 2) =
3
2α
= 0.75 and α = 2.
2. P (|X| < 1) = P (|X| > 2) is equivalent to
1
α
=1−
2
α
from which α = 3.
△
Exercises
Exercise 5.3
If the r.v.X is distributed as U (0, 1), compute the expectations:
1. E(3X 2 − 7X + 2).
2. E(2eX ).
Solution
E[X] =
1
2
R 1 x2
1. E(3X 2 − 7X + 2) = 3E[X 2 ] − 7E[X] + 2 = 3 0 1−0
dx − 7 12 + 2 = −0.5
R 1 ex
2. E(2eX ) = 2E[eX ] = 2 0 1−0
dx = 2ex |10 = 2(e − 1) ≈ 3.44.
△
Answer: -0.5; 3.44.
5.4
The normal distribution
A well known distribution in statistics and engineering is the normal distribution. Also called the Gaussian distribution, it has a continuous probability density
function given by
(x − µ)2
1
,
f (x) = √ exp −
2σ 2
σ 2π
where −∞ < x < ∞; −∞ < µ < ∞; σ 2 > 0.
The normal distribution is completely determined by its parameters (µ and σ 2 ),
which are also the expected value and variance for a normal random variable. The
notation X ∼ N (µ, σ 2 ) is used to indicate that a random variable X is normally
distributed with mean µ and variance σ 2 .
Some special properties of the normal distribution are given below
CHAPTER 5. CONTINUOUS RANDOM VARIABLE
9
• The value of the probability density function approaches zero as x approaches
positive and negative infinity.
• The probability density function is centered at the mean µ, and the maximum
value of the function occurs at x = µ.
• The probability density function for the normal distribution is symmetric
about the mean µ.
Few cases for µ = 1.5 and different parameter values of σ are shown in figure below:
The special case of a standard normal random variable is one whose mean is
zero (µ = 0), and whose standard deviation is one (σ = 1). If X is normally
distributed, then
Z=
X −µ
σ
is a standard normal random variable. The p.d.f is given in a figure below.
10
CHAPTER 5. CONTINUOUS RANDOM VARIABLE
The cumulative distribution function of a standard normal random variable is
denoted by
2
Z z
1
y
dy
Φ(z) = √
exp −
2
2π −∞
If X ∼ N (µ, σ 2 ) and a, b are real numbers then:
X −µ
b−µ
a−µ
<
<
P (a < X < b) = P
σ
σ
σ
b−µ
b−µ
a−µ
a−µ
<Z<
=Φ
−Φ
=P
σ
σ
σ
σ
. That is,
P (a < X < b) = Φ
b−µ
σ
−Φ
a−µ
σ
.
If Z ∼ N (0, 1), it is seen from the Normal tables that:
P (−1 < Z < 1) = 0.68269,
P (−2 < Z < 2) = 0.95450,
P (−3 < Z < 3) = 0.99730,
so that almost all of the probability mass lies within 3 standard deviations from
the mean. The same is true in case X ∼ N (µ, σ 2 ). That is:
P (µ − σ < X < µ + σ) = 0.68269,
P (µ − 2σ < X < µ + 2σ) = 0.95450,
P (µ − 3σ < X < µ + 3σ) = 0.99730.
Example 5.4
Suppose that numerical grades in a statistics class are values of a r.v.X which
is (approximately) Normally distributed with mean µ = 65 and s.d. σ = 15.
Furthermore, suppose that letter grades are assigned according to the following
rule: the student receives an A if X ≥ 85; B if 70 ≤ X < 85; C if 55 ≤ X < 70;
D if 45 ≤ X < 55; and F if X ≤ 45. Identify the expected proportions of letter
grades to be assigned.
Solution
85−65
P (X ≥ 85) = 1 − P (X < 85) = 1 − P X−µ
= 1 − P (Z < 1.34) =
<
σ
15
1 − Φ(1.34) = 1 − 0.909877 = 0.090123 ≈ 0.09.
The student earns a B with probability P (70 ≤ X < 85) = P ( 70−65
≤ X−µ
<
15
σ
85−65
) = P (0.34 ≤ Z ≤ 1.34) = Φ(0.34) − Φ(1.34) = 0.909877 − 0.633072 =
15
0.276805 ≈ 0.28.
Similarly, the student earns a C with probability P (55 ≤ X < 70) = Φ(0.34) +
Φ(0.67) − 1 = 0.381643 ≈ 0.38.
The student earns a D with probability P (45 ≤ X < 55) = Φ(1.34) − Φ(0.67) =
CHAPTER 5. CONTINUOUS RANDOM VARIABLE
11
0.161306 ≈ 0.16,
and the student is assigned an F with probability P (X < 45) = Φ(−1.34) =
1 − Φ(1.34) = 0.09123 ≈ 0.091.
△
Answer: 9%, 28%, 38%, 16%, and 9%.
Exercises
Exercise 5.4
The distribution of I.Q.s of the people in a given group is approximated well by the
Normal distribution with µ = 105 and σ = 20. What proportion of the individuals
in the group in question has an I.Q.:
1. At least 50?
2. At most 80?
3. Between 95 and 125?
Solution
X ∼ N (105, 202 )
50−105
) = P (Z > −2.75)
20
P (X < 80) = P (Z < 80−105
) = P (Z < −1.25)
20
P (95 < X < 125) = P ( 95−105
< Z < 125−105
)
20
20
1. P (X > 50) = P (Z >
= P (Z < 2.75) = 0.997020
2.
= 1 − P (Z < 1.25) = 0.10565
3.
= P (−0.5 < Z < 1) = P (Z <
1) − P (Z < −0.5) = P (Z < 1) + P (Z < 0.5) − 1 = 0.532807
△
Answer: 0.997020; 0.10565; 0.532807.