Download Continuous Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Continuous Distributions
Continuous Distributions
A normal distribution and other density functions involving exponential play the most important role
in probability and statistics. They are related in a certain way, as summarized in a diagram later in
this topic.
10
10.1
Exponential and Gamma Distributions
Exponential distribution
The exponential density is defined by
(
λe−λx
f (x) =
0
x ≥ 0;
x<0
Then, the cdf is computed as
(
1 − e−λt
F (t) =
0
t ≥ 0;
t<0
Memoryless property. The function S(t) = 1 − F (t) = P (X > t) is called the survival function.
When F (t) is an exponential distribution, we have S(t) = e−λt for t ≥ 0. Furthermore, we can find
that
S(t + s)
P (X > t + s)
=
= S(t) = P (X > t),
(9)
P (X > t + s | X > s) =
P (X > s)
S(s)
which is referred as the memoryless property of exponential distribution.
10.2
Gamma distribution
The gamma function, denoted by Γ(α), is defined as
Z ∞
uα−1 e−u du,
Γ(α) =
α > 0.
0
Then the gamma density is defined as
f (x) =
λα α−1 −λx
x
e
Γ(α)
x≥0
which depends on two parameters α > 0 and λ > 0. We call the parameter α a shape parameter,
because changing α changes the shape of the density. We call the parameter λ a rate parameter, which
rescales the density without changing its shape. This is equivalent to changing the units of measurement
(feet to meters, or seconds to minutes). Suppose that the parameter α is an integer n. Then we have
Γ(n) = (n − 1)!. In particular, the gamma distribution with α = 1 becomes an exponential distribution
(with parameter λ).
n
and λ = 12 is called the chi-square
Chi-square distribution. The gamma distribution with α =
2
distribution with n degrees of freedom.
Page 26
MATH 4470-5470/November 3, 2010
Continuous Distributions
Expectation, variance and mgf of a gamma random variable Let X be a gamma random
variable with parameter (α, λ). For t < λ, the mgf MX (t) of X can be computed as follows.
Z ∞
Z ∞
α
λα
α−1 −λx
tx λ
x
e
dx =
xα−1 e−(λ−t)x dx
e
MX (t) =
Γ(α)
Γ(α) 0
0
α
α
Z ∞
Z ∞
λα
1
λ
λ
1
α−1 −(λ−t)x
α−1 −u
=
((λ
−
t)x)
e
(λ
−
t)dx
=
u
e
du
=
(λ − t)α Γ(α) 0
λ−t
Γ(α) 0
λ−t
Then we can find the first and second moment
′
E[X] = MX
(0) =
Thus, we can obtain Var(X) =
10.3
α
λ
and
′′
E[X 2 ] = MX
(0) =
α(α + 1)
λ2
α
α(α + 1) α2
− 2 = 2.
2
λ
λ
λ
Exercises
1. Let X be a gamma random variable with parameter (α, λ), and let Y = cX with c > 0.
(a) Show that
P (Y ≤ t) =
where
f (x) =
Z
t
f (x) dx,
−∞
(λ/c)α α−1 −(λ/c)x
x
e
Γ(α)
(b) Conclude that Y is a gamma random variable with parameter (α, λ/c). Thus, only λ is
affected by the transformation Y = cX, which justifies to call λ a rate parameter.
2. Suppose that the lifetime of an electronic component follows an exponential distribution with
λ = 0.2.
(a) Find the probability that the lifetime is less than 10.
(b) Find the probability that the lifetime is between 5 and 15.
3. X is an exponential random variable, and P (X < 1) = 0.05. What is λ?
4. The lifetime in months of each component in a microchip is exponentially distributed with pa1
rameter µ = 10000
.
(a) For a particular component, we want to find the probability that the component breaks down
before 5 months, and the probability that the component breaks down before 50 months.
Use the approximation e−x ≈ 1 − x for a small number x to calculate these probabilities.
(b) Suppose that the microchip contains 1000 of such components and is operational until three
of these components break down. The lifetimes of these components are independent and ex1
. By using the Poisson approximation, find
ponentially distributed with parameter µ = 10000
the probability that the microchip operates functionally for 5 months, and the probability
that the microchip operates functionally for 50 months.
5. The length of a phone call can be represented by an exponential distribution. Then answer to
the following questions.
(a) A report says that 50% of phone call ends in 20 minutes. Assuming that this report is true,
calculate the parameter λ for the exponential distribution.
Page 27
MATH 4470-5470/November 3, 2010
Continuous Distributions
(b) Suppose that someone arrived immediately ahead of you at a public telephone booth. Using
the result of (a), find the probability that you have to wait between 10 and 20 minutes.
Optional Problems.
1. Suppose that a non-negative random variable X has the memoryless property
P (X > t + s | X > s) = P (X > t).
By completing the following questions, we will show that X must be an exponential random
variable.
(a) Let S(t) is the survival function of X. Show that the memoryless property implies that
S(s + t) = S(s)S(t) for s, t ≥ 0.
(b) Let κ = S(1). Show that κ > 0.
1
m
n
(c) Show that S n1 = κ n and S m
n =κ .
Therefore, we can obtain S(t) = κt for any t ≥ 0. By letting λ = − ln κ, we can write S(t) = e−λt .
2. The gamma function is a generalized factorial function.
(a) Show that Γ(1) = 1.
(b) Show that Γ(α + 1) = αΓ(α). Hint: Use integration by parts.
(c) Conclude that Γ(n) = (n − 1)! for n = 1, 2, 3, . . ..
√
(d) Use the formula Γ( 12 ) = π to show that if n is an odd integer then we have
√
n
π(n − 1)!
= n−1 n−1 Γ
2
2
!
2
11
Normal Distributions
A normal distribution is represented by a family of distributions which have the same general shape,
sometimes described as “bell shaped.” The normal distribution has the pdf
1
(x − µ)2
f (x) := √ exp −
,
(10)
2σ 2
σ 2π
which depends upon two parameters µ and σ 2 . Here π = 3.14159 . . . is the famous “pi” (the ratio of
the circumference of a circle to its diameter), and exp(u) is the exponential function eu with the base
e = 2.71828 . . . of the natural logarithm.
Much celebrated integrations are the following:
Z ∞
(x−µ)2
1
√
e− 2σ2 dx = 1,
σ 2π −∞
Z ∞
(x−µ)2
1
√
xe− 2σ2 dx = µ,
σ 2π −∞
Z ∞
(x−µ)2
1
√
(x − µ)2 e− 2σ2 dx = σ 2 .
σ 2π −∞
(11)
The integration (11) guarantees that the density function (10) always represents a “probability density”
no matter what values the parameters µ and σ 2 are chosen. We say that a random variable X is normally
Page 28
MATH 4470-5470/November 3, 2010
Continuous Distributions
distributed with parameter (µ, σ 2 ) when X has the density function (10). The parameter µ, called a
mean (or, location parameter), provides the center of the density, and the density function f (x) is
symmetric around µ. The parameter σ is a standard deviation (or, a scale parameter); small values
of σ lead to high peaks but sharp drops. Larger values of σ lead to flatter densities. The shorthand
notation
X ∼ N (µ, σ 2 )
is often used to express that X is a normal random variable with parameter (µ, σ 2 ).
11.1
Linear transform of normal random variable
One of the important properties of normal distribution is that if X is a normal random variable with
parameter (µ, σ 2 ) [that is, the pdf of X is given by density function (10)]. then Y = aX + b is also a
normal random variable having parameter (aµ + b, (aσ)2 ). In particular,
X −µ
σ
(12)
becomes a normal random variable with parameter (0, 1), called the z-score. Then the normal density
φ(x) with parameter (0, 1) becomes
x2
1
φ(x) := √ e− 2 ,
2π
which is known as the standard normal density. We define the cdf Φ by
Z t
φ(x) dx.
Φ(t) :=
−∞
11.2
Moment generating function
Let X be a standard normal random variable. Then we can compute the mgf of X as follows.
2
Z ∞
Z ∞
Z ∞
2
2
2
t2
t
1
1
1
tx − x2
− 12 (x−t)2 + t2
− (x−t)
2
2
e e
e
e
dx = √
dx = e √
dx = exp
MX (t) = √
2
2π −∞
2π −∞
2π −∞
Since X = (Y − µ)/σ becomes a standard normal random variable and Y = σX + µ, the mgf M (t) of
Y can be given by
2 2
σ t
µt
M (t) = e MX (σt) = exp
+ µt .
2
Then we can find the first and second moment
E[X] = M ′ (0) = µ
E[X 2 ] = M ′′ (0) = µ2 + σ 2
and
Thus, we can obtain Var(X) = (µ2 + σ 2 ) − µ2 = σ 2 .
11.3
Calculating probability
Suppose that a tomato plant height X is normally distributed with parameter (µ, σ 2 ). Then what is
the probability that the tomato plant height is between a and b?
How to calculate. The integration
P (a ≤ X ≤ b) =
Page 29
Z
b
a
(x−µ)2
1
√ e− 2σ2 dx
σ 2π
MATH 4470-5470/November 3, 2010
Continuous Distributions
X −µ
seems too complicated. But if we consider the random variable
, then the event {a ≤ X ≤ b} is
σ
equivalent to the event
X −µ
b−µ
a−µ
≤
≤
.
σ
σ
σ
In terms of probability, this means that
P (a ≤ X ≤ b) = P
a−µ
X −µ
b−µ
≤
≤
σ
σ
σ
.
(13)
And it is easy to calculate
b−µ
a−µ
and b′ =
.
σ
σ
Since the z-score (12) is distributed as the standard normal distribution, the above probability becomes
a′ =
P (a ≤ X ≤ b) = P
a′ ≤
X −µ
≤ b′
σ
Z
=
b′
a′
φ(x) dx = Φ(b′ ) − Φ(a′ ).
(14)
Then we will be able to look at the values for Φ(a′ ) and Φ(b′ ) from the table of standard normal
distribution. For example, if a′ = −0.19 and b′ = 0.29, then Φ(−0.19) = 0.4247 and Φ(0.29) = 0.6141,
and the probability of interest becomes 0.1894, or approximately 0.19.
Suppose that a random variable X is normally distributed with mean µ and standard deviation σ.
Then we can obtain
a−µ
b−µ
−Φ
;
1. P (a ≤ X ≤ b) = Φ
σ
σ
b−µ
;
2. P (X ≤ b) = Φ
σ
a−µ
3. P (a ≤ X) = 1 − Φ
.
σ
11.4
Exercises
1. Let X be a normal random variable with parameter (µ, σ 2 ), and let Y = aX + b with a 6= 0.
(a) Show that
P (Y ≤ t) =
where
Z
t
f (x) dx,
−∞
1
(x − aµ − b)2
√ exp −
f (x) =
.
2(aσ)2
|a|σ 2π
(b) Conclude that Y is a normal random variable with parameter (aµ + b, (aσ)2 ).
2. Let X be a normal random variable with parameter (µ, σ 2 ), and let Y = eX . Find the density
function f of Y by forming
Z t
f (x) dx.
P (Y ≤ t) =
0
This density function f is called the lognormal density since ln Y is normally distributed.
3. Show that the standard normal density integrates to unity via (a) and (b).
Page 30
MATH 4470-5470/November 3, 2010
Continuous Distributions
Z ∞Z
(a) Show that
−∞
1
(b) Show that √
σ 2π
∞
e−
−∞
Z ∞
x2 +y 2
2
dx dy =
Z
0
(x−µ)2
−
2σ 2
e
−∞
∞ Z 2π
r2
e− 2 r dθ dr = 2π.
0
1
dx = √
2π
Z
∞
e−
y2
2
dy = 1.
−∞
4. Suppose that in a certain population, individual’s heights are approximately normally distributed
with parameters µ = 70 and σ = 3in.
(a) What proportion of the population is over 6ft. tall?
(b) What is the distribution of heights if they are expressed in centimeters?
5. Let X be a normal random variable with µ = 5 and σ = 10. Find (a) P (X > 10), (b) P (−20 <
X < 15), and (c) the value of x such that P (X > x) = 0.05.
6. If X ∼ N (µ, σ 2 ), then show that P (|X − µ| ≤ 0.675σ) = 0.5.
7. X ∼ N (µ, σ 2 ), find the value of c in terms of σ such that P (µ − c ≤ X ≤ µ + c) = 0.95.
12
12.1
Sum of Independent Random Variables
Normal and gamma random variables
Normal random variables. Let X and Y be independent normal random variables with the respective parameters (µx , σx2 ) and (µy , σy2 ). Then the sum Z = X + Y of random variables has the
mgf
!
!
2 2
σy2 t2
(σx2 + σy2 )t2
σx t
+ µx t · exp
+ µy t = exp
+ (µx + µy )t
MZ (t) = MX (t) · MY (t) = exp
2
2
2
which is the mgf of normal distribution with parameter (µx + µy , σx2 + σy2 ). By the property (a) of mgf,
we can find that Z is a normal random variable with parameter (µx + µy , σx2 + σy2 ).
Gamma random variables. Let X and Y be independent gamma random variables with the respective parameters (α1 , λ) and (α2 , λ). Then the sum Z = X + Y of random variables has the mgf
MZ (t) = MX (t) · MY (t) =
λ
λ−t
α1 α2 α1 +α2
λ
λ
·
=
λ−t
λ−t
which is the mgf of gamma distribution with parameter (α1 + α2 , λ). Thus, Z is a gamma random
variable with parameter (α1 + α2 , λ).
12.2
Chi-square random variables
The gamma distribution
f (x) =
1
xn/2−1 e−x/2 ,
2n/2 Γ(n/2)
x≥0
with parameter (n/2, 1/2) is called the chi-square distribution with n degrees of freedom.
Relation between normal and chi-square random variable. Let X be a standard normal random
variable, and let
Y = X2
Page 31
MATH 4470-5470/November 3, 2010
Continuous Distributions
√
be
Then we can express the cdf FY of Y as FY (t) = P (X 2 ≤ t) = P (− t ≤ X ≤
√ of X. √
√ the square
t) = Φ( t) − Φ(− t) in terms of the standard normal cdf Φ. By differentiating the cdf, we can
obtain the pdf fY as follows.
fY (t) =
where we use Γ(1/2) =
of freedom.
√
1
t
d
1
1
t−1/2 e−t/2 ,
FY (t) = √ t− 2 e− 2 = 1/2
dt
2 Γ(1/2)
2π
π. Thus, the square X 2 of X is the chi-square random variable with 1 degree
Now let X1 , . . . , Xn be iid standard normal random variables. Since Xi2 ’s have the gamma distribution
with parameter (1/2, 1/2), the sum
n
X
Xi2
Y =
i=1
has the gamma distribution with parameter (n/2, 1/2). That is, the sum Y has the chi-square distribution with n degree of freedom.
Page 32
MATH 4470-5470/November 3, 2010
Continuous Distributions
13
Summary Diagram
(4)
Normal(µ, σ 2 )
f (x) = √
(x−µ)2
1
e− 2σ2 ,
2πσ
E[X] = µ,
? Exponential(λ)
f (x) = λe−λx
−∞ < x < ∞
E[X] =
V ar(X) = σ 2
6
1
,
λ
x≥0
V ar(X) =
1
λ2
6
(5)
(2)
(1)
?
(3)
Chi-square (with
of freedom)
? m degrees ?
f (x) =
1 − x2 x m
( 2 ) 2 −1
2e
,
Γ( m
2)
E[X] = m,
x≥0
V ar(X) = 2m
(6)
Gamma(α, λ)
λα α−1 −λx †
x
e
, x≥0
Γ(α)
α
α
E[X] = , V ar(X) = 2
λ
λ
f (x) =
(7)
6
† Γ(n) = (n − 1)!, when n
is a positive integer.
1. If X1 and X2 are independent normal random variables with respective parameters (µ1 , σ12 ) and
(µ2 , σ22 ), then X = X1 + X2 is a normal random variable with (µ1 + µ2 , σ12 + σ22 ).
m X
Xk − µ 2
2
2. If X1 , · · · , Xm are iid normal random variables with parameter (µ, σ ), then X =
σ
is a chi-square (χ2 ) random variable with m degrees of freedom.
3. α =
m
2
k=1
and λ = 12 .
4. If X1 , . . . , Xn are independent exponential random variables with respective parameters
λ1 , · · · , λn ,
Pn
then X = min{X1 , . . . , Xn } is an exponential random variable with parameter i=1 λi .
5. If X1 , · · · , Xn are iid exponential random variables with parameter λ, then X =
gamma random variable with (α = n, λ).
n
X
Xk is a
k=1
6. α = 1.
7. If X1 and X2 are independent random variables having gamma distributions with respective
parameters (α1 , λ) and (α2 , λ), then X = X1 + X2 is a gamma random variable with parameter
(α1 + α2 , λ).
Page 33
MATH 4470-5470/November 3, 2010
Continuous Distributions
14
14.1
Exercise Solutions
Exponential and gamma distributions
Z t
(λ/c)α α−1 −(λ/c)y
λα α−1 −λx
x
e
dx =
y
e
dy
1. (a) P (Y ≤ t) = P (cX ≤ t) = P (X ≤ t/c) =
−∞ Γ(α)
−∞ Γ(α)
by substituting x = y/c. (b) The integrand f (y) is a gamma density with (α, λ/c).
Z
t/c
2. Let X be the lifetime of the electronic component.
(a) P (X ≤ 10) = 1 − e−(0.2)(10) ≈ 0.8647.
(b) P (5 ≤ X ≤ 15) = P (X ≤ 15) − P (X ≤ 5) = e−(0.2)(5) − e−(0.2)(15) ≈ 0.3181.
3. By solving P (X < 1) = 1 − e−λ = 0.05, we obtain λ = − ln 0.95 ≈ 0.0513.
4. (a) Let p1 be the probability that the component breaks down before 5 months, and let p2 be the
probability that the component breaks down before 50 months. Then,
p1 = 1 − e−5µ ≈ 1 − (1 − 0.0005) = 0.0005;
p2 = 1 − e−50µ ≈ 1 − (1 − 0.005) = 0.005.
(b) Let X1 be the number of components breaking down before 5 months, and let X2 be the
number of components breaking down before 50 months. Then Xi has the binomial distribution
with parameter (pi , 1000) for i = 1, 2. By using the Poisson approximation with λ1 = 1000p1 = 0.5
and λ2 = 1000p2 = 5, we can obtain
λ1 2
−λ1
P (X1 ≤ 2) = e
1 + λ1 +
≈ 0.986;
2
λ2 2
−λ2
P (X2 ≤ 2) = e
1 + λ2 +
≈ 0.125.
2
5. Let X be the length of phone call in minutes.
(a) By solving P (X ≤ 20) = 1 − e−20λ = 0.5, we obtain λ =
(b) P (10 ≤ X ≤ 20) = (1 −
14.2
e−20λ )
− (1 −
e−10λ )
ln 0.5
20
≈ 0.207.
≈ 0.0347.
Normal distributions
1. (a) Assume a > 0. Then
P (Y ≤ t) = P (aX + b ≤ t) = P (X ≤ (t − b)/a) =
Z t
1
(y − aµ − b)2
√ exp −
=
dy
2(aσ)2
−∞ aσ 2π
Z
(t−b)/a
−∞
(x − µ)2
1
√ exp −
dx
2σ 2
σ 2π
by substituting x = (y − b)/a. When a < 0, we can similarly obtain
Z t
1
(y − aµ − b)2
√ exp −
P (Y ≤ t) =
dy
2(aσ)2
−∞ |a|σ 2π
(b) The integrand f (y) is a normal density with (aµ + b, (aσ)2 ).
Page 34
MATH 4470-5470/November 3, 2010
Continuous Distributions
2. For x > 0 we obtain
Z ln t
1
(x − µ)2
√ exp −
dx
P (Y ≤ t) = P (eX ≤ t) = P (X ≤ ln t) =
2σ 2
−∞ σ 2π
Z t
1
(ln y − µ)2
√ exp −
=
dy
2σ 2
−∞ yσ 2π
h
i
(ln y−µ)2
1
by substituting x = ln y. Thus, the density is given by f (y) = yσ√
exp
−
.
2σ2
2π
3. (a)
Z
0
∞ Z 2π
2
− r2
e
r dθ dr = 2π
0
(b) Substitute x = σy + µ.
Z
∞
2
− r2
re
0
dr = 2π
Z
∞
e−u du = 2π.
0
4. Let X be the height in inches normally distributed with µ = 70in and σ = 3in.
(a) P (X ≥ (6)(12)) ≈ 1 − Φ(0.67) ≈ 0.25.
(b) (2.54)X is the height in centimeters, and normally distributed with µ = (2.54)(70) = 177.8cm
and σ = (2.54)(3) = 7.62cm.
5. (a) P (X > 10) = 1 − Φ(0.5) ≈ 0.3085. (b) P (−20 < X < 15) = Φ(1) − Φ(−2.5) ≈ 0.8351.
(c) Observe that P (X > x) = 1 − Φ((x − 5)/10) = 0.05. Since Φ(1.64) ≈ 0.95, we can find
(x − 5)/10 ≈ 1.64, and therefore, x ≈ 21.4.
6. P (|X − µ| ≤ 0.675σ) = P (−0.675σ ≤ X − µ ≤ 0.675σ) = P (−0.675 ≤ (X − µ)/σ ≤ 0.675) =
Φ(0.675) − Φ(−0.675) ≈ 0.5.
7. Observing that P (µ − c ≤ X ≤ µ + c) = P (|(X − µ)/σ| ≤ c/σ) = 0.95, we obtain P ((X − µ)/σ ≤
c/σ) = Φ(c/σ) = 0.975. Since Φ(1.96) ≈ 0.975, we can find c/σ ≈ 1.96, and therefore, c = 1.96σ.
Page 35
MATH 4470-5470/November 3, 2010
Related documents