Download Chapter 4: CONTINUOUS RANDOM VARIABLES AND

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 4: CONTINUOUS RANDOM
VARIABLES AND
PROBABILITY DISTRIBUTIONS
Part 4:
Gamma Distribution
Weibull Distribution
Lognormal Distribution
Sections 4-9 through 4-11
Another exponential distribution example first...
1
• Example: Magnitude of Earthquakes
The magnitude of earthquakes in a region
can be modeled as having an exponential distribution where the mean of the distribution
is 2.4, as measured on the Richter scale.
Let X represent the magnitude of an earthquake. Then, X ∼ exponential(λ).
First, determine λ:
1
E(X) = = 2.4 ⇒
λ
1
λ=
≈ 0.4167
2.4
In this problem, X is not a ‘wait time’ and is
not related to the Poisson process. Here, λ is
not a rate parameter, but is simply a parameter that tells you the shape of the distribution
of earthquake magnitudes.
2
f(x)
0
2
4
6
8
10
x
Find the probability that an earthquake striking this region will...
a) exceed 3.0 on the Richter scale.
b) fall between 2.0 and 3.0 on the Richter
scale.
3
a) exceed 3.0 on the Richter scale.
b) fall between 2.0 and 3.0 on the Richter
scale.
4
Gamma Distribution
Section 4-9
Another continuous distribution on x > 0 is the
gamma distribution.
• Gamma Distribution
The random variable X with probability density function
λr xr−1e−λx
f (x) =
for x > 0
Γ (r)
is a gamma random variable with parameters λ > 0 and r > 0.
• Mean and Variance
For a gamma random variable with parameters λ and r,
r
µ = E(X) =
λ
5
and
r
2
σ = V (X) = 2
λ
• The Gamma function: Γ (r)
The value in the denominator of f (x) is a
constant dependent on r. This value is:
Z ∞
Γ (r) =
xr−1e−xdx for r > 0
0
and this is a finite integral.
You can think of Γ (r) as a necessary constant in f (x) to make sure the area under
f (x) is 1.0
6
The gamma family (expressed as choices of λ
and r) is very flexible:
0.6
0.8
Gamma distributions with fixed scale parameter (lambda=1)
0.4
0.0
0.2
f(x)
r=0.2
r=1
r=5
0
5
10
15
X
λ is called the scale parameter as it most influences the spread.
r is called the shape parameter as it most influences the peaked-ness of the distribution.
7
• Example: Gene expression data
As technology progresses, so does the kind of
data we can collect. We can now gather information on the amount of protein (or mRNA)
outputted by a certain gene in an organism.
Below is a cDNA microarray slide showing
the amount of mRNA (as intensity of fluorescence) for each of thousands of genes for
a single organism.
8
If we consider the expression values from many
individuals for a single gene:
0.3
0.0
0.1
0.2
Density
0.4
0.5
Gene 4 expression values
0
1
2
3
expression
We can model these expression values all coming from a specific gene with a gamma distribution...
9
4
0.3
0.0
0.1
0.2
Density
0.4
0.5
Gene 4 expression values
0
1
2
3
expression
The solid curve is the ‘best fitting’ gamma
distribution to the observed data.
10
4
Modeling the observed data with a common
distribution allows us to compute theoretical
probabilities, and compare different groups
(such as healthy patients vs. cancer patients).
——————————————————–
• Gamma distribution modeling
examples:
– Gene expression data
– Climatology models for monthly
precipitation
– The sum of k independent exponential random variables
The integration for gamma probabilities would
come from tables (like we saw for the normal distribution)... your book does not include these.
11
Instead...
book homework problems are about recognizing
the gamma probability density function, setting
up f (x), and recognizing the mean µ and variance σ 2 (which can be computed from λ and r),
and seeing the connection of the gamma to the
exponential and the Poisson process.
• Example: The time between failures of a
laser machine is exponentially distributed with
a mean of 25,000 hours.
a) What is the expected time until the second
failure?
12
13
b) What is the probability that the time until the third failure exceeds 50,000 hours?
14
15
• Thus, we have another gamma
distribution modeling example:
– Time until rth failure in a Poisson Process with rate parameter λ is distributed
gamma(r, λ).
• Some comments on the gamma(r, λ)
distribution:
– When r = 1, f (x) is an exponential distribution with parameter λ. The exponential
distribution is a special case of the gamma
distribution.
– If r is a positive integer, the distribution
is called an Erlang distribution.
16
– Some relationships:
Γ (1) = 0! = 1
Γ (r + 1) = rΓ (r)
{Γ (4) = 3·2·1 = 6}
Γ (r+1) = r!
√
1
Γ (2) = π
– For the gamma(r, λ) distribution, when
λ = 1/2 and r = p/2 where p is a positive integer, then we have a chi-squared
distribution with parameter p, another special case of the gamma.
17
Weibull Distribution
Section 4-10
Another continuous distribution for x > 0.
It can be used to model a situation where the
number of failures increases with time, decreases
with time, or remains constant with time. (So,
it’s used for more complicated situations than a
Poisson process).
• Weibull Distribution
The random variable X with probability density function
β−1
β x
x β
f (x) =
exp −
for x > 0
δ δ
δ
is a Weibull random variable with scale parameter δ > 0 and shape parameter β > 0.
18
• Mean and Variance
For a Weibull random variable with parameters β and δ,
1
µ = E(X) = δΓ 1 +
β
and
2
2
1
2
2
2
σ = V (X) = δ Γ 1 +
−δ Γ 1 +
β
β
19
Also a flexible family:
20
• Cumulative Distribution Function
If X has a Weibull distribution with parameters δ and β, then the cumulative distribution function of X is
β
−( xδ )
F (x) = 1 − e
• A comment on the W eibull(δ, β) distribution:
– When β = 1, f (x) is an exponential distribution with parameter 1/δ. The exponential is a special case of the Weibull.
21
• Example: Manufacture of semiconductor
In an industrial engineering article, the authors suggest using a Weibull distribution to
model the duration of a bake step in the manufacture of a semiconductor.
Let T represent the duration in hours of the
bake step for a randomly chosen lot.
Suppose T ∼ W eibull(δ = 10, β = 0.3).
a) What is the probability that the bake step
takes longer than four hours?
22
b) What is the probability that the bake step
takes between two and seven hours?
23
Lognormal Distribution
Section 4-11
The last continuous distribution we will consider
is also for x > 0.
Let W be a normally distributed random variable. Suppose we create a new random variable X with the transformation X = exp(W ).
Then, X is a lognormal random variable. The
name follows from the fact that
ln(X) = W
so we have ln(X) being normally distributed.
W can take on values from −∞ to ∞.
But the domain (or range) of X is the positive
real numbers.
24
• Lognormal Distribution
Let W have a normal distribution with mean
θ and variance ω 2, then X = exp(W ) is a
lognormal random variable with probability
density function
"
#
1
(ln x − θ)2
√
f (x) =
exp −
for x > 0.
2
2ω
xω 2π
25
• Mean and Variance
For a lognormal distribution with parameters
θ and ω 2,
2/2
θ+ω
µ = E(X) = e
and
σ 2 = V (X) = e
2θ+ω 2
e
ω2
−1
Note that for the lognormal r.v. X, the mean
and variance are µ and σ 2 and these are
functions of θ and ω 2, which are the mean
and variance of W , a normal random variable such that X = exp(W ).
So, θ and ω 2 show up in W ∼ N (θ, ω 2).
26
• Example: Component lifetimes
Lifetimes of a randomly chosen component
are lognormally distributed with parameters
θ = 1 and ω = 0.5 days.
a) Find the mean lifetime of these components.
b) Find the standard deviation of the lifetimes.
27
c) Find the probability that a component
lasts longer than 4 days.
28
Related documents