Download Chapter 3.3 – Continuous distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Chapter 3.3 – Continuous distributions
• In this section we study several continuous distributions and their properties.
• Here are a few, classified by their support SX . There are of course many, many more.
• For each of these, we will discuss the PDF/CDF, moments, parameters, relationship with
other distributions, and potential applications.
ST521
SX
Families
(a, b)
Uniform
(0, 1)
Beta
(0, ∞)
Exponential; Gamma; Log normal; Chi-squared
(∞, ∞)
Normal; Double exponential; T; Cauchy
Chapter 3.3
Page 1
Uniform distribution
• The Uniform(a, b) distribution has support SX = (a, b).
• The family has two parameters a and b with a < b.
• The PDF is fX (x) =
1
I(a
b−a
< x < b).
• The CDF is FX (x) =
• The mean is E(X) =
• The variance is V(X) =
ST521
Chapter 3.3
Page 2
Beta distribution
• The Beta(a, b) distribution has support SX = (0, 1), and a more flexible shape than a uniform.
• The PDF is fX (x) =
Γ(a+b) a−1
x (1
Γ(a)Γ(b)
− x)b−1 .
• The gamma function for a > 0 is Γ(a) =
∫∞
0
ta−1 exp(−t)dt, so that (integration by parts)
Γ(a + 1) = aΓ(a) for any a, and thus Γ(a) = (a − 1)! if a is an integer.
• The two parameters a > 0 and b > 0 control the shape of the PDF.
• The uniform is a special case.
• What type of data might be modeled with a beta?
ST521
Chapter 3.3
Page 3
• The mean and variance of the beta are:
ST521
Chapter 3.3
Page 4
• What if Y is a test score between 0 and 30, can we model it with a beta?
ST521
Chapter 3.3
Page 5
Gamma distribution
• The Gamma(a, b) distribution has support SX = R+ .
• The PDF is fX (x) =
xa−1 exp(−x/b)
.
Γ(a)ba
• Sometimes the PDF is written fX (x) =
ba xa−1 exp(−xb)
,
Γ(a)
so be careful!
• The two parameters a > 0 and b > 0 control the shape of the PDF. a is the shape parameter,
b is the scale.
• The shape of the PDF changes from very skewed for small a to symmetric for large a.
• To see that b sets the scale, note that if c > 0 and X ∼ Gamma(a, b), then Y = cX ∼
Gamma(a, cb).
• What types of data might be modeled with a Gamma?
ST521
Chapter 3.3
Page 6
• The mean and variance of the gamma are:
ST521
Chapter 3.3
Page 7
• The gamma has two important special cases, the exponential and the chi-square.
• If X ∼ Gamma(a, b) and a = p/2 and b = 2, then X ∼ Chi-squared(p).
• If the data are normal, the sample variance follows a chi-square distribution (Chapter 5).
• If a = 1, then X ∼ Exponential(b) with PDF fX (x) = 1b exp(−x/b).
• The exponential has the memoryless property, sometimes used in reliability analysis
P (X > t + c|X > c) = P (X > t).
ST521
Chapter 3.3
Page 8
Double exponential distribution
• The double exponential has support on all real numbers.
• If X ∼ DE(µ, σ), then
(
)
|x − µ|
1
fX (x) =
exp −
.
2σ
σ
• The mean is E(X) = µ and the variance is V(X) = 2σ 2 .
ST521
Chapter 3.3
Page 9
Normal distribution
• By far, the most common continuous distribution used in statistics is the normal (also called
Gaussian) distribution.
• It extremely useful because of
1. The central limit theorem
2. Mathematical tractability
• If X ∼ N(µ, σ 2 ), then
[
]
1
(x − µ)2
.
fX (x) =
exp −
2πσ
2σ 2
• The two parameters are the mean E(X) = µ and variance is V(X) = σ 2 .
• Since the log PDF is quadratic in the error x − µ, it turns out there is a connection between
the normal distribution and sum of squared errors in a least squares analysis (Chapter 11).
ST521
Chapter 3.3
Page 10
• The moment generating function is:
ST521
Chapter 3.3
Page 11
• Therefore the mean and variance are:
ST521
Chapter 3.3
Page 12
• Setting µ = 0 and σ 2 = 1 gives the standard normal distribution, Z ∼ N(0, 1).
• Empirical rule:
– P(−1 < Z < 1) ≈ 0.68
– P(−2 < Z < 2) ≈ 0.95
– P(−3 < Z < 3) ≈ 0.99
• If Y = µ + σZ, then
– E(Y ) =
– V(Y ) =
• In fact, a linear combination of normals is normal (Chapter 5), so Y ∼ N(µ, σ 2 ). That is, the
normal distribution is a location-scale distribution (Chapter 3.5).
• This works the other way too: If Y ∼ N(µ, σ) and Z = (Y − µ)/σ, then:
ST521
Chapter 3.3
Page 13
• Another version of the empirical rule: If Y is normal, then
– Y is within one standard deviation of the mean with probability approximately 0.68.
– Y is within two standard deviation of the mean with probability approximately 0.95.
– Y is within three standard deviation of the mean with probability approximately 0.99.
ST521
Chapter 3.3
Page 14
• The normal distribution can be used to approximate many other distributions.
• For example, if Y ∼ Binomial(n, p), then E(Y ) = np and V(Y ) = np(1 − p), and if n is
large,
Y ≈ N [np, np(1 − p)] .
• This can be used to avoid tedious look-ups in the binomial table.
• Example: If Y ∼ Binomial(1000, 0.1), what is P (Y < 90)? Use both the exact binomial
and approximate normal computation.
ST521
Chapter 3.3
Page 15
Log normal distribution
• If X ∼ N(µ, σ 2 ), then Y = exp(X) ∼ LogNormal(µ, σ 2 ).
• Y ’s domain is
• fY (y) =
• E(Y )
ST521
Chapter 3.3
Page 16
Related documents