Download STAT 315: LECTURE 4 CHAPTER 4: CONTINUOUS RANDOM

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of network traffic models wikipedia , lookup

Law of large numbers wikipedia , lookup

Non-standard calculus wikipedia , lookup

History of statistics wikipedia , lookup

Multivariate normal distribution wikipedia , lookup

Exponential family wikipedia , lookup

Chi-squared distribution wikipedia , lookup

Student's t-distribution wikipedia , lookup

Multimodal distribution wikipedia , lookup

Exponential distribution wikipedia , lookup

Normal distribution wikipedia , lookup

Transcript
STAT 315: LECTURE 4
CHAPTER 4: CONTINUOUS RANDOM VARIABLES
TROY BUTLER
1. The basic concepts, definitions, notation, and some results
Some basic definitions and notation and results from basic calculus:
• A continuous random variable X is a rv that can take on any value within an interval, or union
of disjoint intervals.
– For a continuous rv X, the probability of any particular real number is zero, i.e. if X is a
continuous rv, then P (X = x) = 0 for all x ∈ R.
– For a continuous rv X, it only makes sense to ask about probabilities of events that contain an
infinite number of outcomes (e.g. a small interval). In practice, we wish to use many i.i.d. samples
of X to approximate the probabilities of various events in S.
• Let X be a continuous rv. The probability density function (pdf) of X is a function f (x) such
that for any two real numbers a and b, a ≤ b:
∫
b
P (a ≤ X ≤ b) =
f (x)dx.
a
– f (x) ≥ 0 for all x (i.e., f must be nonnegative)
∫∞
– −∞ f (x) dx = 1 (i.e. f must be integrable)
• Using basic calculus we have the following:
– Consider a fixed value of a in the range of continuous rv X with pdf f (x) such that f (a) > 0,
then
∫
P (X = a) =
∫
a
a+ϵ
f (x) dx = lim
f (x) dx = 0.
ϵ↓0
a
a−ϵ
– Moreover, P (a ≤ X ≤ b) = P (a < X ≤ b) = P (a ≤ X < b) = P (a < X < b).
• The cumulative distribution function (cdf) F (x) for a continuous rv X is defined for every
number x by
∫
F (x) = P (X ≤ x) =
x
f (y) dy.
−∞
– F (x) ∈ [0, 1] for all x ∈ R, limx→−∞ F (x) = 0 and limx→∞ F (x) = 1.
– If X is a continuous rv with cdf F (x), then for any number a, P (X > a) = 1 − F (a)
– For any two numbers a and b with a < b, P (a ≤ X ≤ b) = F (b) − F (a).
– F (x) is simply a specific antiderivative of f (x).
1
2
TROY BUTLER
• If X is a continuous rv with pdf f (x) and cdf F (x), then at every x for which the derivative F ′ (x)
exists, F ′ (x) = f (x).
• If X is a continuous rv with pdf f (x), then the expectation of X is
∫ ∞
E(X) = µX =
xf (x) dx.
−∞
– If h(X) is any function of X, then
∫
∞
h(x)f (x) dx.
E(h(X)) = µh(X) =
−∞
– When there is no room for confusion, we often just use µ to indicate the expectation of the rv
under question.
• If X is a continuous rv with pdf f (x) and expectation µ, then the variance of X is
∫ ∞
V (X) = σ 2 =
(x − µ)2 f (x) dx.
−∞
– Sometimes it is easier to compute V (X) = E(X 2 ) − (E(X))2 = E(X 2 ) − µ2 .
√
– The standard deviation of X is given by SD(X) = σ = V (x).
• Let p be a number between 0 and 1. The (100p)th percentile of the distribution of continuous rv X,
denoted by η(p), is defined by
∫
η(p)
p = F (η(p)) =
f (x)dx.
−∞
2. The Uniform Distribution
A uniform rv X is a continuous rv with a pdf that can be written as

 1
a≤x≤b
b−a
f (x) =
 0
otherwise
We denote this as X ∼ U (a, b).
Theorem 1. If X ∼ U (a, b), then
E(X) =
a+b
(b − a)2
, and V (X) =
,
2
12
3. The Exponential Distribution
The exponential distribution is related to the Poisson distribution we studied in chapter 3. It is useful in
modeling many engineering and physical phenomena especially when modeling times between the occurrence
of successive events and can be used in certain instances to model component lifetime.
A continuous rv X is said to have an exponential distribution with parameter λ > 0 if its pdf is

 λ exp(−λx), x ≥ 0
f (x) =
 0,
otherwise.
STAT 315: LECTURE 4
CHAPTER 4: CONTINUOUS RANDOM VARIABLES
3
We often write X ∼ Exp(λ) to indicate that rv X has an exponential distribution with specified parameter
λ.
Theorem 2. If X ∼ Exp(λ), then its cdf is given by

 1 − exp(−λx), x ≥ 0,
F (x) =

0,
x < 0,
The expectation is E(X) = λ1 , and the variance of X is V (X) =
1
λ2 .
Suppose we model a component lifetime as X ∼ Exp(λ), and assume the component is still working after
some fixed t0 hours from when it was first put into use. We can show that P (X ≥ t+t0 | X ≥ t0 ) = P (X ≥ t).
This means that the exponential distribution has a memoryless property.
4. The Normal Distribution
Meet your new best friend. The normal curve is a symmetric bell-shaped curve. A distribution represented by a normal curve is called a normal distribution. Many populations can be approximated by
a normal distribution. Thanks to the Central Limit Theorem, point estimates of the mean-value of any
distribution computed from i.i.d. samples often have a normal distribution making this distribution central
to the rest of the course involving hypothesis testing.
A continuous rv X has a normal distribution with parameters µ and σ (or µ and σ 2 ), where µ ∈
(−∞, ∞) and σ > 0, if its pdf is
(x−µ)2
1
f (x) = √
e− 2σ2
2πσ
The cdf of X is
∫
F (x) = P (X ≤ x) =
− ∞ < x < ∞.
x
−∞
(t−µ)2
1
√
e− 2σ2 dt .
2πσ
The mean of X is E(X) = µ, and its variance is V ar(X) = σ 2 .
Figure 1. Left: pdfs of normal rv’s. Right: cdfs of normal rv’s.
4
TROY BUTLER
The normal distribution with parameters µ = 0 and σ = 1 is called the standard normal distribution.
Its pdf is
x2
1
f (x) = √ e− 2
2π
− ∞ < x < ∞.
We generally denote a standard normal rv by Z.
The cdf of Z is often denoted by Φ(z) and is defined by
∫
Φ(z) := F (z) = P (Z ≤ z) =
z
−∞
x2
1
√ e− 2 dx.
2π
The erf function in Matlab is quite handy in computing probabilities of normally distributed rv’s.
The Table in Appendix A.3 gives Φ(z) = P (Z ≤ z).
Remark 1. We define the notation zα for α ∈ (0, 1) to represent the z value corresponding to P (Z > zα ) = α.
Note this is equivalent to P (Z ≤ zα ) = 1 − α.
2
Theorem 3. If X ∼ N (µX , σX
), for any two constant a and b and Y = aX + b, then Y follows normal
2
distribution with mean µY = aµX + b and variance σY2 = a2 σX
.
Remark 2. Th theorem above means that we can transform any normal rv into a standard normal rv. We
will do this a lot and is relatively simple.
If X has a normal distribution with mean µ and standard deviation σ, then the standardized variable
Z=
X −µ
σ
has a standard normal distribution. Thus,
P (a ≤ X ≤ b) = P (
a−µ
b−µ
≤Z≤
),
σ
σ
P (X ≤ a) = P (Z ≤
a−µ
),
σ
P (X ≥ b) = 1 − P (Z ≤
b−µ
).
σ
We can always go back from the standardized variable to the non-standardized variable using X = µ+Zσ.
Thus, to find the (100p)th percentile of a normal distribution with mean µ and standard deviation σ, follow
the two steps below:
(1) Find the (100p)th percentile of a the standard normal distribution.
(2) The (100p)th percentile for N (µ, σ)=µ+[(100p)th percentile for Z]×σ.
STAT 315: LECTURE 4
CHAPTER 4: CONTINUOUS RANDOM VARIABLES
5
5. Determining when to use normal distributions
There are several ways we may verify that data is modeled reasonably well by a normal distribution. A
population distribution is approximately normal if the empirical rule holds. This rule states that data that
fits the normal distribution will have approximately:
• 68% of the observations within 1 SD of the mean.
• 95% of the observations within 2 SD of the mean.
• 99% of the observations within 3 SD of the mean.
We can always “eyeball” it by plotting a normal curve over a density histogram.
Probably the most systematic eyeball method is to look at a normal probability plot (a.k.a. QQ plot).
This is actually a decent method for determining which named distribution (normal, uniform, exponential,
or other) should be parametrically fit (by using data to estimate the parameters defining the distribution)
to data before we even go through the trouble of fitting the parameters (e.g. why estimate the mean and
standard deviation if the underlying distribution looks to be uniform instead of normal?). A probability plot
or QQ plot graphs the sample percentiles against the theoretical percentiles of a particular distribution. The
essence of such a plot is that if the distribution on which the plot is based is a good fit, then the points in
the plot will fall close to a straight line. If the distribution is a poor fit, then the points will depart from a
linear pattern.
Sometimes our data appears non-normal, but a transformation of the data gives us a symmetric bellshaped curve.
6. Exercises to do in class
Chapter 4 exercises: 6 (b)-(e) and 12 (b)-(e), 10 (b)-(d) and 24, 30, 32, 52, 60