Download 3.2 Continuous Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Pattern recognition wikipedia , lookup

Information theory wikipedia , lookup

Time value of money wikipedia , lookup

Simplex algorithm wikipedia , lookup

Generalized linear model wikipedia , lookup

Birthday problem wikipedia , lookup

Fisher–Yates shuffle wikipedia , lookup

Hardware random number generator wikipedia , lookup

Probability amplitude wikipedia , lookup

Randomness wikipedia , lookup

Transcript
STAT 421 Lecture Notes
3.2
36
Continuous Distributions
In the previous section, the emphasis was upon discrete random variables. Attention shifts
now to continuous random variables. Earlier it was said that a continuous random variable
X differed from discrete random variables since continuous random variables take on every
value in some interval and it is not possible to enumerate the set X(S), where S is the sample
space.1 This section begins a more in-depth discussion of continuous random variables.
Definition X is a continuous random variable (r.v.) if there exists a non-negative function f defined on R such that for every interval I = [a, b] of R , the probability that X takes
on a value in I is
∫
b
Pr(a ≤ X ≤ b) =
f (x)dx.
a
Similarly, if X is continuous, then the probability that X takes on a value in an unbounded
interval (−∞, b] is
∫ b
Pr(X ≤ b) =
f (x)dx
−∞
and the probability that X takes on a value in [a, ∞) is
∫ ∞
Pr(X ≥ a) =
f (x)dx.
a
f is the probability density function (p.d.f.) of X and the closure of {x|f (x) > 0} is the
support of X. For instance, the closure of [−1, 0) ∪ (0, 1] is [−1, 1].
Every p.d.f. satisfies two properties:
1. f (x) ≥ 0 for all x ∈ R.
∫∞
2. −∞ f (x)dx = 1.
The probability of obtaining a value in an interval does not depend on whether it is open or
closed, since
∫ b
f (x)dx = Pr(a ≤ X < b),
Pr(a ≤ X ≤ b) =
a
Furthermore, 0 = Pr(a ≤ X ≤ b) − Pr(a ≤ X < b) = Pr(X = b). As this relationship holds
for all b, we conclude that Pr(X = x) = 0∀x ∈ R, and state that the probability that a
continuous r.v. takes on any singleton point is 0.
Because Pr(X = x) = 0∀x, f can be modified at countable many points in R, yet the
probability that X takes on a value in a particular interval remains the same. Thus, p.d.f’s
are not unique (the p.d.f. of a random variable can be defined differently yet all variations
1
In contrast, the support of a discrete random variable is countable.
STAT 421 Lecture Notes
37
yield the same probabilities). The convention, however is to say that f is the p.d.f. of X
though, properly, we should say that f is a p.d.f. for X.
Example A continuous random variable has a uniform distribution on an interval [a, b] if
f (x) is constant on [a, b]. For example, X is uniformly distributed on [a, b] if

 1
a≤x≤b
f (x) = b − a
0,
otherwise.
The denominator b − a is determined by the condition that the p.d.f. integrates to 1. We
could prove that the denominator is correct by finding c:
∫ b
1=
cdx = c x|ba = c(b − a).
a
For a ≤ x1 ≤ x2 ≤ b,
Pr(x1 ≤ X ≤ x2 ) =
x2 − x1
.
b−a
We write X ∼ Unif(a, b).
Example Consider a slightly different p.d.f. given by f (x) = cx where the support of X
is [a, b], and 0 ≥ a. The value of c is determined from the two requisite properties of p.d.f.s
∫∞
(f (x) ≥ 0 for all x ∈ R, and −∞ f (x)dx = 1). The first property implies c > 0. The second
property implies
∫ b
b2 − a2
2
1=
cxdx = c
⇒c= 2
.
2
b − a2
a
If [a, b] = [0, 1], then,

2x, 0 ≤ x ≤ 1
f (x) =
0,
otherwise.
If [a, b] = [0, 2], then,

 1 x, 0 ≤ x ≤ 2
f (x) = 2
0,
otherwise.
The coefficient c is called a normalizing constant.
Computing probabilities from probability density functions
The probability that a continuous random variable takes on a value in some interval [a, b]
is computed by integrating the p.d.f. and evaluating the indefinite integral as illustrated
above using the uniform random variable. The interval need not be bounded. For example,
STAT 421 Lecture Notes
suppose that
38



1
, 0≤x
2
f (x) = (1 + x)

0,
otherwise.
Then,
∫
∞
1
dx
(1 + x)2
2
∞
1 = −
1 + x 2
1
=
.
3
Pr(X ≥ 2) =
Remarks
1. Online symbolic integrators are available for computing some indefinite integrals, for
example, http://integrals.wolfram.com/index.jsp. Use them to check your work
but do not become dependent on them.
2. The probability distribution function evaluated at x ∈ R is not a probability. For
example,

5e−5x , 0 < x
f (x) =
0,
otherwise.
is a p.d.f. and f (1/5) = 1.84 > 0. Obviously, f (1/5) is not a probability in any sense.
3. Mixed distributions are modified continuous distributions that allow for specific, countably many values of x to be taken on with positive probability. For example, a measuring device may have a threshold value below which it cannot yield an accurate reading.
Whenever the underlying random variable producing the measurements yields a value
below the threshold, then the machine has been set up to report the threshold.
Suppose that the underlying random variable is X and its p.d.f. is f (x). Further,
suppose that p is the probability that X will be less than or equal to the threshold
value x0 , i.e., 0 < p = Pr(X ≤ x0 ). Then, the random variable representing a measurement made of a randomly selected population unit, say Y , may be modeled according
to
∫
Pr(Y ∈ A) = pI(x0 ∈ A) + (1 − p) f (y)dy.
A
The set A is some interval of R and I(x ∈ A) is an indicator variable that takes on the
value 1 if x ∈ A and 0 otherwise.
4. Many of the text book problems ask for plots of the p.d.f. and related functions. It’s
best to learn now how to use R for plotting. There is a script posted at
STAT 421 Lecture Notes
39
http://www.math.umt.edu/steele/STAT421/Homework/
called
DistributionPlotter.R
which constructs two plots: the probability function for a binomial random variable
and the p.d.f of an important continuous random variable. Try modifying the program
when you are asked to plot these functions.
To use the program, download it to your computer. Open R, go to the File dropdown menu (leftmost in the toolbar), and open the file. A simple editor will open with
the program. Highlight some or all of the lines, hold down the Ctrl key and press the
R key. The highlighted lines will execute.
Output appears in the console window, and a graphics device window will open if
the code contains plotting functions. You can copy the graphics file to the clipboard
or save it as a file if you place the cursor in the graphics device window and right-click
with the mouse.