Download Continuous random variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Central limit theorem wikipedia , lookup

Transcript
Lecture 3. Continuous random variables
Mathematical Statistics and Discrete Mathematics
November 9th, 2015
1 / 21
Continuous random variables
Let us measure the temperature outside. The question is: what is the probability that
the outcome is in the interval
[7, 9], [8 − 10−1 , 8 + 10−1 ], [8 − 10−2 , 8 + 10−2 ], [8 − 10−3 , 8 + 10−3 ], . . .
The intuition is that this probability should get closer and closer to 0 as the interval
gets shorter and shorter, and hence the probability of the outcome being exactly 8
degrees is 0.
Main motivating question: how do we mathematically model random quantities with
this property?
2 / 21
Continuous random variables
Recall that a random variable is a function X taking real values and defined on a
sample space S together with a probability measure P on the events contained in S.
A random variable X is called continuous if
P(X = x) = 0
for any real number x.
! Measure the temperature outside.
! Weigh a random person.
! Choose a real number uniformly at random from [0, 1].
3 / 21
Continuous PDF
• We want a tool like the discrete PDF that describes fully the distribution of X,
and that we can use for computations.
• We cannot use the definition for discrete random variables since P(X = x) = 0
for all x.
• We will use an equivalent formulation: recall, that a function f is a discrete
probability density function of a discrete random variable X if and only if
P(X ∈ [a, b]) = P(a ≤ X ≤ b) =
X
f (x),
for any a ≤ b.
a≤x≤b
• One has to replace summation with integration.
4 / 21
Continuous PDF
Let X be a continuous random variable. A function f = fX is called a continuous PDF
of X, or simply a density of X, if
Z
P(X ∈ [a, b]) = P(a ≤ X ≤ b) =
b
f (x)dx
for all a ≤ b.
a
Note that if X is continuous, then
P(a ≤ X ≤ b) = P(a < X ≤ b) = P(a ≤ X < b) = P(a < X < b).
Probability defined in this way satisfies the axioms of probability, and hence also all
their consequences.
Recall the interpretation of the integral as the area below the graph of the function.
5 / 21
Continuous PDF
A function f is a density of some random variable X, if and only if
Z ∞
f (x) ≥ 0 for all x, and
f (x)dx = 1.
−∞
Compare this with the discrete case: recall that f is a PDF of a discrete random
variable if and only if
X
f (x) ≥ 0 for all x, and
f (x) = 1.
all possible values x of X
Let E ⊂ R. The indicator (function) 1E : R → {0, 1} of E is
(
1 if x ∈ E;
1E (x) =
0 if x ∈
/ E.
Ex. Show that 1E1 ∩E2 = 1E1 · 1E2 for all E1 , E2 , and 1E1 ∪E2 = 1E1 + 1E2 if E1 ∩ E2 = ∅.
6 / 21
We are interested in some random quantity X whose pdf is well approximated by the
cosine curve:
(
c cos x for − π/2 ≤ x ≤ π/2;
fX (x) = c1[−π/2,π/2] cos x =
0
otherwise.
We have to compute c so that the density integrates to 1:
−1
c
Z
∞
=
Z
∞
fX (x)dx =
−∞
Z
π/2
1[−π/2,π/2] cos xdx =
−∞
−π/2
π/2
cos xdx = sin(x)
−π/2
= sin(π/2) − sin(−π/2) = 1 − (−1) = 2.
0.5
0.4
0.3
0.2
0.1
-1.5
-1.0
-0.5
0.5
1.0
1.5
The area of the shaded part is equal to P(0.5 ≤ X ≤ 1).
7 / 21
Continuous CDF
The cumulative density function (CDF) FX : R → [0, 1] of a random variable X is
defined by
FX (x) = P(X ≤ x).
Note that CDF is more convenient to compute probabilities than PDF:
Z
P(a ≤ X ≤ b) =
b
f (x)dx = FX (b) − FX (a).
a
8 / 21
Continuous CDF
Let, as before, fX = 12 1[−π/2,π/2] cos x. We have
Z
x
FX =
fX (s)ds =
−∞


0
R
x
cos t
dt
−π/2 2
=
1+sin x
2


1
for x < −π/2;
for − π/2 ≤ x ≤ π/2;
for x > π/2.
1.0
�(��� < � < �)
0.8
0.6
0.4
0.2
�(���<�<�)
-2
-1
1
2
We have P(0.5 ≤ X ≤ 1) = FX (1) − FX (0.5) = (sin(1) − sin(0.5))/2 ∼ 0.18.
9 / 21
From PDF to CDF and back
Let f and F be the PDF and CDF of the same continuous random variable. Then,
Z x
d
F(x) =
f (s)ds
and
f (x) = F 0 (x) = F(x).
dx
−∞
Compare this with the fact that if f and F are discrete PDF and CDF of the same
discrete random variable, then
X
F(x) =
f (s)
and
f (x) = lim− (F(x) − F(s)).
−∞≤s≤x
s→x
10 / 21
Expectation
The expectation E[X] of a continuous random variable X is defined by
Z ∞
E[X] = µX =
x · fX (x)dx.
−∞
Compare this with the same formula for a discrete random variable X:
X
E[X] =
x · fX (x).
all possible values x of X
Expectations of continuous random variables have the same properties as expectations
of discrete random variables.
11 / 21
Expectation
Recall integration by parts:
Z
a
b
b Z
f (x)g (x)dx = f (x)g(x) −
0
a
b
f 0 (x)g(x)dx.
a
Let, again, fX = 12 1[−π/2,π/2] cos x. Integrating by parts, we have:
∞
Z
1
1 π/2
1[−π/2,π/2] x cos xdx =
x cos xdx
2 −π/2
−∞ 2
Z
π/2
π/2
π/2
1
1 π/2
1
1
−
+ cos x
= x sin x
sin xdx = x sin x
2
2 −π/2
2
2
−π/2
−π/2
−π/2
1
= (π/2 − (π/2) + 0 − 0) = 0.
2
Z
E[X] =
If fX is symmetric with respect to a vertical line passing through µ on the x-axis, then
E[X] = µ.
12 / 21
Expectation of functions of random variables
Let X be a continuous random variable and let H : R → R be a function. The
expectation E[H(X)] of H(X) is defined by
Z ∞
E[H(X)] =
H(x) · fX (x)dx.
−∞
Let, as usual, fX = 12 1[−π/2,π/2] cos x and let H(x) = x2 . Using integration by parts
twice, we get:
E[X 2 ] =
Z
∞
−∞
1
1
1[−π/2,π/2] x2 cos xdx =
2
2
Z
π/2
−π/2
x2 cos xdx =
π2 − 8
.
4
Ex 1. Do the computation above.
13 / 21
Variance and standard deviation
The variance of a random variable X is defined to be
Var[X] = σX2 = E[(X − E[X])2 ].
It is usually easier to use the following computational formula:
VarX = E[X 2 ] − (E[X])2 .
Variance of continuous random variables has the same properties as variance of
discrete random variables.
The standard deviation of a random variable X is defined to be
q
p
σX = σX2 = Var[X].
14 / 21
Uniform distribution
A random variable X has uniform distribution on the interval [a, b], for a < b, if
fX =
1
1[a,b] .
b−a
We denote this by writing X ∼ unif(a, b).
Properties:
• Let [c, d] ⊂ [a, b]. Then, P(X ∈ [c, d]) =
d−c
b−a ,
• E[X] = 12 (a + b),
• Var[X] =
1
12 (a
− b)2 .
•
FX (x) =


0
x−a
 b−a

1
for x < a;
for a ≤ x ≤ b;
for x > b.
15 / 21
Uniform distribution
1.0
1.0
0.8
0.8
0.6
0.6
0.4
0.4
0.2
0.2
1
2
3
4
5
1
2
3
4
5
Figure: Left: PDF’s of uniform variables on intervals [0, 1], [0, 2], [0, 4]. Right: CDF’s of the
same variables.
16 / 21
Uniform distribution
The bus comes to the bus stop every 15 minutes starting at 7am. You arrivee at the bus
stop with a uniform distribution over the interval [7.10, 7.45]. What is the probability
that you will wait less than 5 (more than 10) minutes for the bus?
Let X be the time that you arrive at the stop. We have
P(You wait less than 5 min)
= P(X ∈ [7.10, 7.15] or X ∈ [7.25, 7.30] or X ∈ [7.40, 7.45])
= P(X ∈ [7.10, 7.15]) + P(X ∈ [7.25, 7.30]) + P(X ∈ [7.40, 7.45])
= 5/35 + 5/35 + 5/35 = 3/7.
P(You wait more than 10 min) = P(X ∈ [7.15, 7.20] or X ∈ [7.30, 7.35])
= P(X ∈ [7.15, 7.20]) + P(X ∈ [7.30, 7.35])
= 5/35 + 5/35 = 2/7.
17 / 21
Normal distribution
A random variable X has normal distribution (Gaussian distribution) with mean
(location) µ ∈ R and variance (squared scale) σ 2 > 0 if
fX (x) =
(x−µ)2
1
√ e− 2σ2 .
σ 2π
We denote this by writing X ∼ N (µ, σ 2 ). A random variable X ∼ N (0, 1) is called
the standard normal variable.
Properties:
• E[X] = µ,
• Var[X] = σ 2 .
18 / 21
Normal distribution
-6
-4
-2
0.5
0.25
0.4
0.20
0.3
0.15
0.2
0.10
0.1
0.05
2
4
6 -6
-4
-2
2
4
6
Figure: Left: PDF’s of normal variables with µ = 0, and σ 2 = 0.75, 1, 2. Right: PDF’s of
normal variables with σ = 1.5, and µ = −1, 1, 2
.
It is in general true that if fX is the PDF of X, and Y = aX + b, then
fY (x) =
1 x − b
fX
.
|a|
a
Which geometrically means that fY is obtained from fX by squeezing/expanding it by c
along the x-axis, expanding/squeezing it by 1/|c| along the y-axis, if a < 0 reflecting
along the y-axis, and shifting by b along the x-axis.
19 / 21
Normal distribution
If X ∼ N (µ, σ 2 ), and a, b are constants, then
aX + b ∼ N aµ + b, a2 σ 2 .
Hence,
Z=
X−µ
∼ N (0, 1).
σ
Let X ∼ N (µ, σ 2 ) and k > 0. We have
P(X ∈ [µ − kσ, µ + kσ]) = P(X − µ ∈ [−kσ, kσ])
X − µ
=P
∈ [−k, k]
σ
= P(Z ∈ [−k, k]).
For k = 1, 2, 3 this probability is approximately equal to 0.68, 0.95, 0.997.
20 / 21