Download 7. continuous random variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
CSE 312, 2011 Winter, W.L.Ruzzo
7. continuous random variables
continuous random variables
Discrete random variable: takes values in a finite or
countable set, e.g.
X ∈ {1,2, ..., 6} with equal probability
X is positive integer i with probability 2-i
Continuous random variable: takes values in an
uncountable set, e.g.
X is the weight of a random person (a real number)
X is a randomly selected point inside a unit square
X is the waiting time until the next packet arrives at
the server
2
pdf and cdf
f(x) : the probability density function (or simply “density”)
f(x)
a
F(a) = ∫−∞
f(x) dx
a
b
P(X < a) = F(x): the cumulative distribution function (or
simply “distribution”)
P(a < X < b) = F(b) - F(a)
+∞
Need ∫-∞
f(x) dx (= F(+∞)) = 1
A key relationship:
d F(x), since F(a) = ∫a f(x) dx,
f(x) = dx
−∞
3
densities
Densities are not probabilities; e.g. may be > 1
P(x = a) = P(a ≤ X ≤ a)
= F(a)-F(a) = 0
P(a ≤ X ≤ a+ε) = F(a+ε) - F(a)
≈ ε • f(a)
I.e., the probability that a continuous random variable
falls at a specified point is zero
The probability that it falls near that point is proportional
to the density
4
sums and integrals; expectation
5
example
6
properties of expectation
still true, just as
for discrete
just as for discrete,
but w/integral
(e.g., see slides 25-27)
^
7
variance
8
example
9
continuous random variables: summary
Continuous random variable X has density f(x), and
uniform random variable
X ~ Uni(α,β) is uniform in [α,β]
1.0
0.5
0.0
f(x)
1.5
2.0
The Uniform Density Function Uni(0.5,1.0)
0.0
0.5
α
1.0
x
β
1.5
uniform random variable
2.0
The Uniform Density Function Uni(0.5,1.0)
1.0
0.0
0.5
f(x)
1.5
X ~ Uni(α,β) is uniform in [α,β]
0.0
0.5
1.0
x
if α≤a≤b≤β:
1.5
uniform random variable: example
X ~ Uni(α,β) is uniform in [α,β]
You want to read a disk sector from a 7200rpm disk drive.
Let T be the time you wait, in milliseconds, after the disk
head is positioned over the
correct track, until the desired
sector rotates under the head.
T ~ Uni(0, 8.33)
13
exponential random variable
X ~ Exp(λ)
2.0
The Exponential Density Function
1.0
0.5
!=1
0.0
f(x)
1.5
!=2
-1
0
1
2
x
3
4
exponential random variable
X ~ Exp(λ)
=1-F(t)
Memorylessness:
Examples
Radioactive decay: How long until the next alpha
particle?
Customers: how long until the next customer/packet
arrives at the checkout stand/server?
Buses: How long until the next northbound #71 bus
arrives on the Ave?
Yes, they have a schedule, but given the vagaries of traffic, riders
with-bikes-and-baby-carriages, etc., can they stick to it?
Gambler’s fallacy: “I’m due for a win”
Relation to the Poisson: same process, different
measures.
Poisson says how many events in a fixed time
Exponential says how long until the next event
16
normal random variable
X is a normal (aka Gaussian) random variable X ~ N(μ, σ2)
0.5
The Standard Normal Density Function
0.3
0.1
0.2
!=1
0.0
f(x)
0.4
µ=0
-3
-2
-1
0
x
1
2
3
0.5
0.5
changing μ, σ
0.4
0.3
0.3
0.4
µ=0
µ=0
0.2
0
0.1
!=2
0.0
0.0
0.1
0.2
!=1
10
µ=4
-10
-5
0
5
10
0
0.3
µ=4
0.2
0.1
!=2
0.0
0.0
0.1
0.2
!=1
0
0.3
0.4
0
5
0.5
0
0.4
-5
0.5
-10
-10
-5
0
5
10
-10
density at μ is ≈ .399/σ
-5
0
5
10
18
normal random variable
X is a normal random variable X ~ N(μ,σ2)
Y = aX + b
E[Y] = E[aX+b] = aμ + b
Var[Y] = Var[aX+b] = a2σ2
Y ~ N(aμ + b, a2σ2)
Important special case: Z = (X-μ)/σ ~ N(0,1)
Z ~ N(0,1) “standard (or unit) normal”
Use Φ(z) to denote CDF, i.e.
no closed form 
Φ(.46)
0.5
The Standard Normal Density Function
0.3
0.1
0.2
!=1
0.0
f(x)
0.4
µ=0
-3
-2
-1
0
1
2
3
x
20
0.5
The Standard Normal Density Function
0.3
0.1
0.2
!=1
0.0
f(x)
0.4
µ=0
-3
-2
-1
0
1
2
3
x
If Z ~N(µ,σ) what is P( µ-σ < Z < µ+σ )?
P( µ - σ < Z < µ + σ ) = Φ(1) - Φ(-1) ≈ 68%
P( µ - 2σ < Z < µ + 2σ ) = Φ(2) - Φ(-2) ≈ 95%
P( µ - 3σ < Z < µ + 3σ ) = Φ(3) - Φ(-3) ≈ 99%
21
normal approximation to binomial
X ~ Bin(n,p)
E[X] = np Var[X] = np(1-p)
Poisson approx. good for n large, p small (np constant)
For large n: X ≈ Y ~ N(E[X],Var[X]) = N(np,np(1-p))
(p stays fixed)
Normal approximation good when np(1-p) ≥ 10
DeMoivre-Laplace Theorem:
Sn = number of success in n trials (with prob. p), as n→∞
normal approximation to binomial
Ross Ch5 ex 4f
Fair coin flipped 40 times. Probability of 20 heads?
Exact answer:
24
distribution of g(X)
-0.2
0.6
1.0
Ross 5.7 ex 7a
FX
-0.2
0.2
CDF
fY
0.6
1.0
x
0.0 0.2 0.4 0.6 0.8 1.0
1.0
2.0
3.0
x
0.0
density
I.e., X has half its probability density
left of .5, but Y has that squished left
of 1/4. Similarly, X has 0.1 prob
left of 1/10, but Y pushes that left
of 0.01. Generalizing, for any
constant 0 < y < 1:
0.2
CDF
fX
0.0 0.2 0.4 0.6 0.8 1.0
0.0 0.2 0.4 0.6 0.8 1.0
P(Y < 1/4) = P(X2 < 1/4)
= P(X < sqrt(1/4))
“The square of a
fraction is more
= P(X < 1/2)
likely to be near
0 than 1”
= 1/2
density
Suppose X ~Unif(0,1). What is the distribution of Y =
X2?
FY
-0.2
0.2
0.6
P(Y < y) = P(X < sqrt(y)) = sqrt(y) -0.2 0.2 0.6 1.0
y
y
which is the CDF FY of Y, and the PDF of Y is the derivative of that:
fY(y) = 1/(2 sqrt(y)) for 0 < y < 1.
25
1.0
distribution of g(X)
Ross 5.7 ex 7a
Suppose X ~Unif(0,1). What is the distribution of Y = Xn?
Generalizing again, for 0 ≤ y ≤1, the CDF is found as
follows:
And taking the derivative, the density function is:
26
distribution of g(X)
27
the central limit theorem (CLT)
Consider i.i.d. (independent, identically distributed) random
vars X1, X2, X3, …
Xi has μ = E[Xi] and σ2 = Var[Xi]
As n → ∞,
Restated: As n → ∞,
Credit: Annie Leibovitz, © 1987 ?
How tall are you? Why?
Willie Shoemaker & Wilt Chamberlain
29
in the real world…
Why might that be
true?
Frequency
Human height is
approximately normal.
R.A. Fisher (1918)
noted it would follow
from CLT if height
Male Height in Inches
were the sum of
many independent random effects, e.g. many genetic
factors (plus some environmental ones like diet). I.e., suggested
part of mechanism by looking at shape of the curve.
30
Meta-analysis of Dense Genecentric Association Studies Reveals Common
and Uncommon Variants Associated with Height, Lanktree, et al.
The American Journal of Human Genetics 88, 6–18, January 7, 2011
Sixty-Four
(and hundreds more
probably exist)
31
in the real world…
32
in the real world…
33
in the real world…
34
in the real world…
35
Related documents