Download 04. Random Variables: Concepts

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Statistics wikipedia , lookup

Probability wikipedia , lookup

History of statistics wikipedia , lookup

Transcript
University of Rhode Island
DigitalCommons@URI
Nonequilibrium Statistical Physics
Physics Course Materials
2015
04. Random Variables: Concepts
Gerhard Müller
University of Rhode Island, [email protected]
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License.
Follow this and additional works at: http://digitalcommons.uri.edu/
nonequilibrium_statistical_physics
Part of the Physics Commons
Abstract
Part four of course materials for Nonequilibrium Statistical Physics (Physics 626), taught by Gerhard
Müller at the University of Rhode Island. Entries listed in the table of contents, but not shown in the
document, exist only in handwritten form. Documents will be updated periodically as more entries
become presentable.
Updated with version 2 on 5/3/2016.
Recommended Citation
Müller, Gerhard, "04. Random Variables: Concepts" (2015). Nonequilibrium Statistical Physics. Paper 4.
http://digitalcommons.uri.edu/nonequilibrium_statistical_physics/4
This Course Material is brought to you for free and open access by the Physics Course Materials at DigitalCommons@URI. It has been accepted for
inclusion in Nonequilibrium Statistical Physics by an authorized administrator of DigitalCommons@URI. For more information, please contact
[email protected].
Contents of this Document
[ntc4]
4. Random Variables: Concepts
• Probability distributions [nln46]
• Characteristic function, moments, and cumulants [nln47]
• Cumulants expressed in terms of moments [nex126]
• Generating function and factorial moments [nln48]
• Multivariate distributions [nln7]
• Transformation of random variables [nln49]
• Sums of independent exponentials [nex127]
• Propagation of statistical uncertainty [nex24]
• Chebyshev’s inequality [nex6]
• Law of large numbers [nex7]
• Binomial, Poisson, and Gaussian distribution [nln8]
• Binomial to Poisson distribution [nex15]
• De Moivre - Laplace limit theorem [nex21]
• Central limit theorem [nln9]
• Multivariate Gaussian distribution
• Robust probability distributions [nex19]
• Stable probability distributions [nex81]
• Exponential distribution [nln10]
• Waiting time problem [nln11]
• Pascal distribution [nex22]
Probability Distribution
[nln46]
Experiment represented by events in a sample space: S = {A1 , A2 , . . .}.
Measurements represented by stochastic variable: X = {x1 , x2 , . . .}.
Maximum amount of information experimentally obtainable is contained in
the probability distribution:
X
PX (xi ) ≥ 0,
PX (xi ) = 1.
i
Partial information is contained in moments,
X
hX n i =
xni PX (xi ), n = 1, 2, . . . ,
i
or cumulants (as defined in [nln47]),
• hhXii = hXi (mean value)
• hhX 2 ii = hX 2 i − hXi2
3
(variance)
3
• hhX ii = hX i − 3hXihX 2 i + 2hXi3
2
.
The variance is the square of the standard deviation: hhX 2 ii = σX
For continuous stochastic variables we have
Z
Z
n
PX (x) ≥ 0,
dxPX (x) = 1, hX i = dx xn P (x).
In the literature PX (x) is often named ‘probability density’ and the term
‘distribution’ is used for
Z x
X
FX (x) =
PX (xi ) or FX (x) =
dx0 PX (x0 )
xi <x
in a cumulative sense.
−∞
Characteristic Function
®
. ­
Fourier transform: ΦX (k) = eikx =
[nln47]
Z
+∞
dx eikx PX (x).
−∞
|ΦX (k)| ≤ 1.
Attributes: ΦX (0) = 1,
Moment generating function:
Z
"
+∞
dx PX (x)
ΦX (k) =
−∞
.
⇒ hX i =
n
∞
X
(ik)n
n!
n=0
Z
+∞
−∞
#
xn =
∞
X
(ik)n
n=0
n!
hX n i
¯
¯
dn
¯
dx x PX (x) = (−i)
Φ
(k)
.
X
¯
dk n
k=0
n
n
Cumulant generating function:
∞
. X (ik)n
hhX n ii
ln ΦX (k) =
n!
n=1
¯
¯
dn
¯
⇒ hhX ii = (−i)
ln
Φ
(k)
.
X
¯
dk n
k=0
n
n
.
Cumulants in terms of moments (with ∆X = X − hXi): [nex126]
• hhXii = hXi
• hhX 2 ii = hX 2 i − hXi2 = h(∆X)2 i
• hhX 3 ii = h(∆X)3 i
• hhX 4 ii = h(∆X)4 i − 3h(∆X)2 i2
Theorem of Marcienkiewicz:
ln ΦX (k) can only be a polynomial if the degree is n ≤ 2.
• n = 1: ln ΦX (k) = ika
⇒ PX (x) = δ(x − a)
1
• n = 2: ln ΦX (k) = ika − bk 2
2
µ
¶
1
(x − a)2
exp −
⇒ PX (x) = √
2b
2πb
Consequence: any probability distribution has either one, two, or infinitely
many non-vanishing cumulants.
[nex126] Cumulants expressed in terms of moments
The characteristic function ΦX (k) of a probability distribution PX (x), obtained via Fourier transform as described in [nln47], can be used to generate the moments hX n i and the cumulants hhX n ii
via the expansions
ΦX (k) =
∞
X
(ik)n n
hX i,
n!
n=0
ln ΦX (k) =
∞
X
(ik)n
hhX n ii.
n!
n=1
Use these relations to express the first four cumulants in terms of the first four moments. The
results are stated in [nln47]. Describe your work in some detail.
Solution:
Generating function
[nln48]
The generating function GX (z) is a representation of the characteristic function ΦX (k) that is most commonly used, along with factorial moments and
factorial cumulants, if the stochastic variable X is integer valued.
.
Definition: GX (z) = hz x i with |z| = 1.
Application to continuous and discrete (integer-valued) stochastic variables:
Z
X
GX (z) = dx z x PX (x),
GX (z) =
z n PX (n).
n
Definition of factorial moments:
.
hX m if = hX(X − 1) · · · (X − m + 1)i,
m ≥ 1;
.
hX 0 if = 0.
Function generating factorial moments:
∞
X
(z − 1)m
hX m if ,
GX (z) =
m!
m=0
¯
¯
dm
hX if = m GX (z)¯¯ .
dz
z=1
m
Function generating factorial cumulants:
∞
X
(z − 1)m
ln GX (z) =
hhX m iif ,
m!
m=1
¯
¯
dm
hhX iif = m ln GX (z)¯¯ .
dz
z=1
m
Applications:
B Moments and cumulants of the Poisson distribution [nex16]
B Pascal distribution [nex22]
B Reconstructing probability distributions [nex14]
Multivariate Distributions
[nln7]
Let X = (X1 , . . . , Xn ) be a random vector variable with n components.
Joint probability distribution: P (x1 , . . . , xn ).
Marginal probability distribution:
Z
P (x1 , . . . , xm ) = dxm+1 · · · dxn P (x1 , . . . , xn ).
Conditional probability distribution: P (x1 , . . . , xm |xm+1 , . . . , xn ).
P (x1 , . . . , xn ) = P (x1 , . . . , xm |xm+1 , . . . , xn )P (xm+1 , . . . , xn ).
Moments: hX1m1 · · · Xnmn i =
R
mn
1
dx1 · · · dxn xm
1 · · · xn P (x1 , . . . , xn ).
Characteristic function: Φ(k) = heik·X i.
∞
X
(ik1 )m1 · · · (ikn )mn m1
Moment expansion: Φ(k) =
hX1 · · · Xnmn i.
m
!
.
.
.
m
!
1
n
0
∞ 0
X
(ik1 )m1 · · · (ikn )mn
Cumulant expansion: ln Φ(k) =
hhX1m1 · · · Xnmn ii.
m
!
.
.
.
m
!
1
n
0
(prime indicates absence of term with m1 = · · · = mn = 0).
Covariance matrix: hhXi Xj ii = h(Xi − hXi i)(Xj − hXj i)i.
(i = j: variances, i 6= j: covariances).
hhXi Xj ii
Correlations: C(Xi , Xj ) = p
.
hhXi iihhXj ii
Statistical independence of X1 , X2 : P (x1 , x2 ) = P1 (x1 )P2 (x2 ).
Equivalent criteria for statistical independence:
• all moments factorize: hX1m1 X2m2 i = hX1m1 ihX2m2 i;
• characteristic function factorizes: Φ(k1 , k2 ) = Φ1 (k1 )Φ2 (k2 );
• all cumulants hhX1m1 X2m2 ii with m1 m2 6= 0 vanish.
If hhX1 X2 ii = 0 then X1 , X2 are called uncorrelated.
This property does not imply statistical independence.
Transformation of Random Variables
[nln49]
Consider two random variables X and Y that are functionally related:
Y = F (X) or X = G(Y ).
If the probability distribution for X is known then the probability distribution for Y is determined as follows:
Z
PY (y)∆y =
dxPX (x)
y<f (x)<y+∆y
Z
⇒ PY (y) =
¡
¢
¡
¢
dxPX (x)δ y − f (x) = PX g(y) |g 0 (y)| .
Consider two random variables X1 , X2 with a joint probability distribution
P12 (x1 , x2 ).
The probability distribution of the random variable Y = X1 + X2 is then
determined as
Z
Z
Z
PY (y) = dx1 dx2 P12 (x1 , x2 )δ(y − x1 − x2 ) = dx1 P12 (x1 , y − x1 ),
and the probability distribution of the random variable Z = X1 X2 as
Z
Z
Z
dx1
P12 (x1 , z/x1 ).
PZ (z) = dx1 dx2 P12 (x1 , x2 )δ(z − x1 x2 ) =
|x1 |
If the two random variables X1 , X2 are statistically independent we can substitute P12 (x1 , x2 ) = P1 (x1 )P2 (x2 ) in the above integrals.
Applications:
B Transformation of statistical uncertainty [nex24]
B Chebyshev inequality [nex6]
B Robust probability distributions [nex19]
B Statistically independent or merely uncorrelated? [nex23]
B Sum and product of uniform distributions [nex96]
B Exponential integral distribution [nex79]
B Generating exponential and Lorentzian random numbers [nex80]
B From Gaussian to exponential distribution [nex8]
B Transforming a pair of random variables [nex78]
[nex127] Sums of independent exponentials
Consider n independent random variable X1 , . . . , Xn with range xi ≥ 0 and identical exponential
distributions,
1
P1 (xi ) = e−xi /ξ , i = 1, . . . , n.
ξ
Use the transformation relation from [nln49],
Z
Z
Z
P2 (x) = dx1 dx2 P1 (x1 )P1 (x2 )δ(x − x1 − x2 ) = dx1 P1 (x1 )P1 (x − x1 ),
inductively to calculate the probability distribution Pn (x), n ≥ 2 of the stochastic variable
X = X1 + · · · + Xn .
Find the mean value hXi, the variance hhX 2 ii, and the value xp where Pn (x) has its peak value.
Solution:
[nex24] Transformation of statistical uncertainty.
From a given stochastic variable X with probability distribution PX (x) we can calculate the probability distribution of the stochastic variable Y = f (X) via the relation
Z
PY (y) = dx PX (x)δ (y − f (x)) .
Show by systematic expansion that if PX (x) is sufficiently narrow and f (x) sufficiently smooth,
then the mean values and the standard deviations of the two stochastic variables are related to
each other as follows:
hY i = f (hXi), σY = |f 0 (hXi)|σX .
Solution:
[nex6] Chebyshev’s inequality
p
Chebyshev’s inequality is a rigorous relation between the standard deviation σX = hX 2 i − hXi2
of the random variable X and the probability of deviations from the mean value hXi greater than
a given magnitude a.
σ 2
X
P [(x − hXi)2 > a2 ] ≤
a
Prove Chebyshev’s inequality starting from the following relation, commonly used for the transformation of stochastic variables (as in [nln49]):
Z
PY (y) = dx δ(y − f (x))PX (x) with f (x) = (x − hXi)2 .
Solution:
[nex7] Law of large numbers
Let X1 , . . . , XN be N statistically independent random variables described
p by the same probability
distribution PX (x) with mean value hXi and standard deviation σX = hX 2 i − hXi2 . These random variables might represent, for example, a series of measurements under the same (controllable)
conditions. The law of large numbers states that the uncertainty (as measured by the standard
deviation) of the stochastic variable Y = (X1 + · · · + XN )/N is
σX
σY = √ .
N
Prove this result.
Solution:
Binomial, Poisson, and Gaussian Distributions
Consider a set of N independent experiments, each having two possible outcomes occurring with given probabilities.
events
A+B =S
probabilities
p+q =1
random variables n + m = N
Binomial distribution:
PN (n) =
N!
pn (1 − p)N −n .
n!(N − n)!
Mean value: hni = N p.
Variance: hhn2 ii = N pq.
[nex15]]
In the following we consider two different asymptotic distributions in the
limit N → ∞.
Poisson distribution:
Limit #1: N → ∞, p → 0 such that N p = hni = a stays finite [nex15].
P (n) =
an −a
e .
n!
Cumulants: hhnm ii = a.
Factorial cumulants: hhnm iif = aδm,1 .
[nex16]
Single parameter: hni = hhn2 ii = a.
Gaussian distribution:
Limit #2: N 1, p > 0 with N p √
N pq.
1
(n − hni)2
PN (n) = p
exp −
.
2hhn2 ii
2πhhn2 ii
Derivation: DeMoivre-Laplace limit theorem [nex21].
Two parameters: hni = N p, hhn2 ii = N pq.
Special case of central limit theorem [nln9].
[nln8]
[nex15] Binomial to Poisson distribution
Consider the binomial distribution for two events A, B that occur with probabilities P (A) ≡ p,
P (B) = 1 − p ≡ q, respectively:
PN (n) =
N!
pn q N −n ,
n!(N − n)!
where N is the number of (independent) experiments performed, and n is the stochastic variable
that counts the number of realizations of event A.
(a) Find the mean value hni and the variance hhn2 ii of the stochastic variable n.
(b) Show that for N → ∞, p → 0 with N p → a > 0, the binomial distribution turns into the
Poisson distribution
an −a
e .
P∞ (n) =
n!
Solution:
[nex21] De Moivre−Laplace limit theorem.
Show that for large N p and large N pq the binomial distribution turns into the Gaussian distribution
with the same mean value hni = N p and variance hhn2 ii = N pq:
(n − hni)2
N!
1
exp −
PN (n) =
pn q N −n −→ PN (n) ' p
.
n!(N − n)!
2hhn2 ii
2πhhn2 ii
Solution:
Central Limit Theorem
[nln9]
The central limit theorem is a major extension of the law of large numbers.
It explains the unique role of the Gaussian distribution in statistical physics.
Given are a large number of statistically independent random variables Xi , i =
1, . . . , N with equal probability distributions PX (xi ). The only restriction on
the shape of PX (xi ) is that the moments hXin i = hX n i are finite for all n.
Goal: Find the probability distribution PY (y) for the random variable Y =
(X1 − hXi + · · · + XN − hXi)/N .
!
Z
Z
N
X
1
PY (y) = dx1 PX (x1 ) · · · dxN PX (xN )δ y −
[xi − hXi] .
N i=1
Characteristic function:
Z
ΦY (k) ≡ dy eiky PY (y),
Z
⇒ ΦY (k) =
dx1 PX (x1 ) · · ·
Φ̄
k
N
Z
dk e−iky ΦY (k).
!
N
k X
[xi − hXi]
dxN PX (xN ) exp i
N i=1
N
Φ̄ (k/N ) ,
=
Z
1
PY (y) =
2π
Z
1
=
dx ei(k/N )(x−hXi) PX (x) = exp −
2
3
2
1 k
k
2
,
= 1−
hhX ii + O
2 N
N3
k
N
2
!
hhX 2 ii + · · ·
where we have performed a cumulant expansion to leading order.
3 N
2
k 2 hhX 2 ii
k
k hhX 2 ii
N →∞
⇒ ΦY (y) = 1 −
+O
−→ exp −
.
2N 2
N3
2N
where we have used limN →∞ (1 + z/N )N = ez .
s
⇒ PY (y) =
N
N y2
1
2
2
exp −
=p
e−y /2hhY ii
2
2
2
2πhhX ii
2hhX ii
2πhhY ii
with variance hhY 2 ii = hhX 2 ii/N
Note that regardless of the form of PX (x), the average of a large number of
(independent)
√ measurements of X will be a Gaussian with standard deviation
σY = σX / N .
[nex19] Robust probability distributions
Consider two independent stochastic variables X1 and X2 , each specified by the same probability
distribution PX (x). Show that if PX (x) is either a Gaussian, a Lorentzian, or a Poisson distribution,
(i) PX (x) = √
2
2
1
e−x /2σ ,
2πσ
(ii) PX (x) =
1
a
,
π x2 + a2
(iii) PX (x = n) =
an −a
e .
n!
then the probability distribution PY (y) of the stochastic variable Y = X1 + X2 is also a Gaussian,
a Lorentzian, or a Poisson distribution, respectively. What property of the characteristic function
ΦX (k) guarantees the robustness of PX (x)?
Solution:
[nex81] Stable probability distributions
Consider N independent random variables X1 , . . . , XN , each having the same probability distribution PX (x). If the probability distribution of the random variable YN = X1 + · · · + XN can be
written in the form PY (y) = PX (y/cN + γN )/cN , then PX (x) is stable. The multiplicative constant
must be of the form cN = N 1/α , where α is the index of the stable distribution. PX (x) is strictly
stable if γN = 0.
Use the results of [nex19] to determine the indices α of the Gaussian and Lorentzian distributions,
both of which are both strictly stable. Show that the Poisson distribution is not stable in the
technical sense used here.
Solution:
Exponential distribution
[nln10]
Busses arrive randomly at a bus station.
The average interval between successive bus arrivals is τ .
f (t)dt: probability that the interval is between t and t + dt.
Z ∞
P0 (t) =
dt0 f (t0 ): probability that the interval is larger than t.
t
Relation: f (t) = −
dP0
.
dt
Z
Normalizations: P0 (0) = 1,
∞
dt f (t) = 1.
0
Z
Mean value: hti ≡
∞
dt tf (t) = τ.
0
Start the clock when a bus has arrived and consider the events A and B.
Event A: the next bus has not arrived by time t.
Event B: a bus arrives between times t and t + dt.
Assumptions:
1. P (AB) = P (A)P (B) (statistical independence).
2. P (B) = cdt with c to be determined.
Consequence: P0 (t + dt) = P (AB̄) = P (A)P (B̄) = P0 (t)[1 − cdt].
d
P0 (t) = −cP0 (t) ⇒ P0 (t) = e−ct ⇒ f (t) = ce−ct .
dt
Adjust mean value: hti = τ ⇒ c = 1/τ .
⇒
Exponential distribution: P0 (t) = e−t/τ ,
f (t) =
1 −t/τ
e
.
τ
Find the probability Pn (t) that n busses arrive before time t.
First consider the probabilities f (t0 )dt0 and P0 (t − t0 ) of the two statistically
independent events that the first bus arrives between t0 and t0 + dt0 and that
no futher bus arrives until time t.
Probability that exactly one bus arrives until time t:
Z t
t
P1 (t) =
dt0 f (t0 )P0 (t − t0 ) = e−t/τ .
τ
0
Then calculate Pn (t) by induction.
Z t
(t/τ )n −t/τ
Poisson distribution: Pn (t) =
dt0 f (t0 )Pn−1 (t − t0 ) =
e
.
n!
0
Waiting Time Problem
[nln11]
Busses arrive more or less randomly at a bus station.
Given is the probability distribution f (t) for intervals between bus arrivals.
Z ∞
Normalization:
dt f (t) = 1.
0
∞
Z
dt0 f (t0 ).
Probability that the interval is larger than t: P0 (t) =
t
Z
Mean time interval between arrivals: τB =
∞
Z
dt tf (t) =
0
∞
dtP0 (t).
0
Find the probability Q0 (t) that no arrivals occur in a randomly chosen time
interval of length t.
First consider the probability P0 (t0 + t) for this to be the case if the interval
starts at time t0 after the last bus arrival. Then average P0 (t0 + t) over the
range of elapsed time t0 .
Z ∞
⇒ Q0 (t) = c
dt0 P0 (t0 + t) with normalization Q0 (0) = 1.
0
1
⇒ Q0 (t) =
τB
Z
∞
dt0 P0 (t0 ).
t
Passengers come to the station at random times. The probability that a
passenger has to wait at least a time t before the next bus is then Q0 (t):
Probabilty distribution of passenger waiting times:
1
d
Q0 (t) =
P0 (t).
dt
τB
Z ∞
Z
Mean passenger waiting time: τP =
dt tg(t) =
g(t) = −
0
∞
dtQ0 (t).
0
The relationship between τB and τP depends on the distribution f (t). In
general, we have τP ≤ τB . The equality sign holds for the exponential distribution.
[nex22] Pascal distribution.
Consider the quantum harmonic oscillator in thermal equilibrium at temperature T . The energy
levels (relative to the ground state) are En = n~ω, n = 0, 1, 2, . . .
(a) Show that the system is in level n with probability
P (n) = (1 − γ)γ n , γ = exp(−~ω/kB T ).
P (n) is called Pascal distribution or geometric distribution.
(b) Calculate the factorial moments hnm if and the factorial cumulants hhnm iif of this distribution.
(c) Show that the Pascal distribution has a larger variance hhn2 ii than the Poisson distribution
with the same mean value hni.
Solution: