Download Common Distributions Common Distributions Continued

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Properties of (Multivariate) Normal Distribution
Common Distributions
PMF
x
n−x
(n
x ) p (1 − p)
CDF
x −λ
λ e
x!
λe−λx , 0 < x < ∞1 − e−λx
σ2
Mx (t)
np(1 − p) (1 − p + pet )n
µ
np
λ
λ
1
λ
1
λ2
σ2
r
x
e
1
b−a
x−a
b−a
r
2r
(a+b)
2
(b−a)2
12
n
0
µ
(1 − βt)−α
Pk
( i=1 pi eti )n
(1 − 2t)
−r
2
etb −eta
t(b−a)
Name
Binomial
Special Properties
X + Y ∼ B(n + m, p) if X ∼ B(n, p), Y ∼ B(m, p)
X ∼ B(n, p), Y |X ∼ B(X, q), then Y ∼ B(n, pq)
Normal
Order 2 raw moment: µ + σ 2 P
Exponential
X1 , ..., Xn iid, then min{X1 , ..., Xn } ∼
λi
Memoryless: P (T > s + t|T > s) = P (T > t)
Poisson
Exp. num. of occurences in a given interval: λ
Prob. that there are exactly
x occurences
P
P
If Xi ∼ P ois(λi ), then
Xi ∼ P ois( λi )
Probability Laws and Properties
Total Variance
Total Expectation
Variance
Covariance
Exp. of Cond. Var.
Cond. Variance
Bayes’ Theorem
P
P (A) = n
i=1 P (A|Ci )P (Ci )
Cov(X, Y )
=
E[Cov(X, Y |Z)] +
Cov(E[X|Z], E[Y |Z])
V ar(Y ) = P
E[V ar(Y |X)] + V ar(E[Y |X])
E(X) =
y E(X|Y = y)P (Y = y) =
E(E(X|Y ))
V ar(Y ) = E[Y 2 ] − E[Y ]2
Cov(X, Y ) = E(XY ) − E(X)E(Y )
E[V ar(X2 |X1 )] = E(X 2 ) − E[E(X2 |X1 )2 ]
V ar(Y |X) = E[Y 2 |X] − E[Y |X]2
P (A|Ck )P (Ck )
P (Ck |A) = Pn P (A|C
)P (C )
i=1
Imp. by Prob. Space
i
i
P (B) = P (A ∩ B) + P (Ac ∩ B)
A ⊂ B ⇒ P (A) ≤ P (B)
P (A ∪ B) = P (A) + P (B) − P (A ∩ B)
Transformation
Single Variable
Y = g(X), fY (y) = fX
dg
(g −1 (y))|
−1
(y)
|
dy
Two Variable
∂x ∂x 1
1 ∂y ∂y g(y1 , y2 ) = f (x1 , x2 ) ∂x12 ∂x22 , where xi is in terms of y.
∂y ∂y 1
2
Support needs to be transformed too!
Inequalities
Boole’s
Bonferroni
Pn
P (∪n
i=1 P (Ci )
i=1 Ci ) ≤
min{P (C1 ), P (C2 )} ≥ P (C1
Markov
Chebyshev’s
Jensen’s
Cauchy-Schwarz
Var. of Exp.
P (C1 ) + P (C2 ) − 1
E(|x|)
P (|x| ≥ a) ≤ a
x−µx
P (| σ | > b) ≤ b12
x
φ is a convex function. φ(E[X]) ≤ E[φ(X)]
E(XY )2 ≤ E(X 2 )E(Y 2 )
V ar[E(X2 |X1 )] ≤ V ar(X2 )
∩ C2 )
0
Σ−1 (x−µ)
Independence
Common Distributions Continued...
Total Probability
Total Covariance
1
t0 Σt
t+ 1
2
MGF: e
Y = Amxn x + b ⇒ Y ∼ Nm (Aµ + b, AΣA0 )
Cov(Xi , Xj ) = 0 ⇔ Xi ⊥
⊥ Xj for i 6= j
X1 |X2 ∼ Nm (E(X1 |X2 ), V ar(X1 |X2 )) where
E(X1 |X2 ) = µ1 + Σ12 Σ−1
22 (x2 − µ2 )
V ar(X1 |X2 ) = Σ11 − Σ12 Σ−1
21
22 ΣP
n
2
If X1 , XP
2 , ... ∼ N (µP
i , σi ), Y =
i=1 αi Xi , then
n
2 σ2 )
Y ∼ N( n
α
µ
,
α
i
i
i=1
i=1 i i
σ2
1 Pn
If X̄ = n
i=1 Xi , then X̄ ∼ N (µ, n )
t −1
)
λ
1 σ 2 t2
µ+ 2
(1 −
e
1
PMF: f (x) = (2π)− 2 |Σ|− 2 e− 2 (x−µ)
eλ(e −1)
1 x−µ 2
√ 1
e− 2 ( σ ) Φ( x−µ
) µ
σ
2πσ 2
1
xα−1 e−x/β
αβ
αβ 2
Γ(α)β α
xk
x1
n!
p ...pk
npi npi (1 − pi )
x1 !...xk ! 1
βα2
βαβ
α β βα
,
α
<
x
<
∞
1
−
(
)
x
β−1 (β−2)(β−1)2
xβ+1
r −1 − x
1
2
2
Γ( r
)2 2
2
Given X ∼ Nn (µ, Σ),
t
≥
X⊥
⊥ Y iff
∃g(x), h(y) s.t. f (x, y) = g(x)h(y) ⇔
P (a < X < b, c < Y < d) = P (a < X < b)P (c < Y < d) ⇔
M (tx , ty ) = M (tx , 0)M (0, ty ) ⇔
FXY (x, y) = FX (x)FY (y) ⇔
fXY (x, y) = fX (x)fY (y) ⇔
E[u(x)v(y)] = E[u(x)]E[v(y)]
Conditional Independence
1)
2)
3)
4)
X
X⊥
⊥ A, B ⇒ X ⊥
⊥ A, X ⊥
⊥B
X⊥
⊥ A|B, X ⊥
⊥B⇒X⊥
⊥ A, B
X⊥
⊥ A|B, X ⊥
⊥ B|A ⇒ X ⊥
⊥ A, B
X⊥
⊥ Y |Z, U is a function of X then i) U ⊥
⊥ Y |Z and ii)
⊥
⊥ Y |(Z, U )
Definitions and Theorems
σ-Algebra
1) Nonempty: S ∈ Γ ⇒ ∅ ∈ Γ
2) Closed under Complementation: A ∈ Γ ⇒ Ac ∈ Γ
3) Closed under Countable Unions: A1 , A2 , ... ∈ Γ ⇒ ∪∞
i=1 Ai ∈ Γ
Kolmogorov Axioms of a Probability Measure
1) ∀A ∈ Γ, P (A) ≥ 0
2) P (S) = 1
3) ∀{Ai }∞
i=1 in
PΓ∞s.t. i 6= j, Ai ∩ Aj = ∅,
P (∪∞
i=1 P (Ai )
i=1 Ai ) =
P
P
Since
Yn −→ θ, remainder
√ −→ 0. By Slutsky’s Theorem,
√
n[g(Yn ) − g(θ)] = g 0 (θ) n(Yn − θ).
Slutsky’s Theorem
D
P
If Xn −→ X, Yn −→ a, then
D
D
Yn Xn −→ aX, and Xn + Yn −→ X + a
Def. of Convergence in Probability
P
Xn −→ X if ∀ > 0, limn→∞ P (|Xn − X| ≥ ] = 0 or
limn→∞ P [|Xn − X| < ] = 1
Weak Law of Large Numbers
{Xn } be iid with mean µ and σ 2 < ∞, then
i=1
P
Xi −→ µ
P
P
2) Xn −→ X, then aXn −→ aX
P
P
3) Xn −→ a and g(.) is continuous at a, then g(Xn ) −→ g(a)
P
P
P
4) Xn −→ X, Yn −→ Y , then Xn Yn −→ XY
Def. of Convergence in Distribution and Additional Theorems
D
FXn and FX are cdfs of Xn and X. Xn −→ X if
limn→∞ FXn (x) = FX (x)∀x ∈ C(FX ) (set of all points where FX
is continuous.
P
D
1) Xn −→ X, then Xn −→ X
D
P
2) Xn −→ b, then Xn −→ b
D
D
D
3) Xn −→ X, Yn −→ 0, then Xn + Yn −→ X
D
4) Xn −→ X, g(.) is continuous on support of X, then
D
g(Xn ) −→ g(X)
Moment Generating Function Technique
MXn (t) for {Xn } exists for −h < t < h ∀n, X with M (t) which
exists for |t| ≤ h1 ≤ h, if limn→∞ MXn (t) = M (t) for |t| ≤ h1 ,
D
then Xn −→ X
Misc.
h
i
d −b
1
A = ac db , A−1 = ad−bc
−c a
P
n
x n−x
(a + b)nR= n
x = 0(x)a b
∞ α−1 −y
Γ(α) = 0 x
e dy
a3 + b3 = (a + b)(a2 − ab + b2 )
a3 − b3 = (a − b)(a2 + ab + b2 )
σY
E(Y |X) = µY + ρ σ
(x − µx )
X
2 (1 − ρ2 )
E[V ar(YR|X)] = σY
∞
Convolution Z = X +Y fZ (z) = −∞
fX (z − y)fY (y)dy
R g(y)
d
Calculus
f
(x)dx
=
f (g(y))g 0 (y) −
dy h(y)
0
f (h(y))h (y)
R −ax
−e−ax (ax+1)
xe
dx =
a2
R −ax
−ax
e
dx = −ea
Convergence Concepts
Consistency of Extremum Estimators Theorem
Assume the following:
1) Θ is compact set in RK
2) For some function Q : Θ → R,
plimN →∞ supθ∈Θ |QN (θ) − Q(θ)| = 0.
3) The function Q is continuous.
4) Q is uniquely maximized over Θ at θ = θ∗ .
Then, plimN →∞ θ̂N = θ∗
Useful Results/Strategies
Delta Method
√
D
Assume sequence Yn satisfies n(Yn − θ) −→ n(0, σ 2 ),
given function g(.) and specific value θ, g 0 (θ) exists and 6= 0,
√
D
then n[g(Yn ) − g(θ)] −→ N (0, σ 2 (g 0 (θ))2 )
PF: Taylor expansion of g(YN ) around YN = θ is
g(Yn ) = g(θ) + g 0 (θ)(Yn − θ) + Remainder.
Pn
Results from WLLN
P
P
P
1) Xn −→ X, Yn −→ X, then Xn + Yn −→ X + Y
Inverse of 2x2
Binomial Formula
Gamma Function
Sum of Cubes
Difference of Cubes
Linearity in E(Y |X)
Central Limit Theorem
√
D
n(X n − µ) −→ N (0, σ 2 )
1
n
Distributions of Ordinal Statistics
Suppose X ⊥
⊥ Y , FX , FY , Z1 = min{X, Y }, Z2 = max{X, Y }:
1) FZ1 (t) = 1 − (1 − FX (t))(1 − FY (t))
2) FZ2 (w) = FX (w)FY (w)
3)
( FZ1 Z2 (t, w) =
FX (w)FY (w)
if w < t
FX (t)FY (w) + FX (w)FY (t) − FX (t)FY (t) if t ≤ w
4) fZ1 Z2 (t, w) = fX (t)fY (w) + fX (w)fY (t) if t ≤ w
Identification
1) Multiply by constant or add/substract a constant
to show distributions remain the same
2) Check if mean/variance are different if you assume something
is different
3) Choose different obs. x and x0 ; take difference; assume
cdfs are the same; conclude parameter is the same
4) Take the difference of moments
5) Assume paramters are different; conclude cdf is different
Partial Identification
1) If linear, assume norm = 1 for vector of coefficients
2) Assume inverse cdf is linear (get slope and intercept terms)
3) Assume first vector element = 1 (identify a cdf)
4)
5)
6)
7)
Assume two unknown parameters are the same
Assume a function returns 1 or 0 when x = 0
Assume no constant term
Specify support (e.g. {1} x R2 ; use to get mean/variance)
Solutions to HW/Past Exams
Related documents