Download REVIEW OF STATISTICS • Sample Space: S = the set of all possible

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
REVIEW OF STATISTICS
•
Sample Space: S = the set of all possible outcomes not known for sure by the investigator at the
current time.
§ event A = a subset of the sample space S (i.e., A ⊂ S).
§ events A and B are disjoint if A ∩ B = ∅ (where ∅ is the empty set or null set).
•
Axioms of probabilities: P(⋅) is a probability function if it satisfies:
§ P(A) ≥ 0 for any event A ⊂ S
§ P(S) = 1
§ P(A ∪ B) = Pr(A) + Pr(B) if A and B are disjoint events.
The probability of an event can be intuitively interpreted as measuring the relative frequency or
relative likelihood of this event.
•
A random variable X is a function that takes a specific real value at each point of the sample space.
For any event A ⊂ S, P(A) is the probability that X ∈ A.
•
A cumulative distribution function (CDF) for a random variable X is the function F(t) = P(X≤ t)
such that:
§ F(t) is non-decreasing and continuous from the right
§ F(-∞) = 0
§ F(+∞) = 1.
•
A probability density function(PDF) = f(x) where x can be a discrete or a continuous variable.
§ Discrete Case: when the random variable X can take a countable number of distinct values: x1, x2,
x3, ... Then, S = {x1, x2, x3, ...} ,
♦ f(xi) = P(X = xi), and
♦ for A ⊂ S, P(X∈ A) = Σx∈A f(x).
§ Continuous Case: when the random variable X can take all possible values between a and b: a ≤ x
≤ b. Then, S = [a, b] = {x: a ≤ x ≤ b} and, for A ⊂ S,
♦ P(X ∈ A) =
∫
x∈A
f ( x )dx .
♦ where f(x) = ∂F(x)/∂x under differentiability.
♦ F(x0) = P(x ≤ x0) = ∫xx=0 −∞ f ( x )dx , and
♦ P(X = x) = 0 ≠ f(x).
•
Multivariate Case, consider X, Y, Z, …, to be random variables. For simplicity, we consider the case
of two random variables, X and Y.
§ Joint Cumulative Distribution Function of (X, Y) is F(x, y) = P(X ≤ x, Y ≤ y).
§ Marginal Cumulative Distribution of X is Fx(x) = F(x, ∞).
§ Marginal Cumulative Distribution of Y is Fy(y) = F(∞, y).
•
The Joint Probability Density Function of X and Y is f(x, y), where x and y are assumed to be in the
sample space.
§ Discrete Case: f(x, y) = P(X = x, Y = y)
F(x0, y0) = Σ x≤ x 0 Σ y≤ y 0 f(x, y).
§
Continuous Case: f(x, y) = ∂2F(x, y)/∂x ∂y
x
y
F(x0, y0) = ∫x =0 −∞ ∫y =0 −∞ f(x, y) dy dx.
•
The Marginal Probability Density Functions are
§ fx(x) = Σy f(x, y) in the discrete case
+∞
= ∫−∞
f(x, y) dy in the continuous case,
§ fy(y) = Σx f(x, y) in the discrete case
+∞
= ∫−∞
f(x, y) dx in the continuous case,
•
The random variables (X, Y) are independent if
§ F(x, y) = Fx(x) Fy(y) or
§ f(x, y) = fx(x) fy(y) for all x and y in the sample space.
•
Conditional probability: Let f(x, y) be the joint probability density function at (x, y). Then, for all x
and y in the sample space,
§ f(y| x) = f(x, y)/fx(x) and represents the conditional probability function of y given x (assuming
fx(x) ≠ 0), and
§ f(x| y) = f(x, y)/fy(y) which is the conditional probability function of x given y (when fy(y) ≠ 0).
•
Bayes theorem: Assuming that fx(x) ≠ 0,
f ( x| y ) f y ( y )
f(y| x) =
f x ( x)
f ( x| y) f y ( y)
in the discrete case
=
Σ y f ( x| y ) f y ( y )
=
f ( x| y) f y ( y)
+∞
∫−∞
in the continuous case.
f ( x| y) f y ( y) dy
In the case where x corresponds to sample information, fy(y) is called the prior probability, f(x| y)
is called the likelihood function of the sample, and f(y| x) is called the posterior probability.
•
Expectations: The expected value of some function g(X) is given by
E[g(X)] = Σx g(x) f(x) in the discrete case, and
+∞
= ∫−∞ g(x) f(x) in the continuous case,
where E is the "expectation operator".
§
§
•
If g(X) = X, then E(X) is called the mean or average of X.
If g(X) = (X - E(X))2, then E[(X - E(X))2] is called the variance of X, denoted by V(X).
V(X) ≥ 0.
§ V(X) = E[(X - E(X))2]
= E[X2 +(E(X))2 - 2 X E(X)]
= E(X2) - (E(X))2.
Standard Deviation of X = (V(X)½
2
•
Covariance between X and Y = Cov(X, Y)
= E[(X - E(X))(Y - E(Y))]
= E[X⋅Y -X⋅E(Y) - Y⋅E(X) + E(X)⋅E(Y)]
= E(X⋅Y) - E(X)⋅E(Y)
•
Correlation between X and Y = ρ(X, Y) = Cov(X, Y)/[(V(X)⋅V(Y)]½; -1 ≤ ρ ≤ 1.
•
Let X = (X1, X2, ..., Xn)' be a (n×1) random vector with mean E(X) = µ = (µ1, µ2, ..., µn)' (= a (n×1)
 σ11 σ 12 L σ1n 
σ
σ 22 L σ 2 n 
vector) and variance V(X) = Σ =  21
, where σii = V(Xi) is the variance of Xi and
 M
M O M 


σ n1 σ n 2 K σ nn 
σij = Cov(Xi, Xj) is the covariance of Xi with Xj, and Σ is a (n×n) symmetric positive semi-definite
matrix.
•
Let Y = A⋅X + b, where Y = (Y1, Y2, …, Ym)’ is a (m×1) random vector, A is a (m×n) known matrix,
and b is a (m×1) known vector. Then,
§ E(Y) = A⋅E(X) + b = A⋅µ + b
§ V(Y) = A⋅V(X)⋅A' = A⋅Σ⋅A'.
•
Note: If x and y are independently distributed with finite variances, then
Cov(X, Y) = 0 and V(X+Y) = V(X) + V(Y).
•
Chebyschev inequality: If V(X) exists (i.e. if it is finite), then Pr[|X - E(X)| ≥ t] ≤ V(X)/t2
•
Conditional Expectation: Let f(x, y) be a joint probability function, fy(y) be the marginal probability
function of y, and h(x| y) = f(x, y)/fy(y) be the conditional probability of x given y.
§ Conditional expectation Ex|y of some function r(x, y) over the random variable x given y is
the expectation of r(x, y) based on the conditional probability h(x| y):
Ex|y r(x, y) = Σx r(x, y) h(x| y), (assuming that x is a discrete random variable).
§ Unconditional expectation Ex,y of the function r(x, y) is given by
Ex,y r(x, y) = Ey[Ex|y r(x, y)],
where Ex|y is the conditional expectation operator and Ey is the expectation based on the marginal
probability of y.
Proof: Ex,y r(x, y) = Σx,y r(x, y) f(x, y)
= Σx,y r(x, y) h(x| y) fy(y)
= Σy[Σx r(x, y) h(x| y)] fy(y)
= Ey[Ex|y r(x, y)].
3
Some Special Continuous Distributions
Probability Function f(x)
•
•
Mean E(X)
Variance V(X)
Uniform: for a < x < b
f(x) = 1/(b-a)
(b+a)/2
(b-a)2/12
Normal: x ∼ N(µ, σ2), x = scalar
exp[-(x-µ)2/2σ2]/[σ(2π)½]
µ
σ2
Note: (x - µ)/σ ∼ N(0, 1) is called a standard normal random variable.
•
Multivariate Normal: x ∼ N(µ, Σ), x = (n×1) vector
(2π)-n/2 |Σ|-1/2 exp[(-1/2) (x-µ)’ Σ-1 (x-µ)]
µ = (n×1) vector
Σ= (n×n) matrix
•
Gamma: for α>0; β>0; x>0
[βα/Γ(α)]⋅xα-1⋅e-βx
α/β
α/β2
•
Exponential = Gamma with α = 1
•
Chi Square = χ2(k) = Gamma with α = k/2; β = 1/2; k = positive integer (= “degrees of freedom”)
If Zi ∼ N(0, 1) and (Z1, …, Zn) are independently distributed, then
Y = (Z12 + Z22 + … + Zk2) ∼ χ2(k)
•
T-distribution = t(k)
If Z ∼ N(0, 1), C ∼ χ2(k) (i.e., C has a chi-square distribution with k degrees of freedom), and Z
and χ2(k) are independently distributed, then
t = Z/[C/k]1/2 has a t-distribution with k degrees of freedom, i.e. t ∼ t(k).
•
F-distribution = F(k1, k2)
If C1 ∼ χ2(k1) and C2 ∼ χ2(k2) are independent chi-square random variables with k1 and k2 degrees of
freedom, respectively, then:
§ F = [C1/k1]/[C2/k2] has an F-distribution with k1 and k2 degrees of freedom, i.e., F ∼ F(k1, k2).
•
Pareto:
α⋅kα/(xα+1)
for x>k>0; α>0
α⋅k/(α-1)
for α>1
α⋅k2/[(α-2)(α-1)2]
for α>2
•
Lognormal
exp[-(log(x)-m)2/2σ2]/[x⋅σ⋅(2π)1/2] for x>0; σ>0
exp(m+σ2/2)
[exp(σ2)-1]exp(2m+σ2)
∞
Note: Γ (α) = ∫0 yα-1 e-y dy
= 1 if α = 1,
= (α-1)! if α is an integer
4
Some Special Continuous Distributions
Probability Function f(x)
•
Mean E(X)
Binomial: for 0<p<1; x = 0,1,...,n
[n!/(x!(n-x)!]⋅px⋅(1-p)n-x
•
Bernoulli = Binomial when n = 1
•
Negative Binomial: for 0<p<1; x = 0,1,...,n
[(r+x-1)!/(x!⋅(r-1)!)]⋅pr⋅(1-p)x
Variance V(X)
n⋅p
n⋅p⋅(1-p)
r⋅(1-p)/p
r⋅(1-p)/p2
•
Geometric = Negative Binomial when r = 1
•
Poisson: for x = 0,1,2,...; λ>0;
e-λ⋅λx/(x!)
λ
λ
•
Uniform: for x = 1, 2, ...,n; n = integer
f(x) = 1/n
(n+1)/2
(n2-1)/12
Note:
•
n! = n ⋅ (n-1) ⋅ (n-2) ⋅⋅⋅ 2 ⋅ 1 = the factorial of n.
Some important relationships
§ F(1,k) = t(k)2
§ J F(J, K) ≈ χ(J) as K → ∞
§ t(K) ≈ N(0, 1) as K → ∞.
5
Related documents