Download PAS04 - Important discrete and continuous distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Transcript
Common discrete distributions
Continuous distributions
PAS04 - Important discrete and continuous
distributions
Jan Březina
Technical University of Liberec
30. řı́jna 2014
Common discrete distributions
Continuous distributions
Bernoulli trials
Experiment with two possible outcomes:
• yes/no questions
• throwing coin
• born girl/boy
• defect on product
• test of quality
• success/failure
The probability of success is p.
Common discrete distributions
Continuous distributions
Bernoulli/Alternative Alt(p)
description: value 1 with probability p, value 0 with prob. 1 − p
values: 0, 1
parameter: probability p
P (X = 1) = p
EX = 1p + 0(1 − p) = p
DX = (1 − p)2 p + (0 − p)2 (1 − p) = p(1 − p)
Common discrete distributions
Continuous distributions
Binomial Bi(n, p)
description: number of successes in n independent trials
example: k defects on n products; selection with replacement
(non-destructive)
values: 0, . . . , n
parameters: probability p, number of trials n
n k
P (X = k) =
p (1 − p)n−k ;
k
EX = np (calculation with shifting)
DX = np(1 − p)
Notes: Alt(p) = Bi(1, p)
R> plot( dbinom(0:n, n, p) )
0≤k≤n
Common discrete distributions
Continuous distributions
Hypergeometric H(N, M, n)
description: number of successes in n draws from a set of size N
containing M successes without replacement
description: k cracked eggs in n drawn if there is M cracked and N total
in the basket,
quality tests, destructive tests
values: max(0, n + M − N ), . . . , min(M, n)
parameters: number drawn n, total N , total of successes M
P (X = k) =
EX =
DX =
nM
N
nM
N
1−
n-draw has k successes
=
all n-draw
M
N
N −n
N −1
M
k
N −M
n−k
Notes: H(N, M, n) ≈ Bi(n, M/N ) for big values N/n
R> plot( dhyper( 0:n, M, N-M, n) )
−1
N
n
Common discrete distributions
Continuous distributions
Geometric G(p)
description: number of trials until first success (included)
example: number of production cycles until defect
values: 1, . . . , ∞
parameter: probability of success p
P (X = k) = (1 − p)k−1 p
EX =
DX =
1
p
1−p
p2
Common discrete distributions
Continuous distributions
Negative binomial N B(k, p)
description: number of trials until k successes (included)
values: k, . . . , ∞
parameters: probability of success p, number of successes k
P (X = n) =
n−1
(1 − p)n−k pk
k−1
. . . last success is fixed, selecting k − 1 successes from n − 1 trials
EX =
k
p
k(1−p)
p2
DX =
N B(k, p) is sum of k RV G(p)
Common discrete distributions
Continuous distributions
Example
Oil company; geological study reveals: 0.2 chance to strike oil per well.
What is prob. that there will be 2 strikes out of 7 wells?
What is prob. that we need to drill 7 wells to gain 2 strikes?
What is prob. that we need to drill more then 5 wells to gain 2 strikes?
Common discrete distributions
Continuous distributions
Poisson distribution
Poisson process: number of events during (time) interval, assuming that:
• events are evenly distributed with density λ events over time unit
• events are independent
Example: number of nuclear decays over given time, number of defects
on given length of fabric
values: 0, . . . , ∞
parameters: density λ, period t
P (X = k) =
(λt)k e−λt
k!
Common discrete distributions
Continuous distributions
Poisson distribution - derivation
Divide interval t to n pieces, use Bi(n, λt/n) and pass to the limit:
λt k λt n−k
n!
1−
=
n→∞ k!(n − k)! n
n
(λt)k
n!
−λt n λt −k
1−
=
1+
k
k! n (n − k)! |
{zn } |
{zn
}
| {z }
pk = lim
→1
→exp(−λt)
→1
Common discrete distributions
Continuous distributions
. . . expectation and variance
Using expansion for exp(λt):
∞
X
(λt)k
k=0
k!
e−λt = e−λt
∞
X
(λt)k
k=0
k!
= e−λt eλt = 1
similarly for expectation:
EX =
∞
∞
X
X
(λt)k−1 −λt
(λt)k −λt
e
= λt
e
= λt
k
k!
(k − 1)!
k=0
k=0
. . . and variance:
DX =
∞
X
(k 2 − (EX)2 )pk =
k=0
∞
X
k(k − 1)pk + kpk − (λt)2 pk =
k=0
= (λt)2 + (λt) − (λt)2 = λt
Common discrete distributions
Continuous distributions
Exponential distribution E(λ)
X is time between two events in Poisson process with density λ.
Time until failure.
Consider random variable Nt ∼ P o(λ, t).
Event {X ≤ t} (time until next P. event is smaller then t is identical with
event {Nt ≥ 1} (there will be at least one P.event during time t).
FX (t) = P (X ≤ t) = 1 − P (Nt < 1) = 1 −
fX (t) =
d
FX (t) = λe−λt
dt
. . . we have to assume t > 0.
Z ∞
h
i∞ Z
EX =
tλe−λt dt = − te−λt
+
0
(λt)0 e−λt
= 1 − e−λt
0!
0
DX =
0
1
λ2
∞
e−λt =
h e−λt i∞
−λ
0
=
1
λ
Common discrete distributions
Continuous distributions
Exponential distribution “is without memory”
Time until failure is independent on the history:
Prob. of no failure until time a + b under condition of no failure until
time a
is same as prob. of no failure until time b
P (X > a + b|X > a) =
1 − F (a + b)
1 − F (a)
=
e−λ(a+b)
= e−λb = 1 − F (b) = P (X > b)
e−λa
Common discrete distributions
Continuous distributions
Erlang distribution Erlang(k, λ)
X is time until k-th event in Poisson process with density λ. Particular
case of more general Gamma distribution (even for non-integer k)
fX (t) = λe−λt
(λt)k−1
(k − 1)!
FX (t) = 1 − e−λt
k−1
X
i=0
k
λ
k
DX = 2
λ
EX =
(λt)i
i!
Common discrete distributions
Continuous distributions
Relation between Bernoulli and Poisson process
Common discrete distributions
Continuous distributions
Uniform distribution U (a, b)
density:
f (x) =
CDF:
f (x) =
1
b−a
0

 0
x−a
b−a

1
for x ∈ [a, b]
elsewhere
pro x < a
pro x ∈ [a, b]
pro x > b
mean value:
Z
b
EX =
a
a+b
x
1
b2 − a2 =
dx =
b−a
2(b − a)
2
variance:
Z
DX =
a
b
a + b
2
−x
2
1
(a − b)2
=
b−a
12
Common discrete distributions
Continuous distributions
Properties of uniform distribution.
Theorem
For arbitrary RV X with continuous increasing CDF FX the random
variable
Y = FX (X)
has uniform distribution U (0, 1).
−1
−1
proof: P (Y ≤ y) = P (X ≤ FX
(y)) = FX (FX
(y)) = y
Obviously it holds also in other direction:
Theorem
−1
Let Y ∼ R(0, 1) and F is some distribution function, then X = FX
(Y )
is random variable with CDF FX = F .
• In the later theorem, F can be arbitrary CDF (even discontinuous).
• Computer generators of (pseudo)random numbers usually produce
numbers with distribution R(0, 1).
• The second theorem can be used to generate random numbers with
prescribed distribution. (approximation is used in practice)
Common discrete distributions
Continuous distributions
Weibull distribution W (λ, β)
Time until failure with shape/age parameter β.
FX (t) = 1 − exp − (t/λ)β
. . . similar to exponential distribution.
Meaning of parameter β
• k < 1 failure rate decreases over time, ”infant mortality”
• k = 1 failure rate is constant over time, exponential distr.
• k > 1 failure rate increases with time, ”aging”process
R> rweibull(n, beta, lambda)
Common discrete distributions
Continuous distributions
Weibull distribution - influence of parameter β
intensity of failures λ(t) = f (t)/(1 − F (t)):
Common discrete distributions
Continuous distributions
Weibull distribution - influence of parameter β
probability density function:
Common discrete distributions
Continuous distributions
Normal distribution N (µ, σ 2 )
Sum of large number of independent RV. Natural events. NOT social
events.
density:
(x − µ)2 1
exp −
f (x) = √
2σ 2
2πσ 2
CDF: F (x) = errf ( x−µ
σ ) . . . integral of density, no closed formula
EX = µ
DX = σ 2
Common discrete distributions
Continuous distributions
ND - density
Common discrete distributions
Continuous distributions
Standard normal distribution N (0, 1)
Standardization of normal random variable X ∼ N (µ, σ 2 ):
Y =
and vice versa.
X −µ
σ
∼ N (0, 1)
Common discrete distributions
Continuous distributions
Log-normal distribution LN (µ, σ 2 )
X is log-normal if ln X has normal distribution, so X = exp(µ + σZ)
where Z ∼ N (0, 1).
density:
(ln x − µ)2 1
exp −
f (x) = √
2σ 2
x 2πσ 2
Common discrete distributions
Continuous distributions
Presence of normality
When adding lot of random factors (central limit theorem):
• velocity of molecules
• measurements
• biological values (often log-norm, after separating male/female)