Download L2: Lecture notes: Distributions

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Distributions of random variables
A random variable (r.v.) is a real function
X: S → R on the sample space S (a quantitative
aspect of the random experiment).
The range is SX = { X(s) | s
S}
We distinguish discrete and continuous r.v.’s:
X is discrete iff (= if and only if) SX countable,
that is: SX = {x1, x2, …,xn} or SX = {x1, x2,…}
If X is continuous, then SX is usually an interval
in R. (a definition follows later)
Discrete r.v. X: the event X=x is{s S| X(s)= x}
The probability (mass) function of X:
P: x → P(X=x)
Requirements for a probability function P:
1.
P(X=x) ≥ 0 for every x
SX
1
2.
The distribution of a discrete r.v. X consists of a
table or formula for P(X=x) of all x
SX .
The measure for the centre of distribution is the
Expectation or Expected value of X:
E ( X )   xP ( X  x ) ,
xS X
provided that the sum is absolute convergent.
Notation: E(X) = EX = µX = µ.
Interpretation: “ Weighted average”.
Properties E(X):
1. If P(X=x) is symmetric in x = c,
then E(X)= c
2
2.
Eg ( X )   g ( x ) P ( X  x )
3.
4.
E(aX + b) = aE(X) + b
E[ag(X) + b h(X)] = aEg(X) + bh(X)
xS X
Note that in general E(X2) ≠ E(X) 2!
Functions of X and their expectation:
E(Xk) is the kth moment of X
var(X) = E(X - µX)2 is the variance of X
Notation: var(X)=  X2   2 .
3
 X  var( X ) is the standard deviation of the
r.v. X
var(X) and σX are both measures of spread for
the distribution.
Properties of var(X) and σX :
1. var(X) ≥ 0 and σX ≥ 0
2. var(X) = E(X2) - µX2
(formula for computations)
3. var(X) > 0 => E(X2) > µX2
var(X) = 0 => P(X = µX) = 1
4. var(aX + b) = a2 var(X) and σaX + b = |a|σX
Chebyshev’s inequality:
P(| X   X |  c) 
var( X )
c2
for all c > 0
Discrete distributions and characteristics:
4
name
Binomial
B(n, p)
Poisson
(µ)
Geometrical (p)
HyperGeometrical
P(X = k) =
E(X)
Var(X)
n k
  p 1 p nk
k 
np
np(1-p)
µ
µ
1
p
1 p
p2
for k = 0,1,…,n
k
k!
e ,
k= 0,1,…
1 p k 1 p ,
k = 1,2,…
 R  N  R 
 

k
n

k
 

,
N
 
 
n
np
np(1-p)×
(p= NR )
N n
N 1
k = 0, 1,…,n
Properties (linking the distributions):
1. If the parts of the populations are large
compared to the sample size (> 5n2 ), the
hypergeometrical probabilities can be
approximated with the binomial.
5
2. If X ~ B(n, p) for large n and small p so that
np > 10, X is approximately Poissondistributed with µ = np.
When to use these common discrete
distributions?
Binomial
“the number of successes in n Bernoulli trials”
Ex: X = “# sixes in 25 flips of a dice”
X ~ B(25, 1/6)
Geometrical
“the trial number of the first success when
performing Bernoulli trials”
Ex: X = “# of the first flip of a dice that results
in a 6”
Property: P(X > k) = (1-p)k , k = 0, 1, 2,…
Hyper geometrical
“The number of white balls selected when n
6
balls are selected at random without
replacement from an urn that contains R red and
N-R white balls”
Ex: X = “# of girls when 5 persons are
selected at random from a group of 8 boys and
12 girls”
Poisson
“The number of rare events in a period and/or
space”
Ex: X = “# of traffic accidents on a busy road
on a day”.
Continuous random variables
X is a continuous random variable if there exists
a non-negative function f(x) for all real x so that
for every (measurable) set B:
P( X  B )   f ( x )dx
B
7
f(x) or fX(x) is the probability density function.
Requirements: 1. f(x) ≥ 0

2.  f ( x )dx  1

Note: f(x) is not a probability, but for
small dx > 0 is P(x < X ≤ x + dx) ≈ f(x)dx
x
F ( x )  P ( X  x )   f (u )du is the cumulative

distribution function (c.d.f.).
Notation: FX(x) = F(x)
Properties F(x) for every random variable X:
1.
2.
3.
a < b => F(a) ≤ F(b)
(F is non decreasing).
lim F ( x )  1
x 
lim F ( x )  F ( a )
x a
and
lim F ( x )  0
x 
(F is right continuous)
8
4.
5.
P(X > x) = 1- F(x)
P(X = x) = F(x) - limu↑x F(u)
Properties of density function f and c.d.f. F of a
continuous r.v. X:
1.
2.
3.
4.
F(x) is a continuous function
f ( x) 
d
dx
F ( x)
P(X = x) = 0
P(a < X < b) = P(a ≤ X ≤ b)
b
= F(b) – F(a) =  f ( x )dx
a
The expectation of a continuous r.v. X

E ( X ) =  xf ( x )dx

(provided that the integral is absolute
convergent).
A function Y = g(X) of a continuous r.v. X , if
we know the density function fX.
9
The density function f Y(y) can be determined in
3 steps:
1.
Express FY (y) = P( g(X) ≤ y) in FX.
2.
3.
Determine fY ( y ) 
d F ( y)
dy Y
Use the known distribution f

E (Y )  Eg ( X ) =  g ( x ) f ( x )dx


Especially: E ( X )   x f ( x )dx
2
2

All properties of E(X) and var(X) hold for
continuous random variables as well,
e.g. var(X) = E(X2) - µX2
Properties of fX (x):
1. If fX (x) is symmetric in x = c, then E(X) = c
10
2. Linear transformation: Y = aX + b
1 F (
f
(
y
)

(known fX (x)) : Y
|a| X
y b
)
a
Common continuous distributions
Probability
Name
E(X) Var(X)
density function
Uniform f(x) = b1 a , for x in ab (ba )2
2
12
U(a,b)
[a, b]
1
Exponential f(x) = e x , for x 1
2


≥0
Standard 1 x2
0
1
normal
φ(x) = 12 e 2
N(0,1)
x   2

1
Normal
 
2
2   
1
µ
σ
N(µ, σ2) f(x) = 2 2 e
These distributions are often used as a model
of the stochastic reality:
Uniform: random numbers from an interval
11
Exponential: waiting times, serving times
Normal: quantities or variables in nature,
economy etc, varying around an average
Some properties of these continuous
distributions
1. An exponential variable has no memory:
P(X > x + y | X > x) = P(X > y).
This follows from the exponential property
P(X > x) = e –x, x ≥ 0
2. If X ~ U(0, 1),
then Y = aX + (b – a) ~ U(a, b).
3. If X ~ N(µ, σ2),
then Y = aX + b ~ N(aµ+b, a2σ2).
12
Related documents