Download X - math.fme.vutbr.cz

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

History of statistics wikipedia , lookup

Statistics wikipedia , lookup

Probability wikipedia , lookup

Randomness wikipedia , lookup

Transcript
Function of a random variable
Let X be a random variable in a probabilistic space , S , P 
with a probability distribution F(x)
Sometimes we may be interested in another random variable Y
that is a function of X, that is, Y = g(X). The question is whether
we can establish the probability distribution G(y) of Y.
Let g(x) be increasing on R(X).
Since F(x) is the probability distribution of X and G(y) of Y, we
can write
F a   P X  a 
and
Gb  PY  b  P g  X   b
g(x) is increasing and so it has an inverse g 1  x 
P g  X   b  Pg 1  g  X   g 1 b  PX  g 1 b  F g 1 b
This means that
G( y)  F g 1  y 
If g(x) is decreasing on R(X), we can proceed in a similar
way, but since the inverse of a decreasing function is again
decreasing, the inequality will revert the relation sign and so
P g  X   b  Pg 1  g  X   g 1 b 
 PX  g 1 b  1  PX  g 1 b  1  F g 1 b
so that G( y)  1  F g 1  y 
Sometimes we can use even more sophisticated methods as shown
in the following example.
Random variable X has the standardized normal distribution  x ,
calculate the distribution of the random variable Y = X2.
1
 x  
2
x
e

t 2
2
dt
G( y)  ?
G y   PX 2  y   P y  X  y    y    y 
 y    y    y   1   y   2 y   1
2
G y  
2
y t 2
2
e
dt  1

We cannot calculate the last integral exactly, but we can establish the probability density of Y by differentiating G(y)
dG  y 
2
1
1
2
g y 

e

e2 y2
dy
2
2 y
2
y
y
1
This is the probability density of the random variable X2. We
have, in fact, calculated the density of the chi-squared distribution
with one degree of freedom.
x
1
1
2
 1  x  
e2 x2
2
Example
Random variable X has a uniform distribution on [0,1]. Find a
transformation g(X) such that Y = g(X) has a distribution F(y).
X has the density f  x   1 for x  0,1, f  x   0 otherwise
and the distribution
F  x   0 for x  0, F  x   x for x  0,1, F  x   1 for x  1
F(x)
1
1
x
F(x) is a distribution and as such has R as the domain and [0,1]
as the range. This means that, if F(x) is increasing, it has an
inverse F-1(x) that is also increasing with range R and domain
[0,1].
Consider the transformation Y = F-1(x). We have
PY  a   PF 1  X   a   PF F 1  X   F a  
 P X  F a   F a 
Thus, we have shown that F(y) is the distribution of Y = F-1(x).
This can be used for example for simulating a distribution using
a pocket calculator.
Random vector
Height:
115 cm
Weight:
17 kg
No of children: 0
POPULATION
Employed:
No
Height:
195 cm
Weight:
98 kg
No of children: 4
Employed:
Persons chosen at random
Yes
Height:
170 cm
Weight:
80 kg
No of children: 2
Employed:
No
The sequence of random variables
X1 = Height
X2 = Weight
X3 = Number of children
X4 = Employed
is an example of a random vector.
Generally, a random vector X   X 1 , X 2 ,, X n  assigns a
vector  x1 , x2 ,, xn  of real numbers to each outcome   

X
R  R  R
n times
Given a probabilistic space P    , S , P  , a mapping
X   X 1 , X 2 , , X n  :   R n
is called a random vector if
 X 1  a    X 2  a      X n  a  S
for every a  R
Probability distribution of a random vector
Let us consider a probabilistic space P    , S , P

for a random vector X   X 1 , X 2 ,, X n  we define its
probability distribution F  x1 , x2 ,, xn  as follows
F  x1 , x2 ,, xn   P X 1  x1  X 1  x1    X n  xn 
Properties of the distribution of a random vector
The probability distribution F  x1 , x2 ,, xn  of a random
vector X   X 1 , X 2 ,, X n  has the following properties
F  x1 , x2 ,, xn  is increasing and continuous on the left
in each of its independent variables
lim F  x1 , x2 ,, xn   0, i  1,2,, n
xi 
lim F  x1 , x2 ,, xn   1
x1 
x2 

xn 
Discrete random vectors
A random vector X   X 1 , X 2 ,, X n  is called discrete if its
range is a finite or a countable set of real vectors
x , x , x , x , x , x ,
1
1
1
2
1
n
2
1
2
2
2
n
For a discrete random vector, we can define the probability
function p x1 , x2 ,, xn 
p x1 , x2 ,, xn   P X 1  x1  X 2  x2    X n  xn 
The relationship between a probability distribution and
probability function
F  x1 , x2 ,  xn  
 pt , t ,t 
1
t1  x1  t n  xn
2
n
Continuous random vectors
A random vector X   X 1 , X 2 ,, X n  is called continuous if
its range includes a Cartesian product of n intervals
a1 , b1  a2 , b2  an , bn   R n
If a function f  x1 , x2 ,, xn  exists such that
F  x1 , x2 , , xn  
x1


xn
  f  x1 , x2 , , xn  dx1 dxn

we say that f  x1 , x2 ,, xn 
is the probability density of
the random vector X   X 1 , X 2 ,, X n 
Marginal distributions
For a random vector X   X 1 , X 2 ,, X n  with a distribution
F  x1 , x2 ,, xn  we define marginal distributions
F1  x1 , F2  x2 ,, Fn  xn 
Fi  xi   lim F  x1 , x2 ,  , xn 
x1 

xi 1 
xi 1 

xn  
Fi  xi  is a limit of F  x1 , x2 ,, xn  with all variables except
xi tending to infinity
If a random vector X   X 1 , X 2 ,, X n  is discrete, we define
its marginal probability functions p1  x1 , p2  x2 ,, pn  xn 
pi  xi  
 p x , x ,, x 
1
x1 ,, xi 1 , xi 1 ,, xn
2
n
where the summation is done over all the values of the variables
x1 ,, xi 1 , xi 1 ,, xn ,
If a random vector X   X 1 , X 2 ,, X n  is continuous, we define
its marginal probability densities f1  x1 , f 2  x2 ,, f n  xn 

 


 

f i  xi         f  x1 , x2 ,, xn  dx1  dxi 1dxi 1  dxn
By considering a marginal distribution Fi  xi  of a random
vector X   X 1 , X 2 ,, X n  we actually define a single random
variable Xi by "neglecting" all other variables.
Example for n = 2
Let (X,Y) be a discrete random variable with X taking on values
from the set {1,2,3,4}, Y from the set {-1,1,3,5,7} and the
probability function given by the below table. Calculate the
marginal probability functions of the random variables X and Y.
X
1
2
3
4
-1
0,008
0,02
0,24
0,13
1
0,02
0,001
0,021
0,114
3
0,11
0,022
0,014
0,002
5
0,013
0,115
0,003
0,01
7
0,003
0,01
0,05
0,094
Y
X
1
2
3
4
p2(y)
-1
0,008
0,02
0,24
0,13
0,398
1
0,02
0,001
0,021
0,114
0,156
3
0,11
0,022
0,014
0,002
0,148
5
0,013
0,115
0,003
0,01
0,141
7
0,003
0,01
0,05
0,094
0,157
p1(x)
0,154
0,168
0,328
0,35
1
Y
Let X   X 1 , X 2 ,, X n  be a random vector with a distribution
F  x1 , x2 ,, xn  and marginal distributions
F1  x1 , F2  x2 ,, Fn  xn 
If F  x1 , x2 ,, xn   F1  x1 F2  x2  Fn  xn  we say that
X 1 , X 2 ,, X n are independent random variables.
If X 1 , X 2 ,, X n are independent and have a probability
function p or density f, it can be proved that also
p x1 , x2 ,, xn   p1  x1  p2  x2  pn  xn  or
f  x1 , x2 ,, xn   f1  x1  f 2  x2  f n  xn 
Correlation coefficient of two random variables
Let us consider a random vector (X,Y) with a distribution F(x,y).
Using the marginal distributions Fx(x,y) and Fy(x,y) we can define
the expectancies E(X) and E(Y) and variances D(X) and D(Y). We
define the covariance of the random vector (X,Y)
cov(X,Y) = E(XY) – E(X)E(Y)
with
E  XY  
 x y px , y 
i
xi , x j
j
i
j
or
E  XY  
 
  xy f  x, y  dxdy
  
depending on whether (X,Y) is discrete or continuous. Then
p x, y  or f  x, y  is the probability function or density.
The correlation coefficient is then defined as
 X ,Y  
cov X , Y 
D X DY 
  X , Y  has the following properties
1    X , Y   1
X and Y are independen t    X , Y   0
Y  aX  b, a  0    X , Y   1
Y  aX  b, a  0    X , Y   1
Example
Calculate the correlation coefficient of the discrete random vector
(X,Y) with a probability function given by the below table
X
1
2
3
4
-1
0,008
0,02
0,24
0,13
1
0,02
0,001
0,021
0,114
3
0,11
0,022
0,014
0,002
5
0,013
0,115
0,003
0,01
7
0,003
0,01
0,05
0,094
Y
X
1
2
3
4
p 2(y)
-1
0,008
0,02
0,24
0,13
0,398
1
0,02
0,001
0,021
0,114
0,156
3
0,11
0,022
0,014
0,002
0,148
5
0,013
0,115
0,003
0,01
0,141
7
0,003
0,01
0,05
0,094
0,157
p 1(x)
0,154
0,168
0,328
0,35
1
Y
E(X) =
2,874
E(X2) =
9,378
D(X) = 1,118124
E(Y )=
2,006
E(Y2 )=
13,104
D(Y )= 9,079964
E(XY) =
5,168
cov(X,Y) =
-0,59724
r=
-0,18744
Note that, generally, it is not true that   X , Y  implies that X
and Y are independent as proved by the following example
X
1
2
3
p 2(y)
-1
0,1
0,15
0,05
0,3
0
0,2
0,1
0,1
0,4
1
0,15
0,05
0,1
0,3
p 1(x)
0,45
0,3
0,25
1
Y
The correlation coefficient is zero as can be easily calculated,
but we have, for example, p1 1 p2  1  0.450.3  0.135  0.1
so that X and Y cannot be independent.