Download Chapter 1 Multivariate random variables

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Probability Theory
Multivariate random variables
Definition 1.1. An n-dimensional random variable, or random
vector X is a function from the probability space to n, that is,
Chapter 1
Multivariate random variables
The joint distribution function is defined by
for x1, x2,…,xn ∈ . The joint distribution function can also be
written in a more compact way by using vector notation.
Thommy Perlinger, Probability Theory
1
Multivariate random variables
Thommy Perlinger, Probability Theory
2
Marginal distributions
In the discrete case the joint probability function is defined by
When the joint distribution of X is known it is possible to
determine the marginal distribution of any sub-vector of X.
for x1, x2,…,xn ∈
In the special case n=2, i.e. X=(X,Y)´, it is easily shown that
, or by using vector notation
and
In the continuous case the joint density function is defined by
in the discrete case, and that in the continuous case
for x1, x2,…,xn ∈
and
, or by using vector notation
3
Care must be taken when determining the limits of
summation/integration.
4
1
Problem 1.3.13 (part of)
Independent random variables
Let the joint density function of X=(X,Y)´ be given by
It is not in general possible to determine the joint distribution of
a random vector X only knowing the marginal distributions of its
components.
t
Determine the marginal density function of Y. The easiest way
to correctly determine the limits of integration is by describing
the domain of X=(X,Y)´ graphically
The components of X are independent if and only if (iff)
It is now easy to derive that
in the discrete case and
That is, Y is U(0,1).
Thommy Perlinger, Probability Theory
in the continuous case.
5
Independent random variables
Thommy Perlinger, Probability Theory
6
Exercise 1.2 and 1.3
It is often cumbersome to determine whether two random
variables X and Y are independent. The following result is
useful.
f l
X and Y are independent if and only if
1. …the domain of (X,Y) is rectangular, that is, there exists
constants a, b, c och d such that a ≤ x1 ≤ b, c ≤ x2 ≤ d, and
2. …the joint density/probability function can be written as a
product of two functions g(x) och h(y) where g is a function of
x only and h is a function of y only.
Thommy Perlinger, Probability Theory
7
Thommy Perlinger, Probability Theory
8
2
The Transformation Theorem
One-dimensional case
Covariance and correlation
Let g(x) be strictly increasing and that we are interested in the random
variable Y=g(X) where X has density fX(x).
The concepts of covariance and correlation are used to determine the
magnitude of (linear) dependence between X and Y.
Since g(x) is strictly increasing it has an inverse which also is strictly
increasing.
When deriving the covariance manually it is easier to use the formula
When X and Y are independent there is no dependence, i.e. Cov(X,Y)=0.
The covariance is scale-dependent. A measure of linear dependence that is
scale-invariant is given by the correlation coefficient .
Thommy Perlinger, Probability Theory
9
The Transformation Theorem
One-dimensional case
W differentiate
We
diff
ti t with
ith respectt to
t y and
d by
b th
the chain
h i rule
l it follows
f ll
th t
that
Thommy Perlinger, Probability Theory
10
The Transformation Theorem
One-dimensional case
When g(x) is strictly decreasing the inverse is also strictly decreasing, i.e.,
Theorem. Let X be a continuous random variable with density
function fX(x). Further, let g(x) be a function that is strictly
monotone on the domain of X. It then follows that Y=g(X) is a
continuous random variable with density function
We differentiate with respect to y and by the chain rule it follows that
Since g⁻¹(x) is strictly decreasing dg⁻¹/dy will also be negative which means
that it is possible to formulate one single transformation theorem for strictly
monotone functions.
Thommy Perlinger, Probability Theory
11
Thommy Perlinger, Probability Theory
12
3
Example.
The Rayleigh – Exponential relationship
A density function sometimes used by engineers to model
lengths of life of electronic components is the Rayleigh density,
given
i
b
by
Example.
The Rayleigh – Exponential relationship
Since Y=g(X)=X2 it follows that
Thus by the transformation theorem
Consider Y=g(X)=X2. Since the domain of X is the positive
reals, g(X) is strictly increasing which means that the
transformation theorem can be applied.
which we recognize as the density function of Exp(θ).
Thommy Perlinger, Probability Theory
13
Example.
The Rayleigh – Exponential relationship
When we observe that g(x) is strictly increasing it is however not
necessary to do it the ”formal way”. We can just as easily use
the
h di
distribution
ib i function.
f
i
Thommy Perlinger, Probability Theory
14
The Transformation Theorem
n-dimensional case (Conditions)
Let X be a continuous random vector with density function fX(x) with its mass
concentrated on S ⊂ n.
Further, let g=(g1,g2,…,gn) be a bijection from S to T ⊂
n
.
Now consider the n-dimensional random vector Y=g(X), that is, the n onedimensional random variables
We differentiate with respect to y and by the chain rule it follows
that
We finally assume that g and its inverse are continuously differentiable.
Thommy Perlinger, Probability Theory
15
Thommy Perlinger, Probability Theory
16
4
Determinants
Example: Solving a 3×3 determinant
Notation. The determinant of a square matrix A is denoted by det A or |A|.
Let us find the determinant of a certain 3×3 matrix A, where A is given by
Computation. The determinant of a 2×2 matrix A is given by
The algebraic complement Aij of the element aij is the matrix that remains
after deleting the i:th row and the j:th column of A.
In order to make the calculations as easy as possible we develop along the
first row. It then follows that
The determinant of a n×n matrix A can be derived recursively via
The purpose. The absolute value of the determinant is the generalized
volume, in the n-dimensional space, that is given by the column vectors of A.
Thommy Perlinger, Probability Theory
17
The Transformation Theorem
n-dimensional case
Thommy Perlinger, Probability Theory
18
Problem 1.3.21
Let the joint density function of X=(X,Y)´ be given by
Theorem 2.1. The density function of Y is
where h is the inverse of g and where
Determine the joint density function of U=(U,V)´ where U=XY
and V=X.
It is obvious that this is a bijection and therefore Theorem 2.1 is
applicable Inversion yields
applicable.
that is
Thommy Perlinger, Probability Theory
19
Thommy Perlinger, Probability Theory
20
5
The Transformation Theorem
Auxiliary variables
Problem 1.3.21
By Theorem 2.1 we now obtain
In many situations we are only interested in the probability
distribution of a one-dimensional function of a random vector.
In order to use Theorem 2.1 to find this distribution we have to
introduce auxiliary variables.
These auxiliary variables can be chosen arbitrarily which
means that we define them to make the computations as easy
possible.
as p
and it is clear that U and V are independent, equidistributed
random
d
variables
i bl with
ith density
d
it function
f
ti
When the joint density function fY(y) is obtained we find the
sought marginal distribution by integrating over the auxiliary
variables.
Thommy Perlinger, Probability Theory
21
Problem 1.3.18
22
Problem 1.3.18
Let X ∈ Exp(1) and Y ∈ U(0,1) be independent random variables. Determine
the density function of U=X+Y.
To find the density function for U we introduce the auxiliary variable V=Y
which means that
that is
Thommy Perlinger, Probability Theory
In order to find the marginal distribution of U we have to integrate the joint
density function over v. However, care has to be taken in order to find the
correct limits of integration
integration.
We have to break up the problem in two parts; 0<u<1 and u≥1.
In the case 0<u<1 we get
and so
By Theorem 2.1 we now obtain
and in the case u≥1 we get
for 0<v<u<∞, v<1.
Thommy Perlinger, Probability Theory
23
Thommy Perlinger, Probability Theory
24
6
The Transformation Theorem
Many-to-one
Theorem 2.1 requires that the function g is a bijection from
S to T. What if g is not injective?
Suppose S ⊂ n can be partitioned into m disjoint subsets S1,
S2,…,Sm, such that g:Sk→T is injective for each k. Then
where hk=(h1k,h2k,…,hnk) is the inverse corresponding to the
mapping from Sk→T and Jk is the Jacobian.
Thommy Perlinger, Probability Theory
25
7