Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Probability Theory Multivariate random variables Definition 1.1. An n-dimensional random variable, or random vector X is a function from the probability space to n, that is, Chapter 1 Multivariate random variables The joint distribution function is defined by for x1, x2,…,xn ∈ . The joint distribution function can also be written in a more compact way by using vector notation. Thommy Perlinger, Probability Theory 1 Multivariate random variables Thommy Perlinger, Probability Theory 2 Marginal distributions In the discrete case the joint probability function is defined by When the joint distribution of X is known it is possible to determine the marginal distribution of any sub-vector of X. for x1, x2,…,xn ∈ In the special case n=2, i.e. X=(X,Y)´, it is easily shown that , or by using vector notation and In the continuous case the joint density function is defined by in the discrete case, and that in the continuous case for x1, x2,…,xn ∈ and , or by using vector notation 3 Care must be taken when determining the limits of summation/integration. 4 1 Problem 1.3.13 (part of) Independent random variables Let the joint density function of X=(X,Y)´ be given by It is not in general possible to determine the joint distribution of a random vector X only knowing the marginal distributions of its components. t Determine the marginal density function of Y. The easiest way to correctly determine the limits of integration is by describing the domain of X=(X,Y)´ graphically The components of X are independent if and only if (iff) It is now easy to derive that in the discrete case and That is, Y is U(0,1). Thommy Perlinger, Probability Theory in the continuous case. 5 Independent random variables Thommy Perlinger, Probability Theory 6 Exercise 1.2 and 1.3 It is often cumbersome to determine whether two random variables X and Y are independent. The following result is useful. f l X and Y are independent if and only if 1. …the domain of (X,Y) is rectangular, that is, there exists constants a, b, c och d such that a ≤ x1 ≤ b, c ≤ x2 ≤ d, and 2. …the joint density/probability function can be written as a product of two functions g(x) och h(y) where g is a function of x only and h is a function of y only. Thommy Perlinger, Probability Theory 7 Thommy Perlinger, Probability Theory 8 2 The Transformation Theorem One-dimensional case Covariance and correlation Let g(x) be strictly increasing and that we are interested in the random variable Y=g(X) where X has density fX(x). The concepts of covariance and correlation are used to determine the magnitude of (linear) dependence between X and Y. Since g(x) is strictly increasing it has an inverse which also is strictly increasing. When deriving the covariance manually it is easier to use the formula When X and Y are independent there is no dependence, i.e. Cov(X,Y)=0. The covariance is scale-dependent. A measure of linear dependence that is scale-invariant is given by the correlation coefficient . Thommy Perlinger, Probability Theory 9 The Transformation Theorem One-dimensional case W differentiate We diff ti t with ith respectt to t y and d by b th the chain h i rule l it follows f ll th t that Thommy Perlinger, Probability Theory 10 The Transformation Theorem One-dimensional case When g(x) is strictly decreasing the inverse is also strictly decreasing, i.e., Theorem. Let X be a continuous random variable with density function fX(x). Further, let g(x) be a function that is strictly monotone on the domain of X. It then follows that Y=g(X) is a continuous random variable with density function We differentiate with respect to y and by the chain rule it follows that Since g⁻¹(x) is strictly decreasing dg⁻¹/dy will also be negative which means that it is possible to formulate one single transformation theorem for strictly monotone functions. Thommy Perlinger, Probability Theory 11 Thommy Perlinger, Probability Theory 12 3 Example. The Rayleigh – Exponential relationship A density function sometimes used by engineers to model lengths of life of electronic components is the Rayleigh density, given i b by Example. The Rayleigh – Exponential relationship Since Y=g(X)=X2 it follows that Thus by the transformation theorem Consider Y=g(X)=X2. Since the domain of X is the positive reals, g(X) is strictly increasing which means that the transformation theorem can be applied. which we recognize as the density function of Exp(θ). Thommy Perlinger, Probability Theory 13 Example. The Rayleigh – Exponential relationship When we observe that g(x) is strictly increasing it is however not necessary to do it the ”formal way”. We can just as easily use the h di distribution ib i function. f i Thommy Perlinger, Probability Theory 14 The Transformation Theorem n-dimensional case (Conditions) Let X be a continuous random vector with density function fX(x) with its mass concentrated on S ⊂ n. Further, let g=(g1,g2,…,gn) be a bijection from S to T ⊂ n . Now consider the n-dimensional random vector Y=g(X), that is, the n onedimensional random variables We differentiate with respect to y and by the chain rule it follows that We finally assume that g and its inverse are continuously differentiable. Thommy Perlinger, Probability Theory 15 Thommy Perlinger, Probability Theory 16 4 Determinants Example: Solving a 3×3 determinant Notation. The determinant of a square matrix A is denoted by det A or |A|. Let us find the determinant of a certain 3×3 matrix A, where A is given by Computation. The determinant of a 2×2 matrix A is given by The algebraic complement Aij of the element aij is the matrix that remains after deleting the i:th row and the j:th column of A. In order to make the calculations as easy as possible we develop along the first row. It then follows that The determinant of a n×n matrix A can be derived recursively via The purpose. The absolute value of the determinant is the generalized volume, in the n-dimensional space, that is given by the column vectors of A. Thommy Perlinger, Probability Theory 17 The Transformation Theorem n-dimensional case Thommy Perlinger, Probability Theory 18 Problem 1.3.21 Let the joint density function of X=(X,Y)´ be given by Theorem 2.1. The density function of Y is where h is the inverse of g and where Determine the joint density function of U=(U,V)´ where U=XY and V=X. It is obvious that this is a bijection and therefore Theorem 2.1 is applicable Inversion yields applicable. that is Thommy Perlinger, Probability Theory 19 Thommy Perlinger, Probability Theory 20 5 The Transformation Theorem Auxiliary variables Problem 1.3.21 By Theorem 2.1 we now obtain In many situations we are only interested in the probability distribution of a one-dimensional function of a random vector. In order to use Theorem 2.1 to find this distribution we have to introduce auxiliary variables. These auxiliary variables can be chosen arbitrarily which means that we define them to make the computations as easy possible. as p and it is clear that U and V are independent, equidistributed random d variables i bl with ith density d it function f ti When the joint density function fY(y) is obtained we find the sought marginal distribution by integrating over the auxiliary variables. Thommy Perlinger, Probability Theory 21 Problem 1.3.18 22 Problem 1.3.18 Let X ∈ Exp(1) and Y ∈ U(0,1) be independent random variables. Determine the density function of U=X+Y. To find the density function for U we introduce the auxiliary variable V=Y which means that that is Thommy Perlinger, Probability Theory In order to find the marginal distribution of U we have to integrate the joint density function over v. However, care has to be taken in order to find the correct limits of integration integration. We have to break up the problem in two parts; 0<u<1 and u≥1. In the case 0<u<1 we get and so By Theorem 2.1 we now obtain and in the case u≥1 we get for 0<v<u<∞, v<1. Thommy Perlinger, Probability Theory 23 Thommy Perlinger, Probability Theory 24 6 The Transformation Theorem Many-to-one Theorem 2.1 requires that the function g is a bijection from S to T. What if g is not injective? Suppose S ⊂ n can be partitioned into m disjoint subsets S1, S2,…,Sm, such that g:Sk→T is injective for each k. Then where hk=(h1k,h2k,…,hnk) is the inverse corresponding to the mapping from Sk→T and Jk is the Jacobian. Thommy Perlinger, Probability Theory 25 7