Download Introduction to Probability Theory

Stats 241.3 Probability Theory Summary The sample Space, S The sample space, S, for a random phenomena is the set of all possible outcomes. An Event , E The event, E, is any subset of the sample space, S. i.e. any set of outcomes (not necessarily all outcomes) of the random phenomena S E Probability Suppose we are observing a random phenomena Let S denote the sample space for the phenomena, the set of all possible outcomes. An event E is a subset of S. A probability measure P is defined on S by defining for each event E, P[E] with the following properties 1. 2. P[E] ≥ 0, for each E. P[S] = 1.  3. P   i  Ei    P  Ei  if Ei  Ei   for all i, j  i P  E1  E2    P  E1   P  E2   Finite uniform probability space Many examples fall into this category 1. Finite number of outcomes 2. All outcomes are equally likely 3. PE= nE nS   nE N no. of outcomes in E  total no. of outcomes Note : n  A = no. of elements of A To handle problems in case we have to be able to count. Count n(E) and n(S). Techniques for counting Basic Rule of counting Suppose we carry out k operations in sequence Let n1 = the number of ways the first operation can be performed ni = the number of ways the ith operation can be performed once the first (i - 1) operations have been completed. i = 2, 3, … , k Then N = n1n2 … nk = the number of ways the k operations can be performed in sequence. Diagram: n1       n2 n2 n2     n3 Basic Counting Formulae 1. Permutations: How many ways can you order n objects n! 2. Permutations of size k (< n): How many ways can you choose k objects from n objects in a specific order n Pk =n  n  1 n!  n  k  1   n  k ! 3. Combinations of size k ( ≤ n): A combination of size k chosen from n objects is a subset of size k where the order of selection is irrelevant. How many ways can you choose a combination of size k objects from n objects (order of selection is irrelevant) n n Ck     k  n  n  1 n  2   n  k  1 n!  k  k  1 k  2  1  n  k  !k ! Important Notes 1. In combinations ordering is irrelevant. Different orderings result in the same combination. 2. In permutations order is relevant. Different orderings result in the different permutations. Rules of Probability The additive rule P[A  B] = P[A] + P[B] – P[A  B] and P[A  B] = P[A] + P[B] if P[A  B] =  The additive rule for more than two events n  n P  Ai    P  Ai    P  Ai  Aj  i j  i 1  i 1  P  Ai  Aj  Ak   i j   1 k n 1 P  A1  A2  and if Ai  Aj =  for all i ≠ j. then n  n P  Ai    P  Ai   i 1  i 1  An  The Rule for complements for any event E P  E   1  P  E  Conditional Probability, Independence and The Multiplicative Rue Then the conditional probability of A given B is defined to be: P  A B   if P  B  0 P  A  B P  B The multiplicative rule of probability   P  A P  B A if P  A  0 P  A  B     P B P A B if P B  0         and P  A  B  P  A P  B if A and B are independent. This is the definition of independent The multiplicative rule for more than two events P  A1  A2   An   P  A1  P  A2 A1  P  A3 A2  A1  P  An An 1  An 2  A1  Independence for more than 2 events Definition: The set of k events A1, A2, … , Ak are called mutually independent if: P[Ai1 ∩ Ai2 ∩… ∩ Aim] = P[Ai1] P[Ai2] …P[Aim] For every subset {i1, i2, … , im } of {1, 2, …, k } i.e. for k = 3 A1, A2, … , Ak are mutually independent if: P[A1 ∩ A2] = P[A1] P[A2], P[A1 ∩ A3] = P[A1] P[A3], P[A2 ∩ A3] = P[A2] P[A3], P[A1 ∩ A2 ∩ A3] = P[A1] P[A2] P[A3] Definition: The set of k events A1, A2, … , Ak are called pairwise independent if: P[Ai ∩ Aj] = P[Ai] P[Aj] for all i and j. i.e. for k = 3 A1, A2, … , Ak are pairwise independent if: P[A1 ∩ A2] = P[A1] P[A2], P[A1 ∩ A3] = P[A1] P[A3], P[A2 ∩ A3] = P[A2] P[A3], It is not necessarily true that P[A1 ∩ A2 ∩ A3] = P[A1] P[A2] P[A3] Bayes Rule for probability P  A P  B A P  A B   P  A P  B A  P  A  P  B A  An generalization of Bayes Rule Let A1, A2 , … , Ak denote a set of events such that S  A1  A2   Ak and Ai  Aj   for all i and j. Then P  Ai  P  B Ai  P  Ai B   P  A1  P  B A1    P  Ak  P  B Ak  Random Variables an important concept in probability A random variable , X, is a numerical quantity whose value is determined be a random experiment Definition – The probability function, p(x), of a random variable, X. For any random variable, X, and any real number, x, we define p  x   P  X  x   P  X  x where {X = x} = the set of all outcomes (event) with X = x. For continuous random variables p(x) = 0 for all values of x. Definition – The cumulative distribution function, F(x), of a random variable, X. For any random variable, X, and any real number, x, we define F  x   P  X  x   P  X  x where {X ≤ x} = the set of all outcomes (event) with X ≤ x. Discrete Random Variables For a discrete random variable X the probability distribution is described by the probability function p(x), which has the following properties 1. 0  p  x  1  2.  p  x   p  x   1 x 3. i 1 P  a  x  b  i  p  x a  x b Graph: Discrete Random Variable P  a  x  b  p(x) a  p  x a  x b b Continuous random variables For a continuous random variable X the probability distribution is described by the probability density function f(x), which has the following properties : 1. f(x) ≥ 0  2.  f  x  dx  1.  3. b P  a  X  b   f  x  dx. a Graph: Continuous Random Variable probability density function, f(x)   f  x  dx  1.  b P  a  X  b   f  x  dx. a The distribution function F(x) This is defined for any random variable, X. F(x) = P[X ≤ x] Properties 1. 2. 3. F(-∞) = 0 and F(∞) = 1. F(x) is non-decreasing (i. e. if x1 < x2 then F(x1) ≤ F(x2) ) F(b) – F(a) = P[a < X ≤ b]. 4. p(x) = P[X = x] =F(x) – F(x-) Here F  x    lim F  u  ux 5. If p(x) = 0 for all x (i.e. X is continuous) then F(x) is continuous. 6. For Discrete Random Variables F  x   P  X  x   p u  u x F(x) is a non-decreasing step function with F    0 and F     1 p  x   F  x   F  x    jump in F  x  at x. 1.2 F(x) 1 0.8 0.6 0.4 p(x) 0.2 0 -1 0 1 2 3 4 7. For Continuous Random Variables Variables x F  x   P  X  x    f  u  du  F(x) is a non-decreasing continuous function with F    0 and F     1 f  x  F   x. f(x) slope F(x) 1 0 -1 0 1 x 2 To find the probability density function, f(x), one first finds F(x) then f  x  F   x. Some Important Discrete distributions The Bernoulli distribution Suppose that we have a experiment that has two outcomes 1. Success (S) 2. Failure (F) These terms are used in reliability testing. Suppose that p is the probability of success (S) and q = 1 – p is the probability of failure (F) This experiment is sometimes called a Bernoulli Trial Let 0 if the outcome is F X  1 if the outcome is S q Then p  x   P  X  x    p x0 x 1 The probability distribution with probability function q x  0 p  x   P  X  x   p x 1 is called the Bernoulli distribution 1 0.8 0.6 p q = 1- p 0.4 0.2 0 0 1 The Binomial distribution We observe a Bernoulli trial (S,F) n times. Let X denote the number of successes in the n trials. Then X has a binomial distribution, i. e.  n  x n x p  x   P  X  x    p q x  0,1, 2,  x where 1. p = the probability of success (S), and 2. q = 1 – p = the probability of failure (F) ,n The Poisson distribution • Suppose events are occurring randomly and uniformly in time. • Let X be the number of events occuring in a fixed period of time. Then X will have a Poisson distribution with parameter l. p  x  lx x! e l x  0,1, 2,3, 4, The Geometric distribution Suppose a Bernoulli trial (S,F) is repeated until a success occurs. X = the trial on which the first success (S) occurs. The probability function of X is: p(x) =P[X = x] = (1 – p)x – 1p = p qx - 1 The Negative Binomial distribution Suppose a Bernoulli trial (S,F) is repeated until k successes occur. Let X = the trial on which the kth success (S) occurs. The probability function of X is:  x  1 k x  k p  x   P  X  x   p q  k  1 x  k , k  1, k  2, The Hypergeometric distribution Suppose we have a population containing N objects. Suppose the elements of the population are partitioned into two groups. Let a = the number of elements in group A and let b = the number of elements in the other group (group B). Note N = a + b. Now suppose that n elements are selected from the population at random. Let X denote the elements from group A. The probability distribution of X is  a  b     x  n  x   p  x   P  X  x  N   n Continuous Distributions Continuous random variables For a continuous random variable X the probability distribution is described by the probability density function f(x), which has the following properties : 1. f(x) ≥ 0  2.  f  x  dx  1.  3. b P  a  X  b   f  x  dx. a Graph: Continuous Random Variable probability density function, f(x)   f  x  dx  1.  b P  a  X  b   f  x  dx. a Continuous Distributions The Uniform distribution from a to b  1  f  x  b  a  0 a xb otherwise 0.4 f  x 0.3 0.2   1   ba   0.1 0 0 5 10 a b x 15 The Normal distribution (mean m, standard deviation s)  1 f  x  e 2s s m  x  m 2 2s 2 The Exponential distribution lel x f  x    0 x0 x0 0.2 0.1 0 -2 0 2 4 6 8 10 The Weibull distribution A model for the lifetime of objects that do age. The Weibull distribution with parameters a and b. f  x  a x b 1 e a b  xb x0 The Weibull density, f(x) 0.7 (a = 0.9, b = 2) 0.6 (a = 0.7, b = 2) 0.5 0.4 (a = 0.5, b = 2) 0.3 0.2 0.1 0 0 1 2 3 4 5 The Gamma distribution An important family of distributions The Gamma distribution Let the continuous random variable X have density function:  l a a 1  l x x e  f  x     a   0  x0 x0 Then X is said to have a Gamma distribution with parameters a and l. Graph: The gamma distribution 0.4 (a = 2, l = 0.9) 0.3 (a = 2, l = 0.6) (a = 3, l = 0.6) 0.2 0.1 0 0 2 4 6 8 10 Comments 1. The set of gamma distributions is a family of distributions (parameterized by a and l). 2. Contained within this family are other distributions a. The Exponential distribution – in this case a = 1, the gamma distribution becomes the exponential distribution with parameter l. The exponential distribution arises if we are measuring the lifetime, X, of an object that does not age. It is also used a distribution for waiting times between events occurring uniformly in time. b. The Chi-square distribution – in the case a = n/2 and l = ½, the gamma distribution becomes the chi- square (c2) distribution with n degrees of freedom. Later we will see that sum of squares of independent standard normal variates have a chi-square distribution, degrees of freedom = the number of independent terms in the sum of squares. Expectation Let X denote a discrete random variable with probability function p(x) (probability density function f(x) if X is continuous) then the expected value of X, E(X) is defined to be: E  X    xp  x    xi p  xi  x i and if X is continuous with probability density function f(x) EX     xf  x  dx  Expectation of functions Let X denote a discrete random variable with probability function p(x) then the expected value of X, E[g (X)] is defined to be: E  g  X     g  x  p  x  x and if X is continuous with probability density function f(x) EX     g  x  f  x  dx  Moments of a Random Variable the kth moment of X : mk  E  X k    xk p  x  if X is discrete  x     x k f  x  dx if X is continuous - • The first moment of X , m = m1 = E(X) is the center of gravity of the distribution of X. • The higher moments give different information regarding the distribution of X. the kth central moment of X k  m  E  X  m    0 k    x  m k p  x  if X is discrete  x      x  m k f  x  dx if X is continuous - Moment generating functions Definition Let X denote a random variable, Then the moment generating function of X , mX(t) is defined by:   etx p  x  if X is discrete   x tX mX  t   E e       etx f  x  dx if X is continuous   Properties 1. mX(0) = 1 2. mXk   0   k th derivative of mX  t  at t  0.    mk  E X mk  E  X 3. k  k k   x f  x  dx  k    x p  x mX  t   1  m1t  m2 2! t  2 X continuous X discrete m3 3! t  3  mk k! t  k . 4. Let X be a random variable with moment generating function mX(t). Let Y = bX + a Then mY(t) = mbX + a(t) = E(e [bX + a]t) = eatE(e X[ bt ]) = eatmX (bt) 5. Let X and Y be two independent random variables with moment generating function mX(t) and mY(t) . Then mX+Y(t) = E(e [X + Y]t) = E(e Xt e Yt) = E(e Xt) E(e Yt) = mX (t) mY (t) 6. Let X and Y be two random variables with moment generating function mX(t) and mY(t) and two distribution functions FX(x) and FY(y) respectively. Let mX (t) = mY (t) then FX(x) = FY(x). This ensures that the distribution of a random variable can be identified by its moment generating function M. G. F.’s - Continuous distributions Name Continuous Uniform Exponential Gamma c2 n d.f. Normal Moment generating function MX(t) ebt-eat [b-a]t  l   l  t  for t < l a  l   l  t  for t < l  1  1-2t   n/2 for t < 1/2 tm+(1/2)t2s2 e M. G. F.’s - Discrete distributions Name Discrete Uniform Bernoulli Binomial Geometric Negative Binomial Poisson Moment generating function MX(t) et etN-1 N et-1 q + pet (q + pet)N pet 1-qet  pet  k   t 1-qe   l(et-1) e Note: The distribution of a random variable X can be described by: 1. p  x   probability function if X is discrete f  x   probability density function if X is continuous 2. Distribution function:   p u  if X is discrete  u  x F  x   x   f  u  du if X is continuous   3. Moment generating function:   etx p  x  if X is discrete  x tX mX  t   E e       etx f  x  dx if X is continuous  Summary of Discrete Distributions Name Discrete Uniform Bernoulli Geometric probability function p(x) 1 p(x) = N x=1,2,...,N p x=1 p(x) = q x=0  N p(x) =  x  pxqN-x p(x) =pqx-1 x=1,2,... Negative Binomial  x-1  p(x) =  k-1  pkqx-k Binomial Poisson Hypergeometric x=k,k+1,... lx -l p(x) = x! e x=1,2,...  A  N-A      x  n-x  p(x) = N   n Mean N+1 2 p Variance N2-1 12 pq Moment generating function MX(t) et etN-1 N et-1 q + pet Np Npq (q + pet)N 1 p k p q p2 kq p2 pet 1-qet  pet  k   1-qet l l el(e -1) A n N   A  AN-n n N 1-NN-1     not useful t Summary of Continuous Distributions Name Continuous Uniform Exponentia l Gamma c2 n d.f. Normal Weibull probability density function f(x)  1 a xb f ( x)   b  a 0 otherwise le  lx x  0 f ( x)   x0 0  la  x a 1e lx x  0 f(x) = f(x)   G ( a ) 0 x0 (1/2)n n -(1/2)x x e x? 0 f(x) =  (n/2) 0 x<0 1 2 2 f(x) = e-(x-m) /2s 2 s    -x x e x? 0 f(x) =  0 x<0 Mean a+b 2 Variance (b-a)2 12 Moment generating function MX(t) ebt-eat [b-a]t 1 l 1 l2  l  l  t  for t < l a l a l2  l  l  t  for t < l n n m s2  (  ) +1     a  1  1-2t   n/2 for t < 1/2 etm+(1/2)t    ( )-[( )]  +2  +1  2s2 not avail. Jointly distributed Random variables Multivariate distributions Discrete Random Variables The joint probability function; p(x,y) = P[X = x, Y = y] 1. 0  p  x, y   1 2.  p  x, y   1 x 3. y P  X , Y   A   p  x, y   x, y   A Continuous Random Variables Definition: Two random variable are said to have joint probability density function f(x,y) if 1. 0  f  x, y    2.   f  x, y  dxdy  1   3. P  X , Y   A    f  x, y  dxdy A Marginal and conditional distributions Marginal Distributions (Discrete case): Let X and Y denote two random variables with joint probability function p(x,y) then the marginal density of X is p X  x    p  x, y  y the marginal density of Y is pY  y    p  x, y  x Marginal Distributions (Continuous case): Let X and Y denote two random variables with joint probability density function f(x,y) then the marginal density of X is fX  x    f  x, y  dy  the marginal density of Y is fY  y     f  x, y  dx  Conditional Distributions (Discrete Case): Let X and Y denote two random variables with joint probability function p(x,y) and marginal probability functions pX(x), pY(y) then the conditional density of Y given X = x pY X  y x   p  x, y  pX  x  conditional density of X given Y = y pX Y  x y   p  x, y  pY  y  Conditional Distributions (Continuous Case): Let X and Y denote two random variables with joint probability density function f(x,y) and marginal densities fX(x), fY(y) then the conditional density of Y given X = x fY X  y x   f  x, y  fX  x conditional density of X given Y = y fX Y  x y  f  x, y  fY  y  The bivariate Normal distribution Let f  x1 , x2   1  2  s 1s 2 1  2 e 1  Q  x1 , x2  2 where 2  x  m 2       x  m x  m x  m 1 1 1 1 2 2 2 2  2            s 1  s 2   s 2    s 1  Q  x1 , x2   1  2 This distribution is called the bivariate Normal distribution. The parameters are m1, m2 , s1, s2 and . Surface Plots of the bivariate Normal distribution Marginal distributions 1. The marginal distribution of x1 is Normal with mean m1 and standard deviation s1. 2. The marginal distribution of x2 is Normal with mean m2 and standard deviation s2. Conditional distributions 1. The conditional distribution of x1 given x2 is Normal with: s1 m1 2  m1    x2  m2  and mean s2 standard deviation s 1 2  s 1 1   2 2. The conditional distribution of x2 given x1 is Normal with: s2 m21  m2    x1  m1  and mean s1 standard deviation s 21  s 2 1   2 Independence Definition: Two random variables X and Y are defined to be independent if p  x, y   pX  x  pY  y  if X and Y are discrete f  x, y   f X  x  fY  y  if X and Y are continuous multivariate distributions k≥2 Definition Let X1, X2, …, Xn denote n discrete random variables, then p(x1, x2, …, xn ) is joint probability function of X1, X2, …, Xn if 1. 0  p  x1 , 2.   px , 1 x1 3. , xn   1 , xn   1 xn P  X 1 , , X n   A    x1,  px , 1 , xn   A , xn  Definition Let X1, X2, …, Xk denote k continuous random variables, then f(x1, x2, …, xk ) is joint density function of X1, X2, …, Xk if 1. f  x1 ,  2.    f x , 1  3. , xn   0 , xn  dx1 , , dxn  1  P  X 1 , , X n   A   f x ,  A 1 , xn  dx1 , , dxn The Multinomial distribution Suppose that we observe an experiment that has k possible outcomes {O1, O2, …, Ok } independently n times. Let p1, p2, …, pk denote probabilities of O1, O2, …, Ok respectively. Let Xi denote the number of times that outcome Oi occurs in the n repetitions of the experiment. The joint probability function of: p  x1 , n! , xn   p1x1 p2x2 x1 ! x2 ! xk !    x1 n x2 pkxk  x1 x2  p1 p2 xk  xk k p is called the Multinomial distribution The Multivariate Normal distribution Recall the univariate normal distribution 1  s  f  x  e 2s  12 xm 2 the bivariate normal distribution f  x, y   1 2s xs y 1   2 e  12 2 1          2    xm x 2 sx xm x sx xm y sy    xm y 2  sy The k-variate Normal distribution f  x1 , , xk   f  x   1  2  k /2  1/ 2 e  12  x μ   1  x μ  where  x1  x  2  x      xk   m1  m  μ   2      mk  s 11 s 12 s s 12 22    s 1k s 2 k s 1k  s 2 k    s kk  Marginal distributions Definition Let X1, X2, …, Xq, Xq+1 …, Xk denote k discrete random variables with joint probability function p(x1, x2, …, xq, xq+1 …, xk ) then the marginal joint probability function of X1, X2, …, Xq is p12 q  x , , x     p  x , 1 q 1 xq1 xn , xn  Definition Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous random variables with joint probability density function f(x1, x2, …, xq, xq+1 …, xk ) then the marginal joint probability function of X1, X2, …, Xq is  f12 q   x , , x     f  x , 1 q 1   , xn  dxq 1 dxn Conditional distributions Definition Let X1, X2, …, Xq, Xq+1 …, Xk denote k discrete random variables with joint probability function p(x1, x2, …, xq, xq+1 …, xk ) then the conditional joint probability function of X1, X2, …, Xq given Xq+1 = xq+1 , …, Xk = xk is p1 q q 1 k  x ,, x 1 q  xq 1 ,, xk  p  x1 , pq 1 k x q 1 , xk  ,, xk  Definition Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous random variables with joint probability density function f(x1, x2, …, xq, xq+1 …, xk ) then the conditional joint probability function of X1, X2, …, Xq given Xq+1 = xq+1 , …, Xk = xk is f1 q q 1 k  x ,, x 1 q  xq 1 ,, xk  f  x1 , f q 1 k x q 1 , xk  ,, xk  Definition – Independence of sets of vectors Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous random variables with joint probability density function f(x1, x2, …, xq, xq+1 …, xk ) then the variables X1, X2, …, Xq are independent of Xq+1, …, Xk if f  x1 , , xk   f1 q  x , , x  f 1 q q 1 k x q 1 , , xk  A similar definition for discrete random variables. Definition – Mutual Independence Let X1, X2, …, Xk denote k continuous random variables with joint probability density function f(x1, x2, …, xk ) then the variables X1, X2, …, Xk are called mutually independent if f  x1 , , xk   f1  x1  f 2  x2  f k  xk  A similar definition for discrete random variables. Expectation for multivariate distributions Definition Let X1, X2, …, Xn denote n jointly distributed random variable with joint density function f(x1, x2, …, xn ) then E  g  X 1 ,   , X n      g x , 1   , xn  f  x1 , , xn  dx1 , , dxn Some Rules for Expectation 1. E  Xi      x f x , i     1 , xn  dx1 dxn   x f  x  dx i i i i  Thus you can calculate E[Xi] either from the joint distribution of X1, … , Xn or the marginal distribution of Xi. 2. E a1 X1   an X n  b  a1E  X1   The Linearity property  an E  X n   b 3. (The Multiplicative property) Suppose X1, … , Xq are independent of Xq+1, … , Xk then E  g  X1 , , X q  h  X q 1 ,  E  g  X1 , , X k  , X q  E h  X q 1 , In the simple case when k = 2 E  XY   E  X  E Y  if X and Y are independent , X k  Some Rules for Variance 2  Var  X   E  X  m X    E  X 2   m X2   Tchebychev’s inequality 1 P  X  m  ks   1  2 k Ex: 3 P  X  m  2s   4 8 P  X  m  3s   9 15 P  X  m  4s   16 1. Var  X  Y   Var  X   Var Y   2Cov  X , Y  where Cov  X , Y  =E  X  m X Y  mY   Note: If X and Y are independent, then Cov  X , Y   0 and Var  X  Y   Var  X   Var Y  The correlation coefficient XY  xy = Cov  X , Y  Var  X  Var Y   Cov  X , Y  s XsY Properties : 1. If X and Y are independent than  XY  0. 2. 1   XY  1 and  XY  1 if there exists a and b such that P Y  bX  a  1 where XY = +1 if b > 0 and XY = -1 if b< 0 Some other properties of variance 2. 3. Var  aX  bY   a2Var  X   b2Var Y   2abCov  X , Y  Var  a1 X1   an X n   a12 Var  X1    an2 Var  X n   2a1a2Cov  X1 , X 2    2a1anCov  X1 , X n  2a2a3Cov  X 2 , X 3    2a2anCov  X 2 , X n  2an1anCov  X n1 , X n    ai2 Var  X i   2 ai a j Cov  X i , X j  n i 1 4. Variance: Multiplicative Rule for independent random variables Suppose that X and Y are independent random variables, then: Var  XY   Var  X Var Y   m X2 Var Y   mY2Var  X  Mean and Variance of averages Let X1, … , Xn be n mutually independent random variables each having mean m and standard deviation s (variance s2). n 1 Let X   X i n i 1 Then and m X  E  X   m s X2  Var  X   s2 n The Law of Large Numbers Let X1, … , Xn be n mutually independent random variables each having mean m. 1 n Let X   X i n i 1 Then for any d > 0 (no matter how small) P  X  m  d   P  m  d  X  m  d   1 as n   Conditional Expectation: Definition Let X1, X2, …, Xq, Xq+1 …, Xk denote k continuous random variables with joint probability density function f(x1, x2, …, xq, xq+1 …, xk ) then the conditional joint probability function of X1, X2, …, Xq given Xq+1 = xq+1 , …, Xk = xk is f1 q q 1 k  x ,, x 1 q  xq 1 ,, xk  f  x1 , f q 1 k x q 1 , xk  ,, xk  Definition Let U = h( X1, X2, …, Xq, Xq+1 …, Xk ) then the Conditional Expectation of U given Xq+1 = xq+1 , …, Xk = xk is E U xq 1 ,, xk       h  x , , x  f 1   k 1 q q 1 k  x , , x 1 q  xq 1 ,, xk dx1  dxq Note this will be a function of xq+1 , …, xk. A very useful rule Let (x1, x2, … , xq, y1, y2, … , ym) = (x, y) denote q + m random variables. Let U  g  x1 , , xq , y1 , , ym   g  x, y  Then E U   Ey  E U y   Var U   Ey Var U y    Vary  E U y   Functions of Random Variables Methods for determining the distribution of functions of Random Variables 1. Distribution function method 2. Moment generating function method 3. Transformation method Distribution function method Let X, Y, Z …. have joint density f(x,y,z, …) Let W = h( X, Y, Z, …) First step Find the distribution function of W G(w) = P[W ≤ w] = P[h( X, Y, Z, …) ≤ w] Second step Find the density function of W g(w) = G'(w). Use of moment generating functions 1. Using the moment generating functions of X, Y, Z, …determine the moment generating function of W = h(X, Y, Z, …). 2. Identify the distribution of W from its moment generating function This procedure works well for sums, linear combinations, averages etc. Let x1, x2, … denote a sequence of independent random variables Sums Let S = x1 + x2 + … + xn then mS t   mx1  x2   xn t  =mx t  mx t  1 2 mxn t  Linear Combinations Let L = a1x1 + a2x2 + … + anxn then mL  t   ma1x1 a2 x2   an xn t  =mx  a1t  mx  a2t  1 2 mxn  ant  Arithmetic Means Let x1, x2, … denote a sequence of independent random variables coming from a distribution with moment generating function m(t) x1  x2  Let x  n mx  t   m1 1 x1  x2  n n  xn , then 1  1  t m t  1 t   m   xn n  n  n   t   m     n  n 1  m t  n  The Transformation Method Theorem Let X denote a random variable with probability density function f(x) and U = h(X). Assume that h(x) is either strictly increasing (or decreasing) then the probability density of U is:   1 dh (u ) dx g  u   f h (u )  f  x du du 1 The Transfomation Method Theorem (many variables) Let x1, x2,…, xn denote random variables with joint probability density function f(x1, x2,…, xn ) Let u1 = h1(x1, x2,…, xn). u2 = h2(x1, x2,…, xn). un = hn(x1, x2,…, xn). define an invertible transformation from the x’s to the u’s Then the joint probability density function of u1, u2,…, un is given by: g  u1 , , un   f  x1 ,  f  x1 , , xn  d  x1 , d  u1 , , xn  J , xn  , un   dx1  du  1 d  x1 , , xn  where J   det  d  u1 , , un    dxn  du1 Jacobian of the transformation dx1  dun     dxn  dun  Some important results Distribution of functions of random variables The method used to derive these results will be indicated by: 1. DF - Distribution Function Method. 2. MGF - Moment generating function method 3. TF - Transformation method Student’s t distribution Let Z and U be two independent random variables with: 1. Z having a Standard Normal distribution and 2. U having a c2 distribution with n degrees of freedom then the distribution of: t  g (t )  K   1 n  2 n 1  2 Z t U is: n n 1    2   where K  n  n    2 DF The Chi-square distribution Let Z1, Z2, … , Zv be v independent random variables having a Standard Normal distribution, then n U   Z i2 i 1 has a c2 distribution with n degrees of freedom. DF MGF for n = 1 for n > 1 Distribution of the sample mean Let x1, x2, …, xn denote a sample from the normal distribution with mean m and variance s2. n then x x i 1 i n has a Normal distribution with: m x  m and standard deviation s x  s n MGF The Central Limit theorem If x1, x2, …, xn is a sample from a distribution with mean m, and standard deviations s, then if n is large x  the sample mean has a normal distribution with mean mx  m and variance s  2 x s  s  2 standard deviation s x    n  n MGF Distribution of the sample variance Let x1, x2, …, xn denote a sample from the normal distribution with mean m and variance s2. n Let x x i i 1 n n and s 2  n then U  x  x  i 1 i s2 2  x  x  2 i i 1 n 1 n  1 s 2   s2 has a c2 distribution with n = n - 1 degrees of freedom. MGF Distribution of sums of Gamma R. V.’s Let X1, X2, … , Xn denote n independent random variables each having a gamma distribution with parameters (l,ai), i = 1, 2, …, n. Then W = X1 + X2 + … + Xn has a gamma distribution with parameters (l, a1 + a2 +… + an). MGF Distribution of a multiple of a Gamma R. V. Suppose that X is a random variable having a gamma distribution with parameters (l,a). Then W = aX has a gamma distribution with parameters (l/a, a). MGF Distribution of sums of Binomial R. V.’s Let X1, X2, … , Xk denote k independent random variables each having a binomial distribution with parameters (p,ni), i = 1, 2, …, k. Then W = X1 + X2 + … + Xk has a binomial distribution with parameters (p, n1 + n2 +… + nk). MGF Distribution of sums of Negative Binomial R. V.’s Let X1, X2, … , Xn denote n independent random variables each having a negative binomial distribution with parameters (p,ki), i = 1, 2, …, n. Then W = X1 + X2 + … + Xn has a negative binomial distribution with parameters (p, k1 + k2 +… + kn). MGF

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Introduction to Probability Theory