Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
RS – 4 - Jointly distributed RV (b) Chapter 4 Jointly distributed Random variables Multivariate distributions Joint probability function • For a Discrete RV, the joint probability function: p(x,y) = P[X = x, Y = y] • Marginal distributions p X x p x, y y pY y p x, y x • Conditional distributions pY X y x p x, y pX x pX Y x y p x, y pY y 1 RS – 4 - Jointly distributed RV (b) Joint probability function • For a Continuous RV, the joint probability function: f(x,y) = Pf[X = x, Y = y] • Marginal distributions fX x fY y f x , y dy f x , y dx • Conditional distributions fY X y x f x, y fX Y x y fX x f x, y fY y Independence Definition: Independence Two random variables X and Y are defined to be independent if p x, y p X x pY y if X and Y are discrete x, y x fY y if X and Y are continuous f fX N o te : p Y X y x pX Y x y p x, y x p x, y pY y pX pX pX x pY y pX x x pY y pY y pY y pX x Thus, in the case of independence marginal distributions ≡ conditional distributions 2 RS – 4 - Jointly distributed RV (b) The Multiplicative Rule for densities if X and Y are discrete p X x p Y X y x p x, y p Y y p X Y x y p X x pY y if X and Y are independent if X and Y are continuous f X x f Y X y x f x, y f Y y f X Y x y f X x fY y if X and Y are independent The Multiplicative Rule for densities Proof: This follows from the definition for conditional densities: pY X y x p x, y pX Y x y pX x p x, y pY y Hence p x , y p X x pY X y x and p x , y pY y pX Y x y The same is true for continuous random variables. 3 RS – 4 - Jointly distributed RV (b) Finding CDFs (or pdfs) is difficult - Example Suppose that a rectangle is constructed by first choosing its length, X and then choosing its width Y. Its length X is selected from an exponential distribution with mean = 1/ = 5. Once the length has been chosen its width, Y, is selected from a uniform distribution from 0 to half its length. Find the probability that its area A = XY is less than 4. fX x Solution: fY 1 5 e 15 x fo r x 0 1 y x if 0 y x 2 x 2 f x , y f X x fY X y x X 15 e 15 x 1 = x 2 2 5x e 15 x if 0 y x 2 , x 0 Finding CDFs (or pdfs) is difficult - Example yx 24 x x 2 8 or x 8 2 2 yx 22 2 2 2 xy = 4 2 2, 2 y = x/2 4 RS – 4 - Jointly distributed RV (b) Finding CDFs (or pdfs) is difficult - Example P XY 4 x 2 2 2 0 2 2, 2 4 f x, y dydx x f x, y dydx 0 2 2 0 Finding CDFs (or pdfs) is difficult - Example P XY 4 2 2 2 2 2 0 x 2 5x e 15 x dydx 2 2 e 15 x x 2 2 1 5 0 This part can be evaluated 2 e 15 x dx 8 5 2 2 5x e 15 x dydx 2 0 dx 0 2 x x, y dydx f 0 2 4 0 2 5x x x, y dydx f 0 2 4 2 0 2 x 2 5x e 15 x 4 x dx 2 x 2 e 15 x dx 2 This part may require Numerical evaluation 5 RS – 4 - Jointly distributed RV (b) Functions of Random Variables Methods for determining the distribution of functions of Random Variables Given some random variable X, we want to study some function h(X). X is a transformation from (Ω, Σ ,P) into the probability space (χ, AX, PX). Now, Y is a transformation from (χ, AX, PX) into the probability space (Υ, AY, PY), That is, we need to discover how the probability measure PY relates to the measure PX. For some F є F , we have that PY[F] = PX{x є χ : Y(x) є F} = P{ω є Ω: g(X(ω)) є F}= = P{ω є Ω: X(ω) є h-1(F)}= PX[h-1(F)] 6 RS – 4 - Jointly distributed RV (b) Methods for determining the distribution of functions of Random Variables With non-transformed variables, we step "backwards" from the values of X to the set of events in Ω. In the transformed case, we take two steps backwards: 1) once from the range of the transformation back to the values of the X, and 2) again back to the set of events in Ω. Potential problem: The transformation h(x) --we need to work with h-1(x)-- may not yield unique results if h(X) is not monotonic. Methods to get PY: 1. Distribution function method 2. Transformation method 3. Moment generating function method Method 1: Distribution function method Let X, Y, Z …. have joint density f(x,y,z, …) Let W = h( X, Y, Z, …) First step Find the distribution function of W G(w) = P[W ≤ w] = P[h( X, Y, Z, …) ≤ w] Second step Find the density function of W g(w) = G'(w). 7 RS – 4 - Jointly distributed RV (b) Distribution function method: Example 1 - 2 Let X have a normal distribution with mean 0, and variance 1 -i.e., a standard normal distribution. That is: 2 f x x 1 e 2 2 Let W = X2.. Find the distribution of W. Example 1: Chi-square - 2 First step Find the distribution function of W G(w) = P[W ≤ w] = P[ X2 ≤ w] P w w w X w if w 0 2 x 1 e 2 dx 2 F w F w 2 where F x f x x 1 e 2 2 8 RS – 4 - Jointly distributed RV (b) Example 1: Chi-square - 2 Second step Find the density function of W g(w) = G'(w). w w F f d d ww F 1 12 w f 2 w 1 1 12 e 2 w 2 2 1 w 1 w 2e 2 2 w w d w dw 1 12 w 2 w 1 1 12 e 2 w 2 2 if w 0 . Example 1: Chi-square - 2 Thus, if X has a standard Normal distribution =>W = X2 follows g w 1 w 2 1 2 e w 2 if w 0 . This distribution is the Gamma distribution with = ½ and = ½. This distribution is the 2 distribution with 1 degree of freedom. • Using the additive properties of a gamma distribution, the sum of T independent 2 RVs produces a 2 distributed RV. Alternatively, the sum of T independent N(0,1)2 RVs produces a 2 distributed RV. • Note: If we add T independent N(i,σi2)2 RVs, the Σi (Xi/σi)2 follows a non-central 2 distribution, with non-centrality parameter Σi (i/σi)2. This distribution is common in the power analysis of statistical tests in which the null distribution is (asymptotically) a 2. 9 RS – 4 - Jointly distributed RV (b) Distribution function method: Example 2 Suppose that X and Y are independent random variables each having an exponential distribution with parameter (=> E(X) = 1/) f 1 x e x fo r x 0 f 2 y e y fo r y 0 f x, y f1 x f 2 y 2 e x y fo r x 0, y 0 Find the distribution of W = X + Y. Example 2: Sum of Exponentials First step Find the distribution function of W = X + Y G(w) = P[W ≤ w] = P[ X + Y ≤ w] 10 RS – 4 - Jointly distributed RV (b) Example 2: Sum of Exponentials Example 2: Sum of Exponentials P X Y w w w x 0 f1 x f 2 y dydx 0 w w x 0 2 e x y dydx 0 11 RS – 4 - Jointly distributed RV (b) Example 2: Sum of Exponentials P X Y w w w x 0 0 w w x f1 x f 2 y dydx 0 2 e x y dydx 0 w w x 2 e x e y dy dx 0 0 w x w 2 e x 0 w 2 e x 0 e y dx 0 e w x e 0 dx Example 2: Sum of Exponentials P X Y w w 2 e 0 x e w x e0 dx w e x e w dx 0 w e x xe w 0 e w e 0 we w 1 e w we w 12 RS – 4 - Jointly distributed RV (b) Example 2: Sum of Exponentials Second step Find the density function of W g(w) = G'(w). d 1 e w we w dw w dw w de w e w e dw dw e w e w 2 we w 2 we w for w 0 Example 2: Sum of Exponentials Hence if X and Y are independent random variables each having an exponential distribution with parameter then W has density g w 2 we w for w 0 This distribution can be recognized to be the Gamma distribution with parameters = 2 and 13 RS – 4 - Jointly distributed RV (b) Distribution function method: Example 3 Student’s t distribution Let Z and U be two independent random variables with: 1. Z having a Standard Normal distribution -i.e., Z~ N(0,1) 2 z 1 e 2 2 f z 2. U having a 2 distribution with degrees of freedom 1 2 u 1 2 h u u 2 e 2 2 Z Find the distribution of t U Example 3: Student-t Distribution Therefore the joint density of Z and U is: f z,u f z h u The distribution function of T is: G t P T t P Z U 1 2 z2 u 1 2 2 u e 2 2 2 t P Z t U t 0 U 1 2 z2 u 1 2 u 2 e 2 dzdu 2 2 14 RS – 4 - Jointly distributed RV (b) Example 3: Student-t Distribution t G(t) 1 2 z2 u 1 2 u 2 e 2 dzdu 2 2 U 0 Then: g t G ( t ) 0 d d t t u 1 2 z2 u 1 2 2 u e 2 dz 2 2 d u Example 3: Student-t Distribution d dt Using: Using the FTC: b F ( x , t ) dx a b a x F ( x) d F ( x , t ) dx dt f t dt a Then g t 0 d d t F ( x ) f x => t u 1 2 z2 u 1 2 2 u e 2 dz 2 2 1 2 2 2 2 u 0 2 1 e u 2 e t 2u 2 u d u du 15 RS – 4 - Jointly distributed RV (b) Example 3: Student-t Distribution 1 2 2 Hence g (t ) 2 Using 1 0 2 x 1 e x dx => Thus, u 1 0 2 e t2 1 u 2 u 1 2 e t2 1 u 2 du 0 1 x x e dx 0 1 1 2 2 2 du 1 2 t 2 1 Example 3: Student-t Distribution or 1 1 1 t2 2 2 t 2 2 1 1 g (t ) K 2 where K 1 2 2 This is the Student’s t distribution. Its denoted as t ~ tv, where ν is referred as degrees of freedom (df). William Gosset (1876 - 1937) 16 RS – 4 - Jointly distributed RV (b) Example 3: Student-t Distribution t distribution standard normal distribution Example of a t-distributed variable Let _ _ (x ) z / n Let U n ( n 1) s 2 2 (x ) ~ N ( 0 ,1 ) ~ n2 1 Assume that Z and U are independent (later, we will learn how to check this assumption). Then, _ t (x ) n ( n 1) s 2 2 _ /( n 1 ) _ n (x ) (x ) ~ t n 1 s s/ n 17 RS – 4 - Jointly distributed RV (b) Distribution function method: Example 4 Distribution of the Max and Min Statistics Let x1, x2, … , xn denote a sample of size n from the density f(x). Find the distribution of M = max(xi). Repeat this computation for m = min(xi) Assume that the density is the uniform density from 0 to . Thus, 1 f (x) 0 x e ls e w h e re x 0 0 x F ( x) P X x 1 0 x x Example 4: Distribution of Max & Min Finding the distribution function of M. G (t ) P M t P m a x P x1 t , , x n t P x1 t P t t t 0 0 t 1 xn xi n 0 t t 18 RS – 4 - Jointly distributed RV (b) Example 4: Distribution of Max & Min Differentiating we find the density function of M. n t n 1 g t G t n 0 0t o th erw ise g(t) f(x) 0.12 0.6 0.1 0.5 0.08 0.4 0.06 0.3 0.04 0.2 0.02 0.1 0 0 0 2 4 6 8 0 10 2 4 6 8 10 Example 4: Distribution of Max & Min Finding the distribution function of m. G ( t ) P m t P m in x i t 1 P x1 t , , x n t 1 P x1 t P x n t 1 t 0 0 t 1 1 n 0 t t 19 RS – 4 - Jointly distributed RV (b) Example 4: Distribution of Max & Min Differentiating we find the density function of m. g t G t n 1 n t 1 0 0.12 f(x) 0 t o th e rw is e 0.6 0.1 g(t) 0.08 0.5 0.4 0.06 0.3 0.04 0.2 0.02 0.1 0 0 0 2 4 6 8 10 0 2 4 6 8 10 The Probability Integral Transformation Theorem: Let U∼Uniform(0,1) and F is a CDF, then a RVwith CDF F can be generated as F−1(U), where F−1 is the inverse of F, if it exists (more generally, it is the quantile function for F). This transformation allows one to convert observations that come from a uniform distribution from 0 to 1 (easy to generate using a standard function) to observations that come from an arbitrary pdf. Let U denote an observation having a U(0, 1): 1 g (u ) 0 u 1 e ls e w h e r e Let f(x) denote an arbitrary pdf and F(x) its corresponding CDF. Let X=F-1(U); we want to find the distribution of X. 20 RS – 4 - Jointly distributed RV (b) The Probability Integral Transformation Now, we go from the u’s –i.e., from (0,1)- to the x’s. 0 u 1 1 g (u ) e ls e w h e r e The Probability Integral Transformation Find the distribution of X. G x P X P F F x x P U F x 1 ( U ) x (because U is uniform.) g x G x F x f Hence: Thus if U ~ Uniform(0,1), then, X = F-1(U) has density f(x). x X F1(U) U 21 RS – 4 - Jointly distributed RV (b) The Probability Integral Transformation • The goal of some estimation methods is to simulate an expectation, say E[h(Z)]. To do this, we need to simulate Z from its distribution. The probability integral transformation is very handy for this task. Example: Exponential distribution (X ~ Exponential(λ)). Let U ~ Uniform(0,1) and F(x) = 1 – exp(- λx). => X = -log(1 – U )/ λ ~ F (exponential distribution) Code in R > set.seed(90) > lambda <- 3 > U <- runif(500,0,1) > XX <- -log(1-U)/lambda > mean(XX) [1] 0.3127794 The Probability Integral Transformation Example (continuation): hist(XX, main="Histogram for Simulated Draws",xlab="Freq", breaks=20) 22 RS – 4 - Jointly distributed RV (b) The Probability Integral Transformation Example: If F is the standard normal, F-1 has no closed form solution. Most computers programs have a routine to approximate F-1 for the standard normal distribution. We can use this to simulate other distributions –for example, truncated normals. Say, we want to sample from X ~ N(μ, σ2) with truncation points a & b. Suppose we can evaluate points from the untruncated inverse normal cdf F-1. Then, 1. Calculate endpoints pa = F(a) & pb = F(b). 2. Sample U ~ Unif (pa; pb). 3. Set X = F-1(U). Steps (1) and (2) avoid calculating the normalizing constant for the pdf. -i.e., 1/[F-1(b) - F-1(a)]. The Probability Integral Transformation Example: Code in R (a=-1; b=5, μ=0, σ=1) >set.seed(90) >pa <- pnorm(-1) >pb <- pnorm(5) > U <- runif(500,pa,pb) > XX <- qnorm(U) # inverse normal from mu=0, sd=1 > mean(XX) [1] 0.2416027 > hist(XX, main="Histogram for Simulated Draws",xlab="Observations", breaks=20) 23 RS – 4 - Jointly distributed RV (b) Method 2: The Transformation Method Theorem Let X denote a RV with pdf f(x) and U = h(X). Assume that h(x) is either strictly increasing (or decreasing) then the probability density of U is: g u f h 1 (u ) d h 1 (u ) f du x dx du Proof Use the distribution function method. Step 1 Find the distribution function, G(u) Step 2 Differentiate G(u ) to find the probability density function g(u) Method 2: The Transformation Method G u P U u P h X P P 1 X h ( u ) X h 1 ( u ) Thus, g u h s t r i c t ly i n c r e a s i n g h s t r i c t ly d e c r e a s i n g F h 1 (u ) 1 1 F h ( u ) G u u h s tr ic tly in c r e a s in g h s t r i c t ly d e c r e a s i n g dh 1 u 1 F h u du 1 F h 1 u dh u du h strictly increasing h strictly decreasing 24 RS – 4 - Jointly distributed RV (b) Method 2: The Transformation Method g u G u d h 1 u 1 F h u du 1 F h 1 u d h u du h s tric tly in c re a s in g h s tric tly d e c re a s in g Or, h g u f 1 (u ) d h 1 (u ) f du x dx du Example 1: Log Normal Distribution Suppose that X ~ N( 2). Find the distribution of U = h(x) = eX. Solution: Recall transformation formula: g u f f x h 1 (u ) d h 1 (u ) f du 1 e 2 h 1 u ln u and x dx du x 2 2 2 dh 1 u du d ln u du 1 u 25 RS – 4 - Jointly distributed RV (b) Example 1: Log Normal Distribution Replacing in transformation formula: g u f h 1 (u ) d h 1 (u ) f du 1 1 e 2 u ln u 2 x dx du 2 fo r u 0 2 Since the distribution of log(U) is normal, the distribution of U is called the log-normal distribution. Note: It is easy to obtain the mean of a log-normal variable. E[u ] ug (u )du 0 0 1 2 (ln(u ) ) 2 2 2 e du Example 1: Log Normal Distribution E[u ] ug (u)du 0 0 1 2 e (ln(u ) ) 2 2 2 du y=(ln(u) - )/σ=>u = eyσ+ Substitution: dy σ =(1/u)du => du= σ eyσ+ dy. Then, E[u ] 1 e 2 0 e e 1 e 2 2 2 (ln(u ) ) 2 2 2 y2 du y2 y 2 1 e 2 1 e 2 e y dy 2 dy y2 y 2 e 2 2 dy e 2 2 26 RS – 4 - Jointly distributed RV (b) Example 1: Log Normal Distribution - Graph In finance, a popular assumption is assume that stock prices follow a log-normal distribution. We have the following inks between normal and log-normal variables. Asset (Price) ST ln(ST) (Return) (ST - ST-1)/1 ln(ST) – ln(ST-1) Distribution Lognormal Normal First moment exp(μ +0.5 σ) μ Second moment exp(2μ +2 σ) σ Example 1: Log Normal Distribution - Graph 27 RS – 4 - Jointly distributed RV (b) Example 1: Log Normal Distribution Example: We value a call option, c, by discounting the expected option payoff at expiration, using r as the discount rate, where the expectation is calculated in a risk-neutral world (under “Q measure”): c = e-rT EQ [ max (ST – K, 0) ] • Under the Black-Scholes model, S is lognormally distributed, then ln(S) follows a normal distribution. Then, 2 ln(S T ) ~ N (μ, σ 2 ) E[S T ] e μ σ 2 Var[S T ] e 2μ 2σ 2 where μ is expected return and σ is volatility. Rates of return –i.e., ln(ST)-ln(St)- are also normally distributed! We want to value a call options, c, with a strike price K: c = e-rT EQ [ max (ST – K, 0) ] Example 1: Log Normal Distribution Divided into two terms: c = e-rT EQ[ST|ST>K] - e-rT EQ[K|ST>K] Looking at the second part: EQ[K|ST>K] = K Prob [ln ST > ln K] = K Prob [(ln ST – μ)/ σ > (ln K – μ)/ σ ] = K Prob [ZT > (ln K – μ)/ σ ] = K N(d2) The first part is more complicated and relies on the following result: If ln(ST)~N (μ, σ), then E [ST|ST>K] = E[ST ] N(σ -{ln(K)- μ}/σ). Then, EQ[ST|ST>K] = EQ[ST] N(σ -{ln(K)- μ}/σ) = EQ[ST] N(d2 + σ) = EQ[ST] N(d1) = exp(μ + 0.5 σ) N(d1) 28 RS – 4 - Jointly distributed RV (b) Example 2: Inverse Gamma Distribution Suppose that X has a Gamma distribution with α and λ parameters. Find the distribution of U = h(x) = 1/X. Solution: Recall transformation formula: g u f h 1 (u ) d h 1 (u ) f du x 1 e x ( ) f ( x; , ) h 1 ( u ) 1 / u x 0. dh 1 ( u ) 1 2 du u and dx du x x 0. Example 2: Inverse Gamma Distribution Replacing in transformation formula: g u f h 1 (u ) 1 ( ) u dh 1 1 (u ) f du e 1 u 1 u ( ) u2 x dx du 1 e u u 0. Since the distribution of X is gamma, the distribution of U is called the inverse-gamma distribution. It’s straightforward to calculate its mean: E[u ] ug (u )du 0 ( 1) 0 0 u ( ) 1 u ( 1) e e u du u du ( 1) ( 1). 29 RS – 4 - Jointly distributed RV (b) Example 2: Inverse Gamma Distribution - Graph • In Bayesian econometrics, it is common to assume that σ2 follows an inverse gamma distribution (& σ-2 follows a gamma!). Usual values for (α,λ): α=T/2 and λ=1/(2η2)=Φ/2, where η2 is related to the variance of the T N(0, η2) variables we are implicitly adding. 30 RS – 4 - Jointly distributed RV (b) Method 3: Use of moment generating functions Review: MGF - Definition Let X denote a random variable with probability density function f(x) if continuous (probability mass function p(x) if discrete). Then mX(t) = the moment generating function of X E e tX tx e f x d x e tx p x x if X is c o n tin u o u s if X is d is c re te The distribution of a random variable X is described by either 1. The density function f(x) if X continuous (probability mass function p(x) if X discrete), or 2. The cumulative distribution function F(x), or 3. The moment generating function mX(t) 31 RS – 4 - Jointly distributed RV (b) Review: MGF - Properties 1. mX(0) = 1 2. m Xk 0 k th d e riv a tiv e o f m X t a t t 0 . k E k E X 3. 4. m X k X k x k f x d x k x p x t 1 1t 2 2! t2 X c o n tin u o u s X d is c re te 3 3! t3 k k! tk . Let X be a RV with MGF mX(t). Let Y = bX + a Then, mY(t) = mbX + a(t) = E(e [bX + a]t) = eatmX (bt) Review: MGF - Properties 5. Let X and Y be two independent random variables with moment generating function mX(t) and mY(t) . Then mX+Y(t) = mX (t) mY (t) 6. Let X and Y be two random variables with moment generating function mX(t) and mY(t) and two distribution functions FX(x) and FY(y) respectively. If mX (t) = mY (t), then FX(x) = FY(x). This ensures that the distribution of a random variable can be identified by its moment generating function. 7. mX+b(t) = ebt mX (t) 32 RS – 4 - Jointly distributed RV (b) Using of moment generating functions to find the distribution of functions of Random Variables Example: Sum of 2 independent exponentials Suppose that X and Y are two independent distributed exponential random variables with pdf ’s given by f(x) = λ e-λx f(y) = λ e-λy Solution: t m X (t ) ( t m Y (t ) (1 t ) (1 ) 1 ) 1 m X Y (t ) m X (t ) m X (t ) (1 t ) 1 (1 t ) 1 (1 t ) 2 = MGF of the gamma distribution with α=2. 33 RS – 4 - Jointly distributed RV (b) Example: Affine Transformation of a Normal • Suppose that X ~ N() . What is the distribution of Y = aX + b? • Solution: m X t e t 2t 2 2 m aX b t e bt m X at e bt e e a b t at 2 at 2 2 2 a 2t 2 2 = MGF of the normal distribution with mean a + b and variance a22. Thus, Y = aX + b ~ N(a + b, a22). Example: Affine Transformation of a Normal Special Case: the z transformation X Z Z 1 X aX b 1 a b 0 2 2 Z a 2 2 1 2 1 Thus Z has a standard normal distribution. 34 RS – 4 - Jointly distributed RV (b) Example 1: Distribution of X+Y Suppose that X and Y are independent. Let X ~ N(X , X2) and Y ~ N(Y , Y2). Find the distribution of S = X + Y. Solution: m X t e X t m Y t e Y t X2 t 2 2 Y2 t 2 2 m X Y t m X t m Y t e Now, m X Y t e X Y t 2 X X t X2 t 2 2 e Y t Y2 t 2 2 Y2 t 2 2 Thus, Y = X + Y ~ N(X + Y , X2+Y2). Example 2: Distribution of aX+bY Suppose that X and Y are independent. Let X ~ N(X , X2) and Y ~ N(Y , Y2). Find the distribution of L = aX + bY. Solution: m X t e Now, X t X2 t 2 mY t e 2 Y t Y2 t 2 2 m a X bY t m aX t m bY t m X at m Y bt e X at X2 at 2 a 2 m aX bY t e 2 a X b Y t e 2 X Y bt b 2 Y2 t Y2 bt 2 2 2 2 Thus, Y = aX + bY ~ N(aX + bY , a2X2+b2Y2). 35 RS – 4 - Jointly distributed RV (b) Example 2: Distribution of aX+bY Special Case: a = +1 and b = -1. Thus Y = X - Y has a normal distribution with mean X - Y and variance: 1 2 2 X 1 2 2 Y 2 2 Y X Example 3: (Extension to n independent RV’s) Suppose that X1, X2, …, Xn are independent each having a normal distribution with means i, standard deviations i (for i = 1, 2, … , n) Find the distribution of L = a1X1 + a1X2 + …+ anXn Solution: m X i t e Now it i2 t 2 2 (for i = 1, 2, … , n) m a1 X 1 a n X n t m a1 X 1 t m a n X n t m X 1 a1t m X n a n t e 1 a1t 12 a1t 2 2 e n an t n2 a n t 2 2 36 RS – 4 - Jointly distributed RV (b) Example 3: (Extension to n independent RV’s) or ma1 X 1 an X n t e a1 1 ... an n t a 2 2 2 2 1 1 ... a n n t 2 2 = the MGF of the normal distribution a1 1 ... a n with mean 2 and variance a 1 2 1 n . . . a n2 2 n Thus Y = a1X1 + … + anXn has a normal distribution with mean 2 2 2 2 a11 + …+ ann and variance a 1 1 ... a n n Example 3: (Extension to n independent RV’s) Special case: a1 a 2 a n 1 n 1 2 n 12 12 12 2 In this case X1, X2, …, Xn is a sample from a normal distribution with mean , and standard deviations and L 1 n X 1 X 2 X n X th e s a m p le m e a n 37 RS – 4 - Jointly distributed RV (b) Example 3: (Extension to n independent RV’s) Y x a 1 x1 ... a n x n Thus 1n x 1 ... 1n x n has a normal distribution with mean x a 1 1 ... a n n 1 ... 1 n n and variance 2 x a 12 12 ... a n2 2 1 n 2 n 2 2 1 ... n 2 2 1 n n 2 2 n Summary Let x1, x2, …, xn be a sample from a normal distribution with mean , and standard deviations . Then, X ~ N(X, σ2/n) Example 3: (Extension to n independent RV’s) Distribution of x 0.4 0.3 Population 0.2 0.1 0 20 30 40 50 60 38 RS – 4 - Jointly distributed RV (b) Application: Simple Regression Framework Let the random variable Yi be determined by: Yi = α + βXi + εi, i=1, 2, ..., n. where the Xi’s are (exogenous or pre-determined) numbers, α and β constants. Let the random variable εi follow a normal distribution with zero mean and constant variance. That is, εi ~ N(0, σ2), i = 1, 2, ..., n. Then, E(Yi)= E(α + βXi + εi) = α + βXi Var(Yi)= Var(α + βXi + εi) = Var(εi) = σ2. Since Y is a linear function of a normal RV => Yi ~ N(α + βXi, σ2). Central and Non-central Distributions • Noncentrality parameters are parameters of families of probability distributions which are related to other central families of distributions. If the noncentrality parameter of a distribution is zero, the distribution is identical to a distribution in the central family. For example, the standard Student's t-distribution is the central family of distributions for the Noncentral t-distribution family. • Noncentrality parameters are used in the following distributions: - t-distribution - F-distribution - χ2 distribution - χ distribution - Beta distribution 39 RS – 4 - Jointly distributed RV (b) Central and Non-central Distributions • In general, noncentrality parameters occur in distributions that are transformations of a normal distribution. The central versions are derived from zero-mean normal distributions; the noncentral versions generalize to arbitrary means. Example: The standard (central) χ2 distribution is the distribution of a sum of squared independent N(0, 1). The noncentral χ2 distribution generalizes this to N( , 2). • There are extended versions of these distributions with two noncentrality parameters: the doubly noncentral beta distribution, the doubly noncentral F-distribution and the doubly noncentral tdistribution. Central and Non-central Distributions • There are extended versions of these distributions with two noncentrality parameters. This happens for distributions that are defined as the quotient of two independent distributions. • When both source distributions are central, the result is a central distribution. When one is noncentral, a (singly) noncentral distribution results, while if both are noncentral, the result is a doubly noncentral distribution. Example: A t-distribution is defined (ignoring constants) as the ratio of a N(0,1) RV and the square root of an independent χ2 RV. Both RV can have noncentral parameters. This produces a doubly noncentral t-distribution. 40 RS – 4 - Jointly distributed RV (b) The Central Limit Theorem (CLT): Preliminaries The proof of the CLT is very simple using moment generating functions. We will rely on the following result: Let m1(t), m2(t), … , be a sequence of moment generating functions corresponding to the sequence of distribution functions:F1(x) , F2(x), … Let m(t) be a moment generating function corresponding to the distribution function F(x). Then, if lim mi t m t for all t in an interval about 0. then lim Fi x F x for all x. i i The Central Limit Theorem (CLT) Let x1, x2, …, xn be a sequence of independent and identically distributed_RVs with finite mean , and finite variance Then as n increases, x , the sample mean, approaches the normal distribution with mean μ and variance n. _ This theorem is sometimes stated as n (x ) d N ( 0 ,1 ) d where means “the limiting distribution (asymptotic distribution) is” (or convergence in distribution). Note: This CLT is sometimes referred as the Lindeberg-Lévy CLT. 41 RS – 4 - Jointly distributed RV (b) The Central Limit Theorem (CLT) Proof: (use moment generating functions) Let x1, x2, … be a sequence of independent random variables from a distribution with moment generating function m(t) and CDF F(x). Let Sn = x1 + x2 + … + xn then mSn t mx1 x2 xn t =mx1 t mx2 t mxn t = m t n x1 x 2 x n S n n n now x or m x t m 1 t m S Sn n n t t m n n n The Central Limit Theorem (CLT) Let z x n then m z t e n n t x n nt m x e n t nt m n n t n t n ln m and ln m z t n Let u t n or ln [mz(t)] n t t2 and n 2 2 u u 2 ln m u u t2 t2 ln m u t 2 2 2 2 u u u2 42 RS – 4 - Jointly distributed RV (b) The Central Limit Theorem (CLT) ln [mz(t)] t 2 ln m u u u2 2 lim ln m z t lim ln m z t n u0 Now t2 lim 2 t u0 2 lim 2 u0 ln m u u u2 m u m u 2u using L'H opital's rule m u m u m u t2 2 m u 2 lim u0 2 2 using L'Hopital's rule again The Central Limit Theorem (CLT) t 2 m 0 m 0 2 2 th us 2 2 t 2 E xi E xi t2 2 2 2 t2 lim ln m z t n 2 t 2 an d lim m z t e t2 2 n 2 Now m t e 2 • This is the moment generating function of the standard normal distribution. • Thus, the limiting distribution of z is the standard normal distribution: i.e. lim F z x n x 2 u 1 e 2 du 2 Q.E.D. 43 RS – 4 - Jointly distributed RV (b) CLT: Asymptotic Distribution The CLT states that the asymptotic distribution of the sample mean is a normal distribution. Q: What is meant by “asymptotic distribution”? In general, the “asymptotic distribution of Xn” means a non-constant RV X, along with real-number sequences {an} and {bn}, such that an(Xn − bn) has as limiting distribution the distribution of X. In this case, the distribution of X might be referred to as the asymptotic or limiting distribution of either Xn or of an(Xn − bn), depending on the context. The real number sequence {an} plays the role of a stabilizing transformation to make sure the transformed RV –i.e., an(Xn−bn)– does not have zero variance as n →∞. _ _ Example: x has zero variance as n →∞. But, n1/2 x has a finite variance, σ2. CLT: Remarks • The CLT gives only an asymptotic distribution. As an approximation for a finite number of observations, it provides a reasonable approximation only when the observations are close to the mean; it requires a very large number of observations to stretch it into the tails. • The CLT also applies in the case of sequences that are not identically distributed. Extra conditions need to be imposed on the RVs. • Lindeberg found a condition on the sequence {Xn}, which guarantees that the distribution of Sn is asymptotically normally distributed. W. Feller showed that Lindeberg's condition is necessary as well (if the condition does not hold, then the sum Sn is not asymptotically normally distributed). • A sufficient condition that is stronger (but easier to state) than Lindeberg's condition is that there exists a constant A, such that |Xn|< A for all n. 44 RS – 4 - Jointly distributed RV (b) Jarl W. Lindeberg (1876 – 1932) Vilibald S. (Willy) Feller (1906-1970) Paul Pierre Lévy (1886–1971) 45