Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Short Guides to Microeconometrics Fall 2016 Kurt Schmidheiny Unversität Basel Elements of Probability Theory 1 2 Random Variables and Distributions A random variable is a variable whose values are determined by a prob- Elements of Probability Theory ability distribution. This is a casual way of defining random variables which is sufficient for our level of analysis. For more advanced probability theory, a random variable will be defined as a real-valued function over some probability space. Contents In section 1 to 3, a random variable is denoted by capital letters, e.g. 1 Random Variables and Distributions 3 1.1 Univariate Random Variables and Distributions . . . . . . 3 1.2 Bivariate Random Variables and Distributions 5 . . . . . . X, whereas its realizations are denoted by small letters, e.g. x. 1.1 Univariate Random Variables and Distributions A univariate discrete random variable is a variable that takes a countable 2 Moments 7 number K of real numbers with certain probabilities. The probability 2.1 Expected Value or Mean . . . . . . . . . . . . . . . . . . . 7 that the random variable X takes the value xk among the K possible 2.2 2.3 Variance and Standard Deviation . . . . . . . . . . . . . . Higher order Moments . . . . . . . . . . . . . . . . . . . . 7 9 realizations is given by the probability distribution 2.4 Covariance and Correlation . . . . . . . . . . . . . . . . . 9 2.5 Conditional Expectation and Variance . . . . . . . . . . . 9 3 Random Vectors and Random Matrices 11 4 Important Distributions 11 4.1 Univariate Normal Distribution . . . . . . . . . . . . . . . 11 4.2 Bivariate Normal Distribution . . . . . . . . . . . . . . . . 11 4.3 Multivariate Normal Distribution . . . . . . . . . . . . . . 13 P (X = xk ) = P (xk ) = pk with k = 1, 2, ..., K. K may be ∞ in some cases. This can also be written as p1 p2 P (xk ) = .. . pK Note that K X if X = x1 if X = x2 if X = xK pk = 1. k=1 A univariate continuous random variable is a variable that takes a continuum of values in the real line. The distribution of a continuous random variable X can be characterized by a density function or probability Version: 22-9-2016, 09:06 3 Short Guides to Microeconometrics density function (pdf ) f (x). The nonnegative function f (x) is such that Z x2 P (x1 ≤ X ≤ x2 ) = f (x)dx. Elements of Probability Theory 4 or joint probability density function, f (x, y). The nonnegative function f (x, y) is such that Z x1 x2 Z y2 P (x1 ≤ X ≤ x2 , y1 ≤ Y ≤ y2 ) = f (x, y)dydx x1 defines the probability that X takes a value in the interval [x1 , x2 ]. Note y1 that there is no chance that X takes exactly the value x, P (X = x) = 0. defines the probability that X and Y take values in the interval [x1 , x2 ] The probability that X takes any value on the real line is Z ∞ f (x)dx = 1. and [y1 , y2 ], respectively. Note that Z ∞Z ∞ f (x, y)dydx = 1. −∞ −∞ The distribution of a univariate random variable X is alternatively described by the cumulative distribution function (cdf ) −∞ The marginal density function or marginal probability density function is given by Z F (x) = P (X < x). f (x, y)dy −∞ The cdf of a discrete random variable X is X X F (x) = P (X = xk ) = pk . xk ≤x ∞ f (x) = xk ≤x and of a continuous random variable X Z x F (x) = f (t)dt such that Z • F (x) is monotonically nondecreasing f (x)dx. x1 The conditional density function or conditional probability density function with respect to the event {Y = y} is given by f (y|x) = −∞ F (x) has the following properties: x2 P (x1 ≤ X ≤ x2 ) = P (x1 ≤ X ≤ x2 , −∞ ≤ Y ≤ ∞) = f (x, y) f (x) provided that f (x) > 0. Note that Z ∞ f (y|x)dy = 1. −∞ • F (−∞) = 0 and F (∞) = 1. Two random variables X and Y are called independent, if and only if • F (x) is continuous to the left f (x, y) = f (x) · f (y) 1.2 Bivariate Random Variables and Distributions A bivariate continuous random variable is a variable that takes a continuum of values in the plane. The distribution of a bivariate continuous random variable (X, Y ) can be characterized by a joint density function If X and Y are independent, then: • f (y|x) = f (y) • P (x1 ≤ X ≤ x2 , y1 ≤ Y ≤ y2 ) = P (x1 ≤ X ≤ x2 ) · P (y1 ≤ Y ≤ y2 ) 5 Short Guides to Microeconometrics Elements of Probability Theory More generally, if a finite set of n continuous random variables X1 , X2 , X3 , ..., Xn are mutually independent, then The following rules hold in general, i.e. for discrete, continuous and mixed types of random variables: f (x1 , x2 , x3 , ..., xn ) = f (x1 ) · f (x2 ) · f (x3 ) · ... · f (xn ). 2 2.1 6 • E[α] = α • E[αX + βY ] = αE[X] + βE[Y ] Moments Pn Pn • E [ i=1 Xi ] = i=1 E[Xi ] Expected Value or Mean • E[X Y ] = E[X] E[Y ] if X and Y are independent The expected value or mean of a discrete random variable with probability distribution P (xk ) and k = 1, 2, ..., K is defined as E[X] = K X xk P (xk ) k=1 where α ∈ R and β ∈ R are constants. 2.2 Variance and Standard Deviation The variance of a univariate random variable X is defined as if the series converges absolutely. The expected value or mean of a continuous univariate random variable with density function f (x) is defined as Z ∞ xf (x)dx E[X] = −∞ if the integral exists. V[X] = E (X − E[X])2 = E[X 2 ] − (E[X])2 The variance has the following properties: • V[X] ≥ 0 • V[X] = 0 if and only if X = E[X] For a random variable Z which is a continuous function φ of a discrete The following rules hold in general, i.e. for discrete, continuous and mixed random variable X, we have: E[Z] = E[φ(X)] = K X types of random variables: φ(xk )P (xk ) k=1 • V[αX + β] = α2 V[X] • V[X + Y ] = V[X] + V[Y ] + 2Cov[X, Y ] For a random variable Z which is a continuous function φ of the continuous random variables X and Y , we have: Z ∞ E[Z] = E[φ(X)] = φ(x)f (x)dx −∞ Z ∞Z ∞ E[Z] = E[φ(X, Y )] = φ(x, y)f (x, y)dx dy −∞ −∞ • V[X − Y ] = V[X] + V[Y ] − 2Cov[X, Y ] Pn Pn • V i=1 Xi = i=1 V[Xi ] if Xi and Xj independent for all i 6= j where α ∈ R and β ∈ R are constants. Instead of the variance, one often considers the standard deviation √ σX = VX. 7 2.3 Short Guides to Microeconometrics Higher order Moments Elements of Probability Theory 8 We say that The j-th moment around zero is defined as E (X − E[X])j . • X and Y are uncorrelated if ρ = 0 2.4 • X and Y are negatively correlated if ρ < 0 • X and Y are positively correlated if ρ > 0 Covariance and Correlation The Covariance between two random variables X and Y is defined as: Cov[X, Y ] = E (X − E[X])(Y − E[Y ]) = E[XY ] − E[X]E[Y ] = E (X − E[X])Y = E X(Y − E[Y ]) 2.5 Conditional Expectation and Variance Let (X, Y ) be a bivariate discrete random variable and P (yk |X) the conditional probability of Y = yk given X. Then the conditional expected value or conditional mean of Y given X is The following rules hold in general, i.e. for discrete, continuous and mixed types of random variables: E[Y |X] = EY |X [Y ] = K X yk P (yk |X). k=1 • Cov[αX + γ, βY + µ] = αβCov[X, Y ] Let (X, Y ) be a bivariate continuous random variable and f (y|x) the • Cov[X1 + X2 , Y1 + Y2 ] = Cov[X1 , Y1 ] + Cov[X1 , Y2 ] + Cov[X2 , Y1 ] + Cov[X2 , Y2 ] • Cov[X, Y ] = 0 conditional mean of Y given X is if X and Y are independent where α ∈ R, β ∈ R, γ ∈ R and µ ∈ R are constants. The correlation coefficient between two random variables X and Y is defined as: conditional density of Y given X. Then the conditional expected value or Cov[X, Y ] σX σY denote the corresponding standard deviations. The Z ∞ E[Y |X] = EY |X [Y ] = yf (y|X)dy. −∞ The law of iterated means or law of iterated expecations holds in general, i.e. for discrete, continuous or mixed random variables: ρX,Y = where σX and σY correlation coefficient has the following property: • −1 ≤ ρX,Y ≤ 1 The following rule holds: • ραX+γ,βY +µ = ρX,Y • ρX,Y = 0 if X and Y are independent where α ∈ R and β ∈ R are constants. EX E[Y |X] = E[Y ]. The conditional variance of Y given X is given by 2 V[Y |X] = E (Y − E[Y |X])2 X = E Y 2 X − (E[Y |X]) . The law of total variance is V[Y ] = EX V [Y |X] + VX E[Y |X] . 9 3 Short Guides to Microeconometrics Random Vectors and Random Matrices Elements of Probability Theory 10 The following rules hold: In this section we denote matrices (random or non-random) by bold cap- • V[a0 x] = a0 V[x] a ital letters, e.g. X and vectors by small letters, e.g. x. • V[Ax] = A V[x] A0 Let x = (x1 , . . . , xn )0 be a (n × 1)-dimensional vector such that each element xi is a random variable. Let X be a (n × k)-dimensional matrix where the (m × n) dimensional matrix A with m ≤ n has full row rank. such that each element xij is a random variable. Let a = (a1 , . . . , an )0 If the variance-Covariance matrix V[x] is positive definite (p.d.) then be a n × 1-dimensional vector of constants and A a (m × n) matrix of all random elements and all linear combinations of its random elements constants. have strictly positive variance: The expectation of a random vector, E[x), and of a random matrix, E[X), summarize the expected values of its elements, respectively: E[x11 ] E[x12 ] . . . E[x1k ] E[x1 ] E[x21 ] E[x22 ] . . . E[x2k ] E[x2 ] E[x] = . and E[X] = . .. .. .. . . .. .. . . E[xn ] E[xn1 ] E[xn2 ] . . . E[xnk ] V [a0 x] = a0 V [x] a > 0 for all a 6= 0. 4 Important Distributions 4.1 Univariate Normal Distribution The density of the univariate normal distribution is given by: The following rules hold: f (x) = • E[a0 x] = a0 E[x] 1 x−µ 2 1 √ e− 2 ( σ ) . σ 2π The normal distribution is characterized by the two parameters µ and • E[Ax] = AE[x] σ. The mean of the normal distribution is E[X] = µ and the variance V[X] = σ 2 . We write X ∼ N(µ, σ 2 ). • E[AX] = AE[X] • E[tr(X)] = tr(E[X]) for X a quadratic matrix The univariate normal distribution with mean µ = 0 and variance 2 σ = 1 is called the standard normal distribution N(0, 1). The variance-Covariance matrix of a random vector, V(x), summarizes all variances and Covariances of its elements: 0 0 V[x] = E (x − E[x]) (x − E[x]) = E[xx0 ] − (E[x])([E[x]) V[x1 ] Cov[x1 , x2 ] . . . Cov[x1 , xn ] V[x2 ] . . . Cov[x2 , xn ] Cov[x2 , x1 ] . = .. .. .. .. . . . . Cov[xn , x1 ] Cov[xn , x2 ] . . . V[xn ] 4.2 Bivariate Normal Distribution The density of the bivariate normal distribution is 1 p 2πσX σY 1 − ρ2 ( " 2 2 #) 1 x − µX y − µY x − µX y − µY exp − + − 2ρ . 2(1 − ρ2 ) σX σY σX σY f (x, y) = 11 Short Guides to Microeconometrics Elements of Probability Theory If (X, Y ) follows a bivariate normal distribution, then: 12 A n-dimensional random variable x is multivariate normally distributed • The marginal densities f (x) and f (y) are univariate normal. with mean µ and variance-Covariance matrix Σ, x ∼ N(µ, Σ) if its density is: • The conditional densities f (x|y) and f (y|x) are univariate normal. −n/2 f (x) = (2π) −1/2 (det Σ) 2 , E[Y ] = µY , V[Y ] = σY2 . • E[X] = µX , V[X] = σX • The correlation coefficient between X and Y is ρX,Y = ρ. • E[Y |X] = µY + Y ρ σσX (X − µX ) and V[Y |X] = σY2 1 0 −1 exp − (x − µ) Σ (x − µ) . 2 Let x ∼ N(µ, Σ) and A a (m × n) matrix with m ≤ n and m linearly independent rows then we have 2 (1 − ρ ). Ax ∼ N(Aµ, AΣA0 ). The above properties characterize the normal distribution. It is the only distribution with all these properties. References Further important properties: Amemiya, Takeshi (1994), Introduction to Statistics and Econometrics, • If (X, Y ) follows a bivariate normal distribution, then aX + bY is also normally distributed: Hayashi, Fumio (2000), Econometrics, Princeton: Princeton University 2 aX + bY ∼ N(µX + µY , σX + σY2 + 2ρσX σY ). • If X and Y are bivariate normally distributed with Cov[X, Y ] = 0, then X and Y are independent. Multivariate Normal Distribution Let x = (x1 , . . . , xn )0 be a n-dimensional vector such that each element xi is a random variable. In addition let E[x] = µ = (µ1 , . . . , µn ) and V[x] = Σ with σ11 σ12 ... σ1n σ21 Σ= .. . σ22 .. . ... .. . σ2n .. . σn1 σn2 ... σnn where σij = Cov[xi , xj ]. Press. Appendix A. Stock, James H. and Mark W. Watson (2012), Introduction to Econometrics, 3rd ed., Pearson Addison-Wesley. Chapter 2. The reverse implication is not true. 4.3 Cambridge: Harvard University Press.