Download Causality in Econometrics (3)

Graphical Causal Models References Causality in Econometrics (3) Alessio Moneta Max Planck Institute of Economics Jena [email protected] 26 April 2011 GSBC Lecture Friedrich-Schiller-Universität Jena Causality in Econometrics 1/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Graphical Causal Models Terminology and Representation of Statistical Dependence Causality in Econometrics 2/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Sources and Motivations B The graphical-models approach to causal inference was mainly developed by: • Spirtes, Glymour, Scheines (2000), Causation, Prediction, and Search, 2nd edition. • Pearl (2000), Causality: Models, Reasoning, and Inference. B Forerunners: • J.S. Mill • C. Spearman • T. Haavelmo, H. Wold, H. Simon • H. Reichenbach, P. Suppes Causality in Econometrics 3/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Sources and Motivations B Ideas: • Use of probability + diagrams to represent associations in the data • Use of graph-theory to represent and analyze causal relations • This permits, in particular: • addressing the symmetry problem, typical of probabilistic approaches • representation of structures where interventions are possible • Formalization of the relationship between probabilistic and causal representation • Emphasis on inference, agnosticism about causal ontology. But: many points of contact with • probabilistic approach (Reichenbach) • manipulability theory (Woodward). Causality in Econometrics 4/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Formal preliminaries B Graph: < V, M, E > • set V of vertices (or nodes) to represent variables. • set M of marks as ‘>’, ‘−’ (or EM ≡ empty mark), ‘o’, to represent directions of causal influences. • set E of edges, which are pairs of the form {[V1 , M1 ], [V2 , M2 ]}, to represent causal relationships. V1 - V 2 V3 G: < {V1 , V2 , V3 }, {EM, >}, {{[V1 , EM], [V2 , >]}, {[V1 , EM], [V3 , EM]}, {[V3 , EM], [V2 , >]}} > Causality in Econometrics 5/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Formal preliminaries B Undirected graph: • graph in which the set of marks M = {EM} B Directed graph: • graph in which the set of marks M = {EM, >} and for each edge in E the marks are are always: EM, > B Directed edges: A −→ B (≡ {[A, EM], [B, >]}) • A : parent, B : child (descendant). Causality in Econometrics 6/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Formal preliminaries B Path: • undirected path: a sequence of vertices A, . . . , B such that for every pair of vertices X, Y adjacent (in the sequence) there is a connecting edge {[X, M1 ][Y, M2 ]}. • directed path: a sequence of vertices A, . . . , B such that for every pair of vertices X, Y adjacent (in the sequence) there is a connecting edge {[X, EM][Y, >]}. • acyclic path: path that contains no vertex more than once, otherwise it is cyclic. Causality in Econometrics 7/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Example V1 - V 2 -V 4 -V 5 V3 • Directed paths: < V1 , V2 , V4 , V5 >; < V3 , V2 , V4 , V5 >; < V2 , V4 , V5 >, etc. • Undirected paths: < V1 , V3 , V2 , V4 , V5 >; < V1 , V2 , V3 >, etc. • Undirected cyclic path: < V1 , V2 , V3 , V1 > • No directed cyclic paths. Causality in Econometrics 8/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference More terminology B Collider: vertex V such that A −→ V ←− B B Unshielded collider: vertex V such that A −→ V ←− B and A and B are not adjacent (≡ connected by edge) in the graph B Complete graph: graph in which every pair of vertices are adjacent B Directed Acyclic Graph (DAG): directed graph that contains no directed cyclic paths B Directed Cyclic Graph (DCG): directed graph that contains directed cyclic paths Causality in Econometrics 9/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Graphs and probabilistic dependence B First use of graphs: representation of probabilistic dependence and independence B Nodes: random variables (discrete or continuous). B Edges: probabilistic dependence. B Bayesian networks (Pearl 1985). Causality in Econometrics 10/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Conditional Independence B If X, Y, Z are random variables, we say that X is conditionally independent of Y given Z, and write X⊥ ⊥ Y |Z (1) if • for discrete variables: P(X = x, Y = y|Z = z) = P(X = x|Z = z)P(Y = y|Z = z) • for continuous variables: fXY|Z (x, y|z) = fX|Z (x|z)fY|Z (y|z) • We can also write (simplifying the notation): X⊥ ⊥ Y|Z ⇐⇒ f (x, y, z)f (z) = f (x, z)f (y, z) Causality in Econometrics 11/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Conditional independence B Some equalities: • X⊥ ⊥ Y|Z ⇐⇒ f (x, y|z) = f (x|z)f (y|z) • X⊥ ⊥ Y|Z ⇐⇒ f (x, y, z)f (z) = f (x, z)f (y, z) • X⊥ ⊥ Y|Z ⇐⇒ f (x|y, z) = f (x|z) • X⊥ ⊥ Y|Z ⇐⇒ f (x, z|y) = f (x|z)f (z|y) • X⊥ ⊥ Y|Z ⇐⇒ f (x, y, z) = f (x|z)f (y, z) Note: f (x, y|z) = f (x, y, z)/f (z) Causality in Econometrics 12/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Conditional independence B It holds also: • X⊥ ⊥ Y|Z ⇐⇒ Y ⊥ ⊥ X|Z (symmetry) • If Z is empty (trivial) X ⊥ ⊥ Y: X is independent of Y. B Other properties: • X⊥ ⊥ YW |Z =⇒ X ⊥ ⊥ Y|Z (decomposition) • X⊥ ⊥ YW |Z =⇒ X ⊥ ⊥ Y|ZW (weak union) See Pearl 2000:11 Causality in Econometrics 13/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Interpretations of C.I. B Useful interpretations of C.I. X ⊥ ⊥ Y|Z: • once we know Z, learning the value of Y does not provide additional information about X. • once we know Z, reading X is irrelevant for reading Y. • once we observe realizations of Z, observing realizations of Y is irrelevant for predicting the frequent realizations of X. Causality in Econometrics 14/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Independence and uncorrelatedness B Important to distinguish between (conditional) independence and (conditional or partial) correlation. • Recall: B Variance of X: σX2 := E[(X − E(X))2 ] B Covariance between X and Y: σXY := E[(X − E(X))(Y − E(Y))] B Correlation coefficient (Pearson): σ ρXY := XY σX σY B Linear regression coefficient: σ σ = ρXY X rXY := XY σY σY2 B This suggest that correlation is a measure of linear dependence B Notice: σXY = σYX and ρXY = ρYX but rXY 6= rYX Causality in Econometrics 15/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Independence and uncorrelatedness • Recall: B Partial correlation between X and Y given Z ρXY.Z = q ρXY − ρYZ ρXZ q 1 − ρ2XZ 1 − ρ2YZ B Conditional independence X ⊥ ⊥ Y|Z: fXY|Z (x, y|z) = fX|Z (x|z)fY|Z (y|z) B It holds: • X⊥ ⊥ Y =⇒ ρXY = 0 • X⊥ ⊥ Y|Z =⇒ ρXY.Z = 0 B and (of course): • ρXY 6= 0 =⇒ X ⊥ ⊥ / Y • ρXY.Z 6= 0 =⇒ X ⊥ ⊥ / Y |Z Causality in Econometrics 16/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Independence and uncorrelatedness B In general: • ρXY = 0 =⇒ × X⊥ ⊥Y • ρXY.Z = 0 =⇒ × X⊥ ⊥ Y |Z B However, if the joint distribution F(XYZ) is normal: • ρXY = 0 =⇒ X ⊥ ⊥Y • ρXY.Z = 0 =⇒ X ⊥ ⊥ Y |Z Causality in Econometrics 17/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Population and sample B Notice also the difference between population parameters and sample statistics: ρXY = σXY σX σY rYX = σXY σX2 ρ̂XY = q r̂YX = ∑nk=1 (Xk − X̄)(Yk − Ȳ) ∑nk=1 (Xk − X̄)2 ∑nk=1 (Yk − Ȳ)2 ∑nk=1 (Xk − X̄)(Yk − Ȳ) ∑nk=1 (Xk − X̄)2 β̂ OLS = (X0 X)−1 XY, for vectors of data X ≡ (X1 , . . . , Xn )0 , Y ≡ (Y1 , . . . , Yn )0 and where X̄ = n−1 ΣXi . Notice that when X̄ = 0 and Ȳ = 0, r̂YX = β̂ OLS . Causality in Econometrics 18/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Other concepts related to independence B If, given the r.v. X and Y, the moments E(Xk ) < ∞ and E(Ym ) < ∞, it turns out that X ⊥ ⊥ Y iff E(Xk Ym ) = E(Xk )E(Ym ), for all k, m = 1, 2, . . . B X and Y are (k, m)-order dependent iff E(Xk Ym ) 6= E(Xk )E(Ym ), for any k, m = 1, 2, . . . B (1-1)-order linear dependence: E(XY) 6= E(X)E(Y) B (1-1)-order independence: E(XY) = E(X)E(Y) ⇔ E{[X − E(X)][Y − E(Y)]} = 0 ⇔ σXY = 0 ⇔ ρXY = 0 B Orthogonality E(XY) = 0 B Note: 1 if X and Y are uncorrelated (ρXY = 0), this is equivalent to say that their mean deviations are orthogonal (if X and Y are “centered”, subtracting their mean, they become orthogonal). 2 if X and Y are orthogonal, ρXY = 0 only if E(X) = 0 or E(Y) = 0 Causality in Econometrics 19/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Other concepts related to independence B r-th order independence E(Yr |X = x) = 0 for all x ∈ RX B In summary: independence =⇒ 1st -order independence =⇒ non-correlation ⇐⇒ orthogonality mean-subtracted variables non-correlation =⇒ × independence (there could be non-liner dependencies!) (cfr. Spanos 1999: 272-279) Causality in Econometrics 20/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Statistical model B Importance of defining a statistical model. B Typical statistical model for continuous set of n random variables X • Probability model: defines a family of density functions f (x; θ ) defined over the range of values of X; • Sampling model: X ((T × n) matrix of data) is a random sample. (cfr. Spanos 1999: 33) Causality in Econometrics 21/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference The Markov Condition B The Markov condition permits the representation of probabilistic dependence through a DAG. In particular, it imposes a relationship between the Bayesian network (DAG in which nodes are random variables) and the probabilistic structure. • A directed acyclic graph G over V (set of vertices) and a probability distribution P(V) satisfy the Markov condition iff for every W ∈ V, W ⊥ ⊥ V\(Descendants(W ) ∪ Parents(W )) given Parents(W ). (Spirtes et al. 2000: 11) • or, in other words: Any vertex (node) is conditionally independent of its nondescendants (except parents), given its parents. Causality in Econometrics 22/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Markov Condition (example) V1 6 - V 2 -V 4 -V 5 V3 • The DAG above and the probability distribution P(V1 , V2 , V3 , V4 ) satisfy MC iff: (1) V4 ⊥ ⊥ {V1 , V3 }|V2 (2) V5 ⊥ ⊥ {V1 , V2 , V3 }|V4 • Notice that many other c.i. relations follow from (1) and (2) by applying symmetry, decomposition, and weak union (see Slide For example 13 ). • {V1 , V3 } ⊥ ⊥ V4 | V2 ; V1 ⊥ ⊥ V4 |V2 ; V3 ⊥ ⊥ V4 | V2 ; V1 , ⊥ ⊥ V4 |{V2 , V3 }; etc. • { V1 , V2 , V3 } ⊥ ⊥ V5 |V4 ; V5 ⊥ ⊥ {V1 , V2 }|V4 ; etc. Causality in Econometrics 23/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference Markov condition (factorization) B The M.C. permits the following factorization: • discrete case: P(V1 , . . . , Vn ) = Πni=1 P(Vi |Parents(Vi )),where if Parents(Vi ) = ∅, P(Vi |Parents(Vi )) = P(Vi ) • continuous case: f (V1 , . . . , Vn ) = Πni=1 f (Vi |Parents(Vi )), where if Parents(Vi ) = ∅, f (Vi |Parents(Vi )) = f (Vi ) V1 6 - V 2 -V 4 -V 5 V3 • We have: P(V1 , V2 , V3 , V4 , V5 ) = P ( V1 | V3 ) P ( V2 | V1 , V3 ) P ( V3 ) P ( V4 | V2 ) P ( V5 | V4 ) Recall chain rule: in general P(V1 , . . . , Vn ) = P(Vn |Vn−1 , . . . , V2 , V1 ), . . . , P(V2 |V1 )P(V1 ) Causality in Econometrics 24/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference The d-separation criterion B d-separation: a graphical criterion which captures exactly all the C.I. relationships that are implied by the M.C.∗ B Consider a graph G, with distinct nodes X, Y and a set of nodes W, where neither X nor Y belongs to W. We say that X and Y are d-separated given W in G iff there exists no undirected path U between X and Y, such that: 1 every collider C (−→ C ←−) on U is in W or has a descendant in W, and 2 no other vertex on U is in W. • if there is such a path, then X and Y are d-connected. (cfr. Spirtes et al. 2000: 14). ∗ Included those derived by the MC through symmetry, decomposition and weak union. Causality in Econometrics 25/27 Graphical Causal Models References Introduction (In)dependence Probabilistic Inference The d-separation criterion (Pearl’s definition) B d-separation: B Consider a graph G, with distinct nodes X, Y and a set of nodes W, where neither X nor Y belongs to W. A path U is said to be d-separated by a set of nodes W iff 1 U contains a chain (−→ C −→ or ←− C ←−) or a fork (←− C −→) such that the middle node C ∈ W, or 2 U contains a collider C (−→ C ←−) s.t. C ∈ / W and s.t. no descendant of C is in W. • A set W is said to d-separate X from Y iff W every path from X to Y is d-separated by W. • Otherwise X and Y d-connected by W. (cfr. Pearl 2000: 16-17). Causality in Econometrics 26/27 Graphical Causal Models References Reading List • Spirtes, Glymour, Scheines (2000), Causation, Prediction, and Search, MIT Press 2nd edition: • Chapter 1 and 2 • Pearl (2000), Causality: Models, Reasoning, and Inference, CUP: • Section 1.1 and 1.2 • Spanos, A. (1999), Probability Theory and Statistical Inference. CUP: • Section 2.2 and 6.4 Further reading: • Cooper, G.F. (1999), An Overview of the Representation and Discovery of Causal Relationships Using Bayesian Networks, in C. Glymour, G.F. Cooper, Computation Causation, and Discovery, MIT Press. • Scheines, R. (1997), An Introduction to causal inference. www Causality in Econometrics 27/27

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Causality in Econometrics (3)