Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
SIMULATIONS MODELS FOR INTERNATIONAL TRADE GRAVITY EQUATIONS FOR INTERNATIONAL TRADE MODELS Paris-Dauphine / September 2016 DOCUMENT 1: CLASS SCRIPT: BASIC DEFINITIONS AND CONCEPTS Ramón Mahía – UAM (Based on the material provided y UNCTAD-WTO)1 SOME DOCS AND SOURCES OF INTEREST Apart from the main reference mentioned in the footnote, a selection of some RELEVANT WORKS (related with this document) for those who want to go further in this specific topic is: Head, K. (2003), “Gravity for beginners”, mimeo, University of British Columbia. http://vi.unctad.org/tda/background/Introduction%20to%20Gravity%20Models/gravity.pdf Head, K., & Mayer, T. (2013). Gravity equations: Workhorse, toolkit, and cookbook. Handbook of international economics, 4. http://strategy.sauder.ubc.ca/head/papers/headmayer_revised.pdf A really complete and in depth work about gravity equations and their use for empirical exercises (almost 70 pages!). Luca Salvatici (2013). The Gravity Model in International Trade AGRODEP Technical Note TN -04. http://www.agrodep.org/sites/default/files/Technical_notes/AGRODEP-TN-04-2_1.pdf For those seeking for technical details about some of the main on the econometrical side of gravity equations. Silva, J. S., & Tenreyro, S. (2006). The log of gravity. The Review of Economics and statistics, 88(4), 641-658. http://personal.lse.ac.uk/tenreyro/jensen08k.pdf Seminal work on the econometric implications of Jensen’s inequality in the context of gravity equation: Anderson, J. E. and van Wincoop, E. (2003), “Gravity with gravitas: a solution to the border puzzle”, American Economic Review 93: 170–92. 1 IMPORTANT NOTE: The content of this document, and specially the exercise section, is based on the document prepared by UNCTAD-WTO entitled “A Practical Guide to Trade Policy Analysis. (Chapter 3. Analyzing bilateral trade using the gravity equation). To access the on-line version of this UNCTAD-WTO doc, visit the WEB page: http://vi.unctad.org/tpa/index.html 1 http://www.econ.ku.dk/Nguyen/teaching/Anderson%20van%20Wincoop%202003%20Gravitas. pdf Seminal paper about the role of Multilateral Trade Resistance term in gravity equation: http://www.agrodep.org/sites/default/files/Technical_notes/AGRODEP-TN-05_2.pdf Another “accessible” example of a STATA Gravity – Model exercise 1.- SOME BASIC THEORY AND DEFINITIONS 1.1.- Foundations of a Gravity Model for international trade: - What is it about? - The basic expression is formulated as something like: Following Newton expression for masses, Gravity Model for trade2 states that, broadly constructed, trade (attraction) between two countries is proportional to their size (size of the economy) and inversely proportional to their “distance”. “the trade flow from country i to country j (Xij) is proportional to the PRODUCT of the “mass” two countries (namely GDPs of Yi , Yj) and inversely proportional to their distance, Dij (broadly construed to include all factors that might create trade resistance)”. 𝛼 𝛼 𝛼 𝑋𝑖𝑗 = 𝛼0 𝑌𝑖 1 𝑌𝑗 2 𝐷𝑖𝑗3 𝜀𝑖𝑗 o o o o o o - α0: Coefficient (intercept) not depending on “i” or “j” Xij: Exports from “i” to “j” (or imports of “j” from “i” ) Yi: Exporter factors (GDPi, for example…) Yj: Importer factors (GDPj, for example...) Dij:distance/trade barriers of exporter “i” to enter / reach market “j” εij: random term (stating the random nature of the EQ, in contrast to Physics determinism) Does this naïve approach work? Even before a theoretical basis was found, and surprisingly enough, simple gravity equations did a pretty successful job at explaining bilateral trade. Gravity approach has withstood the test of time and it is still being one of the most widespread model for international trade simulation / forecasting. - Is it really a useful tool for empirical analysis? YES, gravity approach is, in fact, widely used to infer trade flow potentials, or to estimate the effects on trade of institutions such as customs unions, monetary agreements, exchange rate mechanism, ethnic ties, linguistic identity, international borders. Some widespread applications are: 2 Suggested in the sixties by Tinbergen,J. (1962) Shaping the World Economy: Suggestions for an International Economic Policy. New York: The Twentieth Century Fund, and Pöyhönen (1963) A tentative model for the volume of trade between countries. Weltwirtschaftliches Archiv 90: 9399. 2 a. Apart from simulation, the simple gravity equation offers a very robust baseline specification for a structural bilateral trade model and thus, it is commonly used to estimate the marginal effects of other relevant factors facilitating or inhibiting trade. b. Simulate scenarios for international trade adjustments: Related to regular changes in main explanatory variables (such as GDP’s, tariffs) Related to particular events / policy issues that may be related to trade distortions: Free Trade Agreement (Trade creation Vs Trade Diversion), Currency unions, Political federations, “Tariffication”,… c. Quantify trade potential between two countries. Basically, a gravity equation is estimated using data only of those countries that may have supposedly reached their trade potential (for example those belonging to a free trade area) and then, after some adjustments, the estimated equation is applied to pairs of countries out of the sample: the real trade is then compared with predicted (potential) trade. d. Measure determinants of bilateral trade resistance / cost. A gravity equation can also be used “in reverse” to measure bilateral trade costs at an aggregate level. The idea is to solve a theoretical gravity equation for the trade costs term instead of trade flows and to express these costs as a function of the observable trade data. e. Tariff(ication) of a Quota. Sometimes, there is a substitution of non – tariff barriers (such as quotas) with only - tariff measures. The idea is to estimate a gravity equation for a market (dataset) where tariff barriers (between some countries) and quotas (between others) cohabit. Then, using coefficients for both resistance terms (tariffs and quotas) a quota-tariff equivalent could be “easily” computed.3 - - Is really the physical distance a variable related to trade?. Newcomers feel OK with the link between GDP’s and trade4 but; Is it distance a matter of real interest? The answer is: yes, it is. Some EMPIRICAL studies clearly illustrate that, overall, distance matter SO MUCH because it measures somehow a lot of relevant trade resistance issues: transport costs, transaction costs, perishability/losses of goods during transport, synchronization costs, communication costs, cultural distance…. Some other almost evident - additional - main features of the equation: o It has a multiplicative form (with crucial empirical/econometrical implications) 3 See example in section 2 of UNCTAD-WTO doc (already mentioned in this text) entitled “A Practical Guide to Trade Policy Analysis. (Chapter 3. Analyzing bilateral trade using the gravity equation). To access the on-line version of this UNCTAD-WTO doc, visit the WEB page: http://vi.unctad.org/tpa/index.html 4 Try to represent relationship between GDP of exporter and imports for a given country using the data of the example 1 for OECD countries to see how clear it looks2000 3 o - - It has a very flexible “nature”: Unit of observations: “i” and “j” might represent countries, regions, provinces, or even firms and … “x” might represent TOTAL trade, a given product trade, sectoral trade, the share of trade5,… …or even a NON TRADE flow: FDI, shopping trips, migration6, commuting, passengers, fleets… “distance” could be understood as “positive” (in the sense of obstructing trade, such as physical distance, tariff barriers,… ) or “negative” (and thus related to factor that facilitate trade such as the existence of a special transport infrastructure, to share a common border, a common language, the existence of regular migration, or cultural ties, common past ….. A time sub index “t” might be added to every term (if a panel dataset is used instead of a cross section) Far beyond an intuitive understanding; Is there any solid theoretical background to support the use of Gravity Models? Yes. Apart from being intuitive and empirically true (at least apparently), some important studies have been able to provide THEORETICAL foundations for the gravity model/equation7 deriving gravity equations as a valid representation of a variety of different trade theoretical models8. This was/is, in fact, one of the reasons for the gravity models living a “second youth”. Theory was somehow developed after empirics were successful using Gravity Models but, at the same time, this theoretical foundations have become critical to understand the right specification of the empirical model, the “correct” signs and size of estimated coefficients, and the proper way of conducting correct econometrical inferences. For example: o o o o According to the theoretical background, elasticities of GDP’s and distance should be around 1 According to that theory behind, the gravity equation SHOULD follow a multiplicative function According to the theory derivation, the gravitation equation MUST also include a Multilateral Resistance Term OR “remoteness” term (we will talk about it later on) ……and many more interesting facts that frame and condition empirical exercises 5 BUT…be careful when dealing with disaggregated data because some critical issues appear when dealing with empirics 6 There is a number of research papers focusing the impact of migration on international trade (or in FDI) and, very frequently, a gravity equation is used as empirical framework. A good meta – analysis can be reviewed in Genc, M., Gheasi, M., Nijkamp, P., & Poot, J. (2012). The impact of immigration on international trade: a meta-analysis1. Migration Impact Assessment: New Horizons, 301. https://books.google.es/books?hl=es&lr=&id=_na1YGdwHucC&oi=fnd&pg=PA301&ots=9amGaDVC_9&sig=aXMdwv4fMUmDYy9Tm7x2Ti7mU8 c#v=onepage&q&f=false 7 See an intuitive sketch in the text Head, K. (2003), “Gravity for beginners” http://vi.unctad.org/tda/background/Introduction%20to%20Gravity%20Models/gravity.pdf 8 of British Columbia University. Based on increasing returns to scale for the case of imperfectly competitive markets with firm-level product differentiation and/or, alternatively for a perfect competition market, with product differentiation at the national level. 4 2.- INTRODUCTION TO EMPIRICAL ESTIMATION OF A GRAVITY MODEL 2.1.- Basic expression: o Normally, a log-log version of a generalized gravity models takes the form of: 𝑙𝑛𝑋𝑖𝑗 = 𝑎0 + 𝑎1 𝑙𝑛𝑌𝑖 + 𝑎2 𝑙𝑛𝑌𝑗 + 𝑎3 𝑙𝑛𝑡𝑖𝑗 + 𝜀𝑖𝑗 *** 𝑙𝑛𝑋𝑖𝑗 = 𝑎0 + 𝑎1 𝑙𝑛𝑌𝑖 + 𝑎2 𝑙𝑛𝑌𝑗 + 𝑎3 𝑙𝑛𝑡𝑖𝑗 + 𝑎4 lnΠ𝑖 + 𝑎5 lnΡ𝑗 + 𝜀𝑖𝑗 o “Xij”: “ij” trade flow ATTENTION: - Xij represents only ONE flow from “i” to “j” (exports, for example) so we will have a different second observation for the model containing “Xji (exports from “j” to “i”) we shouldn’t aggregate both “ij” and “ji” averaging the reciprocal trade flows9 - Normally, IMPORTS “ij” and “ji” as flows are most commonly used than exports - FOB or CIF values? One might understand that it depends on if we are measuring Xit as exports from “i” to “j” or imports of “j” from “i” but NO: Basically, FOB are the best option (if available)10. o Yij: Size of economies “i” and “j”, (usually a measure of income such as GDP’s) GDP’s and trade flows (X) should be measured in nominal terms (not real terms)11 Sometimes, instead of using origin and destination GDP’s, the product of both “i” and “j” GDP’s is used as a single variable (or the share of that product over total GDP), tij could contain both BARRIERS and/or INCENTIVES to trade between “i” and “j”. (dij): Bilateral distance “ij” (computed in several ways12) Natural barriers and/or costs related barriers (dummies 0/1): One of the two being islands, one of the two being landlocked countries (countries entirely enclosed by land)… Incentives: - Access to sea o 9 We will have one observation for each flow. Imagine the case of trade between a big and small country: one of “ij” or “ji” might be very high and the other one might be very low. The equation could explain both attending to relative sizes of “I” and “j” (as exporter and importer) but it could never explain the average. 10 Using FOB, only the cost of movement of goods on board of Airlines or on board of ship at origin is borne by the seller BUT all further cost to reach the goods to the buyer's place (mainly related to distance) has to be met by the buyer. Using CIF data may lead to simultaneous equation biases, as the dependent variable includes costs that are correlated with the right hand side variables for distance and other trade costs. 11 “Gravity is an expenditure function allocating nominal GDP into nominal imports; therefore inappropriate deflation probably creates biases via spurious correlations”. UNCTAD-WTO A Practical Guide to Trade Policy Analysis 12 Both in terms of the distance measurement (Euclidean, great-circle, orthodromic…) and in terms of “where” to measure (economic “poles”, capitals, main ports,…) 5 - Common border or adjacency Existence of Bilateral, Regional or Multilateral Trade Agreement (Usually with a dummy, although some caveats should be made13) - o Other lower information costs related factors: o Common language o Cultural / Historical ties (had been colonies of each other or a common colonizer) “a’s” = elasticities (as ever in a log-log expression) and a3=1-σ (been σ the elasticity of substitution14) 2.2.- A basic amendment: the importance of Multilateral Trade Resistance (to trade) terms (MTR) The basic idea is that, in the modern Gravity Equation, we NEED TO ACCOUNT for something called MTR terms. “The gravity equation tells us that bilateral trade, after controlling for size, depends on the bilateral trade barriers between “i” and “j” BUT RELATIVE to the product of their Multilateral Resistance Indexes” 15. - What does MRT means? - The Multilateral Resistance Index between a pair of countries “ij” represents the trade resistance/barrier/distance “ij” relative to the average trade resistance/barriers that both countries face with all their trading partners. - Why do we need to account for that MRT terms? - Two neighbor countries will trade more if they are isolated from the rest of the world (and near). Good example of the idea found at Head, k (2003): “The importance of remoteness in actual trade patterns can be illustrated by comparing trade between Australia and New Zealand with trade between Austria and Portugal. The distance between each pair’s major cities is approximately the same: Lisbon–Vienna and Auckland–Canberra both happen to be 1430 miles apart. Furthermore the product of their GDP’s are similar (Australia–New Zealand is only 20% smaller). Hence, omitting remoteness, the gravity equation would predict that Austria–Portugal trade would be slightly larger. In fact, however, in 1993 Australia–New Zealand trade was nine times greater than Austria–Portugal Trade”. - It is easy to see why higher multilateral resistance of the importer “j” raises its trade with “i”. Price/Cost advantage: For a given bilateral barrier between “i” and “j”, higher barriers between “j” 13 A dummie is not able to capture those trade agreements with asymmetrical benefits/conditions for its members, for different products or simply with a progressive effect. A single dummy suppose that the “treatment” effect is the same for all the countries participating in this agreement. 14 Remember that this term refers to the degree of trade elasticity of substitution between domestic and traded goods. The higher the elasticity, the greater the effect of resistance / promotion terms to trade. 15 See Anderson and van Wincoop’s (2003) for a much more detailed explanation 6 and its other trading partners will reduce the relative price of goods from “i” and raise imports from “i”. - Higher multilateral resistance of the exporter also raises trade: Higher trade barriers faced by an exporter will lower the demand for its goods and therefore its supply price “pi”. For a given bilateral barrier between “i” and “j”, this raises the level of trade between them. - What if we omit MRT?. - The empirical problem is that this MRT’s are obviously highly correlated with bilateral trade barriers (by and large, the higher the MRT the higher the trade barriers). Omitting MTR induce potentially severe bias in the coefficients of the distance, border variables and other trade resistance measures. The Gravity Equation considering MRT - The previous caveat means that in the gravity equation, we have to include MTR (or the inverse, that said, easiness to trade) for EITHER “i” and “j”. Following Anderson and van Wincoop’s derivation, the gravity equation turns out to be: 1−𝜎 𝑋𝑖𝑗 = 𝑌𝑖 𝑌𝑗 𝑡𝑖𝑗 ( ) 𝑌 Π𝑖 Ρ𝑗 Where, in the “resistance to trade” term we find: o o o tij: (vector of bilateral trade resistance for imports of “j” from “i”) Πj: Ease of access of exporter “i” Ρj: Ease of access of importer “j” Πj and Ρj are low if the countries are “remote” from world markets (physically or in terms of high trade protection for example). o - σ: elasticity of substitution How to deal with non-observable nature of MTR’s: In its seminal work, Anderson and van Wincoop’s16 proposed the use of a nonlinear iterative method to estimate MTR effects but, commonly, the empirical solution is to use a linear estimator following one of these two empirical strategies: 1. If we use a cross – section dataset: a. To proxy MTR using remoteness - like indexes17. 16 For details see section 3.2. (Iterative structural estimation) in “Gravity Equations: Workhorse,Toolkit, and Cookbook Keith Head and Thierry Mayer” http://www.cepii.fr/PDF_PUB/wp/2013/wp2013-27.pdf 17 See section 3.1. (Proxies for multilateral resistance term) in “Gravity Equations: Workhorse,Toolkit, and Cookbook Keith Head and Thierry Mayer” http://www.cepii.fr/PDF_PUB/wp/2013/wp2013-27.pdf 7 For example, to compute (GDP weighted) average distance of trader “i” to all countries/regions other than “j” (and vice versa). 𝑅𝐸𝑀𝑖 = ∑𝑚≠𝑗 𝑑𝑖𝑚 ⁄𝑌 or 𝑚 𝑅𝐸𝑀𝑖 = ∑m≠𝑗 𝑑𝑖𝑚 𝑌𝑚 ⁄𝑌 The major drawback relates to the use of distance. It is difficult to find a good way of measuring that distance “ij” and, even if we manage, it might be a simplistic way of measuring MRT’s. This is the only option if we are interested in country – specific variables (see later how country specific dummies would preclude the estimation of such parameters) b. Use country dummies for countries “i” and “j” to capture this unobserved heterogeneity 𝑙𝑛𝑋𝑖𝑗 = 𝑎0 + 𝑎1 ln(𝐺𝐷𝑃𝑖 ) + 𝑎2 ln(𝐺𝐷𝑃𝑗 ) + 𝑎3 𝐼𝑖 + 𝑎4 𝐼𝑗 + 𝑎5 ln(𝑡𝑖𝑗 )+𝜀𝑖𝑗 The problem with this approach (as we will see later) is that if we allow country dummies in a cross-section approach, we cannot estimate/identify parameters for other country specific variables (Such as GPDi, or GDPj !!) so this approach is only valid when the interest is on bilateral / country pair coefficients (such as the effect of a common border or the impact of a bilateral Trade Agreement, for example) 2. If we use a panel dataset: Apart from controlling for MTR terms, the use of standard Panel Data estimations help us to control for other non - observable bilateral / country pair heterogeneity (bilateral / country pair fixed effects) or trend - secular effects a. TIME INVARIANT Country dummies for countries “i” and “j” (assuming MTR constant over time). 𝑙𝑛𝑋𝑖𝑗𝑡 = 𝑎0 + 𝑎1 ln(𝐺𝐷𝑃𝑖𝑡 ) + 𝑎2 ln(𝐺𝐷𝑃𝑗𝑡 ) + 𝑎3 𝐼𝑖 + 𝑎4 𝐼𝑗 + 𝑎5 ln(𝑡𝑖𝑗 ) + (𝑎6 𝐼𝑡 )+𝜀𝑖𝑗 18 In that case, that allows the estimation of OTHER country specific factors (such as GDP’s) but only under the assumption that this country effects vary over time. We need to ponder if the MTR time invariant assumption is plausible18 REMEMBER THAT country pair time invariant effects cannot be estimated with Fixed Effects panel data estimator if our interest relies on other time invariant variables (such as being landlocked, distance, common borders, common language, trade agreement,….). In this case, the use of Random Yet with some exceptions, MRT is only constant within short periods of time. 8 Effects is an option if we want to get time invariant coefficients (at the obvious risk, as ever, of bias in the rest of coefficients) b. TIME VARIANT Country dummies for countries “i” and “j” and period “t” (allowing MTR change over time): (Iit) and (Ijt). ATTENTION, if we allow time varying country dummies, we cannot estimate parameters for other time varying country specific variables (such as GDP, for example,…) = so again this approach is only valid when the interest is on bilateral / country pair coefficients 𝑙𝑛𝑋𝑖𝑗𝑡 = 𝑎0 + 𝑎1 𝐼𝑖𝑡 + 𝑎2 𝐼𝑗𝑡 + 𝑎3 ln(𝑡𝑖𝑗 ) + (𝑎4 𝐼𝑡 )+𝜀𝑖𝑗 2.3.- Other empirical (and more advanced) issues of interest. - There are a handful of relevant issues (some of them critical): o o o o o o o o - 19 Critical omitted variables biasing coefficients systematically (migration, for example, that might be related to trade and other exogenous variables) How to deal with zero trade values (Xij=0)19 Log – Log form inducing bias in the (highly probable) presence of heteroskedasticity Some extra-cautions when working with disaggregated data such as sectors, firms,… Endogeneity in gravity equation: causation between trade and trade policy could be reversed when, for example in the case of the signature of a FTA, there exist a selection of countries based on intensity of trade, and not the other way round. Spatial correlation Statics nature: several authors have proposed a dynamic gravity equation in place of the traditional static gravity equation …..and some others…. How to deal with zeros: o Reasons for zeros: Real zero trade Value rounded to zero trade Missing values for sometimes unknown reasons (non reported by countries, error in dissemination or manipulation,….) …. The problem is that, in many times, the researcher doesn’t know what proportions of different types of “zeroes” does he have in his dataset. o Alternatives: Remember that we are using a log-log model 9 - Substitute zeros by small numbers (1, for example): only appropriate if real zero or almost zero trade BUT arbitrary level generate unexpected impact on parameters Simply truncate the sample to avoid zero cases (delete “ij” cases with zeroes): only “reasonable” if zeros are true missings and thus randomly distributed across “n” and “t”BUT obviously bias selection problem To control for the “selection bias” using a Heckman procedure (Heckman 2- stages least squared estimation that introduces in the specification the inverse of the so called Mills ratio). However, that method requires “instrumental” variables that may explain the selection (zero or positive trade) but not the value of positive trade. Not to use logs, and estimate the model in levels: With a linear estimator = NO: theoretical foundation of the gravity equation implies multiplicative form With a non-linear estimator: (Pseudo) Poisson maximum likelihood (ML) estimator applied to the levels of trade estimating directly the non- linear form20. Tobit (that allows a significant proportion of zeros) on the log of trade plus a constant: the critique is that Tobit applies for left-censored zeros only when those “zeroes” have the economical meaning of zeros (no trade or almost no trade because of prohibitive trade costs, for example) Log inducing bias in presence of heteroskedasticity The Gravity model is basically defined as: 𝛼 𝛼 𝛼 𝑋𝑖𝑗 = 𝛼0 𝑌𝑖 1 𝑌𝑗 2 𝐷𝑖𝑗3 𝜀𝑖𝑗 Where the random term εij have the standard property: 𝐸[𝜀𝑖𝑗 /𝑌𝑖 𝑌𝑗 𝐷𝑖𝑗 ] = 1 The model is then usually linearized taking the following simple form: 𝑙𝑛𝑋𝑖𝑗 = 𝛼0 + 𝛼1 𝑙𝑛𝑌𝑖 + 𝛼2 𝑙𝑛𝑌𝑗 + 𝛼3 𝑙𝑛𝐷𝑖𝑗 + ln(𝜀𝑖𝑗 ) The problem is that, as Jansen’s inequality states: E[ln(ε)]≠lnE[ε] On the contrary, the expectation of a log of a random variable E[ln(ε)] not only depends on the mean of that variable E[ε] but also on higher order moments (such as the variance) V[ε]. Then when the random term in the original (nonlinear) model presents heteroskedasticity, 20 Silva, J. S., & Tenreyro, S. (2006). The log of gravity. The Review of Economics and statistics, 88(4), 641-658. 10 𝑉[𝜀𝑖𝑗 ] = 𝜎𝑖𝑗 = 𝑓(𝑌𝑖 , 𝑌𝑗 , 𝐷𝑖𝑗 ) the expected value of its logarithm E[ln(ε)], that is also a function of σij, is also a function of the explanatory variables: 𝐸[𝑙𝑛(𝜀𝑖𝑗 )] = 𝑓(𝜎𝑖𝑗 ) = 𝑓(𝑌𝑖 , 𝑌𝑗 , 𝐷𝑖𝑗 ) The result is therefore that some of the parameters in the log model might be biased and inconsistent. A way of solving this problem is to assume a pattern of heteroskedasticity where conditional variance depends on conditional mean, that is, a Poisson distribution for the endogenous variable where E[yi/X]=V[yi/X]. Then a PPML (Poisson pseudo- maximum likelihood estimator) traditionally used for count data can be used to consistently estimate the model. One interesting extra - point is that this estimator is applied for the model in LEVELS so no logs are needed and thus, at the same time, we fix the problem of zeros. 11 SIMULATIONS MODELS FOR INTERNATIONAL TRADE GRAVITY EQUATIONS FOR INTERNATIONAL TRADE MODELS Paris-Dauphine / September 2016 DOCUMENT 2: STATA HANDS ON SESSION Ramón Mahía – UAM (Based on the material provided y UNCTAD-WTO)21 Complete modified and commented DO File: DO_MODIFIED_COMMENTED 1.- MANIPULATION OF DATA (Previous to Econometric estimation) The interest of this section is to understand how to build up the type of dataset we will normally need for a gravity equation estimation. The STATA commands are NOT of particular interest but could help those facing a particular exercise with similar source datasets. ORIGINAL DATASETS tradeflows.csv (cross panel-> i,j;t) gdp.csv (panel-> i;t) gdp.dat (panel-> i;t) joinWTO.txt (country-> i) dist_cepi224.dta (country pairs-> i,j) INTERMEDIATE DATASETS Reshape (all pairs) Reshape and Duplicate Duplicate tradeflows.dta (cross panel, i,j;t) gdp_exporter.dta (panel, i;t) gdp_importer.dta (panel, j;t) join_exporter.dta (panel, i;t) gravity_temp1.dta (cross panel, i,j;t) gravity_temp2.dta gravity_temp3.dta join_importer.dta (panel, j;t) FINAL DATASETS gravity.dta oecd_ex, oecd_im Time dummies (year_) Country dummies (exporter_) (importer_) Selection 1996-2005 Country + time dummies (exportertime_) (importertime_) cepii.dta (country pairs-> i,j) Selection Balanced panel gravity_temp4.dta religion.dta (country-> i) WTO dummies & fix some minor issues logs 5 years averages gravity_1996_2005.dta gravity_OECD_2000_2005.dta 21 IMPORTANT NOTE: This document, and specially the exercise section, is based on the excellent work published by UNCTAD-WTO entitled “A Practical Guide to Trade Policy Analysis. (Chapter 3. Analyzing bilateral trade using the gravity equation). To access the on-line version of this UNCTAD-WTO doc, visit the WEB page: http://vi.unctad.org/tpa/index.html 12 (Steps 1 to 7 as described in Chapter 3 – UNCTAD/WTO.) Several operations to perform before estimation: Download datasets from sources and import them into a single software format (stata dta, EViews wf,..) Homogenize formats of different datasets, list of countries, names for countries, names for variables, “names” for years Replace missings (ceros for trade, functional 999 for real missings….) Generate the structure for the gravity model data set: all possible combinations of countries (and years if panel Is used) Merge different files into a single one Generate dummies (if needed) for year, country, and Year x Country Compute log variables (for GDP, trade and distance) Step 1: Import CSV trade flows (tradeflows.csv), label variables and save it to .dta Import txt file “joinwto.txt” with year of accession for each country and save it in .dta format Import CSV file “GDP.csv” with GDP data for each countries from 1960 to 2006, Replace BELGIUM and LUXEMBOURG by BENELUX, compute BENELUX GDP with the sum of both countries and change names for year variables save it in .dta format Open STATA datafile containing the rest of explanatory variables, fix BENELUX problem, change some variable names, label some other variables and save it in .dta format Basically, at the end of that Step 1, four different STATA files are created and stored in the default directory: 1. tradeflows.dta (endogenous variable) in a Panel dataset for YEARS and PAIRS of countries in LONG format 2. joinwto.dta (for the explanatory variable “wtoaccesion”) in a Cross Section dataset for INDIVIDUAL countries 13 3. GDP.dta (from GDP.csv for explanatory variables GDP’s) in a Panel dataset for YEARS and INDIVIDUAL countries in WIDE format 4. CEPII.dta (other explanatory variables in LONG format) in a CROSS SECTION dataset for PAIRS of countries Step 2: - Starting with “tradeflows.dta”, create the FULL structure of the datafile: PANEL DATA for YEARS and every possible combination (PAIR) of countries filling with “zeros” the pairs newly created. The temporary file created is "gravity_temp1.dta" - Reshape GDP.dta to LONG Panel set and create a duplicate (GDP is going to be used as both importers’s GDP and exporter’s GDP) Step 3: 14 reshape long stub, i(i) j(j) \ j new variable reshape long yr, i(countrycode) j(year) rename yr gdp - And MERGE those two new files (“GDP_exporter.dta” and “GDP_importer.dta”) with "gravity_temp1.dta" in "gravity_temp2.dta" keeping those observations (PAIRS of countries) with information in both files. - MERGE “joinWTA.dta” with that file creating two new variables: join_exporter and join_importer . - The new temporary file created is "gravity_temp3.dta" Step 4: 15 - MERGE data of both two new files “CEPII.dta” (previously saved) and “religion.dta” with the previous. The new temporary file created is "gravity_temp4.dta" Step 5: - - Create WTO accession dummies depending on whether one, none or both countries are members of WTO or not (onein, nonein, bothin) The new PERMANENT file created is "gravity.dta" and basically contains the core dataset (endogenous and exogenous variables, except for country/country x time/time dummies and some lasting transformations) The structure of the main dataset is shown in the next screenshot: Each row contains a trade flow (import) and the variables for the gravity equation (GDPS, and the terms for barriers and incentives) EXCEPT FOR MRT’S dummies. Step 6: - Create country/country x time/time dummies for the specification of MTR terms and time fixed effects In this block, due to memory restrictions, three different options are offered if the number of dummies exceed the STATA capacity: o Option selected in this example: Reduce the number of years (>1995→1996 – 2005) o Compute country-period (and not country-year dummies) 16 o Make a balanced panel (reducing the sample to those countries having the information for the same time period). Step 7: - Create logs of variables GDP’s, and distance Compute five year averages of some variables Create a subset for the period 1996-2005 Create a subset with OECD countries for the period 2000-2005 2.- ECONOMETRIC ESTIMATIONS OF GRAVITY EQUATIONS - - REG1: ESTIMATE A LOG-LOG CROSS SECTION BASIC REGRESION FOR OECD COUNTRIES 2000 AND 2005, WITHOUT MRT’s AND PERFORM SOME BASIC CHECKS Load dataset “gravity_OECD_2000_2005.dta”: o 33 countries o 6 years o 32*6=192 observations for each country as importer o 192*33=6336 records REG1: ESTIMATE A LOG-LOG CROSS SECTION BASIC REGRESION FOR OECD COUNTRIES 2000 AND 2005, WITHOUT MRT’s AND PERFORM SOME BASIC CHECKS Check number of valid observations for the endogenous “limports” in 2000 and 200522 There should be 33*32= 1056 valid values but there are only 992 because of 64 Missings due to zero values for trade with origin or destination in BLX. o 22 Estimate the simplest log-linear gravity model regression for the year 2005 using only lgdp_exporter, lgdp_importer and ldistance STATA: inspect limports if year==2000 17 o Check elasticities (according to theory and meta analysis) : Theory predicts a value around 1 for GDP’s elasticities (both importer and exporter) A difference between origins GDP and destination GDPs is expected, a lower estimation for importer GDPs would suggest evidence of home market effects (due to barriers to entry or national product differentiation). Meta-Analysis shows that distance coefficient is also around -1 META analysis for 2500 gravity equations estimations. Table extracted from Head, K., & Mayer, T. (2013). Gravity equations: Workhorse, toolkit, and cookbook. Handbook of international economics, 4. o Check if trade elasticity is significantly more sensible to trade barriers (proxied by distance) in 2005 than in 2000 Procedure: compare basic estimation for different years (2000 Vs 2005) using seemingly unrelated estimation (STATA suest23 command) 23 Seemingly unrelated estimation procedure combines the estimation results (parameter and variance matrices) in one parameter vector and simultaneous (co)variance matrix. The procedure is done after the isolated estimation of each equation. The idea behind this reasoning is that error terms in different equations might be correlated, and that may impact in the estimated covariance of parameters and thus in every crossmodel hypothesis concerning parameters of those different equations. 18 It looks like no statistical difference exists comparing 2000 and 2005 estimates. - REG2: ESTIMATE ANOTHER CROSS SECTION REGRESSION INCLUDING ADDITIONAL REGRESSORS o Estimate, with robust inference, for 2005 adding more variables (and using robust estimation to adjust heteroskedastcity): reg limports contig comlang_off onein colony REPlandlocked PARTlandlocked religion ldist lgdp* if year==2005, robust o 24 “onein” coefficient cannot be estimated (only zero values), and the same for “bothin” (only value 1) (tab onein if year==2005) Compare REG1 and REG2 regressions24. Check elasticities obtained: GDP’s coefficients for exporter and importer appear to be slightly overestimated (biased) in the first regression. We will always expect that kind of bias for the simplest estimation but the size, and even the sign of this bias depends on the particular nature of relationship between omitted variables (mostly related to trade resistance / incentive) for the particular case of countries comprised in the sample. For that, it is useful to use “eststo” command (download it first if not already installed) 19 Adjacency coefficient (“contig”) usually lies in the vicinity of 0.5 (Head, K 2003) suggesting that trade is around 65%25 higher as a result of sharing a border. That means that the omission of this simple variable, may cause (as in our case) an upward bias (in absolute value) in distance parameter (both are negatively related to each other). Contiguity and common language effects seem to have very comparable effects, with coefficients around 0.5. (Head, K., & Mayer, T. (2013), see table above)). According to some papers, common links (language, colony,…) may cause very significant rises in trade (up to two, three times or even more…). Colonial links are not significant in our regression given the particular nature of the sample (only OECD countries included) “Landlocked” variables are weakly significant. Small and positive coeff for exporters and much more important in size for PARTNER (importer) resulting, in that case, in a reduction of imports of around 42% (coeff.=0,357). - 25 REG3: ADDING DUMMIES TO CONTROL FOR MTR’s EFFECT o According to MET – Analysis (table shown before), gravity models estimated without controlling for MRT terms are biased (comparison between “all gravity” and “structural gravity” sections) o REG. 3.1 Try to estimate previous REG2 for a cross section in 2005, with robust inference, adding country dummies importer_* and exporter_* to control for MTR. Compare this estimation with the previous one 26 (without MRT’s terms) Remember that, in a log-log model, raw coefficients for dummies do not represent elasticities (% changes). The elasticity can be easily derived with Exp(β)-1, so for a coefficient of 0.5 we get exp(0.5)-1=0.648. 26 To compare common coefficients using “esttab”, remember to store this equation into memory [STATA: eststo est2] and then compare common coefficients dropping MRT’s dummies *exporternum, *importernum [STATA: esttab, r2 ar2 se scalar(rmse) drop(*exporternum *importernum)] 20 o Important differences appear for common coefficients. In the case of distance (“ldist”) elasticity is greater than the previous one and well above “1” (as expected according to the MetaAnalysis) Contiguity is not longer significant and the rest of resistance or incentives to trade variables changed their values Given that importer_* and exporter_* are country specific (not pair specific) perfectly correlate with other country specif variables such as REPlandlocked PARTlandlocked and lgdp_importer lgdp_exporter so, as expected, after adding country dummies, we CAN NO LONGER estimate the parameters for other country level variables (GDP, *landlocked)27 3.2 How can we add country dummies to control for MRT’s without losing the estimates of those country specifics such as GDP’s?. A pooled OLS regression (NOT A PANEL) for a short period (2000-2005) could be a solution, at least for that lost country specific variables that DO VARY over time (GDP’s for example) but, obviously, not for country time-INVARIANT variables (such as REPlandlocked / PARTlandlocked). Lets then repeat previous regression for the period 2000-2005 (adding also year* dummies28) 27 In effect, GDP’s coefficients can be now estimated and, according to literature, elasticities drop substantially (down to 0.6) in this “structural” version compared to previous estimates (without controlling for MRT’s) Additionally, some variables related to trade incentives appear to be clearly significant (colony, comlang_off, contiguity, religion…) A better way to say it is that those coefficients for country dummies show the aggregated value of ALL the country specificities. 28 Commonly, year dummies control for omitted terms causing secular / trend variation in panel data models (affecting in our example world trade for every single pair of exporter – importer) 21 o o 3.3. What if we now add country x time dummies allowing for MTR time variants? (in the previous regression, MRT terms were supposed to be constant over time) The answer is that, given that MRT’s now varies over time, we lose again the estimate of country specific time variant variables (such as GDP’s) 3.4. What if we now add country-pair dummies allowing to control for paired heterogeneity? This is weird, because adding “pairid” fixed effects does not allow to estimate the coefficients for any “country pairs” such as distance, colony, onein,….. SO IF WE CONTROL FOR ALL FIXED EFFECTS AT THE SAME TIME (COUNTRY, YEAR, COUNTRY X YEAR, AND COUNTRY-PAIRS) WE THEN LOSE THE REST OF PARAMETERS (except for fixed effects) 22 - REG4: PANEL DATA (Step 8 in UNCTAD-WTO document) o o o Adding country pairs dummies in a POLLED estimation is somehow equivalent to the use of a panel data estimation with fixed effects. Set panel data structure (remember that the panel observation refers to “ij” pairs) Estimate a simple panel data FIXED effects (to control for bilateral MRT’s (including, also, time effects) We have to notice again that, controlling with FE for bilateral MRT’s terms we will be unable to estimate coefficients for every TIME INVARIANT bilateral variables both for “ij” pairs (such as distance, colony, common language, FTA) or simply at the level of “i” and/or “j” (such as landlocked) Using RANDOM Effects, we may estimate every coefficient (missed with FE) but, as always when we move from FE to RE, at the risk of biased estimates: 23 o Check the possibility of RE Vs FE (using a simple Haussman Test) o Apparently, RE is not the right option so we need to stick to fixed effects. 24 SIMULATIONS MODELS FOR INTERNATIONAL TRADE GRAVITY EQUATIONS FOR INTERNATIONAL TRADE MODELS Paris-Dauphine / September 2016 DOCUMENT 3: Work to do: Simulation Exercise for the effect of NAFTA on trade 29 Ramón Mahía – UAM (Based on the material provided y UNCTAD-WTO)30 1.- BACKGROUND NAFTA was conceived as a regional trilateral trade agreement signed in 1994 by Canada, Mexico, and the United States The basic idea is to use a gravity model to empirically test the effects of NAFTA agreement in terms of trade creation (total trade increase as a result of FTA) and/or trade diversion (trade reallocate from non FTA members to FTA members). Suppose that countries “i” and “j” belong to a common FTA, whereas country “k” does not. If, after the FTA’s formation, “i” imports more from “j” and less from “k”, trade diversion is likely. If, in contrast, country “i” imports more from “j” and from “k”, trade creation is likely” (see Box 3.1 page 109 of WTO Manual to understand how to empirically test “trade creation” and “trade diversion”). The basic idea is to complete the exercise that you can find in WTO Manual, Chapter 3, page 131, sections 2, 3 and 4. Instructions are very clear in the text and also a do.file is also provided by WTO in case you need to explore and work some extra details. The MINIMUM work to do is to go through the following instructions and make some comments on the basic econometrical results obtained. EXTRA POINTS will be obtained for extra work such as: o Add some preliminary graphs or descriptive analysis (before econometrical estimation) o Enrich econometrical exercise: Using of additional explanatory variables as covariates Testing the variation of NAFTA effect along the time Trying alternative estimation / specification strategies (for example using random effects or alternative ways of addressing MRT’s issue without a panel) Preliminary steps: - We don’t need to build up the data file according to section 1 (Preliminaries). The file is already prepared as agGravityData.dta. 29 A regional trilateral trade agreement signed by Canada, Mexico, and the United States, that came into force on January 1, 1994. 30 IMPORTANT NOTE: The content of this document, and specially the exercise section, is based on the document prepared by UNCTAD-WTO entitled “A Practical Guide to Trade Policy Analysis. (Chapter 3. Analyzing bilateral trade using the gravity equation). To access the on-line version of this UNCTAD-WTO doc, visit the WEB page: http://vi.unctad.org/tpa/index.html 25 - This file agGravityData.dta contains information enough to build a gravitational model including trade flows and some other basic info for around 80 countries and for the period 1982-2004. use "C:\Users\RAMON\Desktop\GRAVITY\Practical guide to TPA\Chapter3\Datasets\agGravityData.dta", clear - Nevertheless, a thing to do is to create four NAFTA dummies in order to test NAFTA impact on trade: o o o o One dummy to simply identify NAFTA members “nafta” (from year 1994 ) A Second one to identify intra-NAFTA bilateral trade “intra_nafta” (observations with import and export country being NAFTA members) (from year 1994 ) A third one “imp_nafta_rest” to identify import trade to a NAFTA from a NON-NAFTA member (from year 1994 ) A fourth one “exp_nafta_rest” to identify exports from a NAFTA member to a NON-NAFTA member (from year 1994 ) gen nafta = (ccode=="CAN" | ccode=="MEX" | ccode=="USA") label var nafta "1 if home is nafta member" gen pnafta = (pcode=="CAN" | pcode=="MEX" | pcode=="USA") label var pnafta "1 if partner is nafta member" gen intra_nafta = (ccode=="CAN" | ccode=="MEX" | ccode=="USA") & (pcode=="CAN" | pcode=="MEX" | pcode=="USA") replace intra_nafta = 0 if year < 1994 label var intra_nafta "1 if trade bewteen nafta members" gen imp_nafta_rest = (ccode=="CAN" | ccode=="MEX" | ccode=="USA") & (pcode!="CAN" & pcode!="MEX" & pcode!="USA") replace imp_nafta_rest = 0 if year < 1994 label var imp_nafta_rest "1 if nafta's imports from the rest of the world" gen exp_nafta_rest = (pcode=="CAN" | pcode=="MEX" | pcode=="USA") & (ccode!="CAN" & ccode!="MEX" & ccode!="USA") replace exp_nafta_rest = 0 if year < 1994 label var exp_nafta_rest "1 if nafta's exports to the rest of the world" Econometric estimations: - We start by declaring the panel structure (and generate some logarithms) egen id = group(ccode pcode) tsset id year gen lnV = log(imp_tv) label var lnV "value of imported goods in logarithm" gen lncGDP = log(cgdp_current) label var lncGDP "partner's current GDP in logarithm" gen lnpGDP = log(pgdp_current) label var lnpGDP "home's current GDP in logarithm" gen lnD = log(km) label var lnD "bilateral distance in logarithm" - We will then estimate a first fixed effects panel gravity equation for logarithms of value of imports (lnV) including logs of GDPs, distance, NAFTA trade creation and trade diversion dummies (“intra_nafta” and “imp_nafta_rest”) and year dummies. The interpretation of both coefficients might be used as an empirical test about trade creation and/or trade diversion xtreg lnV lncGDP lnpGDP lnD intra_nafta imp_nafta_rest exp_nafta_rest i.year*, fe robust 26