Download 2.- econometric estimations of gravity equations

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Economic globalization wikipedia , lookup

International factor movements wikipedia , lookup

Heckscher–Ohlin model wikipedia , lookup

Internationalization wikipedia , lookup

Balance of trade wikipedia , lookup

Transcript
SIMULATIONS MODELS FOR INTERNATIONAL TRADE
GRAVITY EQUATIONS FOR INTERNATIONAL TRADE MODELS
Paris-Dauphine / September 2016
DOCUMENT 1:
CLASS SCRIPT: BASIC DEFINITIONS AND CONCEPTS
Ramón Mahía – UAM
(Based on the material provided y UNCTAD-WTO)1
SOME DOCS AND SOURCES OF INTEREST
Apart from the main reference mentioned in the footnote, a selection of some RELEVANT WORKS (related
with this document) for those who want to go further in this specific topic is:

Head, K. (2003), “Gravity for beginners”, mimeo, University of British Columbia.
http://vi.unctad.org/tda/background/Introduction%20to%20Gravity%20Models/gravity.pdf

Head, K., & Mayer, T. (2013). Gravity equations: Workhorse, toolkit, and cookbook. Handbook of
international economics, 4. http://strategy.sauder.ubc.ca/head/papers/headmayer_revised.pdf
A really complete and in depth work about gravity equations and their use for empirical exercises
(almost 70 pages!).

Luca Salvatici (2013). The Gravity Model in International Trade AGRODEP Technical Note TN -04.
http://www.agrodep.org/sites/default/files/Technical_notes/AGRODEP-TN-04-2_1.pdf
For those seeking for technical details about some of the main on the econometrical side of gravity
equations.

Silva, J. S., & Tenreyro, S. (2006). The log of gravity. The Review of Economics and statistics,
88(4), 641-658.
http://personal.lse.ac.uk/tenreyro/jensen08k.pdf
Seminal work on the econometric implications of Jensen’s inequality in the context of gravity
equation:

Anderson, J. E. and van Wincoop, E. (2003), “Gravity with gravitas: a solution to the border
puzzle”, American Economic Review 93: 170–92.
1
IMPORTANT NOTE: The content of this document, and specially the exercise section, is
based on the document prepared by UNCTAD-WTO entitled “A Practical Guide to Trade
Policy Analysis. (Chapter 3. Analyzing bilateral trade using the gravity equation). To access
the on-line version of this UNCTAD-WTO doc, visit the WEB page:
http://vi.unctad.org/tpa/index.html
1
http://www.econ.ku.dk/Nguyen/teaching/Anderson%20van%20Wincoop%202003%20Gravitas.
pdf
Seminal paper about the role of Multilateral Trade Resistance term in gravity equation:

http://www.agrodep.org/sites/default/files/Technical_notes/AGRODEP-TN-05_2.pdf
Another “accessible” example of a STATA Gravity – Model exercise
1.- SOME BASIC THEORY AND DEFINITIONS
1.1.- Foundations of a Gravity Model for international trade:
-
What is it about?
-
The basic expression is formulated as something like:
Following Newton expression for masses, Gravity Model for trade2 states that, broadly constructed,
trade (attraction) between two countries is proportional to their size (size of the economy) and
inversely proportional to their “distance”.
“the trade flow from country i to country j (Xij) is proportional to the PRODUCT of the “mass”
two countries (namely GDPs of Yi , Yj) and inversely proportional to their distance, Dij (broadly
construed to include all factors that might create trade resistance)”.
𝛼
𝛼
𝛼
𝑋𝑖𝑗 = 𝛼0 𝑌𝑖 1 𝑌𝑗 2 𝐷𝑖𝑗3 𝜀𝑖𝑗
o
o
o
o
o
o
-
α0: Coefficient (intercept) not depending on “i” or “j”
Xij: Exports from “i” to “j” (or imports of “j” from “i” )
Yi: Exporter factors (GDPi, for example…)
Yj: Importer factors (GDPj, for example...)
Dij:distance/trade barriers of exporter “i” to enter / reach market “j”
εij: random term (stating the random nature of the EQ, in contrast to Physics determinism)
Does this naïve approach work?
Even before a theoretical basis was found, and surprisingly enough, simple gravity equations did a
pretty successful job at explaining bilateral trade. Gravity approach has withstood the test of time and
it is still being one of the most widespread model for international trade simulation / forecasting.
-
Is it really a useful tool for empirical analysis?
YES, gravity approach is, in fact, widely used to infer trade flow potentials, or to estimate the effects
on trade of institutions such as customs unions, monetary agreements, exchange rate mechanism,
ethnic ties, linguistic identity, international borders. Some widespread applications are:
2
Suggested in the sixties by Tinbergen,J. (1962) Shaping the World Economy: Suggestions for an International Economic Policy. New York: The
Twentieth Century Fund, and Pöyhönen (1963) A tentative model for the volume of trade between countries. Weltwirtschaftliches Archiv 90: 9399.
2
a. Apart from simulation, the simple gravity equation offers a very robust baseline specification for a structural bilateral trade model and thus, it is commonly used to estimate
the marginal effects of other relevant factors facilitating or inhibiting trade.
b. Simulate scenarios for international trade adjustments:


Related to regular changes in main explanatory variables (such as GDP’s, tariffs)
Related to particular events / policy issues that may be related to trade distortions:
Free Trade Agreement (Trade creation Vs Trade Diversion), Currency unions, Political
federations, “Tariffication”,…
c. Quantify trade potential between two countries. Basically, a gravity equation is estimated
using data only of those countries that may have supposedly reached their trade potential
(for example those belonging to a free trade area) and then, after some adjustments, the
estimated equation is applied to pairs of countries out of the sample: the real trade is then
compared with predicted (potential) trade.
d. Measure determinants of bilateral trade resistance / cost. A gravity equation can also be
used “in reverse” to measure bilateral trade costs at an aggregate level. The idea is to solve a
theoretical gravity equation for the trade costs term instead of trade flows and to express
these costs as a function of the observable trade data.
e. Tariff(ication) of a Quota. Sometimes, there is a substitution of non – tariff barriers (such as
quotas) with only - tariff measures. The idea is to estimate a gravity equation for a market
(dataset) where tariff barriers (between some countries) and quotas (between others)
cohabit. Then, using coefficients for both resistance terms (tariffs and quotas) a quota-tariff
equivalent could be “easily” computed.3
-
-
Is really the physical distance a variable related to trade?.
Newcomers feel OK with the link between GDP’s and trade4 but; Is it distance a matter of real interest?
The answer is: yes, it is. Some EMPIRICAL studies clearly illustrate that, overall, distance matter SO
MUCH because it measures somehow a lot of relevant trade resistance issues: transport costs,
transaction costs, perishability/losses of goods during transport, synchronization costs,
communication costs, cultural distance….
Some other almost evident - additional - main features of the equation:
o
It has a multiplicative form (with crucial empirical/econometrical implications)
3
See example in section 2 of UNCTAD-WTO doc (already mentioned in this text) entitled “A Practical Guide to Trade Policy Analysis. (Chapter 3.
Analyzing bilateral trade using the gravity equation). To access the on-line version of this UNCTAD-WTO doc, visit the WEB page:
http://vi.unctad.org/tpa/index.html
4
Try to represent relationship between GDP of exporter and imports for a given country using the data of the example 1 for OECD countries to
see how clear it looks2000
3
o
-
-
It has a very flexible “nature”:
 Unit of observations: “i” and “j” might represent countries, regions, provinces, or
even firms and …
 “x” might represent TOTAL trade, a given product trade, sectoral trade, the share of
trade5,…
 …or even a NON TRADE flow: FDI, shopping trips, migration6, commuting,
passengers, fleets…
 “distance” could be understood as “positive” (in the sense of obstructing trade, such
as physical distance, tariff barriers,… ) or “negative” (and thus related to factor that
facilitate trade such as the existence of a special transport infrastructure, to share a
common border, a common language, the existence of regular migration, or cultural
ties, common past …..
 A time sub index “t” might be added to every term (if a panel dataset is used instead
of a cross section)
Far beyond an intuitive understanding; Is there any solid theoretical background to support the use
of Gravity Models?
Yes. Apart from being intuitive and empirically true (at least apparently), some important studies have
been able to provide THEORETICAL foundations for the gravity model/equation7 deriving gravity
equations as a valid representation of a variety of different trade theoretical models8. This was/is, in
fact, one of the reasons for the gravity models living a “second youth”.
Theory was somehow developed after empirics were successful using Gravity Models but, at the same
time, this theoretical foundations have become critical to understand the right specification of the
empirical model, the “correct” signs and size of estimated coefficients, and the proper way of
conducting correct econometrical inferences. For example:
o
o
o
o
According to the theoretical background, elasticities of GDP’s and distance should be around
1
According to that theory behind, the gravity equation SHOULD follow a multiplicative function
According to the theory derivation, the gravitation equation MUST also include a Multilateral
Resistance Term OR “remoteness” term (we will talk about it later on)
……and many more interesting facts that frame and condition empirical exercises
5
BUT…be careful when dealing with disaggregated data because some critical issues appear when dealing with empirics
6
There is a number of research papers focusing the impact of migration on international trade (or in FDI) and, very frequently, a gravity equation
is used as empirical framework. A good meta – analysis can be reviewed in Genc, M., Gheasi, M., Nijkamp, P., & Poot, J. (2012). The impact of
immigration
on
international
trade:
a
meta-analysis1.
Migration
Impact
Assessment:
New
Horizons,
301.
https://books.google.es/books?hl=es&lr=&id=_na1YGdwHucC&oi=fnd&pg=PA301&ots=9amGaDVC_9&sig=aXMdwv4fMUmDYy9Tm7x2Ti7mU8
c#v=onepage&q&f=false
7
See an intuitive sketch in the text Head, K. (2003), “Gravity for beginners”
http://vi.unctad.org/tda/background/Introduction%20to%20Gravity%20Models/gravity.pdf
8
of
British
Columbia
University.
Based on increasing returns to scale for the case of imperfectly competitive markets with firm-level product differentiation and/or, alternatively
for a perfect competition market, with product differentiation at the national level.
4
2.- INTRODUCTION TO EMPIRICAL ESTIMATION OF A GRAVITY MODEL
2.1.- Basic expression:
o
Normally, a log-log version of a generalized gravity models takes the form of:
𝑙𝑛𝑋𝑖𝑗 = 𝑎0 + 𝑎1 𝑙𝑛𝑌𝑖 + 𝑎2 𝑙𝑛𝑌𝑗 + 𝑎3 𝑙𝑛𝑡𝑖𝑗 + 𝜀𝑖𝑗
***
𝑙𝑛𝑋𝑖𝑗 = 𝑎0 + 𝑎1 𝑙𝑛𝑌𝑖 + 𝑎2 𝑙𝑛𝑌𝑗 + 𝑎3 𝑙𝑛𝑡𝑖𝑗 + 𝑎4 lnΠ𝑖 + 𝑎5 lnΡ𝑗 + 𝜀𝑖𝑗
o
“Xij”: “ij” trade flow
 ATTENTION:
- Xij represents only ONE flow from “i” to “j” (exports, for example) so we will
have a different second observation for the model containing “Xji (exports
from “j” to “i”) we shouldn’t aggregate both “ij” and “ji” averaging the
reciprocal trade flows9
- Normally, IMPORTS “ij” and “ji” as flows are most commonly used than
exports
- FOB or CIF values? One might understand that it depends on if we are
measuring Xit as exports from “i” to “j” or imports of “j” from “i” but NO:
Basically, FOB are the best option (if available)10.
o
Yij: Size of economies “i” and “j”, (usually a measure of income such as GDP’s)
 GDP’s and trade flows (X) should be measured in nominal terms (not real terms)11
 Sometimes, instead of using origin and destination GDP’s, the product of both “i” and
“j” GDP’s is used as a single variable (or the share of that product over total GDP),
tij could contain both BARRIERS and/or INCENTIVES to trade between “i” and “j”.
 (dij): Bilateral distance “ij” (computed in several ways12)
 Natural barriers and/or costs related barriers (dummies 0/1):
One of the two being islands, one of the two being landlocked countries
(countries entirely enclosed by land)…
 Incentives:
- Access to sea
o
9
We will have one observation for each flow. Imagine the case of trade between a big and small country: one of “ij” or “ji” might be very high
and the other one might be very low. The equation could explain both attending to relative sizes of “I” and “j” (as exporter and importer) but it
could never explain the average.
10
Using FOB, only the cost of movement of goods on board of Airlines or on board of ship at origin is borne by the seller BUT all further cost to
reach the goods to the buyer's place (mainly related to distance) has to be met by the buyer. Using CIF data may lead to simultaneous equation
biases, as the dependent variable includes costs that are correlated with the right hand side variables for distance and other trade costs.
11
“Gravity is an expenditure function allocating nominal GDP into nominal imports; therefore inappropriate deflation probably creates biases
via spurious correlations”. UNCTAD-WTO A Practical Guide to Trade Policy Analysis
12
Both in terms of the distance measurement (Euclidean, great-circle, orthodromic…) and in terms of “where” to measure (economic “poles”,
capitals, main ports,…)
5
-
Common border or adjacency
Existence of Bilateral, Regional or Multilateral Trade Agreement (Usually with a dummy, although some
caveats should be made13)
-
o
Other lower information costs related factors:
o Common language
o Cultural / Historical ties (had been colonies of each other or a
common colonizer)
“a’s” = elasticities (as ever in a log-log expression) and a3=1-σ (been σ the elasticity of
substitution14)
2.2.- A basic amendment: the importance of Multilateral Trade Resistance (to trade) terms
(MTR)
The basic idea is that, in the modern Gravity Equation, we NEED TO ACCOUNT for something called MTR
terms. “The gravity equation tells us that bilateral trade, after controlling for size, depends on the bilateral
trade barriers between “i” and “j” BUT RELATIVE to the product of their Multilateral Resistance Indexes”
15.
-
What does MRT means?
-
The Multilateral Resistance Index between a pair of countries “ij” represents the trade
resistance/barrier/distance “ij” relative to the average trade resistance/barriers that both countries
face with all their trading partners.
-
Why do we need to account for that MRT terms?
-
Two neighbor countries will trade more if they are isolated from the rest of the world (and near).
Good example of the idea found at Head, k (2003): “The importance of remoteness in actual trade
patterns can be illustrated by comparing trade between Australia and New Zealand with trade
between Austria and Portugal. The distance between each pair’s major cities is approximately the
same: Lisbon–Vienna and Auckland–Canberra both happen to be 1430 miles apart. Furthermore the
product of their GDP’s are similar (Australia–New Zealand is only 20% smaller). Hence, omitting
remoteness, the gravity equation would predict that Austria–Portugal trade would be slightly larger.
In fact, however, in 1993 Australia–New Zealand trade was nine times greater than Austria–Portugal
Trade”.
-
It is easy to see why higher multilateral resistance of the importer “j” raises its trade with “i”.
Price/Cost advantage: For a given bilateral barrier between “i” and “j”, higher barriers between “j”
13
A dummie is not able to capture those trade agreements with asymmetrical benefits/conditions for its members, for different products or
simply with a progressive effect. A single dummy suppose that the “treatment” effect is the same for all the countries participating in this
agreement.
14
Remember that this term refers to the degree of trade elasticity of substitution between domestic and traded goods. The higher the elasticity,
the greater the effect of resistance / promotion terms to trade.
15
See Anderson and van Wincoop’s (2003) for a much more detailed explanation
6
and its other trading partners will reduce the relative price of goods from “i” and raise imports from
“i”.
-
Higher multilateral resistance of the exporter also raises trade: Higher trade barriers faced by an
exporter will lower the demand for its goods and therefore its supply price “pi”. For a given bilateral
barrier between “i” and “j”, this raises the level of trade between them.
-
What if we omit MRT?.
-
The empirical problem is that this MRT’s are obviously highly correlated with bilateral trade barriers
(by and large, the higher the MRT the higher the trade barriers). Omitting MTR induce potentially
severe bias in the coefficients of the distance, border variables and other trade resistance measures.
The Gravity Equation considering MRT
-
The previous caveat means that in the gravity equation, we have to include MTR (or the inverse, that
said, easiness to trade) for EITHER “i” and “j”. Following Anderson and van Wincoop’s derivation, the
gravity equation turns out to be:
1−𝜎
𝑋𝑖𝑗 =
𝑌𝑖 𝑌𝑗 𝑡𝑖𝑗
(
)
𝑌 Π𝑖 Ρ𝑗
Where, in the “resistance to trade” term we find:
o
o
o
tij: (vector of bilateral trade resistance for imports of “j” from “i”)
Πj: Ease of access of exporter “i”
Ρj: Ease of access of importer “j”
Πj and Ρj are low if the countries are “remote” from world markets (physically or in terms
of high trade protection for example).
o
-
σ: elasticity of substitution
How to deal with non-observable nature of MTR’s:
In its seminal work, Anderson and van Wincoop’s16 proposed the use of a nonlinear iterative method
to estimate MTR effects but, commonly, the empirical solution is to use a linear estimator following
one of these two empirical strategies:
1. If we use a cross – section dataset:
a. To proxy MTR using remoteness - like indexes17.
16
For details see section 3.2. (Iterative structural estimation) in “Gravity Equations: Workhorse,Toolkit, and Cookbook Keith Head and Thierry
Mayer” http://www.cepii.fr/PDF_PUB/wp/2013/wp2013-27.pdf
17
See section 3.1. (Proxies for multilateral resistance term) in “Gravity Equations: Workhorse,Toolkit, and Cookbook Keith Head and Thierry
Mayer” http://www.cepii.fr/PDF_PUB/wp/2013/wp2013-27.pdf
7
For example, to compute (GDP weighted) average distance of trader “i” to all
countries/regions other than “j” (and vice versa).
𝑅𝐸𝑀𝑖 = ∑𝑚≠𝑗


𝑑𝑖𝑚
⁄𝑌 or
𝑚
𝑅𝐸𝑀𝑖 = ∑m≠𝑗
𝑑𝑖𝑚
𝑌𝑚 ⁄𝑌
The major drawback relates to the use of distance. It is difficult to find a good
way of measuring that distance “ij” and, even if we manage, it might be a
simplistic way of measuring MRT’s.
This is the only option if we are interested in country – specific variables (see
later how country specific dummies would preclude the estimation of such
parameters)
b. Use country dummies for countries “i” and “j” to capture this unobserved
heterogeneity
𝑙𝑛𝑋𝑖𝑗 = 𝑎0 + 𝑎1 ln⁡(𝐺𝐷𝑃𝑖 ) + 𝑎2 ln⁡(𝐺𝐷𝑃𝑗 ) + 𝑎3 𝐼𝑖 + 𝑎4 𝐼𝑗 + 𝑎5 ln⁡(𝑡𝑖𝑗 )+𝜀𝑖𝑗
The problem with this approach (as we will see later) is that if we allow
country dummies in a cross-section approach, we cannot estimate/identify
parameters for other country specific variables (Such as GPDi, or GDPj !!) so
this approach is only valid when the interest is on bilateral / country pair
coefficients (such as the effect of a common border or the impact of a
bilateral Trade Agreement, for example)
2. If we use a panel dataset:
Apart from controlling for MTR terms, the use of standard Panel Data estimations help us to
control for other non - observable bilateral / country pair heterogeneity (bilateral / country pair
fixed effects) or trend - secular effects
a. TIME INVARIANT Country dummies for countries “i” and “j” (assuming MTR
constant over time).
𝑙𝑛𝑋𝑖𝑗𝑡 = 𝑎0 + 𝑎1 ln⁡(𝐺𝐷𝑃𝑖𝑡 ) + 𝑎2 ln⁡(𝐺𝐷𝑃𝑗𝑡 ) + 𝑎3 𝐼𝑖 + 𝑎4 𝐼𝑗 + 𝑎5 ln⁡(𝑡𝑖𝑗 ) + (𝑎6 𝐼𝑡 )+𝜀𝑖𝑗



18
In that case, that allows the estimation of OTHER country specific factors
(such as GDP’s) but only under the assumption that this country effects vary
over time.
We need to ponder if the MTR time invariant assumption is plausible18
REMEMBER THAT country pair time invariant effects cannot be estimated
with Fixed Effects panel data estimator if our interest relies on other time
invariant variables (such as being landlocked, distance, common borders,
common language, trade agreement,….). In this case, the use of Random
Yet with some exceptions, MRT is only constant within short periods of time.
8
Effects is an option if we want to get time invariant coefficients (at the
obvious risk, as ever, of bias in the rest of coefficients)
b. TIME VARIANT Country dummies for countries “i” and “j” and period “t”
(allowing MTR change over time): (Iit) and (Ijt).

ATTENTION, if we allow time varying country dummies, we cannot estimate
parameters for other time varying country specific variables (such as GDP, for
example,…) = so again this approach is only valid when the interest is on
bilateral / country pair coefficients
𝑙𝑛𝑋𝑖𝑗𝑡 = 𝑎0 + 𝑎1 𝐼𝑖𝑡 + 𝑎2 𝐼𝑗𝑡 + 𝑎3 ln⁡(𝑡𝑖𝑗 ) + (𝑎4 𝐼𝑡 )+𝜀𝑖𝑗
2.3.- Other empirical (and more advanced) issues of interest.
-
There are a handful of relevant issues (some of them critical):
o
o
o
o
o
o
o
o
-
19
Critical omitted variables biasing coefficients systematically (migration, for example, that
might be related to trade and other exogenous variables)
How to deal with zero trade values (Xij=0)19
Log – Log form inducing bias in the (highly probable) presence of heteroskedasticity
Some extra-cautions when working with disaggregated data such as sectors, firms,…
Endogeneity in gravity equation: causation between trade and trade policy could be reversed
when, for example in the case of the signature of a FTA, there exist a selection of countries
based on intensity of trade, and not the other way round.
Spatial correlation
Statics nature: several authors have proposed a dynamic gravity equation in place of the
traditional static gravity equation
…..and some others….
How to deal with zeros:
o
Reasons for zeros:
 Real zero trade
 Value rounded to zero trade
 Missing values for sometimes unknown reasons (non reported by countries, error in
dissemination or manipulation,….)
….
The problem is that, in many times, the researcher doesn’t know what proportions
of different types of “zeroes” does he have in his dataset.
o
Alternatives:
Remember that we are using a log-log model
9





-
Substitute zeros by small numbers (1, for example): only appropriate if real zero or
almost zero trade BUT arbitrary level generate unexpected impact on parameters
Simply truncate the sample to avoid zero cases (delete “ij” cases with zeroes): only
“reasonable” if zeros are true missings and thus randomly distributed across “n” and
“t”BUT obviously bias selection problem
To control for the “selection bias” using a Heckman procedure (Heckman 2- stages
least squared estimation that introduces in the specification the inverse of the so
called Mills ratio). However, that method requires “instrumental” variables that may
explain the selection (zero or positive trade) but not the value of positive trade.
Not to use logs, and estimate the model in levels:
 With a linear estimator = NO: theoretical foundation of the gravity equation
implies multiplicative form
With a non-linear estimator:
 (Pseudo) Poisson maximum likelihood (ML) estimator applied to the levels of
trade estimating directly the non- linear form20.
 Tobit (that allows a significant proportion of zeros) on the log of trade plus a
constant: the critique is that Tobit applies for left-censored zeros only when
those “zeroes” have the economical meaning of zeros (no trade or almost no
trade because of prohibitive trade costs, for example)
Log inducing bias in presence of heteroskedasticity
The Gravity model is basically defined as:
𝛼
𝛼
𝛼
𝑋𝑖𝑗 = 𝛼0 𝑌𝑖 1 𝑌𝑗 2 𝐷𝑖𝑗3 𝜀𝑖𝑗
Where the random term εij have the standard property:
𝐸[𝜀𝑖𝑗 /𝑌𝑖 𝑌𝑗 𝐷𝑖𝑗 ] = 1
The model is then usually linearized taking the following simple form:
𝑙𝑛𝑋𝑖𝑗 = 𝛼0 + 𝛼1 𝑙𝑛𝑌𝑖 + 𝛼2 𝑙𝑛𝑌𝑗 + 𝛼3 𝑙𝑛𝐷𝑖𝑗 + ln(𝜀𝑖𝑗 )
The problem is that, as Jansen’s inequality states:
E[ln(ε)]≠lnE[ε]
On the contrary, the expectation of a log of a random variable E[ln(ε)] not only depends on the mean
of that variable E[ε] but also on higher order moments (such as the variance) V[ε]. Then when the
random term in the original (nonlinear) model presents heteroskedasticity,
20
Silva, J. S., & Tenreyro, S. (2006). The log of gravity. The Review of Economics and statistics, 88(4), 641-658.
10
𝑉[𝜀𝑖𝑗 ] = 𝜎𝑖𝑗 = 𝑓(𝑌𝑖 , 𝑌𝑗 , 𝐷𝑖𝑗 )
the expected value of its logarithm E[ln(ε)], that is also a function of σij, is also a function of the
explanatory variables:
𝐸[𝑙𝑛(𝜀𝑖𝑗 )] = 𝑓(𝜎𝑖𝑗 ) = 𝑓(𝑌𝑖 , 𝑌𝑗 , 𝐷𝑖𝑗 )
The result is therefore that some of the parameters in the log model might be biased and inconsistent.
A way of solving this problem is to assume a pattern of heteroskedasticity where conditional variance
depends on conditional mean, that is, a Poisson distribution for the endogenous variable where
E[yi/X]=V[yi/X]. Then a PPML (Poisson pseudo- maximum likelihood estimator) traditionally used for
count data can be used to consistently estimate the model. One interesting extra - point is that this
estimator is applied for the model in LEVELS so no logs are needed and thus, at the same time, we fix
the problem of zeros.
11
SIMULATIONS MODELS FOR INTERNATIONAL TRADE
GRAVITY EQUATIONS FOR INTERNATIONAL TRADE MODELS
Paris-Dauphine / September 2016
DOCUMENT 2:
STATA HANDS ON SESSION
Ramón Mahía – UAM
(Based on the material provided y UNCTAD-WTO)21
Complete modified and commented DO File: DO_MODIFIED_COMMENTED
1.- MANIPULATION OF DATA (Previous to Econometric estimation)
The interest of this section is to understand how to build up the type of dataset we will normally need
for a gravity equation estimation. The STATA commands are NOT of particular interest but could help
those facing a particular exercise with similar source datasets.
ORIGINAL DATASETS
tradeflows.csv
(cross panel-> i,j;t)
gdp.csv
(panel-> i;t)
gdp.dat
(panel-> i;t)
joinWTO.txt
(country-> i)
dist_cepi224.dta
(country pairs-> i,j)
INTERMEDIATE DATASETS
Reshape
(all pairs)
Reshape
and
Duplicate
Duplicate
tradeflows.dta
(cross panel, i,j;t)
gdp_exporter.dta
(panel, i;t)
gdp_importer.dta
(panel, j;t)
join_exporter.dta
(panel, i;t)
gravity_temp1.dta
(cross panel, i,j;t)
gravity_temp2.dta
gravity_temp3.dta
join_importer.dta
(panel, j;t)
FINAL DATASETS
gravity.dta
oecd_ex, oecd_im
Time dummies (year_)
Country dummies
(exporter_)
(importer_)
Selection 1996-2005
Country + time
dummies
(exportertime_)
(importertime_)
cepii.dta
(country pairs-> i,j)
Selection Balanced panel
gravity_temp4.dta
religion.dta
(country-> i)
WTO dummies & fix
some minor issues
logs
5 years averages
gravity_1996_2005.dta
gravity_OECD_2000_2005.dta
21
IMPORTANT NOTE: This document, and specially the exercise section, is based on the excellent work published by
UNCTAD-WTO entitled “A Practical Guide to Trade Policy Analysis. (Chapter 3. Analyzing bilateral trade using the gravity
equation). To access the on-line version of this UNCTAD-WTO doc, visit the WEB page: http://vi.unctad.org/tpa/index.html
12
(Steps 1 to 7 as described in Chapter 3 – UNCTAD/WTO.)
Several operations to perform before estimation:







Download datasets from sources and import them into a single software format (stata dta, EViews wf,..)
Homogenize formats of different datasets, list of countries, names for countries, names for
variables, “names” for years
Replace missings (ceros for trade, functional 999 for real missings….)
Generate the structure for the gravity model data set: all possible combinations of countries (and
years if panel Is used)
Merge different files into a single one
Generate dummies (if needed) for year, country, and Year x Country
Compute log variables (for GDP, trade and distance)
Step 1:





Import CSV trade flows (tradeflows.csv), label variables and save it to .dta
Import txt file “joinwto.txt” with year of accession for each country and save it in .dta format
Import CSV file “GDP.csv” with GDP data for each countries from 1960 to 2006,
Replace BELGIUM and LUXEMBOURG by BENELUX, compute BENELUX GDP with the sum of both
countries and change names for year variables save it in .dta format
Open STATA datafile containing the rest of explanatory variables, fix BENELUX problem, change
some variable names, label some other variables and save it in .dta format
Basically, at the end of that Step 1, four different STATA files are created and stored in the default
directory:
1.
tradeflows.dta (endogenous variable) in a Panel dataset for YEARS and PAIRS of countries
in LONG format
2.
joinwto.dta (for the explanatory variable “wtoaccesion”) in a Cross Section dataset for
INDIVIDUAL countries
13
3.
GDP.dta (from GDP.csv for explanatory variables GDP’s) in a Panel dataset for YEARS and
INDIVIDUAL countries in WIDE format
4.
CEPII.dta (other explanatory variables in LONG format) in a CROSS SECTION dataset for
PAIRS of countries
Step 2:
-
Starting with “tradeflows.dta”, create the FULL structure of the datafile: PANEL DATA for YEARS and
every possible combination (PAIR) of countries filling with “zeros” the pairs newly created. The
temporary file created is "gravity_temp1.dta"
-
Reshape GDP.dta to LONG Panel set and create a duplicate (GDP is going to be used as both importers’s
GDP and exporter’s GDP)
Step 3:
14
reshape long stub, i(i) j(j)
\
j new variable
reshape long yr, i(countrycode) j(year)
rename yr gdp
-
And MERGE those two new files (“GDP_exporter.dta” and “GDP_importer.dta”) with
"gravity_temp1.dta" in "gravity_temp2.dta" keeping those observations (PAIRS of countries) with
information in both files.
-
MERGE “joinWTA.dta” with that file creating two new variables: join_exporter and join_importer .
-
The new temporary file created is "gravity_temp3.dta"
Step 4:
15
-
MERGE data of both two new files “CEPII.dta” (previously saved) and “religion.dta” with the
previous.
The new temporary file created is "gravity_temp4.dta"
Step 5:
-
-
Create WTO accession dummies depending on whether one, none or both countries are members of
WTO or not (onein, nonein, bothin)
The new PERMANENT file created is "gravity.dta" and basically contains the core dataset (endogenous
and exogenous variables, except for country/country x time/time dummies and some lasting
transformations)
The structure of the main dataset is shown in the next screenshot: Each row contains a trade flow
(import) and the variables for the gravity equation (GDPS, and the terms for barriers and incentives)
EXCEPT FOR MRT’S dummies.
Step 6:
-
Create country/country x time/time dummies for the specification of MTR terms and time fixed effects
In this block, due to memory restrictions, three different options are offered if the number of dummies
exceed the STATA capacity:
o Option selected in this example: Reduce the number of years (>1995→1996 – 2005)
o Compute country-period (and not country-year dummies)
16
o
Make a balanced panel (reducing the sample to those countries having the information for the
same time period).
Step 7:
-
Create logs of variables GDP’s, and distance
Compute five year averages of some variables
Create a subset for the period 1996-2005
Create a subset with OECD countries for the period 2000-2005
2.- ECONOMETRIC ESTIMATIONS OF GRAVITY EQUATIONS
-
-
REG1: ESTIMATE A LOG-LOG CROSS SECTION BASIC REGRESION FOR OECD COUNTRIES 2000 AND
2005, WITHOUT MRT’s AND PERFORM SOME BASIC CHECKS
Load dataset “gravity_OECD_2000_2005.dta”:
o 33 countries
o 6 years
o 32*6=192 observations for each country as importer
o 192*33=6336 records
REG1: ESTIMATE A LOG-LOG CROSS SECTION BASIC REGRESION FOR OECD COUNTRIES 2000 AND
2005, WITHOUT MRT’s AND PERFORM SOME BASIC CHECKS
Check number of valid observations for the endogenous “limports” in 2000 and 200522
There should be 33*32= 1056 valid values but there are only 992 because of 64 Missings due to
zero values for trade with origin or destination in BLX.
o
22
Estimate the simplest log-linear gravity model regression for the year 2005 using only
lgdp_exporter, lgdp_importer and ldistance
STATA: inspect limports if year==2000
17
o
Check elasticities (according to theory and meta analysis) :



Theory predicts a value around 1 for GDP’s elasticities (both importer and exporter)
A difference between origins GDP and destination GDPs is expected, a lower estimation
for importer GDPs would suggest evidence of home market effects (due to barriers to
entry or national product differentiation).
Meta-Analysis shows that distance coefficient is also around -1
META analysis for 2500 gravity equations estimations.
Table extracted from Head, K., & Mayer, T. (2013). Gravity equations: Workhorse, toolkit, and cookbook.
Handbook of international economics, 4.
o
Check if trade elasticity is significantly more sensible to trade barriers (proxied by distance) in 2005
than in 2000
 Procedure: compare basic estimation for different years (2000 Vs 2005) using seemingly
unrelated estimation (STATA suest23 command)
23
Seemingly unrelated estimation procedure combines the estimation results (parameter and variance matrices) in one parameter vector and
simultaneous (co)variance matrix. The procedure is done after the isolated estimation of each equation. The idea behind this reasoning is that
error terms in different equations might be correlated, and that may impact in the estimated covariance of parameters and thus in every crossmodel hypothesis concerning parameters of those different equations.
18
It looks like no statistical difference exists comparing 2000 and 2005 estimates.
-
REG2: ESTIMATE ANOTHER CROSS SECTION REGRESSION INCLUDING ADDITIONAL REGRESSORS
o
Estimate, with robust inference, for 2005 adding more variables (and using robust estimation to
adjust heteroskedastcity):
reg limports contig comlang_off onein colony REPlandlocked PARTlandlocked religion ldist
lgdp* if year==2005, robust

o
24
“onein” coefficient cannot be estimated (only zero values), and the same for “bothin”
(only value 1) (tab onein if year==2005)
Compare REG1 and REG2 regressions24. Check elasticities obtained:
 GDP’s coefficients for exporter and importer appear to be slightly overestimated (biased)
in the first regression. We will always expect that kind of bias for the simplest estimation
but the size, and even the sign of this bias depends on the particular nature of relationship
between omitted variables (mostly related to trade resistance / incentive) for the
particular case of countries comprised in the sample.
For that, it is useful to use “eststo” command (download it first if not already installed)
19

Adjacency coefficient (“contig”) usually lies in the vicinity of 0.5 (Head, K 2003) suggesting
that trade is around 65%25 higher as a result of sharing a border. That means that the
omission of this simple variable, may cause (as in our case) an upward bias (in absolute
value) in distance parameter (both are negatively related to each other).

Contiguity and common language effects seem to have very comparable effects, with
coefficients around 0.5. (Head, K., & Mayer, T. (2013), see table above)).
According to some papers, common links (language, colony,…) may cause very significant
rises in trade (up to two, three times or even more…). Colonial links are not significant in
our regression given the particular nature of the sample (only OECD countries included)
“Landlocked” variables are weakly significant. Small and positive coeff for exporters and
much more important in size for PARTNER (importer) resulting, in that case, in a reduction
of imports of around 42% (coeff.=0,357).


-
25
REG3: ADDING DUMMIES TO CONTROL FOR MTR’s EFFECT
o
According to MET – Analysis (table shown before), gravity models estimated without controlling
for MRT terms are biased (comparison between “all gravity” and “structural gravity” sections)
o
REG. 3.1 Try to estimate previous REG2 for a cross section in 2005, with robust inference, adding
country dummies importer_* and exporter_* to control for MTR. Compare this estimation with
the previous one 26 (without MRT’s terms)
Remember that, in a log-log model, raw coefficients for dummies do not represent elasticities (% changes). The elasticity can be easily
derived with Exp(β)-1, so for a coefficient of 0.5 we get exp(0.5)-1=0.648.
26
To compare common coefficients using “esttab”, remember to store this equation into memory [STATA: eststo est2] and then compare
common coefficients dropping MRT’s dummies *exporternum, *importernum [STATA: esttab, r2 ar2 se scalar(rmse) drop(*exporternum
*importernum)]
20
o

Important differences appear for common coefficients.
 In the case of distance (“ldist”) elasticity is greater than the previous one and
well above “1” (as expected according to the MetaAnalysis)
 Contiguity is not longer significant and the rest of resistance or incentives to
trade variables changed their values

Given that importer_* and exporter_* are country specific (not pair specific) perfectly
correlate with other country specif variables such as REPlandlocked PARTlandlocked and
lgdp_importer lgdp_exporter so, as expected, after adding country dummies, we CAN NO
LONGER estimate the parameters for other country level variables (GDP, *landlocked)27
3.2 How can we add country dummies to control for MRT’s without losing the estimates of those
country specifics such as GDP’s?.
A pooled OLS regression (NOT A PANEL) for a short period (2000-2005) could be a solution, at least
for that lost country specific variables that DO VARY over time (GDP’s for example) but, obviously,
not for country time-INVARIANT variables (such as REPlandlocked / PARTlandlocked). Lets then
repeat previous regression for the period 2000-2005 (adding also year* dummies28)


27
In effect, GDP’s coefficients can be now estimated and, according to literature, elasticities
drop substantially (down to 0.6) in this “structural” version compared to previous
estimates (without controlling for MRT’s)
Additionally, some variables related to trade incentives appear to be clearly significant
(colony, comlang_off, contiguity, religion…)
A better way to say it is that those coefficients for country dummies show the aggregated value of ALL the country specificities.
28
Commonly, year dummies control for omitted terms causing secular / trend variation in panel data models (affecting in our example world
trade for every single pair of exporter – importer)
21
o
o
3.3. What if we now add country x time dummies allowing for MTR time variants? (in the previous
regression, MRT terms were supposed to be constant over time)
 The answer is that, given that MRT’s now varies over time, we lose again the estimate of
country specific time variant variables (such as GDP’s)
3.4. What if we now add country-pair dummies allowing to control for paired heterogeneity?
 This is weird, because adding “pairid” fixed effects does not allow to estimate the
coefficients for any “country pairs” such as distance, colony, onein,…..
 SO IF WE CONTROL FOR ALL FIXED EFFECTS AT THE SAME TIME (COUNTRY, YEAR,
COUNTRY X YEAR, AND COUNTRY-PAIRS) WE THEN LOSE THE REST OF PARAMETERS
(except for fixed effects)
22
-
REG4: PANEL DATA (Step 8 in UNCTAD-WTO document)
o
o
o
Adding country pairs dummies in a POLLED estimation is somehow equivalent to the use of a
panel data estimation with fixed effects.
 Set panel data structure (remember that the panel observation refers to “ij” pairs)
 Estimate a simple panel data FIXED effects (to control for bilateral MRT’s (including, also,
time effects)
We have to notice again that, controlling with FE for bilateral MRT’s terms we will be unable to
estimate coefficients for every TIME INVARIANT bilateral variables both for “ij” pairs (such as
distance, colony, common language, FTA) or simply at the level of “i” and/or “j” (such as
landlocked)
Using RANDOM Effects, we may estimate every coefficient (missed with FE) but, as always when
we move from FE to RE, at the risk of biased estimates:
23
o
Check the possibility of RE Vs FE (using a simple Haussman Test)
o
Apparently, RE is not the right option so we need to stick to fixed effects.
24
SIMULATIONS MODELS FOR INTERNATIONAL TRADE
GRAVITY EQUATIONS FOR INTERNATIONAL TRADE MODELS
Paris-Dauphine / September 2016
DOCUMENT 3:
Work to do: Simulation Exercise for the effect of NAFTA on trade 29
Ramón Mahía – UAM
(Based on the material provided y UNCTAD-WTO)30
1.- BACKGROUND






NAFTA was conceived as a regional trilateral trade agreement signed in 1994 by Canada, Mexico, and the
United States
The basic idea is to use a gravity model to empirically test the effects of NAFTA agreement in terms of trade
creation (total trade increase as a result of FTA) and/or trade diversion (trade reallocate from non FTA
members to FTA members). Suppose that countries “i” and “j” belong to a common FTA, whereas country
“k” does not. If, after the FTA’s formation, “i” imports more from “j” and less from “k”, trade diversion is
likely. If, in contrast, country “i” imports more from “j” and from “k”, trade creation is likely” (see Box 3.1
page 109 of WTO Manual to understand how to empirically test “trade creation” and “trade diversion”).
The basic idea is to complete the exercise that you can find in WTO Manual, Chapter 3, page 131, sections
2, 3 and 4. Instructions are very clear in the text and also a do.file is also provided by WTO in case you need
to explore and work some extra details.
The MINIMUM work to do is to go through the following instructions and make some comments on the
basic econometrical results obtained.
EXTRA POINTS will be obtained for extra work such as:
o Add some preliminary graphs or descriptive analysis (before econometrical estimation)
o Enrich econometrical exercise:
 Using of additional explanatory variables as covariates
 Testing the variation of NAFTA effect along the time
 Trying alternative estimation / specification strategies (for example using random effects
or alternative ways of addressing MRT’s issue without a panel)
Preliminary steps:
-
We don’t need to build up the data file according to section 1 (Preliminaries). The file is already
prepared as agGravityData.dta.
29
A regional trilateral trade agreement signed by Canada, Mexico, and the United States, that came into force on January 1, 1994.
30
IMPORTANT NOTE: The content of this document, and specially the exercise section, is based on the document prepared by
UNCTAD-WTO entitled “A Practical Guide to Trade Policy Analysis. (Chapter 3. Analyzing bilateral trade using the gravity equation). To
access the on-line version of this UNCTAD-WTO doc, visit the WEB page: http://vi.unctad.org/tpa/index.html
25
-
This file agGravityData.dta contains information enough to build a gravitational model including trade
flows and some other basic info for around 80 countries and for the period 1982-2004.
use "C:\Users\RAMON\Desktop\GRAVITY\Practical guide to TPA\Chapter3\Datasets\agGravityData.dta", clear
-
Nevertheless, a thing to do is to create four NAFTA dummies in order to test NAFTA impact on trade:
o
o
o
o
One dummy to simply identify NAFTA members “nafta” (from year 1994 )
A Second one to identify intra-NAFTA bilateral trade “intra_nafta” (observations with import and
export country being NAFTA members) (from year 1994 )
A third one “imp_nafta_rest” to identify import trade to a NAFTA from a NON-NAFTA member
(from year 1994 )
A fourth one “exp_nafta_rest” to identify exports from a NAFTA member to a NON-NAFTA
member (from year 1994 )
gen nafta = (ccode=="CAN" | ccode=="MEX" | ccode=="USA")
label var nafta "1 if home is nafta member"
gen pnafta = (pcode=="CAN" | pcode=="MEX" | pcode=="USA")
label var pnafta "1 if partner is nafta member"
gen intra_nafta = (ccode=="CAN" | ccode=="MEX" | ccode=="USA") & (pcode=="CAN" | pcode=="MEX" |
pcode=="USA")
replace intra_nafta = 0 if year < 1994
label var intra_nafta "1 if trade bewteen nafta members"
gen imp_nafta_rest = (ccode=="CAN" | ccode=="MEX" | ccode=="USA") & (pcode!="CAN" & pcode!="MEX" &
pcode!="USA")
replace imp_nafta_rest = 0 if year < 1994
label var imp_nafta_rest "1 if nafta's imports from the rest of the world"
gen exp_nafta_rest = (pcode=="CAN" | pcode=="MEX" | pcode=="USA") & (ccode!="CAN" & ccode!="MEX" &
ccode!="USA")
replace exp_nafta_rest = 0 if year < 1994
label var exp_nafta_rest "1 if nafta's exports to the rest of the world"

Econometric estimations:
-
We start by declaring the panel structure (and generate some logarithms)
egen id = group(ccode pcode)
tsset id year
gen lnV = log(imp_tv)
label var lnV "value of imported goods in logarithm"
gen lncGDP = log(cgdp_current)
label var lncGDP "partner's current GDP in logarithm"
gen lnpGDP = log(pgdp_current)
label var lnpGDP "home's current GDP in logarithm"
gen lnD = log(km)
label var lnD "bilateral distance in logarithm"
-
We will then estimate a first fixed effects panel gravity equation for logarithms of value of imports (lnV)
including logs of GDPs, distance, NAFTA trade creation and trade diversion dummies (“intra_nafta” and
“imp_nafta_rest”) and year dummies. The interpretation of both coefficients might be used as an empirical
test about trade creation and/or trade diversion
xtreg lnV lncGDP lnpGDP lnD intra_nafta imp_nafta_rest exp_nafta_rest i.year*, fe robust
26