Download Spivack Final Project Paper

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Forecasting wikipedia , lookup

Data assimilation wikipedia , lookup

Transcript
Shedding Light on Conflict: Can Night Lights Data be used to Understand the Economic Impacts
of Conflict?
Marla Spivack
DHP 207 GIS for International Applications
Introduction and Project Description
Violent conflict represents a significant impediment to both human security and
economic development. Violence stunts growth by undermining economic and political
institutions and tearing at social fabrics in complex ways. Conflict presents such a significant
development barrier that the World Bank choose to focus the 2011 World Development Report
on “Conflict, Security, and Development.” This text compiles work from a wide range of
scholars on the economic impacts of conflict and policy steps that neighboring and donor
countries can take to mitigate the effects of violence and conflict on vulnerable states. The report
emphasizes the development gap that conflict creates between countries and their neighbors.
Low income fragile states have not achieved any of the Millennium Development Goals, and
countries that have experienced a civil war typically take 14 years to return to their previous
growth paths (World Bank 63). Furthermore, ending conflict and creating permanent stability
can have extremely positive economic consequences, countries like Mozambique, Rwanda, and
Ethiopia have demonstrated how post conflict recovery can bring rapid improvements in
indicators such as nutrition, education, and sanitation (World Bank, 51). The World Bank and a
variety of other scholars also point out that conflicts in developing countries are regionally
specific. Increasingly we are seeing intrastate, as opposed to interstate conflict. As Mary Kaldor
points out in her book New Wars and Old Wars: Organized Violence in a Global Era, conflict is
increasingly characterized by localized violence in which a variety of outside actors become
involved in an intra-state conflict. The conflicts in Sierra Leone, and the Democratic Republic of
Congo are both excellent examples of this new phenomenon.
Traditionally, economic growth is measured using Gross Domestic Product (GDP). GDP
is an accounting statistic, and while it is a very effective measure of economic activity, it may
not be the most effective measure of growth for developing countries, particularly fragile states.
These states often suffer from poor data collection, and may have missing data for certain years
or not be able to collect data in some areas. Additionally, a great deal of economic activity in
developing countries takes place in the informal sector, and is not taken into account as part of
GDP. Finally, GDP data can only be used to estimate economic activity on a national scale, there
are rarely GDP data on provincial, or district levels for developing countries. However, as
previously mentioned, conflicts in developing countries in the 21st century are increasingly
localized, and methods of studying their effects on a regional level need to be devised.
Recently Adam Storeygaurd, Vernon Henderson and David N. Weil published a paper in
the American Economic Review titled “Measuring Economic Growth from Outer Space.” This
paper examined the Satellite Images of nighttime lights, and conducted econometric tests to
determine if these lights could be used to accurately estimate GDP. They found that the lights
are a very good predictor of GDP, and that they can be used to augment GDP growth data for
developing countries with poor data and to produce a more accurate estimate of growth. They
also point out that the nightlights are useful for estimating growth at a sub-national levels,
addressing one of the key problems with GDP data.
The lights data have been shown to be an effective proxy for GDP and they are sensitive
at a sub-regional level, which makes them ideal for studying the impacts of conflict, which often
1
varies regionally within a country. Because the data are raster data I do not have to restrict my
analysis to particular political boundaries. I can examine the correlation between light and
conflict density using a raster data set. Spatial Analysis is uniquely positioned to address the
effects of subnational conflict on subnational economic indicators, since any other statistical
method would require aggregating the data at a political boundary, but raster analysis in GIS will
allow us to test the correlation without aggregating the data and loosing geographic precision.
Literature Sources
World development report. New York: Oxford University Press (2011). The 2011 World
Development Report focuses on the economic impacts of conflict and how and why countries in
conflict suffer from economic stagnation and recession as a result of both violent and non-violent
conflict and political instability. It compiles a wide variety of literature and scholarly opinions on
this subject. The wide range of opinions brought together in this one text all provide support to
my hypothesis that areas of conflict will experience less growth than areas without conflict. This
text makes this argument at the national level, my analysis will be able to show this at a subnational level.
Rodrik, D. (1999). Where did all of the growth go? External Shocks, Social Conflict, and Growth
Collapses. Journal of Economic Growth, 4(4), 385-412. Retrieved from
http://www.jstor.org/stable/40216016?&Search=yes&searchText=data&searchText=conflict&se
archText=light&searchText=economic&searchText=growth&list=hide&searchUri=/action/doAd
vancedSearch?q0=light+data&f0=all&c1=AND&q1=economic+growth&f1=all&c2=AND&q2=
conflict&f2=all&acc=on&wc=on&ar=on&sd=&ed=&la=&jo=&dc.Economics=Economics&dc.
Geography=Geography&Search=Search&prevSearch=&item=1&ttl=5792&returnArticleService
=showFullText
In this paper Dani Rodrik argues that social conflicts within a country can be directly linked to
inconsistencies in growth rates over time. His analysis is based on the use of national income
numbers, which while helpful, may not be as accurate as necessary, especially in countries with
poor governance that are likely to experience conflict. Additionally, these numbers are not
available at a sub-national level. My analysis will support and extend Rodrik’s findings by
focusing on a specific country, but looking sub-nationally.
Alesina, A., Özler, S., Roubini, N., & Swagel, P. (1996). Political instability and economic
growth. Journal of Economic Growth, 1(2), 189-211. Retrieved from
http://www.jstor.org/stable/40215915?&Search=yes&searchText=data&searchText=conflict&se
archText=light&searchText=economic&searchText=growth&list=hide&searchUri=/action/doAd
vancedSearch?q0=light+data&f0=all&c1=AND&q1=economic+growth&f1=all&c2=AND&q2=
conflict&f2=all&acc=on&wc=on&ar=on&sd=&ed=&la=&jo=&dc.Economics=Economics&dc.
Geography=Geography&Search=Search&prevSearch=&item=3&ttl=5792&returnArticleService
=showFullText
This paper also highlights the relationship between instability and stagnate or recessionary
growth.
Kaldor, Mary. New Wars and Old Wars Organized Violence in a Global Era. Stanford: Stanford
University Press. (1999).
2
This book lead me to take notice of sub-national conflict and helped me solidify my
understanding of the potential for the lights data to identify regional effects, rather than simply
national effects. It was part of what lead me to decide to do the analysis using raster subtraction
and raster correlation rather than zonal statistics.
Henderson, J. Vernon, Adam Storeygard, and David N. Weil. 2012. "Measuring Economic
Growth from Outer Space." American Economic Review, 102(2): 994–1028.This paper forms
the basis of my use for the lights data as a proxy for economic growth. In it the authors show
that the night lights can be used as a proxy for growth, and provide a formula estimation for the
relationship between lights and actual GDP growth, which will allow me to project GDP growth
numbers for sub-regional areas based on the light saturation of these areas.
Data Sources
The two primary data sources used to conduct the analysis of this project were the National
Oceanic and Atmospheric Agency (NOAA) Nighttime Lights Data set and the Armed Conflict
Location and Events Data Set. I also used Columbia University’s Gridded Population of the
World Population density raster to make a map for my poster, and administrative boundaries.
The Night time lights data can be accessed on the NOAA website at
http://www.ngdc.noaa.gov/dmsp/downloadV4composites.html. The specific data set used in this
paper is called the Version 4 DMSP-OLS Nighttime Lights Time Series. This data set includes
cloud cover images, average lights, and average stable lights. Each of these data sets is
composed from averaging images of the earth at night taken from satellites in orbit multiple
times every night. Day time data are illuminated using solar elevation angle, moonlight data are
excluded based on calculations of lunar light. The data are compiled into rasters, with each grid
cell representing 30 arc seconds or about .86km. In this analysis I use the stable lights data,
these have been cleaned and processed to include only stable light observed over the course of
the year. Lights from gas flares, fires, and other abnormal observations are eliminated in the
stable lights dataset. This is the same data set used by Storeygaurd et. Al. in their analysis, which
is why I choose it. The lights data are available for every year between 1992 and 2010, but for
the analysis in this project I only considered the data from 1995 and 2010. I made this choice
because the ACLED conflict data are only available for the years between 1997 and 2010, and I
wanted to look at the relationship between increases in light and these conflict events. I had
originally planned to use the data from 1996, as this is one year before the beginning of the
ACLED data set, but I was not able to project the 1996 stable lights raster, so I choose to use the
1995 data instead, I do not believe that this substitution affected the outcome of my analysis.
To measure conflict I used the ACLED data set, which is available at
http://www.acleddata.com/. The ACLED data are feature points compiled by researchers form a
combination of humanitarian agency reports, scholarly publications, and media reports. Each
event is dated, Georeferenced, and categorized into one of eight types of conflict: Battle with no
territory change, battle rebels gain territory, battle government gains territory, base or HQ
established, non-violent rebel activity, rioting, violence against civilians, and non-violent
acquisition of territory. It is also important to note that ACLED is missing data for Somalia.
There are a few conflict points located within Somalia, but these are actions of rebel groups or
national armies of other countries which have taken place in the country. Because of lack of data
and the potential for confounding I excluded Somalia from my density analysis and correlation
calculation, which I will explain later.
3
In addition to these two primary data sets I also used Version 3 of the Population Density
Raster from 2010 from Columbia University’s Gridded Population of the World, which assigns
values to raster grid cells based on their population density. This data set can be found at
http://sedac.ciesin.columbia.edu/gpw/. Unfortunately, I did not use this data in my analysis, but I
did use them to provide a visual explanation for some of my findings, which I will discuss later.
Finally, I used ESRI administrative area data. These data are from 2008 and therefore do
not include South Sudan, which is not included in the ESRI administrative areas. This is
consistent with my analysis, which only includes data from before South Sudan’s founding.
These maps should therefore be considered historical.
Data Preparation
ACLED: The country level data in this dataset must be downloaded as spate files. I
initially downloaded the shape files, but was then not able to append them together using GIS,
because there were some indicator variables that were present in certain data sets but not others.
Instead I downloaded the data for each country into excel and then wrote a STATA dofile that
imported them into STATA and appended them together. I was then able to manipulate the data
in STATA, which proved useful. I illuminated any conflicts which and taken place after 2010. I
was also able to use STATA to generate a year variable for some of the observations which had
dates, but no year. I then exported the data from SATA into excel and then used the given XY
Coordinates to Georeference the data. I used WGS_1984 Coordinate system, the same system
used in our class exercise to georeference these coordinates. At the advice of Professor Stieve, I
decided to restrict my sample to only look at Battles and violence against civilians, since I
suppose that this type of very violent conflict would have the biggest development implications.
Also riots tend to be an urban phenomenon, which I confirmed my selected out the riots and
observing that they typically occur in cities, which could confound my results. I then projected
these data into the Alberts Equal Area Conic Projection. I used the select tool to remove conflicts
incidents which had taken place over the ocean (Pirate Activity).
NOAA Night Lights: I downloaded these large files and used power achiever to
decompressed them. There was a different compressed file for each year, which contained all
three NOAA datasets (average lights, cloud cover, and stable lights). I opened the stable lights
rasters for 1995 (as I said earlier I tried to use the 1996 raster but was not able to project it) and
2010 in ArcGIS and used the clip tool to clip it to the Geometry of Africa (compiled by selecting
by location for countries that intersected the outline of the ESRI African continent, and
unselecting Spain, Portugal, and Israel before creating a new layer with which to clip the raster.
I later set this clipped raster outline to be the snap raster and the mask, and set the geometry of
Africa as the processing extent so that all calculations would only be done for the area of Africa.
Next I projected the rasters into the Alberts Equal Area Conic Projection with a grid cell size of
1000.
Gridded Population of the World: I downloaded the 2010 data, clipped them to the
outline of Africa, and projected them into the Alberts Equal Area Conic Projection.
Analysis Part I
The goal of my analysis was to discover if a correlation existed between the growths (or
shrinking) of lit areas and the density of conflict across the continent. I predicted that I would
find a negative correlation between the change in light between 1995 and 2010 and the density of
conflict across countries over that time period.
4
First, I subtracted the 1995 light raster from the 2010 light raster, producing a “Change in
Light Raster” that quantified the degree of increase or decrease in light between 1995 and 2010.
Moving forward this was my basic light dataset for analysis.
Next, I conducted a density analysis on the conflict data. I set the cell size to 1000, and
conducted the search in Square Kilometers (so the density values on my final maps are conflicts
per Square kilometer). I conducted density analysis at a variety of different search radii, but
decided to include the 20 mile search radius in the final map because that radius returned a
statistically significant result when I conducted the analysis on the West African Countries only,
which I will discuss later. I also realized towards the end of my project that I did not have data
for Somalia so I should not include Somalia in my analysis. I went back and redid the Analysis
for the 20 mile search radius only, excluding Somalia. I did this by creating a layer of the Africa
administrative areas that excluded Somalia, and setting the processing extent to that area. I also
clipped the light differences raster to exclude Somalia (not the version that is used in the map on
my poster but the version I used to conduct the correlation analysis), and used this map as a mask
and snap raster in my processing Environments.
After conducting the Kernel Density Analysis I used the Band Collection Statistics tool
(Spatial Analyst > Multivariate > Band Collection Statistics) to calculate a correlation coefficient
between the two rasters. Because I projected the lights raster to have a cell size of 1000, and
created the Kernel density with a cell size of 1000, and set the Light Difference Raster as the
Snap raster, I am confident that a correlation calculation between the two is valid. The
correlation coefficient between these two rasters was 0.04975 which is not statistically
significant (80% statistical significance on a one tailed test with a sample this size would be .05),
but is positive, which surprised me. I thought that there would be a negative correlation, and
while this result could be due to random data noise I was interested in exploring if there were
areas of Africa with a statistically significant positive correlation, and thinking about why that
might be the case. I was concerned that the lack of statistical significance was due to a variety of
factors on of which was the presence of large relatively dark and relatively conflict free areas
which could be skewing both data sets down. To see if this might be the case I wanted to focus
in on a relatively small area.
Analysis Part II
I decided to focus in on one regional area where visually it seemed that there might be a
positive correlation between light change and conflict, or at least where there was a lot of
variation in both which a correlation calculation might help me explain. I decided to focus on the
North West Coast of Africa between Senegal and Benin. This relatively small area has a great
degree of political and economic variation. Historically violent and under-developed countries
such as Sierra Leone and Liberia lie adjacent to economically prosperous and stable countries
like Senegal and Ghana. It was the ideal region for a closer analysis.
I conducted the exact same analysis on this small area that I had conducted on the entire
continent, restricting the density analysis and correlation to these countries only. I found a
correlation of .1231, which for a sample size this large is actually slightly statistically significant
(between 80-90 % on a two tailed test). This larger and slightly significant correlation lead me to
another hypothesis. Perhaps conflict and lights data are slightly positively correlated because in
the aftermath of conflict reconstruction and aid efforts lead to more rapid growth than before the
conflict. The World Development report points out that post conflict countries grow (in terms of
GDP) faster than others (World Bank, 51). The bright spots in Liberia and the growth in lights
5
we can see in countries like Rwanda and Northern Uganda do seem to indicate this. In order to
conduct the analysis to prove this theory I would need to conduct a time series regression.
Another potential possibility is that the lights data and conflict data are not correlated
with each other, but are rather both correlated with a third spurious variable. Based on visual
analysis, I believe that another reasonable explanation could be that both conflict and light can
only occur in areas that are populated, and thus happen to occur in the same places, particularly
when selecting locations from a very large area, such as the entire continent of Africa. To
illustrate that population may be driving both variables, I included the gridded population of the
world table in my poster.
It is important to note that my findings may be factually true. Perhaps, as my analysis
concludes, there is no correlation between conflict and night lights, which implies no correlation
between conflict and economic growth. I am hesitant to draw this conclusion because it
contradicts the sensibilities of years of political and economic research, but it is a possibility
which emphasizes the need for more rigorous statistical analysis along this same vein.
Difficulties Encountered
I faced a number of setbacks throughout the course of conducting this project. I’ve complied
some bullet points to summarize them.
 Data Overload – I think one of the challenges I faced with this project was that I had too
much data. I had access to the lights data from every year and the conflict data for the
entire continent and this actually lead me to waste a great deal of time initially
manipulating the data in a not particularly productive way. It also prevented me from
narrowing the scope of my project and determining a specific plan initially.
 Raster Problems – There were some basic facts and functions which I did not know or
understand about raster data when I began my project which led me to difficulties. I
didn’t realize that they were such large datasets, and before I learned to clip and mask I
had lots of issues with GIS crashing because I had so much data in ArcMap at a time. I
also thought that processing extent completed the same function as clipping more
masking, and ran into issues when it did not.
 Data Management and Versioning – I do not think I did as good a job as I could have at
data management and versioning with this project. I think it’s a challenge with GIS
because you create so many files and sub-files and layers, and I kept running into errors
and needing to redo things. I think I would have been more efficient and avoided
repeating work and time searching for things if I had constructed a more effective filing
system initially.
 Kernel Density – I encountered major issues when I first tried to use this tool. I did not
understand the units of the cell size setting and ended up crashing GIS every time I tried
to run it because I was setting the cell size at 10 – 100 meters. I thought I was setting 10100 kilometers, and so believed that GIS was crashing because I had too many conflict
observations. I eventually sorted the problem out with Barbra’s help, but for a while I
thought I would have to change the direction of my project because the density analysis
that I wanted to do would not be possible.
 Append in ArcMap – Initially I had planned to append the data sets for the conflicts in
each country together in ArcMap, but when that was not possible I was forced to
download the excel files and write a STATA dofile that appended them into one Data Set.
6

This was a major time set back, but demonstrated the importance of multiple data
management tools.
Python – Storeygaurd et Al. have their entire data management process up in do files and
aml files online, but because much of it was in python I was not able to really understand
their steps. I think it would have been helpful to know how they manipulated the data, I
also think that I would like to learn to use Python to make my GIS analysis more efficient.
This project has motivated me to try to learn some Python programing, at least in GIS.
Limitations and Future Work
My project faced a variety of limitations, mostly due to the type of statistical analysis I
could conduct in given the time I had an my knowledge of GIS at the start of the project. It
would have been ideal to conduct some sort of regression analysis rather than a simple
correlation. In the future I would caution students interested in conducting this type of analysis
for a GIS project. I am afraid that good statistical analysis using GIS requires more data
preparation than the scope of this project allows for. I also think that the poster presentation
format lends itself to a visual conclusion, which can be challenging when your main output is
statistical. I was able to visualize my data effectively, and I am excited with how my poster has
turned out, but I was initially concerned about producing interesting maps for my project.
The limitations to the validity of my analysis have to do with the fact that I did not
control for a variety of potential confounding variables including area, population, and
population growth. Additionally, I am looking at change from 1995 to 2010, a large jump, my
analysis could have been more effective if I had been able to look at the dynamics of change
between each year, and compare it to density of conflict in the preceding year.
While my results were not statistically significant at the continent level and were only
slightly significant at the country level they are interesting. Because the sample size (the number
of pixels) is so large, even the result on the continent is nearly statistically significant (at 80%)
for a one tailed but not a two tailed test. While my analysis uses crude statistical techniques and
is not able to control for a variety of potentially confounding variables, which means that my
results are not definitive. However, my findings are unexpected, particularly because of the
positive sign on the correlation coefficient I calculated. This is the opposite of what I predicted
and highlights the need for further more statistically powerful inquiry using both GIS and other
statistical tools. Below I have outlined a potential research plan for a more thorough inquiry into
the same question I asked in my project.
In the future the raster data for every year between 1997-2010 should be used to produce
zonal light means at the provincial level or district level light. These numbers should then be
normalized for area. Next, a conflict density analysis should be performed on the ACLED data,
disaggregating for year (one density analysis per year). Then the Densities should be used to
calculate the density of conflict at each province or district. These two complete datasets should
be exported into STATA where a regression analysis of light on conflict can be conducted with
controls for administrative area and year fixed effects, as well as controls for population growth,
and other potentially confounding variables. If this analysis were conducted we would be able to
draw more decisive conclusions about the relationship between light and conflict. It could also
serve as an interesting supplement to Storeygaurd et Al. Al.’s work.
Overall, despite the limitations that my project faced and the lack of definitive results I
am very glad that I pursued this project. I believe that while I did not find conclusive results this
type of analysis using GIS can potentially address questions about the subnational economic
7
effects of conflict. Through this projected I acquired the skills, and approach that would be
needed to undertake a more comprehensive approach. I also encountered many of the challenges
and programing difficulties that would come from working with raster and statistical datasets in
GIS, and how to overcome them. I am very glad that I had a project that allowed me to work
with both raster and vector data, because now I feel confident in my ability to manipulate both.
This project was very enjoyable and enabled me to solidify the things I have learned in this
course and develop many new skills that I can apply in professional settings in the future. It has
made me confident enough in my abilities to count GIS among my key skills, and I hope that I
am able to use what I have learned in the coming years.
8
9
10