Download The Indonesian Sub-National Growth and Governance Dataset 14

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Rostow's stages of growth wikipedia , lookup

Economic growth wikipedia , lookup

Transformation in economics wikipedia , lookup

Transcript
The Indonesian
Sub-National Growth
and
Governance Dataset
14 March 2011
1
Acknowledgements
This dataset was constructed by a research team led by Dr. Neil McCulloch of the Institute of
Development Studies, UK, under Ausaid research project on “Measuring the Economic Benefit
of Better Local Economic Governance in Indonesia” No. ABN 62 921 558 838. The research
team included Pak Agung Pambudhi and his staff at KPPOD, Ms. Sukma Yuningsih at the
World Bank, Jakata and Dr. Eddy Malesky at the University of San Diego.
A large part of the data compilation and documentation was done by Ms. Sukma Yuningsih.
Pak Boedi Rheza of KPPOD also helped to integrate KPPOD’s data and prepare the final
dataset. We are grateful to BPS for permission to use the underlying data drawn from a variety
of BPS surveys, as well as to the Asia Foundation for the use of the Economic Governance
Index dataset.
Disclaimer
This dataset is distributed as a resource for researchers. We do not guarantee the accuracy of
the data and accept no responsibility for it. Any questions regarding the data should be
directed to the organisations responsible for the production of the original data from which this
dataset is constructed. We are unable to offer support, assistance or updates of any kind.
2
The Indonesian Sub-National Growth and Governance Dataset
INTRODUCTION
It is widely believed that good local economic governance is important for boosting local
economic performance. A research project, funded by Ausaid, and led by Dr. Neil McCulloch
at the Institute of Development Studies in the UK, set out to test whether indeed this is the case
(see http://www.ids.ac.uk/go/idsproject/measuring-the-impact-of-better-local-governance-inindonesia for details of the project and the final report). It did so by compiling a unique
dataset which draws together data on the economic characteristics and performance of
Indonesia’s districts (Kabupaten/Kota) between the years 2001 and 2007 along with data from
a 2007 survey by KPPOD/Asia Foundation which measured the quality of economic
governance at the district level (see http://asiafoundation.org/program/overview/economicgovernance-index). This document gives background information to assist researchers to use
the dataset for their own research. The dataset is in STATA 11 format.
TYPES OF VARIABLES AND DATA SOURCES
The bulk of the variables in the dataset are drawn from a range of standard surveys undertaken
by the Baden Pusat Statistik (BPS). However, it is important to note that the number of
districts is not consistent across the original sources of data. For example, the regional GDP
(GRDP) publication dataset, BPS’s Susenas household survey and Village Potential (Podes)
surveys do not always have the same number of districts even when they are done in the same
years. This is because of the different sampling frames used at different times of the year. In
addition the de jure and de facto status of a new district is recorded differently by different
institutions.
To be consistent, we have used the definition of the Ministry of Finance — an autonomous
province/district is the one that receives DAU in the beginning of fiscal year. Between 2001
and 2009, the number of districts in Indonesia (excluding six non-autonomous district level
governments in Jakarta) is: 2001 = 336, 2002 = 348, 2003 = 370, 2004 = 410, 2005-07 = 434,
2008 = 451, and 2009=477.
In this dataset, we use 2001 as our reference point, with 342 districts (336 districts and six nonautonomous district level governments in Jakarta) to avoid spurious changes resulting from the
splitting of districts. That is, if districts subsequently split after 2001, we aggregated the data
from the child districts so that our dataset shows a consistent series of variables for the
geographical regions that comprised the districts in 2001. Table 1 shows the identification
variables in our dataset.
Identification data
Table 1: Identification variables
id_m
WB coding for Kabupaten/Kota
kp09
coding for Province (2009: 33 provinces)
3
kkk09
coding for Kabupaten/Kota (2009)
name09
Name of regions (2009)
province09
Name of province (2009)
island
Name of Island
dummy_kota
(1=Kota, 0= Kabupaten)
jawa
(1= Java, 0= Off - Java)
EASTINDO
Dummy for Eastern Indonesia=1
Dsumat
Dummy Sumatra Island
Djawa
Dummy Java Island
Dkalim
Dummy Kalimantan Island
Dsulw
Dummy Sulawesi Island
Dnusa
Dummy Nusa Tenggara-Maluku Island
Dpapua
Dummy Papua Island
parent_336
WB code (base 2001, collapse districts into 336 districts)
name_336
Names for 336 parent regions
split_342
dummy 342 district since 2001(split to new regions=1; never split=0)
National Income data
The Gross Regional Domestic Product (Pendapatan Domestik Regional Bruto, PDRB) is the
market value of all final goods and services within a region during a given period of time. 1 The
value of intermediate goods is not calculated because the value of the final good contains the
value of all intermediate goods. The GRDP can be used as a measure of economic activity. 2
The data is provided by the Central Bureau of Statistic (Badan Pusat Statistik, BPS) on a yearly
basis. The GRDP data used in this paper are taken from RGDP by production sectors (year
2000–2007) which were kindly provided by BPS on request.
Based on the prices used, GRDP is classified into:
a. Nominal GDP, the production is calculated by quantity of production in a specific year
and the current price of the end product.
b. Real GDP, production is calculated by quantity of production in a specific year and the
constant price of the base year (2000). This calculation enables one to see real
production changes regardless of variations of end product prices.
Based on the sector’s contribution, GRDP is classified into:
a. GRDP with oil and gas (PDRB Migas), the aggregates of all sectors within a specific
year.
1
Adding income earned by domestic residents from their investments abroad, and subtracting income paid from the country to
investors abroad, gives the country's gross national product (GNP).
2 GRDP can be calculated in three ways. The income method adds the income of residents (individuals and firms) derived from
the production of goods and services. The output method adds the value of output from the different sectors of the economy.
The expenditure method totals spending on goods and services produced by residents, before allowing for depreciation and
capital consumption. As one person's output is another person's income, which in turn becomes expenditure, these three
measures ought to be identical. They rarely are because of statistical imperfections. Furthermore, the output and income
measures exclude unreported economic activity that takes place in the Black Economy which may be captured by the
expenditure measure.
4
b. GRDP without oil and gas (PDRB Non Migas), the aggregates of sector excluding Oil
and Gas Mining and Oil and Gas Manufacturing subsectors.
The data is for Kabupaten/Kota GRDP (PDRB Kabupaten/kota) level and available for both
nominal GRDP and real GRDP.
In addition, the GRDP is broken down into sectoral groupings. Table 2 shows how the three
digit sectoral codes in the raw data have been converted into the single digit sectoral
classifications in the dataset.
Table 2: Conversion from 3-digit classification of GRDP to GRDP Items in the dataset
GRDP Items 3 digit 2000-2007
item
GRDP Items in the dataset 2000-2007
Sector
Sector
100
Agriculture Total
1
Agriculture Total
200
Mining and Quarrying Total
2
210
Mining (Oil)
Mining and Quarrying, Oil and Gas
Manufacturing Total
Mining (Oil)
220
Mining (Others)
Mining (Others)
230
Quarrying
Quarrying
300
310
Manufacturing Total
Manufacturing Oil and Gas
Manufacturing Oil and Gas
320
Manufacturing Non-Oil and Gas
3
Manufacturing Non-Oil and Gas
400
4
Electricity, Gas & Water Supply Total
500
Electricity, Gas & Water Supply
Total
Construction Total
5
Construction Total
600
Trade, Restaurant & Hotel Total
6
Trade, Restaurant & Hotel Total
700
Transport and Communication Total
7
Transport and Communication Total
800
Financial Services
8
Financial Services
900
Public Administration & Services
9
Public Administration & Services
998
Without Oil and Gas
Without Oil and Gas
999
Gross Domestic Product
Gross Domestic Product
In addition there are dummy variables indicating whether the district has oil and gas or not
(migas* and MIGAS* and D1 and D2) and whether this is the main sector or not.
Table 3 provides a list of the key national income variables in the dataset.
cy
Real Income (GRDP) BY=2000
RGDPnoil_
Real Income (GRDP) without oil & gas BY=2000
y
Nominal Income (GRDP)
GDPnoil
Nominal Income (GRDP) without oil & gas
5
agr_
Agriculture, GRDP
min_
Mining, Quarrying, Oil & Gas Manufacturing, GRDP
man_
Non Oil & Gas Manufacturing, GRDP
enr_
Electricity, Gas & Water Supply, GRDP
con_
Construction, GRDP
trd_
Trade, Restaurant & Hotel, GRDP
trs_
Transportation and Communication, GRDP
fin_
Financial Services, GRDP
ser_
Services, GRDP
Shagr
Share of agriculture to total GRDP
Shmin
Share of mining to total GRDP
Shman
Share of non oil & gas to total GRDP
Shenr
Share of electricity to total GRDP
Shcon
Share of construction to total GRDP
Shtrd
Share of trade to total GRDP
Shtrs
Share of transportation to total GRDP
Shfin
Share of financial service to total GRDP
Shser
Share of service to total GRDP
Population data
There are in fact three different sources of population data:
1. Interpolations from the Population Census;
2. Susenas data; and
3. The population measures used by the Ministry of Finance to calculate fiscal
transfers.
In our analysis we use the first source because these are the official population figures
published by the BPS. However, the Susenas population figures are also provided in the
dataset.
Economic Performance Variables
We calculate and include a range of measures of economic performance and growth over the
period of the dataset.
To calculate per capita growth we have used the Gross Regional Domestic Product (GRDP)
divided by the population data from the BPS. GRDP per worker calculated using the estimate
of the labour force from Susenas.3 The measure of growth used in the analysis is the geometric
growth rate over the period (i.e. ((final value – initial value)/initial value)^(1/number of
periods) ). However, linear growth rates (i.e. year-on-year) are also calculated, as are
logarithmic growth rates (i.e. [ln(final value) - ln(initial value)]/number of periods) and the
average annual growth rate (i.e. the mean of the annual growth rates).
3
We do not use the Labour Force Survey, Sakernas, because it is inappropriate for calculating district level
averages.
6
In addition, we have calculated the weighted (by GRDP) and unweighted average growth rates
of the districts surrounding each district, to allow the exploration of spillovers between districts
(see GAU and GAW variables).
See the section below on Consumption Expenditure for details of per capita consumption
growth rates.
To get a sense of economic concentration (in the sense of whether the local economy is
dominated by a particular sector), we also calculate the sectoral gini (gini_sector). The stata
command for this variable is:
egen gini_sector=inequal(sh_sector), by(year parent_336) index(gini) or
egen gini_sector = gini(sh_sector), by(year parent_336)
Table 4 shows a list of the key per capita and per work economic performance variables.
PCY_
Income per-total Population
PCYnoil_
Income Without Oil & Gas per-total Population in
PLY_
Income per-total Workers in
lnPLY_
Ln per worker Real GDP,
lnPCY_
Ln per capita Real GDP,
lnPCYnoil_
Ln per capita Real GDP Without Oil & Gas,
y0706
Liner Growth of Percap_RGDP 2006-2007
y0605
Liner Growth of Percap_RGDP 2005-2006
y0504
Liner Growth of Percap_RGDP 2004-2005
y0403
Liner Growth of Percap_RGDP 2003-2004
y0302
Liner Growth of Percap_RGDP 2002-2003
y0201
Liner Growth of Percap_RGDP 2001-2002
y_noil0706
Liner Growth of Percap_RGDP Without Oil & Gas 2006-2007
y_noil0605
Liner Growth of Percap_RGDP Without Oil & Gas 2005-2006
y_noil0504
Liner Growth of Percap_RGDP Without Oil & Gas 2004-2005
y_noil0403
Liner Growth of Percap_RGDP Without Oil & Gas 2003-2004
y_noil0302
Liner Growth of Percap_RGDP Without Oil & Gas 2002-2003
y_noil0201
Liner Growth of Percap_RGDP Without Oil & Gas 2001-2002
yl0706
Liner Growth of Income per Labor 2006-2007
yl0605
Liner Growth of Income per Labor 2005-2006
yl0504
Liner Growth of Income per Labor 2004-2005
yl0403
Liner Growth of Income per Labor 2003-2004
yl0302
Liner Growth of Income per Labor 2002-2003
yl0201
Liner Growth of Income per Labor 2001-2002
gy0701
Geometric Average Growth 2001-2007; post-decentralization income percapita
7
gy0501
gy_noil0701
Geometric Average Growth 2001-2005; post-decentralization income percapita
Geometric Average Growth Without Oil & Gas 2001-2007; post-decentralization income
percapita
gyl0701
Geometric Average Growth 2001-2007; post-decentralization income perlabor
lny0701
Logarithmic Growth 2001-2007; post-decentralization income percapita
avy0701
Average Growth 2001-2007; post-decentralization income percapita
ary0701
Arithmetic Growth 2001-2007; post-decentralization income percapita
GAW0107
Weighted Average Growth of neighbouring districts01-07
GAW0105
Weighted Average Growth of neighbouring districts01-05
GAUn0107
Unweighted Average Growth of neighbouring districts 01-07
GAUn0105
Unweighted Average Growth of neighbouring districts 01-05
gini_sector
Coef. gini structure economy by grdp
Socio-economic variables
The dataset contains a large number of socio-economic variables drawn from the Susenas Core
datasets from 2001 to 2007.
Education outcome indicators
-
-
Net Enrolment Rate
The net enrolment rate is the number of pupils enrolled in (primary/junior
secondary/senior high secondary) of level educations that are of the theoretical schoolage group is divided by the population for the same age-group.
The primary school-age group (7-12 years old), the junior school-age group (13-15
years old), and the senior high school-age group (16-18 years old).
Numberofpupilsenroll inprimarys chool (7  12 yearsold )
Numberofpupils (7  12 yearsold )
NER
primary 
NER
junior 
Numberofpupilsenroll injuniorsc hool (13  15 yearsold )
Numberofpupils (13  15 yearsold )
NER
senior 
Numberofpupilsenroll inseniorhi ghchool (16  18 yearsold )
Numberofpupils (16  18 yearsold )
Gross Enrolment Rate
The gross enrolment rate is the number of students enrolled in (primary/junior/senior
high secondary) of level education, regardless of age divided by the population for the
same age-group.
GER
primary 
Numberofpupilsenroll inprimarys chool
Numberofpupils (7  12 yearsold )
8
GER
junior 
Numberofpupilsenroll injuniorsc hool
Numberofpupils (13  15 yearsold )
GER
senior 
Numberofpupilsenroll inseniorhi ghchool
Numberofpupils (16  18 yearsold )
Labour indicators
Since the Labour Force Survey, Sakernas, is designed only to be representative at national and
provincial levels, it cannot be used to obtain district level averages. Therefore, we used the
Susenas data to provide labour indicators of the labour force, employment and unemployment
at the district level. (Unfortunately, in the Susenas 2005, there is no information about labor
issues in the individual data from BPS, so we have missing labour data in 2005.) Table 5
shows the questions (since 2001) that determine the classification of an individual as
participating in the labour force, and employed or unemployed. Figure 1 shows how these
questions determine the classification.
Table 5: Questions that determine the classification into employed/unemployed
1. Did you work last week?
2. Did you work at least 1 hour last week?
3. Do you have work/business but currently is not active in either activities?
4. Are you looking for job?
5. Are you preparing a new business?
6. What are your reasons for not looking for job/preparing a new business?
Figure 1: Classification of Employed/Unemployed
9
Population of 15 years old and
above
Employment
Unemployment
1Looking for a job
Working
Temporarily absent from work, but
having jobs
2Preparing for wok
3It’s impossible to get job a job
4Already have a job, but not start
to work yet
Definition of labour force:
Labor Force: Persons of 15 years old and over who were working, temporarily absent from
work but having jobs, and those who did not have work and were looking for work:
1. Working: An activity done by a person who worked for pay or assisted others in
obtaining pay or profit for the duration at least one hour during the survey week.
2. Temporarily absent from work, but having jobs: activities done by a person who
had job, but was temporarily absent from work for some reasons during the survey
week.
3. Did not have work and looking for work: All persons who did not have any job but
were looking for work during the survey week. This is usually called open
unemployment.
4. Preparing for work
5. The reason for not looking for job/preparing a new business:
- It’s impossible to get a job
- Already have a job, but not start to work yet
Field sector of work classification
Starting from 2001, BPS renewed its field of work classification system, from a simple 9 sector
classification to a 3 digit KLUI system. To avoid possible errors in the coding of sub-sectors,
we only use the first digit sectoral breakdown as in the old coding system.
1. Agriculture sector
2. Mining and excavation
3. Manufacturing industry
4. Electricity, gas, and water
5. Building construction
6. Accommodation services
7. Transportation, storing, and communication
10
8. Financial institution, real estate, and leasing
9. Public, social, personal services
10. Activity that does not have clear limitation rule
Because the raw Susenas data has information on labour issues for all individuals aged 10 or
over, we have calculated the number of people working in each sector aged 10 or over, rather
than 15 or over.
We have also calculated the proportion of people living in urban areas using Susenas data.
In addition, we include a measure of the concentration of the labour force (complementing the
sectoral concentration of GDP above). This is the gini_secsus* variables. They are the gini
coefficient of the sectoral shares of employment (as opposed to GDP) for each district.
Consumption expenditure
To get an estimate of overall welfare, we use the per capita expenditure data from Susenas i.e.
household expenditure divided by household size. Household expenditure is divided into food
and non-food consumption expenditure.
These are steps to create the key variables:
- Created per capita consumption expenditure from Susenas Core (household data). The
average per capita expenditure per month is average household expenditure per month
divided by number of household size; and the average annual per capita expenditure is
average expenditure per month multiply by 12 and then divided by household size.
- The individual weights from the Susenas Core (individual data) are kept and merged with
the household data.
- The data is then collapsed to give average per capita expenditure per month by district
code (b1r1 b1r2) using individual weights.
Ethnic and Religious Fragmentation Indices
The dataset calculates indices of ethnic and religious fragmentation, similar to those calculated
by Easterley and Levine (1997)4. We use Population Census 2000, Indonesian Bureau of
Statistics (BPS).
Ethnolinguistic Fractionalization (ELF):
The ELF index can be defined as follows:
𝑛
𝐸𝐿𝐹 = 1 − ∑ 𝑋𝑖𝑗 2
𝑖=1
Easterley and Levine (1997), “Africa’s Growth Tragedy: Policies and Ethnic Divisions”, Quarterly Journal of
Economics, November 1997.
4
11
𝑋𝑖𝑗=
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑏𝑒𝑙𝑜𝑛𝑔 𝑡𝑜 𝑒𝑡ℎ𝑛𝑜𝑙𝑖𝑛𝑔𝑢𝑖𝑠𝑡𝑖𝑐 𝑔𝑟𝑜𝑢𝑝 𝑖 𝑖𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑐𝑡 𝑗
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑖𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑐𝑡 𝑗
where 𝑋𝑖𝑗 is the share of population ethnic group i in district j. The index takes values between
zero and one, where ELF equals to 1 implies a highly heterogeneous district and ELF equals to
0 refers to a perfectly homogeneous district. Indonesia Central Bureau of Statistics in Census
2000 traced 1068 ethnics across regions in Indonesia.
Religion Fractionalization:
In Census 2000, officially there are only 5 religions that are categorized in the census, namely
Islam, Catholicism, Protestantism, Hinduism, Buddhism and Other.
𝑛
𝑅𝐹 = 1 − ∑ 𝑆𝑖𝑗 2
𝑖
𝑆𝑖𝑗 =
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑏𝑒𝑙𝑜𝑛𝑔 𝑡𝑜 𝑟𝑒𝑙𝑖𝑔𝑖𝑜𝑛 𝑔𝑟𝑜𝑢𝑝 𝑖 𝑖𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑐𝑡 𝑗
𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑖𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑐𝑡 𝑗
where 𝑆𝑖𝑗 is the share of population religion group i in district j. The index takes values
between zero and one, where RF equals to 1 implies a highly heterogeneous and RF equals to 0
refers to a perfectly homogeneous.
Table 6 shows the key variables drawn from the Susenas surveys.
POPUR
Population in Urban Area (susenas )
popsus
population susenas
popeduc
population age >=5 years old, susenas
age_prim
People in primary school age 7-12 years, susenas
age_secj
People in junior school age 13-15 years, susenas
age_sech
People in senior school age 16-18 years, susenas
Enrolp
People age (7-12) enrol in primary school, susenas
Enroly
People age (13-15) enrol in junior school, susenas
enrolls
People age (16-18) enrol in senior school, susenas
scl_sd
People ever/being in primary school susenas
scl_smp
People ever/being in junior school susenas
scl_sma
People ever/being in senior school susenas
none_scl
People never school
Scl
People ever/being school
prim
People enrol primary school, susenas
secj
People enrol junior school, susenas
sech
People enrol senior high school, susenas
NER_sd
NER - primary school susenas
12
NER_smp
NER - junior school susenas
NER_sma
NER - senior school susenas
enPRIM
Gross enrollment rate- primary school susenas
enSECJ
Gross enrollment rate- junior school susenas
enSECH
Gross enrollment rate- senior school susenas
enSECT
Gross Enrollment rate - junior+senior school susenas
PRIM
SECH
Share people ever/being in primary school per total population;susenas
Share people ever/being in junior secondary school per total
population;susenas
Share people ever/being in high secondary school per total
population;susenas
lf_
Labor force (age>=15 years), susenas
employ
Employed population (susenas)
unempl
Unemployed population (susenas)
POPEMP
Number of people work (age>=10 years), susenas
AGR
Number of People who works in Agriculture susenas
MIN
Number of People who works in Mining susenas
MAN
Number of People who works in Manufacturing susenas
ENR
Number of People who works in Energy & Electricity susenas
CON
Number of People who works in Construction susenas
TRD
Number of People who works in Trade susenas
TRS
Number of People who works in Transportation susenas
FIN
Number of People who works in Finance susenas
SER
Number of People who works in Services susenas
OTHER_SECTOR
Number of People who works in Other Sector susenas
SH_AGR
Share Agriculture worker to Total population susenas
SH_MIN
Share Mining worker to Total population susenas
SH_MAN
Share Manufacturing worker to Total population susenas
SH_ENR
Share Energi & Electricity worker to Total population susenas
SH_CON
Share Construction worker to Total population susenas
SH_TRD
Share Trade worker to Total population susenas
SH_TRS
Share Transport worker to Total population susenas
SH_FIN
Share Finance worker to Total population susenas
SH_SER
Share Services worker to Total population susenas
SH_OTHER_SECTOR
Share Other Sector worker to Total population susenas
sh_urban
Urbanization (portion of population that is urban), susenas
pcexp
Average monthly per capita consumption, susenas
ypcexp
Average annual per capita consumption, susenas
Lnypexp2001
Ln Average annual per capita consumption 2001, susenas
ypcexp0706
Liner Growth Annual Percapita Expenditure 2006-2007
ypcexp0605
Liner Growth Annual Percapita Expenditure 2005-2006
ypcexp0504
Liner Growth Annual Percapita Expenditure 2004-2005
ypcexp0403
Liner Growth Annual Percapita Expenditure 2003-2004
ypcexp0302
Liner Growth Annual Percapita Expenditure 2002-2003
SECJ
13
ypcexp0201
GAUnypcexp0107
GAUnypcexp0105
gini_sectsus
Liner Growth Annual Percapita Expenditure 2001-2002
Unweighted Average Growth Per Capita Expenditure of Neighbouring
Districts 01-07
Unweighted Average Growth Per Capita Expenditure of Neighbouring
Districts 01-05
Coef. gini of worker (>=10 years old) by sector susenas
Consumer Price Indices and Real Per Capita Expenditure
Figures on per capita expenditure per month for each district are in nominal terms. However,
to calculate welfare changes over time requires real values. BPS collect price data from a large
number of cities (42 in 2002, now 63). We matched each district with the nearest city for
which price data was available and then deflated by the CPI for that city using 2002 as the base
year (the cpi variables are cpi<year>). Per capita expenditure figures were then deflated to
give real per capita expenditure figures (rpcexp) for each year. Finally we calculated the
geometric average growth of real per capita expenditure (gy_rpexp0701) between 2001 and
2007.
Farmers Prices
The dataset also contains data on the Farmers Terms of Trade based on a survey of prices
experienced by farmers for their inputs (IB*) and outputs (IT*).
Village characteristics data
The PODES (Village Potential) surveys provide information about village characteristics for
all of Indonesia, with a sample of +/- 65,000 villages (desa). These surveys are carried out in
the run up to each of the periodic censuses (Agriculture, Economy, Population).5 To get
district level data we aggregated the PODES data up to the district level, as well as calculating
the proportion of villages in the district with particular characteristics.
-
Road quality
There are two road variables. First road_asphalt00 is the number of villages in the
district where the main road surface is asphalt. Similarly road_rocks, road_soil,
road_other show the villages with other forms of road surface. SH_road_asphalt00
shows the proportion of villages in the district with asphalt roads.
In addition, PODES asks village heads perception questions about the quality of the
roads on a scale from 1 to 4 where 1 represents best road quality (paved road) to 4 the
worst road quality. The ROAD_00 variable is the average of these scores across all
villages in the district. Hence the better the perceived quality of the road, the closest
the proportion is to 1 and the worse the quality of the road the closer the proportion is
to 4.
5
See: http://www.rand.org/labor/bps.data/webdocs/podes/podes.htm
14
-
Geographical areas
Data are taken from geographical location of a village. In the original dataset,
geographical location is categorized into four: 1 is coastal area, 2 is valley or river side,
3 is hill, and 4 is plain land. These were used to create new variables showing the
number of households living in each of these types of land in each year. In addition,
SH_coastal00 etc show the share of villages in the district that are coastal.
We also created a dummy variable for landlocked districts, LLOCK_<year> with the
value 1 indicating that none of the villages within a district has a coastal area and 0 if at
least one village has a coastal area. Unfortunately, this is not consistent over time. We
therefore manually reconciled those districts with unclear classification to provide a
single landlocked variable LLOCK.
-
Telephone Lines
The telephone line variable is defined as the number of telephones per household. It is
calculated by dividing the total number of households in each district that are connected
to the telephone lines (TELP_<year>) to the total households in each district
(hh_<year>).
Table 7 provides a list of the key infrastructure variables from PODES
SH_road_asphalt
Share of villages in the district with asphalt road (PODES )
SH_road_rocks
Share of villages in the district with rock road (PODES )
SH_road_soil
Share of villages in the district with soil road (PODES )
SH_road_other
Share of villages in the district with other road surface (PODES )
SH_coastal
Share of villages in the district on coastal (PODES )
SH_valey
Share of villages in the district on valley (PODES )
SH_hill
Share of villages in the district on hill (PODES )
SH_plainland
Share of villages in the district on plain land (PODES )
TELPHH
Telephone access per household (PODES )
LLOCK
Dummy Geo-Location --> land-lock=1; Coastal=0 PODES
ROAD
Road Quality: 1 good - 4 worst: PODES
-
Natural Disaster
PODES has detailed information on the number of villages that have experienced
various forms of natural disaster (drought, flood, volcanic eruption, earthquakes, etc).
For some years there is data on the total frequency of such events. For 2008 there is
information on the number of victims from such disasters and the losses that have
resulted.
-
Crime
PODES also contains detailed information on whether a range of different crimes have
been reported in each village (fighting, thieving, robbery, mistreatment, burning,
raping, drug-abuse, murder, suicide etc). We therefore have variable indicating the
number of villages in a district which reported each of these crimes for each PODES
15
year, the share of villages reporting these crimes, and the share of villages reporting any
kind of crime (SHCRIME<year>).
Distance and Remoteness
The dataset contains variable jarak_ which show the straight line distance from the capital of
the district to each of the largest cities, as well as to the provincial capital and the nearest large
city. In addition, the dlaut_ dummy variables show whether it is necessary to get on a boat to
reach these locations i.e. whether the district is separated by sea from these places.
Fiscal data
We also include some variables from the SIKD dataset maintained by the Ministry of Finance.
These include data on revenues received by the district (rev*), as well as development (dev*)
and routine expenditures (rtn*).
The Governance Variables
The governance variables in the dataset (tz* and TZ*) are taken from the KPPOD/Asia
Foundation survey of Local Economic Governance in 2007 (see weblink to report for full
details of the methodology). This surveyed around 50 firms in each of 243 districts
(subsequently further surveys have been done covering the remainder of the country, but these
are not included in this dataset).6 The 243 districts consisted of all regencies and cities in 15
provinces. The firm survey is designed to be representative of all non-primary sector private
sector firms with 10 employees or more.7 The Firm Survey asked questions about nine major
aspects of local economic governance (Box 1) chosen to be consistent with theories of how
local economic governance should influence local economic performance. These aspects of
local economic governance were chosen because they are directly under the control of the local
government.
Box 1: Key Aspects Of The Local Economic Governance Survey 2007









Access to Information
Access to Land and Security of Tenure
Business Licensing
Local Government and Business Interaction
Business Development Programs
Capacity and Integrity of the Mayor/Regent
Local Taxes, User Charges and Other Transaction Costs
Local Infrastructure
Security and Conflict Resolution
6
In addition to the firm survey in each district, two or three business associations were interviewed and local
regulations (perda) were collected. These are not included in our dataset.
7
The sampling method was chosen to make the results representative at the district level. Within each district,
small (10-19 employees), medium (20-99) and large (100+) firms were sampled roughly in proportion to their
presence in the district. Within each size class, firms were sampled from three aggregate sectors – production,
trade, and services – again in proportion to their presence in the local economy.
16
In order to be able to aggregate different variables into a single sub-index for each of the
concepts in Box 1, it was necessary to put them all into the same scale. The EGI therefore uses
within country comparisons of best and worst practice as its key benchmarks. That is, district
performance is measured against a scale that is determined by the best and worst performance
of other districts in Indonesia. For example, the speed of licensing for each district is measured
against the fastest and slowest licensing procedures of other districts in Indonesia. Using
current best and worst practice to define the benchmark ensures that districts are compared
against a standard that is relevant for Indonesia and achievable (because it has actually been
achieved by other districts).
Given this choice of benchmarks, each sub-index is calculated as follows:
1. Determine the variables chosen to represent each sub-index and calculate a district
level mean for each variable
Each sub-index represents a particular concept, as shown in Box 1. The variables used in
constructing each sub-index must therefore reflect that concept. In addition, to ensure that
meaningful comparisons can be made across different districts, the response rate in each
district must be sufficiently high that we can have confidence in the average value calculated.
For example, the variables used in the Land Access and Security of Tenure sub-index are:
 Time taken to obtain a land certificate
 Perceived ease of obtaining land
 Frequency of evictions in the region
 Perceived frequency of land conflicts
 Overall assessment of the significance of land problems
The time taken to obtain a land certificate and the perceived ease of obtaining land, are useful
variables for assessing the ease of accessing land, while the frequency of evictions and the
frequency of land conflicts are useful indications of the level of insecurity about land tenure.
For all sub-indices derived from the firm survey, firms were also asked to give their overall
assessment of the extent to which that issue was a constraint on their activities – this is the
overall assessment variable.
The “hard” data on the time to obtain a land certificate is measured in weeks. Perception data
on the other typically used a simple 4 point scale. For example, when firms were asked about
the ease of obtaining land they were given the choice of: very difficult (1), difficult (2), easy
(3), and very easy (4). Similarly when firms were asked about the likelihood of eviction they
chose between extremely likely, likely, unlikely, and very unlikely. To obtain a general
picture, the average values of each of these variables were calculated for the district.
2. Normalize the average values by calculating a tz-score
Once district average values have been calculated for each variable in a sub-index, it is
necessary to normalize these values so that they can be compared. Without some mechanism
for standardizing or normalizing the variables, it is impossible to combine a mean value of,
say, 20 days to obtain a land certificate, with a mean score of, say, 3.2 of perceptions about the
likelihood of eviction, because these variables use different scales of measurement. We
17
therefore use a simple normalization to put all the variables into the same scale. This is
achieved by putting all variables on a scale from 0 to 100, where 0 is the worst district (for that
variable) and 100 indicates the best district for that variable. The normalization used to
calculate the tz value for each variable is:
tz  100*
x  x min
x max  x min
Where x is the average value of the variable for the district; xmin is the lowest average value of
the variable across all the districts; and xmax is the highest average value across all the
districts. This generates for every variable, a value between 0 and 100 indicating where the
district lies on the scale of worst to best districts for that variable.
Example: The tz_score for the time to obtain a land certificate in the City of Bima
The average time reported by firms to obtain a land certificate in the City of Bima is 19.3 days.
But the minimum time across all 243 districts is found in Regency of Timor Tengah Timur in
Nusa Tenggara Barat where it only takes 4.17 days to obtain a certificate. By contrast, the
longest process can be found in the City of Cimahi in West Java where it takes 42 days on
average to obtain a land certificate. The tz-score for the City of Bima for the “time to obtain a
land certificate” variable is therefore t = (19.3 – 4.17)/(42 – 4.17) = 40.05
Reversing the scale
The variables in the index are constructed so that a higher score indicates better performance.
However, for some variables, larger numbers indicate worse performance. For example, a
longer time to obtain a land certificate makes access to land more difficult rather than easier.
For such variables, the tz_scores are reversed simply by calculating trev = 100 – t. Thus, for
the Bima example the final tz_score is trev = 100 – 40.05 = 59.9. This indicates that the City
of Bima is 60% along the scale that runs from the worst performing district to the best
performing district.
3. Average the t scores for the variables in the sub-index
Once the tz_scores have been calculated for all variables in a sub-index (and reversed when
appropriate to ensure that larger scores correspond to better performance), they can be simply
averaged to obtain an overall score for the sub-index (TZ_*).
Example: the Land Access and Legal Certainty Sub-index for the City of Bima
The tz_scores for each of the variables in the land sub-index for the City of Bima are shown
below:
Variable
Time taken to obtain a land certificate*
Perceived ease of obtaining land
Frequency of evictions in the region*
Perceived frequency of land conflicts*
tz_score
59.9
35.7
48.9
44.3
18
Overall assessment of the significance of land
problems
Land Access and Legal Certainty Sub-index
for City of Bima
*
These variables have had their tz_scores reversed.
18.7
41.5
Finally, in the original Local Economic Governance Report these sub-indices were combined
into an aggregate Economic Governance Index by weighting each of the sub-indices by their
perceived importance to firms. We have not done this in the dataset since researchers may
wish to combine sub-indices in different ways for different purposes (with or without weights).
For full details of individual variables see the corresponding question (indicated in the variable
label) in the EGI Firm Survey questionnaire which is attached as an Annex of this document.
For comparison, the dataset also contains a different set of governance sub-indices produced by
KPPOD in 2002 in an earlier version of this survey. These use a different methodology and
are not directly comparable but are included for reference (see KPPOD 2002 for more details).
Political variables
The dataset also contains political variables about the names and parties of the district leaders.
These are drawn from the Asia Foundation’s database on election monitoring. The variables
show the name of the Bupati (nm_bup) (including Walikota); their party (party_bup); as well
as an indication of the system under which they were elected/appointed. These could be New
Order (i.e. appointed under the Suharto regime); indirectly elected (the system in place
immediately after the fall of the regime); directly elected (from 2005 onwards); and caretaker
leaders (e.g. when a leader is sick or away for some reason). In addition there are dummy
variables for each of these categories.
In addition there are a range of variables on the incidence and extent of corruption, based on
media reports (CORN). These are based on ICW data for 2004.
Data from the Industrial Census (Survei Industri) and the Investment Coordinating Board
BPS carries out an industrial census of large (more than 100 employees) manufacturing
enterprises each year. From this data we include variables on:
- The number of large manufacturing firms (firm*)
- The value of their production (YPRVCU*)
- The total number of workers (LTLNOU*)
- Value-added (VTLVCU*)
- Investment (FTTLCU*)
In addition we calculated a measure of economic concentration – the Herfindahl-Hirschman
Index – to show how concentrated production, employment, value-added, and investment is in
each district (HHI*)
19
N
H   si2
1
Where si2 is the squared share of the variable of interest (production, employment etc). A high
value of the HHI shows a high level of concentration.
In addition, we calculated the share of production, workers, value-added and investment
accounted for by the top 1 firm and the top 5 firms in each district.
The dataset also includes some variables drawn from the Investment Coordinating Board’s data
on investment licenses. This includes: the total number of licenses (izin) issued for foreign
direct investment (pma) or domestic investment (pmdn) by the Investment Coordinating Board,
as well as the total value of investment for the firms in the district.
ACCESSING THE DATA
The dataset, this documentation and the final report on district level governance and growth
can be downloaded from both the KPPOD website www.kppod.org
20