Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The Indonesian Sub-National Growth and Governance Dataset 14 March 2011 1 Acknowledgements This dataset was constructed by a research team led by Dr. Neil McCulloch of the Institute of Development Studies, UK, under Ausaid research project on “Measuring the Economic Benefit of Better Local Economic Governance in Indonesia” No. ABN 62 921 558 838. The research team included Pak Agung Pambudhi and his staff at KPPOD, Ms. Sukma Yuningsih at the World Bank, Jakata and Dr. Eddy Malesky at the University of San Diego. A large part of the data compilation and documentation was done by Ms. Sukma Yuningsih. Pak Boedi Rheza of KPPOD also helped to integrate KPPOD’s data and prepare the final dataset. We are grateful to BPS for permission to use the underlying data drawn from a variety of BPS surveys, as well as to the Asia Foundation for the use of the Economic Governance Index dataset. Disclaimer This dataset is distributed as a resource for researchers. We do not guarantee the accuracy of the data and accept no responsibility for it. Any questions regarding the data should be directed to the organisations responsible for the production of the original data from which this dataset is constructed. We are unable to offer support, assistance or updates of any kind. 2 The Indonesian Sub-National Growth and Governance Dataset INTRODUCTION It is widely believed that good local economic governance is important for boosting local economic performance. A research project, funded by Ausaid, and led by Dr. Neil McCulloch at the Institute of Development Studies in the UK, set out to test whether indeed this is the case (see http://www.ids.ac.uk/go/idsproject/measuring-the-impact-of-better-local-governance-inindonesia for details of the project and the final report). It did so by compiling a unique dataset which draws together data on the economic characteristics and performance of Indonesia’s districts (Kabupaten/Kota) between the years 2001 and 2007 along with data from a 2007 survey by KPPOD/Asia Foundation which measured the quality of economic governance at the district level (see http://asiafoundation.org/program/overview/economicgovernance-index). This document gives background information to assist researchers to use the dataset for their own research. The dataset is in STATA 11 format. TYPES OF VARIABLES AND DATA SOURCES The bulk of the variables in the dataset are drawn from a range of standard surveys undertaken by the Baden Pusat Statistik (BPS). However, it is important to note that the number of districts is not consistent across the original sources of data. For example, the regional GDP (GRDP) publication dataset, BPS’s Susenas household survey and Village Potential (Podes) surveys do not always have the same number of districts even when they are done in the same years. This is because of the different sampling frames used at different times of the year. In addition the de jure and de facto status of a new district is recorded differently by different institutions. To be consistent, we have used the definition of the Ministry of Finance — an autonomous province/district is the one that receives DAU in the beginning of fiscal year. Between 2001 and 2009, the number of districts in Indonesia (excluding six non-autonomous district level governments in Jakarta) is: 2001 = 336, 2002 = 348, 2003 = 370, 2004 = 410, 2005-07 = 434, 2008 = 451, and 2009=477. In this dataset, we use 2001 as our reference point, with 342 districts (336 districts and six nonautonomous district level governments in Jakarta) to avoid spurious changes resulting from the splitting of districts. That is, if districts subsequently split after 2001, we aggregated the data from the child districts so that our dataset shows a consistent series of variables for the geographical regions that comprised the districts in 2001. Table 1 shows the identification variables in our dataset. Identification data Table 1: Identification variables id_m WB coding for Kabupaten/Kota kp09 coding for Province (2009: 33 provinces) 3 kkk09 coding for Kabupaten/Kota (2009) name09 Name of regions (2009) province09 Name of province (2009) island Name of Island dummy_kota (1=Kota, 0= Kabupaten) jawa (1= Java, 0= Off - Java) EASTINDO Dummy for Eastern Indonesia=1 Dsumat Dummy Sumatra Island Djawa Dummy Java Island Dkalim Dummy Kalimantan Island Dsulw Dummy Sulawesi Island Dnusa Dummy Nusa Tenggara-Maluku Island Dpapua Dummy Papua Island parent_336 WB code (base 2001, collapse districts into 336 districts) name_336 Names for 336 parent regions split_342 dummy 342 district since 2001(split to new regions=1; never split=0) National Income data The Gross Regional Domestic Product (Pendapatan Domestik Regional Bruto, PDRB) is the market value of all final goods and services within a region during a given period of time. 1 The value of intermediate goods is not calculated because the value of the final good contains the value of all intermediate goods. The GRDP can be used as a measure of economic activity. 2 The data is provided by the Central Bureau of Statistic (Badan Pusat Statistik, BPS) on a yearly basis. The GRDP data used in this paper are taken from RGDP by production sectors (year 2000–2007) which were kindly provided by BPS on request. Based on the prices used, GRDP is classified into: a. Nominal GDP, the production is calculated by quantity of production in a specific year and the current price of the end product. b. Real GDP, production is calculated by quantity of production in a specific year and the constant price of the base year (2000). This calculation enables one to see real production changes regardless of variations of end product prices. Based on the sector’s contribution, GRDP is classified into: a. GRDP with oil and gas (PDRB Migas), the aggregates of all sectors within a specific year. 1 Adding income earned by domestic residents from their investments abroad, and subtracting income paid from the country to investors abroad, gives the country's gross national product (GNP). 2 GRDP can be calculated in three ways. The income method adds the income of residents (individuals and firms) derived from the production of goods and services. The output method adds the value of output from the different sectors of the economy. The expenditure method totals spending on goods and services produced by residents, before allowing for depreciation and capital consumption. As one person's output is another person's income, which in turn becomes expenditure, these three measures ought to be identical. They rarely are because of statistical imperfections. Furthermore, the output and income measures exclude unreported economic activity that takes place in the Black Economy which may be captured by the expenditure measure. 4 b. GRDP without oil and gas (PDRB Non Migas), the aggregates of sector excluding Oil and Gas Mining and Oil and Gas Manufacturing subsectors. The data is for Kabupaten/Kota GRDP (PDRB Kabupaten/kota) level and available for both nominal GRDP and real GRDP. In addition, the GRDP is broken down into sectoral groupings. Table 2 shows how the three digit sectoral codes in the raw data have been converted into the single digit sectoral classifications in the dataset. Table 2: Conversion from 3-digit classification of GRDP to GRDP Items in the dataset GRDP Items 3 digit 2000-2007 item GRDP Items in the dataset 2000-2007 Sector Sector 100 Agriculture Total 1 Agriculture Total 200 Mining and Quarrying Total 2 210 Mining (Oil) Mining and Quarrying, Oil and Gas Manufacturing Total Mining (Oil) 220 Mining (Others) Mining (Others) 230 Quarrying Quarrying 300 310 Manufacturing Total Manufacturing Oil and Gas Manufacturing Oil and Gas 320 Manufacturing Non-Oil and Gas 3 Manufacturing Non-Oil and Gas 400 4 Electricity, Gas & Water Supply Total 500 Electricity, Gas & Water Supply Total Construction Total 5 Construction Total 600 Trade, Restaurant & Hotel Total 6 Trade, Restaurant & Hotel Total 700 Transport and Communication Total 7 Transport and Communication Total 800 Financial Services 8 Financial Services 900 Public Administration & Services 9 Public Administration & Services 998 Without Oil and Gas Without Oil and Gas 999 Gross Domestic Product Gross Domestic Product In addition there are dummy variables indicating whether the district has oil and gas or not (migas* and MIGAS* and D1 and D2) and whether this is the main sector or not. Table 3 provides a list of the key national income variables in the dataset. cy Real Income (GRDP) BY=2000 RGDPnoil_ Real Income (GRDP) without oil & gas BY=2000 y Nominal Income (GRDP) GDPnoil Nominal Income (GRDP) without oil & gas 5 agr_ Agriculture, GRDP min_ Mining, Quarrying, Oil & Gas Manufacturing, GRDP man_ Non Oil & Gas Manufacturing, GRDP enr_ Electricity, Gas & Water Supply, GRDP con_ Construction, GRDP trd_ Trade, Restaurant & Hotel, GRDP trs_ Transportation and Communication, GRDP fin_ Financial Services, GRDP ser_ Services, GRDP Shagr Share of agriculture to total GRDP Shmin Share of mining to total GRDP Shman Share of non oil & gas to total GRDP Shenr Share of electricity to total GRDP Shcon Share of construction to total GRDP Shtrd Share of trade to total GRDP Shtrs Share of transportation to total GRDP Shfin Share of financial service to total GRDP Shser Share of service to total GRDP Population data There are in fact three different sources of population data: 1. Interpolations from the Population Census; 2. Susenas data; and 3. The population measures used by the Ministry of Finance to calculate fiscal transfers. In our analysis we use the first source because these are the official population figures published by the BPS. However, the Susenas population figures are also provided in the dataset. Economic Performance Variables We calculate and include a range of measures of economic performance and growth over the period of the dataset. To calculate per capita growth we have used the Gross Regional Domestic Product (GRDP) divided by the population data from the BPS. GRDP per worker calculated using the estimate of the labour force from Susenas.3 The measure of growth used in the analysis is the geometric growth rate over the period (i.e. ((final value – initial value)/initial value)^(1/number of periods) ). However, linear growth rates (i.e. year-on-year) are also calculated, as are logarithmic growth rates (i.e. [ln(final value) - ln(initial value)]/number of periods) and the average annual growth rate (i.e. the mean of the annual growth rates). 3 We do not use the Labour Force Survey, Sakernas, because it is inappropriate for calculating district level averages. 6 In addition, we have calculated the weighted (by GRDP) and unweighted average growth rates of the districts surrounding each district, to allow the exploration of spillovers between districts (see GAU and GAW variables). See the section below on Consumption Expenditure for details of per capita consumption growth rates. To get a sense of economic concentration (in the sense of whether the local economy is dominated by a particular sector), we also calculate the sectoral gini (gini_sector). The stata command for this variable is: egen gini_sector=inequal(sh_sector), by(year parent_336) index(gini) or egen gini_sector = gini(sh_sector), by(year parent_336) Table 4 shows a list of the key per capita and per work economic performance variables. PCY_ Income per-total Population PCYnoil_ Income Without Oil & Gas per-total Population in PLY_ Income per-total Workers in lnPLY_ Ln per worker Real GDP, lnPCY_ Ln per capita Real GDP, lnPCYnoil_ Ln per capita Real GDP Without Oil & Gas, y0706 Liner Growth of Percap_RGDP 2006-2007 y0605 Liner Growth of Percap_RGDP 2005-2006 y0504 Liner Growth of Percap_RGDP 2004-2005 y0403 Liner Growth of Percap_RGDP 2003-2004 y0302 Liner Growth of Percap_RGDP 2002-2003 y0201 Liner Growth of Percap_RGDP 2001-2002 y_noil0706 Liner Growth of Percap_RGDP Without Oil & Gas 2006-2007 y_noil0605 Liner Growth of Percap_RGDP Without Oil & Gas 2005-2006 y_noil0504 Liner Growth of Percap_RGDP Without Oil & Gas 2004-2005 y_noil0403 Liner Growth of Percap_RGDP Without Oil & Gas 2003-2004 y_noil0302 Liner Growth of Percap_RGDP Without Oil & Gas 2002-2003 y_noil0201 Liner Growth of Percap_RGDP Without Oil & Gas 2001-2002 yl0706 Liner Growth of Income per Labor 2006-2007 yl0605 Liner Growth of Income per Labor 2005-2006 yl0504 Liner Growth of Income per Labor 2004-2005 yl0403 Liner Growth of Income per Labor 2003-2004 yl0302 Liner Growth of Income per Labor 2002-2003 yl0201 Liner Growth of Income per Labor 2001-2002 gy0701 Geometric Average Growth 2001-2007; post-decentralization income percapita 7 gy0501 gy_noil0701 Geometric Average Growth 2001-2005; post-decentralization income percapita Geometric Average Growth Without Oil & Gas 2001-2007; post-decentralization income percapita gyl0701 Geometric Average Growth 2001-2007; post-decentralization income perlabor lny0701 Logarithmic Growth 2001-2007; post-decentralization income percapita avy0701 Average Growth 2001-2007; post-decentralization income percapita ary0701 Arithmetic Growth 2001-2007; post-decentralization income percapita GAW0107 Weighted Average Growth of neighbouring districts01-07 GAW0105 Weighted Average Growth of neighbouring districts01-05 GAUn0107 Unweighted Average Growth of neighbouring districts 01-07 GAUn0105 Unweighted Average Growth of neighbouring districts 01-05 gini_sector Coef. gini structure economy by grdp Socio-economic variables The dataset contains a large number of socio-economic variables drawn from the Susenas Core datasets from 2001 to 2007. Education outcome indicators - - Net Enrolment Rate The net enrolment rate is the number of pupils enrolled in (primary/junior secondary/senior high secondary) of level educations that are of the theoretical schoolage group is divided by the population for the same age-group. The primary school-age group (7-12 years old), the junior school-age group (13-15 years old), and the senior high school-age group (16-18 years old). Numberofpupilsenroll inprimarys chool (7 12 yearsold ) Numberofpupils (7 12 yearsold ) NER primary NER junior Numberofpupilsenroll injuniorsc hool (13 15 yearsold ) Numberofpupils (13 15 yearsold ) NER senior Numberofpupilsenroll inseniorhi ghchool (16 18 yearsold ) Numberofpupils (16 18 yearsold ) Gross Enrolment Rate The gross enrolment rate is the number of students enrolled in (primary/junior/senior high secondary) of level education, regardless of age divided by the population for the same age-group. GER primary Numberofpupilsenroll inprimarys chool Numberofpupils (7 12 yearsold ) 8 GER junior Numberofpupilsenroll injuniorsc hool Numberofpupils (13 15 yearsold ) GER senior Numberofpupilsenroll inseniorhi ghchool Numberofpupils (16 18 yearsold ) Labour indicators Since the Labour Force Survey, Sakernas, is designed only to be representative at national and provincial levels, it cannot be used to obtain district level averages. Therefore, we used the Susenas data to provide labour indicators of the labour force, employment and unemployment at the district level. (Unfortunately, in the Susenas 2005, there is no information about labor issues in the individual data from BPS, so we have missing labour data in 2005.) Table 5 shows the questions (since 2001) that determine the classification of an individual as participating in the labour force, and employed or unemployed. Figure 1 shows how these questions determine the classification. Table 5: Questions that determine the classification into employed/unemployed 1. Did you work last week? 2. Did you work at least 1 hour last week? 3. Do you have work/business but currently is not active in either activities? 4. Are you looking for job? 5. Are you preparing a new business? 6. What are your reasons for not looking for job/preparing a new business? Figure 1: Classification of Employed/Unemployed 9 Population of 15 years old and above Employment Unemployment 1Looking for a job Working Temporarily absent from work, but having jobs 2Preparing for wok 3It’s impossible to get job a job 4Already have a job, but not start to work yet Definition of labour force: Labor Force: Persons of 15 years old and over who were working, temporarily absent from work but having jobs, and those who did not have work and were looking for work: 1. Working: An activity done by a person who worked for pay or assisted others in obtaining pay or profit for the duration at least one hour during the survey week. 2. Temporarily absent from work, but having jobs: activities done by a person who had job, but was temporarily absent from work for some reasons during the survey week. 3. Did not have work and looking for work: All persons who did not have any job but were looking for work during the survey week. This is usually called open unemployment. 4. Preparing for work 5. The reason for not looking for job/preparing a new business: - It’s impossible to get a job - Already have a job, but not start to work yet Field sector of work classification Starting from 2001, BPS renewed its field of work classification system, from a simple 9 sector classification to a 3 digit KLUI system. To avoid possible errors in the coding of sub-sectors, we only use the first digit sectoral breakdown as in the old coding system. 1. Agriculture sector 2. Mining and excavation 3. Manufacturing industry 4. Electricity, gas, and water 5. Building construction 6. Accommodation services 7. Transportation, storing, and communication 10 8. Financial institution, real estate, and leasing 9. Public, social, personal services 10. Activity that does not have clear limitation rule Because the raw Susenas data has information on labour issues for all individuals aged 10 or over, we have calculated the number of people working in each sector aged 10 or over, rather than 15 or over. We have also calculated the proportion of people living in urban areas using Susenas data. In addition, we include a measure of the concentration of the labour force (complementing the sectoral concentration of GDP above). This is the gini_secsus* variables. They are the gini coefficient of the sectoral shares of employment (as opposed to GDP) for each district. Consumption expenditure To get an estimate of overall welfare, we use the per capita expenditure data from Susenas i.e. household expenditure divided by household size. Household expenditure is divided into food and non-food consumption expenditure. These are steps to create the key variables: - Created per capita consumption expenditure from Susenas Core (household data). The average per capita expenditure per month is average household expenditure per month divided by number of household size; and the average annual per capita expenditure is average expenditure per month multiply by 12 and then divided by household size. - The individual weights from the Susenas Core (individual data) are kept and merged with the household data. - The data is then collapsed to give average per capita expenditure per month by district code (b1r1 b1r2) using individual weights. Ethnic and Religious Fragmentation Indices The dataset calculates indices of ethnic and religious fragmentation, similar to those calculated by Easterley and Levine (1997)4. We use Population Census 2000, Indonesian Bureau of Statistics (BPS). Ethnolinguistic Fractionalization (ELF): The ELF index can be defined as follows: 𝑛 𝐸𝐿𝐹 = 1 − ∑ 𝑋𝑖𝑗 2 𝑖=1 Easterley and Levine (1997), “Africa’s Growth Tragedy: Policies and Ethnic Divisions”, Quarterly Journal of Economics, November 1997. 4 11 𝑋𝑖𝑗= 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑏𝑒𝑙𝑜𝑛𝑔 𝑡𝑜 𝑒𝑡ℎ𝑛𝑜𝑙𝑖𝑛𝑔𝑢𝑖𝑠𝑡𝑖𝑐 𝑔𝑟𝑜𝑢𝑝 𝑖 𝑖𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑐𝑡 𝑗 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑖𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑐𝑡 𝑗 where 𝑋𝑖𝑗 is the share of population ethnic group i in district j. The index takes values between zero and one, where ELF equals to 1 implies a highly heterogeneous district and ELF equals to 0 refers to a perfectly homogeneous district. Indonesia Central Bureau of Statistics in Census 2000 traced 1068 ethnics across regions in Indonesia. Religion Fractionalization: In Census 2000, officially there are only 5 religions that are categorized in the census, namely Islam, Catholicism, Protestantism, Hinduism, Buddhism and Other. 𝑛 𝑅𝐹 = 1 − ∑ 𝑆𝑖𝑗 2 𝑖 𝑆𝑖𝑗 = 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑏𝑒𝑙𝑜𝑛𝑔 𝑡𝑜 𝑟𝑒𝑙𝑖𝑔𝑖𝑜𝑛 𝑔𝑟𝑜𝑢𝑝 𝑖 𝑖𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑐𝑡 𝑗 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑖𝑛 𝑑𝑖𝑠𝑡𝑟𝑖𝑐𝑡 𝑗 where 𝑆𝑖𝑗 is the share of population religion group i in district j. The index takes values between zero and one, where RF equals to 1 implies a highly heterogeneous and RF equals to 0 refers to a perfectly homogeneous. Table 6 shows the key variables drawn from the Susenas surveys. POPUR Population in Urban Area (susenas ) popsus population susenas popeduc population age >=5 years old, susenas age_prim People in primary school age 7-12 years, susenas age_secj People in junior school age 13-15 years, susenas age_sech People in senior school age 16-18 years, susenas Enrolp People age (7-12) enrol in primary school, susenas Enroly People age (13-15) enrol in junior school, susenas enrolls People age (16-18) enrol in senior school, susenas scl_sd People ever/being in primary school susenas scl_smp People ever/being in junior school susenas scl_sma People ever/being in senior school susenas none_scl People never school Scl People ever/being school prim People enrol primary school, susenas secj People enrol junior school, susenas sech People enrol senior high school, susenas NER_sd NER - primary school susenas 12 NER_smp NER - junior school susenas NER_sma NER - senior school susenas enPRIM Gross enrollment rate- primary school susenas enSECJ Gross enrollment rate- junior school susenas enSECH Gross enrollment rate- senior school susenas enSECT Gross Enrollment rate - junior+senior school susenas PRIM SECH Share people ever/being in primary school per total population;susenas Share people ever/being in junior secondary school per total population;susenas Share people ever/being in high secondary school per total population;susenas lf_ Labor force (age>=15 years), susenas employ Employed population (susenas) unempl Unemployed population (susenas) POPEMP Number of people work (age>=10 years), susenas AGR Number of People who works in Agriculture susenas MIN Number of People who works in Mining susenas MAN Number of People who works in Manufacturing susenas ENR Number of People who works in Energy & Electricity susenas CON Number of People who works in Construction susenas TRD Number of People who works in Trade susenas TRS Number of People who works in Transportation susenas FIN Number of People who works in Finance susenas SER Number of People who works in Services susenas OTHER_SECTOR Number of People who works in Other Sector susenas SH_AGR Share Agriculture worker to Total population susenas SH_MIN Share Mining worker to Total population susenas SH_MAN Share Manufacturing worker to Total population susenas SH_ENR Share Energi & Electricity worker to Total population susenas SH_CON Share Construction worker to Total population susenas SH_TRD Share Trade worker to Total population susenas SH_TRS Share Transport worker to Total population susenas SH_FIN Share Finance worker to Total population susenas SH_SER Share Services worker to Total population susenas SH_OTHER_SECTOR Share Other Sector worker to Total population susenas sh_urban Urbanization (portion of population that is urban), susenas pcexp Average monthly per capita consumption, susenas ypcexp Average annual per capita consumption, susenas Lnypexp2001 Ln Average annual per capita consumption 2001, susenas ypcexp0706 Liner Growth Annual Percapita Expenditure 2006-2007 ypcexp0605 Liner Growth Annual Percapita Expenditure 2005-2006 ypcexp0504 Liner Growth Annual Percapita Expenditure 2004-2005 ypcexp0403 Liner Growth Annual Percapita Expenditure 2003-2004 ypcexp0302 Liner Growth Annual Percapita Expenditure 2002-2003 SECJ 13 ypcexp0201 GAUnypcexp0107 GAUnypcexp0105 gini_sectsus Liner Growth Annual Percapita Expenditure 2001-2002 Unweighted Average Growth Per Capita Expenditure of Neighbouring Districts 01-07 Unweighted Average Growth Per Capita Expenditure of Neighbouring Districts 01-05 Coef. gini of worker (>=10 years old) by sector susenas Consumer Price Indices and Real Per Capita Expenditure Figures on per capita expenditure per month for each district are in nominal terms. However, to calculate welfare changes over time requires real values. BPS collect price data from a large number of cities (42 in 2002, now 63). We matched each district with the nearest city for which price data was available and then deflated by the CPI for that city using 2002 as the base year (the cpi variables are cpi<year>). Per capita expenditure figures were then deflated to give real per capita expenditure figures (rpcexp) for each year. Finally we calculated the geometric average growth of real per capita expenditure (gy_rpexp0701) between 2001 and 2007. Farmers Prices The dataset also contains data on the Farmers Terms of Trade based on a survey of prices experienced by farmers for their inputs (IB*) and outputs (IT*). Village characteristics data The PODES (Village Potential) surveys provide information about village characteristics for all of Indonesia, with a sample of +/- 65,000 villages (desa). These surveys are carried out in the run up to each of the periodic censuses (Agriculture, Economy, Population).5 To get district level data we aggregated the PODES data up to the district level, as well as calculating the proportion of villages in the district with particular characteristics. - Road quality There are two road variables. First road_asphalt00 is the number of villages in the district where the main road surface is asphalt. Similarly road_rocks, road_soil, road_other show the villages with other forms of road surface. SH_road_asphalt00 shows the proportion of villages in the district with asphalt roads. In addition, PODES asks village heads perception questions about the quality of the roads on a scale from 1 to 4 where 1 represents best road quality (paved road) to 4 the worst road quality. The ROAD_00 variable is the average of these scores across all villages in the district. Hence the better the perceived quality of the road, the closest the proportion is to 1 and the worse the quality of the road the closer the proportion is to 4. 5 See: http://www.rand.org/labor/bps.data/webdocs/podes/podes.htm 14 - Geographical areas Data are taken from geographical location of a village. In the original dataset, geographical location is categorized into four: 1 is coastal area, 2 is valley or river side, 3 is hill, and 4 is plain land. These were used to create new variables showing the number of households living in each of these types of land in each year. In addition, SH_coastal00 etc show the share of villages in the district that are coastal. We also created a dummy variable for landlocked districts, LLOCK_<year> with the value 1 indicating that none of the villages within a district has a coastal area and 0 if at least one village has a coastal area. Unfortunately, this is not consistent over time. We therefore manually reconciled those districts with unclear classification to provide a single landlocked variable LLOCK. - Telephone Lines The telephone line variable is defined as the number of telephones per household. It is calculated by dividing the total number of households in each district that are connected to the telephone lines (TELP_<year>) to the total households in each district (hh_<year>). Table 7 provides a list of the key infrastructure variables from PODES SH_road_asphalt Share of villages in the district with asphalt road (PODES ) SH_road_rocks Share of villages in the district with rock road (PODES ) SH_road_soil Share of villages in the district with soil road (PODES ) SH_road_other Share of villages in the district with other road surface (PODES ) SH_coastal Share of villages in the district on coastal (PODES ) SH_valey Share of villages in the district on valley (PODES ) SH_hill Share of villages in the district on hill (PODES ) SH_plainland Share of villages in the district on plain land (PODES ) TELPHH Telephone access per household (PODES ) LLOCK Dummy Geo-Location --> land-lock=1; Coastal=0 PODES ROAD Road Quality: 1 good - 4 worst: PODES - Natural Disaster PODES has detailed information on the number of villages that have experienced various forms of natural disaster (drought, flood, volcanic eruption, earthquakes, etc). For some years there is data on the total frequency of such events. For 2008 there is information on the number of victims from such disasters and the losses that have resulted. - Crime PODES also contains detailed information on whether a range of different crimes have been reported in each village (fighting, thieving, robbery, mistreatment, burning, raping, drug-abuse, murder, suicide etc). We therefore have variable indicating the number of villages in a district which reported each of these crimes for each PODES 15 year, the share of villages reporting these crimes, and the share of villages reporting any kind of crime (SHCRIME<year>). Distance and Remoteness The dataset contains variable jarak_ which show the straight line distance from the capital of the district to each of the largest cities, as well as to the provincial capital and the nearest large city. In addition, the dlaut_ dummy variables show whether it is necessary to get on a boat to reach these locations i.e. whether the district is separated by sea from these places. Fiscal data We also include some variables from the SIKD dataset maintained by the Ministry of Finance. These include data on revenues received by the district (rev*), as well as development (dev*) and routine expenditures (rtn*). The Governance Variables The governance variables in the dataset (tz* and TZ*) are taken from the KPPOD/Asia Foundation survey of Local Economic Governance in 2007 (see weblink to report for full details of the methodology). This surveyed around 50 firms in each of 243 districts (subsequently further surveys have been done covering the remainder of the country, but these are not included in this dataset).6 The 243 districts consisted of all regencies and cities in 15 provinces. The firm survey is designed to be representative of all non-primary sector private sector firms with 10 employees or more.7 The Firm Survey asked questions about nine major aspects of local economic governance (Box 1) chosen to be consistent with theories of how local economic governance should influence local economic performance. These aspects of local economic governance were chosen because they are directly under the control of the local government. Box 1: Key Aspects Of The Local Economic Governance Survey 2007 Access to Information Access to Land and Security of Tenure Business Licensing Local Government and Business Interaction Business Development Programs Capacity and Integrity of the Mayor/Regent Local Taxes, User Charges and Other Transaction Costs Local Infrastructure Security and Conflict Resolution 6 In addition to the firm survey in each district, two or three business associations were interviewed and local regulations (perda) were collected. These are not included in our dataset. 7 The sampling method was chosen to make the results representative at the district level. Within each district, small (10-19 employees), medium (20-99) and large (100+) firms were sampled roughly in proportion to their presence in the district. Within each size class, firms were sampled from three aggregate sectors – production, trade, and services – again in proportion to their presence in the local economy. 16 In order to be able to aggregate different variables into a single sub-index for each of the concepts in Box 1, it was necessary to put them all into the same scale. The EGI therefore uses within country comparisons of best and worst practice as its key benchmarks. That is, district performance is measured against a scale that is determined by the best and worst performance of other districts in Indonesia. For example, the speed of licensing for each district is measured against the fastest and slowest licensing procedures of other districts in Indonesia. Using current best and worst practice to define the benchmark ensures that districts are compared against a standard that is relevant for Indonesia and achievable (because it has actually been achieved by other districts). Given this choice of benchmarks, each sub-index is calculated as follows: 1. Determine the variables chosen to represent each sub-index and calculate a district level mean for each variable Each sub-index represents a particular concept, as shown in Box 1. The variables used in constructing each sub-index must therefore reflect that concept. In addition, to ensure that meaningful comparisons can be made across different districts, the response rate in each district must be sufficiently high that we can have confidence in the average value calculated. For example, the variables used in the Land Access and Security of Tenure sub-index are: Time taken to obtain a land certificate Perceived ease of obtaining land Frequency of evictions in the region Perceived frequency of land conflicts Overall assessment of the significance of land problems The time taken to obtain a land certificate and the perceived ease of obtaining land, are useful variables for assessing the ease of accessing land, while the frequency of evictions and the frequency of land conflicts are useful indications of the level of insecurity about land tenure. For all sub-indices derived from the firm survey, firms were also asked to give their overall assessment of the extent to which that issue was a constraint on their activities – this is the overall assessment variable. The “hard” data on the time to obtain a land certificate is measured in weeks. Perception data on the other typically used a simple 4 point scale. For example, when firms were asked about the ease of obtaining land they were given the choice of: very difficult (1), difficult (2), easy (3), and very easy (4). Similarly when firms were asked about the likelihood of eviction they chose between extremely likely, likely, unlikely, and very unlikely. To obtain a general picture, the average values of each of these variables were calculated for the district. 2. Normalize the average values by calculating a tz-score Once district average values have been calculated for each variable in a sub-index, it is necessary to normalize these values so that they can be compared. Without some mechanism for standardizing or normalizing the variables, it is impossible to combine a mean value of, say, 20 days to obtain a land certificate, with a mean score of, say, 3.2 of perceptions about the likelihood of eviction, because these variables use different scales of measurement. We 17 therefore use a simple normalization to put all the variables into the same scale. This is achieved by putting all variables on a scale from 0 to 100, where 0 is the worst district (for that variable) and 100 indicates the best district for that variable. The normalization used to calculate the tz value for each variable is: tz 100* x x min x max x min Where x is the average value of the variable for the district; xmin is the lowest average value of the variable across all the districts; and xmax is the highest average value across all the districts. This generates for every variable, a value between 0 and 100 indicating where the district lies on the scale of worst to best districts for that variable. Example: The tz_score for the time to obtain a land certificate in the City of Bima The average time reported by firms to obtain a land certificate in the City of Bima is 19.3 days. But the minimum time across all 243 districts is found in Regency of Timor Tengah Timur in Nusa Tenggara Barat where it only takes 4.17 days to obtain a certificate. By contrast, the longest process can be found in the City of Cimahi in West Java where it takes 42 days on average to obtain a land certificate. The tz-score for the City of Bima for the “time to obtain a land certificate” variable is therefore t = (19.3 – 4.17)/(42 – 4.17) = 40.05 Reversing the scale The variables in the index are constructed so that a higher score indicates better performance. However, for some variables, larger numbers indicate worse performance. For example, a longer time to obtain a land certificate makes access to land more difficult rather than easier. For such variables, the tz_scores are reversed simply by calculating trev = 100 – t. Thus, for the Bima example the final tz_score is trev = 100 – 40.05 = 59.9. This indicates that the City of Bima is 60% along the scale that runs from the worst performing district to the best performing district. 3. Average the t scores for the variables in the sub-index Once the tz_scores have been calculated for all variables in a sub-index (and reversed when appropriate to ensure that larger scores correspond to better performance), they can be simply averaged to obtain an overall score for the sub-index (TZ_*). Example: the Land Access and Legal Certainty Sub-index for the City of Bima The tz_scores for each of the variables in the land sub-index for the City of Bima are shown below: Variable Time taken to obtain a land certificate* Perceived ease of obtaining land Frequency of evictions in the region* Perceived frequency of land conflicts* tz_score 59.9 35.7 48.9 44.3 18 Overall assessment of the significance of land problems Land Access and Legal Certainty Sub-index for City of Bima * These variables have had their tz_scores reversed. 18.7 41.5 Finally, in the original Local Economic Governance Report these sub-indices were combined into an aggregate Economic Governance Index by weighting each of the sub-indices by their perceived importance to firms. We have not done this in the dataset since researchers may wish to combine sub-indices in different ways for different purposes (with or without weights). For full details of individual variables see the corresponding question (indicated in the variable label) in the EGI Firm Survey questionnaire which is attached as an Annex of this document. For comparison, the dataset also contains a different set of governance sub-indices produced by KPPOD in 2002 in an earlier version of this survey. These use a different methodology and are not directly comparable but are included for reference (see KPPOD 2002 for more details). Political variables The dataset also contains political variables about the names and parties of the district leaders. These are drawn from the Asia Foundation’s database on election monitoring. The variables show the name of the Bupati (nm_bup) (including Walikota); their party (party_bup); as well as an indication of the system under which they were elected/appointed. These could be New Order (i.e. appointed under the Suharto regime); indirectly elected (the system in place immediately after the fall of the regime); directly elected (from 2005 onwards); and caretaker leaders (e.g. when a leader is sick or away for some reason). In addition there are dummy variables for each of these categories. In addition there are a range of variables on the incidence and extent of corruption, based on media reports (CORN). These are based on ICW data for 2004. Data from the Industrial Census (Survei Industri) and the Investment Coordinating Board BPS carries out an industrial census of large (more than 100 employees) manufacturing enterprises each year. From this data we include variables on: - The number of large manufacturing firms (firm*) - The value of their production (YPRVCU*) - The total number of workers (LTLNOU*) - Value-added (VTLVCU*) - Investment (FTTLCU*) In addition we calculated a measure of economic concentration – the Herfindahl-Hirschman Index – to show how concentrated production, employment, value-added, and investment is in each district (HHI*) 19 N H si2 1 Where si2 is the squared share of the variable of interest (production, employment etc). A high value of the HHI shows a high level of concentration. In addition, we calculated the share of production, workers, value-added and investment accounted for by the top 1 firm and the top 5 firms in each district. The dataset also includes some variables drawn from the Investment Coordinating Board’s data on investment licenses. This includes: the total number of licenses (izin) issued for foreign direct investment (pma) or domestic investment (pmdn) by the Investment Coordinating Board, as well as the total value of investment for the firms in the district. ACCESSING THE DATA The dataset, this documentation and the final report on district level governance and growth can be downloaded from both the KPPOD website www.kppod.org 20