Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The location of the Italian manufacturing industry, 1871-1911: a sectoral analysis Roberto Basile∗ Carlo Ciccarelli† May 2015 Abstract Using a new dataset on value added at 1911 prices at province level for 12 industries, we analyze the spatial location patterns of manufacturing activity in Italy during the period 1871-1911. We test the eect of domestic market potential and factor endowment, focusing on water supply. The results show that, as transportation costs decreased and barriers to domestic trade were eliminated, Italian provinces became more and more specialized, and manufacturing activity increasingly concentrated in a few provinces, mostly belonging to the North-West part of the country. The estimation results corroborate the hypothesis that both comparative advantages (water endowment eect) and market potential (home-market eect) have been responsible of this process of spatial concentration. The location of some traditional industries characterized by a low capital-labor ratio (such as clothing and wood) was mainly driven by water endowment, while the location of fast growing new sectors characterized by a medium/high capital labor ratio (such as engineering, metal-making, chemicals, textile and paper) was mainly driven by the domestic market potential. Keywords : Market potential, Factor endowment, Concentration, Italy. Jel codes : R12, R15, N13 PAPER PREPARED FOR THE EHES 2015 CONF. - PISA 4-5 SEPT. 2015. PRELIMINARY AND INCOMPLETE VERSION - PLEASE DO NOT QUOTE WITHOUT AUTHORS PERMISSION. ∗ Department of Economics, Second University of Naples, Corso Gran Priorato di Malta, 1 - 81043, Capua (CE), Italy. † Email : [email protected] Department of Economics and Finance, University of Rome Tor Vergata, Via Columbia 2, 00133 Rome, Italy. [email protected] 1 Introduction Economic theory suggests that the spatial distribution of natural advantages and market access are the key determinants of industrial location. Regional dierences in natural endowments (such as water supply, climate, and mineral deposits) contribute to determine regional comparative advantages and, therefore, regional specialization. An uneven spatial distribution of market access encourages rms to concentrate in regions with higher market potential, to benet from increasing returns, and to export goods and services to other regions. These agglomeration (or centripetal) forces tend to contrast market competition (or centrifugal) forces arising from the concentration of rms and inducing to both lower local market prices and higher local factor prices. The equilibrium be- tween these osetting forces mainly depends on the degree of market integration and transportation costs. Under autarky, each region produces essentially the goods that consumes, the location of industries is stable, and the level of industrial concentration is low. When trade costs decrease and product markets tend to integrate, the neoclassical trade model (Heckscher-Ohlin) predicts that regional specialization will arise as regions produce and export products that are relatively intensive in their abundant resources (comparative advantages). Moreover, when transportation costs decrease, the New Economic Geography (NEG) literature predicts that manufacturing activities characterized by increasing returns to scale tend to concentrate in the regions with higher demand (home-market eect), while the remaining regions suer de-industrialization. Therefore, a regional division of labor spontaneously arises through a process of uneven development. The relative eect of comparative advantages (factor endowment) and increasing returns (market access) on industrial location can be hardly quantied using empirical data referring to modern economies, as the two factors tend to coexist and interact in a very complex way along with the eect of (endogenous) policy interventions. MidelfartKnarvik and Overman (2002) show for instance that the European industrial policy inuenced strongly the industrial location patterns across the EU regions; Basile, Castellani, and Zanfei (2008) also provide evidence that EU Structural funds aected signicantly the location choice of multinational rms in Europe. But if industrial policies are endogenously driven by the actual spatial distribution of economic activity, then it can be quite hard to quantify the genuine (net of industrial policies) eect of comparative advantages and/or market potential on industrial location. However, the use of historical data covering the years following the political unication of a country (that is when the domestic market integration was full of obstacles, and no systematic national industrial policy took place), may provide an opportunity to better contrast the dierent explanations for the spatial concentration of industry and, in particular, to appreciate the increasing role of the home-market eect. Examples of studies in this direction are Wolf (2007), for the case of Poland, Rosés (2003), for the case of Spain, and A'Hearn and Venables (2013), for the case of Italy. In particular, the latter explores the interactions between external trade and regional disparities in the Italian economy since its unication (1861). The authors argue that the economic superiority of Northern regions over the rest of the country was initially based on natural advantages (in particular the endowment of water), while from the late 1880s onwards domestic market access became a key determinant of industrial location, inducing fast growing new sectors (such as engineering) to locate in regions with a large domestic market, that is, essentially, in the North. From 1945 onwards, with the 2 gradual process of European integration, foreign market access represented instead the decisive factor; and the North, again, had the advantage of proximity to these markets. Following these recent contributions, in the present paper we analyze the spatial location patterns of industrial activity in Italy during 1871-1911. During this period the country experimented a strong process of domestic market integration, with the dismantling of institutional barriers to domestic trade, the adoption of a common currency (with the national central bank, the Bank f Italy, instituted in 1893), the harmonization of local institutions and the creation of a national railway network. Going along the line of reasoning put forwards by A'Hearn and Venables (2013), we test the hypothesis that the process of internal economic integration has amplied the regional asymmetries in domestic market accessibility, thus aecting the industrial location, especially for the fast growing modern sectors, characterized by increasing returns technologies. On the other hand, the lack of a modern electricity transmission system during the sample period imposed the need to exploit local sources of power. In the Italian case, as widely pointed out (see, for instance, Cafagna, 1989; Bardini, 1998; Fenoaltea, 2011), abundance of water 1 represented an essential source of power in a country without coal. The second hypothesis is therefore that the location of manufacturing activity was strongly correlated to water supply. We test the relative importance of factors' endowment (water supply) and market potential in the early phases of Italian industrialization using recently developed value added data at 1911 prices at the provincial NUTS-3 level (69 provinces) for the following 12 manufacturing sectors: foodstus, textile, tobacco, clothing, leather, wood, non-metallic mineral products, engineering, metalmaking, chemicals, paper, and sundry manufacturing (Ciccarelli and Fenoaltea, 2013, 2014). In other words, we try to assess the arguments developed by A'Hearn and Venables (2013) using the most detailed information (both from the territorial and the sectoral point of view) on the spatial distribution of economic activity available for the post-unication period. Our results clearly show that as transportation costs decreased and barriers to domestic trade were eliminated, Italian provinces became more and more specialized, and manufacturing activity became increasingly concentrated in a few provinces, mostly belonging to the North-West part of the country. The estimation results corroborate the hypothesis that both comparative advantages (water endowment eect) and market potential (home-market eect) have been responsible of this process of spatial concentration. The location of some traditional industries (such as clothing) was mainly driven by water endowment, while the location of fast growing new sectors (such as engineering, metalmaking, chemicals, textile, and paper) was increasingly driven by the domestic market potential. The rest of the paper is organized as follows. Section 2 describes the basic historical and economic features of the manufacturing industry. distribution and diusion. Section 3 illustrates its spatial Section 4 presents the estimation results of a Geoadditive model of industrial location considering factor endowment and market potential. Section 5 concludes. 1 Fernihough and O'Rourke (2014) consider the importance of geographical proximity to coal as a factor underpinning comparative European economic development during the Industrial revolution. 3 2 Setting the scene Figure 1 illustrates the division of the Italian territory that roughly prevailed during the 1815-1860 period. Before the country's unication in 1861 the Italian territory was divided in seven States (Kingdom of Sardinia, Kingdom of Lombardy-Venetia, Duchy of Parma and Piacenza, Duchy of Modena and Reggio, Gran Duchy of Tuscany, Papal States, and the Kingdom of Two Sicilies) characterized by extremely dierent institutions and economic policies The coin was dierent among pre-unitarian states. Primary school was mandatory in certain states but not in other and trade-policy diered considerably. 2 Figure 1 Italian provinces at 1911 borders grouped into pre-unitarian States (1850 ca.) Italy was unied in 1861, although Venetia and Latium were annexed to the country only in 1866 and 1870, respectively. Between 1861 and 1870 the national capital was moved from Turin to Florence, and nally to Rome. Soon after the political unication, policy makers realized that there was an urgent need of statistical evidence. Population censuses every ten years were established, and dozens of annual reports to the Italian Parliament, and other ocial publications concerning the main economic sectors (public budget and taxation, international trade, railroads, public school system) were regularly produced. The new ocial historical statistics divided the Italian territory into 16 regions compartimenti, ( province, roughly NUTS-2 units), and 69 provinces ( roughly NUTS-3 units). The borders of these administrative units did not change during 1871-1911. As we will document along the rest of the paper, economists and economic historians largely agree in considering factor endowment and market potential as the main factors that inuenced industrial location in nineteenth century Italy. The present study 2 The analysis of the impact of these economic and institutional dierences among pre-unitarian states on the industrial and more generally socioeconomic evolution of post-unication Italy is clearly beyond the scope of this paper. Felice (2014) provides an account of the long-term role played by institutions on regional growth and development in Italy. The author essentially follows Acemoglu, Johnson, and Robinson (2004) and argues that extractive institutions historically prevailed in southern regions, while inclusive institutions characterized essentially the North of Italy. 4 acknowledges the point, and provides a rst attempt to quantify the matter within a proper statistical model estimated at the province level. 2.1 Factor endowment: the role of water supply Factor endowment matters economically depending on the prevailing technology. In 1862, the Francesco Rossi wool mill was built in Schio (province of Vicenza, in the North-East of Italy). The factory was 80 meters in length, 13.9 in width, with 6 oors above the ground, 125 cast-iron columns, 300 windows, with all the convenience of water and steam for normal use and lost steam heating. In the rst oor there were 13 mule-jennies, alternatively opening and closing in three rows, with 3,600 spindles. The cutting took place on the second oor. On the third oor there were 60 Jacquard looms. Under the roof there was a hall in which a hundred and twenty women did the darning. 3 The Rossi wool factory in Schio represented an example of vertical-factory, that is a particularly tall buildings characterizing the industrial architecture of the time. Far from being an isolated case, these vertical-factories became increasingly diused in the North (the Cantoni cotton 4 factory in Lombardy provides another famous example). The main raison d'être of the vertical factories was the need to have machines as close as possible to the drive shaft of the driving force, to avoid energy loss, and to obtain a regular movement of machinery. So a single tree moved by the force of the water (or of the steam) could serve many overlapping levels in each of which the machines could not be too far away from the power source. As ecaciously summarized by Negri and Negri (1981), pp. 138-139 The placement of the rst industrial complexes in the territory is strongly determined by the availability of water power. It is a waterfall in fact that provides the energy that guarantees - through a system of wheels and shafts - that the machines are in constant motion. 5 The central role of natural endowment (water, above all) for industrial location in 19th century Italy was recently reiterated by Fenoaltea (2011), providing perhaps the most careful and sharp account of Italian industrialization over the 1871-1911 period. According to Fenoaltea, the roots of the success of the Northern regions seem [...] environmental rather than historical (p. 231): factor endowment thus, more than socioeconomic variables such as human and social capital. Modern (or factory-based) system of production replaced gradually artisanal production, and gave a strong advantage to the locations with a year-round supply of water (for power, and also, in the specic case of textiles, for the repeated washing of the material); and in Italy such locations abound only on the northern edge of the Po valley, where the Alpine run-o osets the lack of summer rain. And, when analyzing the evolution of the location of the textile industry during 1871-1911, Fenoaltea (2011, p. 232) mentions among the natural advantages of northern regions the water that owed in the rivers, the water suspended in the air [...] and the presence of mountain glaciers. 3 The description of the Rossi wool factory summarized in the text was made by Francesco Rossi himself in a letter of 22nd April 1863 to his friend Fedele Lampertico and available at http://www.schioindustrialheritage.com/uk, last accessed June 2015. 4 The Biellese, Vicentino, Novarese, Varesotto, Brianza, Cremasco, Val Trompia, and Val Brembana industrial districts in Northern Italy are examples of little Manchester considered in Cafagna (1999). A detailed account on the matter can be also found in Romano (1990), pp. 209 . 5 On this point also see Cafagna (1989, p. 196). 5 To measure the relative water endowment at provincial level during 1871-1911, the present study considers ultimately two variables. 6 The rst one refers to the economic "relevance" of Italian rivers. The historical source Annuario Statistico Italiano (1886, pp. 22-27) provides a list of 100 rivers owing through the 69 Italian provinces. These rivers are here ranked, with a value ranging from 1 to 5, according to the weight assigned to them by the experts of the Italian Military Geographic Institute (IGM) in a publication 7 that culminates a research project started in the 1960s (IGM, 2007). The importance (or weight) of each river is established on the base of the length of each river and its socioeconomic relevance (IGM, 2007, preface). In the IGM ranking, the value of 1 means "high relevance", while the value of 5 means "low relevance". Thus, our rst measure of water endowment in each province i is: RIV ERi = where k P k∈i (6 − IGM rankk ), refers to each river owing through province i.8 The second variable refers to presence of waterfalls in the various provinces (source: Pavolini, 1985).Interestingly, as we will document in our econometric analysis, the interaction between the continuous variable RIV ERi and the waterfall dummy variable proved to add signicant help in understanding of industrial location of certain manufacturing sectors, and supports nicely the need for industrial use of a year-round supply of water considered by Fenoaltea (2011). The provincial distribution of these water-related variables is illustrated in Figure 2. The map on the left panel (based on the IGM data) shows a rather homogeneous distribution of rivers among Italian provinces. The only exception are the provinces in the deep South of continental Italy. The map in the right panel (based on Pavolini, 1985) shows instead that the main Italian waterfalls where concentrated in northwestern provinces. To summarize, the rst factories, above all wool and cotton mills, were located where the rivers were steeper. They moved gradually towards the plain with the adoption of steam rst, and of the electricity then. Over time, the original riverforestvertical-mill triad was gradually replaced by large horizontal -factories located in plain, and often bordering the main urban settlements. The point is clearly stated in Ciccarelli and Fenoaltea (2013), pp. 81-82 noticing that Already in the nineteenth century the spread and increasing eciency of the steam engine had meant that energy could be moved by moving fuel, that power could be generated anywhere; but only electricity brought the eective separation of generation and use, and by the early twentieth century power could be economically transmitted over previously inconceivable distances. Industrial location pulls were revolutionized: the most energy intensive industries alone remained tied to the waterfalls, most manufacturing could protably move closer to the sources of the raw materials, closer to the market, saving on the transportation of goods at a now aordable cost in the transmission of energy. In practice, at the margin industry abandoned the remote sources of power for the major nodes of communication, the centres of trade, in short for the very locations that had already nurtured the largest urban concentrations. 6 See the Appendix for details on sources and methods. 7 The Military Geographic Institute (IGM), created in 1861, is the national mapping agency for Italy. 8 The IGM (2007) list includes, of course, the rivers Strona (in Piedmont), Olona (in Lombardy), and Astice (in Venetia) entered in the Italian economic history for the role they played in the early steps of Italian industrialization Cafagna (1989), p. 196. 6 (a) Rivers Figure 2 Water endowment: (b) Waterfalls panel (a) shows the geographical distribution of the RIV ERi variable, a measure of the "relevance" of the rivers owing through province i; panel (b) shows the geographical distribution of the number of waterfalls. The previous arguments lead us naturally to consider the role played by domestic market potential. 2.2 Domestic market integration during the post-unication period In order to analyze the importance of domestic market access as a key driver of the spatial distribution of economic activity (as suggested by the NEG theory; Fujita, Krugman, and Venables, 1999; Combes, Mayer, and Thisse, 2008), a sound measure of accessibility to demand is required. In line with Crafts (2005), we constructs market potential estimates for each Italian province i between 1871 and 1911 using Harris (1954)'s formula, that is as a weighted average of total value added of all provinces M KT P OTit = N X V Ajt j=1 with dij the great circle distance in km dij , j: j 6= i (1) between the centroids of provinces i and j. In practice, this indicator equates the potential demand for goods and services produced in a location with that location's proximity to consumer markets. Thus, it can be interpreted as the volume of economic activity to which a region has access after having subtracted 9 the necessary transport costs to cover the distance to reach all the other provinces. An important point is that historical GDP estimates at the provincial level for the case of Italy are not available. We thus proxy for provincial GDP by allocating total regional GDP (NUTS-2 units) estimates for 1871, 1881, 1901, and 1911 to provinces (NUTS-3 9 The trade cost to access the market internal to the province is assumed to be proportional to one p third of the radius of a circle with an area equal to that of the province, i.e. 7 dii = 2/3 Areai /π . 10 units) using the provincial shares of regional population obtained by population census. As shown in Figure 3, the values of Harris market potential appear to be more and more concentrated in the northwest. It is important to stress that within northern regions there 11 is a substantial amount of heterogeneity. In Lombardy, the undisputed economic leader among Italian regions, only the provinces of Milan (MI) and Como (CO) neighboring the region of Piedmont (in the very northwest of Italy and bordering France) score particularly well. 12 Similarly in Liguria only the province of Genoa registers high market potential while Porto Maurizio (PM) does not. But there is more than that. Estimated market potential is also high in Florence, Rome, and Naples, that is the provinces with the preunitarian city capitals of the Gran Duchy of Tuscany, the Papal States, and the Kingdom of Two Sicilies. This last evidence matches perfectly well with the intuition by Fenoaltea (2003) that in his (NUTS-2) study on Italian regional industrialization noticed that in the early 1870s The industrial, manufacturing regions are those with the former capitals, of the preceding decades and centuries and that In such a context the appropriate unit of analysis is not in fact the region, but (in Italy) the much smaller province. The evolution of trade-policy and the improvement of transport and communication technologies have inuenced the dynamics of transport and trade costs and, thus, the dynamics of the domestic market potential. The historical literature generally agree on the following periodization of the post-unication trade policy: i) the free-trade period: 1861-1878; ii) the start of protectionism: 1878-1887; iii) the full embracement of protectionism: 1887-1913. As any periodization, the one proposed has of course a certain degree of arbitrariness. During the free trade period (1861-1878) the mild tari of Piedmont (the political core of the Kingdom of Sardinia) was extended to the rest of the country. Southern Italy, where protectionism was the rule under the Kingdom of Two Sicilies, was particularly aected. Table 1 summarizes this for selected products. The eects on the economy of Southern regions of the mild tari in force in Italy in the aftermath of the unication has been evaluated in dierent ways. Certain scholars argue that it was particularly detrimental for Southern industry (see, for instance Milone, 1953), other scholars points rather to its positive eects (see, for instance, Ciccarelli and Fenoaltea, 2013). In 1878 and 1887 import duties were generally increased, and the historical literature focused in particular on that on wheat, iron, textile, and sugar (see Federico, 13 Natoli, Tattara, and Vasta, 2011; Fenoaltea, 2011, for an exhaustive account). Ciccarelli and Proietti (2013) discuss the possible eect of protectionism on industrial specialization of Italian provinces during 1871-1911 by means of a graphical tool named dynamic 10 GDP estimates at historical NUTS-2 borders have been kindly provided by Emanuele Felice. These estimates dier from the ones published by same author (Felice, 2013), which are at current borders. 11 This is, we believe, a neat example of the advantage of using provincial (NUTS-3) over regional (NUTS-2) gures. 12 Up to the late 1880s, when Italy started a nonsense trade-war with France, nearly a half of Italy's total exports, albeit rather limited in size, were directed to the France market (Federico, Natoli, Tattara, and Vasta, 2011, pp. 42-43). 13 The Italian import duties of the time were essentially tied to the physical weight of the imported goods, and not to their value. As a consequence their eectiveness depended, other things being equal, on the price level. The price level was lowered in the last decades of the 19th century, reach a minimum in 1895 ca, and start growing rapidly between 1900 and 1913. As a result the degree of protectionism on Italian goods was, for a given structure of import duties, less eective during the pre-WWI years. 8 belle époque of the (a) Year: 1871 (b) Year: 1881 (c) Year: 1901 (d) Year: 1911 Figure 3 Harris market potential 9 Table 1 Import duties on selected products before and after the Unication (lire per 100 kg) (1) (2) (3) PRE-1861 a South POST-1861 North b Italy c TEXTILES: cotton: yarn 28 20 10 cloth 90 75 40 wool: ber 21 0 0 yarn 193 60 40 0 300 140 8 2.48 0 raw sugar 40 17.85 18 coee 47 34.96 30 bar iron 18 0 0 cloth OTHER PRODUCTS: cereals a South denotes the tari of the Kingdom of Two Sicilies; of Sardinia; c b North denotes the tari of the Kingdom Italy denotes the tari in force in Italy after the 1863 Franco-Italian trade-agreement. Source: C. Correnti, P. Maestri, Annuario Statistico Italiano. Anno II - 1864 (Turin, Tipograa Letteraria, 1864), p. 513. specialization Biplot. 14 The study suggests that the adoption of protectionism during the late nineteenth century did change, at least for industrial sectors like the textile one, the specialization trajectory of selected northern provinces. Protectionism generally reduce the international ows of commodities. Despite rising import duties, especially after the 1887 tari, the share of export over GDP increased from about 6.5 percent in 1862-1887 to about 9.3% in 1887-1914, while the share of import over GDP increased from about 8% in 1862-1887 to about 12% during the same period. 15 In 1862, more then 90% of Italian imports were from European countries, while in 1911 only 69%. Imports from USA increased from about 5% in 1862 to about 20% in 1911 (Federico, Natoli, Tattara, and Vasta, 2011, p. 32). The temporal evolution of the structure of the Italian international trade is also particularly informative on the transformation of the industrial system. In the aftermath of the country's unication the most signicant imports were represented by textile goods (of cotton, wool and silk), while imports of industrial raw material (coal, raw cotton) were of relatively reduced importance. In the pre-WWI years, raw cotton and coal represented instead the most signicant items among Italian imports. Over time thus, following a standard process of import-substitution, the imports of nal goods was replaced by that of raw materials, 14 The dynamic specialization Biplot graphical tool proposed in Ciccarelli and Proietti (2013) is largely based on the BiplotGUI package for R elaborated by la Grange, le Roux, and Gardner-Lubbe (2009). 15 The degree of openness (absolute value of import plus export over GDP) increased accordingly, passing from about 14.5 to 21.2. 10 1861 1886 1909 Figure 4 The evolution of the Italian railway network, 1861-1909 whereas nal goods were increasingly produced domestically. The structure of export also changed considerably: while olive oil, silk, and citrus fruit reduced their importance, the exports of manufactures (including cotton textile and machinery) started growing (Fenoaltea, 2011; Federico, Natoli, Tattara, and Vasta, 2011). As for the contribution of the improvement of transport and communication technologies to the dynamics of market potential, it is worth noticing that while both roadsand sea-transports were relevant to the Italian peninsula, the extension of the railroad network represented the main novelty brought out by the political unication of 1861. To illustrate the reduction in transport costs over 1871-1911, Figure 4 summarizes the rapid progress of this network. Railroads were already active in 1861 in the North-West and Tuscany while, with the exception of the area surrounding Naples, they were completely absent in the South. The ports of Genoa, Venice, and Ancona were connected to Milan. The port of Genoa was connected by railways to Turin in 1853. Turin and Milan were connected by railways between 1855 and 1859. These connections contribute substantially to the early industrial development of the North-West area. 16 High quality coal, an input of growing importance with the diusion of factory-based industry, was largely imported from the UK, especially from Cardi and Newcastle. It was delivered to the main Italian ports of Genoa and Naples, and distributed across the country by railways. The coast of imported coal was about 50 to 80 lire per ton depending on the distance from the nearest port (against a cost of about 7 lire per ton in England, Sapori, 1961, p. 8). Coal was also imported by railways. According to Mario Abrate, among the leading expert of industrialization in Piedmont, in the early 1870s the coal delivered to Turin from St. Etienne (central eastern France) by railways had a price of 35 lire per tons (of which about 15 lire for the coal 16 Venetia was part of the Habsburg Empire up to 1866. The Austrian policy aimed at strengthening the commercial relation between Wien and Trieste (in the region of Venetia), and its port on the upper Adriatic sea, and, at the same time, to downsize Genoa's commercial ambitions, tied naturally to the development of its port in the upper Mediterranean sea. The opening of the railway line connecting the capital city of Turin with the Port of Genoa in 1853 represented thus a clear attempt of the Italians, above all Cavour, to hinder the development of north-eastern trade routes dear to Austrian policy makers, and to foster the economic development of the Italian North-West. 11 itself, and 20 lire as transport cost, while the the coal shipped from Cardi to the port of Genoa, and then to Turin by railways had a total cost of about 55 lire per ton (Abrate, 17 1970, pp. 28-29). The subsequent development of the Italian railways was surprisingly fast. While Northern provinces beneted from the early availability of railroads, the main network was extended quickly to the rest of the country and already in 1886 (see, again, Figure 4) it was almost completed. Over the next couple of decades the national network 18 was then lled in completely. 2.3 The structure of economic activity During the second half of the nineteenth century the sectoral composition of total gross value added (GVA) changed considerably. Industry and services increased at the expense of agriculture (see Table 2). From 1871 to 1911 the share of GVA covered by agriculture decreased from 49 to 38 percent, while that of industry increased from about 16 to about 24 percent. 19 Table 2 The changing composition of Italy's GDP: sectoral shares (percentages), 1871, 1881, 1891, 1901, 1911 1871 1881 1891 1901 1911 AGR 49.11 47.27 45.72 44.11 38.41 IND 15.73 16.87 18.20 19.29 24.46 manufacturing SER TOT 12.37 13.08 14.09 15.73 18.98 35.16 35.86 36.08 36.6 37.13 100.00 100.00 100.00 100.00 100.00 Source: authors' elaboration on Fenoaltea (2005), providing value added estimates at 1911 prices; industry includes four major aggregates: extractive, manufacturing, utilities, and constructions. Using a new dataset developed by Ciccarelli and Fenoaltea (2013, 2014), Table 3 focuses on the manufacturing sector and reports value added percentage shares at benchmark years (1871, 1881, 1901, and 1911) when population censuses were taken. With a few exceptions, sectors tied to the production of consumption goods (such as foodstus and leather) show a constant reduction in their share of value added. Sector tied to the production of durable goods (such as metalmaking and paper) show an opposite longterm trend, with a rapid acceleration in the 1901-1911 decade. The last column of Table 3 summarizes 1871-1911 trends, with numbers below one prevailing in the upper rows of the table, while those above one in the lower ones. 17 The rst French railway line was completed in the late 1820s, and connected the coal mines near Saint-Étienne and the banks of the Loire River in Andrézieux, near Lyone. The trains, which consisted of three mine cars, were powered by gravity on downhill trips. Horses pulled them back up until mid 1840s. 18 Scholars also stressed that the Italian rail fares of the time were excessively expensive, and that traditional costal shipping played and important role within the Italian transport system even after the introduction of railways (Fenoaltea, 2011; Schram, 1997). 19 The sectoral composition of Italian GVA was similar to that of other Southern European countries, while Northern countries, above all England, in 1911 ca had only about one fth of the labor force employed in the agricultural sector (Broadberry, Federico, and Klein, 2010). 12 The above level of disaggregation, albeit rather ne, is not fully satisfactory. Within the vast and variegated engineering sector, to exemplify, the manufacturing of naval ship and rolling stock was certainly more sophisticated from a technological point of view, than the maintenance of agricultural tools (such as spades, hoes, and ploughs) representing possibly a substantial part of blacksmiths' traditional activity. Within textile, to give a second example, traditional bers (hemp, linen, jute, silk, wool) were processed very dierently from cotton and articial silk. The non-metallic mineral product sectors, to provide a nal example, includes the famous white marble of Carrara (with its high value per-ton making it a relatively mobile product) and bricks and tiles, tied to the construction sector, with low value per-ton and, as consequence, a production activity of local nature, with kilns diused virtually everywhere. One can further note that in 1911 the foodstus, textile, and engineering sectors represented alone more the 50 percent of total value added in manufacturing. According to Ciccarelli and Proietti (2013), describing industrial specialization trajectories of Italian provinces during 18711911, these three sectors act as sucient statistics for the whole manufacturing sector in that are able to capture much of the total variability of provincial specialization within 20 manufacturing. Table 3 Manufacturing sectors: value added shares Sectors 1871 1881 1901 1911 1911/1871 Food 33.5 30.4 25.4 21.5 0.6 Tobacco 1.6 1.3 0.9 0.7 0.4 Textile 10.3 10.3 12.8 11.1 1.1 Clothing 6.9 7.4 6.8 6.3 0.9 Wood 10.0 9.3 9.7 10.0 1.0 Leather 10.5 11.5 11.4 7.8 0.7 Metal Engin NonMet Chem Paper Sundry 0.6 17.5 3.6 2.1 2.7 0.7 1.0 17.8 4.2 2.4 3.5 0.7 1.7 18.6 4.2 3.0 4.8 0.6 3.1 21.5 6.6 4.3 6.3 0.7 5.2 1.2 1.8 2.0 2.3 1.0 Source: our elaborations on data produced by Ciccarelli and Fenoaltea (2013, 2014). The next sections discuss the geographical distribution of manufacturing activity and its possible determinants. The remaining of this section illustrates the procedure that we followed to obtained the new industrial value added estimates for Italy's provinces. The rst ever estimates of provincial value added at 1911 prices for Italian industry and for the years 1871, 1881, 1901, and 1911, were presented by Ciccarelli and Fenoaltea (2010) and further discussed in Ciccarelli and Fenoaltea (2013). The authors allocated existing regional estimates to provinces, by using the provincial labor force shares of 20 Much of this section's content is inspired by the The structure of industrial production section in Ciccarelli and Proietti (2013). However, we use here updated provincial gures of valued added for the engineering sector, obtained by allocating to provinces trough sectoral labor force shares of the regional total, the new regional estimates by Ciccarelli and Fenoaltea (2014). 13 regional totals, separately for each of the 15 industrial sectors (extractive, manufacturing, constructions, and utilities; with manufacturing further disaggregated into 12 sectors as illustrated in Table 3). While annual 1861-1913 regional estimates (NUTS-2 units) are based on a rich set of detailed historical sources, provincial estimates (NUTS-3 units) rest essentially on the information on labor force as reported in the population censuses 21 and are thus only available for the years 1871, 1881, 1901, and 1911. In this paper we update the provincial estimates by Ciccarelli and Fenoaltea (2013) by incorporating at the provincial level the new regional estimates on the engineering industry provided by Ciccarelli and Fenoaltea (2014). While Ciccarelli (2015) summarizes in detail the state of play of regional historical statistics for Italy's industry, the important point here is that the new provincial estimates (NUTS-3) on industrial value added used in this paper are strictly related to estimates at the regional level (NUTS-2 units). Labor force gures obtained from population censuses allow one to map value added estimates on each industrial sector from 16 regions to 69 provinces. The implicit assumption is that within a given region, within a given industrial sector, and within a given census year, the productivity of labor is homogeneous across provinces. In other words, dierences in labor productivity are here only accounted for by the regional dimension of the data not the provincial one. This is an unfortunate, yet hardly avoidable, assumption. Detailed historical sources that allow one to avoid this assumption are only available at the regional (NUTS-2 units). 22 3 The spatial diusion of manufacturing activity In this section, we use the value added data at 1911 prices described above to answer several questions about the spatial diusion of industrial activity in 19th century Italy. How similar was the industrial structure across dierent provinces in Italy? during the 1871-1911 period? Did it change How concentrated was the manufacturing activity as a whole and how concentrated was a given industry? Which industries tend to agglomerate, which industries were rather dispersed? And do we nd an increase or decrease in concentration over time? In order to answer these questions, we clarify some measurement issues. compute the share of a certain manufacturing industry k activity of province i si (t) , dened as k First, we in the total manufacturing xk (t) ski (t) = P i k k xi (t) (2) 21 Population censuses only reports information on individual professions, and not on industrial sectors. Data on the hundreds of individuals professions have to be grouped to industrial sectors, separately by province, and the procedure is in part arbitrary. To exemplify: a nineteenth century chemist produced and sold the drugs at the same time. But only the former activity (production of drugs) belongs to the manufacturing sector. The latter (sale of drugs) belongs instead to the service sector. Ciccarelli and Missiaia (2013) reports and analyzes the main pros and cons of the provincial labor force data used by Ciccarelli and Fenoaltea (2010, 2013) to obtain their provincial value added estimates. 22 The and long-term Fenoaltea project (2009, on 2014), regional and industrial sponsored by production the Bank carried of Italy on is by Ciccarelli illustrated https://www.bancaditalia.it/pubblicazioni/altre-pubblicazioni-storiche/produzione-industriale-18611913/. 14 at where xki (t) measures the level of economic activity k at location Second, we compute the share of a certain location activity of industry k i i and time t. in the total manufacturing as xk (t) lik (t) = P i k i xi (t) (3) Third, we dene the location quotient as LQki (t) lik (t) ski (t) P P P P P k = k k k x (t)/ x (t) x (t)/ i i i k i k i i k xi (t) =P (4) In section 4, we will use the location quotient as our dependent variable in an econometric analysis of the determinants of industrial location. 3.1 Regional specialization and geographical concentration of economic activities Using the shares ski (t) we can address the question of how specialized were Italian provinces by using Krugman's specialization index Ki (t) = X Ki (t), dened as: |(ski (t) − s−k i (t))| (5) k s−k i (t) is the share of industry k in the total production of all provinces except province i. Thus, the Krugman index summarizes a province's dierence in specialization where with respect to the rest of Italy over all industries. It takes the value of zero if a province's industrial structure is identical to the rest of Italy, and the value of two if the region has no industries in common with the rest of Italy. Table 4 shows means and standard deviations of Krugman index for the four sample periods, while gure 5 maps its spatial distribution. The results clearly show that on average the degree of industrial specialization of Italian provinces monotonically increased over the sample period (from 0.26 to 0.38), while its dispersion slightly raised. The increment in the value of Krugman index was particularly pronounced for selected provinces of the North-West (including Turin, Milan, Genoa, Novara, Como, Cremona, Bergamo), and Tuscany (Massa Carrara, Lucca, Pisa, and Leghorn). Table 4 Industry specialization of Italian provinces. Krugman index (value added data) Year 1871 1881 1901 1911 Mean 0.26 0.29 0.35 0.38 St.dev 0.10 0.11 0.14 0.13 CV 0.38 0.36 0.39 0.35 Source: our elaborations on data produced by Ciccarelli and Fenoaltea (2013, 2014). 15 (a) Year: 1871 (b) Year: 1881 (c) Year: 1901 Figure 5 Specialization: (d) Year: 1911 Krugman index 16 Next, we try to assess whether this increase in specialization corresponded to a higher spatial concentration of industries, by using a relative Theil index of industrial concentration: Ck (t) = X vi i where vi V LQki (t) ln LQki (t) is the value added in province higher the value of Ck (t), i and V (6) is the total value added in Italy. The the higher the concentration of industry k. Table 5 reports the value of the Theil index for both the whole manufacturing sector (the location quotient is in this case computed by normalizing with respect to total economic activity) and for the 12 individual industries (the location quotient is in this case computed by normalizing with respect to total manufacturing activity) at benchmark years during 1871-1911. We observe that, along with the progressive integration of commodities and factor markets (falling transport costs), the manufacturing activity became slightly more concentrated throughout the entire period (the value of the Theil index for the manufacturing industry as a whole was 0.03 in 1871 and became 0.08 in 1911). The aggregate coecient hides the notable tendency towards the concentration of several industries (namely, foodstus, textile, wood, leather, metalmaking, and engineering). Instead, the relative level of concentration remained stable in two sectors (clothing and chemicals) and only three industries became relatively more dispersed (tobacco, paper and non-metallic mineral products). Nevertheless, tobacco remained the most concen- trated sector, followed by metalmaking and textile. 23 The most dispersed sectors were instead Wood, Food, and Engineering. However, the ranking of the Theil indices remained very stable over time (the Spearman rank correlation between the Theil indices in 1871 and in 1911 is 0.95 with a p-value of 0.000). Interestingly, the stability in the level of concentration across industries observed for 19th century Italy is a pattern also found in studies concerning more recent periods, both for Italy (Arbia, Dedominicis, and De Groot, 2011) and other countries (Alonso-Villar, Chamorro-Rivas, and González-Cerdeira, 2004; Devereux, Grith, and Simpson, 2004; Dumais, Ellison, and Glaeser, 2002). 3.2 Global spatial dependence The relative Theil index Ck (t) provides useful information about the extent to which industries in post-unication Italy were concentrated in a limited number of areas, but does not take into consideration whether those areas were close together or far apart. In other words, it does not take into account the spatial structure of the data. Every region is treated as an island, and its position in space relative to other regions is not taken into account. Thus, the relative Theil index Ck (t) is an a-spatial measure of concentration: the same degree of concentration can be compatible with very dierent localization schemes. For example, two industries may appear equally geographically concentrated, while one is located in two neighboring regions, and the other splits between the northern and the southern part of the country. 23 The tobacco sector was managed by the State in a regime a public monopoly. Manufacturing was concentrated in a few State-owned factories. In 1911, tobacco products were manufactured in Bari, Bologna, Cagliari, Catania, Chiaravalle, Florence, Lecce, Lucca, Milan, Modena, Naples, Palermo, Rome, Sestri, Turin, and Venice. 17 Table 5 Industry concentration: Theil index (value added data) Sectors 1871 1881 1901 1911 Food 0.02 0.02 0.04 0.06 Tobacco 1.23 1.05 1.04 0.81 Textile 0.26 0.3 0.45 0.45 Clothing 0.11 0.13 0.1 0.11 Wood 0.02 0.02 0.03 0.05 Leather 0.05 0.06 0.11 0.16 Metal 0.38 0.57 0.86 0.74 Engin 0.05 0.03 0.06 0.07 NonMet 0.17 0.17 0.19 0.09 Chem 0.18 0.16 0.23 0.17 Paper 0.21 0.21 0.17 0.16 Sundry 0.33 0.89 0.62 0.46 Manufacturing 0.03 0.04 0.07 0.08 Source: our elaborations on data produced by Ciccarelli and Fenoaltea (2013, 2014). As pointed out by Arbia, Basile, and Mantuano (2005); Basile and Mantuano (2008); Arbia, Dedominicis, and De Groot (2011), a more accurate analysis of the spatial distribution of economic activities requires the combination of traditional measures of geographical concentration and methodologies that account for spatial dependence, in that they provide dierent and complementary information about the concentration of the various sectors. Spatial autocorrelation is present when the values of one variable ob- served at nearby locations are more similar than those observed in locations that are far apart. More precisely, positive spatial autocorrelation occurs when high or low values of a variable tend to cluster together in space and negative spatial autocorrelation when high values are surrounded by low values and vice-versa. Among the spatial dependence measures, the most widely used is the Moran's I= N P P i P P i wij j index (Moran, 1950): ! wij LQi − LQ LQj − LQ 2 P i LQi − LQ is the total number of provinces, LQi (7) LQj are the observed values of the location quotient for the locations i and j (with mean LQ), and the rst term is a scaling where N j ! I and constant. This statistic compares the value of a continuous variable at any location with the value of the same variable at surrounding locations. The spatial structure of the data W (Anselin, 2013) with generic elements i 6= j ). In the rest of the paper we will employ row-standardized spatial weights matrix (W ), whose elements wij on the main diagonal are set to zero whereas wij = 1 if dij < d and wij = 0 if dij > d, with dij the great circle distance between the centroids of region i and region j and d a cut-o distance. Table 6 reports the values of the Moran's I statistics for the whole manufacturing sector and for the 12 sectors, as well the respective p-values. First of all, it is worth noticing that the degree of spatial autocorrelation of LQ for the manufacturing as whole is formally expressed in a spatial weight matrix wij (with increased over time passing from a null to a statistically signicant and positive value. 18 Thus, both a-spatial and spatial concentration of manufacturing activity increased over time. An increase of spatial autocorrelation is also observed for traditional sectors (leather, wood, clothing, textile, and foodstus). In 1911, Textile and Leather were the sectors 24 with the highest levels of spatial autocorrelation. For three sectors (non-metallic mineral products, metalmaking and engineering) we found a decrease of spatial autocorrelation, albeit the Moran's I statistic remained positive and statistically signicant except for engineering. Only for three sectors (paper, chemicals and tobacco) we found no evidence of signicant spatial dependence over the whole sample period. Also the rank correlation between the Moran's a p-value I indices in 1871 and in 1911 is positive and quite high (0.67) with of 0.015. Table 6 Spatial autocorrelation: global Moran's I index (value added data) Sectors Food 1871 1881 1901 1911 0.15 0.31 0.25 0.29 0.02 0.00 0.00 0.00 Tobacco -0.04 0.00 0.02 0.01 0.63 0.41 0.32 0.38 Textile 0.33 0.38 0.47 0.49 0.00 0.00 0.00 0.00 Clothing 0.29 0.30 0.40 0.35 0.00 0.00 0.00 0.00 Wood 0.11 0.10 0.22 0.37 0.05 0.06 0.00 0.00 Leather 0.60 0.62 0.69 0.73 0.00 0.00 0.00 0.00 Metal 0.28 0.26 0.17 0.17 0.00 0.00 0.00 0.00 Engin 0.17 0.19 0.12 0.02 0.01 0.00 0.03 0.34 NonMet 0.20 0.08 0.12 0.11 0.00 0.03 0.00 0.02 Chem -0.04 0.15 0.00 0.05 0.61 0.02 0.40 0.19 Paper 0.11 0.07 0.06 0.07 0.05 0.12 0.17 0.13 Sundry 0.05 -0.00 0.01 0.15 0.19 0.37 0.36 0.01 Manufacturing 0.05 -0.00 0.01 0.15 0.47 0.41 0.09 0.02 Source: our elaborations on data produced by Ciccarelli and Fenoaltea (2013, 2014). 24 Interestingly, an upsurge of spatial dependence in Italian traditional sectors characterized by the use of basic technologies also emerged for more recent periods (Arbia, Dedominicis, and De Groot, 2011). These are sectors in which operate rms of small and medium size localized in well-dened industrial clusters in the Northern and Central part of the country (Emilia, Tuscany and Marches). 19 As a further step of our analysis, we can jointly consider the results produced by the Theil measure of geographical concentration and the Moran's I measure of spatial depen- dence described above. Similarly to Guillain and Le Gallo (2010), we can identify four dierent patterns in the distribution of economic activities where either or both geograph- i ical concentration and spatial dependence occur: ( ) high concentration and low spatial dependence, ( ii ) high concentration and high spatial dependence, (iii ) low concentration iv ) low concentration and low spatial dependence. To this and high spatial dependence, ( purpose, we combine the a-spatial Theil index and the Moran's I index in a scatterplot (Figure 6) for each period, excluding tobacco and sundry. The four patterns in the distribution of economic activities are identied by including a vertical and an horizontal dashed line at the median values of the each indicator. During 1871-1911, the composition of the four patterns of spatial distribution changed considerably. Three sectors (chemicals, paper and metalmaking) registered a high level of concentration and a low degree of spatial dependence in 1911 (pattern i )). This means that these three industries concentrated their activity in a small number of areas that were not close to each other. In these sectors, indeed, economies of scale may be reached only by increasing the plant size and concentrating the production in a small number of locations. However, in 1871, metalmaking belonged to the pattern ii ) (high concentration and high degree of spatial dependence), along with textile and non-metallic mineral products. Textile was still in this second group in 1911, along with leather. Thus, textile was the mostly "spatially" clustered sector over the whole sample period, being highly concentrated and at the same time strongly agglomerated in all census periods. This feature still characterizes the textile sector in the modern period (see Arbia, Dedominicis, and De Groot, 2011). Clothing was instead the sector that persistently belonged to the third spatial pattern (low concentration and high spatial dependence) over the while sample period. In 1871, leather was in the low-concentration/high-dependence cluster, but then moved to the high-high group in 1911. In 1911, instead, foodstus and wood were in the low-concentration/high-dependence group, but they belonged to the low-low club in 1871. Finally, we nd that engineering, foodstus and wood were the mostly "spatially" spread industries (pattern iv )) in 1871. Engineering was still in this group in 1911, along with non-metallic mineral products that was in pattern ii ) (high Theil and high Moran) in 1871. 3.3 Local spatial patterns in the distribution of economic activities The Moran's I statistic of spatial autocorrelation considered in the previous section cap- tures the overall spatial clustering present in the data. A positive and signicant measure of Moran's I captures the existence of both high-value clustering and low-value clustering, while a negative autocorrelation captures the presence of high-values next to low-values. In other words, only one dominant type of autocorrelation can be detected. If two structures, such as high-value clustering and low-value clustering, coexist, the global statistic of spatial autocorrelation cannot distinguish them (Zhang and Lin, 2007). In contrast, the local version of the Moran's I statistic is specically designed to nd evidence of local spatial patterns in the empirical data, that is to detect local spatial associations (' hot spots '), such as high-value clusters, low-value clusters, and negative autocorrelation 20 (a) Year: 1871 (b) Year: 1881 (c) Year: 1901 Figure 6 Spatial concentration: (d) Year: 1911 scatterplot between the a-spatial Theil index of concentration and the global I Moran index of spatial autocorrelation. Value added data 21 (Anselin, 1995). The local Moran statistic for an observation Ii = LQi X i is dened as: wij LQj (8) j6=i The results of the local Moran's I tests for the whole manufacturing are shown in Figure 7, along with the chroroplet maps of the spatial distribution of the location quotients. 25 HH for locations with high LQ for a specic sector surrounded by regions with high levels of the LQ; LL for locations with low levels surrounded by locations where the LQ are also low; HL for locations with high values surrounded by locations with low values, and LH for locations Signicant values of the local Moran statistic are classied as: levels of the with low values surrounded by locations with high values. While the rst two typologies (namely and HH LH ) and LL) suggest clustering of similar values, the last two situations (HL capture the presence of regional outliers in the spatial distribution of economic activities. In 1871, the manufacturing activity as a whole mainly clustered in a few Northern provinces (Figure 7). In particular, Milan, Novara, Bergamo, Como, Brescia, Verona and Vicenza formed the HH group. In the same period, however, some other provinces (such Genoa, Florence, Venice, Ancona, Naples and Palermo) appeared as spatial outliers, showing a high value of LQ surrounded by low values (HL). Actually, although in 1871 the South as a whole was less industrialized than the North, Naples registered the second highest LQ value in manufacturing (1.41) after Milan (1.57) and higher than Turin (1.35), while Palermo had a LQ value in manufacturing (1.32) higher than Genoa (1.20). In 1911, however, the main HH cluster moved towards the North-West side of the country, thus extending to Turin and excluding Verona and Vicenza. At the same time, only Genoa, Ancona and Naples remained in the cluster of HH HL category, while Pisa and Leghorn formed a new values. On the other hand, the big cluster of low values (LL) moved more and more to the South. As for the single industrial branches, Figures 8-17 show several heterogeneous spatial distributions of the location quotients and of the resulting local Moran's I signicance maps. Let's start with the case of textile. Already one of the country's most important sectors, the textile industries' development would be particularly dynamic over the coming decades, and would remain particularly concentrated in the Northwest. Although in 1871 there was an outpost of textile production in Naples (this province appears as HL outlier), the main cluster was already in the North-western provinces (Lombardy, Piedmont and Liguria); in 1911 Naples was not an outlier anymore, while production concentrated even more in the North. As for the other sectors, Clothing tended to concentrate in Central-Adriatic provinces; Non-metal mineral products clustered in Tuscany and in a few outliers. The number of provinces relatively more specialized in Metal-making and in Engineering reduced over time, thus conrming the tendency towards spatial concentration discussed above. In particular, in 1911 the manufacturing of metal-making products was concentrated in Pisa, Leghorn, Porto Maurizo, Perugia and Arezzo, while Engineering was clustered in Porto 25 It is interesting to consider the two maps jointly in order to fully appreciate the value of using local indicators. In fact, a pure inspection of the spatial patterns of the location quotients is not able to capture the existence of production clusters. Once we inspect the local Moran map we can easily identify signicant industrial local clusters. 22 Figure 7 MAN: Choroplet (a) 1871 (b) 1911 (c) Local I Moran: 1871 (d) Local I Moran: 1911 maps and local Moran's I of LQ values. Value added data 23 Maurizio, Alessandria, Genoa, Milan, Brescia, and Venice, while older positive clusters disappeared. Also the spatial clustering of Chemical activity changed over time, in favor of some provinces in the Center and other in the North. Leather and Foodstu clustered in the South, while the spatial distribution of the Wood industry showed relevant changes over time, mostly moving from the North to the South. The empirical spatial data analysis consider so far provided a clear evidence of an upsurge of the interregional division of labor and of the spatial concentration of industrial activity across the Italian provinces during the post-unication period (1871-1911). The above ndings corroborate, with updated value added data and proper exploratory spatial data analysis, the main evidence suggested by Ciccarelli and Proietti (2013). However, this evidence does not point to any particular set of explanations. An increased interregional division of labor might be seen as evidence in favor of HO-type mechanisms of industrial location. It might equally be seen as the other side of a process of concentration in some industries, due to NEG-type mechanisms. The balance of these osetting forces is a priori uncertain and is left to the econometric analysis carried out in the next section. 24 Figure 8 FOOD: Choroplet (a) 1871 (b) 1911 (c) Local I Moran: 1871 (d) Local I Moran: 1911 maps and local Moran's I of LQ values. Value added data 25 Figure 9 TEXTILE: (a) 1871 (b) 1911 (c) Local I Moran: 1871 (d) Local I Moran: 1911 Choroplet maps and local Moran's I of LQ 26 values. Value added data Figure 10 CLOTHING: (a) 1871 (b) 1911 (c) Local I Moran: 1871 (d) Local I Moran: 1911 Choroplet maps and local Moran's I 27 of LQ values. Value added data Figure 11 WOOD: Choroplet (a) 1871 (b) 1911 (c) Local I Moran: 1871 (d) Local I Moran: 1911 maps and local Moran's I of LQ 28 values. Value added data Figure 12 LEATHER: (a) 1871 (b) 1911 (c) Local I Moran: 1871 (d) Local I Moran: 1911 Choroplet maps and local Moran's I of 29 LQ values. Value added data Figure 13 NONMET: (a) 1871 (b) 1911 (c) Local I Moran: 1871 (d) Local I Moran: 1911 Choroplet maps and local Moran's I of 30 LQ values. Value added data Figure 14 METAL: Choroplet (a) 1871 (b) 1911 (c) Local I Moran: 1871 (d) Local I Moran: 1911 maps and local Moran's I of LQ 31 values. Value added data Figure 15 ENGIN: Choroplet (a) 1871 (b) 1911 (c) Local I Moran: 1871 (d) Local I Moran: 1911 maps and local Moran's I of LQ 32 values. Value added data Figure 16 CHEM: Choroplet (a) 1871 (b) 1911 (c) Local I Moran: 1871 (d) Local I Moran: 1911 maps and local Moran's I of LQ 33 values. Value added data Figure 17 PAPER: Choroplet (a) 1871 (b) 1911 (c) Local I Moran: 1871 (d) Local I Moran: 1911 maps and local Moran's I of LQ values. Value added data 4 Testing the eect of water endowments and market potential on industry location 4.1 Econometric specication The previous section described the spatial distribution of manufacturing industries during 1871-1911. As already pointed out in the introduction, the location of industrial HO hypothesis ) sectors over the sample period is mainly driven by natural advantages ( NEG hypothesis ). and/or (domestic) market access ( A'Hearn and Venables (2013) and Midelfart-Knarvik and Overman (2002) provided theoretical frameworks to show how these economic geography forces operate. Generally speaking, the spatial distribution of perfectly competitive sectors - where rms produce with constant returns to scale - is mainly driven by the comparative advantages of the dierent regions; while, the location of monopolistically competitive sectors characterized by increasing-return technologies is mainly driven by the market potential, so as rms tend to concentrate in one or few markets and to export to the other markets in order to take advantage of scale economies. The industrial location patterns can be thought as the equilibrium outcome between these agglomeration forces, that encourage rms to concentrate over space, and 34 dispersion/competitive forces, that encourage economic activities to spread out. This section examines the relevance of factor endowment and market potential in shaping the location of industries across Italian provinces during 1871-1911. Industrial location is measured in relative terms, that is using the log of the location quotients LQki,t . As for factor endowment, we focus on water abundance, measured with the log of the variable RIV ER described in section 2.1 and its interaction with the dummy variable W AT ERF ALLS , indicating the existence of waterfalls in the province. Market potential is measured by the log of the Harris (1954)'s formula (M KT P OT ) (see section 2.2). Obviously, W AT ERF ALLS and RIV ER are time invariant, while M KT P OT is time varying. To avoid mis-specication biases due to a wrong functional form, we adopt a semiparametric additive model where all variables enter additively as smooth functions, without imposing any specic functional relationship (linear or nonlinear) with the response variable. Thus, for each sector [M 1] ln LQki,t k, the model is specied as: = α +βW AT ERF ALLSi +f1 [ln (RIV ERi |W AT ERF ALLSi = 0)] +f2 [ln (RIV ERi |W AT ERF ALLSi = 1)] +f3 [ln (M KT P OTi,t )] +h (noi , ei ) + γt + i,t ∼ iidN 0, σε2 i = 1, ..., N t = 1, ..., T εi,t f[p] s represent unknown smooth functions of the univariate terms. Namely, f1 (.) f2 (.) capture the smooth eect of water endowment; while f3 (.) captures the smooth where and 26 eect of market potential. Time xed eects (γt ) are introduced in the model to control for time-related factors biases. Moreover, a geoadditive term, h (noi , ei ), that is a smooth interaction between the spatial coordinates of the provinces (or smooth spatial trend), is introduced in order to avoid mis-specication biases due to unobserved spatial heterogeneity. Controlling for unobserved heterogeneity is indeed a fundamental task in our empirical analysis, as other natural advantages (apart from water endowment) may inuence industry location. Thus, failing to control for unobserved heterogeneity can introduce omitted-variable biases and preclude causal inference. Unobserved heterogeneity may also be spatially correlated, thus introducing spatial autocorrelation in the residuals. As McMillen (2012) shows, in this case a simple semiparametric model, with a smooth spatial trend (the so-called Geoadditive Model ), can remove both unobserved spatial heterogeneity and spatial de- pendence (see also Basile, Durban, Minguez, Montero, and Mur, 2014). In the spatial econometrics literature, spatial dependence due to (spatially correlated) unobserved heterogeneity is classied as `nuisance' spatial dependence, in contrast to 26 Apart from the semiparametric form, this specication is dierent from the one used in the related literature, for example by Wolf (2007); Ellison and Glaeser (1999); Midelfart-Knarvik, Overman, Redding, and Venables (2000); Midelfart-Knarvik, Overman, and Venables (2001). These authors pool the data by regions, sectors and time, and regress the location quotient on a set of interactions between the vectors of location characteristics (factors endowment and market potential) and a vector of industry characteristics (measuring industries' factor intensities and the share of intermediate inputs in GDP). Our dataset is reach enough to avoid considering the above interaction and pooling of the data, and allow us to estimate separate models for homogeneous groups of industrial activity. 35 `substantive' spatial dependence due to spillovers, pure contagion and imitation eects. While nuisance spatial dependence is well captured by the spatial trend surface, substantive spatial dependence should be modeled through traditional spatial econometric models. Given these considerations, we will also consider an alternative specication of the location regression equation, a semiparametric Spatial Autoregressive model (SAR) model, that is a semiparametric model without spatial trend but with the spatial lag of the dependent variable: [M 2] ln LQki,t =α + X wij ln LQkj,t + βW AT ERF ALLSi j εi,t +f1 [ln (RIV ERi |W AT ERF ALLSi = 0)] +f2 [ln (RIV ERi |W AT ERF ALLSi = 1)] +f3 [ln (M KT P OTi,t )] +γt + i,t iidN 0, σε2 i = 1, ..., N t = 1, ..., T ∼ Each univariate smooth term in equations [ linear combination of qk M1 ] known basis functions fk (xk ) = X and [ M2 ] can be approximated by a bqk (xk ): βqk bqk (xk ) qk with βqk unknown parameters to be estimated. To reduce mis-specication bias, qk0 s should be large enough, which results in a danger of over-tting. The smoothness of the functions can be controlled by penalizing 'wiggly' functions when tting the model. A measure of 'wiggliness', 0 Jk ≡ β k Sk β k , is associated with each k smooth function, with Sk a positive semi-denite matrix. Equivalently, each univariate smooth spatial lag term can mk (W xk ) = P qk δqk bqk (W xk ). The penalized spline baselearners can be extended to the two dimensions to handle interactions by using tensor products. be approximated by In this case, smooth bases are built up from products of 'marginal' bases functions. For example, h(no, e) = XX qno βqno ,qe bqno (xno )bqe (xe ) qe As is well known, any semiparametric model can be expressed as a mixed model and, thus, it is possible to estimate all the parameters (including the smoothing parameters) using restricted maximum likelihood methods (REML) (Ruppert, Wand, and Carroll, 2003). 27 4.2 Endogeneity ] is given by the presence of P M1 ] and [kM2 and j wij ln LQj,t ) on the right hand side. The A complication with the estimation of models [ endogenous variables (ln M KT P OTi,t 27 To estimate this model, we have used the method described by Wood (2006) which allows for automatic and integrated smoothing parameters selection through the minimization of the Restricted Maximum Likelihood (REML). Wood has implemented this approach in the R package 36 mgcv. k j wij ln LQj,t is the spatial lag of the dependent variable and, thus, is endogenous by construction. As for ln M KT P OTi,t , New Economic Geography models describe a term P process characterized by reverse causality in which market potential, by attracting rms and workers, increases production in a particular location, and this, in turn, raises its market potential. To address these problems, we extend the REML methodology to estimate the parameters of models [ M1 ] and [M2 ] in a 2-step control function (CF) approach (Blundell and Powell, 2003), that is an alternative to standard instrumental variable (IV) methods. In the rst step, each endogenous variable is regressed on a set of conformable instrumental variables, using a semiparametric model. The residuals from the rst steps M1 M2 are then included in the original model ([ ] or [ ]) to control for the endogeneity of P ln M KT P OTi,t and/or j wij ln LQkj,t . Since the second step contains generated regressors (i.e. the rst-step residuals), a bootstrap procedure is recommended to compute p -values. This procedure requires nding good instruments, that is, variables that are correlated with the endogenous explanatory variables but not with the residuals of the regression. Among the instruments used in the empirical literature for market potential, the most common include geographical distances from the main economic centres (Redding and Venables, 2004; Klein and Crafts, 2012) or the use of lagged variables for the market potential (Wolf, 2007; Martinez-Galarraga, 2012). We use the distance from Milan as instrumental variable. As for the endogeneity of the spatial lag of the dependent variable, we follow the standard literature in using the spatial lags of the exogenous variables (i.e. water endowment) as good instruments (Kelejian and Prucha, 1997). As the rst-step equations of models [ M1 ] and [ M2 ] are the same for all sectors analyzed, we report here the results of their estimation (see Table 7). These results clearly show that the distance from Milan is a highly correlated with the market potential and the spatial lags of water endowment variables are highly correlated with the spatial lag of the dependent variable. Table 7 Estimation results rst step of the control function approach Variable First step M1 F test edf f1(ln(River)|Waterfalls=0) 16.781 1.993 (0.000) f2(ln(River)|Waterfalls=1) 11.900 1.000 (0.001) f3(no,e) 14.349 18.531 (0.000) f4(DistMi) 25.073 1.640 (0.000) f5(ln(River)) f6(Waterfalls) 37 First step a M2 First step b M2 F test edf F test edf 1.129 1.770 4.544 3.347 (0.324) (0.002) 7.961 3.040 4.040 2.594 (0.000) (0.008) 29.460 (0.000) 5.819 (0.017) 5.588 (0.001) 1.983 1.000 3.176 40.931 (0.000) 7.431 (0.000) 7.645 (0.001) 1.930 3.638 1.506 4.3 Hypotheses The aim of this section is to provide a possible guidance in the interpretation of the estimation results resulting form the second step of our CF approach (see section 4.5). To this end, we group manufacturing sectors into three categories, depending on the magnitude of their capital intensity, that is their capital to labor ratio (K/L). By doing so, we can make specic hypotheses on the relative eect of market potential on the location of the industry. Indeed, just as the eect of water endowment o industrial location may depend on the prevailing technology, the eect of market potential may depend on the degree of scale economies prevailing in the various sectors. Economic activities with increasing returns to scale tend to establish themselves in regions that enjoy good market access, while economic activities with constant returns technologies are essentially inuenced by factor endowment. We argue thus that the K/L ratio represents a good measure of scale economies. We expect in particular that the location of "high K/L" sectors is relatively more inuenced by the market potential, while the location of "low K/L" sectors is relatively more inuenced by the water endowment. In the years following the country's unication many manufacturing sectors were still organized on an artisanal basis. In the absence of economies of scale, we might expect low-medium K/L industries to locate close to their customers (the eect of market potential is null or negligible), a point made in Fenoaltea (2003, 2011). Dierently, in the case of industrial activities with "medium" and "high" K/L, one may expect an increasing role of market potential during the second part of our sample period, as these kinds of activities became more and more organized on a factory basis. Grouping manufacturing sectors on the base of capital intensity is not new in the literature. Zamagni (1993, pp. 86-87) groups industrial sectors into and traditional advanced, intermediate, industries depending on their capital intensity and number of workers per plant. Federico (2006) groups industrial sectors into light industries and heavy industries. The former are those that are featured by a (relatively) low capital intensity and by the prevailing orientation towards the nal consumer, the latter are more capital-intensive and produce mainly inputs for other sectors. However, from a practical point of view, grouping manufacturing sectors on the base of capital intensity is not an immediate task. First of all, the rst Italian historical source with information on capital stock K is the industrial census of 1911 (IC 1911), while no systematic information is available for earlier years. IC 1911 reports in particular data on horse-power per worker, a variable that is frequently used to proxy capital intensity both in comparative studies (Rostas, 1948; Broadberry and Crafts, 1990; Broadberry and Fremdling, 1990; de Jong, 2003), and studies on the Italian case (Zamagni, 1978, 1993; Bardini, 1998; Federico, 2006). Second, IC (1911) omits shops with only one worker or those who work at home, and more generally understates actual industrial employment (Fenoaltea, 2014). Third, sectoral data on K/L (actually on HP/L) represent averages and, as such, are not per se informative of 28 within-sector heterogeneity. With these important caveats in mind, it is reassuring that Italy's economic histori- 28 An example referring to the vast and variegated engineering sector should help clarifying the point. Fenoaltea (2014) disaggregates the engineering sector in 11 elementary components (ranging from blacksmiths, to shipyards, to precious-metals products). The K/L ratio for the engineering sector in 1911 is there estimated to be equal to 0.33. However, this last gure is an average that includes an estimate of K/L = 0.05 for precious-metals products, of about 0.20 for blacksmith, and of about 0.60 for shipyards. 38 ans substantially agree on classifying clothing, leather, wood, and non-metallic mineral products as "low K/L" sectors; metal-making, chemicals, foodstus, and paper as "high K/L"; and the textile and engineering sector as "medium K/L". 29 Table 8, mainly based on Zamagni (1978, 1993) summarizes our grouping of manufacturing sectors in three categories depending on their capital intensity. It is nally comforting that our "K/L" thresholds (roughly 0 to 0.4, 0.4 to 0.7, and above 0.7) used to group 1911 industrial sectors depending on their capital intensity dovetail nicely with those (0.4 and 0.6) reported in Federico (2006), Table 2.6 and used by the author to group 1927 industrial sectors into light and heavy industries. Table 8 Horse-power per worker (HP/L) in 1911 High HP/L Metalmaking Chemicals Foodstus (excl. sugar) sugar Paper 2.62 1.30 0.94 2.18 0.73 Textile Engineering 0.52 0.51 Medium HP/L Low HP/L Non-metallic mineral products Wood Leather Clothing 0.36 0.23 0.09 0.07 4.4 Model selection and diagnostics Using the CF approach we have estimated models M2 M1 (i.e. the geoadditive model) and (i.e. the semiparametric SAR model), as well as three restricted specications of and M1 M2, that is: i ) a semiparametric model without spatial trend [M3 ]; ii ) a geoadditive model with linear terms for the main explanatory variables (water endowment and market M4 ]; and iii ) a linear SAR model [M5 ]. potential) [ We compare the performance of these BIC ), residual spatial ve competing models in terms of Bayesian Information Criterion ( autocorrelation (through a Moran I test of the residuals of the second step) and the presence of a spatial trend in the second step residuals In the case of whole manufacturing, 30 (Table 9). the geoadditive model [ M1 ] outperforms all the others: it reaches the lowest BIC value and there is no evidence of residual trend and spatial autocorrelation in the residuals. It is worth noticing the the geoadditive model with linear terms (model [M4]) has the second best performance, while the SAR model (both 29 As noticed in Zamagni (1993, p. 85), the relatively high level of horsepower per worker in the foodstus industry should not be misinterpreted, since it is largely due to the our-milling industry, 61% of whose power capacity still consisted in hydraulic energy. 30 To check for the presence of a residual spatial trend, we regress the residuals of second step on the smooth interaction between latitude and longitude by using a P-spline method. 39 semiparametric [ M2 ] M5 ]) and linear [ and the semiparametric model without spatial M3 ] are not able to capture the spatial pattern in the data. Similar comments apply for each group of sectors, high K/L sectors (i.e., paper, chemicals and metal-making), medium K/L sectors and low K/L sectors (i.e. clothing, leather, wood and non-metallic trend [ minerals). The above evidence points clearly to the superiority of the geoadditive model model in capturing the spatial pattern present in the data. This suggests that, over the sample period, spatial spillovers (that is substantive spatial dependence) were not really important, while most of the spatial autocorrelation in the relative location values might be attributed to the presence of spatially autocorrelated unobserved heterogeneity. Thus, in the next section we focus on the results of the geoadditive model M1. Table 9 Model comparison M1 BIC 2.494 Moran test stat. 1.024 p-value (0.153) Trend No trend BIC 4.184 Moran test stat. 0.749 p-value (0.227) Trend No trend BIC 2.477 Moran test stat. 0.371 p-value (0.355) Trend No trend BIC 2.022 Moran test stat. -0.789 p-value (0.785) Trend No trend M2 M3 M4 M5 Manufacturing 2.767 2.862 2.702 2.927 2.895 3.713 1.330 3.008 (0.002) (0.000) (0.092) (0.001) Trend Trend No trend Trend High K/L 4.254 4.443 4.180 4.304 1.671 6.591 0.736 1.500 (0.047) (0.000) (0.231) (0.067) Trend Trend No trend Trend Medium K/L 2.621 2.980 2.664 2.686 -2.266 8.381 1.508 -2.526 (0.988) (0.000) (0.066) (0.994) Trend Trend No trend No trend Low K/L 2.116 2.307 2.134 2.212 -1.032 4.783 0.908 -0.448 (0.849) (0.000) (0.182) (0.673) No trend Trend No trend Trend 4.5 Empirical results M4 ], Table 10 gives the estimation results of the second step of the geoadditive model [ obtained by pooling over the 69 provinces and the 4 years (1871, 1881, 1901, and 1911) in our sample and including the rst-step residuals in the original semiparametric regression M KT P OT . For the parametric terms (the intercept and the dummy variable W AT ERF ALLS ), we report the estimated to correct the inconsistency due to the endogeneity of coecients and the corresponding bootstrapped p -values. All the estimated models in- clude time dummies, but the results for these controls are not reported. For the the edf ), nonparametric smooth terms, the table also shows the estimated degree of freedom ( 40 a broad measure of nonlinearity (an edf equal to 1 indicates linearity, while a value higher than 1 indicates nonlinearity). Table 10 Estimation results second step of the control function approach - M1 MAN Parametric terms Intercept Dummy Waterfalls Nonparametric terms f1(ln(River)|Waterfalls=0) f2(ln(River)|Waterfalls=1) f3(ln(Mktpot)) f4(no,e) f5(ln(Res. F.S.)) Coe. p-value Coe. p-value -0.023 (0.407) -0.265 (0.000) -0.217 (0.000) 0.083 (0.334) -0.067 (0.012) -0.199 (0.000) 0.027 (0.276) 0.015 (0.711) edf edf edf edf edf R2-adj. Dev.expl. REML 2.091 3.032 2.624 18.747 1.000 0.715 0.747 43.156 3.073 1.000 1.000 9.970 3.392 0.525 0.564 -166.306 3.839 2.988 2.388 13.127 1.762 0.734 0.762 57.551 1.000 2.285 2.155 23.024 1.000 0.709 0.745 124.715 The estimation results for the whole manufacturing the three univariate smooth terms (f1 (RIV f3 (M KT P OT )) High K/L Medium K/L Low K/L ER), f2 (RIV ER ∗ W AT ERF ALLS) enter nonlinearly the model: the edf earity assumption is also conrmed by a more formal model M1 and the geoadditive model M4 and is always higher than 1. The lin- F test, comparing the geoadditive (the results are available upon request). Fig- ure 18 portrays the smooth partial eect of M KT P OT , activity clearly suggest that RIV ER, RIV ER ∗ W AT ERF ALLS and along with the point-wise 95% condence bands (obtained using a boot- strap procedure). Strong nonlinearities in the partial eect of W AT ERF ALLS RIV ER are clearly detected. Specically, it emerges a between water endowment and LQ, when W AT ERF ALLS = 0. and U -shaped RIV ER ∗ relationship However, the large con- dence bands indicate a poor precision of the estimate and suggest to be very cautious in the interpretation of the plot. Instead, the eect of water endowment is magnied when W AT ERF ALLS = 1, that is when there is a presence of waterfalls in the province. In this case the precision of the estimate is much higher and the results indicate a positive eect of water endowment up to a certain threshold. Net of the eect of water endowment and after correcting for the endogeneity bias, it also emerges a positive and monotonic relationship between industrial location and market potential, conrming that even at an early stage of Italy's industrialization rms tended to settle in regions with M KT P OT is of M KT P OT . the highest market potential to minimize costs. Specically, an increase in associated with an increase in LQ, but only after a threshold in the level And, as discussed in Section 2, the North West was the area with the highest market potential (and with the highest endowment of waterfalls as well) at the beginning of the sample period. The country's unication improved its relative position within the domestic market. The graphical representation of the estimated smooth trend surface, f4 (no, e), is not reported for reason of space, but this term is also highly signicant, suggesting the presence of unexplained spatial heterogeneity. Finally, the smooth term 41 f5 (Res.F.S.), capturing the eect of the rst-step residuals is also signicant, conrming the endogeneity of M KT P OT . Tables 10 looks within manufacturing and reports the results for the three groups of high K/L, medium K/L and low K/L sectors. In line with our expectations, the location of "high K/L" sectors is more inuenced by the market potential, while the location of "low K/L" sectors is more inuenced by the water endowment. Indeed, in the case of high K/L industries, only the market potential turns out to be a key driver industries: of industrial location (a monotonic positive eect emerges), while the eect of water endowment is not signicant. On the other hand, the variable and positively in the regression equation for " low K/L" River enters signicantly industries. These ndings are also consistent with the prediction of the Core-Periphery NEG model (Krugman, 1991), according to which only economic activities with increasing returns to scale tend to establish themselves in regions that enjoy good market access. K/L" In the case of " medium industries (textile and engineering), both water abundance and market potential appear to play some role in the industrial location. As discussed in section 4.3, the role of market potential might have changed during the sample period. In fact, we expect that with the ongoing domestic market integration, market access increased its role in driving industrial location of medium and high K/L sectors. To assess this hypothesis, we also estimated an alternative model [ M1a ] that includes the interaction between market potential and time dummies: [M 1a] ln LQki,t = α +βW AT ERF ALLSi +f1 [ln (RIV ERi |W AT ERF ALLSi = 0)] +f2 [ln (RIV ERi |W AT ERF ALLSi = 1)] T X + fδ [ln (M KT P OTi,t )] × t δ=1 εi,t ∼ +h (noi , ei ) + γt + i,t iidN 0, σε2 i = 1, ..., N t = 1, ..., T For the sake of completeness, we also estimated a model that imposes the linearity in the eect of market potential (model [ [M 1b] ln LQki,t M1b ]): = α +βW AT ERF ALLSi +f1 [ln (RIV ERi |W AT ERF ALLSi = 0)] +f2 [ln (RIV ERi |W AT ERF ALLSi = 1)] T X + πδ ln (M KT P OTi,t ) × t δ=1 εi,t ∼ +h (noi , ei ) + γt + i,t iidN 0, σε2 i = 1, ..., N t = 1, ..., T Table 11 reports the estimation results of the second steps of models [ [ M1b ] M1a ] and (obviously a CF approach has been used to control for the endogeneity of market potential). Since the eect of water abundance is the same as the discussed above, we only report the results for market potential. 42 As the edf of the nonparametric terms M1a ] of model [ are all equal to one, we can safely approximate them by linear terms. M1b ]. Thus, we focus on the estimation results of model [ In the case of medium K/L the coecients are signicant in all sub-periods and their amount is stable around 0.5, thus indicating the absence of time heterogeneity of the eect of market potential for this 31 group of industries. In the case of high K/L, instead, the coecients are signicant only in the last two period and the elasticity increases over time passing from 0.04 to 0.228, thus indicating a strong time heterogeneity of the eect of market potential for this group of industries. 32 These results strongly corroborate our expectation of an increasing role of market potential for engineering and textile as these kinds of activities became more and more organized on a factory basis. Table 11 Estimation results second step of the control function approach - M1a and M1b Nonparametric terms (Model M1a) f3a(ln(Mktpot))|year=1871 f3b(ln(Mktpot))|year=1881 f3c(ln(Mktpot))|year=1901 f3d(ln(Mktpot))|year=1911 Parametric terms (Model M1b) ln(Mktpot)|year=1871 ln(Mktpot)|year=1881 ln(Mktpot)|year=1901 ln(Mktpot)|year=1911 High K/L Medium K/L edf edf edf edf Coe. p-value Coe. p-value Coe. p-value Coe. p-value 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 0.587 (0.000) 0.550 (0.000) 0.494 (0.000) 0.505 (0.000) 0.040 (0.481) 0.059 (0.301) 0.164 (0.004) 0.228 (0.000) 31 We re-estimated model [M1b ] without time heterogeneity in the linear eect of market potential. A simple likelihood ratio test reveals that this restricted model is not dierent from [M1b ]: the with 3 d.f. is 0.48 and the associated p -value χ2 statistic is 0.923. 32 In this case the likelihood ratio test suggests that the restricted model without time heterogeneity is signicantly dierent from [M1b ]: the χ2 statistic with 3 d.f. is 14.63 and the associated 0.002. 43 p -value is (a) Figure 18 Whole manufacturing. (b) Smooth eects from semipar. M.1 (a) Figure 19 High K/L. (c) (b) Smooth eects from semipar. M.1 44 (c) (a) Figure 20 Medium K/L. (b) (c) Smooth eects from semipar. M.1 (a) Figure 21 Low K/L. Smooth (b) (c) eects from semipar. M.1 5 Conclusions Using a new set of provincial (NUTS 3 units) data on industrial value added at 1911 prices, this paper analyzed the spatial location patterns of manufacturing activity in Italy during the period 1871-1911. Specically, we have tested the relative eect of tangible factors, such as market size, transport costs and factor endowment (water abundance), as main drivers of industrial location, ruling out the role of intangible factors such as knowledge spillovers. The results show that, as transportation costs decreased and barriers to domestic trade were eliminated, Italian provinces became more and more specialized, and manufacturing activity became increasingly concentrated in a few provinces, mostly belonging to the North-West part of the country. The estimation results corroborate the hypothesis that both comparative advantages (water endowment eect) and market potential (home-market eect) have been responsible of this process of spatial concentration. The location of some traditional industries characterized by a low capital-labor ratio (such as clothing and wood) was mainly driven by water endowment, while the location of fast growing new sectors characterized by a medium/high capital labor ratio 45 (such as engineering, metalmaking, chemicals, textile and paper) was increasingly driven by the domestic market potential. In the case of the last group of industries, indeed, increasing returns at the plant level created an incentive for geographical concentration of the production and falling transport costs created an incentive to locate plants close to large markets. This centripetal forces more than counterbalanced the centrifugal force of dispersed natural resources. Our results are quite in line with the Core-Periphery NEG model (Krugman, 1991) that predicts increasing polarization and regional specialization as a result of economic integration. In this model, agglomeration economies derive from the interaction among economies of scale, transportation costs and market size, while intangible external economies (such as information spillovers) do not play any role. Today NEG models are strongly criticized on the base of the observation that they focus on forces and processes that were important a century ago but are much less relevant today. The NEG approach seems less and less applicable to the actual location patterns of advanced economies. Nevertheless, the current economic geography of fast-growing countries like Brazil, China, and India is highly reminiscent of the economic geography of Western World countries during the nineteenth century, and it ts well into the NEG framework. Historical analyses like the one presented in this paper might thus help understanding the current process of economic modernization of developing countries. Data Appendix This appendix describes the sources and methods used to build the provincial dataset used in this paper. Manufacturing value added Industrial value added data are from Ciccarelli and Fenoaltea (2013), duly updated with the new regional estimates by Ciccarelli and Fenoaltea (2014). The basic idea is to allocate to provinces the regional value added estimates at 1911 prices as follows. For each of the 12 manufacturing sectors, provincial labor force shares of the regional total are computed. Sector-specic regional value added estimates are then allocated to provinces using the above province-specic labor force shares. The calculation has been done separately for each benchmark year (1871, 1881, 1901, and 1911). Using formulas: vap,j,t = V Ar,j,t ∗ lf shp,j,t p = 1, 2, . . . , 69 denotes provinces; j = 1, 2, . . . , 12 denotes manufacturing sectors, t = 1871, 1881, 1901, and 1911 denotes the time index, va denote provincial value added, V A denotes regional value added, and lf sh denote the labor force share (of the regional were total). Insert gure ?? about here Water endowment The water endowment variables entering the regression models were obtained as follows. 46 Rivers: Data on Italian rivers are from the Italian Geographic Military Institute: Istituto Geograco Militare (2007), Classicazione dei corsi d'acqua d'Italia, Florence. The source include a detailed list of the most important 100 rivers classied by province. To each river the experts of the IGM assign a score (ranging from 1 to 5) depending on its length in km, and its socio-economic relevance. The data were inspected again historical sources such as the Annuario statistico italiano, also provided us useful information. various years. Many other sources The main consulted are: i) Baccarini A. (1877), Appunti di statistica idrograca italiana: i umi, Rome; iii) Ministero Agricoltura In- dustria e Commercio (1884), Forza idraulica utilizzata nelle diverse provinci d'Italia, Bollettini di notizie agrarie, luglio, pp. 893-896; iii) Nitti, F.S. (1905), La conquista della forza : l'elettricità a buon mercato : la nazionalizzazione delle forze idrauliche, Roux e Viarengo; iv) Ministero dei lavori pubblici (1926), Statistica delle grandi utilizzazioni idrauliche per forza motrice, Rome ; Waterfalls: Quantitative data (height and main jumps, in meters) on Italian waterfalls are from the book: Pavolini, M. (1995), Cascate d'Italia, Udine. The data were compared, when- ever possible, against the world waterfalls database (WWD) available at www.worldwaterfalldatabase.com last accessed March 2015. Market potential The market potential variable is measured by the logarithm of the Harris market potential. Historical GDP estimates at the provincial level for the case of Italy are not available. We thus proxy for provincial GDP by allocating total regional GDP (NUTS-2 units) estimates for 1871, 1881, 1901, and 1911 to provinces (NUTS-3 units) using the provincial shares of regional population obtained by population census. GDP estimates at historical NUTS-2 borders have been kindly provided by Emanuele Felice. These estimates dier from the ones published by same author (Felice, 2013), which are at current borders. 47 References Abrate, M. (1970): Il comitato dell'inchiesta industriale a Torino (1872), economiche, pp. 2530. Acemoglu, D., S. Johnson, and Cronache J. Robinson (2004): Institutions as the Funda- mental Cause of Long-Run Growth, NBER Working Papers 10481, National Bureau of Economic Research, Inc. and A. Venables (2013): Regional disparities: internal geography and external trade, Toniolo G.(a cura di), The Oxford Handbook of the Italian Economy since Unication, Oxford University Press, Oxford, pp. 599630. A'Hearn, B., and X. González-Cerdeira (2004): industries: the case of Spain, Applied Alonso-Villar, O., J.-M. Chamorro-Rivas, Agglomeration economies in manufacturing Economics, 36(18), 21032116. Anselin, L. (1995): Local indicators of spatial association-LISA, Geographical analysis, 27(2), 93115. (2013): Spatial econometrics: methods and models, vol. 4. Springer Science & Business Media. Arbia, G., R. Basile, and M. Mantuano (2005): Does Spatial Concentration Fos- Trivez, FJ, Mur, J., Angulo, A. Kaabia, M e Catalan, B.(eds.), Contributions in Spatial Econometrics, Zaragoza. ter Economic Growth? Empirical Evidence from EU Regions, Arbia, G., L. Dedominicis, and H. De Groot (2011): The spatial distribution of economic activities in Italy. A sectorally detailed spatial data analysis using local labour market areas, Bardini, C. (1998): Regional Studies, (Giugno), 114. Senza carbone nell'età del vapore: gli inizi dell'industrializzazione italiana. Mondadori. and A. Zanfei (2008): Location choices of multinational The role of EU cohesion policy, Journal of International Economics, Basile, R., D. Castellani, rms in Europe: 74(2), 328340. Basile, R., M. Durban, R. Minguez, J. Montero, and J. Mur (2014): Modeling regional economic dynamics: spatial dependence, spatial heterogeneity and nonlinearities, Journal of Economic Dynamics and Control. Basile, R., and M. Mantuano (2008): La concentrazione geograca dell?industria in Italia: 1971-2001, Scienze regionali. and J. Powell (2003): Advances in Economics and Econometrics Theory and Application chap. Endogeneity in Nonparametric and Semiparametric Re- Blundell, R., gression Models. Cambridge University Press. 48 and A. Klein (2010): The Cambridge Economic History of Modern Europe: Volume 2, 1870 to the Present chap. Sectoral Developments, Broadberry, S., G. Federico, 1870-1914. Cambridge University Press. and Broadberry, S. N., N. F. Crafts (1990): Explaining anglo-american produc- tivity dierences in the mid-twentieth century, Statistics, 52(4), 375402. Oxford Bulletin of Economics and and R. Fremdling (1990): Comparative Productivity in British Industry 190737, Oxford Bulletin of Economics and Statistics, 52(4), Broadberry, S. N., and German 403421. Cafagna, L. (1989): (1999): Dualismo e sviluppo nella storia d'Italia. Marsilio. Storia economica d'Italia, vol. 1: Interpretazioni chap. Contro tre pregiudizi sulla storia dello sviluppo economico italiano. Laterza. Ciccarelli, C. (2015): Industry, Sectors, Regions: The State of Play on Italy's Histor- ical Statistics, Scienze Regionali. Italian Journal of Regional Science, (forthcoming). and S. Fenoaltea (2009): La produzione industriale delle regioni d'Italia, 1861-1913: una ricostruzione quantitativa. 1. Le industrie non manifatturiere. Ciccarelli, C., Bank of Italy. (2010): Through the magnifying glass: Provincial aspects of industrial growth in post-Unication Italy, Economic History Working Papers, (4). (2013): Through the magnifying glass: Provincial aspects of industrial growth in post-Unication Italy, Economic History Review, 66(1), 5785. (2014): La produzione industriale delle regioni d'Italia, 1861-1913: una ricostruzione quantitativa. 2. Le industrie estrattivo-manifatturiere. Bank of Italy. Ciccarelli, C., and A. Missiaia (2013): The Industrial Labor Force of Italy's Provinces: Estimates from the Population Censuses, 1871-1911, nomica, 29(2), 141192. Ciccarelli, C., and T. Proietti (2013): post-Unication Italy, Rivista di storia eco- Patterns of industrial specialisation in Scandinavian Economic History Review, 61(3), 259286. and J.-F. Thisse (2008): Economic geography: The integration of regions and nations. Princeton University Press. Combes, P.-P., T. Mayer, Crafts, N. (2005): Market potential in British regions, 18711931, Regional Studies, 39(9), 11591166. Catching up twice: The nature of Dutch industrial growth during the 20th century in a comparative perspective, Jahrbuch für Wirtschaftsgeschichte. Beihefte. de Jong, H. (2003): De Gruyter. 49 Devereux, M. P., R. Griffith, and tion of production activity in the UK, H. Simpson (2004): The geographic distribu- Regional Science and Urban Economics, 34(5), 533564. and E. L. Glaeser (2002): Geographic concentration as a Review of Economics and Statistics, 84(2), 193204. Dumais, G., G. Ellison, dynamic process, Ellison, G., and E. L. Glaeser (1999): The geographic concentration of industry: does natural advantage explain agglomeration?, American Economic Review, pp. 311 316. Federico, G. (2006): Industrial structure (19112001), in prises in the 20th century, pp. 1547. Springer. Evolution of Italian Enter- and M. Vasta (2011): Il commercio estero italiano: 1862-1950, Collana storica della Banca d'Italia / Serie "Statistiche storiche": Federico, G., S. Natoli, G. Tattara, Collana storica della Banca d'Italia. Laterza. Felice, E. (2013): Regional income inequality in Italy in the long run (1871?2001). Patterns and determinants, . Perché il Sud è rimasto indietro. il Mulino. Felice, E. (2014): Fenoaltea, S. (2003): Peeking Backward: Regional Aspects of Industrial Growth in Post-Unication Italy, (2011): The Journal of Economic History, 63(4), 10591102. The reinterpretation of Italian economic history: from unication to the Great War. Cambridge University Press. (2014): The measurement of production: lessons from the engineering industry in Italy, 1911, Discussion paper, Collegio Carlo Alberto. Fernihough, A., and K. O'Rourke (2014): Coal and the European industrial rev- olution, University of Oxford, Discussion papers in economic and social history, No. 124, http://www.economics.ox.ac.uk/materials/papers/13183/Coal%20-%20O'Rourke Fujita, M., P. Krugman, and A. Venables (1999): The spatial economics: Cities, regions and international trade, . Guillain, R., and J. Le Gallo (2010): Agglomeration and dispersion of economic activities in and around Paris: an exploratory spatial data analysis, planning. B, Planning & design, 37(6), 961. Environment and Harris, C. D. (1954): The Market as a Factor in the Localization of Industry in the United States, IGM (2007): Annals of the Association of American Geographers, 44(4), 315348. Classicazione dei corsi d'acqua d'Italia. Istituto Geograco Militare. Kelejian, H., and I. Prucha (1997): Estimation of Spatial Regression Models with Autoregressive Errors by Two-Stage Least Squares Procedures: A Serious Problem, International Regional Science Review, 20, 103111. 50 Klein, A., and N. Crafts (2012): Making sense of the manufacturing belt: deter- minants of US industrial location, 18801920, Journal of Economic Geography, 12(4), 775807. Krugman, P. R. (1991): Geography and trade. MIT press. and S. Gardner-Lubbe (2009): BiplotGUI: InteracJournal of Statistical Software, 30(12), 137. la Grange, A., N. le Roux, tive Biplots in R, Martinez-Galarraga, J. (2012): The determinants of industrial location in Spain, Explorations in Economic History, 49(2), 255275. 18561929, McMillen, D. P. (2012): Perspectives on spatial econometrics: Linear smoothing with Journal of Regional Science, 52(2), 192209. structured models, and H. G. Overman (2002): Delocation and European is structural spending justied?, Economic policy, 17(35), 321359. Midelfart-Knarvik, K. H., integration: and A. J. Venables The location of European industry. European Commission, Directorate General Midelfart-Knarvik, K. H., H. G. Overman, S. J. Redding, (2000): for Economic and Financial Aairs. Midelfart-Knarvik, K. H., H. G. Overman, and A. Venables (2001): Compar- ative advantage and economic geography: estimating the determinants of industrial location in the EU, . Milone, F. (1953): Della distribuzione delle industrie in Italia, economica, (43,1), 2055. Rivista di politica Moran, P. A. (1950): Notes on continuous stochastic phenomena, Biometrika, pp. 1723. Negri, A., and M. Negri (1981): Campagna e industria. I segni del lavoro chap. La fabbrica nel territorio. Touring Club Italiano. Pavolini, M. (1985): Cascate d'Italia. Carlo Lorenzini Editore. and A. J. Venables (2004): Economic geography Journal of international Economics, 62(1), 5382. Redding, S., inequality, and international La modernizzazione periferica: l'Alto Milanese e la formazione di una società industriale, 1750-1914. F. Angeli. Romano, R. (1990): Rosés, J. R. (2003): Why isn't the whole of Spain industrialized? geography and early industrialization, 17971910, New economic The Journal of Economic History, 63(04), 9951022. Rostas, L. (1948): International comparisons of productivity, Ruppert, D., M. Wand, and R. Carroll (2003): bridge University Press, Cambridge. 51 Int'l Lab. Rev., 58, 283. Semiparametric Regression. Cam- Sapori, R. (1961): L'industria e il problema del carbone nel primo cinquantenario di unità nazionale, Economia e Storia, 8(1), 715. Schram, A. (1997): Century, Railways and the Formation of the Italian State in the Nineteenth Cambridge Studies in Italian History and Culture. Cambridge University Press. Wolf, N. (2007): Endowments vs. market potential: What explains the relocation of industry after the Polish reunication in 1918?, Explorations in Economic history, 44(1), 2242. Wood, S. (2006): Generalized additive models: an introduction with R. CRC press. Zamagni, V. (1993): Zamagni, The economic history of Italy 1860-1990. Clarendon Press. V. N. (1978): Industrializzazione e squilibri regionali in Italia: bilancio dell'età giolittiana. Mulino. and G. Lin (2007): A decomposition of Moran's I for clustering detection, Computational Statistics & Data Analysis, 51(12), 61236137. Zhang, T., 52