Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
The separation of land cover from land use using data primitives A.J. Comber, Department of Geography, Leicester University, Leicester, UK. E-mail: [email protected] Abstract The common confusion of land cover and land use in many datasets is problematic for many data integration activities. This paper proposes an approach for the separation of land cover and land use based on data primitives. Data primitives are those dimensions that describe at the most fundamental level the processes under investigation. In this case they provide building blocks that to allow land cover and land use to be separated. A series of data primitives were identified from the literature and applied to US National Land Cover Dataset (2001). Mapped outputs, separating the concepts of form (land cover) and function (land use), show the degree of land use, the degree of land cover and locations where the concepts of use and cover are confused. The separation of land cover and land use facilities the integration of land data for environmental modelling and planning activities. The work promotes the need for land use and land cover to be maintained as distinct concepts in data collection activities. Key words: semantics, expert, integration, land use, land cover, data primitive 1. Introduction The concepts of ‘land cover’ and ‘land use are commonly confused in most land surveys including those derived from satellite imagery. This confusion creates problems for research and other activities that seek to integrate different land data as land cover and land use are fundamentally distinct (GLP, 2005). Land cover is determined by direct observation of the earth’s surface and land use is a socio-economic interpretation of the activities that take place on that surface (Fisher et al, 2005). Mixing of the concepts of land cover and land use has become so prevalent that classifications of ‘pure’ land use or land cover are rare even when that is the stated objective (Di Gregorio and Jansen, 2000). The historical reasons for the blurring of these concepts have been reviewed by Fisher et al (2005) and Comber et al (submitted) which, in brief, relate to the different mapping needs of the different agencies involved in the project and the ability to statistically process digital remotely sensed data. The recording of land cover is a relatively recent phenomenon. Historically the overriding trend has been for land use to be recorded as documented by Fisher et al (2005) who note that until the 1960s there was little evidence of land cover being recorded. The availability of digital satellite data in the 1970s resulted a shift away from demand driven classification (e.g. of aerial photography) to data driven statistical classifications of remotely sensed imagery. The classification techniques employed by those involved in recording land cover from remotely sensed imagery have become dependent on and determined by the imagery available to them (Comber, 2002). For these reasons, some workers have identified the need to step back from such data driven approaches to the manipulation of remote sensing data. For example, Skelsey et al (2003) propose and implement Task Oriented approaches to the classification of remote sensing data that focus on the task in hand rather than on the specifications and characteristics of the data. Confusing the concepts of land cover and land hampers data integration and modelling activities such as evaluating the impacts of climate change and the interaction between terrestrial and atmospheric environments. Land cover is needed for the development of physical environmental models and land use is required for policy and planning purposes. The IGBP have called for the explicit separation of the concepts of land use and land cover in order to facilitate such modelling activities (GLP, 2005). Comber et al (submitted) noted that the concepts of land use and land cover need to be separated in order to foster a culture of consistency in land data recording. Similarly, Brown et al (2000) argue for land cover to be separated from land use in order to better link socioeconomic changes with observed changes in land cover. As a step in this direction, this paper proposes a method to separate land use and land cover in existent data, based on the concept of data primitives. Land cover and land use have very different characteristics. For instance there may be many different simultaneous or alternate land uses at any given place whereas the classification of land cover records the physical material on the surface of the earth and is static. This means that the relationship between land use and land cover cannot be directly inferred as they have a complex many-tomany relationship. As models require one or the other the separation of land cover and land use will facilitate activities that require either as an input. In order to integrate data into models, measures of semantic and conceptual overlap between the data and the intended application are needed (see Wadsworth et al this volume). As Wastfelt (2005) comments “if improved technologies such as remote sensing and GIS are to become more useful to society, they need new strategies for interpreting their source material that will bridge the wide epistemological gap between physical and social-scientific understandings of space” (p398). The problem is that data are commonly collected for a specific purpose. The classes and concepts embedded in the data reflect the interests of those involved in the commissioning of the data (e.g. the steering group). The epistemology of data collection and measurement and the ontology of data classification will be specific to those specific interests (Comber et al., 2003). Other uses of the data therefore have to rework the data in some way to fit the objectives and concepts of their analyses or their data. The work presented in this paper makes an attempt to do this by separating the concepts of land use and land cover using scores in different data primitive dimensions. 2. Background Land use and land cover are fundamentally different concepts. Their conflation is convenient but represents an illogical but commonplace paradigm for the following reasons. First, the cover that exists at a given place is a single land class (not withstanding work that seeks to record the heterogeneity of cover such as fuzzy set theory). This is because land cover is defined by what is observed at any given point on the earth’s surface. Whilst there may be variation in precisely what is observed due to different data, sensors, classification algorithms and operators, the phenomenon of land cover is agreed to be a single phenomenon at any given point in time. Second, the land use that exists at any given place is likely to be multidimensional. This is because land uses are defined by human activity on the land which may be single, simultaneous or alternate. For example, Fisher et al (2005) note that a single patch of plantation forestry may also be used simultaneously for several forms of recreation, (hunting, hiking) and for grazing as well as timber production and that the land uses may alternate (grazing, hunting). Some land uses also vary seasonally – a reservoir provides flood control in the spring, hydro-electric power in the winter, fishing in season and boating all year round. Jansen and Di Gregorio (2003) comment that land-use is influenced by cultural factors such as agricultural practices with the result that different land-uses are practised on the same type of land in different areas. Third, land cover and land use classes are not directly compatible. They rarely have a one-to-one relationship and more commonly have a one-to-many or manyto-many relationship. For example, the cover type ‘Grass’ may occur in a number of uses (sports grounds, urban parks, residential land, pasture, etc). Similarly the use type ‘Residential’ may be composed of many covers including trees, grass, buildings, and asphalt. Jansen and Di Gregorio (2003) comment that relations between land cover and land use vary in rural and urban contexts noting that in rural contexts (composed mainly of agricultural and forest land uses) there is more likely to be more direct one-to-one relations between cover and use, whereas in urban contexts (i.e. where there are more people) there are fewer one-to-one relations. Fourth, relationships between land cover and land use may change depending on the scale or level of analysis at which they are observed due to the complexity of the links between socio-economic and environmental systems (Veldkamp and Lambin, 2001; Monroe, 2007). Veldkamp et al (2001) noted that land use to land cover interactions at different scales can create local, spatially dependent process. 2.1 Classification Classification is the process by which objects or individuals are allocated to classes or categories. Many scientific endeavours are based on the need to simplify the real world into some ordered aggregation and classification provides a method to do this. The assumptions that underpin classification are that objects of the same class can be treated as a single phenomenon and generalisations can be made about their behaviour, characteristics or attributes. The grouping provided by classification (or stratification) therefore allows scientists to analyse those variations in behaviour between and within groups. In this section, the objectives and characteristics of classification are reviewed in light of the classification of remotely sensed imagery. The classification of remote sensing data commonly employs one of two approaches: identifying regions with similar characteristics or matching regions to pre-defined prototypes. In the first approach only the number of classes are predefined. The characteristics of the classes are not. An iterative statistical process considers the digital numbers of the image objects (pixels or segments) in the N layers of the image data. It identifies statistical clusters in this N-dimensional space and allocates each image object to the class to which it is nearest under some criterion of distance in this N-dimensional feature space. The clusters represent classes based on spectral similarity. In this approach it is assumed that the number of classes specified matches the number of spectral classes, that the selection of image bands captures variation between objects on the ground and that the spectral clusters represent land classes of interest. This unsupervised classification technique can be used in situations where little ground based data or samples are available. In the second approach image objects such as segmented regions or pixels are compared with prototypes (also known as exemplars, training or samples data) of different classes. The prototypes, or training regions, represent categories of information and serve as abstractions of the most typical or central members of that category. The image object is allocated to the class to which it is nearest under some criteria – distance in spectral space, fulfilment of some condition, probability calculated from a set of beliefs. Brown (1998) note that this approach has traditionally been used in vegetation mapping, where vegetation stands are represented as discrete spatial units and allocated to one of a number of predefined categories. There is an obvious relationship between this discrete model of classification (Brown, 1998) and the allocation of homogenous areas of vegetation into specific categories based on their characteristics. In this context a prototype for a class can be seen as a theoretical cognitive structure (Lloyd, 1994). A number of studies have considered the encoding of cognitive structures as semantic and spatial reference points (Rosch 1975a; 1975b), family resemblance (Rosch and Mervis 1975), and basic-level categories in a hierarchy (Rosch et al. 1976; Tversky and Hemenway 1983). There are known issues relating to the use of prototypes in classifying remote sensing data including the number of number samples used to define the prototype, the selection of the image bands used to differentiate between the land features of interest, and the number of classes, all of which produce variation in the mapped outcome. This supervised classification technique is the most commonly used generic method for classifying remote sensing data as it allows the operator to have a degree of control over the classes that are created. Comber et al (2005a) provide a wider discussion of classification theories in relation to assigning image objects to classes or categories. It is instructive to reflect on the implications of classifications for mapping land cover and land use from remote sensing data. The process of classification allocates individuals, in this case image objects, uniquely to one class based on their characteristics, predefined (supervised) or not (unsupervised). In supervised classification, the class assigned to each individual is that of the ‘closest’ prototype, where the closeness is usually defined by some measure of distance in the N-dimensional image space. Since the nineteenth century most land surveys have used such approaches, often with a taxonomic hierarchy, as the basis mapping with the objective of defining relations between mapped objects in order to understand them better. Thus conventional land mapping defines classes and identifies areas of land to which those descriptions could be applied and produces a crisp choropleth map of spatially discrete mapped areas without gaps or overlaps. Recent developments acknowledge the shortcomings of crisp classifications such as fuzzy set theory (Wang, 1990). In fuzzy classifications membership functions are generated for class for each image object, usually based on the distance in N-dimensional space of that object to the centre of the category cluster. In summary, classification is the process of simplifying and ordering the real world. In remote sensing unsupervised classification approaches are ‘bottom up’ and the closeness of objects to each other is calculated through statistical clustering algorithms or matching characteristics to classes. Supervised approaches to classifying remote sensing data are ‘top down’ and match image objects with predefined prototypes. In each case, objects are allocated a class which itself has a position within a wider taxonomy of land. 2.2 Mapping land cover and land use Bibby and Shepherd (1999) comment that land use objects only “are best regarded as objects by convention, that is, they are objects by virtue of the fact that they are held to be so” and that such objects are “grounded in discourse and projected onto the physical world” (p584). Conversely, land cover is concerned with pre-existing physical matter. The authors make two salient points: that land-use categories lack an intrinsic relation to physical matter, and that they form various hierarchies whose structures reflect different economic and social organizations. The implications for mapping use from remotely sensed imagery are that the same land use can be described at many different levels and membership of one land use to one category does not preclude membership of another. Monroe (2007) also noted that the allocation of a particular land use class is not an objective, observable process: use categories are allocated for other reasons such as institutional objectives, maximising profit or production factors. Heoschele (2000) observed subsistence farmers being disadvantaged in the recording of land use compared to large land owners who are more interested in the forestry. Thus the choice of land-use class is not transparent and the specific circumstances of this choice are not directly measurable (Anselin, 2002). Remote sensing captures the spectral reflectance from earth’s surface. The classification of remote sensing data necessarily allocates image objects into land cover classes based on reflectance values. A second stage of interpretation, often requiring ancillary data or local knowledge, is needed to infer land use. In this context the classification of data on the reflectance of the earth’s surface to land use categories is an unscientific slight of hand as Monroe (2007) notes “Empirically, the discrete representation of land use is often proxied by a discrete representation of land cover” (p522). A number of authors have highlighted the problem of confusing land use and land cover in the classification of remotely sensed imagery. Brown et al (2000) observe that the phenomena of land use and land cover are linked to one another. However changes observed in remotely sensed imagery may not relate to changes in socio-economic conditions and therefore need to be mapping as separate processes for theoretical and practical reasons. Barnsley and Barr (2001) comment that spectral radiance values recorded in remotely sensed data are only indirectly related to the attributes and dimensions of land use. A number of workers have sought to tackle the problems of identifying land use from remotely sensed data using a secondary techniques based on the spatial configuration of land cover elements. Barr and Barnsley (2004) proposed an approach based the morphological properties of buildings to identify a range of land use categories. They concluded that different types of urban land use may be identified through analysis of the spatial disposition of their constituent land cover parcels and suggest that a quantifiable mapping exists between urban form (land cover) and urban function (land use). Herold et al (2002) used landscape metrics to describe urban land-use structures and land-cover changes. Their results showed that different urban land-use types could be identified and land use changes quantified. Jansen and Di Gregorio (2003) identified agricultural production systems based on analysis of field patterns with the presence of and type of built-up structures: the regular shape of land cover polygons indicate commercial production systems and irregular forms may indicate protective and conservation uses. They note that relations between land cover and land-use are more complicated in the case of forestry as the presence of tree stands may not indicate their use(s) and that the relation between land cover in built-up areas and land-use is extremely weak. In a more local study Harrison (2006) has developed a classification system that explicitly separates land cover and land use. The objectives of his work were to promote a coordinated and consistent approach to data recording across government sectors in the UK. The clear separation of land use and land cover allows the requirements of different user communities to be explicitly supported without having to re-work land cover data to land use for policy and planning purposes and land use data to land cover for environmental objectives. This explicit distinction between use and cover facilitates analysis of relationships between the drivers and patterns of land change. 2.3 Data primitives for land cover and land use Data primitives are here defined as those dimensions or measurements that describe the processes under investigation at the most fundamental level. They provide information about the building blocks that underpin the concepts of the phenomenon – what they mean and what they represent. The objective in describing data and land concepts using primitive dimensions (variously referred to as “conceptual spaces”, “approximation spaces”, “domains”) is not seek to generate a hierarchical taxonomy – another classification – rather the approach seeks to generate descriptions of different data features to allow the amount of overlap between them to be quantified. A number of workers have used data primitives to facilitate the integration of data with different ontologies and epistemologies. Ahlqvist (2004) uses conceptual overlap in four dimensions (or primitives) to describe classes simple land taxonomy with 2 agricultural and 2 forest land classes. Each class was given a value in each dimension allowing the amount of overlap between different classes to be quantified. Wadsworth et al (this volume) adapted and extended this approach, defining 5 domains in order to quantify the overlap between 3 Siberian land cover from 3 classifications (IGBP classification of 1km AVHRR, GLC based on 1km AVHRR and SUC based on MODIS 500 meter). The domains were specific to the objectives of the Siberia study: photosynthetic activity / biomass accumulation, wetness, human disturbance, seasonality / phenology and vegetation height. The FAO Land Cover Classification System (LCCS) developed by Di Gregorio and Jansen (2000) provides a method for integrating land cover based on a 2-phase process. In the first phase, land cover classes are allocated to one of eight major land cover categories. In the second phase a further set of classifiers are used to refine the class description based on environmental attributes (e.g. climate, land form, soils / lithology and erosion) and specific technical attributes (e.g. floristic composition, crop type and soil type). One of the criticisms of the LCCS is that the classifiers and categories are fixed and Boolean. For example LCCS classifiers describe forest height to be 2-7m (B1), >3m (B2), >14m (B5), 7m-15m (B6) and 3m-7m (B7). The application of these sub-classes can result in ambiguities and can create second order uncertainties when data are being described using the LCCS. For these reasons Ahlqvist (2007) proposed modifications to LCCS and suggested that LCCS-style reference systems should define the dimensions and unit of measurement for quantitative data attributes in order to allow users freedom to define any threshold values in that dimension. 3. Method The US National Land Cover Dataset 2001 for the state of Connecticut and surrounds was downloaded from Multi-Resolution Land Characteristics Consortium hosted by the USGS1. A description the NLCD project and methodology can be fund in presented in Homer et al (2004). The dataset records 29 land types, with a mixture of cover and use classes, classified from composites of Landsat satellite imagery and modified from the Anderson et al (1976) classification. A series of primitives were identified from other work that has sought to harmonise classifications, including the FAO Land Cover Classification System (Di Gregorio and Jansen, 2000), the conceptual overlaps of Wadsworth et al (this volume) and Ahlqvist (2004) and Wyatt and Gerrard (2001). In setting the criteria the aim was to capture the range of alternatives and information required by applications using land cover or land use data. The objective was defines those parameters that characterise the important features of land use and land cover. The selection of primitives will be returned to in the discussion. Land use studies are concerned with the nature and degree of human activity and there is much interest in land use classifications that capture the diversity of activity in both urban and non-urban environments. In urban contexts the modelling and research interest is in being able to identify the economic value or social value/appreciation of activities as well as the nature of that activity itself. Often in agricultural contexts the aim is to capture land use information that relates to food production. The elements that land cover classifications seek to report on are the physical properties of the surface. These may be related to the naturalness of the surface or human activity – impervious surfaces and the like – or may be related to the nature of the vegetation present. Many land cover classes relating to vegetation define it in relation to its structure. Often this is in terms of percentage cover, vegetation height, seasonality, and the prevailing environmental conditions, often wetness. As a result of this review the following dimensions or 1 http://gisdata.usgs.net/website/MRLC/viewer.php primitives were identified for application to a land cover / land use dataset (NB units are listed, if not then the primitive is an index): 1. Naturalness: the extent to which the class was a naturally occurring feature or was directly the result of anthropogenic activity i.e. the cover primitive; 2. Vegetation height: the minimum height in metres of the vegetation; 3. Vegetation canopy coverage: the minimum percentage of vegetation coverage; 4. Homogeneity of appearance; 5. Seasonality: the extent to which the classes is seasonal or perennial; 6. Structure: the complexity of vegetation structure; 7. Wetness: the dependency on specific wetness conditions (e.g. soil, growing medium, climate); 8. Biomass production: relating to the amount energy fixed through photosynthesis by the class; 9. Human activity: the amount of human related activity in the class; 10. Human disturbance: the extent to which the existence and nature of this class reflect anthropogenic activity; 11. Economic value: the importance economically of this class – how much money can be earned or how much it is worth; 12. Production of crop related food; 13. Production of animal related food; 14. Artificiality: the extent to which the surface has been artificially created Each class was given a score in each primitive of between 1 (least) and 9 (most) if the class was thought to have some properties in that primitive. These were allocated based on examination of the class definitions in Anderson et al (1976) and Homer et al (2004). If the class was thought not to have any attributes in that primitive then no score was allocated. The allocation of the primitive scores is subjective as it was done by a data user with experience of other land cover classifications, with no direct experience of the USGS NLCD classification. The scores for the different classes in these 14 dimensions are shown in Table 1. The scores were applied to the data so that each NLUD class was attributed with a score in each of the 14 primitives. (Insert Table 1 about here) 4. Results The objective was to apply the primitive scores to the NLCD classes and to test the separability of land use and land cover elements in the NLUD data. The primitive were allocated as into being primarily related to ‘Use’ or ‘Cover’ and overall scores for each NLCD class were calculated by normalising use and cover average scores. The primitives allocated to the Cover group were numbers 1) to 7) in the list above and numbers 8) to 14) were allocated to Use. The mean scores for the 2 groups were then normalised using a cumulative distribution function: (Equation 1) where x is the original score, µ the group mean and σ the group standard deviation. The Use and Cover scores for the different NLCD classes generated in this way are shown in Table 2. (Insert Table 2 about here) The normalised scores were used to identify the classes with a high degree of use and / or cover. The classes with high Cover scores above the 50th percentile are: Open Water, Perennial Ice / Snow, Developed High Intensity, Barren Land, Deciduous Forest, Evergreen Forest, Mixed Forest, Shrub / Scrub, Orchards / Vineyards / Other, Lichens, Moss, Woody Wetlands, Palustrine Forested Wetland and Palustrine Scrub / Shrub Wetland. The classes with high Use scores above the 50th percentile are: Developed Open Space, Developed, Low Intensity, Developed, Medium Intensity, Developed High Intensity, Dwarf Scrub, Shrub / Scrub Orchards / Vineyards / Other, Pasture / Hay, Cultivated Crops, Urban / Recreational Grasses, Woody Wetlands, Palustrine Forested Wetland and Palustrine Scrub / Shrub Wetland. Many of the classes identified as being strongly Cover and Use are perhaps unsurprising and may have been identifiable in advance as belonging to a general use or cover category without calculating a score from the 14 data primitives. Others are unexpected especially the strong use scores for the wooded wetland and the scrub. The inclusion of Biomass production as the sole use primitive given a score for these classes resulted in these high use scores. The classes Developed High Intensity, Unconsolidated Shore, Transitional, Dwarf Scrub, Shrub / Scrub, Orchards / Vineyards / Other, Grassland / Herbaceous, Sedge / Herbaceous, Woody Wetlands, Palustrine Forested Wetland and Palustrine Scrub / Shrub Wetland all had similar Use and Cover scores. This provides an indication of the extent to which they may be ambiguously defined. If the classes with high use scores based on Biomass production are excluded, a set of highly mixed classes are identified: Developed High Intensity has a strong cover attributes as it is highly homogenous (spectrally and in terms of the land cover) as well having a degree of use: Unconsolidated Shore is defined as being composed of “unconsolidated material such as silt, sand, or gravel that is subject to inundation and redistribution due to the action of water” and is “characterized by substrates lacking vegetation except for pioneering plants that become established”. It has both weak and cover scores, with little anthropogenic use and little homogeneity or structure of cover. The Transitional class has a high degree of both use and cover: “Areas of sparse vegetative cover (less than 25 percent of cover) that are dynamically changing from one land cover to another, often because of land use activities. Examples include forest clearcuts, a transition phase between forest and agricultural land, the temporary clearing of vegetation, and changes due to natural causes (e.g. fire, flood, etc.).” Orchards / Vineyards / Other has a short NLCD description grounded in use (“Orchards, vineyards, and other areas planted or maintained for the production of fruits, nuts, berries, or ornamentals”), but also has a distinct cover dimension: orchards are not forest (>5m) but could be included under Shrub / Scrub (“Areas dominated by shrubs; less than 5 meters tall with shrub canopy typically greater than 20% of total vegetation. This class includes true shrubs, young trees in an early successional stage or trees stunted from environmental conditions.”). Grassland / Herbaceous has a mix of cover and use, reflected in the definition: “Areas dominated by grammanoid or herbaceous vegetation, generally greater than 80% of total vegetation. These areas are not subject to intensive management such as tilling, but can be utilized for grazing”. The scores can be applied as weights to the NLCD to visualise the distribution of confusions between use and cover. NLCD Data for Connecticut was downloaded from the USGS website. The scores from the above tables were joined to the NLCD classes. The weights as applied to the different classes were interpreted to as beliefs, then areas of differential belief in land use, land cover and land use with land cover were identified. Figure 1 a-e shows the NLCD data, weighted land cover and land use, areas where land both use and land cover weights were high (>50th percentile of weights) and areas where they were both low (<50th percentile) (Insert Figure 1 about here) 5. Discussion and Conclusions This paper has presented a method for separating the concepts of land use and land cover that are embedded in most land datasets. There is a need for their separation to support modelling activities (GLP, 2005), to better link observed changes in the earth’s surface with socio-economic process (Brown et al, 2000) and to foster a culture of consistency in land survey reporting (Comber et al. submitted). The method presented in this paper has applied a set of data primitives to an existent land dataset and sought to capture the essence of both land cover and land use. The aim was to develop and illustrate a generic process by which the concepts of land cover and land use could be separated relative to the task in hand. The approach applied data primitive scores to land classes in a dataset where the concepts of land cover and land use were confused, the USGS National Land Cover Dataset, in order to generate measures of the degree of land use and the degree of land cover for each class. The results of this analysis are necessarily subjective for a number of reasons. First, fourteen primitives were identified by an expert user as being representative of the essential elements of land cover and land use. Other primitives may be more important for specific applications. For instance many of the LCCS classifiers have classifiers that relate to spatial context. Possible spatial primitives include patch size, landscape ecology indices, relationships with adjacent classes and bio-geographic context. Second, some of the primitives are not orthogonal to each other and may essentially record the same thing. For example, Naturalness and Artificiality may be the inverse of each other and Human activity and Economic value may be describing the same processes. However the objective of this work was to explore an approach for separating use and cover as demanded by scientist and modellers. Third, the expert allocated scores based on their limited experience of NLCD and the Anderson et al (1976) classification. Other work has shown that different experts have varying opinions of how landscape features relate to data (Comber et al., 2005b). Fourth, each of the primitives was related to either Use or Cover. For some applications, the specific combination of primitives may vary. For example, Biomass production may be directly related to vegetation cover or the land use of the area. However, the analysis also shows that it is possible to identify classes which are closely related to the surface cover and those which are related to the activity on that surface. It is also possible to identify those classes which have similar degrees of use and cover from the perspective of the interpreted primitives. More importantly, using the primitives it is possible to generate weights that relate to the degree of land cover or land use and the intended application (e.g. data integration). For example, data users would be able to reclassify the classes of an existing data base based on their understanding of the essential data primitives related to their application, identifying relevant land use categories. Equally, other applications of primitives could be used to generate maps that highlight uncertain areas in terms of their use or cover classification. An investigation of spatial patterns would reveal the extent of any spatial autocorrelation aside from the possible autocorrelation in the selected data primitives, as evident the clustering in Figures 1d and 1e. In this work the weights were generated from the perspective of an expert and for other applications alternative weights may be generated from different expert perspectives depending on the task in hand. The use of experts is problematic: opinion between experts varies, they change their mind and may not give consistent opinions, their reasoning is not always transparent and their time is scarce. For these reasons a number of workers have explored the use of text mining approaches to determine the semantic relations between spatial data concepts. Comber et al (submitted) and Wadsworth et al (in prep) have used text mining with measures to weight the importance and overlap of each term. In conclusion, this paper has argued that the concepts of land cover and land use should be separated and has presented a method based on the application of data primitives to do this. It is acknowledged that some of the dimensions used in this work may be dependant and not be orthogonal, they may overlap, they may be redundant for some workers. However the aim of this work was to illustrate how such an approach could be applied and used rather than proposing a definitive separation of use and cover. The separation of land cover and land use facilities data integration modelling activities, etc and allows existing data resources to be better utilised. It also allows the separation of the concepts of form (land cover) and function (land use) which underpins much research in for example urban planning, monitoring of resources and climate change modelling. The fourteen data primitives suggested in this work are not intended to be definitive but to act as an illustration of how capturing the essence of use and cover offers the potential to separate these two concepts in existent data. The advantage (and possibly disadvantage) of the proposed methodology is that it can be applied to any existing data set by any worker. It is hoped that this will act as a starting point and promote discussion within the land use and land cover communities about the nature of the primitives that are relevant to different application areas and the need to maintain a land use and land cover as distinct concepts in data collection activities. References AHLQVIST, O., 2004. A parameterized representation of uncertain conceptual spaces. Transactions in GIS 8 (4): 493-514. AHLQVIST, O., 2007. In search for classification that support the dynamics of science – The FAO Land Cover Classification System and proposed modifications, paper to be published in Environment and Planning B, advance online publication, DOI:10.1068/b3344. ANDERSON, J.R., HARDY, E.E., ROACH, J.T. and WITMER, R.E., 1976. A Land Use and Land Cover Classification System for Use with Remote Sensor Data. U.S. Geological Survey, Professional Paper 964, p 28, Reston, VA. ANSELIN, L., 2002. Under the hood - Issues in the specification and interpretation of spatial regression models. Agricultural Economics, 27(3): 247-267. BARNSLEY, M. and BARR, S., 2001. Monitoring urban land use by Earth observation. Surveys in Geophysics, 21: 269-289. BARR, S.L. and BARNSLEY, M.J., 2004. On the separability of urban land-use categories in fine spatial scale land-cover data using structural pattern recognition. Environment and Planning B: Planning and Design, 31: 397-418 BIBBY, P. and SHEPHERD, J. 1999. GIS, land use, and representation. Environment and Planning B: Planning and Design, 27: 583-598 BROWN, D.G., 1998. Classification and boundary vagueness in mapping presettlement forest types. International Journal of Geographical Information Science, 12 (2): 105-129. BROWN, D.G., PIJANOWSKI, B.C. and DUH, J.D., 2000. Modeling the relationships between land use and land cover on private lands in the Upper Midwest, USA. Journal of Environmental Management, 59 (4): 247-263. COMBER, A.J., FISHER, P.F. and WADSWORTH, R.A., 2005b. Combining expert relations of how land cover ontologies relate. International Journal of Applied Earth Observation and Geoinformation, 7(3): 163-182. COMBER, A.J., FISHER, P. and WADSWORTH, R. (submitted). Using semantics to clarify the conceptual confusion between land cover and land use: the example of ‘forest’. Paper submitted to Journal of Land Use Science. COMBER, A., FISHER, P. and WADSWORTH, R., 2003. Actor Network Theory: a suitable framework to understand how land cover mapping projects develop? Land Use Policy, 20: 299–309. COMBER, A.J., FISHER, P.F. and WADSWORTH, R.A., 2005a. What is land cover? Environment and Planning B: Planning and Design, 32:199-209. COMBER, A.J., 2002, Automated land cover change detection. PhD thesis, Aberdeen University. DI GREGORIO, A. and JANSEN, L.J.M., 2000. Land Cover Classification System: Classification Concepts and User Manual. Rome, FAO. FISHER, P.F., COMBER, A.J., and WADSWORTH, R.A., 2005. Land use and Land cover: Contradiction or Complement. Pp. 85-98 in Re-Presenting GIS, (eds. Peter Fisher, David Unwin), Wiley, Chichester. GLP, 2005. Science Plan and Implementation Strategy. IGBP Report No. 53/IHDP Report No. 19. IGBP Secretariat, Stockholm. 64pp. HARRISON A., 2006. National Land Use Database: Land Use and Land Cover Classification. ODPM, London HOESCHELE, W., 2000. Geographic Information Engineering and Social Ground Truth in Attappadi, Kerala State, India. Annals of the Association of American Geographers, 90(2): 293-321. HEROLD, M., SCEPAN, J. and CLARKE, K.C., 2002. The use of remote sensing and landscape metrics to describe structures and changes in urban land uses. Environment and Planning A: 34 (8): 1443-1458. HOMER, C., HUANG, C.Q., YANG, L.M., WYLIE, B. and COAN, M., 2004. Development of a 2001 National Land-Cover Database for the United States, Photogrammetric Engineering and Remote Sensing, 70(7): 829-840. JANSEN, L.M. and DI GREGORIO, A., 2003. Land-use data collection using the ‘‘land cover classification system’’ results from a case study in Kenya. Land Use Policy, 20: 131–148. LLOYD, R., 1994. Learning Spatial Prototypes. Annals of the Association of American Geographer, 84(31): 418-440. MONROE, D.K. and MULLER D., 2007. Issues in spatially explicit statistical landuse/cover change (LUCC) models: Examples from western Honduras and the Central Highlands of Vietnam. Land Use Policy, 24: 521–530. ROSCH, E., 1975a. Cognitive Representations of Semantic Categories. Journal of Experimental Psychology: General, 104:192-233. ROSCH, E., 1975b. Cognitive Reference Points. Cognitive Psychology, 7:532-547. ROSCH, E., and MERVIS, C. 1975. Family Resemblances, Studies in the Internal Structure of Categories. Cognitive Psychology, 7:573-605. ROSCH, E., MERVIS, C., GRAY, W., JOHNSON, D., and BOYES-BRIAN, P. 1976. Basic Objects in Natural Categories. Cognitive Psychology, 8:382-439. SKELSEY, C., LAW, A.N.R., WINTER, M. and LISHMAN, J.R., 2003. A system for monitoring land cover. International Journal of Remote Sensing, 24 (23): 48534869. TVERSKY, B. and HEMENWAY, K., 1983. Categories of Environmental Scenes. Cognitive Psychology, 15:121-149. VELDKAMP, A. and LAMBIN, E.F., 2001. Predicting land-use change. Agriculture Ecosystems and Environment, 85 (1-3): 1-6. VELDKAMP, A., VERBURG, P.H., KOK, K., DE KONING, G.H.J., PRIESS, J. And BERGSMA, A.R., 2001. The need for scale sensitive approaches in spatially explicit land use change modeling. Environmental Modeling and Assessment, 6 (2): 111-121. WADSWORTH, R., COMBER, A.J. and FISHER, P.F., in prep. Text Mining Applied to Class Descriptions Associated with Natural Resource Mappings. Paper to be submitted to International Journal of Geographical Information Science November 2007. WADSWORTH, R., BALZTER, H., GERRARD, F., GEORGE, C., COMBER, A.J. and FISHER, P.F. (this volume). An environmental assessment of land cover and land use change in Central Siberia using Quantified Conceptual Overlaps to reconcile inconsistent data sets. Journal of Land Use Science - this volume. WANG, F. 1990. Fuzzy Supervised Classification Of Remote-Sensing Images, IEEE Transactions on Geoscience and Remote Sensing 28 (2): 194-201. WASTFELT, A. 2005. Satellite Images – A Source for Social Scientists? On Handling Multiple Conceptualisations of Space in Geographical Information Systems. In A.G. Cohn and D.M. Mark (Eds.): COSIT 2005, Lecture Notes in Computer Science 3693: 397 – 408. WYATT, B.K. and GERARD, F.F., 2001. What’s in a Name? Approaches to the intercomparison of Land Use and Land Cover Classifications, In Strategic Landscape Monitoring for the Nordic Countries, G. Groom (Ed.), NORDLAM, Copenhagen. List of Tables and Figures Table 1. The primitive scores for the NLUD classes Table 2. The normalised Use and Cover scores for the NLCD classes, values greater than the 50th percentile are highlighted Figure 1. a) NLCD data for Connecticut b) weighted land cover c) weighted land use d) land use and land cover with weights >50th percentile e) land use and land cover with weights <50th percentile NLCD Code 11 12 21 22 23 24 31 32 33 41 42 43 51 52 61 71 72 73 74 81 82 85 90 91 92 Class Name 1 Open Water Perennial Ice/Snow Developed, Open Space Developed, Low Intensity Developed, Medium Intensity Developed, High Intensity Barren Land Unconsolidated Shore Transitional Deciduous Forest Evergreen Forest Mixed Forest Dwarf Scrub Shrub/Scrub Orchards/Vineyards/Other Grassland/Herbaceous Sedge/Herbaceous Lichens Moss Pasture/Hay Cultivated Crops Urban/Recreational Grasses Woody Wetlands Palustrine Forested Wetland Palustrine Scrub / Shrub Wetland 9 9 3 2 1 7 9 4 9 9 9 9 9 3 9 9 9 2 9 9 9 2 3 2 2 4 3 1 2 9 9 9 9 2 9 5 2 2 1 3 9 9 9 3 3 6 1 1 1 1 1 1 1 5 5 4 2 5 2 9 9 7 Table 1. The primitive scores for the NLUD classes 4 5 9 9 6 5 7 9 7 6 3 7 7 5 4 4 6 7 4 5 5 7 7 9 3 3 3 2 6 3 2 1 3 2 1 2 5 7 2 5 4 5 6 2 2 5 7 7 7 2 2 7 2 3 5 5 1 8 7 6 2 2 1 3 3 3 7 8 9 10 11 9 1 1 1 6 7 8 9 2 1 2 9 9 9 9 2 7 7 8 9 6 6 2 2 2 6 1 1 1 4 6 6 6 5 3 3 7 3 7 3 8 4 7 6 3 7 7 4 7 9 9 7 9 9 8 9 7 9 9 1 9 2 1 4 7 7 3 2 1 7 3 3 3 3 2 2 2 2 4 4 4 2 2 2 7 9 9 1 5 9 9 9 6 6 8 7 4 1 1 7 8 3 7 6 6 12 13 14 1 6 7 8 9 2 NLCD Code 11 12 21 22 23 24 31 32 33 41 42 43 51 52 61 71 72 73 74 81 82 85 90 91 92 Class Name Open Water Perennial Ice/Snow Developed, Open Space Developed, Low Intensity Developed, Medium Intensity Developed, High Intensity Barren Land Unconsolidated Shore Transitional Deciduous Forest Evergreen Forest Mixed Forest Dwarf Scrub Shrub/Scrub Orchards/Vineyards/Other Grassland/Herbaceous Sedge/Herbaceous Lichens Moss Pasture/Hay Cultivated Crops Urban/Recreational Grasses Woody Wetlands Palustrine Forested Wetland Palustrine Scrub / Shrub Wetland Cover 0.862 0.976 0.204 0.105 0.068 0.976 0.832 0.376 0.39 0.866 0.771 0.793 0.237 0.447 0.544 0.11 0.215 0.426 0.426 0.16 0.229 0.097 0.725 0.749 0.647 Use 0.058 0.048 0.719 0.747 0.799 0.963 0.212 0.048 0.451 0.439 0.439 0.439 0.689 0.689 0.788 0.5 0.356 0.048 0.048 0.852 0.879 0.574 0.822 0.689 0.689 Table 2. The normalised Use and Cover scores for the NLCD classes, values greater than the 50th percentile are highlighted. Figure 1a. Figure 1b. Figure 1c. Figure 1d. Figure 1e. Figure 1. a) NLCD data for Connecticut b) weighted land cover c) weighted land use d) land use and land cover with weights >50th percentile e) land use and land cover with weights <50th percentile