Download Updating and re-developing structure, data bases and models for

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft Jet Database Engine wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
General enquiries on this form should be made to:
Defra, Science Directorate, Management Support and Finance Team,
Telephone No. 020 7238 1612
E-mail:
[email protected]
SID 5

Research Project Final Report
Note
In line with the Freedom of Information
Act 2000, Defra aims to place the results
of its completed research projects in the
public domain wherever possible. The
SID 5 (Research Project Final Report) is
designed to capture the information on
the results and outputs of Defra-funded
research in a format that is easily
publishable through the Defra website. A
SID 5 must be completed for all projects.
A SID 5A form must be completed where
a project is paid on a monthly basis or
against quarterly invoices. No SID 5A is
required where payments are made at
milestone points. When a SID 5A is
required, no SID 5 form will be accepted
without the accompanying SID 5A.


This form is in Word format and the
boxes may be expanded or reduced, as
appropriate.
ACCESS TO INFORMATION
The information collected on this form will
be stored electronically and may be sent
to any part of Defra, or to individual
researchers or organisations outside
Defra for the purposes of reviewing the
project. Defra may also disclose the
information to any outside organisation
acting as an agent authorised by Defra to
process final research reports on its
behalf. Defra intends to publish this form
on its website, unless there are strong
reasons not to, which fully comply with
exemptions under the Environmental
Information Regulations or the Freedom
of Information Act 2000.
Defra may be required to release
information, including personal data and
commercial information, on request under
the Environmental Information
Regulations or the Freedom of
Information Act 2000. However, Defra will
not permit any unwarranted breach of
confidentiality or act in contravention of
its obligations under the Data Protection
Act 1998. Defra or its appointed agents
may use the name, address or other
details on your form to contact you in
connection with occasional customer
research aimed at improving the
processes through which Defra works
with its contractors.
SID 5 (2/05)
Project identification
1.
Defra Project code
2.
Project title
NT2503
MAGPIE: Updating and re-developing structure,
databases and models for wider application
3.
Contractor
organisation(s)
ADAS Research
Werts Road
Wolverhampton
WV6 8TQ
54. Total Defra project costs
5. Project:
Page 1 of 24
£
194,944
start date ................
01/10/2001
end date .................
31 March 2005
6. It is Defra’s intention to publish this form.
Please confirm your agreement to do so. ................................................................................... YES
NO
(a) When preparing SID 5s contractors should bear in mind that Defra intends that they be made public. They
should be written in a clear and concise manner and represent a full account of the research project
which someone not closely associated with the project can follow.
Defra recognises that in a small minority of cases there may be information, such as intellectual property
or commercially confidential data, used in or generated by the research project, which should not be
disclosed. In these cases, such information should be detailed in a separate annex (not to be published)
so that the SID 5 can be placed in the public domain. Where it is impossible to complete the Final Report
without including references to any sensitive or confidential data, the information should be included and
section (b) completed. NB: only in exceptional circumstances will Defra expect contractors to give a "No"
answer.
In all cases, reasons for withholding information must be fully in line with exemptions under the
Environmental Information Regulations or the Freedom of Information Act 2000.
(b) If you have answered NO, please explain why the Final report should not be released into public domain
Executive Summary
7.
The executive summary must not exceed 2 sides in total of A4 and should be understandable to the
intelligent non-scientist. It should cover the main objectives, methods and findings of the research, together
with any other significant events and options for new work.
The original Magpie system (Modelling Agricultural Pollution and Interactions with the
Environment), was created in 1998 for Defra by ADAS for their use in support of nitrate policy
development. MAGPIE brought together agricultural and environmental data in a single GIS
system for England and Wales. The purpose of the original system was to provide input data for
catchment and national-scale modelling of nitrate leaching. Since the system was implemented
the range of potential applications has expanded to include other pollutants and policy questions.
Thus the main objective of the project was to extend and redevelop the Magpie system. The
new Magpie system provides the spatial and temporal data required by catchment or coarsescale models, and a framework on which new models can be attached relatively easily; thus a
valuable tool for catchment and national scale environmental/water policy development. The
main requirement of the new Magpie 2 system was both the restructuring of the database to
accommodate more complex models (e.g. of time series of water quality) and to provide the
framework from which models can be easily integrated into the system. A further aim of the
project was to make use of recent advances in proprietary software capabilities thus simplifying
maintenance and updating, and rendering the system more robust.
The Magpie 2 architecture adopts a 3-tier structure comprising
- a database tier, which holds the source data, or data for a given project area and provides a
generic structure for handling the data within the system
- a model tier, containing data relating to all models implemented on the system, and
-a browser tier.
The browser tier provides the graphical user interface to the system and defines the
programmers object library, which is used to interrogate the database for all user data
requirements and is integrated into all models.
The browser has been designed to make it easy for the user to examine and display the
underlying data for a catchment. This ability to explore spatial data and results was found to be
one of the most powerful and useful features of MAGPIE 1 when used for policy support.
The Magpie 2 database forms a crucial part of the system. Key datasets within Magpie 2
include, for each 1 km cell:
SID 5 (2/05)
Page 2 of 24
- the ADAS Landcover 2000, comprising an integration of data from remote sensing and other
data sets with the agricultural census for 2000. This contains also the crop areas and livestock
numbers within each cell;
- Equivalent land cover data for 1970, 1980, and 1995
- human population
- soils data from the NSRI NATMAP 1km2 National Soil map, with associated soil properties,
enhanced by ADAS to ensure there are no gaps and all properties required by the models are
present;
- long term average climate data from both CRU and UKCIP;
- ‘lookup table’ datasets required for agricultural pollution model, including data on manure
production and management by livestock type
- ‘’lookup table’ data sets required by the N model NEAP-N and the N and P Balance models.
- Spatial hydrological data including EA catchment boundaries and monitoring points;
- Contextual spatial data e.g. location of towns
These datasets are held within the Magpie 2 Core database in read-only status from which
copies are extracted as a subset into a Project database by the user as required for interrogation
and to run within their models.
Three models have been implemented within Magpie 2 to demonstrate its capability: an updated
version of NEAP-N; an N and P Balance model; and a water balance model estimating daily
values of drainage, runoff, potential evapotranspiration and soil moisture deficit.
The methodology provides a table structure and associated relationships to allow all types of
data (e.g. daily weather data from individual locations, annual spatial coverages, lookup data
such as soil horizon data) to be stored in one consistent way regardless of the data type held
within the dataset. The Programmers Object Library defines a standardised programmatic
‘interface’ detailing methods and properties that another programming languages can access
and use in a third party application. The Library provides functionality to allow the modeller to
interrogate the database for all data required by a model; save model results back into the
project database and launch the Magpie 2 browsers to view the model results graphically. This
approach means that each model does not have its own database querying engine, but a
generic one which can be used by all.
The Magpie 2 system contains the spatial datasets required by most annual catchment-scale
water quality models recently or currently being developed under Defra funding, including
PSYCHIC (P and sediment), PIT (P) and EVENFLOW (N); the NARSES model (ammonia);
models of pathogen emission from agriculture; and (with the addition of daily weather data) the
NIPPER model of nitrate leaching being developed for assessment of the impact of the NVZ
Action Programme. Implementation of a number of models within the MAGPIE system could
facilitate evaluation of impacts of measures on a range of pollutants, and thereby provide
invaluable support for development of mitigation approaches under the Water Framework
Directive and other environmental legislation.
An important future step linkage for Magpie 2 is to integrate a database of the national Manure
Management survey data which would allow localised estimation of the timing and methods of
application of manure on a spatially distributed basis.
The Magpie 2 project has underpinned developments with SEPA/SNIFFER to define a screening
tool to help identify water bodies at risk of failing the WFD standards. In addition, the EA have
expressed an interest in developing a similar system to support their own catchment
management policy development.
Project Report to Defra
8.
As a guide this report should be no longer than 20 sides of A4. This report is to provide Defra with
details of the outputs of the research project for internal purposes; to meet the terms of the contract; and
to allow Defra to publish details of the outputs to meet Environmental Information Regulation or
Freedom of Information obligations. This short report to Defra does not preclude contractors from also
SID 5 (2/05)
Page 3 of 24
seeking to publish a full, formal scientific report/paper in an appropriate scientific or other
journal/publication. Indeed, Defra actively encourages such publications as part of the contract terms.
The report to Defra should include:
 the scientific objectives as set out in the contract;
 the extent to which the objectives set out in the contract have been met;
 details of methods used and the results obtained, including statistical analysis (if appropriate);
 a discussion of the results and their reliability;
 the main implications of the findings;
 possible future work; and
 any action resulting from the research (e.g. IP, Knowledge Transfer).
SID 5 (2/05)
Page 4 of 24
1. Introduction
The original Magpie system (Modelling Agricultural Pollution and Interactions with the
Environment), created in 1998 for Defra by ADAS for policy support, brought together
agricultural and environmental data in a single GIS system for England and Wales (1). This
system linked the data to visualisation software, query systems and models so that the
implications of current agricultural practice and future changes for nitrate in waters could be
assessed by policy makers within Defra. A number of other Defra policy issues were explored
using data derived from Magpie. These include, for example, the impacts of CAP reform on land
use (3); the impacts of strategies for manure disposal on N losses to air and water (4,5), a study
of the effects of the wet autumn of 2000 on agriculture (6) and to answer policy questions such
as the potential for injection of slurry to reduce ammonia loss. It was therefore apparent that a
high-quality agri-environmental database could be exploited by a wide range of uses and applied
to many policy issues.
Since the system was implemented the range of potential applications has expanded to include
other pollutants and policy questions. It was considered by Defra that the further development of
MAGPIE could provide data for a number of models of air and water quality; improve the
reliability of results for the effort involved, and aid communication between all concerned. The
database was already in use not only in nitrate policy development but also as the basis of the
national P Loss Indicator (PIT) (2) and the PSYCHIC model being developed on the Wye and
Avon catchments. Such an approach is timely in the light of the Water Framework Directive,
which lays such stress on whole-catchment approaches to water quality management (the whole
range of diffuse pollutants - N, P, pathogens, pesticides, ammonia, farm chemicals etc could be
addressed from the same data source). The new Magpie system is designed to provide Defra
with a valuable tool for catchment and national scale environmental/water policy development at
several levels. The system provides the spatial and temporal data required by catchment or
coarse-scale models, and a framework on which new models can be attached relatively easily.
Given the newly realised potential for application of the system, the main requirements identified
during the past 3 years of using the Magpie system were to restructure the database to
accommodate both more complex models (e.g. of time series of water quality) and a wider range
of models. The process of updating the existing nitrate model would address both recent
scientific developments and the desire for greater flexibility in scenario running for use both now
and in the future. As the range of potential applications of the database has grown the need for
greater flexibility in access, handling and presenting data within the systems has also risen.
2. Objectives
The objectives as set out in the original contract were: to extend and redevelop the MAGPIE
system, created in 1998 for DEFRA by ADAS, to incorporate new requirements, updated data
and science, and newly realised potential for application of the system. To this end the following
main objectives were identified:
1. Redevelop the MAGPIE database and model system to extend the scope for its application
to policy issues, facilitate implementation of new or extended models, and simplify
maintenance and updating.
2. Update the MAGPIE land use / land cover database; add historic data
3. Redevelop the science of the nitrate leaching model; implement the new model together with
N and P balance models
4. Technology transfer (Liaison with customer; documentation; reporting; papers and final
product handover.)
The redesign results in a system where the database is independent of the models, and will
therefore be accessible to new models without the need for modifying the data/utilities core.
Recent software advances by Microsoft (e.g. in ActiveX controls / documents as used within the
SID 5 (2/05)
Page 5 of 24
Visual Basic language; and the SQL Server relational database) make this approach easier to
design and manage than previously, and increase the degree of independence attainable. New
protocols and modules developed to process and display data are designed to improve access
for users.
A second important component in the development of Magpie 2 was the generation and
implementation of a new, updated land cover / land use database at 1 km scale for England and
Wales. This uses an amalgam of updated Agricultural Census Data; Land Cover 2000 mapping
derived from satellite imagery; and OS-based coverages of specific non-agricultural land uses.
Previously-collated historic land use coverages have also be implemented, extending the scope
for scenario testing. Coupled with improvements in integration of data such as OS vector data for
delineation of built-up areas, roads, forests this database provides Defra with the most up-todate and precise spatial database of national agricultural and rural land use available.
Developments in nitrate leaching models (in a range of other projects) to incorporate recent
advances in science, and to allow greater power and flexibility for scenario testing has been
undertaken so to improve the robustness of the predictions of surface water quality in particular,
and of the fate of manure-derived N. The extension of nitrate loss models (NT2201;
EUROHARP) to predict the daily time course of river flow and N concentration has also required
methods for interpolating weather data across a catchment. MAGPIE 1 worked only with annual
data. Implementation of daily water quality modelling requires the ability to store and process
many years of data for a network of weather stations throughout the country; together with daily
river flow and concentrations. This ability to work with time series of weather, flow and nutrient
flux is equally important for future modelling of loss of P, pesticides, sediment and microorganisms, whose transport is even more sensitive to timing and intensity of rainfall events. New
structures for storing and displaying the time course of flow and pollutant concentration at a
sampling point have been built into the system in anticipation of the addition of such models into
the Magpie 2 system.
3. Magpie 2 Framework Architecture
A key aim of this project was to ensure that Magpie 2 was developed to be as flexible and
extendable as possible. Firstly, the database allows a wide variety of data types to be stored
and enables extra data to be added with no restructuring of the database required. Secondly,
the Magpie 2 framework architecture has been developed to allow for any number of models to
be accommodated within it and allows for extra models to be seamlessly integrated into the
framework by a competent user, thus avoiding the need for recoding and redeployment. This is
particularly important; firstly, in view of the number of ongoing Defra funded policy related
models being developed which allows them to be potentially integrated into one system for the
policy maker to use; and secondly, as it provides a very valuable research based modelling tool.
A number of different strategies for the system architecture of Magpie 2 were considered.
Magpie 2 adopts a 3-tier data structure, as presented in Figure 1, and comprises a database tier,
a model tier and a browser tier.
SID 5 (2/05)
Page 6 of 24
Core Database
Project
Database
Database Tier
Data Browser
Programmers Object
Library
Browser Tier
Model Tier
Magpie 2 Model
e.g. NEAP-N
Magpie 2 Model
e.g. NP Balance
Figure 1. Magpie 2 Modelling Framework System Architecture
3.1 Browser Tier
The Magpie 2 application is a single automation object (ActiveX Exe) that provides two roles.
1. It is the graphical user interface (GUI) of the system comprising a simple multiple document
interface (MDI) shell. This hosts a number of browser controls (e.g. for mapping, graphing and
tabulating of data) and data manipulation tools that provide access to the source datasets and
model simulation results and functionality to simulate and interact with models that are available.
2. It provides the definition of the programmers object library. This object library can be used by
the modeller to: interrogate the database for all data required by a model; to save model results
back into the database; and to launch the Magpie 2 GUI to view the model results within one of
the browsers. Further details of the programmers object library is described in a later section.
3.2 Database Tier
Two types of database comprise the database tier: Core and Project.
Source datasets are stored in the Core database and are comprised of a combination of ‘raw’
and pre-processed data at a number of different scales and have a full UK coverage. The
content of the Core database can be accessed via the browsers. Access to data in the Core
database is always read-only.
Models do not have direct access to data in the Core database. To run a model, data for the
area of interest are extracted from the Core database and stored in a Project database. This
data can be edited (‘scenarios’) if required and the results of the model simulations are saved
within the Project database. More detail about the database structure and the procedure for
creating a project database is given below.
3.3 Model Tier
SID 5 (2/05)
Page 7 of 24
The modelling tier contains essential data on each of the models that are integrated within the
Magpie 2 framework. By default Magpie 2 is deployed with three models (NEAP-N, N and P
Balance and a time series model) although there is no limit to the number of models that can be
integrated.
Each model is developed as an automation object (ActiveX Exe). Access to data (including both
input data and model coefficients) stored within the database required by a model is obtained by
referencing the programmers object library. This data can then be manipulated to provide the
model results. The programmers object library is then used to save the data back into the
database.
A particular benefit of adopting the system architecture described above is that it provides
increased power and implementation flexibility. Firstly, the user can run the Magpie 2 application
via the GUI and simulate and evaluate the results of any model that is available. This is the
normal way of running Magpie 2. Secondly, if the programmer of a model chooses, the model
can be run independently (e.g. double clicking the model executable file) and not via the Magpie
2 user interface. In this case the model will still draw data from the database, run the simulation
and save the results. This may be a more favoured approach by a researcher who has a large
number scenario runs to undertake and evaluate, for example in Monte Carlo simulations.
4. Programmers Object Library
As previously discussed the Magpie 2 application provides a graphical user interface to allow
browsing of Core and Project databases. Magpie 2 also supports a Programmers Object Library
which provides functionality to allow the modeller to interrogate the database for all data required
by a model; save model results back into the project database and launch the Magpie 2
browsers to view the model results graphically. This approach means that each model does not
have its own database querying engine, but a generic one can be used by all.
The Programmers Object Library defines a standardised programmatic ‘interface’ detailing
methods and properties that another programming languages can assess and use in a third
party application. The interface allows the user to use Magpie 2 using standard Automation
techniques. The automation interface for Magpie2 is called Magpie2.ClsMagpie2.
‘Behind the scenes’ of the interface, the communication with the database and the majority of
data handling is undertaken using Microsoft ActiveX Data Objects (ADO). ADO is an industry
standard technology for accessing and manipulating data from a database server.
The majority of the methods and properties within the interface can be used to retrieve any of the
data in the database regardless of its type (e.g. time series data, model parameter coefficients,
1km2 data).
The functionality of the methods and properties can be broadly split into the following areas:
4.1 Dataset retrieval
A number of different functions are available giving functionality to return datasets. Either a
complete dataset can the retrieved, e.g. with all the parameters within it (for example, in the case
of Census data each of the census classes would be returned) or a dataset can be returned
specifically based on a particular parameter. For each record in a dataset, in addition to the
physical parameter values (e.g. number of pigs), attribute data associated with the spatial
location are available (e.g. northing, easting, county id, catchment id). These attributes can then
be used to find records relating to a specific requirement (e.g. find the number of pigs that are
contained within 1km2 that are situated within a specific catchment.
Functionality exists to allow the user to undertake a search of spatial details based upon over
lapping datasets. For example, this could be used to find a list of all the weather stations
existing within the CRU Climate dataset that are located within a particular catchment area.
SID 5 (2/05)
Page 8 of 24
4.2 Dataset modification\creation
Functionality exists for adding datasets to the database in a number of different ways. Firstly, a
dataset can be added into the database but with no actual data inserted. Secondly, a dataset
can be added and data inserted simultaneously. Thirdly, an existing dataset can be retrieved,
modified and either saved back to the database with the same name or saved back as a new
dataset with a different name.
4.3 Database\Dataset interrogation
A number of properties and methods exist to allow the user to filter the number of records within
an existing dataset based upon a specific criteria. This is very useful to quickly find a subset of
records from a large dataset. For example, records in a dataset can be filtered by Catchment Id,
County Id, Country Id or Parish Id etc.
A number of methods also exist to query the database to find out information about specific
parameters. For example, the attributes of all the available parameters in the database can be
returned e.g. parameter name, parameter report label, or the units a parameter is associated
with. Alternatively, specific information about an individual parameter can be found.
4.4 Database administration
A number of functions exist to help administer the database. New parameters can be added,
and their properties changed. The properties determining how a specific parameter is displayed
within the browsers and can be updated. For example, you can set the colour ramp shown on
the map, or the symbol displayed. Summary information associated with datasets can be
updated (e.g. basic number of values, minimum value, maximum value, mean value). If
summaries do not exist for a dataset when they are displayed in the browsers they need to be
calculated. The delay time involved in this calculation can be avoided by updating the summary
as the dataset is created.
4.5 Dataset Display
Two methods exist for displaying datasets visually. Both background data and model results can
be shown either in map or tabular format. If the Magpie 2 application GUI is already visible then
the data will be displayed within it, if it not it will be launched and the data displayed. These
methods would commonly be used at the end of a model run such that the results can be
automatically displayed to the user.
Meta-data describing a particular dataset can be uploaded in the database. This could be
straight text but could also be a Word document or web page link.
The interface provides the facility to visually display a progress screen whilst running a model.
Two progress bars are displayed: one to indicate the progress of a sub process and another to
indicate the progress of the overall process.
Details of all the methods and properties defined within the automation interface are provided in
Appendix 1.
The Programmers Object Library has been used to develop each of the models that are
deployed with Magpie 2. A simple example of model code using the object library is provided in
Appendix 2
5. Magpie 2 Database Design
5.1 Basic Requirements
The Magpie 2 database forms a crucial part of the overall system architecture. A fundamental
requirement of this database is that it must be capable of holding a wide variety of types of data.
SID 5 (2/05)
Page 9 of 24
Furthermore, it is important that the design and structure of the database does not just reflect the
datasets necessary for this immediate project but must accommodate the requirements of other
models currently being developed, such as PSYCHIC, PIT, and EVENFLOW and those of the
future.
5.2 Methodology
A number of different database structures were evaluated to meet the objectives. One approach
had specific tables for different types of dataset (e.g. time series data, spatial coverages). The
software code used to query the data accessed a ‘Data Dictionary’ to determine which tables
stored what type of data. A disadvantage of this approach was the strong likelihood that any
new data of a different type added in the future would require new tables to be defined and
added to the database. This was considered inflexible.
The chosen methodology provides a table structure and associated relationships to allow all
types of data (e.g. daily weather data from individual locations, annual spatial coverages, lookup
data such as soil horizon data) to be stored in one consistent way regardless of the data type
held within the dataset. Taking this approach allows the Magpie 2 application to access any
data using the same methods and database queries, which significantly reduces the
maintenance of the actual software code in the application itself and it enables the
implementation of models to be a simple and easy task.
Both the Magpie 2 Core and Project databases have identical structures and Microsoft SQL
Server (7 and 2000) and Microsoft Access (97, 2000, XP) are both supported. SQL Server is
typically used to store the Core data due to its ability to handle large quantities of data and
performance when located on a ‘server’. Project databases are typically in Microsoft Access.
5.3 Details of the chosen approach
The Magpie 2 core database is designed around three fundamental concepts; datasets, spatial
details and data details.

The ‘Dataset’ is basically a label for a group of spatial (or non spatial) items and their related
data. The dataset defines the ‘name’ of this group of data, its type, measurement frequency
and its relationship with other datasets. The dataset is top of the hierarchical storage of data
in the database.

The ‘Spatial Details’ define the location of a measurement point i.e. gives the database a
point to fix any related data to. The majority of spatial details have an easting and northing.
However, non-spatial items, such as look up items (for example, fertilizer rates) also have an
entry in the spatial details table even though they do not have an easting and northing.

The ‘Data Details’ table is the main store of actual measured, predicted or modelled data.
The data details table is the lowest point in the hierarchical storage of data in the database.
The datasets and spatial details are linked together via a dataset group table. This dataset group
table allows several datasets to be based on the same spatial details without duplicating the
spatial details in the database. For example, there are several datasets relating to 1 km2 cells,
such as Land Cover and soil details, both datasets will have an entry in the dataset group table
but only one entry per 1 km2 cell in the spatial details table.
The data values are stored in the data details table and are related directly to an entry in the
spatial details table and an entry in the dataset table. To reduce the number of individual rows in
the data details table, several data values can be assigned to a spatial item on one row of the
table. To allow the Magpie 2 system to distinguish the parameter type of each value field in the
data details table, there is a table in the database for parameter details which links the
parameters defined for each dataset to the ‘value’ fields in data details table.
Comprehensive information detailing the database tables and relationships can be found in
SID 5 (2/05)
Page 10 of 24
Appendix 3.
6. Magpie 2 Data Sets
This section describes briefly the main spatial data sets within MAGPIE 2.
6.1 ADAS Landcover 2000
A methodology has been developed (8) to improve the spatial precision of agricultural census
data. This was necessary because census data are detailed with regards to the cropping and
livestock, but spatially imprecise as data are recorded against the Parish in which the farm office
sits. This means that census data alone cannot be used to identify the land use within a
catchment (9). To overcome this, firstly an improved land use mask was developed. This was
created by linking the CEH LandCover 2000 coverage derived from satellite imagery to
Ordnance Survey (OS) and other maps of the location of major blocks of non-agricultural land
(e.g. urban, woodland). A normalisation process was used to ensure that at district level, total
areas of agricultural grassland and arable land matched the census data. This overcame certain
problems with Landcover 2000 (e.g. inability to distinguish agricultural from non-agricultural
grass; and some confusion between urban and arable land ).
Updated Agricultural Census data has been obtained at the maximum resolution permitted by
Defra Census Branch (different spatial resolutions were available for England and Wales
individually). The data has been integrated with Landcover 2000 and OS digital data using an
updated version of the methodology described in the report on Defra project NT2203. The
resulting updated land use, land cover and livestock map has been mounted on the Magpie 2
database.
Under previous MAFF funding (11), census data for 1970, 1980 and 1995 have been worked up
and provide valuable historical context for policy work. These datasets can be made available
for use within Magpie 2.
6.2 Soils Data
The Soils data is from the NATMAP 1km² version of the NSRI National Soil Map. The National
Soil Map is the product of sixty years of soil survey work in England and Wales. It is the most
detailed comprehensive map of the soils of the two countries and describes the composition and
distribution of 300 soil map units. Each of the soil map units can be linked to the soil property
datasets from the National Soil Resources Institute.
The National Soil Map details the distribution of 300 soil associations, each of which contains
three to five soil series. Together, these soil associations describe the wide range of soil
conditions encountered across England and Wales. Some amendments have been included
with the data to ensure that the properties relevant to water quality modelling are represented
within each soil type.
This dataset has been mounted onto the Magpie 2 database with lookup tables for Soil Horizons
and Soil Series Name.
6.3 CRU Climate Data (1961 – 1990) Long Term Averages
This data exists on a 10km² of the UK. The climatologies were constructed by E.Barrow in
September 1993. Data exists for:
 Temperature
 Maximum Temperature
 Minimum Temperature
 Precipitation
 Relative Humidity
 Sun Hours
 Wind speed
SID 5 (2/05)
Page 11 of 24
All of the above data is available for 3 elevations in the 10 Km²:
 Min
 Mean
 Max
Raw station data were interpolated from the UK Met. Office through the Climate Impact LINK
Project (Dept of the Environment Contract PECD 7/12/96). The construction of this climatology
has been fully described (12) and the work has been funded by the Natural Environment
Research Council through a TIGER IV sub-contract from the Environmental Change Unit at the
University of Oxford. These data provide the basis for water balance calculations, and also for
the storm size and frequency calculations necessary to run recent models of erosion, P transfer
etc.
6.4 UKCIP Climate Data (1961 – 1990) Long Term Averages
The UKCIP (UK Climate Impacts Programme) data sets have been created for 26 weather
parameters, based on the archive of UK weather observations held at the Met Office. For most
parameters approximately 500 stations are used to create each grid; for rainfall approximately
3,500 stations are used. The regression and interpolation process used to obtain the 5 km grids
alleviates the impacts of station openings and closures on homogeneity.
The list of available parameters is given below:















Mean monthly daily maximum temperature (°C)
Mean monthly daily minimum temperature (°C)
Monthly mean temperature (average of the mean monthly maximum and minimum) (°C)
Number of days of air frost
Number of days of ground frost
Mean monthly sea-level pressure (hPa)
Mean monthly vapour pressure (hPa)
Mean monthly wind speed (knots)
Mean monthly cloud cover (%)
Monthly mean hours of bright sunshine per day
Number of days per month having a rainfall >= 1 mm (Rain Days)
Number of days per month having a rainfall >= 10 mm (Wet Days)
Number of days per month with snow falling (Snow Fall)
Number of days per month with snow cover > 50%
Total monthly precipitation (mm)
6.5 NEAP-N Datasets
Spatial data were specifically pre-processed for use with the NEAP-N model. The variables
derived were:



PET Potential Evapotranspiration
AAR Average Annual Rainfall
Soils Variables including
 HOST Class
 Water Content at Field Capacity (Phi05), Averaged Over Depth to Rock
 Water Content at Wilting Point (Phi15), Averaged Over Depth to Rock
 Depth to Rock (cm)
 Percent Sand, Silt, Clay and Carbon of Topsoil
The Soils variables were obtained from the NATMAP 1km² version of the National Soil (see
above) while the AAR data came from the UKCIP Climate data (12).
The PET dataset was derived using a Mean Climate Drainage Model (13). This model uses
mean monthly climate data, landscape and soil management information (e.g. soil texture, slope,
SID 5 (2/05)
Page 12 of 24
surface roughness and compaction) to calculate PET. The model has been validated against
established water balance models such as MORECS (14).
6.6 Lookup tables: N and P Balance Datasets
Coefficient values for N and P surplus (kg/ha) for each crop and livestock type, calculated using
the methodology described in Section 8.2 below are saved in a lookup table within the database.
7. Magpie 2 Graphical User Interface (GUI)
The Magpie 2 graphical user interface (GUI) provides flexible tools to view, analyse and query
data existing in both Core and Project databases and to select, configure and simulate available
models. Multiple maps, graphs and tables can be displayed simultaneously which would allow
easy comparison of datasets.
The Magpie 2 GUI provides the following main areas of functionality:

Data Browsing:
1.
2.
3.
4.
5.
Summarise all the datasets in their database.
View their data spatially on a map.
View data within a table.
View data on a graph.
Data querying.

Create a project database for further analysis or modelling

Models
1. Insert new models into the system
2. Run models registered within the system

Export dataset
On entry to the system the user is able to login and select the specified database to work from.
If a Core database is opened, data can be browsed and Project databases created. If a Project
database is opened, models can then be run.
7.1 Database Browser
Once logged in, the user is presented with the ‘Database Browser’ screen (Figure 2) which
allows the user to view and examine the datasets in the open database. The user can navigate
down the dataset tree on the right hand side of the browser and the left of side of the browser
will display details of the selected item.
SID 5 (2/05)
Page 13 of 24
Figure 2. Database Browser displaying a sample number of datasets available within the Core
Database.
In order to view a specific dataset it needs to be selected within the ‘Database Browser’ and
either Map, Graph or Table option from either the File menu on the menu bar or from the ‘right
click’ menu.
7.2 Map Browser
The ‘Map Browser’ (Figure 3) is split into two main parts. On the left, a list of datasets is
displayed and to the right a map showing the spatial coverage of the selected dataset. To the
left of the map an inset map and map legend are shown. A number of standard GIS based tools
are available to manipulate the map, which can be accessed via the toolbar, including pan, zoom
in, zoom out and identify. Multiple datasets can be displayed within the same Map Browser.
The map legend displays a legend for each of the layers shown on the map. The map legend
provides a variety of different functions to customise the way the map looks. The different
functional options are available by right clicking the mouse over the legend. Map layers can be
added or removed via the Add Layer or Remove Layer option respectively. The Promote,
Demote, Send to Top and Send to Bottom options can be used to determine the order in which
the layers are drawn.
The ‘Map Data Query’ option allows the user to select the cells or nodes on the current map
using a data query that can be built up via an editor. This option is very useful for quickly
identifying data that meets a specific criteria. For example, the user could select all the cells on a
land cover map where the percentage of grass was above 50%.
SID 5 (2/05)
Page 14 of 24
Figure 3 Magpie 2 Map Browser showing the % Arable Landcover
7. 3 Data and Graph Browser
In addition to the Map Browser, data can be displayed in either a Graph or a Table (Figure 4).
The ‘Data Browser’ allows the user to view the contents of a dataset within a spreadsheet like
grid view. Datasets from a Project Database can be modified in the Data Browser and the data
can be either exported either to a file or to the clipboard.
Figure 4. Daily UKCIP data displayed in the Magpie 2 Data browser and Minimum and
Maximum temperatures for Newbry Bridge displayed in the Graph Browser
SID 5 (2/05)
Page 15 of 24
Within the ‘Graph Browser’, data from a spatial based dataset is displayed as a cumulative, while
data that is based from a time-series dataset is displayed in XY format (e.g. a simple data versus
date plot).
7.4 Creation of a Project Database
A Project database is a subset of data either extracted from the Core database or from an
existing Project database. Models are run using data from a Project database. In order to
create a Project database the user needs to choose a spatial area of interest, the datasets
required and the data range of data required. Theses options can be specified via the Create
Project Database option from the View menu on the toolbar. A number of different options
facilitate the quick creation of a project database including: the saving of area selections for
future use; dataset parameters required for Magpie 2 models can be automatically detected and
selected; whole project definitions can be saved and reused to recreate projects.
7.5 Data Export
The Magpie 2 Export screen allows the user to export data from the core or project database.
The available export formats from Magpie 2 are listed below
 Comma Delimited File
 Tab Delimited File
 Shape File
 XML File
 Excel File
 Access Database
 DBF File
7.6 Model Implementation
Any models that are registered for use within Magpie 2 can be located from the View|Models
toolbar menu. In addition to providing functionality to run the model, the author of the model may
provide the following: the ability to view all the dataset parameters required by the model; a
facility to modify the parameters of the model.
New models can be registered to operate within the Magpie 2 framework, via the ‘Configure’
option. The user simply has to navigate to, and select the model executable. If the model
conforms to Magpie 2 standards it will be accepted and registered for use within the application.
To aid the development of new models, a Microsoft Visual Basic Project template can be
automatically created from within Magpie 2. This Visual Basic project has reference to the
Magpie 2 programmers object library and all the necessary methods added to allow the modeler
to get started.
8. Models
Three models have been implemented within Magpie 2 to demonstrate its capability: NEAP-N;
N and P Balance; and a model estimating daily values of runoff, PET and SMD. Details of these
models are provided below
8.1 NEAP-N
The NEAP-N model is a simple model of nitrate loss for use at catchment scale. Potential N
leaching (kg/ha N) is assigned to each crop and to each livestock type. These coefficients may
be obtained from measured data or by running more detailed field scale models for the range of
management appropriate to the scenario. The values assigned to livestock take due account of
SID 5 (2/05)
Page 16 of 24
manure management, adjustment for fertiliser N, and the impact of the manure organic matter on
the risk of leaching in the medium to long term. For grassland, the coefficients take account of
nitrate leaching from the whole grassland system. Actual nitrate leaching is then modified as a
function of soil type and water balance.
The coefficients were reviewed using more recent experimental data, and a secondary set of
coefficients was developed to represent the imposition of closed periods for manure application
and improved fertiliser management. The leaching function for clay soils was modified to take
account of recent experimental data (e.g. from Defra project NT1850) which shows that nitrate
concentrations in water draining from clay soils is reduced during heavy rainfall, due to rapid flow
via cracks.
Scenario analysis can be effectively undertaken using Magpie 2: coupled input parameter sets
and model output results are registered to allow the user to quickly see the effect of a input
change to the model results. Figure 6 shows the Neap-N dialogue, which enables the user to
change model parameter values and to save down sets of parameters as a scenario. Scenarios
can be recalled by reselecting from the ‘Scenario Name’ drop down list.
As described above, updates to the NEAP-N model are reflected in a change to the values of
some parameters. Sets of parameter values have been saved into the database, which can be
selected by the user by Scenario Name. Parameter values may also be changed by the user to
explore further scenarios. In addition, as within MAGPIE 1, land use and livestock numbers may
be changed by the user within a project database.
Example outputs for nitrate concentration (mg l-1 NO3-N
1)
mean
) and leached nitrogen (Kg N ha -1 y -
are shown in Figures 4 and 5 respectively.
Figure 4. Example NEAPN output: Nitrate Concentration (mg l-1 NO3-N mean)
SID 5 (2/05)
Page 17 of 24
Figure 5. Example NEAPN Output : Leached Nitrogen (Kg N ha -1 y -1)
Figure 6. Dialogue showing editable parameter values used within NEAP-N. These values can
be quickly changed and the model re-run.
8.2 N and P Balance
8.2.1 Model Methodology
A ‘farm gate ’ nutrient balance for N and P was implemented within MAGPIE 2. In the farm gate
balance, inputs include fertiliser, livestock feed and wastes imported onto the farm and outputs
include harvested crops, livestock products and wastes exported from farm. For the purpose of
simplicity, all arable crops harvested are counted as exports, and feed eaten by livestock is
counted as an import, even if grown on the same farm.
SID 5 (2/05)
Page 18 of 24
The ‘farm gate approach’ uses widely available annually updated feed and production statistics
and does not suffer from the problems of estimating for grassland systems representative values
of N off take, N input from organic manures, and yield which are inherent with the ‘soil surface’
approach.
The methodology used for calculating N surplus replicates the approach reported by Lord et al.,
(15) which broadly follows that defined by PARCOM (16). The nitrogen surplus is equal to the
sum of nutrient loses due to denitrification, volatilisation, leaching and erosion, and that retained
in the soil. However, inputs via atmospheric deposition are excluded from the model as farmers
have no control over this. The methodology is summarised below.
Arable Crop
N arable surplus are calculated based on inputs due to fertiliser, fixation and seed, and outputs
due to off take on a crop basis.
Livestock
The livestock N surplus is subdivided into grassland, pigs and poultry sectors; and within this,
separate surplus values are assigned to each sub category (e.g. calves, dairy young stock, dairy
adults).
The resulting N surplus coefficients are expressed per unit of crop area and livestock unit type.
This allows the surplus to be distributed spatially. The same approach is applied to P . It is
envisaged that current work re-evaluating P balance of livestock may give rise to modifications to
these values. These coefficients are saved within the Magpie 2 database.
The spatial distribution of N and P surplus is calculated as follows:
S   an  sn    lm  sm 
where S is the total surplus per spatial unit (e.g. 1km2); a n is the total area of crop type n ; l m is
m ; s n is the calculated surplus for crop type n (kg/ha); and s m is the
calculated surplus for livestock type m .
the number of livestock units of type
8.2.2 Model Scenarios
The facility exists to give the user the opportunity to modify the N and P surplus coefficients
before running the model. Changes to coefficients can be made from the ‘model coefficients’
dialogue box. The livestock and arable crop coefficients are displayed on separate tabs. Any
changes made can be saved down with a user-defined description to aid easy recall in future
scenario testing. The model results also use this description to ensure coupling between
coefficients and outputs is maintained.
An example output from the P Balance Model is shown in Figure 7. This illustrates that the P
surplus per ha of agricultural land tends to be greater in livestock areas.
SID 5 (2/05)
Page 19 of 24
Figure 7. P Surplus (kg/ha) for England and Wales
8.3 Time Series Demonstration Model
An important aspect of the Magpie 2 system is its ability to store and manipulate time series
data. A simple model has been developed to demonstrate this capability which calculates daily
estimates of runoff, soil moisture deficit (SMD) and potential evapo-transpiration (PET) based on
inputs of the HOST class definition (17) for a site and values of daily maximum and minimum
temperature. Thequantity of surface runoff is an important input to models of erosion, sediment
transfer, and transfer of pollutants such as P, ammonium, pathogens, pesticides and BOD.
9. Weather Interpolation Processor
Catchment scale models run at a daily time step necessitate the use of daily weather data. The
weather interpolation processor allows interpolation of weather station data into a spatial
coverage for use with daily models. Estimates of rainfall, rain days, minimum and maximum air
temperatures, wind speed, total sunshine hours, relative humidity, and vapour pressure deficit
are calculated. The interpolation methodology is described below.
A time series of daily weather data is generated for a target site by spatial interpolation from
observed weather stations located within the catchment. In order to determine which weather
stations should contribute to the time series estimates, the method of delaunay triangulation is
adopted. The observed data at the selected weather stations are expressed as a ratio of longterm (1961-1990) monthly mean observations. A weighted mean of the ratios at each weather
station is then calculated. The weights are inversely proportional to the distance between the
weather station and the target site according to the following equation:
SID 5 (2/05)
Page 20 of 24

Di  

Wi 

 0.01  Ei  E X 

2
 
   0.01  N i  N x 
 



2



0. 5
1
1  0.005  Di 
3
where: Ei and Ni are the easting and northing of the weather station; Ex and Nx are the easting
and northing of the target site; Di is the distance between the weather station and target site;
and Wi is the weighted mean ratio.
The weighted mean ratios are then multiplied by the long-term (1961-1990) monthly mean
climate data for the target site to calculate a new time series of weather data.
Long term monthly mean values are obtained from the climate surfaces (10km2) generated for
the UK (12). These values are transformed to make them more representative for the target site
based on its altitude. Monthly means are available at mean, maximum and mean altitude for
each 10km2 cell. Lapse rates are calculated for each monthly climate variable using the means
reported at the maximum and minimum altitude. A climate estimate for the target site is then
calculated by using the value reported at the mean altitude (within the 10km2 cell of the site)
combined with a correction factor calculated as the lapse rate multiplied by the difference
between target site altitude and mean altitude of the 10km2 cell. Estimated rainfall figures may
be further adjusted by the known long-term mean annual rainfall at a site if known. The monthly
figures are multiplied by the ratio of the known and estimated annual rainfall totals.
The existing water quality models implemented within Magpie 2 do not require daily estimations
of weather data. For this reason the weather interpolator has been developed as an
independent application and is not integrated within Magpie 2. However, when such a need is
required, the coding structures used will enable quick migration. At present, the results of the
interpolator can be easily uploaded into the Magpie 2 database for visualisation should this be
required.
10 Future Work and Knowledge Transfer
The Magpie 2 system contains datasets, which are also used by other catchment or national
scale models currently being developed under Defra funding, including PSYCHIC, PIT and
EVENFLOW; and the NIPPER model being developed for assessment of the impact of the NVZ
Action Programme. ADAS has involvement with these model developments and has ensured
that the data requirements can be met by the MAGPIE 2 framework. The needs of other models
being developed in the UK and elsewhere have also been considered. It would be
advantageous for multiple models to run directly from the same data sources and be integrated
into the Magpie 2 modelling framework.
Throughout the course of this work project discussions with the Environment Agency (EA) have
taken place with a view to ensure that Magpie 2 takes into account EA interests. More recently,
the EA they have expressed an interest of ADAS assistance to help them develop a similar
system to Magpie 2, in support of their catchment management policy development. Several
meetings have occurred and an agreement is currently being drafted. Magpie 2 was
demonstrated at a recent workshop of ADAS research to Defra.
The Magpie 2 project has underpinned developments with SEPA/SNIFFER to define a screening
tool to help identify water bodies at risk of failing the WFD standards. The system could be
readily extended to include data from Northern Ireland and Scotland. The system is also being
used as part of the NVZ monitoring project (NIT18), for policy work including the NVZ action
programme revision (NIT15) and as part of negotiations within the EA working towards improved
river basin management planning.
SID 5 (2/05)
Page 21 of 24
Linkage of the Magpie 2 database to the national Manure Management survey data would allow
localised estimation of the timing and methods of application of manure on a spatially distributed
basis, providing much improved pollution risk assessment for pathogens, phosphate, ammonia,
BOD and nitrate. This opportunity is the subject of a current bid for funding, and is particularly
relevant in the light of re-appraisal of NVZ Action Measures and other environmental schemes.
The system is providing underpinning catchment data in support of catchment management
within the Teme, Bassenthwaite and Wensum catchments, as part of the government initiative to
implement Entry Level Schemes within a water quality framework.
11 Conclusions
This project has resulted in a spatial database structure capable of supporting a wide range of
agri-environmental models operating at catchment scale. As such it is an important tool in the
development of environmental policy, particularly in relation to pollution from agriculture, at
catchment and national scale.
The system has been designed to contain the highest quality spatial data on agricultural land
use and agricultural practices, at 1 km resolution, such that it can provide the data for any
location within England and Wales. The system architecture allows new data structures and
new models to be added as required, without recompilation. The system can be run on a PC.
The User Interface has been designed to allow display and querying of the underlying data as
well as the results, and export of the data and results to other packages such as Excel. This
facilitates policy development by allowing exploration of linkages between cause and effect. Use
of such data especially as maps can promote interest in catchment-specific issues and thereby
help greatly in generating stakeholder involvement in policy development.
The ability to integrate a number of models within the same data framework provides a powerful
tool for integrated policy development. For example, the effect of changing livestock numbers;
or changes to manure management; on a wide range of pollutants could be investigated using
exactly the same input data and scenarios for all. This capability will be important in
development of optimal strategies for water quality improvement under the Water Framework
Directive. Development of MAGPIE 2 has provided a unifying theme within a number of Defrafunded catchment model developments, such that they all use the same input data, and can
therefore be run on the same data sets. They will therefore be suitable for running within the
MAGPIE 2 framework without major modification.
The datasets, although collated to provide inputs to water quality models, are equally
appropriate to other pollutants (ammonia; methane ; nitrous oxide) and could equally be used to
investigate entirely different questions, such as transport costs of food policy or waste disposal
policies; impacts of CAP reform on land use; or the area of land on which novel crops may be
grown under climate change scenarios.
.
References
1. Lord, E.I & Anthony, S.G. (2000) MAGPIE: A modelling framework for evaluating nitrate
losses at national and catchment scales. Soil Use and Management 16, 167-174
2. a) DEFRA Project PE0105: Towards a National Consensus on Indicators of P Loss to Water
from Agriculture
b) DEFRA Project subject to contract: Development of a risk assessment and decisionmaking tool to control diffuse loads of phosphorus and particulates from agricultural land
3. DEFRA Project SA0111: CAP reform : potential for effects on environmental impact of
farming
4. DEFRA Project NT1841: Desk study: management of manures from pig and poultry
holdings: Impact of alternative approaches on nitrogen losses to water and air
SID 5 (2/05)
Page 22 of 24
5. DEFRA Project NT1843: Mixed farming desk study: impact of management on nitrogen
losses to water and air
6. DEFRA Project CC0372: The wet autumn of 2000: implications for agriculture
7. DEFRA Project NT2203: Development of an agricultural land use spatial dataset to aid
agricultural/environmental spatial modelling and assessment
8. Miles, A. R., ANTHONY, S. G., and Askew. D. (1995) A Comparison of Agricultural Land Use
and Environmental Land Cover Data. Association for Geographical Information, 1995
Conference Proceedings, UK, pp. 1-12.
9. Miles, A., ANTHONY, S. G., Lord, E. I. and Fuller, R. (1996) Spatial agricultural land use
data for regional scale modelling. Proceedings Aspects of Applied Biology 46, Modelling in
Applied Biology: Spatial Aspects, 25-28th June.
10. DEFRA Project NT2201: An Integrated Approach to Modelling The Fate of Agricultural
Pollutants at National Scale
11. DEFRA Project NT2202: Improvement of nutrient loss modelling system and data: national
and catchment scales.
12. Barrow,E.M., Hulme,M. and Jiang,T. (1993) A 1961-90 baseline climatology and future
climate change scenarios for Great Britain and Europe. Part I: 1961-90 Great Britain baseline
climatology Climatic Research Unit, Norwich, 43pp. (plus maps).
13. Anthony (2003). The MCDM model: a monthly calculation of water balance for use in
decision support systems. ADAS internal document.
14. Hough,M., Palmer,S., Weir,A., Lee,M. and Barrie,I. (1996). The Meteorological Office Rainfall
and Evaporation Calculation System: MORECS Version 2.0 (1995), An Update to the
Hydrological Memorandum 45, The Meteorological Office.
15. Lord, E. I., Anthony, S.G. and Goodlass, G. (2002). Agricultural nitrogen balance and water
quality in the UK. Soil Use and Management, 18, 363-369.
16. PARCOM (1988). PARCOM Guidelines for calculating mineral balances. European
Commission, Westraat 200, B-1049, Brussels.
17. Boorman, D. Hollis, J. and Liley, A. (1995) Hydrology of soil types: a hydrologically based
classification of the soils of the United Kingdom. Institute of Hydrology Report No. 126,
Wallingford, Oxfordshire, 134pp.
References to published material
9.
This section should be used to record links (hypertext links where possible) or references to other
published material generated by, or relating to this project.
SID 5 (2/05)
Page 23 of 24
SID 5 (2/05)
Page 24 of 24