Download JRA1 - Del. 4.3

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Whole genome sequencing wikipedia , lookup

Primary transcript wikipedia , lookup

DNA barcoding wikipedia , lookup

DNA polymerase wikipedia , lookup

No-SCAR (Scarless Cas9 Assisted Recombineering) Genome Editing wikipedia , lookup

RNA-Seq wikipedia , lookup

Vectors in gene therapy wikipedia , lookup

Replisome wikipedia , lookup

Comparative genomic hybridization wikipedia , lookup

Nucleosome wikipedia , lookup

DNA vaccination wikipedia , lookup

DNA sequencing wikipedia , lookup

DNA profiling wikipedia , lookup

DNA damage theory of aging wikipedia , lookup

Non-coding DNA wikipedia , lookup

Therapeutic gene modulation wikipedia , lookup

Nucleic acid analogue wikipedia , lookup

Molecular cloning wikipedia , lookup

Genomic library wikipedia , lookup

History of genetic engineering wikipedia , lookup

Extrachromosomal DNA wikipedia , lookup

Epigenomics wikipedia , lookup

Gel electrophoresis of nucleic acids wikipedia , lookup

DNA supercoil wikipedia , lookup

Artificial gene synthesis wikipedia , lookup

Helitron (biology) wikipedia , lookup

Nucleic acid double helix wikipedia , lookup

Cre-Lox recombination wikipedia , lookup

SNP genotyping wikipedia , lookup

Microsatellite wikipedia , lookup

Genealogical DNA test wikipedia , lookup

Deoxyribozyme wikipedia , lookup

Cell-free fetal DNA wikipedia , lookup

Genomics wikipedia , lookup

United Kingdom National DNA Database wikipedia , lookup

Metagenomics wikipedia , lookup

Bisulfite sequencing wikipedia , lookup

Transcript
JRA1 - Del. 4.3
Aims
The central direction of the work of WP4 was to create an online predictive tool for curators.
Originally termed PrediCtoR, this was posted to a new website thermal-age.eu as a result of the
shift to next-generation sequencing as the primary tool for DNA sequence for museum remains.
Thermal-age.eu
1. Calculates the thermal history of a site and use this to predict DNA fragment length.
2. Keeps indexed records of past calculations so these may be published and subject to
scrutiny.
3. Collects data from users to help refine the quality of predictions.
4. Provides detailed explanations and supporting materials to help users understand the
strengths and limitations of the numbers we can produce.
Deliverable 4.3
Thermal-age.eu is the site for designed to help collections managers and users to quantify the risks
associated with destructive analysis of specimens. The website originally to be entitled PrediCtoR
was to predict amplification success of PCR, in particular to highlight the importance of sample size
vs. PCR amplicon length. It appears non-intuitive to most researchers that copy number scales in
direct proportion to the size of the sample, whilst DNA fragment length survival decreases as an
exponent of fragment length. PrediCtoR was therefore a web tool to encourage researchers to
reduce sample size for destructive analysis (see WP6). Changes in DNA sequencing technologies
(so-called next generation sequencing platforms) required us to change the focus of the site. The
site now reports predicted fragment length, rather than the copy number of a PCR fragment of a
specific length. This has meant that the original objective in D4.3 of normalising results to sample
size is both no longer relevant to users and (as next-generation sequencing has largely replaced
PCR experiments) that data on sample size are not available from those undertaking experimental
work. Instead the model now reports a probability distribution for recovery of different fragment
lengths which is more useful as a tool for excluding specimens where DNA has degraded below a
recoverable threshold.
Refinement of the web tool and model code
Following user feedback and in response to the advances in technology discussed above, both the
web tool and the underlying modelling software were refined as follows:
1. Improved temperature resolution, switching from a 1° x 1° resolutions (ISLSCP) to ~1 km x
1 km (0.05° x 0.05° WorldClim).
2. Improved altitude correction, using WorldClim, Google Maps API and Wikipedia (e.g.
Elevation for central Brussels, PMIP = 128 m, WorldClim = 62.5 m, Google = 64.84 [153m
resolution)
3. Ability to search for all places on Wikipedia with a latitude and longitude: this has the benefit
(over searching e.g. Google Maps API for places) of including many archaeological sites
and geographical features in addition to place names. A search results can be previewed on
the map and automatically fill in elevation and brief description of the place when selected.
4. Improved soil types input - using a sliding scale to estimate thermal diffusivity in different
soils of typical granularity based on a collection of known values given soil type and water
content. The model can use multiple soil layers.
5. Improved processing speed by refining calculation software in parallel with adding new
features and ability to process arbitrarily large datasets in a spreadsheet.
6. Ensuring the site was a mobile compatible, for use on phones and tablet computers.
7. Values previously entered into any job can be loaded instantly into and screen of the wizard
allowing users to easily run additional jobs where some of the input is the same as previous
runs (e.g. where more than one specimen comes from the same site).
8. Enabling large anumbers to be added using a spreadsheet input (thermal-age.eu generates
a ‘quick start’ spreadsheet based on the user’s requirements including the correct column
headings and example rows to show how to enter data correctly.)
9. Enabling many users to simultaneously queue up spreadsheets as well as wizard runs for a
single specimen. The processing of large (spreadsheet) jobs is suspended and resumed
such that the longer a job has been running, the lower its priority against competing jobs in
the queue. This means smaller jobs are always turned around as quickly as possible while
the system cannot be “blocked” by one very large job.
10. Providing a Dashboard which lists all your activity on the site and shows the status of
currently running jobs. This is especially useful as large spreadsheets of results can take
some time to process.
11. Proving an interactive report in PDF form, which can be printed (see 4 in Roll-out)
12. Providing a means of making searches publicly accessible (including an ability to embargo
the results to a set date - and change this at will).
User-entered database of
results & analysis
Researchers are able to upload actual
results of DNA sequencing and compare
them against previously run predictions,
allowing data capture and comparison
between predicted and measured DNA
survival. These can optionally be
published en-masse, again with the
option to embargo results.
Once results are made public they are
assigned a permanent canonical URL
(suitable for publication) and may not
then be removed from the site.
A graph showing a comparison between
the predicted and actual experimental
results (where these have been
provided) is automatically generated. An example is shown to the right. This provides a qualitative
indicator of the quality of predictions (which appears strong) as well as giving an overview of the
relative predicted and actual survival of DNA. The “traffic lights” colour coding indicates
preservation from good (green) to complete destruction (black).
Improving access to collections
Thermal-age.eu enables curators to assess the likelihood of DNA preservation, the original aim of
the project. However the Synthesys II management team realised that a more effective method of
using the tool was to offload the analysis onto the researcher collecting material. Therefore an
interactive PDF Report was produced (which enables each of the figures to downloaded as a PDF,
PNG, or SVG format).
This has a number of advantages
1) If the sample is identified as unsuitable the request will never be made
2) The research gets an insight into where and when samples s/he is interested in analysing are
predicted to fail.
3) The reporting tool offers a way to report on thermal-age.eu data highlighting instances in which
the predicted fragment length is incorrect.
How good is it?
The tool is currently slightly ahead of its time, as there have been relatively few studies which have
reported DNA fragment length. GoogleScholar lists three articles which have reported having used
the tool in publication. A total of 145 unique visitors spend on average of 6 minutes each time they
visit the site (a total of 237 visits), this is the time to run a minimum on one analysis.
However working with leading researchers in the field of ancient DNA results appear to be very
promising. The following from a recently submission to a high profile publication illustrates the
steps within thermal-age.eu and the quality of the prediction (references not given).
A thermal age (Smith et al., 2003) - equivalent age if held at a constant 10 ° C - of 4200 years was
estimated from the sample, based upon an effective burial temperature Teff of 4.3 ° C. This
estimation assumed the sample was buried to 1 m is soil with a thermal diffusivity of 0.029 m2 day1 (silt-loam, 10% water) for the 12,740 years (until 1968; maximum age 12,722 CAL BP 2σ C.I.)
then in highly variable conditions (1 5 °C +/- 7 °C) until the present day. Seasonal fluctuations in
monthly temperature at the site was estimated from the WorldClim dataset (Hijmans et al., 2005).
The altitude difference between the 1 x 1 km squares of WorldClim and the site was corrected by
comparing the altitude of the WorldClim grid with the altitude from the DEM in GoogleEarth (1552
m) and corrected using a standard environmental lapse rate of 6.49 °C/1,000 m. The extent to
which temperature decreased beyond the Holocene was estimated from the difference in the 1° ×
1° PIMP2 grid for the region at three time intervals Modern (pre-industrial), Holocene (6ka) and
LGM (Braconnot et al., 2007). These data were correlated against the equivalent time intervals
from Bintanja et al. (2005) curve, and the correlation used to transform the latter temperature
series to reflect the temperature change in this region. Using the rate data estimated in Allentoft et
al. (2012), our thermal age estimates slightly under-predict the extent of degradation (rate of 0.99E6 yr -1), when compared with that estimated from the observed fragment length (1.3E-6 yr -1).
We do not have detailed data on the net thermal diffusivity of the soil, precise burial depth, burial
depth over time, nor storage history post 1968. The sensitivity of the model to these estimated
factors is illustrated by the fact that if the burial depth is reduced to 0.5 m, the estimated rate (1.4E6 yr -1) is higher than the observed value.
Achievements
Scheduled Deliverables
4.1 Production of pilot website for PrediCtoR now http://thermal-age.eu
4.2 Proof of concept using both temperature plus large DNA datasets & DNA validation
(presented as report)
4.3 Software refined: thermal modelling, normalise to sample size; develop and testing User
entered database & analysis and reporting
Other benefits
Funding obtained as a result of JRA1 & JRA3 activities
Systematics Association (2011-12) £9 501. Collagen: the Barcode of Death. PI Matthew Collins
(York: JRA 1), co-I with Sam Turvey (IoZ).
What next?
The site is now live and stable, generating reports, and requesting feedback on predictions (to
improve the accuracy of the site). We would like to increase the number of types of tissue modelled
to include (for example) dried plants and insects.
We are seeking additional funding to support this - either via a UK funded CASE studentship
between York and the NHM or a fellowship application.