* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download James Hutton Institute Presentation Template
Transcriptional regulation wikipedia , lookup
Non-coding DNA wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Gene desert wikipedia , lookup
Ridge (biology) wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Gene regulatory network wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Community fingerprinting wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Molecular evolution wikipedia , lookup
Genomic imprinting wikipedia , lookup
Exome sequencing wikipedia , lookup
Gene expression profiling wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
De novo Genome Sequencing and Gene Prediction in Lolium perenne, Perennial Ryegrass Ewan Mollison The James Hutton Institute 31st International EUCARPIA Symposium Section Fodder Crops and Amenity Grasses: BREEDING IN A WORLD OF SCARCITY 13 – 17 September 2015. Ghent, Belgium Methods Source plant material Inbred and partially inbred Lolium perenne lines Genome sequencing strategy 207x by Illumina sequencing of PE, MP and LJD libraries; reduced to 105x Assembled using CLC Bio with k-mer length 41; scaffolded with SSPACE Estimate of gene-space coverage CEGMA pipeline used to identify coverage of highly conserved genes Gene prediction Ab initio gene prediction using Augustus with wheat-based model 22 RNA-Seq experiments aligned to Lolium assembly using Tuxedo pipeline Results Assembly (genomic scaffolds) Total length (Gbp) % GC No. scaffolds N50 Max. scaffold (bp) Scaffolds >= N50 1.11 44.16 424,745 25,193 274,411 10,875 CEGMA coverage estimate 239/248 (96.37%) complete coverage 246/248 (99.19%) complete or partial coverage Gene prediction RNA-Seq Genomic Mt. Predicted genes 67,706 109 Predicted transcripts 111,464 109 Scaffolds with predictions 33,212 3 Genes / kb * 0.051 0.209 Mt. = mitochondrial; Ch. = chloroplast * Genes / kb gene-containing scaffolds Ch. 12 18 2 0.095 Augustus Genomic Mt. 188,822 20 n/a n/a 59,900 3 0.23 0.038 Ch. 0 n/a 0 0 Discussion & conclusion Assembly and coverage of gene-space Around 40% of the expected genome size has been captured by assembly CEGMA analysis indicates a good level of coverage of the gene-space has been achieved Overlapping predictions with transcripts 44,252 predictions from genomic scaffolds and 3 from mitochondrial have supporting evidence from RNA-Seq, based on reciprocal overlap of 20% using BEDTools intersect Acknowledgements This work is funded as part of a Teagasc Walsh Fellowship PhD studentship