* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download James Hutton Institute Presentation Template
Survey
Document related concepts
Transcriptional regulation wikipedia , lookup
Non-coding DNA wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Gene desert wikipedia , lookup
Ridge (biology) wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Gene regulatory network wikipedia , lookup
Whole genome sequencing wikipedia , lookup
Community fingerprinting wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Molecular evolution wikipedia , lookup
Genomic imprinting wikipedia , lookup
Exome sequencing wikipedia , lookup
Gene expression profiling wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Transcript
De novo Genome Sequencing and Gene Prediction in Lolium perenne, Perennial Ryegrass Ewan Mollison The James Hutton Institute 31st International EUCARPIA Symposium Section Fodder Crops and Amenity Grasses: BREEDING IN A WORLD OF SCARCITY 13 – 17 September 2015. Ghent, Belgium Methods Source plant material Inbred and partially inbred Lolium perenne lines Genome sequencing strategy 207x by Illumina sequencing of PE, MP and LJD libraries; reduced to 105x Assembled using CLC Bio with k-mer length 41; scaffolded with SSPACE Estimate of gene-space coverage CEGMA pipeline used to identify coverage of highly conserved genes Gene prediction Ab initio gene prediction using Augustus with wheat-based model 22 RNA-Seq experiments aligned to Lolium assembly using Tuxedo pipeline Results Assembly (genomic scaffolds) Total length (Gbp) % GC No. scaffolds N50 Max. scaffold (bp) Scaffolds >= N50 1.11 44.16 424,745 25,193 274,411 10,875 CEGMA coverage estimate 239/248 (96.37%) complete coverage 246/248 (99.19%) complete or partial coverage Gene prediction RNA-Seq Genomic Mt. Predicted genes 67,706 109 Predicted transcripts 111,464 109 Scaffolds with predictions 33,212 3 Genes / kb * 0.051 0.209 Mt. = mitochondrial; Ch. = chloroplast * Genes / kb gene-containing scaffolds Ch. 12 18 2 0.095 Augustus Genomic Mt. 188,822 20 n/a n/a 59,900 3 0.23 0.038 Ch. 0 n/a 0 0 Discussion & conclusion Assembly and coverage of gene-space Around 40% of the expected genome size has been captured by assembly CEGMA analysis indicates a good level of coverage of the gene-space has been achieved Overlapping predictions with transcripts 44,252 predictions from genomic scaffolds and 3 from mitochondrial have supporting evidence from RNA-Seq, based on reciprocal overlap of 20% using BEDTools intersect Acknowledgements This work is funded as part of a Teagasc Walsh Fellowship PhD studentship