* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download RNA gene prediction
Point mutation wikipedia , lookup
Short interspersed nuclear elements (SINEs) wikipedia , lookup
X-inactivation wikipedia , lookup
Metagenomics wikipedia , lookup
Human genome wikipedia , lookup
Transposable element wikipedia , lookup
Epigenetics of human development wikipedia , lookup
History of RNA biology wikipedia , lookup
Primary transcript wikipedia , lookup
History of genetic engineering wikipedia , lookup
RNA interference wikipedia , lookup
Public health genomics wikipedia , lookup
Pathogenomics wikipedia , lookup
Nutriepigenomics wikipedia , lookup
Epigenetics of diabetes Type 2 wikipedia , lookup
Copy-number variation wikipedia , lookup
Genetic engineering wikipedia , lookup
Saethre–Chotzen syndrome wikipedia , lookup
Neuronal ceroid lipofuscinosis wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Genome (book) wikipedia , lookup
Non-coding RNA wikipedia , lookup
The Selfish Gene wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Genome editing wikipedia , lookup
Gene expression programming wikipedia , lookup
Gene expression profiling wikipedia , lookup
RNA silencing wikipedia , lookup
Epitranscriptome wikipedia , lookup
Gene therapy wikipedia , lookup
Gene desert wikipedia , lookup
Genome evolution wikipedia , lookup
Site-specific recombinase technology wikipedia , lookup
Helitron (biology) wikipedia , lookup
Therapeutic gene modulation wikipedia , lookup
Gene nomenclature wikipedia , lookup
Microevolution wikipedia , lookup
Gene Prediction Chengwei Luo, Amanda McCook, Nadeem Bulsara, Phillip Lee, Neha Gupta, and Divya Anjan Kumar Gene Prediction • Introduction • Protein-coding gene prediction • RNA gene prediction • Modification and finishing • Project schema Gene Prediction • Introduction • Protein-coding gene prediction • RNA gene prediction • Modification and finishing • Project schema Why gene prediction? experimental way? Why gene prediction? Exponential growth of sequences New sequencing technology Metagenomics: ~1% grow in lab How to do it? How to do it? It is a complicated task, let’s break it into parts How to do it? It is a complicated task, let’s break it into parts Genome How to do it? It is a complicated task, let’s break it into parts Genome How to do it? Protein-coding gene prediction Homology Search Phillip Lee & Divya Anjan Kumar ab initio approach Nadeem Bulsara & Neha Gupta How to do it? RNA gene prediction Amanda McCook & Chengwei Luo tRNA rRNA sRNA Homology Search Homology Search Strategy open reading frame(ORF) How/Why find ORF? How/Why find ORF? How/Why find ORF? Protein Database Searches Domain searches Limits of Extrinsic Prediction ab initio Prediction Homology Search is not Enough! Biased and incomplete Database sequenced genomes are not evenly distributed on the tree of life, and does not reflect the diversity accordingly either. ab initio Gene Prediction Features ORFs (6 frames) Codon Statistics Features (Contd.) Probabilistic View Supervised Techniques Unsupervised Techniques Usually Used Tools GeneMark Glimmer EasyGene PRODIGAL GeneMark GeneMark.hmm GeneMark.hmm GeneMarkS Glimmer Glimmer Journey Glimmer3.02 PRODIGAL Prokaryotic Dynamic Programming Gene Finding Algorithm Developed at Oak Ridge National Laboratory and the University of Tennessee Features Features EasyGene Developed at University of Copenhagen Statistical significance is the measure for gene prediction. Ґ High quality data set based on similarity in SwissPRot is extracted from genome. Ґ Data set used to estimate the HMM where based on ORF score and length statistical significance is calculated. Problem: Ґ No standalone version available Comparison of Different Tools RNA Gene Prediction Why Predict RNA? Regulatory sRNA sRNA Challenges Fundamental Methodology RFAM What Is Covariance? Fig: Christian Weile et al. BMC Genomics (2007) 8:244 Noncomparative Prediction Fig: James A. Goodrich & Jennifer F. Kugel, Nature Rev. Mol. Cell Biol. (2006) 7:612 Noncomparative Prediction *Rolf Backofen & Wolfgang R. Hess, RNA Biol. (2010) 7:1 Comparative+Noncomparative • Effective sRNA prediction in V. cholerae • Non-enterobacteria • sRNAPredict2 • 32 novel sRNAs predicted • 9 tested • 6 confirmed Jonathan Livny et al. Nucleic Acids Res. (2005) 33:4096 Software *Rolf Backofen & Wolfgang R. Hess, RNA Biol. (2010) 7:1 Eva K. Freyhult et al. Genome Res. (2007) 17:117 Modification & finishing • Consensus strategy to integrate ab initio results • Broken gene recruiting • TIS correcting • IS calling • operon annotating • Gene presence/absence analysis Modification & finishing Consensus strategy Broken gene recruiting pass pass candidate fragments fail homology search ab initio results Modification & finishing TIS correcting Start codon redundancy:ATG, GTG, TTG, CTG Leaderless genes Markov iteration, experimental verified data Modification & finishing IS calling IS Finder DB Operon annotating Modification & finishing Gene Presence/absence analysis Schema (proposed) Schema (proposed) assembly group Schema (proposed) assembly group