Download Genome annotation techniques new approaches and challenges

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Genome Annotation Techniques : New
Approaches and Challenges
Alistair G. Rust, Emmanuel Mongin and Ewan Birney
Drug Discovery Today Vol.7, No.11, S70-76, May 6, 2002
1. Introduction
2. Automatic Genome Annotation Pipelines
 In essence, pipelines are the integration of suites of bioinformatics
software tools with multiple databases, to manage automatically the
analysis and storage of genomic sequence.
 The basic structure of an annotation pipeline is depicted as follows :
I. Analysis of raw sequence data into gene predictions.
II. Storage of predictions and features within a relational database.
III. Completed with the integration of databases and the distribution of
annotation data via websites and downloadable data files.
 From raw sequence to gene predictions :
I. Raw sequence pre-processing
II. Gene structure prediction
 Gene function characterization :
I. Mapping to known genes
II. Protein domain annotation
3. Future opportunities
 Cross-species analyses will improve the accuracy of gene predictions
and refine the definition of gene function.
 Protein coding genes are likely to be highly conserved between closely
related species, and other regions, such as RNA genes and regulatory
regions, could also be elucidated.
Created by : Jih-Wei Huang
Date : Aug. 22, 2002
1
2