Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Genome Annotation Techniques : New Approaches and Challenges Alistair G. Rust, Emmanuel Mongin and Ewan Birney Drug Discovery Today Vol.7, No.11, S70-76, May 6, 2002 1. Introduction 2. Automatic Genome Annotation Pipelines In essence, pipelines are the integration of suites of bioinformatics software tools with multiple databases, to manage automatically the analysis and storage of genomic sequence. The basic structure of an annotation pipeline is depicted as follows : I. Analysis of raw sequence data into gene predictions. II. Storage of predictions and features within a relational database. III. Completed with the integration of databases and the distribution of annotation data via websites and downloadable data files. From raw sequence to gene predictions : I. Raw sequence pre-processing II. Gene structure prediction Gene function characterization : I. Mapping to known genes II. Protein domain annotation 3. Future opportunities Cross-species analyses will improve the accuracy of gene predictions and refine the definition of gene function. Protein coding genes are likely to be highly conserved between closely related species, and other regions, such as RNA genes and regulatory regions, could also be elucidated. Created by : Jih-Wei Huang Date : Aug. 22, 2002 1 2