Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
FunCatTM, a controlled vocabulary encompassing the biology of prokaryotes, plants and animals from cellular to systemic level Dr. Dieter Maier Manchester Ontologies Workshop 23/24.3.02 Biomax Informatics AG, Lochhamer Str. 11, 82152 Martinsried, Germany Biomax Informatics AG Bioinformatics designed with you in mind. Outline • • • • • Biomax Informatics AG Objectives Structure Content Development Use Bioinformatics designed with you in mind. Objectives • • • • • • Automatic data management No prior knowledge of vocabulary required Group genes by functional categories Extensible Organism independent Compatible to other ontologies Biomax Informatics AG Bioinformatics designed with you in mind. Disclaimer what the FunCat is not: - Tool for the complete description of functions on a single gene level Biomax Informatics AG Bioinformatics designed with you in mind. Structure • • • Organized hierarchicall Related functions grouped on different levels Internally consistent => Provides a data warehouse - overview about available selection - progress from general to specific - infere from specific to general Biomax Informatics AG Bioinformatics designed with you in mind. Hierarchical structure Transcription rRNA-transcription rRNA-processing tRNA-transcription mRNA-transcription mRNA-processing 5´-end processing Biomax Informatics AG Bioinformatics designed with you in mind. Content • Covers cellular processes, systemic physiology, development and anatomy from procaryotes to the human • 25 main Categories with ~ 1500 sub-categories • Categories are independent of organism • Genes can belong to multiple categories Biomax Informatics AG Bioinformatics designed with you in mind. Metabolism: 247 Energy: 60 Biological process: 1061 Cell cycle and DNA processing: 54 Transcription: 31 Protein synthesis (Translation): 11 Localisation: 256 Protein fate (folding, modification, destination): 25 Subcellular localisation: 63 Cell type localisation: 69 Tissue localisation: 41 Organ localisation: 91 Cellular transport: 32 Cellular communication: 47 Cell rescue, defense and virulence: 50 Regulation / interaction with cellular environment: 45 Cell fate: 54 Molecular function: 122 Systemic regulation / interaction with environment : 89 Development (systemic): 51 Transposable Elements, viral and plasmid proteins: 8 Control of cellular organisation: 57 Cell type differentiation: 69 Tissue differentiation: 40 Organ differentiation: 91 Biomax Informatics AG Enzymatic activity => EC ~ 4400 Protein activity regulation: 23 Protein with binding function / cofactor requirement: 49 Transport facilitation: 49 Bioinformatics designed with you in mind. Development • Historical • Pathways • Thesaurus • Complex relations Biomax Informatics AG Bioinformatics designed with you in mind. Structural development • Proven flexibility – easy to extend • Stable overall structure • Compatibel to other ontologies like - Enzyme Cataloge - Gene Ontology - EcoCyce Biomax Informatics AG Bioinformatics designed with you in mind. Development in numbers S. cerevisiae 1996 Main categories: Plant (A. thaliana) and Procaryotes 1998 Animals (Human) 2001 16 20 25 Depth: 4 6 6 Total: 182 528 1448 Biomax Informatics AG Bioinformatics designed with you in mind. Integrating Pathways into processes - hierachical structure allows: - Univocal attribution - Test for completeness - Test for consistence Biomax Informatics AG Bioinformatics designed with you in mind. Integrating additional information • Create a dynamic ontology from existing ontologies, keywords and linguistic extraction of descriptors from the literature • Semiautomatic mapping of dynamic ontologie to FunCat Biomax Informatics AG Bioinformatics designed with you in mind. Enabling complex relations • Intensify multidimensionality • Enable if ... then ... relations Biomax Informatics AG Bioinformatics designed with you in mind. Use • Manual annotation • Automatic annotation • Data mining Biomax Informatics AG Bioinformatics designed with you in mind. Manual annotation - multidimensional - stepwise Four dimensions Biomax Informatics AG Bioinformatics designed with you in mind. Manual annotation • 17 manually annotated genomes (5 eucaryotes, 12 procaryotes) • H.sapiens, A.thaliana, S.cerevisiae, N.crassa, propriatary: A.niger • B.subtilis, T.acidophilum, Listeria, 6 public procaryotes in progress, propriatary: C.glutamicum, C.pneumoniae, 1 undisclosed • Used for annotation of Transcriptomes Biomax Informatics AG Bioinformatics designed with you in mind. Automatic Annotation Sequence similarity to manually annotated proteins (distinguish experimentally verified and similarity associated function): - H. sapiens A. thaliana S. cerevisiae B. subtilis T. acidophilum Biomax Informatics AG Bioinformatics designed with you in mind. PEDANT Genome Database Currently more than 170 genomes (600 000 ORFs) Bacteria Archea Green non-sulfur bacteria Eucarya Entamoeba Methanosarcina Gram positives Proteobacteria Cyanobacteria Methanobacterium Extreme halophiles Methanococcus Slime molds Animals Fungi Plants Ciliates Thermoproteus Pyrodictium Flavobacteria Thermotogales Flagellates Trichomonades Microsporida Diplomonades Biomax Informatics AG Bioinformatics designed with you in mind. Data mining • Retrieval • Visualisation • Mining • Integration Biomax Informatics AG Bioinformatics designed with you in mind. Queries using the FunCat: Grouplevel - Looking for groups of genes: Biomax Informatics AG Bioinformatics designed with you in mind. Single molecule level - Retrieving protein entries: Biomax Informatics AG Bioinformatics designed with you in mind. The human FunCat cell cycle Transcription Translation Energy Metabolism Protein fate Intracellular Transport Signalling Unclassified Cell physiology Biomax Informatics AG Bioinformatics designed with you in mind. Defense Comparing genomes Sequence similairty „ functional homology“ Identification of organism specific functions Biomax Informatics AG Bioinformatics designed with you in mind. Comparing H.sapiens – B.subtilis 30 25 20 H.sapiens B.subtilis 15 10 5 0 Biomax Informatics AG Bioinformatics designed with you in mind. Integrative analysis Protein-protein interaction data Protein expression data Gene expression data Functional Functional Functional catalogue Functional catalogue catalogue catalogue Biomax Informatics AG Bioinformatics designed with you in mind. Topological clustering (SOM) Biomax Informatics AG Bioinformatics designed with you in mind. Distribution of the genes Biomax Informatics AG Bioinformatics designed with you in mind. Limitations Co-expression is no proof of functional association. Integrate evidence from multiple sources. Biomax Informatics AG Bioinformatics designed with you in mind. Integration with annotation Analyse gene expression data using integration with annotation catalogues. Functional catalogue Phenotypes Interaction Biomax Informatics AG Bioinformatics designed with you in mind. Functional projection Biomax Informatics AG Bioinformatics designed with you in mind. Looking at the gene lists Biomax Informatics AG Bioinformatics designed with you in mind. FunCat Tool to structure information Tool to connect information Biomax Informatics AG Bioinformatics designed with you in mind. Thank you! Biomax Informatics AG Bioinformatics designed with you in mind.