* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Bio-Ontologies in the context of the BOOTStrep project
Survey
Document related concepts
Protein moonlighting wikipedia , lookup
Molecular evolution wikipedia , lookup
Histone acetylation and deacetylation wikipedia , lookup
Genome evolution wikipedia , lookup
Community fingerprinting wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Gene therapy wikipedia , lookup
Gene desert wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
List of types of proteins wikipedia , lookup
Gene nomenclature wikipedia , lookup
Gene expression profiling wikipedia , lookup
Gene expression wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Gene regulatory network wikipedia , lookup
Transcript
Bio-Ontologies in the Context of the BOOTStrep Project Andrea Splendiani1, Elena Beisswanger2, Jung-Jae Kim3, Vivian Lee3, Olivier Dameron1 Udo Hahn2 and Dietrich Rebholz-Schuhmann3 1Laboratoire d'Informatique Médicale, Université de Rennes 1, Rennes, FR 2 Language and Information Engineering (JULIE) Lab, Friedrich-Schiller-Universität, Jena, DE 3 Rebholz Group, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, U.K. Aims of the BOOTStrep Project GRO (0.1) – Fragment of Continuants • Construction of a wide-coverage integrated lexical, conceptual and factual knowledge resource for the biology domain, consisting of: - Bio-Lexicon - Bio-Ontology - Bio-Fact Store • Development of a comprehensive natural language processing (NLP) pipeline • Incremental enrichment of the BOOTStrep knowledge resources by information automatically extracted from scientific literature Bio-Ontology: Pivotal Knowledge Resource • • • OWL DL as the knowledge representation language for the Bio-Ontology Bio-Ontology provides: - Vocabulary for semantic annotation of a selected text corpus (a subset of the ontology is used only) - Semantics for accessing the Bio-Fact Store and for populating the Bio-Lexicon Bio-Ontology enables: - Consistency checking of ontological terms and facts in the Bio-Fact Store (using a terminological classifier) - Logical inferencing that allows for advanced text analysis and interpretation (disambiguation of terms, event identification, anaphora resolution, etc.) and interpretation of facts in the Bio-Fact Store From a Specific GRO to an Integrated Architecture for the Bio-Ontology • The current version of GRO (V 0.1) covers fundamental types of the gene regulation domain Bottom-up construction process that was primarily driven by the task of semantic annotation of gene regulation events in a corpus of scientific abstracts Top-down definition according to the upper level ontology BFO1 is under way (Semi-automatic) Enhancement towards a broader and more comprehensive three-layer Bio-Ontology reusing existing alternative terminological and data base resources • • • 1 Ontology http://www.ifomis.org/bfo/ Three-Layer Architecture for Bio-Ontology Type-to-Linguistic-Variants-Linking Semantic Annotation Inference / Consistency Checking Text Bio-Lexicon Bio-Fact Store Ontology Top-Level Bio-Ontology (Three Layers) Entity Continuant Characteristics of the Gene Regulation Ontology (GRO, V 0.1) - 250 classes - 150 property restrictions - Taxonomic is-a relation plus five fundamental semantic relation types (has-part, has-participant with the subrelations has-agent and haspatient, and species (plus inverses and reciprocals) - Two major branches describing physical objects and processes in the field of gene regulation, and their interrelations - Integrates parts of existing bio-ontologies and terminological resources, such as: • Gene Ontology1 • INOH Molecular Role Ontology1 • INOH Event Ontology1 • Sequence Ontology1 • TransFac2 1 http://obofoundry.org 2 http://www.gene-regulation.com/ Bio-Ontologies Function Biological Function Cell Occurrent Independent Cont. Quality Process Boundary Object Organism E.coli Molecular Entity Nucleotice Sequence Protein Gene Transcription Factor Biological Process Gene Expression Regulation of Biological Process has-agent Regulation of Gene Expression DNA-Binding Transcription Repressor species Process Regulation Mouse Human Genes and Proteins • Domain chosen: gene regulation (in E. coli, though scaling to other model organisms is anticipated in the design) Database Entries: • GRO and Parts of Other Dependent Cont. Focus on Gene Regulation Multiple regulation involved in the expression of the uxuR regulatory gene in Escherichia coli K-12 „ ... These results indicate that the expression of the uxuR gene is repressed by its own product but also by the exuR repressor. ...” has-patient part-of part-of Population and Cleaning of Ontology Translation Transcription Bio-Lexicon Escherichia coli Synonym: E.coli exuR uxuR uxuR Synonym: ECK4315 Synonym: JW4287 Variant: … exuR Synonym: ECK3085 Synonym: JW3065 Regulates (uxuR, uxuR expression, E.coli K-12) Regulates (exuR, uxuR expression, E.coli K-12) Bio-Fact Store is-a instance-of Acknowledgements BOOTStrep (FP6 - 028099) is a Specific Targeted Research Project (STREP) funded by the European Union http://www.bootstrep.eu/