* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Biological ontologies for human functional annotation and
Endogenous retrovirus wikipedia , lookup
Biosynthesis wikipedia , lookup
DNA supercoil wikipedia , lookup
Signal transduction wikipedia , lookup
Gene regulatory network wikipedia , lookup
Magnesium transporter wikipedia , lookup
Paracrine signalling wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Western blot wikipedia , lookup
Expression vector wikipedia , lookup
Interactome wikipedia , lookup
Metalloprotein wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Protein purification wikipedia , lookup
Biochemistry wikipedia , lookup
Proteolysis wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Gene expression wikipedia , lookup
Protein–protein interaction wikipedia , lookup
Point mutation wikipedia , lookup
From GENIA to BioTop – Towards a Top-Level Ontology for Biology Stefan Schulz, Elena Beisswanger, Udo Hahn, Joachim Wermter, Anand Kumar, Holger Stenzhorn Freiburg University, Jena University, Saarland University Germany Need for Ontologies in Biology Raw Data Experimental Data Need for Ontologies in Biology Raw Data Experimental Data MEDLINE Literature Collection Need for Ontologies in Biology UniProt Raw Data Experimental Data Yeast FlyBase Databases Mouse MEDLINE Literature Collection Need for Ontologies in Biology UniProt Raw Data Experimental Data Yeast Ontologies FlyBase Databases Mouse MEDLINE Literature Collection Flood of Semi-Formal Bio-Ontologies OBO (Open Biomedical Ontologies) • About 60 ‘ontologies’ - • • Gene Ontology (GO) Sequence Ontology (SO) Cell Ontology (CO) Chemical entities of biological interest (ChEBI) Anatomy of model organisms … and others Ontologies unconnected, upper level missing OBO foundry project - Standards for bio-medical ontology development Ontology integration Ontological Layers Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology Ontological Layers Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) Ontological Layers Upper Ontology / Top Ontology • Domain Upper Ontology / Top Domain Ontology Transcription • DNA-dependent transcription • antisense RNA transcription • mRNA transcription • rRNA transcription • tRNA transcription • … (from Gene Ontology) Domain Ontology OBO (GO, SO, ChEBI, …) Ontological Layers DOLCE BFO Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) Ontological Layers DOLCE BFO •Entity • Continuant • Dependent Continuant • Realizable Entity • Function • Role • Independent Continuant • Object • Object Aggregate • Occurrent (from BFO) Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) Ontological Layers DOLCE BFO Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) Ontological Layers DOLCE BFO Upper Ontology / Top Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) Ontological Layers DOLCE BFO Upper Ontology / Top Ontology Biology Domain: • Organism • Body Part • Cell • Cell Component • Tissue • Protein • Nucleic Acid • DNA • RNA • Biological Function • Biological Process Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) Ontological Layers DOLCE BFO Upper Ontology / Top Ontology Simple Bio Upper Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) GFO-Bio GENIA Ontological Layers DOLCE BFO BioTop Upper Ontology / Top Ontology Simple Bio Upper Ontology Domain Upper Ontology / Top Domain Ontology Domain Ontology OBO (GO, SO, ChEBI, …) GFO-Bio GENIA GENIA Ontology “The GENIA ontology is intended to be a formal model of cell signaling reactions in human. It is to be used as a basis of thesauri and semantic dictionaries for natural language processing applications […] Another use of the GENIA ontology is to provide a basis for integrated view of multiple databases” • • • • Developed at Tsujii Laboratory, University Tokyo Taxonomy, 48 classes Verbal definitions (“scope notes”) Purpose: Semantic annotation of biological papers Annotation of Biological Entities We have shown that <PROTEIN_MOLECULE> interleukin-1 (IL-1) </PROTEIN_MOLECULE > and <PROTEIN_MOLECULE> IL-2 </PROTEIN_MOLECULE> control <DNA_DOMAIN> IL-2 receptor alpha (IL-2R alpha) gene </DNA_DOMAIN> transcription in <CELL_LINE> CD4-CD8- murine Tlymphocyte precursors </CELL_LINE>. Here we map the <DNA_FAMILY> cisacting elements </DNA_FAMILY> that mediate <PROTEIN_FAMILY> interleukin </PROTEIN_FAMILY> responsiveness of the <DNA_DOMAIN> mouse IL-2R alpha gene </DNA_FAMILY> using a <CELL_LINE> thymic lymphoma-derived hybridoma (PC60) </CELL_LINE>. The transcriptional response of the <DNA_DOMAIN> IL-2R alpha gene </DNA_DOMAIN> to stimulation by <PROTEIN_MOLECULE> IL-1 </PROTEIN_MOLECULE> + <PROTEIN_MOLECULE> IL-2 </PROTEIN_MOLECULE> is biphasic. <PROTEIN_MOLECULE> IL-1 </PROTEIN_MOLECULE> induces a rapid, protein synthesis-independent appearance of <RNA_MOLECULE> IL-2R alpha mRNA </RNA_MOLECULE> that is blocked by inhibitors of <PROTEIN_MOLECULE> NF-kappa B </PROTEIN_MOLECULE> activation. Annotation of Biological Entities We have shown that <PROTEIN_MOLECULE> interleukin-1 (IL-1) </PROTEIN_MOLECULE > and <PROTEIN_MOLECULE> IL-2 </PROTEIN_MOLECULE> control <DNA_DOMAIN> IL-2 receptor alpha (IL-2R alpha) gene </DNA_DOMAIN> transcription in <CELL_LINE> CD4-CD8- murine Tlymphocyte precursors </CELL_LINE>. Here we map the <DNA_FAMILY> cisacting elements </DNA_FAMILY> that mediate <PROTEIN_FAMILY> interleukin </PROTEIN_FAMILY> responsiveness of the <DNA_DOMAIN> mouse IL-2R alpha gene </DNA_FAMILY> using a <CELL_LINE> thymic lymphoma-derived hybridoma (PC60) </CELL_LINE>. The transcriptional response of the <DNA_DOMAIN> IL-2R alpha gene </DNA_DOMAIN> to stimulation by <PROTEIN_MOLECULE> IL-1 </PROTEIN_MOLECULE> + <PROTEIN_MOLECULE> IL-2 </PROTEIN_MOLECULE> is biphasic. <PROTEIN_MOLECULE> IL-1 </PROTEIN_MOLECULE> induces a rapid, protein synthesis-independent appearance of <RNA_MOLECULE> IL-2R alpha mRNA </RNA_MOLECULE> that is blocked by inhibitors of <PROTEIN_MOLECULE> NF-kappa B </PROTEIN_MOLECULE> activation. Annotation of Biological Entities We have shown that <PROTEIN_MOLECULE> interleukin-1 (IL-1) </PROTEIN_MOLECULE > and <PROTEIN_MOLECULE> IL-2 </PROTEIN_MOLECULE> control <DNA_DOMAIN> IL-2 receptor alpha (IL-2R alpha) gene </DNA_DOMAIN> transcription in <CELL_LINE> CD4-CD8- murine Tlymphocyte precursors </CELL_LINE>. Here we map the <DNA_FAMILY> cisacting elements </DNA_FAMILY> that mediate <PROTEIN_FAMILY> interleukin </PROTEIN_FAMILY> responsiveness of the <DNA_DOMAIN> mouse IL-2R alpha gene </DNA_FAMILY> using a <CELL_LINE> thymic lymphoma-derived hybridoma (PC60) </CELL_LINE>. The transcriptional response of the <DNA_DOMAIN> IL-2R alpha gene </DNA_DOMAIN> to stimulation by <PROTEIN_MOLECULE> IL-1 </PROTEIN_MOLECULE> + <PROTEIN_MOLECULE> IL-2 </PROTEIN_MOLECULE> is biphasic. <PROTEIN_MOLECULE> IL-1 </PROTEIN_MOLECULE> induces a rapid, protein synthesis-independent appearance of <RNA_MOLECULE> IL-2R alpha mRNA </RNA_MOLECULE> that is blocked by inhibitors of <PROTEIN_MOLECULE> NF-kappa B </PROTEIN_MOLECULE> activation. Annotation of Biological Entities We have shown that <PROTEIN_MOLECULE> interleukin-1 (IL-1) </PROTEIN_MOLECULE > and <PROTEIN_MOLECULE> IL-2 </PROTEIN_MOLECULE> control <DNA_DOMAIN> IL-2 receptor alpha (IL-2R alpha) gene </DNA_DOMAIN> transcription in <CELL_LINE> CD4-CD8- murine Tlymphocyte precursors </CELL_LINE>. Here we map the <DNA_FAMILY> cisacting elements </DNA_FAMILY> that mediate <PROTEIN_FAMILY> interleukin </PROTEIN_FAMILY> responsiveness of the <DNA_DOMAIN> mouse IL-2R alpha gene </DNA_FAMILY> using a <CELL_LINE> thymic lymphoma-derived hybridoma (PC60) </CELL_LINE>. The transcriptional response of the <DNA_DOMAIN> IL-2R alpha gene </DNA_DOMAIN> to stimulation by <PROTEIN_MOLECULE> IL-1 </PROTEIN_MOLECULE> + <PROTEIN_MOLECULE> IL-2 </PROTEIN_MOLECULE> is biphasic. <PROTEIN_MOLECULE> IL-1 </PROTEIN_MOLECULE> induces a rapid, protein synthesis-independent appearance of <RNA_MOLECULE> IL-2R alpha mRNA </RNA_MOLECULE> that is blocked by inhibitors of <PROTEIN_MOLECULE> NF-kappa B </PROTEIN_MOLECULE> activation. GENIA Ontology http://www-tsujii.is.s.u-tokyo.ac.jp/~genia/ corpus/GENIAontology.owl GENIA Scope Notes • Amino Acid Monomer: “An amino acid monomer, e.g. tyrosine, serine, tyr, ser” • DNA: “DNAs include DNA groups, families, molecules, domains, and regions” False Ontological Parent GENIA: Source “Sources are biological locations where substances are found …” • ‘Organism’, ‘Tissue’, … are objects and not primarily sources! • ‘Source’ is a role … Misconception of Type GENIA: Cell Type “A cell type, e.g. TLymphocyte, T-cell, astrocyte, fibroblast” • Are the instances of ‘Cell Type’ classes ? • Otherwise rename the class as ‘Cell’! Non-Conformant Naming Policy GENIA: Amino Acid, GENIA: Protein “Proteins include protein groups, families, molecules, complexes, and substructures.” “An amino acid molecule or the compounds that consist of amino acids.” • Names suggest single molecules • Don’t work against biologists’ intuition! BioTop • Top level ontology for biology • Ontologically founded • Formal semantics • Use in bio text mining OBO Relations Ontology foundational spatial temporal participation OBO Relations Ontology foundational spatial temporal participation BioTop Relations BioTop Relations • partOf • properPartOf • locatedIn • derivesFrom • hasParticipant BioTop Relations • partOf • properPartOf • locatedIn • derivesFrom • hasParticipant • hasFunction (functionOf) BioTop Relations grainOf (hasGrain) componentOf (hasComponent) • partOf • properPartOf • locatedIn • derivesFrom • hasParticipant • hasFunction (functionOf) Compound hasComponent • • hasComponent non-transitive subrelation of hasPart Example: Definition of Protein Oxygen Carbon NH2 Protein • Arginine Compounds are determined by their components Collective hasGrain • • hasGrain non-transitive subrelation of hasPart Example: “Collective of Cell hasGrain only Cell” • Collectives can gain or lose grains without changing their identity Full Definitions • • • Classes defined by necessary and sufficient conditions Rationales - Precise understanding of meaning - Empowering the classifier for automated validation processes Required introduction of new Nucleotide classes, e.g. Phosphate, Ribose Base Necessary conditions of class ‘Nucleotide’ Phosphate Ribose Rearranged Classes GENIA BioTop properPartOf componentOf hasComponent BioTop Results • BioTop.owl: - 143 non-taxonomic relation instances (seven relation types) 128 classes • BioTopGenia.owl: - Imports GENIA Links BioTop classes to GENIA classes (if possible) Ontology Integration BioTop OBO Ontologies Biological Process ↔ Biological ProcessGene Ontology Protein Function ↔ Molecular FunctionGene Ontology Cell Component ↔ Cellular ComponentGene Ontology Cell ↔ CellCell Ontology and CellFMA Atom ↔ AtomsChEBI Organic Compound ↔ Organic Molecular EntitiesChEBI Tissue ↔ TissueFMA DNA, RNA ↔ DNASequence Ontology, RNASequence Ontology Protein ↔ ProteinSequence Ontology Conclusions • • • GENIA Ontology as de facto standard for bio text mining but with major shortcomings BioTop as a new proposal for an upper-level ontology for biology - principled - formalized - enhanced Coupling to OBO ontologies BioTop: www.ifomis.org/biotop Current Projects on Bio Text Mining at Jena University: BOOTStrep: www.bootstrep.eu Bootstrapping ontologies and terminologies StemNet: www.stemnet.de A knowledge management system for stem cell biology