* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Introduction-to-OBO - Buffalo Ontology Site
Survey
Document related concepts
Transcript
OBO Foundry Principles BFO RO Barry Smith 1 OBO Foundry Principles open common formal language (OBO Format, OWL DL, CL) commitment to collaboration maintenance in light of scientific advance unique identifier space (Alan) naming conventions (Susanna / EBI) – metadata for changes versioning 2 OBO Foundry Principles common architecture (= RO + BFO) clearly delineated content (redundant – overlaps with orthogonality) the ontology is well-documented (– overlaps with rules for definitions; needs expanding, for developers, for users, minimal metadata) plurality of independent users single locus of authority, trackers, help desk 3 OBO Foundry Principles textual definitions plus formal definitions all definitions should be of the genus-species form, utilizing cross-products therefore: single is_a inheritance (= each ontology should be conceived as consisting of a core of asserted single inheritance with further is_a relations inferred) 4 Orthogonality • For each domain, there should be convergence upon a single ontology that is recommended for use by those who wish to become involved with the Foundry initiative • Compare what happens in other parts of science: for each domain, there should be convergence upon a single theory Preventing silos on the side of annotated data = preventing forking of the ontologies used for annotation 5 Strategy to ensure orthogonality • If the Foundry already has an ontology O1 covering a domain D, and an outside group creates a second ontology O2 covering D (or part of D), we need to ask: – is it in every respect better? (then replace O1 with O2) – is it in some respects better? (then negotiate an improved synthesis, O3) ASSUMPTION: ontologies are always comparable PROBLEM: need better measures of ontology quality) 6 Benefits of orthogonality • Offers a solution to the problem of silos that is – modular – incremental – empirically based – incorporates a strategy for motivating potential developers and users 7 Orthogonality = non-redundancy for the reference ontologies inside the Foundry • CARO-Mammal will not be orthogonal to CARO • IDO-Malaria will not be orthogonal to IDO • IDO will not be orthogonal to DO • DO will be orthogonal to CL 8 Absolute redundancy for application ontologies = all terms in application ontologies should be taken from orthogonal reference ontologies within the Foundry 9 Benefits of orthogonality • Modularity brings benefits of division of labor, division of authority, minimizes redundancy 10 Benefits of orthogonality • Scientists become motivated to commit themselves to developing an ontology falling within their domain of expertise because they themselves will need to use this ontology in their own work in the future. • Forking would erode this motivation 11 Benefits of orthogonality • Incrementality means that the strategy will still work even if ontologies are still only partial • this allows adoption and application at early stages 12 Benefits of orthogonality • Empirically based means that we can always go back and start again if some ontology module does not work (compare the problem of non-modular approaches like SNOMED CT, where it is all or nothing) 13 Benefits of orthogonality • Modularity brings ownership, motivates on scientist-developers to commit themselves long term to developing the ontology • This in turn motivates users to commit themselves to adoption – they see strong positive network effects from use of the ontology) – they gain reassurance from long-term commitment 14 Benefits of orthogonality • It helps those new to ontology who need to know where to look in finding an ontology relating to their subject-matter • it obviates the need for ‘mappings’ between ontologies, which are – difficult to create and use – error-prone – hard to keep up-to-date when mapped ontologies change 15 Benefits of orthogonality • modularity (orthogonality) ensures the mutual consistency of ontologies, and thereby also the additivity of the annotations created with their aid by different groups of annotators describing common bodies of data. • thereby contributes to the cumulativity of science and allows new forms of unmanaged collaboration. 16 Benefits of orthogonality • brings grave responsibilities to those in charge of ensuring for each domain that the Foundry includes an ontology for that domain • they must commit to perpetual striving for scientific accuracy and domain-completeness in their work • orthogonality rules out the sorts of simplification and partiality which may be acceptable under more pluralistic regimes 17 Benefits of orthogonality • it supports the strategy of utilizing crossproducts in composing terms and definitions • this strategy will work only if we can – minimize the degree of arbitrariness involved in selecting the terms to be composed – and thereby maximize the degree to which the Foundry ontologies are networked together through the cross-product links 18 Misunderstandings of Orthogonality • Orthogonality does not mean that all ontologies must be developed within the Foundry framework • We welcome the development of competing approaches to open-access ontology development – which can only make the Foundry stronger 19 Problems with Orthogonality • what if researchers need purpose-built ontologies to meet their own specific needs? • OBO Foundry provides orthogonal reference ontologies, so that they can as far as possible build their application ontologies using terms composed as cross-products • thereby avoid silos • and contributing new terms back to the Foundry in case of need 20 Problems with Orthogonality • For each domain, there should be convergence upon a single ontology that is recommended for use by those who wish to become involved with the Foundry initiative Q: WHAT DOES ORTHOGONALITY MEAN? minimally: two ontologies are not orthogonal if they share a single term with the same meaning Q: WHAT DOES DOMAIN MEAN? 21 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Biological Process (GO) Molecular Process (GO) Initial OBO Foundry Reference Ontologies (jigsaw) 22 Homesteading Recommendation: Ontology developers should register their claim on territory not yet unoccupied, as soon as possible, because the Foundry is designed to serve as an attractor for collaboration 23 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Biological Process (GO) Molecular Process (GO) Orthogonality = Westphalian principles of national sovereignty for reference ontologies no shared territory 24 Varieties of application ontology • • • • cross-border national parks Slims Fractal ontologies Cross-product ontologies – Template ontologies (CARO, IDO, GDO …) 25 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Biological Process (GO) Molecular Process (GO) cross-border national parks: an ontology for studying the effects of viral infection on cell function26in shrimp RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT MOLECULE Organism (NCBI Taxonomy) Cell (CL) Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Biological Process (GO) Molecular Process (GO) Slims = an ontology of dendritic cells 27 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT MOLECULE Organism (NCBI Taxonomy) Cell (CL) Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Biological Process (GO) Molecular Process (GO) Slims = an ontology of dendritic cells, with definitions composed using terms from other ontologies 28 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Biological Process (GO) Molecular Process (GO) fractal ontologies, employing small portions of many ontologies (e.g. MSO Multiple Sclerosis Ontology) 29 RELATION TO TIME GRANULARITY INDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE CONTINUANT DEPENDENT Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RNAO, PRO) OCCURRENT Organism-Level Process (GO) Cellular Process (GO) Molecular Process (GO) Molecular Function (GO) rationale of OBO Foundry coverage + BFO 30 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Biological Process (GO) Molecular Process (GO) types plus instances 31 Fundamental Dichotomy Continuants (aka endurants) – have continuous existence in time – preserve their identity through change – exist in toto whenever they exist at all Occurrents (aka processes) – have temporal parts – unfold themselves in successive phases – exist only in their phases Functions are continuants Functionings are occurrents Anatomical Organ Entity Function (FMA, (placehol CARO) der) Phenotypic Quality (PATO) CELL AND CELLULAR COMPONENT MOLECULE Cell (CL) Disease (DO) Biological Process (GO) Cellular Cellular Component Function (FMA, GO) (GO) (ChEBI, SO, RNAO, PRO) Molecular Function (GO) Molecular Process (GO) Biomedical Investigations (OBI) ORGAN AND ORGANISM Organism (NCBI Taxonomy) ORGAN AND ORGANISM Organism Anatomical Organ (NCBI Entity Function Taxonomy / (FMA, (placehold placeholder) CARO) er) Phenotypic Disease Biological Process Quality (DO) (GO) (PATO) CELL AND CELLULAR COMPONENT MOLECULE Cell (CL) Cellular Component (FMA, GO) (ChEBI, SO, RNAO, PRO) Cellular Function (GO) Molecular Function (GO) Molecular Process (GO) ORGAN AND ORGANISM Organism Anatomical Organ (NCBI Entity Function Taxonomy / (FMA, (placehold placeholder) CARO) er) Disease (DO) Phenotypic Quality (PATO) CELL AND CELLULAR COMPONENT MOLECULE Cell (CL) Cellular Component (FMA, GO) (ChEBI, SO, RNAO, PRO) Cellular Function (GO) Biological Process (GO) Cellular Pathology ???? Molecular Function (GO) Molecular Process (GO) ORGAN AND ORGANISM Organism Anatomical Organ (NCBI Entity Function Taxonomy / (FMA, (placehold placeholder) CARO) er) Disease (DO) Phenotypic Quality (PATO) CELL AND CELLULAR COMPONENT MOLECULE Cell (CL) Cellular Component (FMA, GO) (ChEBI, SO, RNAO, PRO) Cellular Function ???? (GO???) Biological Process (GO) Cellular Pathology ???? Molecular Function (GO) Molecular Process (GO) ORGAN AND ORGANISM Organism Anatomical Organ (NCBI Entity Function Taxonomy / (FMA, (placehold placeholder) CARO) er) Phenotypic Disease Biological Process Quality (DO) (GO) (PATO) CELL AND CELLULAR COMPONENT MOLECULE Cell (CL) Cellular Component (FMA, GO) (ChEBI, SO, RNAO, PRO) Cellular Function (GO) Molecular Function (GO) Molecular Process (GO) ORGAN AND ORGANISM Organism Anatomical Organ (NCBI Entity Function Taxonomy / (FMA, (placehold placeholder) CARO) er) Phenotypic Disease Biological Process Quality (DO) (GO) (PATO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) 2- and 3-D Structure (RNAO) (PRO) Cellular Function (GO) Molecular Function (GO) MOLECULE Small Molecule (ChEBI) 1-D Sequence (SO) Molecular Process (GO) ORGAN AND ORGANISM Organism Anatomical Organ (NCBI Entity Function Taxonomy / (FMA, (placehold placeholder) CARO) er) Phenotypic Disease Biological Process Quality (DO) (GO) (PATO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) 2- and 3-D Structure (RNAO) (PRO) Cellular Function (GO) Molecular Process (GO) ????? Molecular Function (GO) MOLECULE Small Molecule (ChEBI) 1-D Sequence (SO) Molecular Pathway ORGAN AND ORGANISM Organism Anatomical Organ (NCBI Entity Function Taxonomy / (FMA, (placehold placeholder) CARO) er) Phenotypic Disease Biological Process Quality (DO) (GO) (PATO) CELL AND CELLULAR COMPONENT Cell (CL) Cellular Component (FMA, GO) 2- and 3-D Structure (RNAO) (PRO) Cellular Function (GO) Molecular Process (GO) ????? Molecular Phenotypic Quality of Molecule Function ???? (GO) MOLECULE Small Molecule (ChEBI) 1-D Sequence (SO) Reactome Orthogonality can be preserved by expanding the territory (land reclamation) 42 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Biological Process (GO) Molecular Process (GO) GO already started to deal with biological processes involving multiple organisms 43 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY Family, Community, Deme, Population ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT MOLECULE Organ Anatomical Function Organism Entity (FMP, CPRO) Phenotypic (NCBI (FMA, Quality Taxonomy) CARO) (PaTO) Cell (CL) Cellular Component (FMA, GO) Molecule (ChEBI, SO, RnaO, PrO) Biological Process (GO) Cellular Function (GO) Molecular Function (GO) http://obofoundry.org Molecular Process (GO) 44 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY COMPLEX OF ORGANISMS ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT MOLECULE Family, Community, Deme, Population Population Phenotype Population Process Organ Anatomical Function Organism Entity (FMP, CPRO) (NCBI (FMA, Phenotypic Taxonomy) CARO) Quality (PaTO) Cellular Cellular Cell Component Function (CL) (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) http://obofoundry.org Biological Process (GO) Molecular Process (GO) 45 RELATION TO TIME CONTINUANT INDEPENDENT GRANULARITY ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT MOLECULE Family, Community, Deme, Population Organism (FMA, (NCBI CARO) Taxonomy) Cell (CL) Cell Component (FMA, GO) Molecule (ChEBI, SO, RnaO, PrO) DEPENDENT ENVIRONMENT COMPLEX OF ORGANISMS OCCURRENT Organ Function (FMP, CPRO) Population Phenotype Population Process Phenotypic Quality (PaTO) Biological Process (GO) Cellular Function (GO) Molecular Function (GO) http://obofoundry.org 46 Molecular Process (GO) RELATION TO TIME CONTINUANT INDEPENDENT GRANULARITY ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT MOLECULE Family, Community, Deme, Population Organism (FMA, (NCBI CARO) Taxonomy) Cell (CL) Cell Component (FMA, GO) Molecule (ChEBI, SO, RnaO, PrO) ENVIRONMENT COMPLEX OF ORGANISMS Environment of population Environment of single organism Environment of cell Molecular environment http://obofoundry.org 47 RELATION TO TIME CONTINUANT INDEPENDENT GRANULARITY Family, Community, Deme, Population ORGAN AND ORGANISM Organism (FMA, (NCBI CARO) Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) Cell Component (FMA, GO) ENVIRONMENT COMPLEX OF ORGANISMS Environment of population Environment of single organism* Environment of cell * The sum total of the conditions and elements Molecule that make up theSO,surroundings and influence MOLECULE (ChEBI, Molecular environment RnaO, PrO) the development and actions of an individual. 48 RELATION TO TIME CONTINUANT INDEPENDENT GRANULARITY ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT MOLECULE ENVIRONMENT COMPLEX OF ORGANISMS biome / biotope, territory, habitat, neighborhood, ... work environment, home environment; host/symbiont environment; ... extracellular matrix; chemokine gradient; ... hydrophobic surface; virus localized to cellular substructure; active site on protein; pharmacophore ... http://obofoundry.org 49 CONTINUANT INDEPENDENT Organism NCBI Taxonomy Cell (CL) OCCURRENT DEPENDENT Anatomical Organ Entity Function (FMA, (FMP, CARO) CPRO) Phenotyp Biological Process ic Quality (GO) (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) X Organism Taxonomy Molecular Function Molecular Process (GO) (GO) Template ontologies (CARO, IDO, CL?) 50 51 The case of IDO Human Disease Ontology Infectious Disease Ontology unitary hierarchy with root node: human disease refers only to dependent realizable continuants draws terms from all BFO categories template exists in many copies: specializing to different hosts, pathogens, vectors, etc. We have data TBDB: Tuberculosis Database, including Microarray data VFDB: Virulence Factor DB TropNetEurop Dengue Case Data ISD: Influenza Sequence Database at LANL PathPort: Pathogen Portal Project ... 53 We need common controlled vocabularies to describe these data in ways that will assure comparability and cumulation What content is needed to adequately cover the infectious domain? – – – – Host-related terms (e.g. carrier, susceptibility) Pathogen-related terms (e.g. virulence) Vector-related terms (e.g. reservoir, Terms for the biology of disease pathogenesis (e.g. evasion of host defense) – Population-level terms (e.g. epidemic, endemic, pandemic, ) 54 We need to annotate this data to allow retrieval and integration of – sequence and protein data for pathogens – case report data for patients – clinical trial data for drugs, vaccines – epidemiological data for surveillance, prevention – ... Goal: to make data deriving from different sources comparable and computable 55 IDO needs to work with Disease Ontology (DO) + SNOMED CT Gene Ontology Immunology Branch Phenotypic Quality Ontology (PATO) Protein Ontology (PRO) Sequence Ontology (SO) ... 56 IDO provides a common template IDO works like CARO. It contains terms (like ‘pathogen’, ‘vector’, ‘host’) which apply to organisms of all species involved in infectious disease and its transmission Disease- and organism-specific ontologies then built as specifications of the IDO core 57 Proposed additions to list of OBO Foundry Principles • INSTANTIABILITY: Terms in an ontology should correspond to instances in reality Even disposition terms correspond to instances in reality There are no absent nipples There are no cancelled studies Proposed additions to list of OBO Foundry Principles INSTANTIABILITY: Terms in an ontology should represent types all of which have instances in reality types = what are described in textbooks instances = (roughly) what are described in data 59 Proposed additions to list of OBO Foundry Principles Ontologies consist of representations of types in reality – therefore, their terms should consist entirely of singular nouns (preferred terms blah blah) Ontologies should use singular nouns and noun phrases belonging to ordinary English as extended by technical terms already established in the relevant discipline – they should not use phrases like ‘EV-EXP-IGI’, no lab slang, no ellipses 60 Proposed additions to list of OBO Foundry Principles EVALUATION • each ontology should be subject to evaluation (as far as possible quantitative): • software (conversion OBO format OWL) • specialist review (OWL natural language) • when one version is used for a given purposes later versions should be applied to the same purpose and results compared 61 Proposed additions to list of OBO Foundry Principles each ontology should be built on the basis of BFO top-level distinctions (common top level): • continuants vs. occurrents • independent continuants (molecules, cells, organisms …) • specifically dependent continuants (qualities, functions, roles …) • generically dependent continuants (information artifacts, sequences …) 62