Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
A biological ontology is: A machine interpretable representation of some aspect of biological reality what kinds eye disc of things develops exist? from what are the relationships between these things? sense organ is_a eye part_of ommatidium Following basic rules helps make better ontologies Ontologies must be intelligible both to humans (for annotation) and to machines (for reasoning and error-checking) Unintuitive rules for classification lead to entry errors (problematic links) Facilitate training of curators Overcome obstacles to alignment with other ontology and terminology systems Enhance harvesting of content through automatic reasoning systems Animal disease models Animal models Mutant Gene Mutant or missing Protein Mutant Phenotype Animal disease models Humans Mutant Gene Animal models Mutant Gene Mutant or missing Protein Mutant or missing Protein Mutant Phenotype (disease) Mutant Phenotype (disease model) Animal disease models Humans Mutant Gene Animal models Mutant Gene Mutant or missing Protein Mutant or missing Protein Mutant Phenotype (disease) Mutant Phenotype (disease model) Animal disease models Humans Mutant Gene Animal models Mutant Gene Mutant or missing Protein Mutant or missing Protein Mutant Phenotype (disease) Mutant Phenotype (disease model) SHH-/+ SHH-/- shh-/+ shh-/- Phenotype (clinical sign) = entity + attribute Phenotype (clinical sign) = entity P1 = eye + attribute + hypoteloric Phenotype (clinical sign) = entity P1 P2 = eye = midface + attribute + hypoteloric + hypoplastic Phenotype (clinical sign) = entity P1 P2 P3 = eye = midface = kidney + + + + attribute hypoteloric hypoplastic hypertrophied Phenotype (clinical sign) = entity P1 P2 P3 = eye = midface = kidney + + + + ZFIN: eye midface kidney attribute hypoteloric hypoplastic hypertrophied PATO: hypoteloric + hypoplastic hypertrophied Phenotype (clinical sign) = entity + attribute Anatomical ontology Cell & tissue ontology Developmental ontology Gene ontology biological process molecular function cellular component + PATO (phenotype and trait ontology) Phenotype (clinical sign) = entity P1 P2 P3 = eye = midface = kidney + + + + attribute hypoteloric hypoplastic hypertrophied Syndrome = P1 + P2 + P3 (disease) = holoprosencephaly Human holoprosencephaly Zebrafish shh Zebrafish oep EA model entity attribute attribute fin shape irregular shape eye color hue blue mesenchyme relative thickness thin brain structure fused retinal cells relative orientation disoriented Proposed schema Association = Genotype Phenotype Environment Assay Phenotype = Stage* Entity Attribute Value Entity = OBOClassID Attribute = PATOVersion2ClassID Monadic and relational attributes Monadic: the quality/attribute inheres in a single entity Relational: the quality/attribute inheres in two or more entities sensitivity of an organism to a kind of drug sensitivity of an eye to a wavelength of light can turn relational attributes into cross-product monadic attributes e.g. sensitivityToRedLight better to use relational attributes avoids redundancy with existing ontologies Incorporating relational attributes Association = Genotype Phenotype Environment Assay Phenotype = Stage* Entity Attribute Entity* Entity = OBOClassID Attribute = PATOVersion2ClassID Example data record: Phenotype = “organism” sensitiveTo “puromycin” Measurable attributes Some attributes are inexact and implicitly relative to a wild-type or normal attribute relatively short, relatively long, relatively reduced easier than explicitly representing: this tail length shorter-than ‘canonical mouse’ wild-type tail length Some attributes are determinable use a measure function unit, value, {time} this tail has length L measure(L, cm) = 2 Keep measurements separate from (but linked to) attribute ontology Incorporating measurements Association = Genotype Phenotype Environment Assay Phenotype = Stage* Entity Attribute Entity* Measurement* Measurement = Unit Value (Time) Entity = OBOClassID Attribute = PATOVersion2ClassID Example data record: Phenotype = “gut” “acidic” Measurement = “pH” 5 Composite phenotype classes Mammalian phenotype has composite phenotype classes e.g. “reduced B cell number” Compose at annotation time or ontology curation time? False dichotomy Core 2 will help map between composite class based annotation and EA annotation Interpreting annotations Annotations are data records typically use class IDs implicitly refer to instances How do we map an annotation to instances? Important for using annotations computationally Interpreting annotations (1) What does an EA (or EAV) annotation mean? Annotation: Genotype=“FBal00123” E=“brain” A=“fused” presumed implied meaning: this organism has_part x, where x instance_of “brain” x has_quality “fused” or in natural language: “this organism has a fused brain” Various built-in assumptions Interpreting annotations (II) What does this mean: annotation: Genotype=“FBal00123” E=“wing” A=“absent” using same mapping as annotation I: fly98 has_part x, where x instance_of “wing” x has_quality “absent” or in natural language: this fly has a wing which is not there ! What we really intend: NOT(this organism has_part x, where x instance_of “wing”) Interpreting annotations (II) What does this mean: annotation: Genotype=“FBal00123” E=“wing” A=“absent” using same mapping as annotation I: this organism has_part x, where x instance_of “wing” x has_quality “absent” or in natural language: this fly has a wing which is not there ! What we really intend: this organism has_quality “wingless” “wingless” = the property of having count(has_part “wing”)=0 Are our computational representations intended to capture linguistic statements or reality? Does this matter? Logical reasoners will compute incorrect results unless explicitly provided with specific rules for certain attributes such as “absent” What are the consequences? Basic search will be fine e.g. “find all wing phenotypes” But computers will not be able to reason correctly Interpreting annotations (III) What does this mean: annotation: E=“digit” A=“supernumery” using same interpretation as annotation I: this organism has_part x, where x instance_of “digit” x has_quality “supernumery” or in natural language: this organism has a particular finger which is supernumery What we really intend: this person has_quality “supernumery finger” “supernumery finger” = the property of having count(has_part “digit”) > wild-type” !!! Interpreting annotations (IV) What does this mean: annotation: Gt=“mp001” E=“brown fat cell” A=“increased quantity” using same mapping as annotation I: this organism has_part x, where x instance_of “brown fat cell” x has_quality “increased quantity” or in natural language: this organism has a particular brown fat cell which is increased in quantity What we really intend: this organism has_part population_of(“brown fat cell”) which has_quality increased size Other use cases spermatocyte devoid of asters Homeotic transformations increased distance between wing veins Some vs all Alternate perspectives process vs state regulatory processes: acidification of midgut has_quality reduced rate midgut has_quality low acidity development vs behavior wing development has_quality abnormal flight has_quality intermittent granularity (scale) chemical vs molecular vs cell vs tissue vs anatomical part Summary Define attributes in terms of instances Evaluate proposed new schema measurement proposal relational attribute proposal Complexity trade-off create library of use cases Core2 will create tools to present user-friendly layer Alternate perspective annotations are useful Before: domain knowledge is embedded in the db schema Gene table Exon table RNA table Protein table After: domain knowledge is embedded in the ontology feature table Ontology driven db schema is less expensive to maintain The logical description and the physical database description of the biology are developed independently Therefore new biological knowledge will only require: Ontology changes: e.g. new terms GUI changes: display No schema changes No query changes No middleware changes Step 1: Build an ontology that reflects reality Step 2: Data capture Database: UIDs serving as proxies for instances Step 3: Classify data using the ontology Ontologies must adapt over time Getting it right It is impossible to get it right the 1st (or 2nd, or 3rd, …) time. What we know about biology is continually growing This “standard” requires versioning. Improve Collaborate and Learn Image Ontologies Matthew Fielding From RadLex to RadiO A unified language for radiology information sources (e.g. teaching files, research data, and radiology reports). Will describe all the salient aspects of an imaging examination (e.g., modality, technique, visual features, anatomy, and pathology). Will emphasize adoption or linkage to established terminology and standards when possible, such as the ACR Index, SNOMED, the Unified Medical Language System (UMLS), the Fleischner Society Glossaries, and DICOM. Will be used to organize and retrieve radiology images. Image Ontologies C. Forbes Dewey Experibase A common technology that will capture data from all of the major experimental systems generating biological data. Implementing it for gel electrophoresis, microarrays, fluorescence-activated cell sorting, mass spectrometry and optical microscopy. Coordinating with the Interoperable Informatics Infrastructure Consortium (I3C) Will be used to organize and interrogate these experimental data Image Ontologies Bill Lorensen Image Ontologies William Bug Image Ontology Requirements Linking databases created at multiple centers concerned with human disease and associated animal models. BIRN Ontology Task Force (OTF) reviews different ontological reference interpretations by its audience: anatomists, clinicians, genomics, pathologists, diagnosticians, and neurologists Using existing ontologies, tools, and formalisms wherever possible and extend them only as necessary. Any ontology work performed by BIRN should be aligned with other efforts and provided back to the maintainers Developing a set of ontologies that are approved for use and a set of policies and procedures for extensions Image Ontologies Louis Goldberg On Reasoning with Images What different approaches are available for spatial, temporal, and spatio-temporal representation and reasoning formalisms used in computer applications? What is the expressive power of those formalisms Formalizations for commonsense reasoning about space and time. Formalisms for the representation of vagueness