Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
An Introduction to Anatomy Ontologies Phenotype RCN Feb 23, 2012 Melissa Haendel Setting the stage 1. Who are we? What do we need? Why are we here? 2. What is an anatomy ontology? 3. What kinds of anatomy ontologies exist? 4. How are anatomy ontologies used? 5. Anatomical evidence Who are we? Domain Experts: Anatomists, comparative morphologists, developmental biologists, immunologists, neuroscientists, etc. Engineers: have to build tools that can consume ontologies and give the Domain Experts the right results Domain experts: want to query for gene expression and phenotypes across species Ontologists: have to be able to interpret and represent domain knowledge computationally Engineers: Our tool builders Ontologists: Biologists-gone-informatics, computer scientists and logicians We want to enable: Comparison of structures across different organisms, scales Standardization of anatomical vocabulary among and between communities Integration of anatomical data across databases Query across large amount of data Automatic reasoning to infer related classes Error checking Annotation consistency Therefore, we build ontologies that are intelligible to: Engineers Domain experts Ontologists Machines Anatomical information retrieval from text-based resources OMIM Query # Records “large bone” 785 “enlarged bone” 156 “big bone” 16 “huge bones” 4 “massive bones” 28 “hyperplastic bones” 12 “hyperplastic bone” 40 “bone hyperplasia” 134 “increased bone growth” 612 Less than ideal. Why build an anatomy ontology? A simple example Number of genes annotated to each of the following brain parts in an ontology: brain 20 hindbrain 15 part_of rhombomere 10 part_of Query brain without ontology 20 Query brain with ontology 45 Ontologies can facilitate grouping and retrieval of data There are many useful ways to classify parts of organisms: its parts and their arrangement its relation to other structures what is it: part of; connected to; adjacent to, overlapping? its shape its function its developmental origins its species or clade its evolutionary history Cajal 1915, “Accept the view that nothing in nature is useless, even from the human point of view.” An ontology is a classification appendage wing antenna fore wing hind wing Relationships record classifications too ‘leg’ SubClassOf part_of some thoracic segment part_of some ‘thoracic segment wing leg Multiple inheritance is very hard to manage by hand It is difficult to keep track of multiple classification chains to: • ensure completeness; • avoid redundancy; • Incorrect inheritance of classification criteria from a distant superclass The knowledge in an ontology can make the reasons for classification explicit Any sense organ that functions in the detection of smell is an olfactory sense organ sense organ olfactory sense organ capable_of some detection of smell Classifying sense organ capable_of some detection of smell nose nose olfactory sense organ sense organ nose capable_of some detection of smell Compositionality and avoiding asserted multiple inheritance Let the reasoner do the work! We can logically define composed classes and create complex definitions from simpler ones aka: building blocks, cross-products, logical definitions Descriptions can be composed at any time Ontology construction time (pre-composition) Annotation time (post-composition) Formal necessary and sufficient definitions + a reasoner Automatic (and therefore manageable) classification Requires subtype classification, so apart from the root term(s), no term should lack an is_a parent. Example of a post-composed anatomical entity Plasma membrane of spermatocyte • Plasma membrane [GO CC] • Spermatocyte [Cell Ontology] Genus Differentia a plasma membrane which is part_of a spermatocyte Gene Ontology Basic Formal Ontology Cell Ontology Many perspectives, many ontologies behavior clinical disorders reactions proteins cellular processes chemical entities evolutionary characters nervous system neural crest cells cell anatomy physiological processes development tissues phenotypes processes gross anatomy What kinds of anatomy ontologies exist? Species-centric and multi-species ontologies Mouse MA (adult) EMAP / EMAPA (embryonic) Human FMA (adult) EHDAA2 (CS1-CS20) Amphibian AAO XAO Fish ZFA (zebrafish) MFO (medaka) TAO (teleosts) Nematode WBbt (c elegans) Arthropod FBbt (Drosophila) TGMA (Mosquito) HAO (hymenoptera) Arthropod anatomy ontology Plant ontology Species neutral ontologies CARO (common anatomy reference ontology) Uberon (cross-species anatomy) vHOG (vertebrate homologous organs) CL (cell ontology) GO (gene ontology) Phenotype ontologies MP mammalian phenotype HP human phenotype WB worm phenotype Species-centric ontologies The Zebrafish Anatomy Ontology Used to record gene expression and phenotypes at different stages of development Ontologies built for one species will not work for others http://ccm.ucdavis.edu/bcancercd/22/mouse_figure.html http://fme.biostr.washington.edu:8080/FME/index.html Multi-species anatomy ontologies The Plant Ontology Seed plants (Angiosperms and Gymnosperms) Pteridophytes (Ferns and Lycopods) Bryophytes (Mosses, Hornworts and Liverworts) Algae Challenge is in representing diversity in anatomy, morphology, life cycles, growth patterns Bowman et al, Cell, 2007 Example of complexity arising from multiple species-contexts cell nucleate cell erythrocyte enucleate cell not applicable in all contexts Example of complexity arising from multiple species-contexts cell enucleate cell nucleate cell species ontologies attached at appropriate level CL:0000232 CL:0000562 nucleate erythrocyte ZFA:0009256 … erythrocyte … zebrafish nucleate erythrocyte CL:0000592 enucleate erythrocyte human erythrocyte FMA:81100 Using reasoners to detect errors UBERON: bone disjoint with Drosophila melanogaster part_of only_in_taxon is_a is_a Homo sapiens UBERON: tibia is_a Fruit fly FBbt ‘tibia’ Vertebrata ✗ is_a part_of Human FMA ‘tibia’ Developmental Biology, Scott Gilbert, 6th ed. The Gene Ontology has an anatomy ontology zebrafish Look ma, no pons! human Phenotype ontologies also have inherent anatomy WBbt C. elegans phenotype Designed primarily for annotation of phenotypes within a single species Representing different levels of granularity GO lateral line development neuromast development ? hair cell development ? neuromast part_of lateral line hair cell part_of neuromast cilium development cilium part_of hair cell part_of neuromast The problem: Data Silos is_a (SubClassOf) part_of develops_from surrounded_by GO FMA multicellular organismal process EHDAA2 pharyngeal region organ system solid organ respiratory system parenchymatous organ Lower respiratory tract lobular organ pleural sac lung respiratory gaseous exchange respiratory primordium respiratory system process lung bud lung MPO abnormal respiratory system morphology MA thoracic cavity organ system thoracic cavity organ respiratory system abnormal lung morphology lung abnormal pulmonary acinus morphology pulmonary acinus abnormal pulmonary alveolus morphology lung alveolus alveolar sac How to synchronize anatomy ontologies Three approaches: Mapping Direct reconciliation Synchronization using imports/MIREOT There are issues with mappings Class A Class B In Bioportal? Useful? FMA extensor retinaculum of wrist MA retina Yes No FMA portion of blood MA blood No Yes ZFA Macula MA macula Yes No ZFA aortic arch MA arch of aorta Yes Dubious ZFA hypophysis MA pitiuitary No Yes FMA tibia FBbt tibia Yes No FMA colon GAZ Colón, Panama Yes No PATO male Chebi maleate 2(-) Yes No Reconciliation and linking between TAO and ZFA Zebrafish terms are is_a subtypes of teleost terms Teleost Anatomy Ontology Zebrafish Anatomy is_a Logic implemented via Xrefs- difficult to keep synchronized The Common Anatomy Reference Ontology CARO is a structural classification based on granularity From the bottom up: Cell component Cell Portion of tissue Multi-tissue structure From the top down: Organism subdivision Anatomical system Acellular structures Note: CARO is being updated to be more interoperable, include logical definitions, and functional differentia Synchronization by import across ontologies CARO VAO Present TAO Modularized ontology One can import a whole ontology or just portions of another ontology MIREOT: Minimum information to reference an external ontology term Uberon – a multi-species ontology for phenomics and evo-devo analyses Uberon.org Uberon classes generalize species-specific ones, and connect to other ontologies via a variety of relations is_a (SubClassOf) part_of develops_from capable_of is_a (taxon equivalent) only_in_taxon anatomical structure endoderm organ part foregut swim bladder organ NCBITaxon: Actinopterygii respiration organ endoderm of forgut respiratory primordium GO: respiratory gaseous exchange pulmonary acinus alveolus NCBITaxon: Mammalia lung alveolus of lung MA:lung alveolus FMA: pulmonary alveolus alveolar sac lung primordium lung bud FMA:lung MA:lung EHDAA: lung bud OntoFox: a Web Server for MIREOTing Good things: Based on MIREOT principle Web-based data input and output Output OWL file can be directly imported in your ontology No programming needed Programmatically accessible Improvements: Integration into ontology editing tools More customizable http://ontofox.hegroup.org Proposed model moving forward Maintain series of ontologies at different taxonomic levels - euk, plant, metazoan, vertebrate, mollusc, arthropod, insect, mammal, human, drosophila Each ontology imports/MIREOTs relevant subset of ontology “above” it - this is recursive Subtypes are only introduced as needed Work together on commonalities at appropriate level above your ontology Leveraging an integrated set of ontologies cross-ontology link (sample) cell import nervous system gut circulatory system gland mollusca mantle shell foot cephalopod tentacle brachial lobe gonad arthropoda mushroom body mesoderm caro / uberon/all metazoa drosophila neuron types XYZ skeleton appendage respiratory airway muscle tissue larva skeletal tissue vertebrata trachea limb vertebra bone fin tibia vertebral column cuticle antenna tissue parietal bone mesonephros teleost amphibia weberian ossicle mammalia mammary gland tibiafibula mouse zebrafish NO pons human Not all classification is useful About thirty years ago there was much talk that geologists ought only to observe and not theorise; and I well remember some one saying that at this rate a man might as well go into a gravel-pit and count the pebbles and describe the colours. C. Darwin Be practical: Build ontologies for what you need and for what can be reused Ontologies can help reconcile annotation inconsistencies Semantic Similarity of Phenotypes FMA+PATO MP ZFA+PATO FBbt+PATO "Linking Human Diseases to Animal Models Using Ontology-Based Phenotype Annotation." PLoS Biol 7(11): e1000247. doi:10.1371/journal.pbio.1000247 Washington NL, Haendel MA, Mungall CJ, Ashburner M, Westerfield M, Lewis SE Querying for genes in similar structures across species A C ascidian ampulla sea urchin tube feet D B mouse limb polychaete parapodia E Vertebrata tetrapod limbs Ascidians ampullae Echinodermata tube feet Arthropoda Annelida parapodia Mollusca Distal-less orthologs participate in distal-proximal pattern formation and appendage morphogenesis Panganiban et al., PNAS, 1997 Anatomy ontologies in 2012 Identify key points of integration between ontologies Modularize based on domain or taxon Import and reuse rather than crossreferencing or “aligning” Let the reasoner help do the work Work together to distribute work Reproduced with permission, Jason Freeny http://web.mac.com/moistproduction/flash/index.html Anatomical evidence: what is it, and why do we care about it? What is evidence? ECO:000000X Imaging assay evidence Synaptolaemus cingulatus AMNH 91095 OBI:Specimen material_processing Drawing about anatomical entity OBI:Image Phenotype (character) annotation: S. Cingulatus: mesethmoid narrow OBI:Conclusion (textual entity) Draw prepared specimen OBI:imaging assay cleared and stained for cartilage and bone OBI:processed specimen Brian, 2008, maybe in Venezuela OBI: Interpreting Dataphenotypic assessment Sidlauskas and Vari, Zoological Journal of the Linnean Society, 2008, 154, 70–210 Anatomical evidence is cumulative and synergistic Synaptolaemus cingulatus AMNH 91095 ECO:0000080 phylogenetic evidence mesethmoid narrow ECO:0000071 morphological similarity evidence Caenotropus maculosus USNM 231545 mesethmoid narrow . Schizodon fasciatus INPA 21606 .. mesethmoid wide Brian, 2008 Phylogeny construction using PAUP* 4.0 Beta 10 OBI: Interpreting Data phylogeny OBI:Conclusion The means to the end matters Synaptolaemus cingulatus AMNH 91095 ECO:0000080 phylogenetic evidence Mesethmoid ECO:0000071 sequence similarity evidence Caenotropus maculosus USNM 231545 mesethmoid narrow . Schizodon fasciatus INPA 21606 .. mesethmoid wide Brian, 2008 Phylogeny construction using PAUP* 4.0 Beta 10 OBI: Interpreting Data phylogeny OBI:Conclusion So what should one do about evidence? • Keep in mind that as you record your phenotype data, the means by which you obtained it can matter later one • Others may want to use your data, and they too will care • You may find that how you know what you know depends on the means to the end • You can work with ECO and OBI to get the terms you need for your work Acknowledgments Jonathan Bard Marcus Chibucos Wasila Dahdul Paula Mabee Chris Mungall David Osumi-Sutherland Alan Ruttenberg Erik Segerdell Carlo Torniai Matt Yoder Jie Zheng AND numerous others Larson, October 1987