* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download What_Is_Ontology_Tor.. - Buffalo Ontology Site
Polycomb Group Proteins and Cancer wikipedia , lookup
Gene therapy of the human retina wikipedia , lookup
Minimal genome wikipedia , lookup
Epigenetics of human development wikipedia , lookup
Public health genomics wikipedia , lookup
Genome (book) wikipedia , lookup
Gene expression programming wikipedia , lookup
Microevolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Designer baby wikipedia , lookup
What is an ontology and Why should you care? Barry Smith http://ontology.buffalo.edu/smith 1 What I do • Gene Ontology (NIHGR) (Scientific Advisor) • National Center for Biomedical Ontology (NIHGR) • Protein Ontology (NIGMS) • Infectious Disease Ontology (NIAID) • Biometrics Ontology (US Army) • Ontology for Integration of Cross-Border Emergency Data (European Union) 2 Uses of ‘ontology’ in PubMed abstracts 3 By far the most successful: GO (Gene Ontology) 4 You’re interested in which genes control heart muscle development 17,536 results 5 time Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Microarray data shows changed expression of thousands of genes. Puparial adhesion Molting cycle hemocyanin Amino acid catabolism Lipid metobolism How will you spot the patterns? Peptidase activity Protein catabloism Immune response Immune response Toll regulated genes attacked control Tree: pearson Coloredby: by: arson lw n3d ... lw n3d ... Colored Copy of Copy C5_RMA Copy ofofCopy of(Defa... C5_RMA (Defa... 6 You’re interested in which of your hospital’s patient data is relevant to understanding how genes control heart muscle development 7 Lab / pathology data EHR data Clinical trial data Family history data Medical imaging Microarray data Model organism data Flow cytometry Mass spec Genotype / SNP data How will you spot the patterns? How will you find the data you need? 8 How does the Gene Ontology work? with thanks to Jane Lomax, Gene Ontology Consortium 9 1. GO provides a controlled system of representations for use in annotating data multi-species, multi-disciplinary, open source contributing to the cumulativity of scientific results achieved by distinct research communities compare use of kilograms, meters, seconds … in formulating experimental results 10 11 Definitions 12 Gene products involved in cardiac muscle development in humans 13 http://wiki.geneontology.org/index.php/Priority_Cardiovascular_genes 14 Questions for annotation where is a particular gene product involved • in what type of cell or cell part? • in what part of the normal body? • in what anatomical abnormality? when is a particular gene product involved • in the course of normal development? • in the process leading to abnormality with what functions is the gene product associated in other biological processes? 15 2. GO provides a tool for algorithmic reasoning 16 Hierarchical view representing relations between represented types 17 3. GO allows a new kind of clinical research, based on analysis of the massive quantities of annotations linking GO terms to gene products 18 Uses of GO in studies of • pathways associated with heart failure development correlated with cardiac remodeling (PMID 18780759) • molecular signature of cardiomyocyte clusters derived from human embryonic stem cells (PMID 18436862) • contrast between cardiac left ventricle and diaphragm muscle in expression of genes involved in carbohydrate and lipid metabolism. (PMID 18207466 ) • immune system involvement in abdominal aortic aneurisms in humans (PMID 17634102) 19 GO is amazingly successful – but it covers only generic biological entities of three sorts: –cellular components –molecular functions –biological processes and it does not provide representations of disease-related phenomena 20 Extending the GO methodology to other domains of biology and of clinical and translational medicine 21 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Biological Process (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 22 Foundational Model of Anatomy 23 An A is_a B All instances of A are instances of B What are types? what are instances? (Buckets, thresholds) 24 Definitions Cell =Def. an anatomical structure which consists of cytoplasm surrounded by a plasma membrane Anatomical structure =Def. a material anatomical entity which is generated by coordinated expression of the organism’s own genes An A =Def. a B which Cs 25 Anatomical Structure Anatomical Space Organ Cavity Subdivision Organ Cavity Organ Serous Sac Cavity Subdivision Serous Sac Cavity Serous Sac Organ Component Organ Subdivision Pleural Sac Pleural Cavity Parietal Pleura Interlobar recess Organ Part Mediastinal Pleura Pleura(Wall of Sac) Visceral Pleura Mesothelium of Pleura Tissue Heterotaxy =Def. the abnormal arrangement of organs or viscera across the left-right axis differing from ‘‘complete situs solitus’’ and ‘‘complete situs inversus’’ Left isomerism =Def. a subset of heterotaxy where some paired structures on opposite sides of the left-right axis of the body are symmetrical mirror images of each other, and have the morphology of the normal left-sided structures. Jacobs, et al., 2007 27 OBO Foundry recognized by NIH as framework to address mandates for re-usability of data collected through Federally funded research see NIH PAR-07-425: Data Ontologies for Biomedical Research (R01) 28 Analysis of outcomes for congenital cardiac disease: can we do better? Jeffrey P. Jacobs, et al. 2007 • Improving methodologies for verification of data • Clarifying the relationship between administrative databases [such as ICD] and clinical databases • Establishing links between databases • Moving beyond geographical barriers • Moving beyond sub-specialty barriers OBO Foundry provides • tested guidelines enabling new groups to develop the ontologies they need in ways which counteract forking and dispersion of effort • an incremental bottoms-up approach to evidence-based terminology practices in medicine that is rooted in basic biology • automatic web-based linkage between medical terminologies and biological knowledge resources (massive integration of databases across species and biological system) 30 A good solution to the silo problem must be: • • • • • • modular incremental bottom-up based on consistent, intuitive structure evidence-based and thus revisable incorporate a strategy for motivating potential developers and users 31 An ontology is not a database New databases for each new kind of data New databases for each new project Ontologies like the GO are a solution to the silo problems databases cause 32 An ontology is not a terminology Existing term lists • built to serve specific data-processing • in ad hoc ways Ontologies • designed from the start to ensure integratability and reusability of data • by incorporating a common logical structure 33 Can existing CHD terminologies serve as ontologies? An ontology is a representation of the types of entities in a given domain of reality and of the relations between types What happens if we apply evidence-based rules for ontology construction? 44 Rule • Every node in the ontology must represent some type of entity in reality 45 CardioAccess Tree View Rule: Each term in an ontology represents a type of biological entity instantiated in biological reality 46 CardioAccess Tree View Rule: Each term in an ontology represents a type of biological entity instantiated in biological reality 1. Syntactic Consequences 47 CardioAccess Tree View Rule: Each term in an ontology represents a type of biological entity instantiated in biological reality 2. No ‘Other’, No ‘Miscellaneous’, No ‘NOS’ 48 CardioAccess Tree View Rule: Each term in an ontology represents a type of biological entity instantiated in biological reality 3. Hierarchical organization of types and subtypes 49 Rule: Each term in an ontology represents a type of biological entity instantiated in biological reality 3. Hierarchical organization of types and subtypes 50 Rule: Each term in an ontology represents a type of biological entity instantiated in biological reality 5. Non-redundancy 51 Rule: Each term in an ontology represents a type of biological entity instantiated in biological reality 5. Non-redundancy 52 Rule: Each term in an ontology represents a type of biological entity instantiated in biological reality 6. An instance of a process type is never an instance of a thing type 53 Rule: Each term in an ontology represents a type of biological entity instantiated in biological reality 7. Consistent principles for classification not applied 54 Strategy for building a CDH ontology within the OBO Foundry A good solution to the silo problem must be: • • • • • • modular incremental bottom-up evidence-based revisable incorporate a strategy for motivating potential developers and users • work well with other ontologies for neighboring domains 55 OBO Foundry principle of modularity • one ontology for each domain • once you’ve annotated existing data, then no need for mappings (which are in any case too expensive, too fragile, too difficult to keep up-to-date as mapped ontologies change) • everyone knows where to look to find out how to annotate each kind of data 56 Modularity fosters division of labor • allows distributed development • but only if there is a well-tested, principles-based structure in place • to ensure that the separate modules work well together 57 Extending the OBO Foundry to other domains of biology and of clinical and translational medicine 59 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY Developmenal Process Normal Organ Normal Anatomical Entity Function ORGAN AND ORGANISM Abnormal Anatomical Entity Abnormal Organ Function CELL AND CELLULAR COMPONENT Cellular Component Cellular Function (GO) MOLECULE Genes and Gene Products Genetic Predispositions Disease Molecular Function Embryology Morphology Surgical Processes Molecular Process Congenital Heart Disease Ontology Modules 60