Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
What is an ontology and Why should you care? Barry Smith http://ontology.buffalo.edu/smith with thanks to Jane Lomax, Gene Ontology Consortium 1 You’re interested in which genes control heart muscle development 17,536 results 2 time Defense response Immune response Response to stimulus Toll regulated genes JAK-STAT regulated genes Microarray data shows changed expression of thousands of genes. Puparial adhesion Molting cycle hemocyanin Amino acid catabolism Lipid metobolism How will you spot the patterns? Peptidase activity Protein catabloism Immune response Immune response Toll regulated genes attacked control Tree: pearson Coloredby: by: arson lw n3d ... lw n3d ... Colored Copy of Copy C5_RMA Copy ofofCopy of(Defa... C5_RMA (Defa... 3 Ontologies provide a way to capture and represent all this knowledge in a computable form 4 Uses of ‘ontology’ in PubMed abstracts 5 By far the most successful: The Gene Ontology 6 7 Definitions 8 Gene products involved in cardiac muscle development in humans9 Term Search Results 10 Hierarchical view representing relations between represented types 11 How GO can be used to help analyse microarray data • • • • • • • • Treat samples Collect mRNA Label Hybridize Scan Normalize Select differentially regulated genes Understand the biological phenomena involved 12 Traditional analysis operates via literature search for each successive gene Gene 1 Apoptosis Cell-cell signaling Protein phosphorylation Mitosis … Gene 3 Growth control Gene 4 Mitosis Nervous system Oncogenesis Pregnancy Protein phosphorylation Oncogenesis … Mitosis … Gene 2 Growth control Mitosis Oncogenesis Protein phosphorylation … Gene 100 Positive control. of cell proliferation Mitosis Oncogenesis Glucose transport … 13 But by using GO annotations, this work has already been done GO:0006915 : apoptosis 14 GO allows grouping by process Apoptosis Gene 1 Gene 53 Positive control. of cell proliferation Gene 7 Gene 3 Gene 12 … Mitosis Gene 2 Gene 5 Gene45 Gene 7 Gene 35 … Glucose transport Gene 7 Gene 3 Gene 6 … Growth Gene 5 Gene 2 Gene 6 … Allows us to ask meaningful questions of microarray data e.g. which genes are involved in the same process, with same/different 15 expression patterns? How does the Gene Ontology work? 16 1. It provides a controlled vocabulary contributing to the cumulativity of scientific results achieved by distinct research communities (if we all use kilograms, meters, seconds … , our results are callibrated) 17 2. It provides a tool for algorithmic reasoning 18 Hierarchical view representing relations between represented types 19 The massive quantities of annotations to gene products in terms of the GO allows a new kind of research 20 Uses of GO in studies of • pathways associated with heart failure development correlated with cardiac remodeling (PMID 18780759) • sex-specific pathways in early cardiac response to pressure overload in mice (PMID 18665344) • molecular signature of cardiomyocyte clusters derived from human embryonic stem cells (PMID 18436862) • contrast between cardiac left ventricle and diaphragm muscle in expression of genes involved in carbohydrate and lipid metabolism. (PMID 18207466 ) • immune system involvement in abdominal aortic aneurisms in humans (PMID 17634102) • … 21 But GO covers only three sorts of biological entities – cellular components – molecular functions – biological processes and does not provide representations of disease-related phenomena 22 How extend the GO to help integrate complex representations of reality help human beings find things in complex representations of reality help computers reason with complex representations of reality in other areas of biomedicine? 23 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Biological Process (GO) Molecular Process (GO) The Open Biomedical Ontologies (OBO) Foundry 24 RELATION TO TIME GRANULARITY INDEPENDENT ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE CONTINUANT DEPENDENT Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) OCCURRENT Molecular Function (GO) Organism-Level Process (GO) Cellular Process (GO) Molecular Process (GO) initial OBO Foundry coverage 25 CRITERIA CRITERIA opennness common formal language. collaborative development evidence-based maintenance identifiers versioning textual and formal definitions 26 CRITERIA COMMON ARCHITECTURE: The ontology uses common formal relations ORTHOGONALITY: One ontology for each domain 27 LEADERSHIP Michael Ashburner, Suzanna Lewis, Chris Mungall (GO Consortium) Alan Ruttenberg (Science Commons, OWL Working Group, HCLS/Semantic Web) Richard Scheuermann (ImmPort, CTSA) Barry Smith 28 OBO Foundry provides • tested guidelines enabling new groups to develop the ontologies they need in ways which counteract forking and dispersion of effort • an incremental bottoms-up approach to evidence-based terminology practices in medicine that is rooted in basic biology • automatic web-based linkage between medical terminologies and biological knowledge resources 29