Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Ontology in Buffalo August 27, 2012 Barry Smith Problems How to find data How to reason with data when you find it How to integrate with other data How to label the data you are collecting Answer annotate your data with a common ontology How to build a common ontology = an ontology that will integrate well with ontologies built for neighboring domains? 2 Science requires a common suite of ontologies covering all scientific domains Science is global and seamless Scientific data is public 3 Ontologists in UB • Barry Smith (Philosophy, Bioinformatics) • Werner Ceusters (Psychiatry, Bioinformatics) • Alan Ruttenberg (Director of UB Clinical and Translational Data Exchange) • Alex Diehl (Neurology, Director of Ontology Services for School of Medicine) • • • • • • • • Pain Ontology grant with NIDCR Protein Ontology grant with NIGMS Infectious Disease Ontology grant with NIAID National Center for Biomedical Ontology grant with NIHGR Cell Ontology grant with NIHGR SNOMED grant with NLM ARGOS on EU/US cooperation in Health IT VIVO / eagle-I collaboration Collaborations Center for Brain and Behavior Informatics (http://cbbi.buffalo.edu) Stroke Patient Registry Alzheimers Patient Registry Degenerative Disease Ontology Immunology Ontology Roswell Park Cancer Institute (Malignancy Ontology) School of Dental Medicine (Pain Ontology, Picasso EHR) Institute for Healthcare Informatics • http://ahc.buffalo.edu/ihi/ Center of Excellence in Bioinformatics and Life Sciences • http://www.bioinformatics.buffalo.edu/ Ontologists in Buffalo • Jason Corso (Computer Science – video analysis) • Albert Goldfain (Blue Highway, Inc. – Infectious Disease Ontology, data exchange between devices) • Dagobert Soergel (Information Studies – online advanced certificate program in ontology) National Center for Biomedical Ontology (NCBO) Stanford University Biomedical Research Mayo Clinic University at Buffalo 9 Uses of ‘ontology’ in PubMed abstracts 10 By far the most successful: GO (Gene Ontology) 11 GO provides a controlled system of terms for use in annotating (describing, tagging) data • multi-species, multi-disciplinary, open source • contributing to the cumulativity of scientific results obtained by distinct research communities • compare use of kilograms, meters, seconds in formulating experimental results 12 US $200 mill. invested in literature and data curation using GO over 11 million annotations relating gene products described in the UniProt, Ensembl and other databases to terms in the GO experimental results reported in 52,000 scientific journal articles manually annoted by expert biologists using GO 13 GO is amazingly successful in overcoming the data balkanization problem but it covers only generic biological entities of three sorts: – cellular components – molecular functions – biological processes and it does not provide representations of diseases, symptoms, … 14 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY ORGAN AND ORGANISM Organism (NCBI Taxonomy) CELL AND CELLULAR COMPONENT Cell (CL) MOLECULE Anatomical Organ Entity Function (FMA, (FMP, CPRO) Phenotypic CARO) Quality (PaTO) Cellular Cellular Component Function (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) Biological Process (GO) Molecular Process (GO) Original OBO Foundry ontologies (Gene Ontology in yellow) 15 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY CELL AND CELLULAR COMPONENT MOLECULE Anatomical Entity (FMA, CARO) Cell (CL) Cellular Component (FMA, GO) Molecule (ChEBI, SO, RnaO, PrO) Organ Function (FMP, CPRO) environments are here ORGAN AND ORGANISM Organism (NCBI Taxonomy) Phenotypic Quality (PaTO) Biological Process (GO) Cellular Function (GO) Molecular Function (GO) Molecular Process (GO) Environment Ontology 16 RELATION TO TIME CONTINUANT INDEPENDENT OCCURRENT DEPENDENT GRANULARITY COMPLEX OF ORGANISMS ORGAN AND ORGANISM CELL AND CELLULAR COMPONENT MOLECULE Family, Community, Deme, Population Population Phenotype Population Process Organ Anatomical Function Organism Entity (FMP, CPRO) (NCBI (FMA, Phenotypic Taxonomy) CARO) Quality (PaTO) Cellular Cellular Cell Component Function (CL) (FMA, GO) (GO) Molecule (ChEBI, SO, RnaO, PrO) Molecular Function (GO) http://obofoundry.org Biological Process (GO) Molecular Process (GO) 17 The OBO Foundry: a step-by-step, evidence-based approach to expand the GO Developers commit to working to ensure that, for each domain, there is community convergence on a single ontology and agree in advance to collaborate with developers of ontologies in adjacent domains. http://obofoundry.org 18 OBO Foundry Principles Common governance (coordinating editors) Common training Common architecture • simple shared top level ontology • shared Relation Ontology: www.obofoundry.org/ro 19 Open Biomedical Ontologies Foundry Seeks to create high quality, validated terminology modules across all of the life sciences which will be • non-redundant • close to language use of experts • evidence-based • incorporate a strategy for motivating potential developers and users • revisable as science advances 20 The OBO Foundry is a collective experiment involving many biological and clinical communities attempting to create terminology resources which will support the goal of modularity one ontology for each domain No need for ‘mappings’ 21 OBO Foundry (example ontologies) GO Gene Ontology CL Cell Ontology SO Sequence Ontology ChEBI Chemical Ontology PATO Phenotype (Quality) Ontology FMA Foundational Model of Anatomy Ontology ChEBI Chemical Entities of Biological Interest PRO Protein Ontology Plant Ontology Environment Ontology Ontology for Biomedical Investigations RNA Ontology 22 Introduction to Basic Formal Ontology 23 The central distinction universal vs. instance human being vs. Arnold Schwarzenegger science text vs. diary catalog vs. inventory 24 Ontologies are representations of universals in reality aka kinds, types, categories, species, genera, ... 25 inventory A B C 515287 521683 521682 DC3300 Dust Collector Fan Gilmer Belt Motor Drive Belt catalog 26 instances A B C 515287 521683 521682 DC3300 Dust Collector Fan Gilmer Belt Motor Drive Belt universals27 universals object organism animal mammal cat siamese frog instances 28 Anatomical Structure Anatomical Space Organ Cavity Subdivision Organ Cavity Organ Serous Sac Cavity Subdivision Serous Sac Cavity Serous Sac Organ Component Organ Subdivision Pleural Sac Pleural Cavity Parietal Pleura Interlobar recess Organ Part Mediastinal Pleura Tissue Pleura(Wall of Sac) Visceral Pleura Mesothelium of Pleura 29 An example of a simple rule: Each term in an ontology represents exactly one universal For this reason ontology terms should be singular nouns organism headache drug administration 30 The Pre-History of BFO • Aristotle (4th Century BC) • Edmund Husserl’s Logical Investigations (1900-01) • “Truthmaker” (1984) • Patrick Hayes, “Naïve Physics Manifesto” (1985) • Qualitative spatial reasoning (1990 – ) • DOLCE (1991 – ) • GO, FMA (2004 – ) 31 Aristotle’s Ontological Square Particular Universal Substantial Accidental Second substance Second accident man headache cat sun-tan ox dread First substance First accident this man this headache this cat this sun-tan this ox this dread 32 Edmund Husserl Substantial Accidental Universal Independent continuant Second accident Particular • Coined ‘formal ontology’ • Introduced formal mereology • First formal account of dependence relations this man this ox this man’s headache that man’s knowledge of Greek 33 Truthmaker (1984) Q: What is it in reality in virtue of which a true assertion such as “John has a headache” is true? A: John’s current headache Kevin Mulligan, Peter M. Simons and Barry Smith, “Truth-Makers”, Philosophy and Phenomenological Research, 44 (1984), 287–321. 34 John Searle mind-to-world direction of fit – have truthmakers Belief Statement Photograph Scientific theory world-to mind direction of fit Plan Instruction Request Command 35 Hayes’ Naïve Physics Manifesto BFO 1.n: How can we construct a formal ontology (= an ontology formalized using first-order predicate logic) that will represent the entities we experience in our everyday perception and action? BFL 2.n: How can we do this in a way that will also be compatible with what we know from physics? 36 Qualitative spatial reasoning / mereotopology • COSIT Conferences on Spatial Information Theory – http://www.cosit.info/ • Leeds Qualitative Spatial Reasoning Group – http://www.comp.leeds.ac.uk/qsr/ • Anthony Galton – http://empslocal.ex.ac.uk/people/staff/apgalton/ • Thomas Bittner – http://www.buffalo.edu/~bittner3 • Roberto Casati and Achille Varzi, Parts and Places (MIT Press, 1999) 37 The History of BFO 2004 BFO 1.0 2005 OBO Relation Ontology (RO) 2006 BFO 1.1 adds generically dependent continuants 2012 BFO 2.0 incorporates top-level relations from RO addresses problem of process measurement data (e.g. heart rates) 38 BFO: A First Look Continuant Independent Continuant Occurrent (Process, Event) Dependent Continuant universals ..... ..... ..... instances 39 Basic Formal Ontology • • • • a true upper level ontology no interference with domain ontologies no interference with issues of cognition no putative fictions 40 Main reason to use BFO BFO has the largest body of users (compare: This telephone network has the largest number of subscribers) Snowballing network effects: data annotated using BFO-conformant ontologies becomes more valuable numbers of people with expertise in building BFO-conformant ontologies increases 41 How BFO is constructed and maintained Simplicity BFO has objects BFO has qualities of objects BFO has no qualities of qualities Simplicity BFO has particulars BFO has universals Only particulars instantiate universals (no ‘meta-universals’) 42 How BFO is constructed and maintained Perspectivalism: Ontologies are windows on reality There is a multiplicity of windows (perspectives), all equally veridical, i.e. transparent to reality For example we can view an organism as a single object or as a collection of molecules (granular perspectives) 43 Ontological realism reality exists behind a transparent grid = a veridical partition Barry Smith, “Beyond Concepts, or: Ontology as Reality Representation”, (FOIS 2004), http://ontology.buffalo.edu/bfo/Beyond_Concepts.pdf 44 Alberti‘s Grid 45 Many veridical partitions Common sense involves many verdical partitions otherwise we would all be dead The common sense partitions of folk physics, folk psychology, folk biology, are to a large degree transparent to reality It is such common sense partitions that are involved, for instance, when someone 46 takes your temperature in the hospital The fundamental thesis of ontological realism that many of our natural-language and scientific partitions are transparent to reality is in fact quite trivial 47 48 BFO 1.0 49 Three Fundamental Dichotomies • Universal/Type vs. instance • Continuant vs. occurrent • Dependent vs. independent http://ontology.buffalo.edu/bfo/ 50 Basic Formal Ontology Continuant Independent Continuant Occurrent (Process, Event) Dependent Continuant http://ifomis.uni-saarland.de/bfo/ 51 Blinding Flash of the Obvious Continuant Independent Continuant Occurrent (Process, Event) Dependent Continuant http://ifomis.uni-saarland.de/bfo/ 52 Continuant entities - have continuous existence in time - preserve their identity through change - exist in toto if they exist at all Occurrent entities - have temporal parts - unfold themselves phase by phase - exist only in their phases/stages 53 You are a substance Your life is a process You are 3-dimensional Your life is 4-dimensional 54 BFO: the very top Continuant Independent Continuant Specifically Dependent Continuant Occurrent (always dependent on one or more independent continuants) 55 instance_of types Continuant Independent Continuant thing Specifically Dependent Continuant Occurrent process, event quality .... ..... ....... instances 56 Specifically dependent continuants • ‘ Qualities • of whiteness of this cheese, of mass of this banana, of rigidity of this stone 57 Continuant Independent Continuant Specifically Dependent Continuant Non-realizable Dependent Continuant (quality) Realizable Dependent Continuant (function, role, disposition) ..... ..... 58 Realizable dependent continuants Role: nurse role, pathogen role, food role Disposition: fragility, virulence, susceptibility, genetic disposition to disease X Function: to pump (of the heart), to unlock (of the key) 59 realization specifically_depends_on realizable Continuant Independent Continuant bearer Specifically Dependent Continuant disposition Occurrent Process of realization .... ..... ....... 60 BFO Continuant Independent Continuant Dependent Continuant (molecule, (quality, cell, organ, organism) function, disease) Occurrent (Process) e.g. Functioning e.g. Side-Effect, Stochastic Process, ... ..... ..... .... ..... 61 BFO partitions reality all terms included in the ontology are intended to designate universals in reality, in conformity with the basic principle of science-based ontology but this means that science-based ontologies are on the one hand windows on the universals in reality, but on the other hand windows on the instances in reality 62 Realizable dependent entities role disposition function continuants 63 Their realizations execution expression exercise application course occurrents 64 Continuant Independent Continuant Occurrent Specifically Dependent Continuant Realizable Dependent Continuant Quality Disposition e.g. Disease Function Role e.g. Functioning 65 BFO 1.1 66 Specifically Dependent Continuants Specifically Dependent Continuant if any bearer ceases to exist, then the quality or function ceases to exist the color of my skin the function of my heart Quality, Pattern Realizable Dependent Continuant 67 Generically Dependent Continuants Generically Dependent Continuant if one bearer ceases to exist, then the entity can survive, because there are other bearers (copyability) the pdf file on my laptop Information Object Sequence the DNA (sequence) in this chromosome 68 Information objects pdf file poem symphony algorithm symbol sequence molecular structure 69 Generically dependent continuants such as plans, laws … are concretized in specifically dependent continuants (the plan in your head, the protocol being realized by your research team, the law being implemented by this government agency) 70 71 Generically dependent continuants are concretized in specifically dependent continuants Beethoven’s 9th Symphony is concretized in the pattern of ink marks which make up this score in my hand 72 Universal or instance Continuant Independent Continuant human being, protocol document Dependent Continuant pattern of ink marks Occurrent (Process) Applying the protocol Side-Effect … ... .. ..... .... ..... 73 Generically Dependent Continuants Generically Dependent Continuant Information Entity .pdf file Sequence .doc file instances 74 BFO 2.0 75 BFO:object_aggregate not a sum of objects, but something like a set: a certain part of material reality, picked out by a certain granular partition 76 The Beatles 77 The Beatles 78 We use the John Paul George Ringo partition to pick out a certain object aggregate in this particular portion of material reality, then we mask all other portions of reality, from external (the water) and internal (the cells and molecules) 79 John Paul George Ringo is a veridical partition. The Beatles truly do (did) exist 80 king rook chess pieces queen pawn knight table top bishop surrounding space molecules all these partitions are verdical 81 BFO:object_aggregate which can however change its members over time (e.g. the aggregate of members of the International Association for Ontology and Its Applications) examples: populations, families, tribes, species, planetary systems – anything associated with a count, a registry, an inventory, a census 82 inventory 83 member_part_of a member_part_of b at t =Def. a is an object at t & there is at t a mutually exhaustive and pairwise disjoint partition of b into objects x1, …, xn with a = xi for some natural number i. Use this as basis for a theory of groups, organizations and other social objects 84 Non-rigid universals = universals which (may) hold of a continuant only for a certain time in the life of the continuant human in nature, no sharp boundaries here embryo instantiates at t1 fetus instantiates at t2 neonate instantiates at t3 infant instantiates at t4 John child instantiates at t5 adult instantiates at t6 85 portion of water portion of ice instantiates at t1 portion of liquid water instantiates at t2 Phase transitions portion of gas instantiates at t3 this portion of H20 86 When we measure temperatures we impose a quantitative partition on a portion of reality temperature 37ºC 37.1ºC instantiates at t1 instantiates at t2 37.2ºC instantiates at t3 37.3ºC instantiates at t4 37.4ºC 37.5ºC instantiates at t5 instantiates at t6 John’s temperature endures through time 87 Determinable and determinate qualities rigid temperature 37ºC 37.1ºC instantiates at t1 instantiates at t2 37.2ºC instantiates at t3 37.3ºC instantiates at t4 37.4ºC instantiates at t5 37.5ºC instantiates at t6 John’s temperature (a quality instance) 88 Determinable and determinate qualities temperature in nature, no sharp boundaries here 37ºC instantiates at t1 37.1ºC instantiates at t2 37.2ºC instantiates at t3 37.3ºC instantiates at t4 37.4ºC instantiates at t5 37.5ºC instantiates at t6 John’s temperature 89 Recall how we deal with phase sortals John instance_of nurse at t =Def. John instance_of human being at t & for some x, x instance_of nurse role & x inheres_in John at t 90 Role universals are rigid universals Nurse role is_a role (Role universals are rigid universals) If x instance_of role at t, then x instance_of role at all times at which x exists. Quality, disposition, region, material entity – these too are rigid universals Is object a rigid universal? 91 Full processes p is a full process =Def. for some spatiotemporal region s, p occupies s & every process q which occupies some part of s is part of p All full processes occupying any given spatiotemporal region are identical 92 History history of a material entity m = the full process which is the sum of processes taking place in the spatiotemporal region occupied by m 93 History The relation between a material entity and its history is one-to-one: for any material entity a, there is exactly one process which is the history of a, for every history h, there is exactly one material entity which h is the history of. Histories are additive. Thus for any two material entities a and b, the history of the sum of a and b is the sum of their histories. 94 Lives (for OGMS) The life of an organism is the history of the corresponding OGMS:extended organism 95 Partial processes p is a partial process = p is a process & p is not a full process 96 A spinning top is simultaneously getting warmer Two distinguishable (indeed separately measurable process profiles) in a single region of spacetime 97 Typically, processes are very complicated a single running process p might be an instance of multiple universals such as – 3.12 m/s motion process, – 9.2 calories per minute energy burning process, – 30.12 liters per kilometer oxygen utilizing process, – cardiovascular exercise process of type #16 and so on. Each of these corresponds to a partial process within p. Solution • focus not on ‘thick’ processes, such as runnings or hearts’ beating • but on ‘thin’ structural parts of processes –called ‘process profiles’ • (event patterns, …) Single quality process profile • a process of the sort that can be represented by a chart plotting quality measurement results on a single dimension against a time axis • a quality process profile is a truthmaker for a time series graph of this sort Examples of single quality process profiles Examples of 1. the course of Jim’s temperature 2. the course of Jim’s weight 3. the course of Jim’s height 4. the course of Jim’s fortune Each is depictable by means of a time series graph Process profile that which the output of a correct device would represent = that which a correct time-series graph would represent Temperature Call the process represented by this graph a (temperature) quality process profile The graph picks out just one dimension of qualitative change within a much larger conglomerate of processes Hence ‘quality process profile’ What did your temperature do over the last month, Jim? a target of a certain sort of cognitive selection, or cognitive profiling Cardiac Cycle, Left Ventricle Some processes can incorporate multiple quality process profiles Cardiac Cycle, Left Ventricle …corresponding to the multiple different sorts of partition of the same reality involved during measurement Cardiac Cycle, Left Ventricle multi-quality process profile Cardiac Cycle, Left Ventricle Compare perception of polyphonic music • Cognitive selection of the cello part when you listen to a string quartet • Picking out a certain process profile within a larger body of vibrations • Ignoring sneezes, coughs, … • (sometimes focusing on sneezes and coughs for diagnostic purposes) 110 simultaneous causality specifically_depends_on Continuant Occurrent process Independent Continuant Dependent Continuant thing quality temperature depends on bearer .... ..... ....... 112 The Beatles 114 Quality partitions ... ... red orange crimson red yellow green deep red blood red ... ... 115 Example: a chess game W: Pawn to King4 B: Pawn to Queen’s Bishop 3 W. Pawn to Queen 3 ... 116 Two directions of fit world-to-mind and mind-to-world what begins as a plan, ends as a record (with truthmaker – if it is a true record – the journey you took) 117 Example: An airline ticket 7:00am LH 465 Vienna arrive London Heathrow 8:15am 9:45am LH 05 London Heathrow arrive New York (JFK) 3:45pm 5:50pm UA 1492 New York (JFK) arrive Columbus, OH 7:05pm 118 Example: An airline ticket 7:00am LH 465 Vienna arrive London Heathrow 8:15am 9:45am LH 05 London Heathrow arrive New York (JFK) 3:45pm 5:50pm UA 1492 New York (JFK) arrive Columbus, OH 7:05pm 119 Example: An airline ticket 7:00am LH 465 Vienna arrive London Heathrow 8:15am 9:45am LH 05 London Heathrow arrive New York (JFK) 3:45pm 5:50pm UA 1492 New York (JFK) arrive Columbus, OH 7:05pm 120 Example: An airline ticket 7:00am LH 465 Vienna arrive London Heathrow 8:15am 9:45am LH 05 London Heathrow arrive New York (JFK) 3:45pm 5:50pm UA 1492 New York (JFK) arrive Columbus, OH 7:05pm 121 Example: An airline ticket 7:00am LH 465 Vienna arrive London Heathrow 8:15am 9:45am LH 05 London Heathrow arrive New York (JFK) 3:45pm 5:50pm UA 1492 New York (JFK) arrive Columbus, OH 7:05pm 122 Example: An airline ticket 7:00am LH 465 Vienna arrive London Heathrow 8:15am 9:45am LH 05 London Heathrow arrive New York (JFK) 3:45pm 5:50pm UA 1492 New York (JFK) arrive Columbus, OH 7:05pm 123 When you understood the airline ticket, when you understood the reality of the corresponding portion of spacetime at the level of granularity dictated by the airline ticket, you were directed towards a process profile called a journey the journey is the truthmaker for the ticket 124 Two directions of fit world-to-mind and mind-to-world what begins as a plan, ends as a record (with truthmaker – if it is a true record – the journey you took) 125 126 Process profiles and the role of standardized notations music chess choreography ship stow planning military manoeuvres language traffic law 127 Process profiles and linguistics Phonetics deals with the production of speech sounds by humans, often without prior knowledge of the language being spoken. Phonology is about patterns of sounds in all spoken languages 128 Protocol #1 protocol (GDC) instance_of OBI: type plan specification. #1 concretized_in #2 (= plan in mind of leader of research team, a realizable SDC to carry out some experiment. realization of #2 starts with the creation of a series of sub-protocols, which are plan specifications for each team member. The experiment itself is the sum of the realizations of these plans, having outputs further GDCs such as publications, databases … 129 Music Beethoven’s 9th Symphony, a certain abstract pattern (generically dependent continuant), which we shall call #9 #9 instance_of symphony symphony is_a musical work. #9 instance_of musical work #9 concretized_in specifically dependent continuant pattern of ink marks borne by this printed copy of the score #10 #9 concretized_in specifically dependent continuant pattern of grooves in this vinyl disk. 130 Music #10 instance_of generically dependent continuant type OBI:plan specification #10 specifies how to create performance of #9. #10 is concretized_in this network of subplan (complex realizable SDC) distributed across the minds of the conductor and members of this orchestra #11 #11 realized_in this performance #12 #12 “copied” in what you hear (a process inside your head) 131 132 Two directions of fit world-to-mind and mind-to-world what begins as a plan, ends as a record (with truthmaker – if it is a true record – the journey you took) 133 Are mental processes process profiles? 134