Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Politics and Pragmatism in Scientific Ontology Construction Mike Travers Inconsistency Robustness 2011 Overview • • • • • • • Introduction Two kinds of knowledge infrastructure Ontological controversies: some examples The nature of actual scientific representation Representational pragmatism Technical directions Conclusions My background • SSS Artificial Intelligence Media Science Human Interface Constructionism Visual Programming Knowledge Representation Agent-based systems Programming Languages Scientific Software Philosophy of Science Narrative Theory (@startups, large companies, open source projects, and now SRI) Scientific KM Collaboration Decision Support Publishing Standards Sociology Cognitive Science Synopsis • Knowledge representation inevitably involves inconsistency, controversy, hence politics; • Scientific representation does too, but it has worked-out practices for dealing with it; • KR should work more like science rather than the other way around; • Representational Pragmatism: a conceptual framework to make it happen Overview • • • • • • • Introduction Two kinds of knowledge infrastructure Ontological controversies: some examples The nature of actual scientific representation Representational pragmatism Technical directions Conclusions What’s a knowledge infrastructure? • A system of – – – – Technologies, Institutions, Standards, and Practices • that serve to support knowledge – – – – – – Collection Storage Curation Sharing Validation … Knowledge Infrastructure #1: Science • The scientific community • An elaborate web of – People (scientists and others) – Institutions (labs, journals, funding agencies, instrument makers…) – Practices (publishing criteria, protocols, conferences) • Works pretty well! The gold standard for knowledge in fact. • But there are issues of scaling, quality, inertia, siloing, epistemological closure… Knowledge Infrastructure #2: The Semantic Web • Set of technical standards for sharing formalized knowledge • Aspires to be a universal framework for knowledge • A grand vision of global-scale knowledge representation • And tremendously important and needed. Provenance Reasoning Classification Relations & Properties Naming These two are becoming one… Bioscience is by far the largest application area for semantic web technology Some non-robust properties of the semantic web • Too inexpressive (Can’t represent default reasoning or n-way predicates) • Too complex (Prevents widespread acceptance) • Too logic-based (Emphasizes wrong things) Overview • • • • • • • Introduction Two kinds of knowledge infrastructure Ontological controversies: some examples The nature of actual scientific representation Representational pragmatism Technical directions Conclusions Convergence and Controversy • Ontologies are supposed to define a common understanding of a domain • But “common” is easier said than done • In practice: – Many different constituencies – With different ideas about what’s important – Many side-factors complicate things (implementation cost, personal status, existing non-rigorous usages…) – Compromise is necessary but rarely produces elegant results Example: psychiatric illness • What constitutes a mental illness? – Not at all obvious that categories correspond to real phenomena – Huge changes over over time – Currently defined by DSM-IV through a highly politicized process – History of PTSD (Scott, 1990) • “combat fatigue” or cowardice • In and out of the DSM • Finally recognized as PTSD, partly as response to Vietnam War Psychiatric illness (2) • Homosexuality – Formerly a pathology, now not, through a highly politicized process • Attention Deficit Disorder – Cluster of symptoms, not clear what the boundaries should be – Opinions often determined by theories of child-rearing or institutional aspects of school. • Insurers and economics are important actors in debate • Summary: – these disorders are social constructed categories – over a definite but unclear underlying reality. Example: category fudging • In Pathway Tools, SRI’s bioinformatics knowledge base • This is a widely used system for curating genomes and metabolic pathways • Underlying frame system • Web based interface Example: Gene/Protein conflation • Genes and Proteins are different things • But biologists tend to want to use the same name for a gene and its product • Tension between formal ontology and actual scientific usage • Equivalently, an argument between the computer scientists who build the system and the biologists who use it and curate it Gene (DNA) trpA Gene product (Protein) Search for “trpA” Moral of this somewhat trivial example • There are tensions (inconsistencies) between formal representation and actual usage • And, software makers end up having to cope with these tensions in design decisions • Usually in a kludgy way! • Eg, papering over the conflict in the user interface layer • Would be nice to have a better theory of how do this. Example: how do we classify mitochondria? • Organelles (part of cell) • But descended from separate endosymbiotic organisms • With their own DNA • (Generally but not universally accepted theory) There are consequences • “If we accept that mitochondria are bacteria, then the record books have to be rewritten. The first bacterial genome sequence was completed not by American arriviste Craig Venter …in 1995, but instead by … Fred Sanger, who completed the human mitochondrial genome sequence in 1981!” Expressivity in Description Logics • Description Logics (DL) are the basis for semantic web ontology. • Selected largely for computational tractability • But DL make it hard to do simple things such as representing defaults – All cats have hair – Except for this one! • Expressivity has been traded away • A compromise and perhaps not the right one Overview • • • • • • • Introduction Two kinds of knowledge infrastructure Ontological controversies: some examples The nature of actual scientific representation Representational pragmatism Technical directions Conclusions Bruno Latour • French philosopher and sociologist of science • Roundly reviled for perceived anti-realism • Started with anthropological studies of science in labs and fields • Ends in a rather unique view of representation and even metaphysics Latour for dummies • Science is a social construction (but not an arbitrary one) • Network based: a network consists of humans and non-human actors (lab animals, instruments, funding institutions…) • Agonistic – trials of strength between networks • Understand how science works by tracing the flow of inscriptions, abstractions, and power through these networks • An enriched realism, that provides a rich account of the relation between phenomena and representation Dual face of science Settled science: “That’s the way it is” Objective Black-boxed Politically Established Natural Science under construction: Unsettled Contentious Searching for allies (people, funding, t Building networks of alliance Social • Science in the making: – EG: Watson and Crick’s work on the structure of DNA • Speculations (A three-strand model was proposed) • Contending theories • Eventually a winner emerges • Science made – Now that the structure of DNA is known, • it’s a “black box” • we can make instruments that measure it • representations of its sequence Under construction Black boxed Where the representation meets the road • Science is: “the transformation of rats and mice into paper” • Situated representations – From phenomena – Lab notebook – Tables in articles – Laws of nature Concrete, situated Abstract, objective Jeff Shrager, “Diary of an Insane Cell Mechanic” Intercalation of representations and the phenomenon Analogizing to KR Knowledge Representation: Realist Objective Settled Factual Established Abstract Graph structures Knowledge Construction: Situated representations Unsettled Bottom-up User interfaces Ad-hoc structures A new view of the relation between world and representation • Latour refocuses epistemology – Less on the truth of representations, – More on their connection to the world via networks of actants. • Should be a natural fit for computationalists – Who also make systems of symbols with causal connections to the world and each other Overview • • • • • • • Introduction Two kinds of knowledge infrastructure Ontological controversies: some examples The nature of actual scientific representation Representational pragmatism Technical directions Conclusions Realism vs Conceptualism • Realism: a movement in philosophy of KR • Led mostly by Barry Smith, SUNY Buffalo (eg “Beyond Concepts: Ontology as Reality Representation”, 2004) • The problem: nobody knows what makes a good ontology • His solution: Aristotelian universals – Bad ontologies are…those whose general terms lack the relation to corresponding universals in reality, and thereby also to corresponding instances. Good ontologies are reality representations... Realism is extremely annoying • Both vacuous and wrong • Vacuous: because it presupposes we know what is real beforehand • Wrong: because it doesn’t correspond to actual scientific knowledge representation • Examples of failure: – Higgs bosons – we don’t know if they are real – Genes – were hypothesized before their “implementation” was known; when were they real? – Software for synthetic chemistry – mixes real and notyet-real molecular structures Afferent: software for drug discovery chemists But Realism is Winning • Basis of BFO (Basic Formal Ontology) • Which is used by OBO Foundry and other bio-ontology efforts • Nobody wants to be against “realism”… so they picked a good name Realism only deals with half of science • May work for ready-made science, • hopeless for science-in-the-making • Where we don’t know what’s real • And which is where the action is Representational Pragmatism • Needed: a term with good connotations to compete with “realism”. • Connects to a philosophical tradition (James, Peirce, Dewey, Rorty) – “It is astonishing how many philosophical disputes collapse into insignificance the moment that you subject them to this simple test of tracing a concrete consequence” -James • Bottom-up rather than top-down; opposed to premature ontologizing; Latourian • Support the divergent representational practices of actual science • Help science towards convergence, objectivity, and realism, rather than demanding it upfront. Overview • • • • • • • Introduction Two kinds of knowledge infrastructure Ontological controversies: some examples The nature of actual scientific representation Representational pragmatism Technical directions Conclusions Some encouraging developments • Linked data vs semantic web A somewhat more bottom-up, pragmatic approach to universal knowledge infrastructure • Freebase, DBPedia similar efforts • Open Science movement – Open Access Journals (PLoS, etc) – Open Data (standards) – Open Notebook (practices) BioBike: a platform for symbolic biocomputing • A web-based, programmable tool for advanced biocomputing – Knowledge-based – Programmable – Social • Really the inspiration of many of the ideas here • Joint work with Jeff Shrager (Stanford), Jeff Elhai (VCU), and others Reworked to be more social Biocomputation Bio-blog menu Knowledge/ data analysis Integration with services Commentary Prototype-based KR • How the mind categorizes (Rosche, Lakoff) • A perennial minority theme in computation: – 60s: Sutherland, Sketchpad – 70s: Early frame-based KR systems – 80s: Ungar and Smith, SELF programming language – 90s: Ken Haase, Framer – Now: Javascript • A structured way to manage inconsistency Biology is prototype-based • Every feature of a biological class started out as an exception to a general case! • aka mutation • Classes are Aristotelian • Prototypes are Darwinian Overview • • • • • • • Introduction Two kinds of knowledge infrastructure Ontological controversies: some examples The nature of actual scientific representation Representational pragmatism Technical directions Conclusions The Problems • Ontologies are plagued with inconsistencies (or compromise) because they are inevitably the product of different interests. • Ontologies generally only try to capture the settled science • Realism is vacuous, question-begging; if we knew at the start what was real we wouldn't need to do science • Knowledge construction is social, tentative, situated, multiviewpoint, and only objective at its endpoints. The Solutions • Tools that support how science is actually done, at web scale and with greater visibility and traceability • A pragmatic view of scientific representation – That let scientists work bottom-up from their results – that foregrounds the concrete relations between representation and reality (circulating reference) – connects science in progress with settled science, supporting and preserving controversy, unsettledness, and argument structure • More simply: integrate data and knowledge and the processes that connect them. • Open Science: institutions, standards, practices. • A representational infrastructure that supports prototypes, default reasoning, and exceptions. Thank you!