Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
CS690L Ontologies Interoperability (Integration, Mapping, Query) Yugi Lee STB #555 (816) 235-5932 [email protected] www.sice.umkc.edu/~leeyu 1 CS690L - Lecture 4 Semantic Web Fabric • Bootstrapping, Creation and Maintenance of Semantic Knowledge – Collaborative and Sociological Processes, Statistical Techniques – Ontology Building, Maintenance and Versioning Tools • Re-use of Existing Semantic Knowledge (Ontologies) • Annotation/Association/Extraction of Knowledge with/from Underlying Data • Information Retrieval and Analysis (Distributed Querying, Search, Inference Middleware) • Semantic Discovery and Composition of Services • Distributed Computing/Communication Infrastructures – Component based technologies, Agent based systems, Web Services Repositories for managing data and semantic knowledge – Relational Databases, Content Management Systems, Knowledge Base Systems [V. Kashyap, 2002] CS690L - Lecture 4 2 What DB researchers have done ? • Semantic Data Models • Multi-database Schema Heterogeneity • Multi-database/Federated Database Schema Integration • Schema Evolution • Object Oriented/XML/Deductive Databases/Rule Based Systems • Mediators and Wrappers • Multidatabase/Federated Database Query Processing • Data Mining • Probabilistic Databases • Workflow-based Coordination Systems • Security in Database Systems • Multimedia Databases – Text and Information Retrieval Systems – Image Databases • DB Research is well positioned to contribute to the Semantic Web, but: – there has been little interest in issues related to Semantics in the DB community – the Semantic Web can be the underlying theme that ties in all the disparate pieces of work [V. Kashyap, 2002] CS690L - Lecture 4 3 What are the missing gaps ? • Ontology Integration/Interoperation – Problem is different from Schema Integration – Need to address “semantics” of relationships such as “synonyms”, “hyponyms”, etc. • Ontology Impedance/Mismatch – Relax the requirements of consistency and completeness – Should be able to characterize the “information error/loss” that occurs.. • Dynamic Ontologies – Need to relax the assumption of the “staticness” of database schemas Inferences based on Semantics of the Data – Has been relatively ignored by the DB community [V. Kashyap, 2002] CS690L - Lecture 4 4 What are the missing gaps ? • Semantics of Multimedia Data – Need to focus more on non-traditional data such as text, images, etc. – Need to focus on “annotation mechanisms” as an addition to wrappers/mediators • Semantics of Processes/Plans/Workflows • Performance/Scalability – A traditional strong point of DB research • The next wave of research (esp. in the context of the Semantic Web) will focus on re-use of pre-existing data models/schemas/ontologies that describes the content of information sources… [V. Kashyap, 2002] CS690L - Lecture 4 5 CS690L - Lecture 4 6 CS690L - Lecture 4 7 Inter-ontological relationships • Synonyms – leads to semantics preserving translations • Hyponyms/Hypernyms – lead to semantics altering translations – typically results in loss of recall and precision • List of Hyponyms – – – – – – – – technical-manual hyponym manual book hyponym book proceedings hyponym book thesis hyponym book misc-publication hyponym book technical-reports hyponym book press hyponym periodical-publication periodical hyponym periodical-publication [V. Kashyap, 2002] CS690L - Lecture 4 8 [V. Kashyap, 2002] 9 CS690L - Lecture 4 [V. Kashyap, 2002] 10 CS690L - Lecture 4 [V. Kashyap, 2002] CS690L - Lecture 4 11 Role of Ontologies • Content explication Ontologies are used for the explicit description of the information source Approaches: – Single ontology – Multiple ontology – Hybrid ontology • Query model • Verification (query containment) [H. Wache, 2002] CS690L - Lecture 4 12 Single Ontology Approach • • • • SIMS One global ontology Hierarchical terminological database Combination of several specialized ontolgies (for modularization) • Can be used when all information sources to be integrated provide nearly the same view on a domain • Minimal ontology commitment • Susceptible to changes in the information sources [H. Wache, 2002] CS690L - Lecture 4 13 Multiple Ontologies • OBSERVER • Each information source is described by its own ontology (source ontology) • No shared vocabulary • No common and minimal ontology commitment is needed • Simplifies integration and supports changes in sources • Difficult to compare different source ontologies • Inter-ontology mapping is needed [H. Wache, 2002] CS690L - Lecture 4 14 Multiple Ontologies • COIN • Semantics of each source is described by its own ontology • Built from a a global shared vocabulary • Shared vocabulary contains basic terms of a domain • New sources can easily be added • Supports acquisition and evolution of ontologies • Source ontologies are comparable because of shared vocabulary • Existing ontologies can not easily be reused, but have to be redeveloped from scratch [H. Wache, 2002] CS690L - Lecture 4 15 Query Model • • • • Integrated global view Global query schema User formulates query in terms of the ontology System reformulates queries in terms of subqueries for each source • Structure of the query model should be more intuitive for the user [H. Wache, 2002] CS690L - Lecture 4 16 Mappings Connecting to Information Sources • Relate the ontologies to the actual content of an information source • Approaches – Structure resemblance Produce a one-to-one copy of the structure of the database and encode it in a language that makes automated reasoning possible – Definition of terms Use ontology to define terms from the database or the database scheme – Structure enrichment (most common) A logical model is built that resembles the structure of the information source and contains additional definitions and concepts Can be done using DLs – Meta-annotation Add semantic information to an information source ontobroker, SHOE [H. Wache, 2002] CS690L - Lecture 4 17 Inter-Ontological Mapping Defined Mappings (KRAFT) – special customized mediator agents – Great flexibility – Fails to ensure a preservation of semantics - no verification Lexical Relations (OBSERVER) – Extend a common DL model by quantified inter-ontology relationships – Synonym, hypernym, overlap, covering, disjoint – Do not have formal semantics [H. Wache, 2002] CS690L - Lecture 4 18 Inter-Ontological Mapping Top-level grounding (DWQ) – Relate all ontolgies used to a single top-level ontology – Inheriting concepts from a common top-level ontology – Can resolve conflicts and ambiguities Semantic correspondences – Rely on a common vocabulary – Uses semantic labels in order to compute correspondences – Subsumption reasoning can be used to establish relations between different terminolgies [H. Wache, 2002] CS690L - Lecture 4 19 Conclusions • Data Models/Schemas/Ontologies will form the critical infrastructure for the Semantic Web • Re-use of pre-existing data models/schemas/ontologies is crucial in describing the semantics of various information sources • There is a need to relax consistency and completeness requirements and estimate the “error” in the results returned. • Semantics of information should be used to minimize “error” in the information obtained • The new environment is likely to be more “dynamic” in nature – schemas, workflows, queries, etc. can no longer be assumed to be static… • DB research is well positioned to participate in the Semantic Web if it “adapts” to these new requirements…. CS690L - Lecture 4 20 References • Vipul Kashyap, The Semantic Web:Has the DB Community Missed the Bus (again ?) NSF Workshop on DB & IS Research for Semantic Web and Enterprises, April 3, 2002 • H.Wache, T.Vogele, U.Visser, H.Stuckenschmidt, G.Schuster, H.Neumann and S.Hubner, Ontology-Based Integration of Information: A Survey of Existing Approaches CS690L - Lecture 4 21