Download Document

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft SQL Server wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Concurrency control wikipedia , lookup

Database wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Clusterpoint wikipedia , lookup

Relational algebra wikipedia , lookup

Versant Object Database wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
A Knowledge-Based
Approach to Querying
Heterogeneous Spatial
Databases
Andrea Rodríguez
Universidad de Concepción - Chile
Department of Information Engineering and Computer Science
{andrea,mvaras}@udec.cl
Outline
•
•
•
•
•
Problem
General Approach
Ontology-Based Similarity
DB Schema Similarity
Conclusions and Future Work
Syntantic level (data types and formats)
Heterogeneous databases
•
languages, and interfaces)
Schematic level (schematic integration, query
Semantic level
•
•
Spatial Relations
Role
Estructure
Semantic Relations
Heterogeneous databases
Entity Clases
Instances or
ocurrences
Geometry
Attributes
Syntantic level (data types and formats)
Heterogeneous databases
•
Schematic level (schematic integration, query
languages, and interfaces)
•
Semantic level
•
•
Abstraction of irrelevant information
Independence of data representation
Rich semantic description
Ontology-Based Approach
•
Single Ontology
Ontology-Based Approach
Database
Mapping entities in the database onto
concepts in the ontology
Single Ontology
Ontology-Based Approach
Database
Single Ontology
Mapping queries onto
ontological definitions
User query
Ontology-Based Approach
Database
There exist well-defined ontologies for specific
updates
A single ontology forces to commitments and limits
ontologies
Different conceptualizations imply different
Issues
•
•
•
domains
Issues
Issues
Stadium
Athletic field
Ontology mismatches (polysemy, synonymy,
formalization
Differences in the level of explicitness and
Issues of Multiple Ontologies
•
•
overlapping)
General Approach
• by using a user ontology we allow users to
express queries in their own terms.
• we expand the query to extract not only
equivalent but also similar concepts.
• databases have no ontological descriptions of
their stored entities so, we cannot compare, at the
ontological level, different databases
General Approach
Query = { C11, ….,Cnn}
•
A query is expressed in terms
of a user ontology
Ontology
General Approach
Query = { C1, ….,Cn}
Similarity11
Query’ = { C11, ….,Cmm}
•
•
A query is expressed in terms
of a user ontology
A similarity measure expands
this query with similar
concepts of this ontology
Ontology
General Approach
Query = { C1, ….,Cn}
Similarity1
Query’ = { C1, ….,Cm}
Mapping
Query’’ = { Eq1
q1, ….,Eqm
qm}
•
•
•
A query is expressed in terms
of a user ontology
A similarity measure expands
this query with similar
concepts of this ontology
The expanded query is
mapped onto a database
schema, query schema
Databaseaa
General Approach
Query’’ = { Eq1
, ….,Eqm
}
q1
qm
Similarity22
Solutionaa = { Ea1
, ….,Eap
}
a1
ap
•
•
The query schema is
compared with each database
schema
Ranking based on the degree
of similarity
User Ontology: Entity Classes
(hospital-building)
Is-a (Class)
Whole-of
Meronymy
Inclusion relations
Semantic relations
Terminological relation
Synonymy
(building-edifice)
Part-of
(stadium-athletic_field)
Attributes
(room-building)
Functions
(administration, sports)
Distinguishing features
Parts
(play, practice)
(Stadium)
(athletic_field, stands)
Nouns
(entity types)
Spatial Concepts
Synonyms
Parts
Partial definition of semantic relation IS-A
Attributes
User Ontology: Entity Classes
SDTS
WordNet
IS-A relation
Part-Whole Relations
Entity Class Definition
User Ontology: Similarity Assessment
St ( c1,c2 ) =
c1,c2
St
distinguishing features of type t for entity class c1 and c2
coefficient of asymmetry
entity classes
similarity functions for type of distinguishing features t
| C1 « C2|
| C1 « C2| + a(c1, c2 )| C1 - C2| + (1 -a (c1 ,c2 ) )| C2 - C1 |
C1,C2
a()
set cardinality
where
||
Semantic Similarity Evaluation
Database Schema: Relational Database
• Entities: names, attributes, primary key and
foreign key
• Foreign keys (FK): relations that they belong and
refer to
• Attributes: names
Part-Whole
Is-A
Semantic Relation
(1) New Relation
(2) Add Foreign Key
(1) An entity for each children
Transformation
From Ontology to DB Schema
Whole-Of
(1) New Relation
(2) Add Foreign Key
Heating_system (Fkutility,Fkpipeline )
Foreign_key: FKutility reference to Utility
Foreign_key: FKpipeline reference to Pipeline
From Ontology to DB Schema
Entity_class{
name: {heating_system}
is_a: {utility}
part_of: {}
whole_of: {{pipeline, piping, pipe}}
Pipeline (Fkheating_system,Fkplumbing_system )
Foreign_key: Fkheating_system reference to
Heatin_system
Foreign_key: Fkplumbing_system reference to
Plumbing_system
Comparing DB Schemas
• String Matching over entities’ names, attributes
domains and foreign keys’ references.
• Matching of semantic neighborhood.
p o
,e )
i
j
=
Max
È
| t « tj |
ep
Í
i
Í
Í| t e p « t j | + | t e p - t j | + | t j - t e p
Î i
i
i
String Matching
Sw (e
t j ŒsynSet o
e
j
˘
˙
˙
|˙
˚
Semantic Neighbordhood
• Semantic Neighborhood fi Similarity
• Semantic relations:
• In the query, relations are represented by foreign keys
• In DB, the relations are represented by foreign keys or by
domain value of an attribute.
m
n+b
 Max S w (FK p , FK o ) +  Max Sw (D p ,FK o )
ii,e
jj,e
l,ii,e
jj,e
jj
j
j
i
ii=1 l, j
i
= ii=1
n
Semantic Neighbordhood
Sn (eip, e oj )
Extending Comparisons to Instances
based
on
content
• Attributes: based on type of domains
• Spatial Relations:
measure
• Geometry: based on area, diagonal, and
geometric type.
Conclusions
• Semantic Similarity combines:
• Matching of attributes
• Semantic distance
• Comparing DB Schemas combines:
• String matching of names, attributes, and foreign keys.
• Matching of semantic Neighborhood
Future Work
• Attributes
• Experimental results with large databases
• Extending this approach to the domain of the
WWW