Download Database 1 Database Design

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
TU/e
eindhoven university of technology
Exporting Databases in XML DTD
A Conceptual and Generic Approach
Philippe Thiran
Computer Science Department
Technische Universiteit Eindhoven
The Netherlands
/faculty of mathematics and informatics
TU/e
eindhoven university of technology
Exporting Databases in XML
• Current Situation
– XML as the standard for publishing and exchanging data
over the Web
– Data recorded and maintained in existing Databases
• Heterogeneous databases: different data models
• Limitation of database models
– Database schema incompleteness (implicit/hidden structures)
– Explicit and implicit interconnections among entities
Oracle V5 Model
Order
OderID
Customer
Date
Total[0-1]
Detail
OderID
Reference
Quantity
Amount
Product
Reference
Label[0-1]
UnitPrice
Supplier
(no primary and foreign keys)
/faculty of mathematics and informatics
Order
OderID
Customer
Date
Total[0-1]
id: OderID
Detail
OderID
Reference
Quantity
Amount
id: Reference
OderID
ref: Reference
ref: OderID
Product
Reference
Label[0-1]
UnitPrice
Supplier[1-5]
id: Reference
TU/e
eindhoven university of technology
Exporting Databases in XML
• Migrating existing databases to XML
– Principle
• XML description in DTD
• Bottom-up Approach
• Exploiting as much as possible the meaning of
source data
– Method and Tool
• Method
– Not limited to any specific database model
– Capturing the explicit and implicit structures and
interconnections of the database schema
• Tool for supporting the method
/faculty of mathematics and informatics
TU/e
eindhoven university of technology
Exporting Databases in XML
Schema Representation
Database models and DTD
Schema Manipulation
Database schemas and DTD
/faculty of mathematics and informatics
TU/e
eindhoven university of technology
Exporting Databases in XML
•
Schema Representation
–
Expressing database schemas and XML
in terms of GER
•
•
Extended object-entity relationship data
model
One rich and expressive model able to
express data schemas whatever their
operational data models
– Operational database models like IMS, Relational,
OO
– XML-family models: XML DTD or XML Schema
/faculty of mathematics and informatics
TU/e
eindhoven university of technology
Exporting Databases in XML
• Schema Representation
– Expressing XML in terms of GER
• DTD expressed in terms of GER
– DTD concepts
DTD Concepts
– Hierarchical organization
Element types
– Sequence organization
GER Interpretation
Entity types
Hierarchy of element types
(root) entity types,
relationship types, father
roles
Content type ELEMENT
Relationship types
Sequence organization
(order of elements in the
sequence)
Seq groups
Occurrence operators on
sub-elements ?, *, +
Role Cardinalities
/faculty of mathematics and informatics IDREF, GID attributes
IDREF, GID groups
TU/e
eindhoven university of technology
Exporting Databases in XML
• Schema Representation
– Expressing XML in terms of GER
<!ELEMENT Catalog (Order*, Product*)>
<!ELEMENT Order (Customer, Date, Total?, detail+)>
<!ATTLIST Order OrderID ID #REQUIRED>
<!ELEMENT Customer ANY>
<!ELEMENT Date (#PCDATA)>
<!ELEMENT Total (#PCDATA)>
<!ELEMENT Detail (Quantity, Amount)>
<!ATTLIST Detail Product IDREF #REQUIRED>
<!ELEMENT Quantity (#PCDATA)>
<!ELEMENT Amount (#PCDATA)>
<!ELEMENT Product (Supplier+)>
<!ATTLIST Product
Reference ID #REQUIRED
Label CDATA #IMPLIED
UnitPrice CDATA #REQUIRED>
<!ELEMENT Supplier ANY>
/faculty of mathematics and informatics
Catalog
seq: .Order[*]
.Product[*]
f
0-N
f
0-N
Order
OderID
seq: .Customer
1-1
.Date
.Total
.Detail[*]
gid: OderID
1-1
f
1-1
f
1-1
f
0-1
f
1-N
1-1
1-1
Customer
#any
Date
#pcdata
1-1
Total
#pcdata
1-1
Product
Reference
Label[0-1]
UnitPrice
gid: Reference
seq: .Supplier[*]
Detail
Product
idref: Product
seq: .Quantity
.Amount
f
1-1
f
1-1
Quantity
1-1 #pcdata
1-1
f
1-N
1-1
Supplier
#any
Amount
#pcdata
TU/e
eindhoven university of technology
Exporting Databases in XML
• Schema Manipulation
– Transforming XML DTD within GER
• Schema transformations defined on GER
– Reverse transformations, semantics-preserving
transformations
– Transformation operators
• Standard transformations
– For manipulating schemas expressed in operational
database models
– Example: transforming an entity type into an
attribute
• DTD-specific transformations
/faculty of mathematics and informatics
TU/e
eindhoven university of technology
Exporting Databases in XML
• Schema Manipulation
– Transforming XML DTD within GER
• Standard transformations
– For manipulating schemas expressed in classical
structured models
– Example of a semantics-preserving transformation:
transforming an relationship type into a entity type
RT-ET: Transforming
a relationship type
into an entity type.
A
A1
Inverse: ET-RT
0-N
/faculty of mathematics and informatics
B1
B1
R
0-N
B1
B1
A
A1
0-N
rA
R
1-1 id: rB.B1 1-1
rA.A
0-N
rB
TU/e
eindhoven university of technology
Exporting Databases in XML
• Schema Manipulation
– Transforming XML DTD within GER
• DTD-specific transformations (example)
– Suited to derive a DTD from a structured data
DTD-RT-to-HIER: Transforms
schema
a one-to-many (or one-to-one)
binary relationship type into a
hierarchical relation. The 1-1
role becomes the child role.
A
0-N
R
f
0-N
f
0-1
R
1-1
B
R1
1-1
A
A
f
0-N
R
1-1
B
1-1
A
Inverse: DTD-HIER-to-RT
Create-SEQ-GROUP: Adds a
seq group to an entity type.
That group contains the child
roles played by its children (in
an aleatory order).
Inverse: Del-SEQ-GROUP
/faculty of mathematics and informatics
R
f
seq: R1.A[*]
0-N
R2.B
f
0-1
R2
1-1
R1
R2
1-1
B
B
TU/e
eindhoven university of technology
Exporting Databases in XML
Converting (legacy) databases into DTD
Exploiting as much as possible the meaning
of source data
Capturing the explicit and implicit structures
and interconnections
/faculty of mathematics and informatics
TU/e
eindhoven university of technology
Exporting Databases in XML
• Exporting Databases
– Bottom-up approach (from the source to the
target)
– Semi-automated 4-step method
• Extraction of the database schema (automated)
– Extraction of the explicit structures and constraints
• Semantics recovering (semi-automated)
– Recovery of the implicit structures and constraints
• Model translation (semi-automated)
– Translation of a schema expressed in the GER into a
schema expressed in the GER DTD
– Use of the relations among entities
• DTD exportation (automated)
– Generation of the DTD document
/faculty of mathematics and informatics
TU/e
eindhoven university of technology
Exporting Databases in XML
• Exporting XML
– Reverse Engineering
• Recovering of the conceptual schema of an
existing database
– Augmentation of the knowledge about the data
semantics
– Database reverse engineering process (DB-MAIN)
Database Schema
– Elicitation of hidden structures and
constraints
Conceptual
Schema
Order
OderID
Customer
Date
Total[0-1]
acc: OderID
Detail
OderID
Reference
Quantity
Amount
acc: Reference
OderID
Product
Reference
Label[0-1]
UnitPrice
Supplier
acc: Reference
Schema transformations
FileCatalog
Product
Order
Detail
/faculty of mathematics and informatics
Order
OderID
Customer
Date
Total[0-1]
id: OderID
1-N
Detail
Quantity
Amount
0-N
Product
Reference
Label[0-1]
UnitPrice
Supplier[1-5]
id: Reference
TU/e
eindhoven university of technology
Exporting Databases in XML
• Exporting XML
– Model Translation
• DTD-specific transformation
• Non-deterministic process
– It requires some design choices
– The user-inputs might have consequences on the
properties and the semantics of the resulting
schema
• 5-step transformation process
–
–
–
–
–
Schema preparation
Hierarchy structure creation
Constraint relaxation
Attribute representation
Ordering definition
/faculty of mathematics and informatics
eindhoven university of technology
TU/e
Exporting Databases in XML
1.
2.
3.
4.
5.
• Exporting XML
– Model Translation
• Schema preparation
Schema preparation
Hierarchy structure creation
Constraint relaxation
Attribute representation
Ordering definition
– Removing invalid constructs
» Multivalued/compound attributes
» Complex relationship types
Conceptual Schema
Order
OderID
Customer
Date
Total[0-1]
id: OderID
1-N
Detail
Quantity
Amount
0-N
Product
Reference
Label[0-1]
UnitPrice
Supplier[1-5]
id: Reference
Order
OderID
Customer
Date
Total[0-1]
id: OderID
Detail
Quantity
Amount
id: of.Product
consists.Order
1-1
1-N
consists
Product
Reference
Label[0-1]
UnitPrice
id: Reference
1-5
1-1
of
0-N
supplied
1-1
/faculty of mathematics and informatics
Supplier
Supplier
id: supplied.Product
Supplier
TU/e
eindhoven university of technology
Exporting Databases in XML
• Exporting XML
– Model Translation
• Hierarchical structure creation
1.
2.
3.
4.
5.
Entity types, relationship types are transformed into a tree
• by electing natural roots (significant concepts)
Catalog f
• by resolving father conflicts
0-N
f
• by breaking cycles
0-N
• by (eventually) adding a unique root
Schema preparation
Hierarchy structure creation
Constraint relaxation
Attribute representation
Ordering definition
Order
OderID
1-1 Customer
Date
f
Total[0-1] 1-N
id: OderID
1-1
Order
OderID
Customer
Date
Total[0-1]
id: OderID
Detail
Quantity
Amount
id: of.Product
consists.Order
1-1
1-N
consists
Product
Reference
Label[0-1]
UnitPrice
id: Reference
Product
Reference
Label[0-1]
UnitPrice
id: Reference
f
1-5
1-5
1-1
of
0-N
Detail
Reference
1-1
Quantity
Amount
id: Reference
.f
ref: Reference
supplied
1-1
Supplier
Supplier
id: supplied.Product
Supplier
/faculty of mathematics and informatics
1-1
Supplier
Supplier
id: .f
Supplier
TU/e
eindhoven university of technology
Exporting Databases in XML
1.
2.
3.
4.
5.
• Exporting XML
– Model Translation
• Constraint relaxation
– Role cardinalities extension
– Gid and idref groups creationCatalog f0-N
f
0-N
Catalog f
0-N
f
0-N
Order
OderID
1-1 Customer
Date
f
Total[0-1] 1-N
id: OderID
1-1
Product
Reference
Label[0-1]
UnitPrice
id: Reference
Schema preparation
Hierarchy structure creation
Constraint relaxation
Attribute representation
Ordering definition
Order
OderID
1-1 Customer
Date
f
Total[0-1] 1-N
gid: OderID
1-1
Detail
Reference
1-1
Quantity
Amount
id: Reference
.f
ref: Reference
f
1-5
Product
Reference
Label[0-1]
UnitPrice
gid: Reference
f
1-N
Detail
Reference
1-1
Quantity
Amount
id: Reference
.f
idref: Reference
1-1
1-1
Supplier
Supplier
id: .f
Supplier
/faculty of mathematics and informatics
Supplier
Supplier
gid: .f
Supplier
TU/e
eindhoven university of technology
Exporting Databases in XML
1.
2.
3.
4.
5.
• Exporting XML
– Model Translation
• Attribute representation
• Ordering definition
Catalog
f
seq: .Order[*]
0-N
.Product[*]
f
0-N
Order
OderID
seq: .Customer
1-1
.Date
.Total
.Detail[*]
gid: OderID
Schema preparation
Hierarchy structure creation
Constraint relaxation
Attribute representation
Ordering definition
1-1
f
1-1
f
1-1
f
0-1
f
1-N
1-1
1-1
Customer
#any
Date
#pcdata
1-1
Total
#pcdata
1-1
Product
Reference
Label[0-1]
UnitPrice
gid: Reference
seq: .Supplier[*]
Detail
Product
idref: Product
seq: .Quantity
.Amount
Quantity
1-1 #pcdata
1-1
f
1-N
1-1
Supplier
#any
/faculty of mathematics and informatics
f
1-1
f
1-1
Amount
#pcdata
TU/e
eindhoven university of technology
Exporting Databases in XML
CASE Support – DB-MAIN
Model Expression
Database models and DTD
Model Translation
DTD-specific transformation
/faculty of mathematics and informatics
TU/e
eindhoven university of technology
Exporting Databases in XML
• CASE Support – DB-MAIN
– Basic Features
• Dedicated to database application engineering
• Based on the GER
• Includes transformation operators, reverse engineering processors and
schema analysis tools
• Extraction facilities (SQL, Codasyl, RPG, IMS, etc.)
/faculty of mathematics and informatics
TU/e
eindhoven university of technology
Exporting Databases in XML
• CASE Support
– *-to-DTD Transformation
• DTD-Specific transformations
• Assistant
/faculty of mathematics and informatics
TU/e
eindhoven university of technology
Exporting Databases in XML
• Conclusions
– Rich and expressive data model
• Translating semantics of both database and XML models
– Non-deterministic aspect of the model translation
• The same database schema can lead to a large set of equivalent XML
structures
– CASE Support (application)
• Automatic production of XML documents
– that comply with the DTD that has been computed
– based on the schema transformations used to convert the database
schema in XML DTD
/faculty of mathematics and informatics
Related documents