Download The Storage for Semi

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Document related concepts

Database wikipedia , lookup

Clusterpoint wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Relational algebra wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
Storing and Maintaining
Semistructured Data
Efficiently in an ObjectRelational Database
Mo Yuanying and Ling Tok Wang
1
Contests
1. Main accomplishment
2. Related Works
3. ORA-SS
4. Storing Algorithm
5. Comparison with Related Works
6. Conclusion
2
Main Accomplishment

This study provides an efficient and consistent
storage for semistructured data by developing
algorithms that map the XML document to
logical ORA-SS model and then to an objectrelational data store.
3
Contests
1. Main accomplishment
2. Related Works
3. ORA-SS
4. Storing Algorithm
5. Comparison with Related Works
6. Conclusion
4
Related Works
(1) the file system


store each XML document as a separate operating
system file and use a DOM or SAX parser whenever the
document is accessed by a query
Disadvantage




XML files in ASCII format need to be parsed every time when
they are accessed for either browsing or querying.
the entire parsed file must be memory-resident during query
processing in DOM.
it is hard to build and maintain indices on documents stored
this way.
update operations are difficult to implement.
5
Related Works
(2)Using a relational DBMS

------
XML data is stored in relations and the XML query
language (for example, XQuery) is translated to SQL
and executed by the underlying relational database
system
 Disadvantages
The Edge Approach
The Attribute Approach
 A great deal of redundancy
Universal Table
 Difficult to do search or update
Normalized Universal Approach  Handling multi-valued attribute is
STORED
expensive
6
Related Works
(3)Using a storage manager

the XML query is parsed, translated to a suitable
operator tree representation, optimized, and then
executed by an XML Query Engine
-- Shore
-- B-tree
Disadvantage

Inconvenient when doing the search or update
7
Related Works
(4)Our approach
--Store ORA-SS in nested relations

Problems in existing storage approaches




Stored in flat files – it is long and difficult to query or update
Relational DBMS – these approaches cannot get the semantic
information
ORA-SS reflects the nested structure of semi-structured data,
distinguishes between object classes, relationship types and
attributes. It is possible to specify the degree of n-ary
relationship types and indicate if an attribute is an attribute of
a relationship type or an attribute of an object class. Such
information is essential for designing an efficient and nonredundant storage organization for semi-structured data
Handling multi-valued attribute better in nested relations
8
Contests
1. Main accomplishment
2. Related Works
3. ORA-SS
4. Storing Algorithm
5. Comparison with Related Works
6. Conclusion
9
ORA-SS


A semantically richer data model for semistructured data
3 main concepts



Object class
Relationship type
Attribute
10
ORA-SS
Example

Binary relationship type
11
ORA-SS
Example (Cont)

Ternary relationship type
12
ORA-SS
Example (Cont)

The distinction between binary and ternary
relationship types cannot be made in other
semi-structured data models.
13
ORA-SS



ORA-SS can specify the degree of n-ary
relationship types
ORA-SS can indicate if an attribute is an attribute
of a relationship type or an attribute of an object
class
Existing semi-structured data models cannot
specify such information while it is essential and
important for storage
14
Contests
1. Main accomplishment
2. Related Works
3. ORA-SS
4. Storing Algorithm
5. Comparison with Related Works
6. Conclusion
15
Storing Algorithm
ORA-SS to OR database

Object-Relational database can handle multivalued attributes efficiently.

Multi-valued attributes are treated as repeating groups
in nested relations.
16
Storing Algorithm
ORA-SS to OR database

Main rules


Each object class together with its attributes forms a
nested relation while multi-valued attributes as
repeating groups of this relation (Object relation).
Each relationship type(object classes involved in this
relationship type) together with its attributes forms a
nested relation while multi-valued attributes as
repeating groups of this relation (Relationship
relation).
17
Storing Algorithm
(1)Object class translation algorithm



O1 The identifier and candidate key of this
object class is the primary key and candidate key
of the generated relation.
O2 Each single-valued attribute of this object
class is a single-valued attribute of the generated
relation.
O3 Composite attributes of object class are
represented directly. They are replaced by their
components in the generated relation.
18
Storing Algorithm
Object class translation algorithm (cont)




O4 Each multi-valued attribute of this object class
forms a repeating group in this relation.
O5 Each reference is a foreign key in this relation.
O6 Each disjunctive attribute is treated as two
attributes.
O7 For the ID dependency relationship type, the
rule for the ID dependent object class is the same as
the rule for the regular object class. The ID
dependent object class together with its attributes
forms a nested relation within its parent object class.
19
Storing Algorithm
Translation Example1
20
Storing Algorithm
(2)Relationship type translation algorithm

R1 All the identifiers of the object classes
participating in this relationship type form the
single-valued attributes of the nested relation.


The key of the relationship type can be determined by
the participation constraint of the relationship type.
R2 Each single-valued attribute of this
relationship type is a single-valued attribute of
the generated relation.
21
Storing Algorithm
Relationship type translation algorithm (cont)




R3 Composite attributes of relationship type are
represented directly. They are replaced by their
components in the generated relation
R4 Each multi-valued attributes of this
relationship type forms a repeating group in this
relation.
R5 A disjunctive relationship type is treated as
two relationship types.
R6 There is no need to translate ID dependency
relationship type.
22
Storing Algorithm
Translation Example1
23
Storing Algorithm
Translation for Ordering and ANY


(3)Translation for Ordering
we define another attribute named ordinal within the
ordered object class (ie, the ordered attribute).
(4)Translation for ANY


the unknown structured attribute or an attribute may have a
different structure for different instances, which is denoted as
ANY
we define a separate table as (Identifier, ANY, ANY-value).



Identifier is the identifier of the object class or the relationship type
which this ANY belongs to.
ANY is the different structure name (the TAG) for the different
instances.
ANY-value is its value.
24
Storing Algorithm
Translation Results


Followed these algorithms, the Normal Form
ORA-SS schema will result in the normal form
nested relations.
the undesirable update anomalies in semistructured databases are removed and any
redundancy due to many-to-many relationships
and n-ary relationships are controlled
25
Contests
1. Main accomplishment
2. Related Works
3. ORA-SS
4. Storing Algorithm
5. Comparison with Related Works
6. Conclusion
26
Comparison
Other models
Supply(J#, S#, P#, price, Qty)
27
Conclusion



Our approach is to use ORA-SS as our data
model and use object-relational database as the
database management system.
We can store and access the semi-structured data
correctly, more efficient and without avoidable
redundancy.
There is no node ID needed in our approach.
28
Conclusion (cont)



Our approach can capture the semantic
information which is essential and important for
storage.
Our approach can represent the degree of n-ary
relationship types.
Our approach can represent the attribute as
attribute of object class or attribute of
relationship type.
29