* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download The Storage for Semi
Survey
Document related concepts
Transcript
Storing and Maintaining Semistructured Data Efficiently in an ObjectRelational Database Mo Yuanying and Ling Tok Wang 1 Contests 1. Main accomplishment 2. Related Works 3. ORA-SS 4. Storing Algorithm 5. Comparison with Related Works 6. Conclusion 2 Main Accomplishment This study provides an efficient and consistent storage for semistructured data by developing algorithms that map the XML document to logical ORA-SS model and then to an objectrelational data store. 3 Contests 1. Main accomplishment 2. Related Works 3. ORA-SS 4. Storing Algorithm 5. Comparison with Related Works 6. Conclusion 4 Related Works (1) the file system store each XML document as a separate operating system file and use a DOM or SAX parser whenever the document is accessed by a query Disadvantage XML files in ASCII format need to be parsed every time when they are accessed for either browsing or querying. the entire parsed file must be memory-resident during query processing in DOM. it is hard to build and maintain indices on documents stored this way. update operations are difficult to implement. 5 Related Works (2)Using a relational DBMS ------ XML data is stored in relations and the XML query language (for example, XQuery) is translated to SQL and executed by the underlying relational database system Disadvantages The Edge Approach The Attribute Approach A great deal of redundancy Universal Table Difficult to do search or update Normalized Universal Approach Handling multi-valued attribute is STORED expensive 6 Related Works (3)Using a storage manager the XML query is parsed, translated to a suitable operator tree representation, optimized, and then executed by an XML Query Engine -- Shore -- B-tree Disadvantage Inconvenient when doing the search or update 7 Related Works (4)Our approach --Store ORA-SS in nested relations Problems in existing storage approaches Stored in flat files – it is long and difficult to query or update Relational DBMS – these approaches cannot get the semantic information ORA-SS reflects the nested structure of semi-structured data, distinguishes between object classes, relationship types and attributes. It is possible to specify the degree of n-ary relationship types and indicate if an attribute is an attribute of a relationship type or an attribute of an object class. Such information is essential for designing an efficient and nonredundant storage organization for semi-structured data Handling multi-valued attribute better in nested relations 8 Contests 1. Main accomplishment 2. Related Works 3. ORA-SS 4. Storing Algorithm 5. Comparison with Related Works 6. Conclusion 9 ORA-SS A semantically richer data model for semistructured data 3 main concepts Object class Relationship type Attribute 10 ORA-SS Example Binary relationship type 11 ORA-SS Example (Cont) Ternary relationship type 12 ORA-SS Example (Cont) The distinction between binary and ternary relationship types cannot be made in other semi-structured data models. 13 ORA-SS ORA-SS can specify the degree of n-ary relationship types ORA-SS can indicate if an attribute is an attribute of a relationship type or an attribute of an object class Existing semi-structured data models cannot specify such information while it is essential and important for storage 14 Contests 1. Main accomplishment 2. Related Works 3. ORA-SS 4. Storing Algorithm 5. Comparison with Related Works 6. Conclusion 15 Storing Algorithm ORA-SS to OR database Object-Relational database can handle multivalued attributes efficiently. Multi-valued attributes are treated as repeating groups in nested relations. 16 Storing Algorithm ORA-SS to OR database Main rules Each object class together with its attributes forms a nested relation while multi-valued attributes as repeating groups of this relation (Object relation). Each relationship type(object classes involved in this relationship type) together with its attributes forms a nested relation while multi-valued attributes as repeating groups of this relation (Relationship relation). 17 Storing Algorithm (1)Object class translation algorithm O1 The identifier and candidate key of this object class is the primary key and candidate key of the generated relation. O2 Each single-valued attribute of this object class is a single-valued attribute of the generated relation. O3 Composite attributes of object class are represented directly. They are replaced by their components in the generated relation. 18 Storing Algorithm Object class translation algorithm (cont) O4 Each multi-valued attribute of this object class forms a repeating group in this relation. O5 Each reference is a foreign key in this relation. O6 Each disjunctive attribute is treated as two attributes. O7 For the ID dependency relationship type, the rule for the ID dependent object class is the same as the rule for the regular object class. The ID dependent object class together with its attributes forms a nested relation within its parent object class. 19 Storing Algorithm Translation Example1 20 Storing Algorithm (2)Relationship type translation algorithm R1 All the identifiers of the object classes participating in this relationship type form the single-valued attributes of the nested relation. The key of the relationship type can be determined by the participation constraint of the relationship type. R2 Each single-valued attribute of this relationship type is a single-valued attribute of the generated relation. 21 Storing Algorithm Relationship type translation algorithm (cont) R3 Composite attributes of relationship type are represented directly. They are replaced by their components in the generated relation R4 Each multi-valued attributes of this relationship type forms a repeating group in this relation. R5 A disjunctive relationship type is treated as two relationship types. R6 There is no need to translate ID dependency relationship type. 22 Storing Algorithm Translation Example1 23 Storing Algorithm Translation for Ordering and ANY (3)Translation for Ordering we define another attribute named ordinal within the ordered object class (ie, the ordered attribute). (4)Translation for ANY the unknown structured attribute or an attribute may have a different structure for different instances, which is denoted as ANY we define a separate table as (Identifier, ANY, ANY-value). Identifier is the identifier of the object class or the relationship type which this ANY belongs to. ANY is the different structure name (the TAG) for the different instances. ANY-value is its value. 24 Storing Algorithm Translation Results Followed these algorithms, the Normal Form ORA-SS schema will result in the normal form nested relations. the undesirable update anomalies in semistructured databases are removed and any redundancy due to many-to-many relationships and n-ary relationships are controlled 25 Contests 1. Main accomplishment 2. Related Works 3. ORA-SS 4. Storing Algorithm 5. Comparison with Related Works 6. Conclusion 26 Comparison Other models Supply(J#, S#, P#, price, Qty) 27 Conclusion Our approach is to use ORA-SS as our data model and use object-relational database as the database management system. We can store and access the semi-structured data correctly, more efficient and without avoidable redundancy. There is no node ID needed in our approach. 28 Conclusion (cont) Our approach can capture the semantic information which is essential and important for storage. Our approach can represent the degree of n-ary relationship types. Our approach can represent the attribute as attribute of object class or attribute of relationship type. 29