Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Relational algebra wikipedia , lookup
Concurrency control wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
ContactPoint wikipedia , lookup
Versant Object Database wikipedia , lookup
Clusterpoint wikipedia , lookup
A Performance Evaluation of Alternative Mapping Schemes for Storing XML Data in a Relational Database By Daniela Floresu Donald Kossmann Presented by: Intakhab Mehboob Khan Table of Contents • • • • • Introduction Approaches to Store Semi-Structured Data Data Model for Semi-Structured Data Query Language and XML-QL Storing XML Data in Relational Database – Mapping Attributes – Mapping Values • Evaluating the Mapping Schemes • Conclusion Introduction • August 3, 1999 • How XML data can be stored and Queried • Presented alternative Mapping Schemes to Store XML data • Performance experiments that analyze the tradeoffs of the schemes Approaches to Store SemiStructured Data • Special Purpose Database System – Examples are Lore, Rufus and Strudel – Store and retrieve xml data, using specially designed structures and indices • Object Oriented Database – Example is O2 or Objectsore – Rich data modeling capabilities of OODMS are exploited • Standard Relational Database System – Data is mapped in tables of a relational schema Data Model for Semi-Structured Data • Characteristics of Semi-Structured Data – Schema is not given in advance, may be implicit – Schema is relatively large and may be changing frequently – Schema is descriptive rather than perspective – Data is not strongly typed • Simple graph data model similar to OEM model Data Model for Semi-Structured Data Query Language and XML-QL • All query languages for semi-structure are based on labeled graph • Features of Semi-Structure query language – regular path expression – ability to query the schema • In addition, XML-QL restructuring mechanism Storing XML Data in Relational Database [Mapping Attributes] • Edge Approach – – – – Store all attributes in single table Edge(source, ordinal, name, flag, target) Indexing, Forward and backward traversals Variant of Edge approach is: Store attributes name in separate table Storing XML Data in Relational Database [Mapping Attributes] • Attribute Approach – All the attributes with the same name in one table – Resembles to binary storage scheme proposed to stir semi-structure data – Aname(source, ordinal, flag, target) – Indexing Storing XML Data in Relational Database [Mapping Attributes] • Universal Table – Single Universal table to store all attributes of XML document – Universal(source, ordinaln1, flagn1, targetn1,…..) Storing XML Data in Relational Database [Mapping Attributes] • Normalized Universal Table – Multi-valued attributes are stored in separate Overflow tables – UnivNorm(source, ordinaln1, flagn1, targetn1,…..) – Overflow(source, ordinal, flag, target),…. Storing XML Data in Relational Database [Mapping Values] • Storing values in separate table – Value table storing all integers, dates, and all strings • Vtype(vid, value) Storing XML Data in Relational Database [Mapping Values] • Storing values together with attributes – Column for each data type: Inlining – No flag is needed – For indexing, on every value columns separately in addition to source and target Evaluating the Mapping Schemes • Plan of Attack – Size of Relational Database for each mapping scheme – The time to bulkload the relational database given an XML document – The time to reconstruct the XML document from the relational data – The time to execute different classes of XML queries – The time to execute different kinds of update functions Evaluating the Mapping Schemes • Experimental Platform – Commercial relational database system, installed on Sun Sparc Station 20 with • Two 75 MHZ processors • 128MB of main memory & a disk that stores the database and intermediate results of query processing – Machine runs on Solaris 2.6, with limited size of main memory buffer to 6.4MB – Calls to relational database from the Java programs are implemented with JDBC Evaluating the Mapping Schemes • Benchmark Specification – Benchmark Database Evaluating the Mapping Schemes • Benchmark Specification – Benchmark Queries Evaluating the Mapping Schemes • Benchmark Specification – Update Functions Evaluating the Mapping Schemes • Benchmark Specification – Database Size Evaluating the Mapping Schemes • Benchmark Specification – Bulkloading Times Evaluating the Mapping Schemes • Benchmark Specification – Reconstructing the XML Document Evaluating the Mapping Schemes • Benchmark Specification – Running Times of the Queries Evaluating the Mapping Schemes • Benchmark Specification – Running Times of the Updates Functions Conclusion • Relational database has following advantages – Mature and Scale very well – Traditional and Semi-structured data can coexist in relational database – RDBMS are capable of performing more complex XML queries on large database • Disadvantages – Very expensive to reconstruct the original XML data from relational database – Components such as authorization and concurrency control need to be implemented outside RDBMS Conclusion (Cont’d) • Alternative mapping schemes results shows: – Attribute tables for every attribute name that occurs in an XML document and inlining of values into these Attributes tables is the best approach