Download jayavel - Berkeley Database Research

Document related concepts

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Relational algebra wikipedia , lookup

PL/SQL wikipedia , lookup

Clusterpoint wikipedia , lookup

SQL wikipedia , lookup

Null (SQL) wikipedia , lookup

Database model wikipedia , lookup

Relational model wikipedia , lookup

Transcript
Bridging Relational Technology
and XML
Jayavel Shanmugasundaram
University of Wisconsin &
IBM Almaden Research Center
Business to Business Interactions
Cars R Us
Tires R Us
eXtensible Markup Language (XML)
Internet
Order Fulfillment
Application
Purchasing
Application
Relational Database
System
Relational Database
System
Shift in Application Developers’
Conceptual Data Model
XML application
development
XML
XML application
development
Code to convert XML
data to relational data
Code to convert relational
data to XML data
Relations
Relational data
manipulation
Are XML Database Systems
the Answer?
Cars R Us
Tires R Us
eXtensible Markup Language (XML)
Internet
Order Fulfillment
Application
Purchasing
Application
XML Database
System
XML Database
System
Why use Relational Database Systems?
• Highly reliable, scalable, optimized for
performance, advanced functionality
– Result of 30+ years of Research & Development
– XML database systems are not “industrial strength”
… and not expected to be in the foreseeable future
• Existing data and applications
– XML applications have to inter-operate with existing
relational data and applications
– Not enough incentive to move all existing business
applications to XML database systems
• Remember object-oriented database systems?
A Solution
Cars R Us
Tires R Us
eXtensible Markup Language (XML)
Internet
Order Fulfillment
Application
Purchasing
Application
XML Translation Layer
XML Translation Layer
Relational Database
System
Relational Database
System
XML Translation Layer
(Contributions)
• Store and query XML documents
– Harnesses relational database technology for
this purpose
• Publish existing relational data as XML
documents
– Allows relational data to be viewed in XML
terms
Bridging Relational Technology
and XML
XML application
development
XML
Code to convert XML
data to relational data
XML application
development
XML Translation
Layer
Relations
Relational data
manipulation
Code to convert relational
data to XML data
Outline
•
•
•
•
•
Motivation & High-level Solution
Background (Relations, XML)
Storing and Querying XML Documents
Publishing Relational Data as XML Documents
Conclusion
Relational Schema and Data
PurchaseOrder
Id
Customer
200I Cars R Us
Day Month Year
10
300I Bikes R Us null
June 1999
July
1999
Item
Quantity
Cost
50
2000.00
200I
1
40%
300I Schwinn Tire 100
300I Trek Tire
20
200I Goodyear Tire 200
2500.00
400.00
8000.00
200I
300I
2
1
60%
100%
Pid
Name
Payment
200I Firestone Tire
Pid Installment Percentage
XML Document
Self-describing tags
<PurchaseOrder id=“200I” customer=“Cars R Us”>
<Date>
<Day> 10 </Day>
<Month> June </Month>
<Year> 1999 </Year>
</Date>
<Item name=“Firestone Tire” cost=“2000.00”>
<Quantity> 50 </Quantity>
</Item>
<Item name=“Goodyear Tire” cost=“8000.00”>
<Quantity> 200 </Quantity>
</Item>
<Payment> 40% </Payment>
<Payment> 60% </Payment>
</PurchaseOrder>
XML Document
Self-describing tags
Nested structure
<PurchaseOrder id=“200I” customer=“Cars R Us”>
<Date>
<Day> 10 </Day>
<Month> June </Month>
<Year> 1999 </Year>
</Date>
<Item name=“Firestone Tire” cost=“2000.00”>
<Quantity> 50 </Quantity>
</Item>
<Item name=“Goodyear Tire” cost=“8000.00”>
<Quantity> 200 </Quantity>
</Item>
<Payment> 40% </Payment>
<Payment> 60% </Payment>
</PurchaseOrder>
XML Document
Self-describing tags
Nested structure
Nested sets
<PurchaseOrder id=“200I” customer=“Cars R Us”>
<Date>
<Day> 10 </Day>
<Month> June </Month>
<Year> 1999 </Year>
</Date>
<Item name=“Firestone Tire” cost=“2000.00”>
<Quantity> 50 </Quantity>
</Item>
<Item name=“Goodyear Tire” cost=“8000.00”>
<Quantity> 200 </Quantity>
</Item>
<Payment> 40% </Payment>
<Payment> 60% </Payment>
</PurchaseOrder>
XML Document
Self-describing tags
Nested structure
Nested sets
Order
<PurchaseOrder id=“200I” customer=“Cars R Us”>
<Date>
<Day> 10 </Day>
<Month> June </Month>
<Year> 1999 </Year>
</Date>
<Item name=“Firestone Tire” cost=“2000.00”>
<Quantity> 50 </Quantity>
</Item>
<Item name=“Goodyear Tire” cost=“8000.00”>
<Quantity> 200 </Quantity>
</Item>
<Payment> 40% </Payment>
<Payment> 60% </Payment>
</PurchaseOrder>
XML Schema
PurchaseOrder
<PurchaseOrder id={integer} customer={string}>
Date (Item)* (Payment)*
</PurchaseOrder>
Date
<Date>
Day? Month Year
</Date>
Day
<Day> {integer} </Day>
Month
<Month> {string} </Month>
Year
<Year> {integer} </Year>
Item
<Item name={string} cost={float}>
Quantity
</Item>
… and so on
XML Schema (contd.)
PurchaseOrder
<PurchaseOrder id={integer} customer={string}>
Date? (Item | Payment)*
</PurchaseOrder>
PurchaseOrder
<PurchaseOrder id={integer} customer={string}>
(Date | Payment*) (Item (Item Item)* Payment)*
</PurchaseOrder>
PurchaseOrder
<PurchaseOrder id={integer} customer={string}>
Date Item (PurchaseOrder)* Payment
</PurchaseOrder>
XML Query
Find all the items bought by “Cars R Us” in 1999
For
$po in /PurchaseOrder
Where $po/@customer = “Cars R Us” and $po/date/year = 1999
Return $po/Item
XML Query (contd.)
//Item
//Item[5]
//Item Before //Payment
/(Item/(Item/Payment)*/(Payment | Item))*/Date
Outline
•
•
•
•
•
Motivation & High-level Solution
Background (Relations, XML)
Storing and Querying XML Documents
Publishing Relational Data as XML Documents
Conclusion
Storing and Querying XML Documents
[Shanmugasundaram et. al., VLDB’99]
XML
Schema
XML
Documents
XML
Query
XML
Result
XML Translation Layer
Relational
Schema
SQL
Query
Tuples
Translation
Information
Relational Database System
Relational
Result
Outline
• Motivation & High-level Solution
• Background (Relations, XML)
• Storing and Querying XML Documents
– Relational Schema Design and XML Storage
– Query Mapping and Result Construction
• Publishing Relational Data as XML Documents
• Conclusion
XML Schema
PurchaseOrder
<PurchaseOrder id={integer} customer={string}>
(Date | (Payment)*) (Item (Item Item)* Payment)*
</PurchaseOrder>
Desired Properties of Generated
Relational Schema R
• All XML documents conforming to XML
schema should be “mappable” to tuples in R
• All queries over XML documents should be
“mappable” to SQL queries over R
• Not Required: Ability to re-generate XML
schema from R
Simplifying XML Schemas
• XML schemas can be “simplified” for
translation purposes
PurchaseOrder
<PurchaseOrder id={integer} customer={string}>
(Date | (Payment)*) (Item (Item Item)* Payment)*
</PurchaseOrder>
PurchaseOrder
<PurchaseOrder id={integer} customer={string}>
Date? (Item)* (Payment)*
</PurchaseOrder>
• All without undermining storage and query
functionality!
Why is Simplification Possible?
• Structure in XML schemas can be captured:
– Partly in relational schema
– Partly as data values
PurchaseOrder
<PurchaseOrder id={integer} customer={string}>
Date? (Item)* (Payment)*
</PurchaseOrder>
• Order field to capture order among siblings
• Sufficient to answer ordered XML queries
– PurchaseOrder/Item[5]
– PurchaseOrder/Item AFTER PurchaseOrder/Payment
• Sufficient to reconstruct XML document
Simplification Desiderata
• Simplify structure, but preserve differences
that matter in relational model
– Single occurrence (attribute)
– Zero or one occurrences (nullable attribute)
– Zero or more occurrences (relation)
PurchaseOrder
<PurchaseOrder id={integer} customer={string}>
(Date | (Payment)*) (Item (Item Item)* Payment)*
</PurchaseOrder>
PurchaseOrder
<PurchaseOrder id={integer} customer={string}>
Date? (Item)* (Payment)*
</PurchaseOrder>
Translation Normal Form
• An XML schema production is either of the
form:
P
<P attr1={type1} … attrm={typem}>
a1 … ap ap+1? … aq? aq+1*… ar*
</P>
where ai  aj
• … or of the form:
P
<P attr1={type1} … attrm={typem}>
{type}
</P>
Example Simplification Rules
(e1 | e2)
e1 ? e 2 ?
(Date | (Payment)*) (Item (Item Item)* Payment)*
Date? (Item)*? (Item (Item Item)* Payment)*
e*?
e*
Date? (Payment)*? (Item (Item Item)* Payment)*
Date? (Item)* (Item (Item Item)* Payment)*
Simplified XML Schema
PurchaseOrder
<PurchaseOrder id={integer} customer={string}>
Date (Item)* (Payment)*
</PurchaseOrder>
Date
<Date>
Day? Month Year
</Date>
Day
<Day> {integer} </Day>
Month
<Month> {string} </Month>
Year
<Year> {integer} </Year>
Item
<Item name={string} cost={float}>
Quantity
</Item>
… and so on
Relational Schema Generation
PurchaseOrder (id, customer)
1
Date
?
Day
1
Month
*
Item (name, cost)
1
Year
*
Payment
1
Quantity
Satisfy: Fourth normal form
Minimize: Number of joins for path expressions
Generated Relational Schema
and Shredded XML Document
PurchaseOrder
Id
Customer
200I Cars R Us
Day Month Year
10
Item
Pid Order
Name
June 1999
Payment
Cost
Quantity
Pid Order Value
200I
1
Firestone Tire 2000.00
50
200I
2
40%
200I
3
Goodyear Tire 8000.00
200
200I
4
60%
Recursive XML Schema
PurchaseOrder
PurchaseOrder (id, customer)
Id
Customer Order Pid
200I Cars R Us
*
900I Us Again
Item (name)
1
Quantity
null null
1
5
*
Item
Pid Order
Name
Quantity Id
200I
1
Firestone Tire
50
200I
2
Goodyear Tire
200
5
6
Relational Schema Generation
and XML Document Shredding
(Completeness and Optimality)
• Any XML Schema X can be mapped to a
relational schema R, and …
• Any XML document XD conforming to X
can be converted to tuples in R
• Further, XD can be recovered from the tuples
in R
• Also minimizes the number of joins for path
expressions (given fourth normal form)
Outline
• Motivation & High-level Solution
• Background (Relations, XML)
• Storing and Querying XML Documents
– Relational Schema Design and XML Storage
– Query Mapping and Result Construction
• Publishing Relational Data as XML Documents
• Conclusion
XML Query
Find all the items bought by “Cars R Us” in 1999
For
$po in /PurchaseOrder
Where $po/@customer = “Cars R Us” and $po/date/year = 1999
Return $po/Item
Path Expression Automata
(Moore Machines)
PurchaseOrder
/PurchaseOrder/Item
Item
#
//Item
Item
XML Schema Automaton
(Mealy Machine)
PurchaseOrder (id, customer)
Date
Day
Month
Item (name, cost)
Year
Quantity
Payment
Intersected Automaton
PurchaseOrder (customer)
= “Cars R Us”
Date
Item (name, cost)
Year
= 1999
Quantity
Generated SQL Query
Select i.name, i.cost, i.quantity
From PurchaseOrder p, Item i
Where p.customer = “Cars R Us” and
p.year = 1999 and
p.id = i.pid
Predicates
Join condition
Recursive XML Query
PurchaseOrder (id, customer)
Find all items (directly or indirectly)
under a “Cars R Us” purchase order
Item (name)
For
$po in /PurchaseOrder
Where $po/@customer = “Cars R Us”
Return $po//Item
Quantity
Recursive Automata Intersection
PurchaseOrder (id, customer)
PurchaseOrder (customer)
= “Cars R Us”
Item (name)
Quantity
Item (name)
Quantity
PurchaseOrder
Recursive SQL Generation
PurchaseOrder (customer)
= “Cars R Us”
ResultItems (id, name, quantity) as (
Select it.id, it.name, it.quantity
From PurchaseOrder po, Item it
Where po.customer = “Cars R Us”
and po.id = it.pid
Item (name)
Quantity
Union all
Select it.id, it.name, it.quantity
From ResultItems rit,
PurchaseOrder po, Item it
Where rit.id = po.pid and
po.id = it.pid
PurchaseOrder
)
SQL Generation for Path
Expressions (Completeness)
• (Almost) all path expressions can be
translated to SQL
• SQL does not support seamless querying
across data and meta-data (schema)
– SQL based on first-order logic (with least fixpoint recursion)
• Meta-data query capability provided in the
XML translation layer
– Implemented and optimized using a higher-order
operator
Constructing XML Results
<item name = “Firestone Tire” cost=“2000.00”>
<quantity> 50 </quantity>
</item>
<item name=“Goodyear Tire” cost=“8000.00”>
<quantity> 200 </quantity>
</item>
(“Firestone Tire”, 2000.00, 50)
(“Goodyear Tire”, 8000.00, 200)
Complex XML Construction
PurchaseOrder (id, customer)
= “Cars R Us”
Date
Day
Month
Item (name, cost)
Year
= 1999
Quantity
Payment
Outline
•
•
•
•
•
Motivation & High-level Solution
Background (Relations, XML)
Storing and Querying XML Documents
Publishing Relational Data as XML Documents
Conclusion
Relational Schema and Data
PurchaseOrder
Id
Customer
200I Cars R Us
Day Month Year
10
June 1999
Item
Quantity
Cost
50
2000.00
200I
1
40%
200I Goodyear Tire 200
8000.00
200I
2
60%
Pid
Name
Payment
200I Firestone Tire
Pid Installment Percentage
XML Document
<PurchaseOrder id=“200I” customer=“Cars R Us”>
<Date>
<Day> 10 </Day>
<Month> June </Month>
<Year> 1999 </Year>
</Date>
<Item name=“Firestone Tire” cost=“2000.00”>
<Quantity> 50 </Quantity>
</Item>
<Item name=“Goodyear Tire” cost=“8000.00”>
<Quantity> 200 </Quantity>
</Item>
<Payment> 40% </Payment>
<Payment> 60% </Payment>
</PurchaseOrder>
Naïve Approach
• Issue many SQL queries that mirror the structure of
the XML document to be constructed
• Tag nested structures as they are produced
DBMS Engine
(200I, “Cars R Us”, 10, June, 1999)
PurchaseOrder
Item
Payment
(“Firestone Tire”, 2000.00, 50)
(“Goodyear Tire”, 8000.00, 200)
(40%)
(60%)
Problem 1: Too many SQL queries
Problem 2: Fixed (nested loop) join strategy
Relations to XML: Issues
[Shanmugasundaram et. al., VLDB’00]
• Two main differences:
– Ordered nested structures
– Self-describing tags
• Space of alternatives:
Early Tagging
Outside Engine
Late Tagging
Outside Engine
Early Structuring
Inside Engine
Inside Engine
Outside Engine
Late Structuring
Inside Engine
Early Tagging, Early Structuring, Outside Engine
Naïve (Stored Procedure) Approach
• Issue queries for sub-structures and tag them
• Could be a Stored Procedure to maximize performance
DBMS Engine
(200I, “Cars R Us”, 10, June, 1999)
PurchaseOrder
Item
Payment
(“Firestone Tire”, 2000.00, 50)
(“Goodyear Tire”, 8000.00, 200)
(40%)
(60%)
Problem 1: Too many SQL queries
Problem 2: Fixed (nested loop) join strategy
Early Tagging, Early Structuring, Inside Engine
CLOB Approach
• Push down computation inside engine
• Use extensibility features of SQL
– Scalar functions for tagging
– Aggregate functions for creating nested
structures
• Character Large Objects (CLOBs) to hold
intermediate tagged results
Problem: CLOBs during query execution
(storage, size estimation, copying)
Late Tagging, Late Structuring
• XML document content produced without
tags or structure (in arbitrary order)
• Tagger adds tags and structure as final step
Result XML Document
Tagging
Unstructured content
Relational Query
Processing
Late Tagging, Late Structuring
Outer Union Approach
(200I, null
, null, null , null ,
null
, null , null, 1 , 40%, 2)
(200I, “Cars R Us”, 10 , June, 1999,
null
, null , null, null, null, 0)
(200I, null
, null, null , null ,
null
, null , null, 2 , 60%, 2)
(200I, null
, null, null , null , “Firestone Tire” , 2000.00, 50 , null, null, 1)
(200I, null
, null, null , null , “Goodyear Tire”, 8000.00, 200, null, null, 1)
Union
(200I, “Firestone Tire”, 2000.00, 50)
(200I, “Goodyear Tire”, 8000.00, 200)
Item
Payment
(200I, 2, 60%)
(200I, 1, 40%)
PurchaseOrder
(200I, “Cars R Us”, 10, June, 1999)
Problem: Wide tuples (having many columns)
Late Tagging, Late Structuring
Hash-based Tagger
• Outer union results not structured early
– In arbitrary order
• Tagger has to enforce nesting order during tagging
– Essentially a nested group-by operation
• Hash-based approach
– Hash on ancestor ids
– Group siblings together and tag them
• Inside/Outside engine tagger
Problem: Requires memory for entire document
Late Tagging, Early Structuring
• Structured XML document content produced
– In document order
– “Sorted Outer Union” approach
• Tagger just adds tags
– In constant space
– Inside/outside engine
Result XML Document
Tagging
Structured content
Relational Query
Processing
Late Tagging, Late Structuring
Sorted Outer Union Approach
(200I, null
, null, null , null ,
null
, null , null, 1 , 40%, 2)
(200I, “Cars R Us”, 10 , June, 1999,
null
, null , null, null, null, 0)
(200I, null
, null, null , null ,
null
, null , null, 2 , 60%, 2)
(200I, null
, null, null , null , “Firestone Tire” , 2000.00, 50 , null, null, 1)
(200I, null
, null, null , null , “Goodyear Tire”, 8000.00, 200, null, null, 1)
1
3
Sort
2
Union
(200I, “Firestone Tire”, 2000.00, 50)
(200I, “Goodyear Tire”, 8000.00, 200)
Item
Payment
PurchaseOrder
(200I, “Cars R Us”, 10, June, 1999)
(200I, 2, 60%)
(200I, 1, 40%)
Late Tagging, Late Structuring
Sorted Outer Union Approach
(200I, “Cars R Us”, 10 , June, 1999,
null
, null , null, null, null, 0)
(200I, null
, null, null , null , “Firestone Tire” , 2000.00, 50 , null, null, 1)
(200I, null
, null, null , null , “Goodyear Tire”, 8000.00, 200, null, null, 1)
(200I, null
, null, null , null ,
null
, null , null, 1 , 40%, 2)
(200I, null
, null, null , null ,
null
, null , null, 2 , 60%, 2)
1
3
Sort
2
Union
(200I, “Firestone Tire”, 2000.00, 50)
(200I, “Goodyear Tire”, 8000.00, 200)
Item
Payment
PurchaseOrder
(200I, “Cars R Us”, 10, June, 1999)
Problem: Enforces Total Order
(200I, 2, 60%)
(200I, 1, 40%)
Classification of Alternatives
Inside Engine
CLOB
Late Tagging
Outside Engine
Early
Structuring
Outside Engine
Early Tagging
Late
Structuring
Outside Engine
Stored Procedure
Inside Engine
Sorted Outer Union
(Tagging inside)
Sorted Outer Union
(Tagging outside)
Inside Engine
Unsorted Outer Union
(Tagging inside)
Unsorted Outer Union
(Tagging outside)
Performance Evaluation
366 MHz Pentium II, 256MB
Database Size = 10MB – 100MB
TABLE0
TABLE00
TABLE000
TABLE001
Query Fan Out
TABLE01
TABLE010 TABLE011
Query Depth
Where Does Time Go?
Time (in seconds)
35
30
25
XML File
20
Tagging
15
Bind Out
10
Execution
5
0
Stored
Proc
CLOB Unsorted Unsorted Sorted Sorted
OU (Out) OU (In) OU (Out) OU (In)
Database Size = 10MB, Query Fan Out = 2, Query Depth = 2
Time (in seconds)
Effect of Query Depth
60
CLOB
40
Unsorted OU
20
Sorted OU
0
2
3
4
Query Depth
Database Size = 10MB, Query Fan Out = 2
Memory Considerations
• Sufficient memory
– Unsorted outer union is method of choice
– Efficient hash-based tagger
• Limited memory (large documents)
– Sorted outer union is method of choice
– Relational sort is highly scalable
XML Document Construction
(Completeness and Performance)
• Any nested XML document can be constructed
using “outer union” approaches
• 9x faster than previous approaches
– 10 MB of data
– 17 seconds for sorted outer union approach
– 160 seconds for “naïve XML application developer”
approach
Outline
•
•
•
•
•
Motivation & High-level Solution
Background (Relations, XML)
Storing and Querying XML Documents
Publishing Relational Data as XML Documents
Conclusion
Conclusion
• XML has emerged as the Internet data format
• But relational database systems will continue
to be used for data management tasks
• Internet application developers currently have
to explicitly bridge this “data model gap”
• Can we design a system that automatically
bridges this gap for application developers?
For $cust in /Customer
Where $cust/name = “Jack”
Return $cust
// First prepare all the SQL statements to be executed and create cursors for them
Exec SQL Prepare CustStmt From “select cust.id, cust.name from Customer cust where cust.name = ‘Jack’“
Exec SQL Declare CustCursor Cursor For CustStmt
Exec SQL Prepare AcctStmt From “select acct.id, acct.acctnum from Account acct where acct.custId = ?“
Exec SQL Declare AcctCursor Cursor For AcctStmt
Exec SQL Prepare PorderStmt From “select porder.id, porder.acct, porder.date from PurchOrder porder
where porder.custId = ?“
Exec SQL Declare PorderCursor Cursor For PorderStmt
Exec SQL Prepare ItemStmt From “select item.id, item.desc from Item item where item.poId = ?“
Exec SQL Declare ItemCursor Cursor For ItemStmt
Exec SQL Prepare PayStmt From “select pay.id, pay.desc from Payment pay where item.poId = ?“
Exec SQL Declare PayCursor Cursor For PayStmt
// Now execute SQL statements in nested order of XML document result. Start with customer
XMLresult = ““
Exec SQL Open CustCursor
while (CustCursor has more rows) {
Exec SQL Fetch CustCursor Into :custId, :custName
XMLResult += “<customer id=“ + custId + “><name>“ + custName + “</name><accounts>“
// For each customer, issue sub-query to get account information and add to custAccts
Exec SQL Open AcctCursor Using :custId
while (AcctCursor has more rows) {
Exec SQL Fetch AcctCursor Into :acctId, :acctNum
XMLResult += “<account id=“ + acctId + “> “ + acctNum + “</account>“
}
XMLResult += “</accounts><porders>“
// For each customer, issue sub-query to get purchase order information and add to custPorders
Exec SQL Open PorderCursor Using :custId
while (PorderCursor has more rows) {
Exec SQL Fetch PorderCursor Into :poId, :poAcct, :poDate
XMLResult += “<porder id=“+ poId + “ acct=“+poAcct +“><date>“+poDate +“</date><items>“
// For each purchase order, issue a sub-query to get item information and add to porderItems
Exec SQL Open ItemCursor Using :poId
while (ItemCursor has more rows) {
Exec SQL Fetch ItemCursor Into :itemId, :itemDesc
XMLResult += “<item id=“ + itemId + “>“ + itemDesc + “</item>“
}
XMLResult += “</items><payments>“
// For each purchase order, issue a sub-query to get payment information and add to porderPays
Exec SQL Open PayCursor Using :poId
while (PayCursor has more rows) {
Exec SQL Fetch PayCursor Into :payId, :payDesc
XMLResult += “<payment id=“ + payId + “>“ + payDesc + “</payment>“
}
XMLResult += “</payments></porder>“
} // End of looping over all purchase orders associated with a customer
XMLResult += “</customer>“
Return XMLResult as one result row; reset XMLResult = ““
} // loop until all customers are tagged and output
Conclusion (Contd.)
• Yes! XPERANTO is the first such system
• Allows users to …
– Store and query XML documents using a
relational database system
– Publish existing relational data as XML
documents
… using a high-level XML query language
• Also provides a dramatic improvement in
performance
Relational Database System
Vendors
• IBM, Microsoft, Oracle, Informix, …
– SQL extensions for XML
• XML Translation Layer
– “Pure XML” philosophy … provides high-level
XML query interface
• SQL extensions for XML, while better than writing
applications, is still low-level
– More powerful than XML-extended SQL
• SQL just not designed with nifty XML features in
mind
Related Research
• Storing and querying “schema-less” XML
documents using a relational database system
– STORED [Deutsch et. al.]
– Binary decomposition [Florescu & Kossmann]
• XML Views of relational databases
– SilkRoute [Fernandez et. al.]
– Mediators
Future Work
• Functionality
– Update XML documents
– Insert/Update XML views of relational data
– Information-retrieval style queries
• Performance
– Query-aware relational schema generation
– Caching
– Relational engine extensions