Download XML SUPPORT IN IBM DB2, SQL SERVER, ORACLE

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Relational algebra wikipedia , lookup

DBase wikipedia , lookup

Concurrency control wikipedia , lookup

Microsoft Access wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Oracle Database wikipedia , lookup

Database wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

SQL wikipedia , lookup

PL/SQL wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Comparison of XML Support in IBM
DB2 9, Microsoft SQL Server 2005,
Oracle 10g
O. Beza¹, M. Patsala², E. Keramopoulos³
¹Dpt. Of Information Technology, Alexander Technology Educational Institute
(ATEI), Thessaloniki, Greece, E-mail: [email protected]
²Dpt. Of Information Technology, Alexander Technology Educational Institute
(ATEI), Thessaloniki, Greece, E-mail: [email protected]
³Dpt. Of Information Technology, Alexander Technology Educational Institute
(ATEI), Thessaloniki, Greece, E-mail: [email protected]
Abstract
In this paper we present the relation between XML (Extensible Markup Language)
documents and Relational Database Management System (RDBMS) IBM DB2 9,
MICROSOFT SQL SERVER 2005 and ORACLE 10g. The research aims to develop and
describe the ways in which we can manipulate this type of documents using these three
XML-enabled Databases and perform a comparison analysis of their XML support. The
paper discusses the basic characteristics/concepts of XML and it presents the structure
of XML documents, all related technologies (DTDs, SCHEMATA, etc) and two of the
most important XML Query languages XPath and XQuery. Moreover, we outline the
basic concepts of Database systems and how they can benefit using XML. The
emphasis of the paper is given in the presentation of the comparison analysis, which is
based on a list of basic features of XML that a RDBMS should support. We introduce
these XML features and we analyze the comparison analysis by presenting examples of
using XML with IBM DB2 9, MICROSOFT SQL SERVER 2005 and ORACLE 10g. Finally
we summarize all our conclusions in a comparison table which contains all the supported
XML operations from the three RDBMSs.
Keywords: XML, XML-enabled Databases, DTD, XML Schema
1. Introduction
XML (Extensible Markup Language) [1, 2] is a markup language
developed by the World Wide Web Consortium (W3C) [3] to deliver
structured content over the web. XML was originally developed as an
application profile of SGML [4], but soon XML made an instant success
for a variety of other application domains. That’s because XML provides
many advantages as a data format over others, including:
1. Built-in support for internationalization due to the fact that it
utilizes unicode.
2. Platform independence.
3. Human readable format which is easier for developers to locate
and fix errors than with other data storage formats.
1
4. Extensibility in a manner that allows developers to add extra
information to a format without breaking applications that based
on older versions of the specific format.
5. Large number of off-the-shelf tools for processing existing XML
documents.
XML databases have become widely accepted for all applications where
the storage of XML data is necessary. There are three different types of
XML databases [5], namely:
ƒ XML Enabled Database: A database that holds data in some format
different than XML. An interface is provided, which presents XML info
to the application even though the data is stored in some other format
than XML. An XML-enabled database might be a relational database,
an object-relational database, or an object-oriented database.
ƒ Native XML Database: This type of database allows XML data to be
stored directly. Also, they define a (logical) model for an XML
document that stores and retrieves documents according to that model.
Native XML databases are likely to perform better than XML-enabled
databases since there is little need for converting the data. The data
conversion in an enabled database is almost always going to be more
significant and time consuming than with a native database.
ƒ Hybrid XML Database: A database that have characteristics of Native
XML Databases and XML Enabled Databases.
IBM DB2 9, Microsoft SQL Server 2005 and Oracle 10g are XML enabled
relational database management systems (RDBMS). DB2 offers two ways of
processing XML documents: XML Extender [6, 7] and PureXML [8, 9]. In this
paper, we present the XML characteristics that the three RDBMSs should
support. In particular, in section 2 the three DBMSs are examined against
some XML technologies, such as DTDs, XML Schema, XPath, XQuery and
XSL. The method that the three DBMSs use in order to store an XML
Document is given in Section 3. In the next section, we focus on the mapping
between XML documents and DBMSs and the XML indexes. In Section 5, we
examine how the three DBMSs compose and decompose XML documents
into/from relational table columns. In section 6, we study the way that three
DBMSs use the extension of SQL3 [10] for using XML, i.e. SQL/XML. Finally,
in section 7 we conclude by summarizing in a comparison table which
contains all the described XML features that a RDBMS should support.
2. XML Technologies
2.1 DTD (Document Type Definition) Documents
DTDs [11] are documents where are defined some markup rules as a
vocabulary. DTDs have a different syntax from XML and are generally
used to specify the order and occurrence of elements in an XML
document. In fact the use of DTDs is not so popular since the XML
schemata were introduced. However there are programmers that prefer to
use DTDs mainly because DTDs are easier to code and validate than
XML Schemata. Sql Server does not support the use of DTD files. DB2
2
PureXML does not support DTD validation but it permits the insertion of
documents that contain a DOCTYPE that refer to DTDs. On the other
hand, DB2 XML Extender and Oracle fully support DTD validation.
2.2 XML Schemata
An XML Schema [12] is a mechanism introduced by the W3C and can be
used in place of a DTD to define the specifications for the content of XML
documents.
All three DBMSs register an XML schema in their database. Oracle and
DB2 provide a repository that contains all registered XML technologies
used for validation and stores them in their hierarchical structure, named
XML DB Repository and XML Schema Repository respectively. SQL
Server does not provide such a tool, but it provides a method for
modifying an XML Schema which is, the alter xml schema collection.
Similarly DB2 gives the method add xmlschema document to. In Oracle once
we register the schema in the database we can not modify it. Moreover,
Oracle provides two methods isSchemaBased that checks if the inserted
XML document conforms to a schema and isSchemaValidated that checks
if the document inserted in a column is valid. Finally, in Oracle and SQL
Server we can create XML schemata in the database.
One of the most important reasons that XML Schemata are used by
DBMSs is to validate XML documents before inserted in the columns of a
table. In the case of SQL Server we have to define from the creation of
the table whether the XML column will contain an XML document that
conforms to an XML Schema. Thus, if we create a typed table the XML
document should contain all the tags and the same names that the
schema defines. On the contrary in Oracle we just have to name the root
element and the rest of the document may differ. In the case of DB2 we
have to define on insert command whether the document will be validated
against a schema or not, as we can see in the following example.
insert into PurchaseOrder(poid,info) values (2002,XMLVALIDATE(
XMLPARSE(DOCUMENT'<purchaseOrder poid="2002" orderDate="199910-20" status=""> … </purchaseOrder>')
ACCORDING TO XMLSCHEMA ID migrate.po));
2.3 XPath-Xquery-XSL
XPath [13] is a query language that conforms to a data model (DTD, XML
Schema) and provides a hierarchical representation of XML documents.
All three DBMS use it to navigate through elements and attributes in an
XML document that is stored n a column of XML type.
XQuery [14] is a W3C Recommendation and conforms to the same data
model of XPath. XQuery is used for finding and extracting elements and
attributes from XML documents. According to our research DB2’s support
of XQuery is superior compared to the other two since it treats XQuery as
a first-class language. Only DB2 XML Extender does not support XQuery.
3
dB2 PureXML's XQuery: select xmlquery('$cinfo/purchaseOrder/shipTo/name'
passing info as "cinfo") from purchaseOrder ;
XQuery:
xquery
for $y in db2fn:xmlcolumn('PURCHASEORDER.INFO')/purchase
Order/items
return $y
SQL Server's XQuery:
Select poid, info.query('for $y in
/purchaseOrder/items
return <topic>{$y/item[@pid]}</topic>') from
purchaseorder
Oracle's XQuery:
SelectXMLQuery
('$cinfo/product/description/name[ora:contains
(.,"Roll")>0]' passing info as "cinfo" returning content)
from product;
XSL (Extensible Stylesheet Language) [15] is a language for expressing
style sheets. In other words it defines how an XML document should be
presented. All three DBMS fully support the use of XSL.
3. Storage Methods
In this section we examine the method that the three DBMSs use in order
to store an XML Document in a database. In particular, SQL Server [16,
17] stores XML Documents in table columns of XML type like BLOBs
(Binary Large Objects). In the case that the XML document stored in an
untyped 1 column then it is stored as Unicode (UTF-16) whereas in the
other case that the XML Document is stored in a typed 2 column then it is
stored with the same type as the XML schema defines. For example,
Create table Product (pid varchar(10) not null primary key,
name varchar(128), category varchar(32), price decimal(30,2), info xml);
Create table purchaseorder ( POid bigint not null primary key,
Status varchar(10) not null default 'New', Info XML(content PO));
Oracle [18, 19] stores XML documents as intact documents in xmltype
type columns of tables like CLOBs (Character Large Objects) or BLOBs
(Binary Large Objects) or as a distinct xmltype table. For example,
Create table purchaseOrder ( POid bigint not null primary key,
status varchar (10) not null, info xmltype)
xmltype info xmlschema "http://www.w3.org/2001/XMLSchema"
element "purchaseOrder";
1
Typed is a terminology used in SQL Server to describe those columns of XML type that do not
comply with an XML Schema.
2
Untyped is a terminology used in SQL Server to describe those columns of XML type that comply to
an XML Schema
4
Create table purchaseOrder of xmltype XMLSCHEMA
"http://www.w3.org/2001/XMLSchema" element "purchaseOrder";
DB2 XML Extender stores the XML document in a single column as
character data, extracting values into "side tables". For example,
Create table PurchaseOrder ( POid bigint not null primary key,
Status varchar (10) not null with default 'New',Info db2xml.xmlclob not
logged not null);
In case of DB2 PureXML the XML document is stored in a column of XML
type. What is worth mentioning is that PureXML does not store
documents as plain text and does not map XML to relational or objectrelational tables. Instead, it stores XML in its inherent hierarchical format,
which matches the XML data model. Any XML document is a well-defined
tree of elements and attributes, and XML queries are expressed in terms
of tree traversal. An example of storing an XML document in DB2
PureXML is given:
Create table PurchaseOrder ( POid bigint not null primary key,Status
varchar (10) not null with default 'shipped', Info xml not null);
4. XML Mapping and Indexing
The concept of mapping [20] is of greatest importance for XML Enabled
Databases, and that’s because the data transfer between the XML
document and the database is based on the mapping between them.
Using DB2’s XML Extender, the mapping between the tables of the
database and the structure of the XML document is defined by a document
called DAD (Data Access Definition). This document maps the elements of
the XML document with the columns of the table. In contrary, DB2
PureXML uses annotated XML Schemata [12], instead of DAD files.
Generally annotated Schemata, which are also referred as mapping
schemata, are used by all three RDBMSs for mapping. Annotations can be
defined on tables (sql:relation annotation), on fields (sql:field annotation)
and on referential integrity relationships (sql:relationship annotation).
In case that an XML schema is not registered in a database, each one of
the DBMSs use a default mapping. SQL Server also uses FOR XML
clauses [21] that define how the select clauses are mapped to XML
documents.
A common characteristic of all the XML Enabled Databases is the support
of XML indexes [17, 22] which are produced by elements and attributes of
XML documents. Just like relational indexes, XML indexes are used to
improve the performance of queries. The user should always create
indexes over frequently accessed data that results in a much better
performance of the select statements and executed over the indexed
data.
5
5. Managing XML Documents and Relational Data
When working with XML documents and Databases we can either store
the documents intact in columns of XML type or decompose XML data into
relational tables. Another operation we can perform is to compose XML
documents from existing relational data.
In case of decomposition DB2 XML Extender provides the method of XML
Collection [23]. XML Collection is defined by a DAD document that
determines how the elements and the attributes are to be mapped in one
or more relational tables. After we enable the database for XML Collection
(dxxadm enable_collection database_name name “path”) we insert the XML
data in the tables using DAD and XML documents. On the other hand DB2
PureXML uses an annotated XML Schema that we have registered in the
database and we have enabled it for decomposition and with the
command decompose xml document the desirable XML data are inserted into
tables.
SQL Server uses a stored procedure sp_xml_preparedocument and
OPENXML clauses [16] for decomposing XML data in relational tables.
This procedure does not require an XML Schema, we just have to define
the XML document. This approach is not automated like DB2’s and it does
not support insertion in more than one tables.
Oracle performs a similar procedure using dbms_xmlgen. By defining an
XML document and the name of tables we want to create and their
contained columns we can insert XML data in them. This approach is far
more complicated compared to DB2’s, especially when we want to update
a lot of tables or insert more than one XML documents.
Apart from these functions we can create XML documents and Schemata
from existing relational data. Oracle and DB2 use SQL/XML methods to
produce XML documents whereas SQL Server uses OPENXML
statements. Finally, DB2 does not support XML Schema creation.
6. SQL/XML
SQL/XML [24] is an extension of SQL that is part of ANSI/ISO SQL 2003.
SQL/XML was developed by INCITS H2.3 [25], with participation from
Oracle, IBM, Microsoft (which does not plan to implement SQL/XML),
Sybase [26], and DataDirect Technologies [27]. It's extensions include the
following:
• Mapping SQL tables, schemas, and catalogs to XML documents.
• Generation of an XML schema corresponding to an XML document
generated from SQL data.
• An XML data type to allow columns of SQL tables to contain XML data.
• Publishing functions that allow SQL queries to create XML structures
using
XML
publishing
functions
including:
XMLELEMENT,
XMLATTRIBUTES, XMLFOREST, XMLCONCAT, XMLAGG, and
XMLGEN
6
7. Conclusions
Summing up this paper we quote some observations we made during our
research. Oracle was rather slow working in Windows XP and less userfriendly compared to the other two DBMS. What we find quiet convenient
working with SQL Server and DB2 was the existence of hyperlinks in the
columns that contain XML documents and in query results.
One limitation of DB2 9 is that working with XML Extender is necessary to
make a number of steps in order to enable the database and in case of
PureXML we have to work in a database with codeset UTF-16.
Below we indicate a comparison table that consists of all the functions and
tools that DBMS use for XML support.
FEATURES
XML Technologies
Storage Methods
XML Data Type
DTD
XML Schemas
Xpath
Xquery
XSL
BLOB
CLOB
VARCHAR
Native
XMLType
XML
XMLVarchar
XMLClob
XMLFile
Columns of XML
Type
Tables of XML type
XML Validation
XML Shredding
Composition of
XML Documents
Composition of
XML Schema
XML Mapping
XML Indexing
SQL/XML
XML Repository
DB2
PureXML XMLExtender
3
3
3
3
3
3
3
3
3
3
3
3
ORACLE
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
DTD
XML Schema
SQL SERVER
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
Figure 1: Comperative Table
As we can see from the above table DB2 XML Extender does not support
the use of XML Schemata and XQuery whereas SQL Server fells short in
the use of DTDs.
7
One big advantage of DB2 PureXML is the native storage of XML data.
This approach contributes to a faster query performance and data access.
One of the most interesting functions that SQLServer and Oracle offer is
the creation of XML Schema, something that DB2 does not support.
Finally, DB2 PureXML and Oracle provide a very helpful tool for managing
XML schemata and validating technologies, the XML Repository.
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
[27]
Extensible Markup Language (XML) 1.0 (Fourth Edition), W3C Recommendation, (29
September 2006) Available at: http://www.w3.org/TR/xml/
Extensible Markup Language (XML) Tutorial.
Available at: http://www.w3schools.com/ xml/
World Wide Web Consortium (W3C), Available at: http://www.w3.org/
SGML, Available at: http://www.w3.org/TR/html4/intro/sgmltut.html
What is an XML database?, Available at: http://xmldb-org.sourceforge.net/faqs.html
IBM Redbooks, XML Guide , db2xge90
IBM Redbooks, XML Extender Administration and Programming, Version 8.2,
db2sxe81
IBM Redbooks, DB2 9: pureXML Overview and Fast Start, sg247298
IBM Redbooks, DB2 9 pureXML Guide, sg247315
ISO/IEC 9075-1:2003. Information technology — Database languages — SQL —
Part 1: Framework (SQL/Framework).
Kelvin Williams, Professional XML Databases, Wrox Press, Ltd 2000
Introduction to Annotated XSD Schemas (SQLXML 4.0),
Available at: http://technet.microsoft.com/en-us/library/ms171870.aspx
XML Path Language (XPath) Version 1.0, W3C Recommendation, (16 November
1999), Available at: http://www.w3.org/TR/xpath
XML Path Language (XQuery) Version 1.0, W3C Recommendation
Available at: http://www.w3.org/TR/xquery
Extensible Stylesheet Language (XSL) Version 1.1,
Available at: http://www.w3.org/TR/xsl/
Scott Klein, 2006/ Professional SQL Server 2005 XML. Wiley Publishing.
Mitch Ruebush, Comparing SQL Server 2005 and Oracle 10g as a Database
Platform for Microsoft .NET Developers, April 2005
Shelley Higgins, Oracle Application Developer’s Guide - XML, 10g (9.0.4) Part No.
B12099-01, Oracle Corporation, 2003
Geoff Lee, Mastering XML DB Queries in Oracle 10g, Release 2, Oracle Corporation,
March 2006.
Igor Dayen, Storing XML in Relational Databases, June 20, 2001
Srinivas Sampath, Beginning SQL Server 2005 XML Programming, 21 February 2006
IBM Redbooks, DB2 9: Indexing XML documents with DB2 9 pureXML
IBM Redbooks, XML for DB2 Information Integration, SG24-6994
SQL/XML, Available at: http://www.stylusstudio.com/sqlxml_tutorial.html
INCITS H2.3, Available at: http://www.incits.org/
Sybase, see also:www.sybase.com/
DataDirect Technologies, Available at: www.datadirect.com/
8