Download XML Databases – do they really exist? - Indico

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Microsoft Access wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Concurrency control wikipedia , lookup

Functional Database Model wikipedia , lookup

Oracle Database wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Open Database Connectivity wikipedia , lookup

Database wikipedia , lookup

SQL wikipedia , lookup

Microsoft SQL Server wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

PL/SQL wikipedia , lookup

Versant Object Database wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
ELAG 2005 at CERN, Geneva
XML Databases – do they really
exist?
Jan Erik Kofoed
BIBSYS Library Automation
Design goals for XML
1.
2.
3.
4.
XML shall be straightforwardly usable over the Internet.
XML shall support a wide variety of applications.
XML shall be compatible with SGML.
It shall be easy to write programs which process XML
documents.
5. The number of optional features in XML is to be kept to the
absolute minimum, ideally zero.
6. XML documents should be human-legible and reasonably clear.
7. The XML design should be prepared quickly.
8. The design of XML shall be formal and concise.
9. XML documents shall be easy to create.
10. Terseness in XML markup is of minimal importance.
2005-06-03
2
ELAG 2005
A simple example
<?xml version="1.0" encoding="UTF-8"?>
<book id="231456">
<author>Henrik Ibsen</author>
<title>The Wild Duck</title>
<published>
<place>London</place>
<year>1890</year>
</published>
</book>
2005-06-03
3
ELAG 2005
A relational model for the example
book
published
id (primary key
publ-id (foreign key)
author
publ-id (primary key)
place
year
title
2005-06-03
4
ELAG 2005
Realisation as tables
book
id
publ-id
author
title
231456
0001
Henrik
Ibsen
The Wild
Duck
published
publ-id
place
year
0001
London
1890
2005-06-03
5
ELAG 2005
Realisation as DOM Document Object Model
book
id
231456
autho
r
Henrik Ibsen
title
The Wild Duck
published
element node
2005-06-03
attribute node
6
place
London
year
1890
text node
ELAG 2005
Important W3C XML technologies
• XML Schema
– defining database schema, instance validation
• XPath
– query expressions
– addressing content
• XQuery
– a new query language for XML
– based on XPath and XML Schema type hierarchy
• Namespaces in XML
– qualification of content
2005-06-03
7
ELAG 2005
XML Schema – text representation
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="book">
<xs:complexType>
<xs:sequence>
<xs:element name="author" type="xs:string"/>
<xs:element name="title" type="xs:string"/>
<xs:element name="published">
<xs:complexType>
<xs:sequence>
<xs:element name="place" type="xs:string"/>
<xs:element name="year" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
<xs:attribute name="id" use="required" type="xs:int"/>
</xs:complexType>
</xs:element>
</xs:schema>
2005-06-03
8
ELAG 2005
XML Schema – graphical
representation
2005-06-03
9
ELAG 2005
Assigning schema to a XML
document
<?xml version="1.0" encoding="UTF-8"?>
<book id="231456"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:noNamespaceSchemaLocation="book.xsd">
<author>Henrik Ibsen</author>
<title>The Wild Duck</title>
<published>
<place>London</place>
<year>1890</year>
</published>
</book>
2005-06-03
10
ELAG 2005
XQuery
The query:
for
$query in doc("book.xml")
where $query/book/author = "Henrik Ibsen"
return
<result>
{$query/book/title}
XPath
</result>
gives this result:
<result>
<title>The Wild Duck</title>
</result>
2005-06-03
11
ELAG 2005
Native XML database
1. Defines a (logical) model for an XML
document -- as opposed to the data in that
document -- and stores and retrieves
documents according to that model.
2. Has an XML document as its fundamental
unit of (logical) storage.
3. Is not required to have any particular
underlying physical storage model.
2005-06-03
12
ELAG 2005
SAG Tamino XML Server
1. Database Schema based on W3C XML
Schema.
2. The basic record-type is a well-formed XML
document type.
Different document types can be grouped into
collections.
3. Stores data as serialised DOM objects.
2005-06-03
13
ELAG 2005
Tamino System Architecture
2005-06-03
14
ELAG 2005
ORACLE XML Database
•
•
•
•
Includes a native XML datatype
SQL operators on XML content
Support for W3C XML Schema
XML/SQL duality
– XML operations on relational data
– SQL operations on XML data
• Support for XPath and SQL/XML
• XML mapped to file/folder structure
2005-06-03
15
ELAG 2005
ORACLE – storage options
2005-06-03
16
ELAG 2005
ORACLE – Create statement
CREATE TABLE purchase_order_table
(
po_number NUMBER(16),
purchase_order XMLTYPE
)
2005-06-03
17
ELAG 2005
ORACLE – Insert statement
INSERT INTO purchase_order_table
VALUES (1234, XMLTYPE(
'<PurchaseOrder>
<Reference>BLAKE-2002-015</Reference>
<Actions/>
<Reject/>
<Requestor>David E. Blake</Requestor>
<User>BLAKE</User>
<CostCenter>S30</CostCenter>
</PurchaseOrder>‘
)
)
2005-06-03
18
ELAG 2005
ORACLE – Select statement
Database content:
<PurchaseOrder>
<Reference>BLAKE-2002-015</Reference>
<Actions/>
<Reject/>
<Requestor>David E. Blake</Requestor>
<User>BLAKE</User>
<CostCenter>S30</CostCenter>
</PurchaseOrder>
XPath
SQL query:
SELECT extractValue(p.purchase_order,'/PurchaseOrder/User')
FROM purchase_order_table p
WHERE existsNode(p.purchase_order,'/PurchaseOrder[CostCenter="S30"]') = 1
Result:
EXTRACTVALUE(P.PURCHASEORDER,’/PURCHASEORDER/USER’)
----------------------------------------------------------------------------------------------------------BLAKE
2005-06-03
19
ELAG 2005
SQL 2003 and SQL/XML
• New ANSI/ISO SQL standard:
Information technology — Database
languages — SQL — Part 14: XML-Related
Specifications (SQL/XML). Final comittee draft
– new XML type
– mapping between SQL- and XML-constructs
– functions for generating XML from SQL data.
2005-06-03
20
ELAG 2005
XML database implementations
Ronald Bourret: XML Database Products
• Native XML databases
– 24 commercial
– 14 open source
• XML enabled databases
– mostly RDBMS
– 16 commercial
• XML Servers
– mostly based on RDBMS
– 19 commercial
– 5 open source
2005-06-03
21
ELAG 2005
XML Databases at BIBSYS
• Software: Tamino XML Server from Software AG
–
–
–
–
–
–
Native XML database
Supports XML Schema, Namespaces, XPath, and XQuery
Stores both XML and binary objects (images, video a.o.)
Communication based on HTTP
Uses Apache web server as frontend
Java API used for programming
• BIBSYS Galleri
– a database of images
• BIBSYS Subject Portal
– a metadata database for high quality web resources
2005-06-03
22
ELAG 2005
BIBSYS Galleri
•
•
•
•
a database of images
metadata in MARC wrapped in XML
images stored in JPEG format (nonXML data)
Database schema generated from DTD (Document
Type Definition)
• XPath used as query language
• XML content transfomed using XSLT into HTML for
presentation
• URL: http://bilde.bibsys.no
2005-06-03
23
ELAG 2005
BIBSYS Galleri – xml format (extract)
<?xml version="1.0" encoding="iso-8859-1" ?>
<marc id="UBT-TO-004680A" type="BILDEMARC" utgave="1.0">
<f012>
<f012e>2002-04-27</f012e>
<f012k>IJGR</f012k>
</f012>
<f096>
<f096a>UBiT</f096a>
<f096b>topografisk</f096b>
<f096c>VII-Uhj md-004680A</f096c>
<f096f>Prospektkort s/h</f096f>
</f096>
<f100>
<f100a>Hovde, L.E.</f100a>
<f100c>forlag</f100c>
</f100>
<f245>
<f245a>Olav Tryggvasons gate fra Bakke bro med trikk og hest.</f245a>
</f245>
</marc>
2005-06-03
24
ELAG 2005
BIBSYS Subject Portal
•
•
•
•
Metadata for high quality web resources
Uses a subject hierarchy based on Dewey
Data in XML using several namespaces
Database schema written as several XML
Schemas, one for each namespace
• XQuery (working draft from 2002-08-16) used as query
language
• URL: http://emneportal.bibsys.no
2005-06-03
25
ELAG 2005
BIBSYS Subject portal – xml format
(extract)
<ep:eprecord xmlns:ep="http://www.bibsys.no/eprecord/"
xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/"
xmlns:epres="http://www.bibsys.no/res_type/" xmlns:epadm="http://www.bibsys.no/epadm/"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" recid="983870854-8343">
<ep:head>
<epadm:identifier>983870854-8343</epadm:identifier>
<epadm:status>godkjent</epadm:status>
</ep:head>
<ep:biblio>
<dc:title>Cognitive Psychology Online Laboratory : CogLab</dc:title>
<dc:type xsi:type="epres:resource">BM</dc:type>
<ep:uriandaccess>
<dc:identifier xsi:type="dcterms:URI">http://coglab.wadsworth.com/</dc:identifier>
<dcterms:accessRights>FREEE</dcterms:accessRights>
</ep:uriandaccess>
</ep:biblio>
<ep:admin>
<epadm:created>2001-03-06</epadm:created>
</ep:admin>
</ep:eprecord>
2005-06-03
26
ELAG 2005
Conclusion
• Two main kind of XML Databases
– native
– RDBMS with extention
• XML databases well suited for storing
hierarchial structures
• Work in progress to join SQL and XML based
functionality
• DBMS will in future handle relational data and
xml base data equally well
• Yes, XML databases, - indeed exist!
2005-06-03
27
ELAG 2005