* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download XML Databases – do they really exist? - Indico
Survey
Document related concepts
Microsoft Access wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Concurrency control wikipedia , lookup
Functional Database Model wikipedia , lookup
Oracle Database wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Open Database Connectivity wikipedia , lookup
Microsoft SQL Server wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Versant Object Database wikipedia , lookup
Relational model wikipedia , lookup
Transcript
ELAG 2005 at CERN, Geneva XML Databases – do they really exist? Jan Erik Kofoed BIBSYS Library Automation Design goals for XML 1. 2. 3. 4. XML shall be straightforwardly usable over the Internet. XML shall support a wide variety of applications. XML shall be compatible with SGML. It shall be easy to write programs which process XML documents. 5. The number of optional features in XML is to be kept to the absolute minimum, ideally zero. 6. XML documents should be human-legible and reasonably clear. 7. The XML design should be prepared quickly. 8. The design of XML shall be formal and concise. 9. XML documents shall be easy to create. 10. Terseness in XML markup is of minimal importance. 2005-06-03 2 ELAG 2005 A simple example <?xml version="1.0" encoding="UTF-8"?> <book id="231456"> <author>Henrik Ibsen</author> <title>The Wild Duck</title> <published> <place>London</place> <year>1890</year> </published> </book> 2005-06-03 3 ELAG 2005 A relational model for the example book published id (primary key publ-id (foreign key) author publ-id (primary key) place year title 2005-06-03 4 ELAG 2005 Realisation as tables book id publ-id author title 231456 0001 Henrik Ibsen The Wild Duck published publ-id place year 0001 London 1890 2005-06-03 5 ELAG 2005 Realisation as DOM Document Object Model book id 231456 autho r Henrik Ibsen title The Wild Duck published element node 2005-06-03 attribute node 6 place London year 1890 text node ELAG 2005 Important W3C XML technologies • XML Schema – defining database schema, instance validation • XPath – query expressions – addressing content • XQuery – a new query language for XML – based on XPath and XML Schema type hierarchy • Namespaces in XML – qualification of content 2005-06-03 7 ELAG 2005 XML Schema – text representation <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="book"> <xs:complexType> <xs:sequence> <xs:element name="author" type="xs:string"/> <xs:element name="title" type="xs:string"/> <xs:element name="published"> <xs:complexType> <xs:sequence> <xs:element name="place" type="xs:string"/> <xs:element name="year" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:sequence> <xs:attribute name="id" use="required" type="xs:int"/> </xs:complexType> </xs:element> </xs:schema> 2005-06-03 8 ELAG 2005 XML Schema – graphical representation 2005-06-03 9 ELAG 2005 Assigning schema to a XML document <?xml version="1.0" encoding="UTF-8"?> <book id="231456" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="book.xsd"> <author>Henrik Ibsen</author> <title>The Wild Duck</title> <published> <place>London</place> <year>1890</year> </published> </book> 2005-06-03 10 ELAG 2005 XQuery The query: for $query in doc("book.xml") where $query/book/author = "Henrik Ibsen" return <result> {$query/book/title} XPath </result> gives this result: <result> <title>The Wild Duck</title> </result> 2005-06-03 11 ELAG 2005 Native XML database 1. Defines a (logical) model for an XML document -- as opposed to the data in that document -- and stores and retrieves documents according to that model. 2. Has an XML document as its fundamental unit of (logical) storage. 3. Is not required to have any particular underlying physical storage model. 2005-06-03 12 ELAG 2005 SAG Tamino XML Server 1. Database Schema based on W3C XML Schema. 2. The basic record-type is a well-formed XML document type. Different document types can be grouped into collections. 3. Stores data as serialised DOM objects. 2005-06-03 13 ELAG 2005 Tamino System Architecture 2005-06-03 14 ELAG 2005 ORACLE XML Database • • • • Includes a native XML datatype SQL operators on XML content Support for W3C XML Schema XML/SQL duality – XML operations on relational data – SQL operations on XML data • Support for XPath and SQL/XML • XML mapped to file/folder structure 2005-06-03 15 ELAG 2005 ORACLE – storage options 2005-06-03 16 ELAG 2005 ORACLE – Create statement CREATE TABLE purchase_order_table ( po_number NUMBER(16), purchase_order XMLTYPE ) 2005-06-03 17 ELAG 2005 ORACLE – Insert statement INSERT INTO purchase_order_table VALUES (1234, XMLTYPE( '<PurchaseOrder> <Reference>BLAKE-2002-015</Reference> <Actions/> <Reject/> <Requestor>David E. Blake</Requestor> <User>BLAKE</User> <CostCenter>S30</CostCenter> </PurchaseOrder>‘ ) ) 2005-06-03 18 ELAG 2005 ORACLE – Select statement Database content: <PurchaseOrder> <Reference>BLAKE-2002-015</Reference> <Actions/> <Reject/> <Requestor>David E. Blake</Requestor> <User>BLAKE</User> <CostCenter>S30</CostCenter> </PurchaseOrder> XPath SQL query: SELECT extractValue(p.purchase_order,'/PurchaseOrder/User') FROM purchase_order_table p WHERE existsNode(p.purchase_order,'/PurchaseOrder[CostCenter="S30"]') = 1 Result: EXTRACTVALUE(P.PURCHASEORDER,’/PURCHASEORDER/USER’) ----------------------------------------------------------------------------------------------------------BLAKE 2005-06-03 19 ELAG 2005 SQL 2003 and SQL/XML • New ANSI/ISO SQL standard: Information technology — Database languages — SQL — Part 14: XML-Related Specifications (SQL/XML). Final comittee draft – new XML type – mapping between SQL- and XML-constructs – functions for generating XML from SQL data. 2005-06-03 20 ELAG 2005 XML database implementations Ronald Bourret: XML Database Products • Native XML databases – 24 commercial – 14 open source • XML enabled databases – mostly RDBMS – 16 commercial • XML Servers – mostly based on RDBMS – 19 commercial – 5 open source 2005-06-03 21 ELAG 2005 XML Databases at BIBSYS • Software: Tamino XML Server from Software AG – – – – – – Native XML database Supports XML Schema, Namespaces, XPath, and XQuery Stores both XML and binary objects (images, video a.o.) Communication based on HTTP Uses Apache web server as frontend Java API used for programming • BIBSYS Galleri – a database of images • BIBSYS Subject Portal – a metadata database for high quality web resources 2005-06-03 22 ELAG 2005 BIBSYS Galleri • • • • a database of images metadata in MARC wrapped in XML images stored in JPEG format (nonXML data) Database schema generated from DTD (Document Type Definition) • XPath used as query language • XML content transfomed using XSLT into HTML for presentation • URL: http://bilde.bibsys.no 2005-06-03 23 ELAG 2005 BIBSYS Galleri – xml format (extract) <?xml version="1.0" encoding="iso-8859-1" ?> <marc id="UBT-TO-004680A" type="BILDEMARC" utgave="1.0"> <f012> <f012e>2002-04-27</f012e> <f012k>IJGR</f012k> </f012> <f096> <f096a>UBiT</f096a> <f096b>topografisk</f096b> <f096c>VII-Uhj md-004680A</f096c> <f096f>Prospektkort s/h</f096f> </f096> <f100> <f100a>Hovde, L.E.</f100a> <f100c>forlag</f100c> </f100> <f245> <f245a>Olav Tryggvasons gate fra Bakke bro med trikk og hest.</f245a> </f245> </marc> 2005-06-03 24 ELAG 2005 BIBSYS Subject Portal • • • • Metadata for high quality web resources Uses a subject hierarchy based on Dewey Data in XML using several namespaces Database schema written as several XML Schemas, one for each namespace • XQuery (working draft from 2002-08-16) used as query language • URL: http://emneportal.bibsys.no 2005-06-03 25 ELAG 2005 BIBSYS Subject portal – xml format (extract) <ep:eprecord xmlns:ep="http://www.bibsys.no/eprecord/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:epres="http://www.bibsys.no/res_type/" xmlns:epadm="http://www.bibsys.no/epadm/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" recid="983870854-8343"> <ep:head> <epadm:identifier>983870854-8343</epadm:identifier> <epadm:status>godkjent</epadm:status> </ep:head> <ep:biblio> <dc:title>Cognitive Psychology Online Laboratory : CogLab</dc:title> <dc:type xsi:type="epres:resource">BM</dc:type> <ep:uriandaccess> <dc:identifier xsi:type="dcterms:URI">http://coglab.wadsworth.com/</dc:identifier> <dcterms:accessRights>FREEE</dcterms:accessRights> </ep:uriandaccess> </ep:biblio> <ep:admin> <epadm:created>2001-03-06</epadm:created> </ep:admin> </ep:eprecord> 2005-06-03 26 ELAG 2005 Conclusion • Two main kind of XML Databases – native – RDBMS with extention • XML databases well suited for storing hierarchial structures • Work in progress to join SQL and XML based functionality • DBMS will in future handle relational data and xml base data equally well • Yes, XML databases, - indeed exist! 2005-06-03 27 ELAG 2005