Download ppt file

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Database wikipedia , lookup

Relational algebra wikipedia , lookup

Functional Database Model wikipedia , lookup

Clusterpoint wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Relational model wikipedia , lookup

Database model wikipedia , lookup

Transcript
The Semistructured-Data Model
Programming Languages for XML
Spring 2011
Instructor: Hassan Khosravi
Semistructured Data
 Another data model, based on trees.
 Self-describing:

The data implicitly carries information about what its schema is.

May only carry the names of attributes (so possibly untyped), and
has a lower degree of organization than the data in a relational
database.

May have no associated schema (i.e. may be schema-less)
 Motivation:

flexible representation of data.

sharing of documents among systems and databases.

Information integration
– E.g. want to “merge” or query two databases.

Data exchange
– E.g. two enterprises may want to exchange data (such as
buyers and sellers)
11.2
Semistructured Data representation
11.3
Relational
Semistructured
Structure
Tables
Hierarchical tree,
graph
Schema
Fixed in advance
Flexible, self
describing
Queries
Simple nice language
Less so
Ordering
None (has order by)
Implied
Implementation
Mature and native
Add-on
11.4
Comparison with Relational Data
 Inefficient: tags, which in effect represent schema information, are
repeated
 Access: data is structured hierarchically.
 Better than relational tuples as a data-exchange format

Unlike relational tuples, semistructured data is self-documenting
due to presence of tags

Flexible, non-rigid format: tags can be added

Allows nested structures

Wide acceptance, not only in database systems, but also in
browsers, tools, and applications
11.5
Flexibility in Schema
11.6
XML
 XML : Extensible Markup Language

A standard adopted in 1998
 While HTML uses tags for formatting (e.g., “italic”), XML uses tags for
semantics (e.g., indicating “this is an address” or “this is a title”).
 Key idea: create tag sets for a domain (e.g., genomics), and translate
all data into properly tagged XML documents.
 There are two different modes of use of XML:

Well-Formed XML allows you to invent your own tags.


No predefined schema
Valid XML conforms to a certain Document Type Descriptor
DTD.

The DTD describes allowable tags and their nesting.

But still reasonably flexible – e.g. may allow optional or missing
fields
11.7
Well-Formed XML
 Begins with a declaration that it is XML
 It has a root element that is the entire body of the text
11.8
Well-Formed XML
Valid XML
11.9
Valid XML
 Document Type Descriptor (DTD)

Grammar-like language for specifying elements, attributes,
nesting, ordering, #occurrences

Special attribute types ID and IDREF

Example
11.10
QUERYING SEMISTRUCTURED
DATA
11.11
Querying XML
 Not nearly as mature as Querying relational

Newer

No underlying theory as in relational models
 Sequence of development

Xpath – path expressions + conditions

Xquery – Xpath + full featured query language
11.12
XPath
 Think of XML as a tree

path expressions + conditions
11.13
Xpath Syntax
 /  root element
 Axes (to navigate around tree 13)
 name of element “book”
 Parent::
 Use name as * to match everything
 Following-sibling::
 @ISBN
 Descendants::
 //  matches all descendant
 Self::
 conditions [@price < 50]
 [N]  nth child author [2]
11.14
Xpath Demo
 Example
11.15
XQuery
11.16
XQuery Demo
 Example
11.17