Download XML with Structural Information Description

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Data Protection Act, 2012 wikipedia , lookup

Operational transformation wikipedia , lookup

Data center wikipedia , lookup

Clusterpoint wikipedia , lookup

Data analysis wikipedia , lookup

Relational model wikipedia , lookup

Entity–attribute–value model wikipedia , lookup

Data model wikipedia , lookup

Forecasting wikipedia , lookup

Versant Object Database wikipedia , lookup

3D optical data storage wikipedia , lookup

Information privacy law wikipedia , lookup

Data vault modeling wikipedia , lookup

Business intelligence wikipedia , lookup

XML wikipedia , lookup

Database model wikipedia , lookup

Transcript
An Extension to XML Schema
for Structured Data Processing
Presented by: Jacky Ma
Date: 10 April 2002
Presentation Outline






The Problems
Research Objectives
The Schema Extension: MMX
MMX Query System
Discussion
Conclusion
The Problems

Mapping XML data into relational tables



Legacy application-specific structured data




Not natural to XML structure
Efficient, but may not be a effective method
Similar modeling but proprietary implementation
Not interoperable, and difficult to maintain
Lack of modular design and thus difficult to combine
to form more complex data structure
Meta-data can facilitate wide range of needs,
while XML Schema is solely used for physical
data validation nowadays
Research Objectives


To facilitate more effective searching and
storing of XML contents by making use of
meta-data (XML Schema)
Propose a data-oriented model to allow
different storage mechanism, processing
model, and query model on XML contents
Our Approach – MMX




Use meta-data to map XML data into
structured data objects
Define the structured data models
“conceptually” and link the models to
XML document structure “syntactically”
Meta-data is defined as an extension of
XML Schema
The extension is called MMX (Multi
Model XML)
Program Driven vs. Data Driven
Information for processing
is hard-coded in program
Program Driven
Raw Data
Structured Data (XML)
Data with Modeling Information
MMX!
Data with Program Codes
Data Driven
Processing instruction
is hard-coded in data?!
A Glance of XML Data
A Glance of The Linked Schema
Schema Extension


The extended schema is associated with a namespace
The extended schema goes within a schema element, like
<tree:element> in the example



<tree:element> specify a single structure object instance
Name association for elements and attributes
Class hierarchies:




<tree:element> -> <tree:internal> -> <tree:leafNode>
finally to the structure specified in <tree:leafNodeValue>
Additional properties in <rootNodeAttr>, <internalNodeAttr>
and <leafNodeAttr>
Schema writer has to know the structure model
specification, while the XML writer only needs to know
the given schema
Modeling

For an instance of “MMX data object”



As an encapsulated information object only
accessible from the root, thus as a “single tree node”
As a mapping from root node, query method and
query parameters to the value at leaf nodes
Leaf nodes may contain any valid XML content, as
long as defined in the Schema


I.e. may contain another “MMX data object”
A query is modeled as a 3-dimension tuple:




[accessing-node, query-method, query-parameters]
Accessing-node is specified by XPath
Query-method is specified in String Value
Query-parameters is multi-dimension depends on the current
model
Modeling (2)
A
Tree(1) is accessible from
point A, occasionally, a query
(e.g. [A, “spatial-search”,(3, 5)],
assuming Tree(1) will accept
spatial-search with two coordinates)
Tree (1)
B
XML Elements..
Tree(2)
may return point B as answer,
either by XPath of B or the
XML subtree of B.
From this point B, user may
drill down the tree by issue
another query on Tree(2).
Query with and without MMX

From the original XML data, we could not
assume the semantics of the data:



We can ONLY do XML-based query such as XPath
We can do the spatial query ONLY IF we can map
the data into a R-Tree
After mapping the data into R-Tree

Spatial Queries



Give me the point at (2,7)
Give me the point nearest to (4,4)
Nearest Neighbor Search

Give me the point nearest to “Franklin”
(0,0)
Processing



Users might not know the “type” of the
node (and not necessary to know). They are
interested in what they can do
Users retrieved the list of possible
operation by issuing a LIST-OPERATION
method to the root element of a MMX
object
Possible operations may include queries,
updates, and other model-specific
operations
MMX Query System




To show that the schema, modeling, and
processing of MMX extension is workable
To illustrate how it assists in querying
XML data
To facilitate as the platform for testing the
implementation of arbitrary structured
models
Implement with JDK1.4
System Design
XML
DOM
Node Data
Schema
Parse
Schema
Fetch
Classes
MMX Element
Abstract
MMX Element
Extends class
(Partly)
Defines
R-Tree Maps
Schema
VP-Tree
X-Tree
R-Tree
MMX
Document
…
Clients
The Abstract Class defines
common interface that
have to be implement in
each MMX Element such
as LIST-OPERATION,
QUERY, BUILD, etc.
Discussions - Pros



Compatible with the relational approach, and
supersedes that.
Modular design promotes reusability and
maintainability
XML “flatten” the legacy structured data to
make them text-editable, easy to transport and
process by different systems
Discussion - Cons




There is no generic syntax to precisely
describe all kinds of structures models
The size of XML file is often larger than
legacy data file
Each structure model needs additional
implementation effort
Schema specification become longer and
longer quickly as number of supported model
increases
Conclusion




Propose a representation to encapsulate data
structures
Describe XML data with the Schema
conceptually as well as syntactically
Map legacy structure models into Schema, and
map XML data to the structure models by the
Schema
Structured data repository with increased
interoperability, reusability, and transportability
Q&A