* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download XML with Structural Information Description
Survey
Document related concepts
Data Protection Act, 2012 wikipedia , lookup
Operational transformation wikipedia , lookup
Data center wikipedia , lookup
Clusterpoint wikipedia , lookup
Data analysis wikipedia , lookup
Relational model wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Forecasting wikipedia , lookup
Versant Object Database wikipedia , lookup
3D optical data storage wikipedia , lookup
Information privacy law wikipedia , lookup
Data vault modeling wikipedia , lookup
Transcript
An Extension to XML Schema for Structured Data Processing Presented by: Jacky Ma Date: 10 April 2002 Presentation Outline The Problems Research Objectives The Schema Extension: MMX MMX Query System Discussion Conclusion The Problems Mapping XML data into relational tables Legacy application-specific structured data Not natural to XML structure Efficient, but may not be a effective method Similar modeling but proprietary implementation Not interoperable, and difficult to maintain Lack of modular design and thus difficult to combine to form more complex data structure Meta-data can facilitate wide range of needs, while XML Schema is solely used for physical data validation nowadays Research Objectives To facilitate more effective searching and storing of XML contents by making use of meta-data (XML Schema) Propose a data-oriented model to allow different storage mechanism, processing model, and query model on XML contents Our Approach – MMX Use meta-data to map XML data into structured data objects Define the structured data models “conceptually” and link the models to XML document structure “syntactically” Meta-data is defined as an extension of XML Schema The extension is called MMX (Multi Model XML) Program Driven vs. Data Driven Information for processing is hard-coded in program Program Driven Raw Data Structured Data (XML) Data with Modeling Information MMX! Data with Program Codes Data Driven Processing instruction is hard-coded in data?! A Glance of XML Data A Glance of The Linked Schema Schema Extension The extended schema is associated with a namespace The extended schema goes within a schema element, like <tree:element> in the example <tree:element> specify a single structure object instance Name association for elements and attributes Class hierarchies: <tree:element> -> <tree:internal> -> <tree:leafNode> finally to the structure specified in <tree:leafNodeValue> Additional properties in <rootNodeAttr>, <internalNodeAttr> and <leafNodeAttr> Schema writer has to know the structure model specification, while the XML writer only needs to know the given schema Modeling For an instance of “MMX data object” As an encapsulated information object only accessible from the root, thus as a “single tree node” As a mapping from root node, query method and query parameters to the value at leaf nodes Leaf nodes may contain any valid XML content, as long as defined in the Schema I.e. may contain another “MMX data object” A query is modeled as a 3-dimension tuple: [accessing-node, query-method, query-parameters] Accessing-node is specified by XPath Query-method is specified in String Value Query-parameters is multi-dimension depends on the current model Modeling (2) A Tree(1) is accessible from point A, occasionally, a query (e.g. [A, “spatial-search”,(3, 5)], assuming Tree(1) will accept spatial-search with two coordinates) Tree (1) B XML Elements.. Tree(2) may return point B as answer, either by XPath of B or the XML subtree of B. From this point B, user may drill down the tree by issue another query on Tree(2). Query with and without MMX From the original XML data, we could not assume the semantics of the data: We can ONLY do XML-based query such as XPath We can do the spatial query ONLY IF we can map the data into a R-Tree After mapping the data into R-Tree Spatial Queries Give me the point at (2,7) Give me the point nearest to (4,4) Nearest Neighbor Search Give me the point nearest to “Franklin” (0,0) Processing Users might not know the “type” of the node (and not necessary to know). They are interested in what they can do Users retrieved the list of possible operation by issuing a LIST-OPERATION method to the root element of a MMX object Possible operations may include queries, updates, and other model-specific operations MMX Query System To show that the schema, modeling, and processing of MMX extension is workable To illustrate how it assists in querying XML data To facilitate as the platform for testing the implementation of arbitrary structured models Implement with JDK1.4 System Design XML DOM Node Data Schema Parse Schema Fetch Classes MMX Element Abstract MMX Element Extends class (Partly) Defines R-Tree Maps Schema VP-Tree X-Tree R-Tree MMX Document … Clients The Abstract Class defines common interface that have to be implement in each MMX Element such as LIST-OPERATION, QUERY, BUILD, etc. Discussions - Pros Compatible with the relational approach, and supersedes that. Modular design promotes reusability and maintainability XML “flatten” the legacy structured data to make them text-editable, easy to transport and process by different systems Discussion - Cons There is no generic syntax to precisely describe all kinds of structures models The size of XML file is often larger than legacy data file Each structure model needs additional implementation effort Schema specification become longer and longer quickly as number of supported model increases Conclusion Propose a representation to encapsulate data structures Describe XML data with the Schema conceptually as well as syntactically Map legacy structure models into Schema, and map XML data to the structure models by the Schema Structured data repository with increased interoperability, reusability, and transportability Q&A