Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Data vault modeling wikipedia , lookup
Resource Description Framework wikipedia , lookup
Asynchronous I/O wikipedia , lookup
Business intelligence wikipedia , lookup
Semantic Web wikipedia , lookup
National Information Exchange Model wikipedia , lookup
Entity–attribute–value model wikipedia , lookup
Operational transformation wikipedia , lookup
Search engine indexing wikipedia , lookup
Versant Object Database wikipedia , lookup
Database model wikipedia , lookup
DT228/3 Web Development Introduction to XML XMl Parsers Uses of XML • To exchange data between incompatible systems (just send an XML document, with an agreed definition of the tags) • For B2B e-commerce – exchange of business documents between businesses - XML is flexible enough to describe any logical text structure e.g. Purchase order, invoice • To store data – as plain text files, or in databases • To create new mark-up languages (I.e. that uses tags) – Can use XML to agree what the tags mean. Many markup languages already created that have been based on XML – e.g. JSTL, WML, VoiceXML, XHTML Using an XMl document Need an XML Parser to “use” or parse out the data held in the XMl document XML Parsers An XML parser does the following: • Retrieves and read the an XML document – I.e. “parses” the document to figure out what’s in it, • Ensures the document adheres to specific standards (e.g. well formed? Adheres to DTD?) • Makes the document contents available to your application XML Document parsers • If you application is going to use XML documents, you could write your own parser • But makes sense to use a pre-built parser • E.g. Java provides an XML parser API that can be used in any java application that processes XMl document • Saves on development work XMl Document Parsers • Hundreds of parsers available • Most parsers are based on two main interfaces: – Tree based – Document Object Model (DOM) – Event based – Simple API for XMl (SAX) XML Parsers: Tree based DOM interface • Uses Document Object Model (DOM) • Tree based interface (navigates through the document) • Developed by W3C • XML parsers that use DOM exist for java, javascript, perl, C++ Tree based DOM parser - example Object/Tree Interface (DOM) Definition: Parser reads the XML document, and creates an in-memory “tree” of data – an object module of the data For example: Given a sample XML document on the next slide, what kind of tree would be produced? Tree based DOM parser - example Sample XML Document <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE WEATHER SYSTEM "Weather.dtd"> <WEATHER> <CITY NAME="Hong Kong"> <HI>87</HI> <LOW>78</LOW> </CITY> </WEATHER> Tree based DOM parser - example XML Parsers: Event based SAX parser • Simple API for XML • Event based • Developed by volunteers on the XMLdev mailing list • http://www.megginson.com/SAX/ Event based SAX parser Event Based Parser Definition: Parser reads the XML document, and generates events for each parsing event. They don’t create an in memory object model of the document – it’s up to the programmer to write the code to interpret the events For example: Given the same XML document, what kind of events would be produced? Event based SAX parser: example <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE WEATHER SYSTEM "Weather.dtd"> <WEATHER> <CITY NAME="Hong Kong"> <HI>87</HI> <LOW>78</LOW> </CITY> </WEATHER> Event based SAX parser: example Events generated: • • • • • • • • • • 1. Start of <Weather> Element 2. Start of <CITY> Element 3. Start of <HI> Element 4. Character Event: 87 5. End of </HI> Element 6. Start of <LOW> Element 7. Character Event: 78 8. End of </LOW> Element 9. End of </CITY> Element 10. End of </WEATHER> Element Event based parsers For each of these events, the your application implements “event handlers.” Each time an event occurs, a different event handler is called. Your application intercepts these events, and handles them in any way you want. Comparing tree based DOM parser with event based SAX parser Questions: • Which parser is faster? • Which parser is more efficient? • Which parser is suitable for which type of XML documents? Comparing tree based DOM parser with event based SAX parser Tree based: slower takes up more memory Simpler to use More suitable for documents that are less structured, with less repetition of tags. More suitable where the program needs to move around the document alot within the program need to keep easy access to full document at all time. Event based: Faster Takes up much less memory But More complex to implement Good for large, machine generated, structured documents e.g. book contents (because repetitive nature of tags allows for re-use of event handling code and therefore less work for programmer Good where only parts of the document needed at any one time within the document (event based parsers cannot “skip around” from one part of the document to the other Comparing tree based DOM parser with event based SAX parser Performance and Memory Therefore, when high performance and low-memory are the most important criteria, use an event-based parser. Examples: • Java applets • Palm Pilot Applications • Parsing Huge Data files Storing XML documents • Can use XML for data storage – e.g. to store news headlines, business documents • Q: How to store XML documents in a database? Storing XML documents • Choices: • Keep as XML files.. (filename.xml) • Put into a relational database and convert to/from XMl format • Use a native XML database Storing XML documents • Keep as XML files.. (filename.xml) • ---- Fast for small number of users -----Eliminates overheads of database connections • ---- Large number of users -> concurrency issues • -----Poor for high volume read/write • -----Security/visibility Storing XML documents Put into a relational database and convert to/from XMl format as needed • ---- Provides “ACID” support to ensure integrity of access to the data • ---- Assumes data can become “tabular” in format (usually data used for transport..) • ---- Poor for data that is not easily transformed into table-based structures e.g. Word processor documents Storing XML documents • Store in a Native XML database • -----Native XML databases are databases designed especially to store XML documents. • ---- A native XML database is one that treats XML documents and elements as the fundamental structures rather than tables, records, and fields. • ---- Good for XMl documents that are for human consumption..”Content”.. (e.g. books, emails) • ---- ---- Provides “ACID” support to ensure integrity of access to the data Storing XML documents • Store in a Native XML database (continued) • ----- Good when XMl documents needs to be returned (but most applications need data returned in other formats).. • Query languages evolving (e.g. XQuery) but no equivalent yet of SQL update/insert/delete.. • New technology • (e.g. open source dB eXist)