Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
XML and Java Alex Chaffee, [email protected] http://www.purpletech.com ©1996-2000 jGuru.com Overview • • • • • Why Java and XML? Parsers: DOM, JDOM, SAX Using XML from JSP Java/XML Object Mapping Resources Why Java/XML? • XML maps well to Java – late binding – hierarchical (OO) data model • • • • Unicode support in Java XML Structures map well to Java Objects Portability Network friendly XML Parsers • • • • • Validating/Non-Validating Tree-based Event-based SAX-compliance Not technically parsers – XSL – XPath Some Java XML Parsers • DOM – – – – – Sun JAXP IBM XML4J Apache Xerces Resin (Caucho) DXP (DataChannel) • SAX – Sun JAXP – SAXON • JDOM Dom API • • • • • Tree-based Node classes Part of W3C spec Sorting/Modifying of Elements Sharing document with other applications XML is a Tree <?xml version="1.0"?> <!DOCTYPE menu SYSTEM "menu.dtd"> <menu> <meal name="breakfast"> <food>Scrambled Eggs</food> <food>Hash Browns</food> <drink>Orange Juice</drink> </meal> name <meal name="snack"> <food>Chips</food> </meal> "breakfast" </menu> menu meal meal food food drink "Scrambled Eggs" "Hash Browns" "Orange Juice" DOM API (cont’d) • Based on Interfaces – Good design style - separate interface from implementation – Document, Text, Processing Instruction, Element ALL are interfaces – All extend interface Node – Including interface Attr (parentNode is null, etc) DOM Example public void print(Node node) { //recursive method call using DOM API... int type = node.getNodeType(); case Node.ELEMENT_NODE: // print element with attributes out.print('<'); out.print(node.getNodeName()); Attr attrs[] = node.getAttributes(); for (int i = 0; i < attrs.length; i++) { Attr attr = attrs[i]; out.print(' '); out.print(attr.getNodeName());out.print("=\""); out.print(normalize(attr.getNodeValue())); out.print('"'); } out.print('>'); NodeList children = node.getChildNodes(); if (children != null) { int len = children.getLength(); for (int i = 0; i < len; i++) { print(children.item(i)); } } break; case Node.ENTITY_REFERENCE_NODE: // handle entity reference nodes // ... DOM API Highlights • Node – getNodeType() – getNodeName() – getNodeValue() •Attr –attributes are not technically child nodes –getParent() et al. return null –getName(), getValue() • returns null for Elements – getAttributes() • returns null for non-Elements •Document – getChildNodes() – getParentNode() • Element – getTagName() • same as getNodeName() – getElementsByTagName(String tag) • get all children of this name, recursively – normalize() • smooshes text nodes together –has one child node - the root element •call getDocumentElement() –contains factory methods for creating attributes, comments, etc. DOM Level 2 • Adds namespace support, extra methods • Not supported by Java XML processors yet The Trouble With DOM • Written by C programmers • Cumbersome API – Node does double-duty as collection – Multiple ways to traverse, with different interfaces • Tedious to walk around tree to do simple tasks • Doesn't support Java standards (java.util collections) JDOM: Better than DOM • • • • Java from the ground up Open source Clean, simple API Uses Java Collections JDOM vs. DOM • • • • Classes / Interfaces Java / Many languages Java Collections / Idiosyncratic collections getChildText() and other useful methods / getNextSibling() and other useless methods JDOM: The Best of Both Worlds • Clean, easy to use API – document.getRootElement().getChild("book"). getChildText("title") • Random-access tree model (like DOM) • Can use SAX for backing parser • Open Source, not Standards Committee – Allowed benevolent dictatorship -> clean design JDOM Example XMLOutputter out = new XMLOutputter(); out.output( element, System.out ); Or… public void print(Element node) { //recursive method call using JDOM API... out.print('<'); out.print(node.getName()); List attrs = node.getAttributes(); for (int i = 0; i < attrs.size(); i++) { Attribute attr = (Attribute)attrs.get(i); out.print(' '); out.print(attr.getName());out.print("=\""); out.print(attr.getValue() ); out.print('"'); } out.print('>'); List children = node.getChildren(); if (children != null) { for (int i = 0; i < children.size(); i++) { print(children.item(i)); } } JDOM Example public Element toElement(User dobj) throws IOException { User obj = (User)dobj; Element element = new Element("user"); element.addAttribute("userid", ""+user.getUserid()); String val; val = obj.getUsername(); if (val != null) { element.addChild(new Element("username").setText(val)); } val = obj.getPasswordEncrypted(); if (val != null) { element.addChild(new Element("passwordEncrypted").setText(val)); } return element; } JDOM Example public User fromElement(Element element) throws DataObjectException { List list; User obj = new User(); String value = null; Attribute userid = element.getAttribute("userid"); if (userid != null) { obj.setUserid( userid.getIntValue() ); } value = element.getChildText("username"); if (value != null) { obj.setUsername( value ); } value = element.getChildText("passwordEncrypted"); if (value != null) { obj.setPasswordEncrypted( value ); } return obj; } DOMUtils • DOM is clunky • DOMUtils.java - set of utilities on top of DOM • http://www.purpletech.com/code • Or just use JDOM Event-Based Parsers • Scans document top to bottom • Invokes callback methods • Treats XML not like a tree, but like a list (of tags and content) • Pro: – Not necessary to cache entire document – Faster, smaller, simpler • Con: – must maintain state on your own – can't easily backtrack or skip around SAX API • • • • • Grew out of xmldev mailing list (grassroots) Event-based startElement(), endElement() Application intercepts events Not necessary to cache entire document Sax API (cont’d) public void startElement(String name, AttributeList atts) { // perform implementation out.print(“Element name is “ + name); out.print(“, first attribute is “ + atts.getName(0) + “, value is “ + atts.getValue(0)); } XPath • The stuff inside the quotes in XSL • Directory-path metaphor for navigating XML document – "/curriculum/class[4]/student[first()]" • Implementations – Resin (Caucho) built on DOM – JDOM has one in the "contrib" package • Very efficient API for extracting specific info from an XML tree – Don't have to walk the DOM or wait for the SAX – Con: yet another syntax / language, without full access to Java libraries XSL • • • • eXtensible Stylesheet Language transforms one XML document into another XSL file is a list of rules Java XSL processors exist – Apache Xalan • (not to be confused with Apache Xerces) – – – – IBM LotusXSL Resin SAXON XT Trouble with XSL • It's a programming language masquerading as a markup language • Difficult to debug • Turns traditional programming mindset on its head – Declarative vs. procedural – Recursive, like Prolog • Doesn't really separate presentation from code JSP • JavaServer Pages • Outputting XML <% User = loadUser(request.getParameter("username")); response.setContentType("text/xml"); %> <user> <username><%=user.getUsername()%></username> <realname><%=user.getRealname()%></realname> </user> • Can also output HTML based on XML parser, naturally (see my "JSP and XML" talk, or http://www.purpletech.com) XMLC • A radical solution to the problem of how to separate presentation template from logic… • …to actually separate the presentation template from the logic! XMLC Architecture HTML (with ID tags) Reading data XMLC HTML Object (automatically generated) Setting values Java Class (e.g. Servlet) HTML (dynamically-generated) Data XMLC Details • Open-source (xmlc.enhydra.org) • Uses W3C DOM APIs • Generates "set" methods per tag – Source: <H1 id="title">Hello</H1> – Code: obj.setElementTitle("Goodbye") – Output: <H1>Goodbye</H1> • Allows graphic designers and database programmers to develop in parallel • Works with XML source too XML and Java in 2001 • Many apps' config files are in XML – Ant – Tomcat – Servlets • Several XML-based Sun APIs – – – – JAXP JAXM ebXML SOAP (half-heartedly supported ) Java XML Documentation • Jdox – Javadoc -> single XML file – http://www.componentregistry.com/ – Ready for transformation (e.g. XSL) • Java Doclet – http://www.sun.com/xml/developers/doclet – Javadoc -> multiple XML files (one per class) • Cocoon – Has alpha XML doclet Soapbox: DTDs are irrelevant • DTDs describe structure of an unknown document • But in most applications, you already know the structure – it's implicit in the code • If the document does not conform, there will be a runtime error, and/or corrupt/null data • This is as it should be! GIGO. • You could have a separate "sanity check" phase, but parsing with validation "on" just slows down your app • Useful for large-scale document-processing applications, but not for custom apps or transformations XML and Server-Side Java ©1996-2000 jGuru.com Server-Side Java-XML Architecture • Many possible architectures – XML Data Source • disk or database or other data feed – Java API • DOM or SAX or XPath or XSL – XSL • optional transformation into final HTML, or HTML snippets, or intermediate XML – Java Business Logic • JavaBeans and/or EJB – Java Presentation Code • Servlets and/or JSP and/or XMLC Server-Side Java-XML Architecture Java UI Java Business Logic XML Processors XML Data Sources JSP JavaBeans DOM, SAX Filesystem EJB XPath XML-savvy RDBMS XSL XML Data Feed HTML Servlet Server-Side Architecture Notes • Note that you can skip any layer, and/or call within layers – e.g. XML->XSL->DOM->JSP, or – JSP->Servlet->DOM->XML Cache as Cache Can • Caching is essential • Whatever its advantages, XML is slow • Cache results on disk and/or in memory XML <-> Java Object Mapping ©1996-2000 jGuru.com XML and Object Mapping • Java -> XML – Start with Java class definitions – Serialize them - write them to an XML stream – Deserialize them - read values in from previously serialized file • XML -> Java – Start with XML document type – Generate Java classes that correspond to elements – Classes can read in data, and write in compatible format (shareable) Java -> XML Implementations • Java -> XML – – – – – – – BeanML Coins / BML Sun's XMLOutputStream/XMLInputStream XwingML (Bluestone) JDOM BeanMapper Quick? JSP (must roll your own) BeanML Code (Extract) <?xml version="1.0"?> <bean class="java.awt.Panel"> <property name="background" value="0xeeeeee"/> <property name="layout"> <bean class="java.awt.BorderLayout"/> </property> <add> <bean class="demos.juggler.Juggler" id="Juggler"> <property name="animationRate" value="50"/> <call-method name="start"/> </bean> <string>Center</string> </add> …</bean> Coins • Part of MDSAX • Connect XML Elements and JavaBeans • Uses Sax Parser, Docuverse DOM to convert XML into JavaBean • Uses BML - (Bindings Markup Language) to define mapping of XML elements to Java Classes JDOM BeanMapper • Written by Alex Chaffee • Default implementation outputs elementonly XML, one element per property, named after property • Also goes other direction (XML->Java) – Doesn't (yet) automatically build bean classes • Can set mapping to other custom element names / attributes XMLOutputStream/XMLInputStream • From some Sun engineers – http://java.sun.com/products/jfc/tsc/articles/persistence/ • • • • May possibly become core, but unclear Serializes Java classes to and from XML Works with existing Java Serialization Not tied to a specific XML representation – You can build your own plug-in parser • Theoretically, can be used for XML->Java as well XMLOutputStream/XMLInputStream Sample XWingML code <?xml version="1.0"?> <!DOCTYPE XwingML SYSTEM "file:///c:/XwingML/xml/xwingml.dtd"> <XwingML> <Classes> <Instance name="OpenFile" className="XMLOpenFile"/> <Instance name="SaveFile" className="XMLSaveFile"/> <Instance name="ParseFile" className="XMLParseFile"/> <Instance name="About" className="XMLAbout"/> </Classes> <JFrame name="MainFrame" title="Bluestone XMLEdit" image="icon.gif" x="10%" y="10%" width="80%" height="80%"> <JMenuBar> <JMenu text="File" mnemonic="F"> <JMenuItem icon="open.gif" text="Open..." mnemonic="O" accelerator="VK_O,CTRL_MASK" actionListener="OpenFile"/> <JMenuItem icon="save.gif" text="Save" mnemonic="S" accelerator="VK_S,CTRL_MASK" actionCommand="save" actionListener="SaveFile"/> <JMenuItem icon="save.gif" text="Save As..." mnemonic="a" actionCommand="saveas" actionListener="SaveFile"/> <Separator/> <JMenuItem text="Exit" mnemonic="x" accelerator="VK_X,CTRL_MASK" actionListener="com.bluestone.xml.swing.XwingMLExit"/> XML -> Java Implementations • XML -> Java – – – – Java-XML Data Binding (JSR 31 / Adelard) IBM XML Master (Xmas) Purple Technology XDB Breeze XML Studio (v2) Adelard (Java-XML Data Binding) • Java Standards Request 31 • Still vapor! (?) Castor • Implementation of JSR 31 – http://castor.exolab.org • Open-source IBM XML Master ("XMas") • • • • Not vaporware - it works!!! Same idea as Java-XML Data Binding From IBM Alphaworks Two parts – builder application – visual XML editor beans Brett McLaughlin's Data Binding Package • See JavaWorld articles Purple Technology XDB • In progress (still vapor) – Currently rewriting to use JDOM – JDOMBean helps • Three parts – XML utility classes – XML->Java data binding system – Caching filesystem-based XML database (with searching) Conclusion • Java and XML are two great tastes that taste great together Resources • XML Developments: – Elliot Rusty Harold: • Café Con Leche - metalab.unc.edu/xml • Author, XML Bible – Simon St. Laurent • www.simonstl.com • Author, Building XML Applications • General – – – – www.xmlinfo.com www.oasis-open.org/cover/xml.html www.xml.com www.jdm.com – www.purpletech.com/xml Resources: Java-XML Object Mapping • JSR 31 – http://java.sun.com/aboutJava/communityprocess/j sr/jsr_031_xmld.html – http://java.sun.com/xml/docs/binding/DataBinding. html • XMas – http://alphaworks.ibm.com/tech/xmas Resources • XSL: – James Tauber: • xsl tutorial: www.xmlsoftware.com/articles/xsl-byexample.html – Michael Kay • Saxon • home.iclweb.com/icl2/mhkay/Saxon.html – James Clark • XP Parser, XT • editor, XSL Transformations W3C Spec Resources: • JDOM – www.jdom.org Thanks To • • • • John McGann Daniel Zen David Orchard My Mom