Download Best practices for large document and resource caching in an

Best practices for AFP Resource caching in a IBM® Content Manager OnDemand Web Enablement Kit Java API application. 12/15/2007 Author: Bob Lichens Software Engineer IBM Content Management OnDemand Development “Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API application”. Rev: 12/15/2007 © Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Page 1 of 6 This document will describe how the IBM® Content Management OnDemand Web Enablement Kit Java APIs (referred to hereafter as simply ODWEK) handle the caching of AFP Resource data in memory, and how you can avoid excessive system load by caching large AFP Resources to file using ODWEK Java APIs. We will also discuss how you can avoid JVM Heap Fragmentation due to large JNI allocations by retrieving large document data buffers directly to file. The target audience for this article is application developers, testers, and those that would support the application. This document assumes that the reader is proficient with the Java™ programming language, Java™ Virtual Machine configuration and management, and has a good working knowledge of the IBM Content Manager OnDemand products and AFP print streams. What are IBM OnDemand Web Enablement Kit Java APIs (ODWEK)? The ODWEK Java API provide industry standard Java™ classes that can be used by a customer to write a custom web application that can access data stored on the OnDemand Server. This custom application could, for example, allow the end user to logon to an OnDemand server, get a list of folders, search a specific folder, generate a hitlist of matching documents and retrieve those documents for viewing. There are also many APIs to provide advanced functionality as well. Prerequisites: This document addresses features and functionality that is only available in ODWEK Versions 7.1.2.5 or later (including ODWEK 8.4.0.0 and later). ODWEK Overview of AFP Resource handling and memory cache. The ODWEK Java APIs are optimized for quick retrieval of documents in high-volume, production environments and are enhanced to handle high-volume retrievals of AFP documents by caching the AFP Resources in memory. Each unique resource will be compressed and stored in a memory cache which is shared by all ODWEK ODServer sessions in the web application environment. This resource sharing will avoid multiple data requests to the OnDemand Server, free TCP/IP bandwidth, and reduce OnDemand Server load. Each Application Group defined within the OnDemand Server has its own unique set of AFP resources and corresponding AFP Resource ID’s. These application group specific resources are added to the Memory Cache (as needed) for document retrievals and will not be freed from Memory Cache until all ODServer objects have been (are simultaneously?) terminated. Problems that can occur. With the advent of full color printing and other additional data being added to print streams, the size of a typical document that is stored in OnDemand has increased dramatically. Historically, with limited bandwidth and system resources, a typical AFP document would have an average resource size of 5KB. Today’s AFP documents have resource files that may include full color graphics, custom fonts, images and any number of inserts, and these AFP Resource files can reach sizes approaching 10MB. If not sized properly, your application may run out of memory. “Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API application”. Rev: 12/15/2007 © Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Page 2 of 6 The Solution. The solution to the sizing and memory problems delineated above is to have your application retrieve the resource data from a file cache, rather than memory. This way you can avoid multiple retrievals of large resource file buffers across the network. Also, you can read the data into the JVM directly from file and avoid passing a large buffer across the JNI/JVM boundary. The ODWEK Java APIs (version 7.1.2.5 and higher) now include two methods which allow you to retrieve document data and AFP resource data separately, and output them directly to file: The ODHit.getResource( ) API will allow you to get the AFP Resource data directly to file, and will bypass the ODWEK memory cache for these resources. This will not only avoid the excess overhead of having many large resources in memory, but along with ODHitGetResourceID() will allow for you to design a file based cache system to suit your particular AFP resource caching needs. The ODHit.getDocument( ) API allows for the data to be written to file in the JNI layer, which can alleviate some of the JVM heap fragmentation problems that are evident with larger data buffers. Also, this API is optimized to eliminate much of the legacy ODHit.retrieve() API overhead when handling documents, since there is no data conversion of any kind here. This improves ODWEK performance and also allows for data to be handed off to other methods outside of ODWEK. (i.e.; data transforms and parsing routines). “Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API application”. Rev: 12/15/2007 © Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Page 3 of 6 API Documentation ODHit.getResourceID() Get the unique resource ID for the AFP resources required to view/transform the document. The resource ID will be constructed from the OnDemand server, OnDemand application ID, and OnDemand resource ID. Some examples of the values returned are: akira-1209-9" akira.boulder.ibm.com-1209-9" 9.17.00.201-1209-9" This ID can then be used with ODHit.getResource(filename) to save off the resource file for later use. ODHit.getResources()/ODHit.getResources(filename) Retrieve the AFP resources for ODHit which is either returned to the caller in a ByteArray or written out to the file specified. These methods do not use the standard ODWEK memory cache, and will not cache the resources into memory. ODHit.getDocument() Retrieve the document for this ODHit. The document will be retrieved uncompressed and in its native format. Conversion is not provided. See ODServer/ODHit.retrieve() if you require conversion processing. If the current ODHit is for a large object document, only the first segment is returned. See ODHit.getDocument(filename,allsegs) or ODHit.retrieveSegment() for large document handling. ODHit.getDocument(String Filename, Bool allsegs) Retrieve the document for this ODHit. The document will be retrieved uncompressed and in its native format. Conversion is not provided. See ODServer/ODHit.retrieve() if you require conversion processing. If allsegs = true, then all segments of a large object document will be written to the file specified. If allsegs = false, then only the first segment will be retrieved and written to file. ODHit.isLargeObject() Returns TRUE if the ODHit is for a large object document. This can be used to select the appropriate retrieval method. “Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API application”. Rev: 12/15/2007 © Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Page 4 of 6 Code Samples The following code snippets are examples of how you might use these ODWEK Java APIs. Disclaimer: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. try { //---------// Logon to specified server //---------odServer = new ODServer(); odServer.initialize(argv[5], "TcHitGetDocLO.java"); System.out.println("Logging on to " + argv[0] + "..."); odServer.logon(argv[0], argv[1], argv[2]); //---------// Open the specified folder and search with the default criteria //---------System.out.println("Opening " + argv[3] + " folder..."); odFolder = odServer.openFolder(argv[3]); System.out.println("Searching with default criteria..."); hits = odFolder.search(); System.out.println("Number of hits: " + hits.size()); if (hits.size() > 0) { odHit = (ODHit) hits.elementAt(0); String rid = odHit.getResourceID(); System.out.println("Resource id is " + rid); //You can add code here to check if rid already exists // if not then System.out.println("\n Call GetResource with filename"); odHit.getResources( rid + ".res"); System.out.println(resfile.length + "bytes" + " written to " + argv[4] + GetRes.out"); System.out.println("\nCall GetDocument with filename. If LargeObject Data, this will only retrieve only the 1st segment."); odHit.getDocument(argv[4]+"TcHitGetDocLO1.out",false); System.out.println("Doc data written to " + argv[4] +"TcHitGetDocLO1.out"); “Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API application”. Rev: 12/15/2007 © Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Page 5 of 6 } //---------// Cleanup //---------odFolder.close(); odServer.logoff(); odServer.terminate(); } catch (ODException e) { System.out.println("ODException: " + e); System.out.println(" id = " + e.getErrorId()); System.out.println(" msg = " + e.getErrorMsg()); e.printStackTrace(); } catch (Exception e2) { System.out.println("exception: " + e2); e2.printStackTrace(); } } “Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API application”. Rev: 12/15/2007 © Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Page 6 of 6

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Best practices for large document and resource caching in an