Download Best practices for large document and resource caching in an

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Best practices for AFP Resource caching in a IBM®
Content Manager OnDemand Web Enablement Kit Java
API application.
12/15/2007
Author: Bob Lichens
Software Engineer
IBM Content Management OnDemand Development
“Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API
application”. Rev: 12/15/2007
© Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted
Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Page 1 of 6
This document will describe how the IBM® Content Management OnDemand Web Enablement Kit
Java APIs (referred to hereafter as simply ODWEK) handle the caching of AFP Resource data in
memory, and how you can avoid excessive system load by caching large AFP Resources to file using
ODWEK Java APIs. We will also discuss how you can avoid JVM Heap Fragmentation due to large
JNI allocations by retrieving large document data buffers directly to file.
The target audience for this article is application developers, testers, and those that would support the
application. This document assumes that the reader is proficient with the Java™ programming language,
Java™ Virtual Machine configuration and management, and has a good working knowledge of the IBM
Content Manager OnDemand products and AFP print streams.
What are IBM OnDemand Web Enablement Kit Java APIs (ODWEK)?
The ODWEK Java API provide industry standard Java™ classes that can be used by a customer to write
a custom web application that can access data stored on the OnDemand Server. This custom application
could, for example, allow the end user to logon to an OnDemand server, get a list of folders, search a
specific folder, generate a hitlist of matching documents and retrieve those documents for viewing.
There are also many APIs to provide advanced functionality as well.
Prerequisites: This document addresses features and functionality that is only available in ODWEK
Versions 7.1.2.5 or later (including ODWEK 8.4.0.0 and later).
ODWEK Overview of AFP Resource handling and memory cache.
The ODWEK Java APIs are optimized for quick retrieval of documents in high-volume, production
environments and are enhanced to handle high-volume retrievals of AFP documents by caching the AFP
Resources in memory. Each unique resource will be compressed and stored in a memory cache which
is shared by all ODWEK ODServer sessions in the web application environment. This resource sharing
will avoid multiple data requests to the OnDemand Server, free TCP/IP bandwidth, and reduce
OnDemand Server load.
Each Application Group defined within the OnDemand Server has its own unique set of AFP resources
and corresponding AFP Resource ID’s. These application group specific resources are added to the
Memory Cache (as needed) for document retrievals and will not be freed from Memory Cache until all
ODServer objects have been (are simultaneously?) terminated.
Problems that can occur.
With the advent of full color printing and other additional data being added to print streams, the size of a
typical document that is stored in OnDemand has increased dramatically. Historically, with limited
bandwidth and system resources, a typical AFP document would have an average resource size of 5KB.
Today’s AFP documents have resource files that may include full color graphics, custom fonts, images
and any number of inserts, and these AFP Resource files can reach sizes approaching 10MB. If not
sized properly, your application may run out of memory.
“Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API
application”. Rev: 12/15/2007
© Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted
Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Page 2 of 6
The Solution.
The solution to the sizing and memory problems delineated above is to have your application retrieve
the resource data from a file cache, rather than memory. This way you can avoid multiple retrievals of
large resource file buffers across the network. Also, you can read the data into the JVM directly from
file and avoid passing a large buffer across the JNI/JVM boundary.
The ODWEK Java APIs (version 7.1.2.5 and higher) now include two methods which allow you to
retrieve document data and AFP resource data separately, and output them directly to file:
The ODHit.getResource( ) API will allow you to get the AFP Resource data directly to file, and will bypass the ODWEK memory cache for these resources. This will not only avoid the excess overhead of
having many large resources in memory, but along with ODHitGetResourceID() will allow for you to
design a file based cache system to suit your particular AFP resource caching needs.
The ODHit.getDocument( ) API allows for the data to be written to file in the JNI layer, which can
alleviate some of the JVM heap fragmentation problems that are evident with larger data buffers. Also,
this API is optimized to eliminate much of the legacy ODHit.retrieve() API overhead when handling
documents, since there is no data conversion of any kind here. This improves ODWEK performance and
also allows for data to be handed off to other methods outside of ODWEK. (i.e.; data transforms and
parsing routines).
“Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API
application”. Rev: 12/15/2007
© Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted
Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Page 3 of 6
API Documentation
ODHit.getResourceID()
Get the unique resource ID for the AFP resources required to view/transform the document. The
resource ID will be constructed from the OnDemand server, OnDemand application ID, and OnDemand
resource ID. Some examples of the values returned are:
akira-1209-9"
akira.boulder.ibm.com-1209-9"
9.17.00.201-1209-9"
This ID can then be used with ODHit.getResource(filename) to save off the resource file for later use.
ODHit.getResources()/ODHit.getResources(filename)
Retrieve the AFP resources for ODHit which is either returned to the caller in a ByteArray or written out
to the file specified. These methods do not use the standard ODWEK memory cache, and will not cache
the resources into memory.
ODHit.getDocument()
Retrieve the document for this ODHit. The document will be retrieved uncompressed and in its native
format. Conversion is not provided. See ODServer/ODHit.retrieve() if you require conversion
processing. If the current ODHit is for a large object document, only the first segment is returned. See
ODHit.getDocument(filename,allsegs) or ODHit.retrieveSegment() for large document handling.
ODHit.getDocument(String Filename, Bool allsegs)
Retrieve the document for this ODHit. The document will be retrieved uncompressed and in its native
format. Conversion is not provided. See ODServer/ODHit.retrieve() if you require conversion
processing. If allsegs = true, then all segments of a large object document will be written to the file
specified. If allsegs = false, then only the first segment will be retrieved and written to file.
ODHit.isLargeObject()
Returns TRUE if the ODHit is for a large object document. This can be used to select the appropriate
retrieval method.
“Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API
application”. Rev: 12/15/2007
© Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted
Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Page 4 of 6
Code Samples
The following code snippets are examples of how you might use these ODWEK Java APIs.
Disclaimer: This information contains sample application programs in source language, which illustrate
programming techniques on various operating platforms. You may copy, modify, and distribute these
sample programs in any form without payment to IBM, for the purposes of developing, using, marketing
or distributing application programs conforming to the application programming interface for the
operating platform for which the sample programs are written. These examples have not been
thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability,
serviceability, or function of these programs.
try {
//---------// Logon to specified server
//---------odServer = new ODServer();
odServer.initialize(argv[5], "TcHitGetDocLO.java");
System.out.println("Logging on to " + argv[0] + "...");
odServer.logon(argv[0], argv[1], argv[2]);
//---------// Open the specified folder and search with the default criteria
//---------System.out.println("Opening " + argv[3] + " folder...");
odFolder = odServer.openFolder(argv[3]);
System.out.println("Searching with default criteria...");
hits = odFolder.search();
System.out.println("Number of hits: " + hits.size());
if (hits.size() > 0)
{
odHit = (ODHit) hits.elementAt(0);
String rid = odHit.getResourceID();
System.out.println("Resource id is " + rid);
//You can add code here to check if rid already exists
// if not then
System.out.println("\n Call GetResource with filename");
odHit.getResources( rid + ".res");
System.out.println(resfile.length + "bytes" + " written to " + argv[4] + GetRes.out");
System.out.println("\nCall GetDocument with filename.
If LargeObject Data, this will only retrieve only the 1st segment.");
odHit.getDocument(argv[4]+"TcHitGetDocLO1.out",false);
System.out.println("Doc data written to " + argv[4] +"TcHitGetDocLO1.out");
“Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API
application”. Rev: 12/15/2007
© Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted
Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Page 5 of 6
}
//---------// Cleanup
//---------odFolder.close();
odServer.logoff();
odServer.terminate();
}
catch (ODException e) {
System.out.println("ODException: " + e);
System.out.println(" id = " + e.getErrorId());
System.out.println(" msg = " + e.getErrorMsg());
e.printStackTrace();
}
catch (Exception e2) {
System.out.println("exception: " + e2);
e2.printStackTrace();
}
}
“Best practices for AFP Resource caching in a IBM Content Manager OnDemand Web Enablement Kit Java API
application”. Rev: 12/15/2007
© Copyright International Business Machines Corporation 2007. All rights reserved. US Government Users Restricted
Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Page 6 of 6