Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Collection Analysis Technical Documentation Yue Ji February 26, 2007 This is the Collection Analysis Version 2 Revision 1. Table of Contents: 1. Collection Analysis Version 2 Revision 1 Interface Structure………… page 1-8 2. Interface Programming Technique ……………………………..……… page 8-11 3. Collection Analysis Version 2 Revision 1 Output Description………… page 11-15 4. Special Technique Used in Programming ……………………………… page 15-21 5. Related Documentations……………….………………………………… page 21 6. Contact Staff……………………………………………………………… page 21 1. Collection Analysis Version 2 Revision 1 Interface Structure. 1.1 Interface overview. The call number is the key to find records. This version only deals with LC call number. The call number’s formation rules and formats of storing in database are complex. It is hard for users to know the call number based on the record category that they are interested. Also it is very easy to pull the wrong records due to the call number’s complex format. This version of Collection Analysis Tool changes the traditional way that let users enter in call number. Instead, it gives the call number’s list based on users’ interests. There are four select boxes with the values depending on previous selections. Here is the interface screen shot. The highlights here are the examples that will be used to explain the interface design. 1 1.2 First select box. The first select box displays the location code, location name, and the MFHD count of LC call number in each location. The data in this box is loaded whenever this application is invoked by browsers. Here is an example: yulint [6] -> Yale Internet Resource Location code: yulint. Location name: Yale Internet Resource. Total MFHD count of LC call number in yulint location: 6. 1.3 Second select box. The second select box displays location code, LC class letter, class label, and MFHD count in its location. The data in this box are the results by clicking SELECT button under “1. Select Location(s)” after making the selections from the first select box. For example, after select the above first box selection, the data in this box are: yulint ->D[2] HISTORY (GENERAL) AND HISTORY OF EUROPE yulint ->G[2] GEOGRAPHY. ANTHROPOLOGY. RECREATION 2 yulint ->Q[3] SCIENCE These seven MFHDs from yulint are: Two is LC D class whose label is HISTORY (GENERAL) AND HISTORY OF EUROPE. Two is LC G class whose label name is GEOGRAPHY. ANTHROPOLOGY. RECREATION. Three is LC Q class whose label is SCIENCE. If you want to see more details, for example, what subclasses of Q that yulint has, highlight the Q line, then click SELECT button under “2. Select Class(es)”. The results will show up at the third select box. 1.4 Third select box. The third select box displays location code, LC subclass letter, subclass label, and MFHD count in its location. For example, after select the above second box all selections, the data in this box are: yulint yulint yulint yulint yulint ->DS[2] Asia ->G[2] Geography (General). Atlases. Maps ->QC[1] Physics ->QE[1] Geology ->QL[1] Zoology The two D MFHD’s subclass in yulint is DS whose label is Asia. The two G MFHD’s subclass in yulint are G whose label is Geography (General). Atlases. Maps. The three Q MFHD’s subclass in yulint are: One is QC whose label is Physics. One is QE whose label is Geology. One is QL whose label is Zoology. If you are still curious to know what call numbers of these subclasses in yulint could be, for example, what call number range of this QC is, highlight QC line, then click SELECT button under “3. Select Subclass(es)”. The results will show up at the fourth select box. 1.5 Fourth select box. The fourth select box displays location code, LC subclass call number range, subclass call number range label, and MFHD count in its location. For example, after select the above third box selections, the data in the fourth box are: yulint yulint yulint yulint yulint yulint yulint yulint ->***DS1-937[2] History of Asia*** -> DS501-518[1] East Asia. The Far East -> DS801-897[1] Japan ->***G1-922[1] Geography (General) *** ->***G3180-9980[1] Maps*** -> G3290-9880[1] By region or country ->***QC1-999[1] Physics*** -> QC81-114[1] Weights and measures 3 yulint yulint yulint yulint ->***QE1-996[1] Geology*** -> QE500-639[1] Dynamic and structural geology ->***QL1-991[1] Zoology*** -> QL605-739[1] Chordates. Vertebrates LC subclass has parent – child hierarchy structure. For example, “DS1-937 History of Asia” is the parent; “DS501-518 East Asia. The Far East”, and “DS801-897 Japan” are its children. So if you are wondering what are they in DS1-937? The answer will be one is in DS501-518, and the other one is in DS801-897. This parent – child hierarchy structure is displayed in browser showed above. Parent level: the lines start and end with ***. Child level: indented away from the arrow and under its parent. It is possible that there is only parent level, no any children belong to. The sum of all MFHD counts from parent level is equal to this section’s total MFHD counts. Example of LC Subclass Hierarchy: Parent / MFHD count Children / MFHD count DS1-937 / 2 DS501-518 / 1, DS801-897 / 1 (two children) No child G3290-9880 / 1 (one child) QC81-114 / 1 (one child) QE500-639 / 1 (one child) QL605-739 / 1 (one child) G1-922 / 1 G3180-9980 / 1 QC1-999 / 1 QE1-996 / 1 QL1-991 / 1 1.6 Conversion of “Library of Congress Classification Outline”. The resource that is used to apply the call number hierarchy is “Library of Congress Classification Outline”. The URL is http://www.loc.gov/catdir/cpso/lcco/lcco.html “Library of Congress Classification Outline” is input into three Microsoft Excel sheets - LC_MAIN_CLASS.xls, LC_SUB_CLASS.xls, LC_RANGE.xls. There is another Java standalone application EXPORTLCCLASS that processes these three Excel files to import the data into following three Oracle tables in LIBSYS. Creation of these three tables. 1). create table "LC_MAIN_CLASS" ( "CLASS_LETTER" VARCHAR2(1) not null constraint "CLASS_LETTER_PK" primary key, "CLASS_TITLE" VARCHAR2(70) not null ) 4 The data sample from LC_MAIN_CLASS table screen shot: 2). create table "LC_SUB_CLASS" ( "CLASS_LETTER" VARCHAR2(1) not null, "SUBCLASS_LETTER" VARCHAR2(3) not null constraint "SUBCLASS_LETTER_PK" primary key, "SUBCLASS_TITLE" VARCHAR2(70) not null ) 5 The data sample from LC_SUB_CLASS table screen shot: 3). create table "LC_RANGE" ( "RANGE_ID" DECIMAL(22) not null constraint "RANGE_ID_PK" primary key, "SUBCLASS" VARCHAR2(3) not null, "START_NUMBER" VARCHAR2(6), "END_NUMBER" VARCHAR2(6), "RANGE_TITLE" VARCHAR2(70) not null, "HIERARCHY" DECIMAL(22) not null, "SEQUENCE" DECIMAL(22) not null, "CATEGORY_ID" DECIMAL(22) not null ) 6 The data sample from LC_RANGE table screen shot: 1.7 Queries behind SELECT buttons. 1). For the first select box: select * from LIBSYS.LC_MAIN_CLASS select location_id, count(*) as class_count, substr(normalized_call_no,1,1) as class_letter from (select * from mfhd_master where location_id in [list of location id] and call_no_type = '0') group by location_id, substr(normalized_call_no,1,1) order by location_id, substr(normalized_call_no,1,1) 2). For the second select box: select * from LIBSYS.LC_SUB_CLASS select count(*) as sub_count, substr(normalized_call_no,1,instr(normalized_call_no,' ')) as sub_letter from (select * from (select * from MFHD_MASTER where location_id = ?) where call_no_type = '0') where substr(normalized_call_no,1,1) = ? group by substr(normalized_call_no,1,instr(normalized_call_no,' ')) order by 2 7 3). For the third select box: select * from LIBSYS.LC_RANGE where SUBCLASS = ? order by range_id select count(*) as range_count from (select * from MFHD_MASTER where location_id = ? and call_no_type='0') where substr(normalized_call_no,1,8) between ? AND ? ‘?’ presents the data that is generated by programs dynamically. The structure of the first part in NORMALIZED_CALL_NO field in MFHD_MASTER table: Subclass letter(one or more) + four spaces + one digit (subclass number part). Subclass letter(one or more) + three spaces + two digits (subclass number part). Subclass letter(one or more) + two spaces + three digits (subclass number part). Subclass letter(one or more) + one space + four digits (subclass number part). 2. Interface Programming Technique. 2.1 Programming language. Client: JSP. Middle tier: JavaScript, AJAX – DWR. Server: Java. Special Use: Java Thread, Java TreeMap, PrepareStatement, Java Encoding output. 2.2 Three tiers connections. Here gives the example to explain what needs to do that can make the data move from first select box to second select box. 1). In JSP - CARevision1Main.jsp: In the HEAD: <script src='src/MoveSelection.js'> </script> <script src='dwr/interface/StartMoveSelection.js'> </script> <script src='dwr/interface/SelectLocation.js'> </script> <script src='dwr/interface/ThreadGetLocation.js'> </script> In the BODAY: <select name="selectlocation" size=8 multiple class="selectlocation"> 8 <button type="button" name="move1" onClick="moveFrom1To2(this.form.selectlocation)" style=" background:images/yellowbackground.jpg;border-width:0px"> <img src="images/selectbutton.gif" width="98" height="23"></button> moveFrom1To2 is the function of MoveSelection.js 2). In JavaScript - MoveSelection.js: Clear the second select box. StartMoveSelection.startLocation(refreshLocation,wholeOptions). StartMoveSelection is a Java program, and startLocation is its one method. refreshLocation is the function of MoveSelection.js. wholeOptions is the processed string of first select box’s selected data. In refreshLocation.js: ThreadGetLocation.isRunning(updateLocation). ThreadGetLocation is a Java program, and isRunning is its one method. updateLocation is the function of MoveSlecetion.js. updateLocation is called as updateLocation(runStatusBean) runStatusBean is related with a Java program RunStatusBean.java. It is a setter and getter. It needs to be declared in dwr.xml: <convert converter="bean" match="JavaCodes.RunStatusBean"/>. This program is the connection between client and server. In updateLocation(runStatusBean): runStatusBean.finishRunning is checking the query running status on server. The result is represented as a number: 0: Query is running. it will call refreshLocation every 1000 milliseconds (thousandths of a second). 1: Query is finished. Populate the results: SelectLocation.queryResults(popListIn2). SelectLocation is a Java program , and queryResults is its method. popListIn2 is the function of MoveSlecetion.js. It uses DWR method to write the results back into second HTML select box. 2: Error happened on server. Display the error message on the browser. 3). In Java: The communication from middle tier to server starts from StartMoveSelection.java. 9 This program invokes Java threads by calling following two Java classes: ThreadPutLocation putData = new ThreadPutLocation(thisApplication,inputOpt); ThreadGetLocation getData = new ThreadGetLocation(thisApplication); putData.start(); getData.start(); ThreadPutLocation is a thread which invokes running query Java class on the server. This running query Java class is SelectLocation with the method runQuery. ThreadGetLocation is also a thread which is checking query running status, assign the status as a number that describes above to RunStatusBean’s setters. In updateLocation.js, RunStatusBean’s getters are being called. In the JavaScript function updateLocation(runStatusBean), it will periodically (every 1000 milliseconds) check this number. If the query is finished, SelectLocation.queryResults will be called to get the query results. RunStatusBean.java is setter/getter. The setter are: setFinishRunning, setCountRunning. The getters are: getFinishRunning, getCountRunning. 2.3 Programs and their methods/functions behind SELECT button. 1). Each SELECT button’s background functions with their parameters in MoveSelection.js First SELECT button Second SELECT button Third SELECT button moveFrom1To2(fbox) refreshLocation() updateLocation(runStatusBean) popListIn2(selectLocation) moveFrom2To3(fbox) refreshClass() updateClass(runStatusBean) popListIn3(selectClass) moveFrom3To4(fbox) refreshSubclass() updateSubclass(runStatusBean) popListIn4(selectSubclass) 2). Each SELECT button’s background methods of Java programs: Java Program(.java) Methods Included All SELECT Buttons StartMoveSelectio n RunStatusBean startLocation startClass startSubclass First SELECT Button Second SELECT Button SelectLocation ThreadPutLocation ThreadGetLocation setFinishRunnin g queryStatus run run getFinishRunni ng runQuery setCountRunnin g queryResults isRunning isCompleted SelectClass ThreadPutClass ThreadGetClass queryStatus run run runQuery queryResults isRunning isCompleted 10 getCountRunning Third SELECT Button SelectSubclass ThreadPutSubclass ThreadGetSubclass queryStatus run run runQuery queryResults isRunning isCompleted Although some of method’s names are the same in different Java class, but the contents are the different. 3. Collection Analysis Version 2 Revision 1 Output Description. 3.1 Output overview. You can output data from each of the four select boxes. The output file is the “|” delimited text file. The file name pattern is netid_timestamped_CA.txt. You can import the text file into Microsoft Excel or Access to review and manipulate the data. The maximum number of records that the text file can contain depends on multiple factors, such as the capability of Oracle function, the maximum size of the Oracle result set, the maximum size of text file, The length limitation of Excel or Access to import the file, the memory size of the desktop, and server etc. It’s hard to tell what the maximum number of record that can be output is. It’s recommended less than 40,000 records. The time of getting output data is various upon different requests. It could be from seconds to hours. Here uses AJAX technique to separate the connection between client and server. After the client submits the request, the client doesn’t need to wait the response from the server. That means the connection is over, but the server still continues to do its own job. After the job is done, the server will notify the user to get her/his file by sending an email with the URL to point to the file path. 3.2 Output button queries. There are 4 OUTPUT buttons with 4 output types. So there are total 16 queries behind all OUTPUT buttons. These 16 queries are documented in following four files: OutputLocationQueries. OutputClassQueries. OutputSubclassQueries. OutputRangeQueries. 3.3 Output programming summary. Here gives the example to explain what need to do for the OUTPUT of first select box. 1). In JSP - CARevision1Main.jsp: 11 In the HEAD: <script src='src/InvokeOutput.js'> </script> <script src='dwr/interface/OutputLocation.js'> </script> In the BODY: <button type="button" name="out1" onClick="output1(this.form,'<% out.print(passData); %>', '<% out.print(lastName); %>','<% out.print(netID); %>')" style=" background:images/yellowbackground.jpg;border-width:0px"> <img src="images/output.gif" width="98" height="23"></button> output1 is the function of InvokeOutput.js. 2). In JavaScript InvokeOutput.js': Output1 parses the parameters that are passed in from JSP, then concatenate them to the different parameters that will be passed out to the Java server program. Different parameter that is passed into Output1 will invoke one of these four methods of Java program on the server. OutputLocation.OutputBM(wholeOptions,passdata): Output Bibliographic and holdings in selected location information. OutputLocation.OutputBMA(wholeOptions,passdata): Output Bibliographic and holdings in all related locations information. OutputLocation.OutputBMI(wholeOptions,passdata): Output Bibliographic and holdings plus items in selected location information. OutputLocation.OutputBMIA(wholeOptions,passdata): Output Bibliographic and holdings plus items in all related locations information. OutputLocation is a Java program, and has four methods OutputBM, OutputBMA, OutputBMI, OutputBMIA. After you click the OUTPUT button, it will prompt the message “Your report URL link will be sent to your email”. At this point, this interactive transaction between client and server is over. The client and server will not wait for each other’s response. 3). In the Java OutputLocation.java: Each method has the similar procedure. The procedure steps are list below in the execution order. Parse the parameters that have been passed in from InvokeOutput.js. 12 Get the system date; then create timestamped file name. The file name pattern is netid_YYYYMMDD_hh-mm-ss_CA.txt. "YYYYMMDD_hh-mm-ss" is the date and time that the file is created. Assign the output text file’s path (where to get this file). Set up environment of sending Email. Dynamically build queries. Run queries. Write the results into text file. Send email to notify the user that the output file is ready. 3.4 Email servers. 1). There are two domains on campus. Central campus. Incoming mail server: netid.mail.yale.edu Email address: [email protected] Medical campus. Incoming mail server: email.med.yale.edu Email address: [email protected] Both have the same outgoing mail server: mail.yale.edu The user has to use the correct domain name in order to receive his/her output file. For example, staff work on SML, their incoming mail server should be netid.mail.yale.edu. If the program assigns their incoming mail server as email.med.yale.edu, the sending email will be failed. 2). How to decide the user’s email domain name? In the Voyager OPERATOR table, the LAST_NAME contains staff group data. Most groups are located on central campus, except for following 4 groups on medical campus: Medical Library, Medical Library Student, EPH Library, EPH Library Student. The program selects different incoming mail server based on staff group by using netid to find the group. 13 3.5 Output file link. The size of the output file can be very large. If sending the large file through the email, it may crash the email system. So in this application, it just sends the file’s URL link in the email. When the user clicks this link, it will bring the user to the file path that locates on the server. Because the file is named starting with netid, the user can easily find his/her file on the server. Then right click the file to save this file to his/her desktop. Be cautious, DON’T double click to open the file. If the file size is too large, it can freeze the browser, even the whole desktop. After the file is downloaded on the desktop, open the new Excel sheet, and import this file. 3.6 Output file structure. The output file is the text file. The fields are delimited by pipe sign ‘|’. The fields in bib and holding file: MFHD_LOCATION_CODE|CALL_NUMBER|BIB_FORMAT|AUTHOR|BRIE F_TITLE|IMPRINT|BEGIN_PUB_YEAR|PHYSICAL_DESC|LANGUAGE|BIB _ENCODING_LEVEL|BIB_ID|MFHD_ID|SUCCEEDING|BIB_DATE_TYPE|H OLDING The fields in bib, holding, and item file: MFHD_LOCATION_CODE|CALL_NUMBER|BIB_FORMAT|AUTHOR|BRIE F_TITLE|IMPRINT|BEGIN_PUB_YEAR|PHYSICAL_DESC|LANGUAGE|BIB _ENCODING_LEVEL|BIB_ID|MFHD_ID|SUCCEEDING|BIB_DATE_TYPE|H OLDING|ITEM_PERM_LOC_CODE|ITEM_TEMP_LOC_CODE|LAST_CIRC_ DATE|CHARGES|BROWSES|BARCODE|ITEM_ID 3.7 Reason of output file disordered in Excel file. After the text file is imported into Excel sheet, if there is non-display character or pipe sign ‘|’ in one record, this record in the Excel sheet will be disordered. This record should be fixed by Cataloging Department. 3.8 Programs and their methods/functions behind OUTPUT button. 1). Each OUTPUT button’s background functions with their parameters in InvokeOutput.js: Function First OUTPUT Button Second OUTPUT Button Third OUTPUT Button Fourth OUTPUT Button output1(fbox,passdata,lastname,netid) output2(fbox,passdata,lastname,netid) output3(fbox,passdata,lastname,netid) output4(fbox,passdata,lastname,netid) 14 2). Each OUTPUT button’s background methods of Java programs: First OUTPUT Button Second OUTPUT Button Third OUTPUT Button Fourth OUTPUT Button Java Program(.java) Methods Included OutputLocation OutputBM OutputBMA OutputBMI OutputBMIA OutputClass OutputCBM OutputCBMA OutputCBMI OutputCBMIA OutputSubclass OutputSBM OutputSBMA OutputSBMI OutputSBMIA OutputRange OutputRBM OutputRBMA OutputRBMI OutputRBMIA 4. Special Technique Used in Programming. 4.1 prepareStatement vs. createStatement. The prepareStatement is used instead of createStatement in this application. The decision is made based upon following explanation. When to actually use a PreparedStatement vs a Statement object? It depends on your usage. If you plan of executing your statement infrequently, you might want to consider the createStatement() approach. If you plan on executing that statement frequently, and would not want to incur the repeated cost of creating and compiling the statement, you may be better off using prepared statements. PreparedStatement objects are best used when you will be executing a large number of identical queries with different values. If you are going to be looping through code and adding in or updating rows in bulk, go for the PreparedStatement, otherwise, Statement is your answer. Example code 1: PreparedStatement pstmt = conn.prepareStatement("insert into table (column2) values ("My Value") where id = 1000"); pstmt.execute(); 15 Example code 2: PreparedStatement pstmt = conn.prepareStatement("insert into table (column2) values (?) where id = ?"); pstmt.setString("My Value"); pstmt.setInt(1000); pstmt.execute(); The first one is blatantly wrong but what's wrong with the second one? It's being executed every time you run through the code. Why is it bad to do it this way? You are DOUBLING your number of calls to the database. When you call conn.prepareStatement(String) you are sending a message to the database to pre-compile the sql string. You then send another message to the database when you call execute() after you set the variables. The correct way of using prepared statement would be in a situation like this: Example code 3: PreparedStatement pstmt = conn.prepareStatement("insert into table (column2) values (?) where id = ?"); while (true) // some kind of terminating loop here not just while true { pstmt.setString(valueVar); pstmt.setInt(intVar); pstmt.execute(); } However, there is a large speed difference with the first 50-60 records being sent. If you are doing less that 50-60 iterations of this query it is still faster to use Statements rather than a PreparedStatement. However, it is twice as fast to use PreparedStatements once you have iterated through it about 1000 times. Statements are good for one time insert/updates and also for sending in batches of several different inserts/updates. Example code 4: String sql1 = "insert into..."; String sql2 = "update table set ..."; Statement st = conn.createStatement(); st.addBatch(sql1); st.addBatch(sql2); int[] returnRows = st.executeBatch(); 16 4.2 Precompile JSP and Servlet. 1). What data are “loaded into” browser when CARevision1Main.jsp is invoked every time? The first load of data are all locations with Library of Congress call numbers. There are two steps to get the data: Get all locations that have holding counts from LOCATION table. These holding counts include all classifications, such as LC, Government Documents etc. So it needs to go to MFHD_MASTER table to find LC holding counts only. Each location needs to go through the whole MFHD_MASTER table to count the number of LC call number it has. Because MFHD_MASTER is a huge table, around 8 million records in it, counting each location LC holdings is time consuming, for about 1 minute. The users will feel too long while they are facing a blank page for a minute. The whole results will not be changed after this JSP page is loaded at first time. It’s no need to execute above two steps every time. In order to make performance more efficient, and to achieve this capability, init() method is used in PrecompileInit.java, and jspInit() method is used in CARevision1Main.jsp in this application. 2). Init() method in Java servlet. PrecompileInit.java is a servlet. It is located at WEB-INF/classes/Precompile. The init() method is precomplied and executed only once into cache if it is declared in web.xml as below, when this application is loading into tomcat container by various reasons, such as deploy this applocation, start the whole tomcat, start this application, reload this application. If it is not declared in web.xml, the init() won’t be precompiled and executed. <init-param> part is not required for precompile, but it has a good feature that can bring in changeable external key-pair value into codes. <servlet> <servlet-name>PrecompileInit</servlet-name> <display-name>Servlet Precompile Init</display-name> <description>Fast servelet for listing location with LC MFHD counts. </description> <servlet-class>Precompile.PrecompileInit</servlet-class> <init-param> <param-name>Incoming_central_mail_server</param-name> <param-value>netid.mail.yale.edu</param-value> </init-param> <init-param> 17 <param-name>Incoming_medical_mail_server</param-name> <param-value>email.med.yale.edu</param-value> </init-param> <init-param> <param-name>Outgoing_mail_server</param-name> <param-value>mail.yale.edu</param-value> </init-param> <init-param> <param-name>Send_email_address</param-name> <param-value>[email protected]</param-value> </init-param> <init-param> <param-name>Output_file_path</param-name> <!--param-value> /usr/local/tomcat/webapps/DownloadFiles/Collection_Analysis_Files </param-value--> <param-value>c:/temp</param-value> </init-param> <init-param> <param-name>Output_file_URL</param-name> <param-value> http://magellan.library.yale.edu:8085/DownloadFiles/Collection_Analysis_Files </param-value> </init-param> <load-on-startup>1</load-on-startup> </servlet> <servlet-mapping> <servlet-name>PrecompileInit</servlet-name> <url-pattern>/servlet/PrecompileInit</url-pattern> </servlet-mapping> Explanation about <load-on-startup>. This tag specifies that the servlet should be loaded automatically when the web application is started. The value is a single positive integer, which specifies the loading order. Servlets with lower values are loaded before servlets with higher values (ie: a servlet with a load-on-startup value of 1 or 5 is loaded before a servlet with a value of 10 or 20). When loaded, the init() method of the servlet is called. Therefore this tag provides a good way to do the following: - start any daemon threads, such as a server listening on a TCP/IP port, or a background maintenance thread 18 - perform initialization of the application, such as parsing a settings file which provides data to other servlets/JSPs If no <load-on-startup> value is specified, the servlet will be loaded when the container decides it needs to be loaded - typically on it's first access. This is suitable for servlets that don't need to perform special initialization. If init() is not declared in web.xml, the init() won’t be precompiled and executed at the time of tomcat starting the application. If init() is declared in web.xml, the init() will be precompiled and executed only once at the time of tomcat starting the application. The data that generate from init() will be cached for the life time at the time of tomcat starting the application. In the init() of PrecompileInit.java, make data source connection; get all locations with LC MFHD counts as above described, save them in the cached temp file for jspInit() to use. 3). jspInit() method in CARevision1Main.jsp. jspInit() method can be compiled and executed into cache only once when the JSP is invoked at the first time, no matter it is declared at web.xml or not. If it is declared in web.xml as below, it will be compiled and executed before the JSP is invoked, but can’t be cached; and when the JSP is invoked at the first time, jspInit() will be compiled and executed again; but this time it will be cached for the life time. If it is not declared in web.xml, it won’t be compiled and executed before the JSP is invoked. It is no need to add the declaration in web.xml, because it can cause the jspInit() being compiled and executed twice. <servlet> <servlet-name>JSPINIT Preload</servlet-name> <jsp-file>/CARevision1Main.jsp</jsp-file> <load-on-startup>1</load-on-startup> </servlet> The JSP’s preload doesn’t need to have mapping section like the servlet does. In tomcat environment, a JSP's jspInit() method is called and cached only once the first time the JSP is invoked for its life time. Be aware, it must happen at the first time the JSP is invoked. Here is a trick that you can use to improve performance using jspInit() method. You can use this method to cache static data. Generally a JSP generates not only dynamic data but also static data. Programmers often make a mistake by creating both dynamic and static data from 19 JSP page. Obviously there is a reason to create dynamic data because of its nature but there is no need to create static data every time for every request in JSP page. If JSP is not declared in web.xml, the jspInit() won’t be precompiled and executed at the time of tomcat starting the application. If JSP is declared in web.xml, the jspInit() will be precompiled and executed at the time of tomcat starting the application. Regardless JSP is declared in web.xml or not, JSP will be precompiled and executed only once at the time of this JSP is invoked by browser. The data that generate from jspInit() will be cached for the life time. In the jspInit() of CARevision1Main.jsp, parsing the data acquired from init() of PrecompileInit.java, and cached into string arrays for life time use. That means after the CARevision1Main.jsp is invoked at the first time, jspInit() won’t be compiled and executed any more. CARevision1Main.jsp just gets cached data every time when is running. 4.3 Diacritics. How to display the foreign language’s diacritics correctly is a complex issue. For the most common European language, the encoding is ISO8859_1. Whether the diacritics can be displayed or got correctly depends on if its environments support the ISO8859_1 or not. The environments include Java language, SQL, text editors, browsers, MS Excel, Access, Word etc. From Java programming point of view, if codes use the inappropriate methods, output function still can work, but diacritics will be the wrong characters. Here are codes that used in this application for outputting the correct diacritics into the file: OutputStream fout = new FileOutputStream(txtName); OutputStream bout = new BufferedOutputStream(fout); OutputStreamWriter txtOutput = new OutputStreamWriter(bout, "8859_1"); txtOutput.write(dataLine); txtOutput.close(); This part of coding can successfully output the diacritics. However in order to display the correct diacritics, it also depends on if display environments can support the ISO8859_1. For example, you can see NOTEPAD can display diacritics correctly, but VEDIT can’t display diacritics correctly. 20 4.4 The feature of output file’s path on the server. This path has to be accessible through URL link. It can’t be any paths on the server. The path has to be in tomcat container under the root of webapps. One simple web application DownloadFiles is created for this purpose. “DownloadFiles” is the root directory for URL accessible file path. All files and directories that need to be accessed from URL are under “DownloadFiles”. For this version of Collection Analysis, the directory is named as Collection_Analysis_Files. All output files are saved in Collection_Analysis_Files directory. After the user OUTPUTs his/her file, He/she will get the email with the URL link that indicates the file path. The Collection_Analysis_Files directory needs to be cleaned up daily. The length of days that the files are saved in this directory will be 14 days from their creating date. Here is the example of email message that users receive after they click OUTPUT button: Here is your Collection Analysis File: yj33_20070305_16-55-36_CA.txt Please click the link to find your file. Then right click your file name to save on your desktop. http://magellan.library.yale.edu:8085/DownloadFiles/Collection_Analysis_Files This file will be saved for 14 days. 5. Related Documentations. - ReadMe_CARevision1_Deploy.txt. HowToUseDWR.doc. OutputLocationQueries. OutputClassQueries. OutputSubclassQueries. OutputRangeQueries. 6. Contact Staff. IS&P: Estelle Pope < [email protected] > ITS: Gail Barnett < [email protected] > Bob Rice < [email protected] > 21