Download manual - Department of Computer Engineering

Document related concepts
no text concepts found
Transcript
XML with AuctionBase
For Lab Use
17 Ocak 2005
Mehmet Cihan Kurt
9702413, Cmpe Dept.
INDEX
1
2
Introduction _______________________________________________________ 4
1.1
MSSQL and Web Warm-up I __________________________________________ 4
1.2
MSSQL and Web Warm-up II _________________________________________ 5
1.3
AuctionBase Schema and Data _________________________________________ 5
1.4
MSSQL Features _____________________________________________________ 5
1.5
AuctionBase Web Site _________________________________________________ 5
MSSQL and Web Warm-up Part I _____________________________________ 6
2.1
2.1.1
2.1.2
2.1.3
2.1.4
2.1.5
2.1.6
2.2
Section A: Getting familiar with MSSQL _________________________________ 6
Logging In to Query Analyzer ________________________________________________ 6
Creating a Table___________________________________________________________ 6
Creating a Table With a Primary Key __________________________________________ 7
Inserting Tuples ___________________________________________________________ 7
Getting the Value of a Relation _______________________________________________ 8
Getting Rid of Your Tables __________________________________________________ 8
Section B : Simple Web Interface and Servlets ____________________________ 9
2.2.1
Overview ________________________________________________________________ 9
2.2.2
Setting up Java and Tomcat __________________________________________________ 9
2.2.3
Configuring & Using Apache Tomcat [2] _______________________________________ 10
2.2.3.1
Summary __________________________________________________________ 10
2.2.3.2
Steps for Installation & Setup __________________________________________ 10
2.2.3.3
Install the JDK ______________________________________________________ 12
2.2.3.4
Set the JAVA_HOME Variable _________________________________________ 13
2.2.3.5
Change the Port to 80 _________________________________________________ 13
2.2.3.6
Turn on Servlet Reloading _____________________________________________ 14
2.2.3.7
Enable the Invoker Servlet _____________________________________________ 14
2.2.3.8
Test Server _________________________________________________________ 15
2.2.3.8.1 Verify That the Server Can Start ______________________________________ 15
2.2.3.8.2 Try Some Simple HTML and JSP Pages ________________________________ 16
2.2.3.8.3 Setup Your Development Environment ________________________________ 17
2.2.3.8.4 Create a Development Directory ______________________________________ 18
2.2.3.8.5 Make Shortcuts to Start and Stop the Server _____________________________ 18
2.2.3.8.6 Set Your CLASSPATH _____________________________________________ 19
2.2.3.9
Installing JDBC Drivers for Windows XP & Windows 2000 [4] ________________ 20
2.2.4
Java and HTML Warm up [9] ________________________________________________ 23
2.2.5
Retrieving Input from the User ______________________________________________ 24
2.2.6
Forms __________________________________________________________________ 24
2.2.7
Java Server-Side Input Handling [10] __________________________________________ 27
2.2.8
Returning Output to the User ________________________________________________ 28
2.2.9
Java Code Output _________________________________________________________ 29
2.2.10
A Complete JDBC Example [5] ____________________________________________ 30
2.2.10.1
Creating a Database __________________________________________________ 30
2.2.10.2
Getting Information from a Database _____________________________________ 33
2.2.10.3
Obtaining Result MetaData Type Information ______________________________ 35
2.2.11
Handling Special Characters in HTML______________________________________ 36
3
MSSQL and Web Warm-up Part II____________________________________ 36
3.1
Preliminary Information about XML and DTD’s _________________________ 37
3.1.1
What is XML ? __________________________________________________________ 37
3.1.2
What Do XML Documents Look Like? [12] _____________________________________ 37
3.1.3
DTD (Data Type Definitions) for XML [11] _____________________________________ 38
3.1.3.1
Elements ___________________________________________________________ 40
3.1.3.2
Attributes __________________________________________________________ 42
3.1.3.3
Comments _________________________________________________________ 43
3.2
3.2.1
Section A: Examining XML Files ______________________________________ 43
Sample eBay Data ________________________________________________________ 43
3.3
Section B: Designing Relational Schema ________________________________ 49
3.4
Section C: Creating Tables in MS SQL _________________________________ 49
3.5
Section D: Writing a data transformation program [10] _____________________ 50
3.5.1
Creating the Skeleton ______________________________________________________ 50
3.5.2
Importing Classes ________________________________________________________ 51
3.5.3
Setting up for I/O _________________________________________________________ 51
3.5.4
Implementing the ContentHandler Interface ____________________________________ 52
3.5.5
Setting up the Parser ______________________________________________________ 53
3.5.6
Writing the Output ________________________________________________________ 54
3.5.7
Spacing the Output _______________________________________________________ 54
3.5.8
Handling Content Events ___________________________________________________ 55
3.5.8.1
Document Events ____________________________________________________ 55
3.5.8.2
Element Events _____________________________________________________ 55
3.5.8.3
Character Events ____________________________________________________ 56
3.5.9
Compiling and Running the Program _________________________________________ 59
3.6
4
Auctionbase Schema and Data _______________________________________ 65
4.1
Section A: Indexes ___________________________________________________ 65
4.2
Section B: Views ____________________________________________________ 65
4.2.1
4.2.2
5
Section D: Load the data into MSSQL [13] _______________________________ 63
What is a View ? [14] ______________________________________________________ 65
Views of AuctionBase [13] __________________________________________________ 66
MSSQL Features __________________________________________________ 67
5.1
Section A: Current Time _____________________________________________ 67
5.2
Section B: Constraints and Triggers [15] [14] _______________________________ 68
5.2.1
5.2.2
What is a CONSTRAINT ? _________________________________________________ 68
What is a TRIGGER? [15] ___________________________________________________ 70
5.2.2.1.1 Triggers Compared to Constraints [15] __________________________________ 70
5.2.3
CONSTRAINTs & TRIGGERs of AuctionBase DB [13] ___________________________ 72
5.2.3.1
CONSTRAINTs of AuctionBase DB_____________________________________ 72
5.2.3.2
TRIGGERs of AuctionBase DB ________________________________________ 73
6
AuctionBase Web Site ______________________________________________ 75
6.1
Functionality _______________________________________________________ 75
6.2
Web Interface
6.3
System testing ______________________________________________________ 78
[13]
___________________________________________________ 75
7
Conclusion _______________________________________________________ 79
8
References & Resources ____________________________________________ 79
1 Introduction
“XML with AuctionBase for Lab Use” is an implementation of an auction web site
with its full details, consisting of the database design, its working website, supported
by examples implementation details and references and links so that a student having
that manual in hand can carry the project himself/herself with enough information and
references. This is a project “For Lab Use” so a lot of concern is given to examples
,references and World Wide Web links, so that the student can follow the this booklet
by reading the supporting material, doing its exercises and get the working
knowledge and then apply it to do the project. Real data of an auction site eBay is
supplied to the student in XML form, so s/he can work on the data and design his/her
own database on MSSQL compatible to the 4NF. After the design of the database,
some special MSSQL functionalities (e.g. CONSTRAINTS) should be learnt and
applied so that dabatase consistency can be preserved and functionality can be
implemented. After all these design issues, a simple web interface is necessary for the
user interaction, Java (including Servlets, JSP and JDBC) implementation will be
covered in this lab booklet, the assumed Operating Systems are Windows XP or
Windows 2000 with the latest patches and service packs applied. Since we are
working with Java and HTML code they are portable, and you can setup the final
project on any java compatible platform with minor changes.
This booklet is partitioned into 5 sections which can be summarized as follows. Each
section has a references and links part at the end so that additional information can be
found on books, documents as hardcopy or internet resources as softcopy.
1.1 MSSQL and Web Warm-up I
Student will become familiar to the MSSQL and Java by implementing a very simple
end-to-end system, by running queries on the database and visualizing them on
HTML by conducting Java Servlets.
1.2 MSSQL and Web Warm-up II
Simple end-to-end system in part i will be extended so as to include additional
features of MSSQL and HTML such as input boxes, menus, parameterized queries,
database updates and result browsing.
1.3 AuctionBase Schema and Data
A large amount of data will be supplied to the student in XML form. Student will
examine the given data and design a relational schema for it. A small program or
script will be written to parse XML data and load into MSSQL by converting to an
importable form.
1.4 MSSQL Features
Usage of indexes and its performance issues will be experimented by the student, and
some jobs and view issues will be concerned. Some advanced functionalities of
MSSQL will be used to implement “current time” and other features of an auction
system with identified real-world constraints. CONSTRAINTS and TRIGGERS will
be applied in this part.
1.5 AuctionBase Web Site
AuctionBase Web Site will be designed with necessary queries and updates on the
database and adapting the Part II web interface to the new requirements and
functionalities. A friendly and simple web interface is enough but the details can be
implemented if time left or for bonus.
2 MSSQL and Web Warm-up Part I
2.1 Section A: Getting familiar with MSSQL
Introduction to MSSQL Query Analyzer interface, connecting to the database with
username and password and trying some SQL commands, creating a table, making
selections over the table and dropping the table. Some experimentation with the
interface.
2.1.1 Logging In to Query Analyzer
Query Analyzer can be reached from the Microsoft SQL Server on Program Files
menu, which will open by a window where you chose the server and then login with
your username and password. Since a lot of people will be connecting to the database
from the same computer in the lab or their home computer, SQL Server
Authentication will be used instead of Windows Authentication. Your database
administrator in the department should provide you with username and passwords
with necessary privileges.
2.1.2 Creating a Table
In Query Analyzer we can execute any SQL command. One simple type of command
creates a table (relation). The form is
CREATE TABLE <tableName> (
<list of attributes and their types>
);
You may enter text on one line or on several lines. If your command runs over several
lines, you should type semicolon that ends any command. An example table-creation
command is:
CREATE TABLE test (
i int,
s char(10)
);
If any command you executed is successful you will get the message “The
command(s) completed successfully.” if the query does not return any results. If you
want to run just a single command or a line of a command then you should select the
command and click the “Run” button on the Query Analyzer.
This command creates a table named test with two attributes. The first, named i, is an
integer, and the second, named s, is a character string of length (up to) 10.
2.1.3 Creating a Table With a Primary Key
To create a table that declares attribute a to be a primary key:
CREATE TABLE <tableName> (..., a <type> PRIMARY KEY, b, ...);
To create a table that declares the set of attributes (a,b,c) to be a primary key:
CREATE TABLE <tableName> (<attrs and their types>, PRIMARY KEY
(a,b,c));
2.1.4 Inserting Tuples
Having created a table, we can insert tuples into it. The simplest way to insert is with
the INSERT command:
INSERT INTO <tableName>
VALUES( <list of values for attributes, in order> );
For instance, we can insert the tuple (10, 'foobar') into relation test by
INSERT INTO test VALUES(10, 'foobar');
2.1.5 Getting the Value of a Relation
We can see the tuples in a relation with the command:
SELECT *
FROM <tableName>;
For instance, after the above create and insert statements, the command
SELECT * FROM test;
produces the result
I S
---------- ---------10 foobar
2.1.6 Getting Rid of Your Tables
To remove a table from your database, execute
DROP TABLE <tableName>;
We suggest you execute
DROP TABLE test;
after trying out this sequence of commands to avoid leaving a lot of garbage around
that will be still there the next time you use the MSSQL system.
2.2 Section B : Simple Web Interface and Servlets
A simple introduction to web interfaces, implementation with HTML and Java. JDBC (
Java Database Connectivity) will be used for database interaction. Setting up Servlet &
JSP web environment and compiling Servlets. Writing HTML and Java code for
implementation of a database retrieval and display on the web interface.
2.2.1 Overview
Java Servlets and JSP(Java Scrip Pages) are the Java solution for providing web-based
services. They provide a interface for interacting with client queries and providing server
responses. As such, discussion of much of the input and output in terms of HTML will
overlap. Students will interface with MSSQL using JDBC by Java Servlets.
2.2.2 Setting up Java and Tomcat
Java Servlets interact with the user through HTML forms. you'll have to run a special
Servlet program of your choice on a specific port on a PC, in that project Tomcat will be
used which is a part of the the Apache Jakarta Project. You can find a lot of useful
documents and manuals and also download the free server for Windows platform from
http://jakarta.apache.org/tomcat/ , latest version is Tomcat 5.5. Tomcat will require a Java
Development Environment that should already be setup before Tomcat, so you should get
the latest version from http://java.sun.com and install it. Instead of deploying the whole
development environment, you can choose a smaller package according to the needs of
your application, which is a web application, and deploy Java WSDP (Java Web Services
Developer Pack) or J2SE which targets Desktop environments, our suggestion is to install
J2SE which is a general purpose package and will be useful if you code in Java in the
future other than its web services functionality.
2.2.3 Configuring & Using Apache Tomcat [2]
2.2.3.1 Summary
Using Tomcat as a deployment server or integrating Tomcat as a plugin within the
regular Apache server or a commercial Web server is more complicated than what is
described in this tutorial. Although such integration is valuable for a deployment scenario
(see http://jakarta.apache.org/tomcat/tomcat-5.5-doc/), our goal here is to show how to
use Tomcat as a development server on your desktop. Regardless of what deployment
server you use, you'll want a standalone server on your desktop to use for development.
The examples here assume you are using Windows, but they can be easily adapted for
Linux, Solaris, and other versions of Unix.
Steps for Installation & Setup
1. Install the JDK. Make sure JDK 5.0 is installed and your PATH is set so that both
"java -version" and "javac -help" give a result.
2. Configure Tomcat.
1. Download the software. Go to
http://jakarta.apache.org/site/binindex.cgi#tomcat and download and run
the latest prime time release which for the time current release build of
Tomcat 5.5.4.
2. Set the JAVA_HOME variable. Set it to refer to the base JDK directory, not
the bin subdirectory.
3. Change the port to 80. Edit install_dir/conf/server.xml and change the
port
attribute of the Connector element from 8080 to 80.
4. Turn on servlet reloading. Edit install_dir/conf/context.xml and change
<Context>
to <Context reloadable="true">.
5. Enable the invoker servlet. Go to install_dir/conf/web.xml and uncomment
the servlet and servlet-mapping elements that map the invoker
servlet to /servlet/*.
6. Set the CATALINA_HOME variable. Optionally, set CATALINA_HOME to refer
to the top-level Tomcat installation directory. Not necessary unless you
copy the startup scripts instead of making shortcuts to them.
3. Test the server.
1. Verify that you can start the server. Double-click
install_dir/bin/startup.bat and try accessing http://localhost/.
2. Check that you can access your own HTML & JSP pages. Drop some
simple HTML and JSP pages into install_dir/webapps/ROOT and access
them with http://localhost/filename.
4. Set up your development environment.
1. Create a development directory. Put it anywhere except within the Tomcat
installation hierarchy.
2. Make shortcuts to the Tomcat startup & shutdown Scripts. Put shortcuts to
install_dir/bin/startup.bat and install_dir/bin/shutdown.bat in your
development directory and/or on your desktop.
3. Set your CLASSPATH. Include the current directory ("."), the servlet / JSP
JAR files (install_dir/common/lib/servlet-api.jar and
install_dir/common/lib/jsp-api.jar), and your main development directory
from Step 1.
4. Bookmark the servlet & JSP javadocs. Add install_dir/webapps/tomcatdocs/servletapi/index.html and install_dir/webapps/tomcatdocs/jspapi/index.html to your bookmarks/favorites list.
5. Compile and test some simple servlets.
1. Test a packageless servlet. Compile a simple servlet, put the .class file in
install_dir/webapps/ROOT/WEB-INF/classes, and access it with
http://localhost/servlet/ServletName.
2. Test a servlet that uses packages. Compile the servlet, put the .class file in
install_dir/webapps/ROOT/WEB-INF/classes/packageName, and access it
with http://localhost/servlet/packageName.ServletName.
3. Test a servlet that uses packages and utility classes. Compile a servlet, put
both the servlet .class file and the utility file .class file in
install_dir/webapps/ROOT/WEB-INF/classes/packageName, and access
the servlet with http://localhost/servlet/packageName.ServletName. This
third step verifies that the CLASSPATH includes the top level of your
development directory.
6. Establish a simplified deployment method.
1. Copy to a shortcut. Make a shortcut to install_dir/webapps/ROOT. Copy
packageless .class files directly there. With packages, copy the entire
directory there.
2. Use the -d option of javac. Use -d to tell Java where the deployment
directory is.
3. Let your IDE take care of deployment. Tell your IDE where the
deployment directory is and let it copy the necessary files.
4. Use ant or a similar tool. Use the Apache make-like tool to automate
copying of files.
7. Get more info. Access the complete set of Tomcat docs, get free JSP and servlet
tutorials, read the official servlet and JSP specifications, get JSP-savvy editors
and IDEs, look for J2EE jobs, etc.
2.2.3.2 Install the JDK
Your first step is to download and install Java. The servlet 2.4 (JSP 2.0) specification
requires JDK 1.3 or later; J2EE 1.5.0 (which includes servlets 2.4 and JSP 2.0) requires
JDK 5.0 or later. You might as well get a recent Java version, so use JDK 5.0. If you
know which of those Java versions will be used on your project, get that one. See the
following sites for download and installation information.

JDK 5.0 for Windows, Linux, and Solaris:
http://java.sun.com/j2se/1.5.0/download.jsp Be sure you download the full SDK
(Software Development Kit), not just the JRE (Java Runtime Environment). The
JRE is only for running already-compiled .class files, and lacks a compiler.
Once you've installed Java, confirm that everything including your PATH is configured
properly by opening a DOS window and typing "java -version" and "javac -help".
You should see a real result both times, not an error message about an unknown
command. Or, if you use an IDE, compile and run a simple program to confirm that the
IDE knows where you installed Java.
2.2.3.3 Set the JAVA_HOME Variable
Next, you must set the JAVA_HOME environment variable to tell Tomcat where to find
Java. Failing to properly set this variable prevents Tomcat from compiling JSP pages.
This variable should list the base JDK installation directory, not the bin subdirectory. For
example, on almost any version of Windows, if you installed the JDK in C:\j2sdk1.5.0,
you might put the following line in your C:\autoexec.bat file.
set JAVA_HOME=C:\j2sdk1.5. 0
On Windows XP, you could also go to the Start menu, select Control Panel, choose
System, click on the Advanced tab, press the Environment Variables button at the
bottom, and enter the JAVA_HOME variable and value directly. On Windows 2000 and NT,
you do Start, Settings, Control Panel, System, then Environment. However, you can use
C:\autoexec.bat on those versions of Windows also (unless a system administrator has set
your PC to ignore it).
2.2.3.4 Change the Port to 80
Assuming you have no other server already running on port 80, you'll find it convenient
to configure Tomcat to run on the default HTTP port (80) instead of the out-of-the-box
port of 8080. Making this change lets you use URLs of the form http://localhost/blah
instead of http://localhost:8080/blah. Note that you need admin privileges to make this
change on Unix/Linux. Also note that some versions of Windows XP automatically start
IIS on port 80. So, if you use XP and want to use port 80 for Tomcat, you may need to
disable IIS (see the Administrative Tools section of the Control Panel).
To change the port, edit install_dir/conf/server.xml and change the port attribute of the
Connector
element from 8080 to 80, yielding a result similar to that below.
<Connector port="80" ...
maxThreads="150" minSpareThreads="25" ...
You can also:
2.2.3.5 Turn on Servlet Reloading
The next step is to tell Tomcat to check the modification dates of the class files of
requested servlets, and reload ones that have changed since they were loaded into the
server's memory. This slightly degrades performance in deployment situations, so is
turned off by default. However, if you fail to turn it on for your development server,
you'll have to restart the server every time you recompile a servlet that has already been
loaded into the server's memory. Since this tutorial discusses the use of Tomcat for
development, this change is strongly recommended.
To turn on servlet reloading, edit Edit install_dir/conf/context.xml and change
<Context>
to
<Context reloadable="true">
2.2.3.6 Enable the Invoker Servlet
The invoker servlet lets you run servlets without first making changes to your Web
application's deployment descriptor (i.e., the WEB-INF/web.xml file). Instead, you just
drop your servlet into WEB-INF/classes and use the URL http://host/servlet/ServletName
(or http://host/webAppName/servlet/ServletName once you start using your own Web
applications. The invoker servlet is extremely convenient when you are learning and even
when you are doing your initial development. You almost certainly want to enable it
when learning, but you should disable it again before deploying any real applications.
To enable the invoker servlet, uncomment the following servlet and servlet-mapping
elements in install_dir/conf/web.xml. Do not confuse this Apache Tomcat-specific
web.xml file with the standard one that goes in the WEB-INF directory of each Web
application.
<servlet>
<servlet-name>invoker</servlet-name>
<servlet-class>
org.apache.catalina.servlets.InvokerServlet
</servlet-class>
...
</servlet>
...
<servlet-mapping>
<servlet-name>invoker</servlet-name>
<url-pattern>/servlet/*</url-pattern>
</servlet-mapping>
2.2.3.7 Test Server
2.2.3.7.1 Verify That the Server Can Start
Before trying your own servlets or JSP pages, you should make sure that the server is
installed and configured properly. For Tomcat, click on install_dir/bin/startup.bat (or
execute install_dir/bin/startup.sh on Unix/Linux). Next, enter the URL http://localhost/
in your browser and make sure you get the Tomcat welcome page, not an error message
saying that the page could not be displayed or that the server could not be found. If you
chose not to change the port number to 80 as described above, you will need to use a
URL like http://localhost:8080/ that includes the port number.
If this does not work, there are a couple of things to check:

Did the Tomcat window pop up and stay open? If not, the error messages are
lost and it is hard to know what you did wrong. So, open a DOS window, go to
install_dir/bin and type "catalina run" to start Tomcat without popping up a
new window. Now, the error messages should help you figure out the problem
(e.g., JAVA_HOME not set properly or IIS already reserving port 80).

Does the server appear to be running but you cannot access the home page?
Maybe your browser is using a proxy and you have not set it to bypass proxies for
local addresses? To fix this:
o
On Internet Explorer, go to Tools, Internet Options, Connections, and
LAN Settings. If the "Use a proxy server" checkbox is selected, make sure
the "Bypass proxy server for local addresses" box is also selected.
o
On Netscape 6/7, go to the Edit menu, then select Preferences, Advanced,
and Proxies. Then enter "localhost" in the textfield labeled "No Proxy
for:".
o
On Mozilla Firefox go to Tools, Internet Options, and Connections. Make
sure "localhost" is in the textfield labeled "No Proxy for:". Note that this
entry is the default with Firefox, so you probably do not need to change it.
To halt the server, double click on install_dir/bin/shutdown.bat. I recommend that you
make shortcuts to (not copies of) the startup and shutdown scripts and place those
shortcuts on the desktop or in your main development directory. If you put them on the
desktop, you can assign keyboard shortcuts, which is convenient.
2.2.3.7.2 Try Some Simple HTML and JSP Pages
After you have verified that the server is running, you should make sure that you can
install and access simple HTML and JSP pages. This test, if successful, shows two
important things. First, successfully accessing an HTML page shows that you understand
which directories should hold HTML and JSP files, and what URLs correspond to them.
Second, successfully accessing a new JSP page shows that the Java compiler (not just the
Java virtual machine) is configured properly.
Eventually, you will almost certainly want to create and use your own Web applications
but for initial testing many people prefer to use the default Web application. With Tomcat
and
the
default
Web
application,
you
put
HTML
and
JSP
pages
in
install_dir/webapps/ROOT or install_dir/webapps/ROOT/somePath and access them with
http://localhost/filename or http://localhost/somePath/filename.
For your first tests, I suggest you simply take this Hello.jsp and another simple HTML
file:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<!-Simple JSP file to test server setup and configuration.
-->
<HTML>
<HEAD><TITLE>JSP Test</TITLE></HEAD>
<BODY BGCOLOR="#FDF5E6">
<H1>JSP Test</H1>
Time: <%= new java.util.Date() %>
</BODY></HTML>
and drop them into the appropriate locations. If you put the file in the top-level directory
of the default Web application (i.e., in install_dir/webapps/ROOT), access it with the
URL http://localhost/Hello.jsp, respectively. If you put them in a subdirectory of
install_dir/webapps/ROOT, use the URL http://localhost/directoryName/Hello.jsp,
respectively.
If you successfully started the server as described above, but the JSP file does not work
(e.g., you get File Not Found--404--errors), you likely are using the wrong directory for
the files. If the HTML file works but the JSP file fails, you probably have incorrectly
specified the base JDK directory (i.e., with the JAVA_HOME variable).
2.2.3.7.3 Setup Your Development Environment
The server startup script startup.bat automatically sets the server's CLASSPATH to include
the standard servlet and JSP classes and the WEB-INF/classes directory (containing
compiled servlets) of each Web application. But you need similar settings, or you will be
unable to compile servlets in the first place. Configuring your system for servlet
development involves the following four steps:
1. Creating a development directory
2. Making shortcuts to the Tomcat startup and shutdown scripts
3. Setting your CLASSPATH
4. Bookmarking the servlet & JSP javadocs
Details on each step are given below.
2.2.3.7.4 Create a Development Directory
The first thing you should do is create a directory in which to place the servlets and JSP
pages that you develop. This directory can be in your home directory (e.g., C:\Documents
and Settings\Your Name\My Documents\Servlets+JSP on Windows 2000) or in a
convenient general location (e.g., C:\Servlets+JSP). It should not, however, be in the
Tomcat deployment directory (e.g., anywhere within install_dir/webapps).
Eventually, you will organize this development directory into different Web applications.
For initial testing of your environment, however, you can just put servlets either directly
in the development directory (for packageless servlets) or in a subdirectory that matches
the servlet package name. Many developers simply put all their code in the server's
deployment directory (within install_dir/webapps). I strongly discourage this practice and
instead recommend one of the approaches described in the deployment section. Although
developing in the deployment directory seems simpler at the beginning since it requires
no copying of files, it significantly complicates matters in the long run. Mixing locations
makes it hard to separate an operational version from a version you are testing, makes it
difficult to test on multiple servers, and makes organization much more complicated.
Besides, your desktop is almost certainly not the final deployment server, so you'll
eventually have to develop a good system for deploying anyhow.
2.2.3.7.5 Make Shortcuts to Start and Stop the Server
Since I find myself frequently restarting the server, I find it convenient to use the Tomcat
Configuration Monitor on the system tray and stop and start server from the relevant
service links. You can also use batch files including the command line inputs;
net stop tomcat5
net start tomcat5
Put these commands in a batch file and name it restart.bat then place it in your quick
launch toolbar.
2.2.3.7.6 Set Your CLASSPATH
Since servlets and JSP are not part of the Java 2 platform, standard edition, you have to
identify the servlet classes to the compiler. The server already knows about the servlet
classes, but the compiler (i.e., javac) you use for development probably doesn't. So, if
you don't set your CLASSPATH, attempts to compile servlets, tag libraries, filters, Web app
listeners, or other classes that use the servlet and JSP APIs will fail with error messages
about unknown classes. Here are the standard Tomcat locations:

install_dir/common/lib/servlet-api.jar

install_dir/common/lib/jsp-api.jar
You need to include both files in your CLASSPATH.
Now, in addition to the servlet JAR file, you also need to put your development directory
in the CLASSPATH. Although this is not necessary for simple packageless servlets, once
you gain experience you will almost certainly use packages. Compiling a file that is in a
package and that uses another class in a user-defined package requires the CLASSPATH to
include the directory that is at the top of the package hierarchy. In this case, that's the
development directory I just discussed. Forgetting this setting is perhaps the most
common mistake made by beginning servlet programmers!
Finally, you should include "." (the current directory) in the CLASSPATH. Otherwise, you
will only be able to compile packageless classes that are in the top-level development
directory.
Here are two representative methods of setting the CLASSPATH. They assume that your
development directory is C:\Servlets+JSP. Replace install_dir with the actual Tomcat
installation path (e.g., C:\jakarta-tomcat-5.5.4). Also, be sure to use the appropriate case
for the filenames, and enclose your pathnames in double quotes if they contain spaces.
Note that these examples represent only one approach for setting the CLASSPATH.
Many Java integrated development environments have global or project-specific
settings that accomplish the same result. But these settings are totally IDEspecific and won't be discussed here. Another alternative is to make a .bat file or
ant
build script whereby -classpath ... is automatically appended onto calls
to javac.

Windows NT/2000/XP. On WinXP, go to the Start menu and select Control
Panel, then System, then the Advanced tab, then the Environment Variables
button. On Win2K/WinNT, go to the Start menu and select Settings, then Control
Panel, then System, then Environment. Either way, enter the CLASSPATH value
from the previous bullet.
2.2.3.8 Installing JDBC Drivers for Windows XP & Windows 2000 [4]
JDBC is a mechanism that allows Java to communicate with databases using a standard
Application Programming Interface (API) to access databases regardless of the driver and
the database product. We have to install the necessary drivers so that we are able to speak
and communicate with the SQL Server through our code.
Beside the JDK, as always required for Java programs, we need the Microsoft
implementation of the JDBC specification. The so-called Microsoft SQL Server 2000
JDBC Driver
Download Site: http://www.microsoft.com/downloads/details.aspx?FamilyID=9f1874b6f8e1-4bd6-947c-0fc5bf05bf71&displaylang=en
OR
search Google for keywords; “jdbc sql server windows” and follow the first link.
The Microsoft® SQL Server 2000 Driver for JDBC is a Type 4 JDBC driver that
provides highly scalable and reliable connectivity for the enterprise Java environment.
This driver provides JDBC access to SQL Server 2000 through any Java-enabled applet,
application, or application server.
We can find there drivers for Windows and Unix. If we download the Windows and
Solaris drivers we get the following two files:

setup.exe (Windows)

mssqlserver.tar (Solaris)
Trying to install the setup.exe on Windows we immediately recognised that we don't
know what the installation program exactly does (registry?) and additionally we get about
30 DLLs. That's not what we want!
JDBC does not need any installation or environmental settings. The other thing is that we
don't want to have DLLs, instead a pure Java driver, that's the thing we are looking for!
Inside the Solaris mssqlserver.tar file it looks much better. Why not taking the required
jar files out of this tar? If it is pure Java it will run on Windows too! ... and it does.
Installation on all above platforms
- Create any Directory on your system
- Untar the file: mssqlserver.tar and you get:



install.ksh
msjdbc.tar
read.me
- Untar the file: msjdbc.tar and you get:



lib/msbase.jar
lib/mssqlserver.jar
lib/msutil.jar
A small Test Program: Test.java
Here's a sample program, which shows how to establish a connection to Microsoft SQL
Server. Don't forget to import the java.sql packages to get access to DriverManager and
many other related classes and methods. Take a careful look ath the connection string
where <Host> means the IP or the name of the SQL Server defined in the domain you are
working on, 1433 is the default SQL Server connection port, which of course can be
different in your development environment, ask your DB Administrator for relevant
information. <UID> and <PWD> are of course your userid and password which youı
should already have under your hand, if not contact your DB Administrator.
import java.sql.*;
/**
* Microsoft SQL Server JDBC test program
*/
public class Test {
public Test() throws Exception {
// Get connection
DriverManager.registerDriver(new
com.microsoft.jdbc.sqlserver.SQLServerDriver());
Connection connection = DriverManager.getConnection(
"jdbc:microsoft:sqlserver://<Host>:1433",<"UID>","<PWD>");
if (connection != null) {
System.out.println();
System.out.println("Successfully connected");
System.out.println();
// Meta data
DatabaseMetaData meta = connection.getMetaData();
System.out.println("\nDriver Information");
System.out.println("Driver Name: "
+ meta.getDriverName());
System.out.println("Driver Version: "
+ meta.getDriverVersion());
System.out.println("\nDatabase Information ");
System.out.println("Database Name: "
+ meta.getDatabaseProductName());
System.out.println("Database Version: "+
meta.getDatabaseProductVersion());
}
} // Test
public static void main (String args[]) throws Exception {
Test test = new Test();
}
}
Compile it
Compile the Java Source: Test.java (all in one line):
$ javac -classpath ".;./lib/mssqlserver.jar;
./lib/msbase.jar;./lib/msutil.jar" Test.java
Be aware that you need access to a javac program on your computer or media. If not,
simply specify the full path ahead of javac. The above command is good for Java 2. If
your are using for instance Java 1.1.8, add your JDK's classes.zip to the classpath. On
Unix systems replace the the semicolons " ; " by colons " : " The forward slashes " / " are
fine for both platforms, it's not a must to use backslashes " \ " on Windows.
Run it
Similar to the compilation you may run it like this (again all in one line):
$ java -classpath ".;./lib/mssqlserver.jar;
./lib/msbase.jar;./lib/msutil.jar" Test
The output looks something like this:
Successfully connected
Driver Information
Driver Name: SQLServer
Driver Version: 2.2.0022
Database Information
Database Name: Microsoft SQL Server Database Version:
Microsoft SQL Server 2000 8.00.194 (Intel X86)
Aug 6 2000 00:57:48
Copyright (c) 1988-2000 Microsoft Corporation
Enterprise Edition on Windows NT 5.0
(Build 2195: Service Pack 2)
2.2.4 Java and HTML Warm up [9]
Now you have installed and fine-tuned your development environment and JDBC driver
for SQL is installed, it’s time to play with it a bit, since our aim is to build a web
application that utilizes a DBMS.
2.2.5 Retrieving Input from the User
Input to Servlet programs is passed to the program using web forms. Forms include text
fields, radio buttons, check boxes, popup boxes, scroll tables, and the like.
Thus retrieving input is a two-step process: you must create an HTML document that
provides forms to allow users to pass information to the server, and your Servlet program
must have a means for parsing the input data and determining the action to take. This
mechanism is provided for you in Java Servlets.
2.2.6 Forms
Forms are designated within an HTML document by the fill-out form tag:
<FORM>
... Contents of the form ...
</FORM>
Within the form you may have anything except another form. The tags used to create user
interface objects are INPUT, SELECT, and TEXTAREA.
The INPUT tag specifies a simple input interface:
<INPUT TYPE="text" NAME="thisinput" VALUE="default" SIZE=10
MAXLENGTH=20>
<INPUT TYPE="checkbox" NAME="thisbox" VALUE="on" CHECKED>
<INPUT TYPE="radio" NAME="radio1" VALUE="1">
<INPUT TYPE="submit" VALUE="done">
<INPUT TYPE="radio" NAME="radio1" VALUE="2" CHECKED>
<INPUT TYPE="hidden" NAME="notvisible" VALUE="5">
Which would produce the following form:
Figure 1
The different attributes are mostly self-explanatory. The TYPE is the variety of input
object that you are presenting. Valid types include "text", "password", "checkbox",
"radio", "submit", "reset", and "hidden". Every input but "submit" and "reset" has a
NAME which will be associated with the value returned in the input to the program. This
will not be visible to the user (unless they read the HTML source). The other fields will
be explained with the types:
"text" - refers to a simple text entry field. The VALUE refers to the default text within the
text field, the SIZE represents the visual length of the field, and the MAXLENGTH
indicates the maximum number of characters the textfield will allow. There are defaults
to all of these (nothing, 20, unlimited).
"password" - the same as a normal text entry field, but characters entered are obscured.
"checkbox" - refers to a toggle button that is independently either on or off. The VALUE
refers to the string sent to the server when the button is checked (unchecked boxes are
disregarded). The default value is "on".
"radio" - refers to a toggle button that may be grouped with other toggle buttons such that
only one in the group can be on. It's essentially the same as the checkbox, but any radio
button with the same NAME attribute will be grouped with this one.
"submit" and "reset" - these are the pushbuttons on the bottom of most forms you'll see
that submit the form or clear it. These are not required to have a NAME, and the VALUE
refers to the label on the button. The default names are "Submit Query" and "Reset"
respectively.
"hidden" - this input is invisible as far as the user interface is concerned (though don't be
fooled into thinking this is some kind of security feature -- it's easy to find "hidden" fields
by perusing a document source or examining the URL for a GET method). It simply
creates an attribute/value binding without need for user action that gets passed
transparently along when the form is submitted.
The second type of interface is the SELECT interface, which includes popup menus and
scrolling tables. Here are examples of both:
<SELECT NAME="menu">
<OPTION>option 1
<OPTION>option 2
<OPTION>option 3
<OPTION SELECTED>option 4
<OPTION>option 5
<OPTION>option 6
<OPTION>option 7
</SELECT>
Figure 2
The SIZE attribute determines whether it is a menu or a scrolled list. If it is 1 or it is
absent, the default is a popup menu. If it is greater than 1, then you will see a scrolled list
with SIZE elements. The MULTIPLE option, which forces the select to be a scrolled list,
signifies that a more than one value may be selected (by default only one value can be
selected in a scrolled list).
OPTION is more or less self-explanatory -- it gives the names and values of each field in
the menu or scrolled table, and you can specify which are SELECTED by default.
2.2.7 Java Server-Side Input Handling [10]
The parsing of the input is done for you by Java, so you are separated from the actual
format of the input data completely. Your program will be an object subclassed off of
HttpServlet, the generalized Java Servlet class for handling web services.
Servlet programs must override the doGet() or doPost() messages, which are methods
that are executed in response to the client. There are two arguments to these methods,
HttpServletRequest request and HttpServletResponse response. Let's take a look at a very
simple servlet program, the traditional HelloWorld (this time with a doGet method):
import java.io.*;
import java.text.*;
import java.util.*;
import javax.servlet.*;
import javax.servlet.http.*;
public class Hello extends HttpServlet {
public void doGet(HttpServletRequest request,
HttpServletResponse response)
throws IOException, ServletException {
response.setContentType("text/html");
PrintWriter out = response.getWriter();
out.println("<html>");
out.println("<head>");
String title = "Hello World";
out.println("<title>" + title + "</title>");
out.println("</head>");
out.println("<body bgcolor=white>");
out.println("<h1>" + title + "</h1>");
String param = request.getParameter("param");
if (param != null)
out.println("Thanks for the lovely param='" + param + "'
binding.");
out.println("");
out.println("");
}
}
We'll discuss points in this code again in the section on Java Output, but for now, we will
focus on the input side. The argument HttpServletRequest request represents the client
request, and the values of the parameters passed from the HTML FORM can be retrieved
by calling the HttpServletRequest getParameter method. This method takes as its
argument the name of the parameter (the name of the HTML INPUT object), and returns
as a Java String the value assigned to the parameter. In cases where the parameter may
have multiple bindings, the method getParameterValues can be used to retrieve the values
in an array of Java Strings -- note that getParameter will return the first value of this
array. It is through these mechanisms that you can retrieve any of the values entered or
implicit in the form.
As might be inferred from the example above, Java returns null if the parameter for
whose name you request does not have a value. Recall that unchecked buttons' bindings
are not passed in a POST message -- you can check for null to determine when buttons
are off.
2.2.8 Returning Output to the User
In your project, you are going to be concerned with returning HTML documents to the
user. The documents will be dynamically created based on the output of the query. You
can format it however you like, using ordinary HTML formatting routines. Before we get
into gettint an output with Java, let’s look to a very simple Hello World servlet:
/********************************************************************
*
* Hello.java
*
* A simple sevlet that returns a single page.
* It looks for a binding called "param" and if present incorporates
* it into its response.
*
********************************************************************/
import
import
import
import
import
java.io.*;
java.text.*;
java.util.*;
javax.servlet.*;
javax.servlet.http.*;
public class Hello extends HttpServlet {
public void doGet(HttpServletRequest request,
HttpServletResponse response)
throws IOException, ServletException
{
response.setContentType("text/html");
PrintWriter out = response.getWriter();
out.println("<html>");
out.println("<head>");
String title = "Hello World";
out.println("<title>" + title + "</title>");
out.println("</head>");
out.println("<body bgcolor=white>");
out.println("<h1>" + title + "</h1>");
String param = request.getParameter("param");
if (param != null)
{
out.println("Thanks for the lovely param='" + param +
"' binding.");
}
out.println("</body>");
out.println("</html>");
}
}
2.2.9 Java Code Output
When we look at our Java code example. Output is all handled by the
HttpServletResponse object, which allows you to set the content type through the
setContentType method. Instead of printing the HTTP header yourself, you tell the
HttpServletResponse object that you want the content type to be "text/html" explicitly.
All HTML is returned to the user through a PrintWriter object, that is retrieved from the
response object using the getWriter method. HTML code is then returned line by line
using the println method.
Assuming that you all have a basic background in Java, so we won't provide a detailed
treatment of exceptions here, but do note that IOException and ServletException both
must either be handled or thrown.
2.2.10
A Complete JDBC Example [5]
Running through a simple, but complete, example will help you grasp the overall
concepts of JDBC. The fundamental issues encountered when writing any database
application are:

Creating a database. You can either create the database outside of Java, via tools
supplied by the database vendor, or via SQL statements fed to the database from a
Java program.

Connecting to a data source. Java you can use either the JDBC to ODBC bridge,
or JDBC and a vendor-specific bridge to connect to the datasource, in this project
we are going to use MS SQL Driver for JDBC which is a vendor-specific bridge.

Inserting information into a database. Again, you can either enter data outside
of Java, using database-specific tools, or with SQL statements sent by a Java
program.

Selectively retrieving information. You use SQL commands from Java to get
results and then use Java to display or manipulate that data.
2.2.10.1
Creating a Database
For this example, consider the scenario of tracking coffee usage at the Boğaziçi
University Computer Enginnering Department. A weekly report must be generated for
University management that includes total coffee sales and the maximum coffee
consumed by a programmer in one day. Here is the data:
Coffee Consumption at CMPE Dept, Boğaziçi University
"Caffeinating the World, one programmer at a time"
Programmer Day # Cups
Gilbert
Mon
1
Wally
Mon
2
Edgar
Tue
8
Wally
Tue
2
Eugene
Tue
3
Josephine
Wed
2
Eugene
Thu
3
Gilbert
Thu
1
Clarence
Fri
9
Edgar
Fri
3
Josephine
Fri
4
To create this database, you can feed SQL statements to MS SQL via the JDBC-MS SQL
bridge.
To enter the data into the CafeJolt database, create a Java application that follows these
steps:
1. Load the JDBC Driver for MS SQL. You must load a driver that tells the JDBC
classes how to talk to a data source.
Class.forName("com.microsoft.jdbc.sqlserver.SQLServerDriver");
2. Connect to a data source. A URL is used to connect to a particular JDBC data
source. Using the DriverManager class, you request a connection to a URL and
the DriverManager selects the appropriate driver; here, only the driver MS SQL
is loaded.
Connection con = DriverManager.getConnection(
URL,
username,
password);
3. Send SQL statements to create the table. Ask the connection object for a
Statement
object:
Statement stmt = con.createStatement();
Then, execute the following SQL statement to create a table called JoltData.
create table JoltData (
programmer varchar (32),
day char (3),
cups integer,
variety varchar (20));
The Java code to do this is:
stmt.execute(
"create table JoltData ("+
"programmer varchar (32),"+
"day char (3),"+
"cups integer);"
);
After you have created the table, you can the insert the appropriate values such as:
insert into JoltData values ('Gilbert', 'Mon', 1);
insert into JoltData values ('Wally', 'Mon', 2);
insert into JoltData values ('Edgar', 'Tue', 8);
...
Review what you have done so far. After creating a data source visible to ODBC, you
connected to that source via the JDBC-MS SQL Driver and sent a series of SQL
statements to create a table called JoltData filled with rows of data.
2.2.10.2
Getting Information from a Database
To retrieve information from a database, use SQL select statements via the Java
Statement.executeQuery method, which returns results as rows of data in a ResultSet
object. The results are examined row-by-row using the ResultSet.next and
ResultSet.getXXX methods.
Consider how you would obtain the maximum number of cups of coffee consumed by a
programmer in one day. In terms of SQL, one way to get the maximum value is to sort
the table by the cups column in descending order. The programmer column is selected, so
the name attached to the most coffee consumption can also be printed. Use the SQL
statement:
SELECT programmer, cups FROM JoltData ORDER BY cups DESC;
From Java, execute the statement with:
ResultSet result = stmt.executeQuery(
"SELECT programmer,
cups FROM JoltData ORDER BY cups DESC;");
The cups column of the first row of the result set will contain the largest number of cups:
Clarence 9
Edgar
8
Josephine 4
Eugene
3
Eugene
3
Edgar
3
Wally
2
Wally
2
Josephine 2
Examine the ResultSet by:
Gilbert
1
Gilbert
1
1. "Moving" to the first row of data. Perform:
2.
3.
result.next();
4. Extracting data from the columns of that row. Perform:
5.
6.
7.
String name = result.getString("programmer");
int cups = result.getInt("cups");
The information can be printed easily via:
System.out.println("Programmer "+name+
" consumed the most coffee: "+cups+" cups.");
resulting in the following output:
Programmer Clarence consumed the most coffee: 9 cups.
Computing the total sales for the week is a matter of adding up the cups column. Use an
SQL select statement to retrieve the cups column:
result = stmt.executeQuery(
"SELECT cups FROM JoltData;");
Peruse the results by calling method next until it returns false, indicating that there are no
more rows of data:
// for each row of data
cups = 0;
while(result.next()) {
cups += result.getInt("cups");
}
Print the total number of cups sold:
System.out.println("Total sales of
"+cups+" cups of coffee.");
The output should be:
Total sales of 38 cups of coffee.
2.2.10.3
Obtaining Result MetaData Type Information
You will occasionally need to obtain type information about the result of a query. For
example, the SQL statement:
SELECT * from JoltData
will return a ResultSet with the same number of columns (and rows) as the table,
JoltData. If you do not know how many columns there are beforehand, you must use
metadata via the ResultSetMetaData class to find out. Continuing the Cafe Jolt scenario,
determine the number and type of columns returned by the same SQL query
SELECT programmer, cups FROM JoltData ORDER BY cups DESC;
First, perform the usual execute method call:
ResultSet result = stmt.executeQuery(
"SELECT programmer,
cups FROM JoltData ORDER BY cups DESC;");
Then obtain the column and type metadata from the ResultSet:
ResultSetMetaData meta = result.getMetaData();
You can query the ResultSetMetaData easily to determine how many columns there
are:
int columns = meta.getColumnCount();
and then walk the list of columns printing out their name and type:
int numbers = 0;
for (int i=1;i<=columns;i++) {
System.out.println (meta.getColumnLabel(i) + "\t"
+ meta.getColumnTypeName(i));
if (meta.isSigned(i)) { // is it a signed number?
numbers++;
}
}
System.out.println ("Columns: " +
columns + " Numeric: " + numbers);
2.2.11
Handling Special Characters in HTML
The special characters &, <, and >, need to be escaped as &amp;, &lt;, and &gt;,
respectively in HTML text (see NCSA Beginner's Guide to HTML). Moreover, special
characters appearing in URL's need to be escaped, differently than when they appear in
HTML text. For example, if you link on text with special characters and want to embed
them into extended URLs as parameter values, you need to escape them: convert space to
+ or %20, convert & to %26, convert = to %3D, convert % to %25, etc. (In general, any
special character can be escaped by a percent sign followed by the character's
hexadecimal ASCII value.) Important: Do NOT escape the & that actually separates
parameters! For example, if you want two parameters p1 and p2 to have the values 3 and
M&M, you should write something like:
http://cgi-courses.stanford.edu/~username/cgi-bin/cgiprog?p1=3&p2=M%26M
Be careful not to confuse the escape strings for HTML text with those for URL's.
3 MSSQL and Web Warm-up Part II
After the first warm-up part and getting acquianted with the web interface and servlet
coding, it’s time to get in touch with the real data, which is in XML form. We’ll analyze
and work on that real data from eBay and desing our AuctionBase database accordingly,
and bulk load the data into the database.
3.1 Preliminary Information about XML and DTD’s
3.1.1 What is XML ?
XML is a markup language for documents containing structured information.
Structured information contains both content (words, pictures, etc.) and some indication
of what role that content plays (for example, content in a section heading has a different
meaning from content in a footnote, which means something different than content in a
figure caption or content in a database table, etc.). Almost all documents have some
structure.
A markup language is a mechanism to identify structures in a document. The XML
specification defines a standard way to add markup to documents.
3.1.2 What Do XML Documents Look Like? [12]
If you are conversant with HTML, XML documents will look familiar. A simple XML
document is presented in Example 1.
Example: A Simple XML Document
<?xml version="1.0"?>
<oldjoke>
<burns>Say <quote>goodnight</quote>,
Gracie.</burns>
<allen><quote>Goodnight,
Gracie.</quote></allen>
<applause/>
</oldjoke>
A few things may stand out to you:

The document begins with a processing instruction: <?xml ...?>. This is the XML
declaration. While it is not required, its presence explicitly identifies the
document as an XML document and indicates the version of XML to which it was
authored.

There's no document type declaration. Unlike SGML, XML does not require a
document type declaration. However, a document type declaration can be
supplied, and some documents will require one in order to be understood
unambiguously.

Empty elements (<applause/> in this example) have a modified syntax. While
most elements in a document are wrappers around some content, empty elements
are simply markers where something occurs (a horizontal rule for HTML's <hr>
tag, for example). The trailing /> in the modified syntax indicates to a program
processing the XML document that the element is empty and no matching end-tag
should be sought. Since XML documents do not require a document type
declaration, without this clue it could be impossible for an XML parser to
determine which tags were intentionally empty and which had been left empty by
mistake.
XML has softened the distinction between elements which are declared as
EMPTY and elements which merely have no content. In XML, it is legal to use
the empty-element tag syntax in either case. It's also legal to use a start-tag/endtag pair for empty elements: <applause></applause>. If interoperability is of any
concern, it's best to reserve empty-element tag syntax for elements which are
declared as EMPTY and to only use the empty-element tag form for those
elements.
XML documents are composed of markup and content. There are six kinds of markup
that can occur in an XML document: elements, entity references, comments, processing
instructions, marked sections, and document type declarations. The following sections
introduce each of these markup concepts.
3.1.3 DTD (Data Type Definitions) for XML [11]
An XML document primarily consists of a strictly nested hierarchy of elements with a
single root. Elements can contain character data, child elements, or a mixture of both. In
addition, they can have attributes. Child character data and child elements are strictly
ordered; attributes are not. For example:
<?xml version="1.0" ?>
<Book Author="Anonymous">
<Title>Sample Book</Title>
<Chapter id="1">
This is chapter 1. It is not very long or interesting.
</Chapter>
<Chapter id="2">
This is chapter 2. Although it is longer than chapter 1,
it is not any more interesting.
</Chapter>
</Book>
The names of the elements and attributes and their order in the hierarchy (among other
things) form the XML markup language used by the document. This language can be
defined by the document author or it can be inferred from the document's structure. In the
example shown above, the language contains three elements: Book, Title, and Chapter.
The Book element contains a single Title element and one or more Chapter elements. The
Book element has an Author attribute and the Chapter element has an id attribute.
The main reason to explicitly define the language is so that documents can be checked to
conform to it. For example, if we defined a grammar for the Book language, authors
using this grammar could use a validating parser to ensure that their documents
conformed to the language.
An XML markup language is defined in a Document Type Definition (DTD). The DTD
is either contained in a <!DOCTYPE> tag, contained in an external file and referenced
from a <!DOCTYPE> tag, or both. For example, the document shown above could
contain the following <!DOCTYPE> tag:
<!DOCTYPE Book [
<!ELEMENT Book (Title, Chapter+)>
<!ATTLIST Book Author CDATA #REQUIRED>
<!ELEMENT Title (#PCDATA)>
<!ELEMENT Chapter (#PCDATA)>
<!ATTLIST Chapter id ID #REQUIRED>
]>
3.1.3.1 Elements
1) An element is defined as a group of one or more subelements/subgroups, character
data, EMPTY, or ANY. For example:
Group:
<!ELEMENT A (B, C)>
Character data:
<!ELEMENT A (#PCDATA)>
EMPTY:
<!ELEMENT A EMPTY>
ANY:
<!ELEMENT A ANY>
2) Elements defined as groups of subelements/subgroups constitute non-terminals in the
language. Elements defined as character data, EMPTY, or ANY constitute terminals. For
example:
<!-- Element A is a non-terminal. -->
<!ELEMENT A (B)>
<!-- Element B is a terminal. -->
<!ELEMENT B (#PCDATA)>
Although it is legal to define a language containing non-terminals that never resolve to
terminals, such as one with purely circular definitions, it is generally impossible and/or
useless to create any valid documents for such languages.
3) Groups can be either a sequence or choice of subelements and/or subgroups. For
example:
Sequence:
<!-- Element A consists of a single element B. -->
<!ELEMENT A (B)>
<!-- Element A consists of element B followed by element C.
-->
<!ELEMENT A (B, C)>
<!-- Element A consists of a sequence, including a choice
subgroup. -->
<!ELEMENT A (B, (C | D), E>
Choice:
<!-- Element A consists of either element B or element C. ->
<!ELEMENT A (B | C)>
<!-- Element A consists of a choice, including a sequence
subgroup. -->
<!ELEMENT A (B | C | (D, E))>
4) Optional (?), one-or-more (+), and zero-or-more (*) operators can be applied to
groups, subgroups, and subelements. For example:
Optional:
<!-- Subelement B is optional. -->
<!ELEMENT A (B?, C)>
One or more:
<!-- Subgroup (C | D) occurs one or more times. -->
<!ELEMENT A (B, (C | D)+, E)>
Zero or more:
<!-- Group (B, C) occurs zero or more times, i.e. A can be
empty. -->
<!ELEMENT A (B, C)*>
5) Elements containing character data can be declared as containing only character data:
<!ELEMENT A (#PCDATA)>
or as containing a mixture of character data and elements in any order:
<!ELEMENT A (#PCDATA | B | C)*>
In the latter case, the declaration must place #PCDATA first in the group, the group must
be a choice, and the group must appear zero or more times. Such groups are generally
referred to as "mixed content" (as opposed to element-only groups or "element content").
Technically, mixed content refers to any element containing character data. However, in
common usage it refers only to the latter case.
Note: "PCDATA" in the declarations is short for "Parsed Character DATA". The term is
inherited from SGML and comes from the fact that the text in the XML document
following the element tag is parsed looking for more markup tags. Although it is possible
to include unparsed character data through the use of CDATA sections, these can occur
only where PCDATA occurs. While this is of interest to parser writers, it does not affect
the syntax of DTDs, nor does it affect the resulting elements -- they still contain character
data.
6) EMPTY means that the element has no child elements or character data. Empty
elements often have attributes -- see below.
7) ANY means that the element can contain zero or more child elements of any declared
type, as well as character data. It is therefore a shorthand for mixed content containing all
declared elements.
3.1.3.2 Attributes
1) Elements can have zero or more attributes. For example:
<!ELEMENT A (#PCDATA)>
<!-- Declare an attribute a for element A -->
<!ATTLIST A a CDATA #IMPLIED>
2) A single ATTLIST statement can declare multiple attributes for the same element.
Multiple ATTLIST statements can declare attributes for the same element. That is, the
following are equivalent:
Single ATTLIST statement declaring multiple attributes for an element:
<!-- Element A has attributes a and b -->
<!ATTLIST A
a CDATA #IMPLIED
b CDATA #IMPLIED>
Multiple ATTLIST statements declaring attributes for the same element:
<!-- Element A has attributes a and b -->
<!ATTLIST A a CDATA #IMPLIED>
<!ATTLIST A b CDATA #IMPLIED>
3) Attributes can be optional, required, or have a fixed value. Optional attributes can have
a default; fixed attributes must have a default. For example:
Optional without a default:
<!-- Element A has an attribute a. #IMPLIED = "optional, no
default" -->
<!ATTLIST A a CDATA #IMPLIED>
Optional with a default:
<!-- If attribute a is not provided, a default of "aaa"
will be used. -->
<!ATTLIST A a CDATA "aaa">
Required:
<!ATTLIST A a CDATA #REQUIRED>
Fixed:
<!-- The value of attribute a is always "aaa" -->
<!ATTLIST A a CDATA #FIXED "aaa">
4) Each attribute has a type:
Character data:
<!ATTLIST A a CDATA #IMPLIED>
A user-defined enumerated type:
<!-- Attribute a uses a simple enumeration. -->
<!ATTLIST A a (yes | no) #IMPLIED>
<!-- Attribute a uses an enumeration of notation types.
See the XML specification for complete details. -->
<!ATTLIST A a NOTATION (ps | pdf) #IMPLIED>
ID, IDREF: These attributes point from one element to another. The value of the
IDREF attribute on the pointing element is the same as the value of the ID
attribute on the pointed-to element.
<!-- Attribute id gives the ID of element A -->
<!ATTLIST A id ID #REQUIRED>
<!-- Attribute ref points to the ID of another element -->
<!ATTLIST A ref IDREF #IMPLIED>
ENTITY, ENTITIES. These attributes point to external data in the form of
unparsed entities. For complete details, see the XML specification.
<!-- Attribute a points to a single unparsed entity -->
<!ATTLIST A a ENTITY #IMPLIED>
<!-- Attribute b points to multiple unparsed entities -->
<!ATTLIST A b ENTITIES #IMPLIED>
NMTOKEN, NMTOKENS. These attributes have single/multiple tokens as
values.
<!ATTLIST A a NMTOKEN #IMPLIED>
<!ATTLIST A b NMTOKENS #IMPLIED>
3.1.3.3 Comments
1) DTDs can contain comments. Comments are delimited by <!-- and -->. For example:
<!-- This is a comment in an XML file. -->
3.2 Section A: Examining XML Files
Examining the XML and DTD files to make sure that data we are working is fully
understood, then this data will be transformed into a relational schema and loaded into
the AuctionBase system. One of the most important aspect of that data is it represents a
single point in time, starting with 1st January 2005, one second after midnight. It will
contain items auctioned in the past and are open to bid right now.
3.2.1 Sample eBay Data
This auction data is gathered from real auctions on ebay.com, crawled by Yuan Wang at
University of Wisconsin. This is a somewhat-random, somewhat-edited sample of the
data. The Buy_Price and Bids elements were synthetically generated, but all other data is
real. Web address for the data is
http://www-2.cs.cmu.edu/~olston/streamseminar/project.html , follow the links down of
the page for the dtd and other item infos contained in a 10MB zip file.
The data is organized into files items-*.xml, where each items-*.xml file contains
information about 500 auctions. Each auction corresponds to one Item element, which
gives complete information about the auction. The files conform to the DTD given in the
file data-xml/items.dtd, reproduced here:
<!ELEMENT Items
<!ELEMENT Item
(Item*)>
(Name, Category+, Currently, Buy_Price?,
First_Bid, Quantity?, Number_of_Bids,
Bids, Location, Country, Started, Ends,
Seller, Description)>
<!ATTLIST Item
ItemID CDATA #REQUIRED>
<!ELEMENT Name
(#PCDATA)>
<!ELEMENT Category
(#PCDATA)>
<!ELEMENT Currently
(#PCDATA)>
<!ELEMENT Buy_Price
(#PCDATA)>
<!ELEMENT First_Bid
(#PCDATA)>
<!ELEMENT Quantity
(#PCDATA)>
<!ELEMENT Number_of_Bids (#PCDATA)>
<!ELEMENT Bids
(Bid*)>
<!ELEMENT Bid
(Bidder, Time, Amount, Quantity?)>
<!ATTLIST Bidder
UserID CDATA #REQUIRED
Rating CDATA #REQUIRED>
<!ELEMENT Bidder
(Location?, Country?)>
<!ELEMENT Time
(#PCDATA)>
<!ELEMENT Amount
(#PCDATA)>
<!ELEMENT Location
(#PCDATA)>
<!ELEMENT Country
(#PCDATA)>
<!ELEMENT Started
(#PCDATA)>
<!ELEMENT Ends
(#PCDATA)>
<!ELEMENT Seller
EMPTY>
<!ATTLIST Seller
UserID CDATA #REQUIRED
Rating CDATA #REQUIRED>
<!ELEMENT Description
(#PCDATA)>
The meaning of each element and attribute is explained below:
ItemID (attribute):
An identifier unique across all items.
Name:
A short item description used as the auction's title.
Category:
A category to which the item belongs. An item may belong to multiple categories.
Currently:
The current highest bid. This amount is always equal to the amount of the highest bid, or
First_Bid if there are no bids.
Buy_Price: This element was synthetically generated.
First_Bid: The minimum qualifying first-bid amount, as determined by the seller before
the auction starts. It does not mean there is a bid at all.
Quantity:
The number of copies of the item up for sale. Usually this number is 1, although some
auctions are for multiple copies. In such auctions, each bidder may bid on more than 1
copy, and there may be multiple winners. (Note that auction winners are not encoded in
our data. It is up to you to determine winners.) Assumed 1 if missing.
Number_of_Bids:
Number of Bids/Bid elements, each corresponding to a bid.
Bids:
This element was synthetically generated.
Bids/Bid/Bidder:
Attribute UserID uniquely identifies a user. Attribute Rating is the user's rating. Note that
a user may be a bidder in one auction and a seller in another. However, his Rating,
Location, and Country information are the same wherever he appears in our data (which
reflects a snapshot in time). Note this implies that UserID's with missing location or
country information cannot be sellers in another auction.
Bids/Bid/Time:
The time the bid was placed. Note that bids must be placed after the auction starts and
before it ends. A user may bid on an item multiple times, but not at the same time.
Bids/Bid/Amount:
Bid amount. If bid quantity is greater than 1, this is the price per copy.
Bids/Bid/Quantity:
The number of copies bid on. Must be less than or equal to the number of copies up for
auction. Assumed 1 if missing.
Location:
The seller's location information (e.g., city, state). See comment under Bids/Bid/Bidder.
Country:
Seller's country. See comment under Bids/Bid/Bidder.
Started:
Auction start time.
Ends:
Auction end time. If this is in the past with respect to the current
system time, the auction is closed. If in the future, the auction is
still open.
Seller:
Attributes give the seller's UserID and rating.
Description:
The item's full description.
All money values are in the form $x,xxx.xx and are in US dollars. All times are in 24hour format. See the actual data for the exact time format.
The auctions in the data set range in time from November to December of
2001. Both open and closed auctions are included, and it is up to you to determine which
auctions are still open based on the current system time, taken to be Dec. 20, 2001
00:00:01. Times in the data are consistent with the current system time, so all bid times
and auction start times are earlier.
Example XML file we will be working on is from yahoo,an instance of a single item is as
follows:
eBay1.xml
- <Items>
- <Item ItemID="1043374545">
<Name>christopher radko | fritz n_ frosty sledding</Name>
<Category>Collectibles</Category>
<Category>Decorative & Holiday</Category>
<Category>Decorative by Brand</Category>
<Category>Christopher Radko</Category>
<Currently>$30.00</Currently>
<First_Bid>$30.00</First_Bid>
<Number_of_Bids>0</Number_of_Bids>
<Bids />
<Location>its a dry heat</Location>
<Country>USA</Country>
<Started>Dec-03-01 18:10:40</Started>
<Ends>Dec-13-01 18:10:40</Ends>
<Seller UserID="rulabula" Rating="1035" />
<Description>brand new beautiful handmade european blown glass ornament from
christopher radko. this particular ornament features a snowman paired with a
little girl bundled up in here pale blue coat sledding along on a silver and
blue sled filled with packages. the ornament is approximately 5_ tall and 4_
wide. brand new and never displayed, it is in its clear plastic packaging and
comes in the signature black radko gift box. PLEASE READ CAREFULLY!!!! payment
by cashier's check, money order, or personal check. personal checks must clear
before shipping. the hold period will be a minimum of 14 days. I ship with UPS
and the buyer is responsible for shipping charges. the shipping rate is
dependent on both the weight of the package and the distance that package will
travel. the minimum shipping/handling charge is $6 and will increase with
distance and weight. shipment will occur within 2 to 5 days after the deposit
of funds. a $2 surcharge will apply for all USPS shipments if you cannot have
or do not want ups service. If you are in need of rush shipping, please let me
know and I_will furnish quotes on availability. the BUY-IT-NOW price includes
free domestic shipping (international winners and residents of alaska and
hawaii receive a credit of like value applied towards their total) and, as an
added convenience, you can pay with paypal if you utilize the feature. paypal
is not accepted if you win the auction during the course of the regular
bidding-I only accept paypal if the buy it now feature is utilized. thank you
for your understanding and good luck! Free Honesty Counters powered by Andale!
Payment Details See item description and Payment Instructions, or contact
seller for more information. Payment Instructions See item description or
contact seller for more information.</Description>
</Item>
</Items>
DTD (Data Type Definition) of the sample XML file is as follows
eBay.dtd
<!ELEMENT Items
<!ELEMENT Item
(Item*)>
(Name, Category+, Currently, Buy_Price?,
First_Bid, Quantity?, Number_of_Bids,
Bids, Location, Country, Started, Ends,
Seller, Description)>
<!ATTLIST Item
ItemID CDATA #REQUIRED>
<!ELEMENT Name
(#PCDATA)>
<!ELEMENT Category
(#PCDATA)>
<!ELEMENT Currently
(#PCDATA)>
<!ELEMENT Buy_Price
(#PCDATA)>
<!ELEMENT First_Bid
(#PCDATA)>
<!ELEMENT Quantity
(#PCDATA)>
<!ELEMENT Number_of_Bids (#PCDATA)>
<!ELEMENT Bids
(Bid*)>
<!ELEMENT Bid
(Bidder, Time, Amount, Quantity?)>
<!ATTLIST Bidder
UserID CDATA #REQUIRED
Rating CDATA #REQUIRED>
<!ELEMENT Bidder
(Location?, Country?)>
<!ELEMENT Time
(#PCDATA)>
<!ELEMENT Amount
(#PCDATA)>
<!ELEMENT Location
(#PCDATA)>
<!ELEMENT Country
(#PCDATA)>
<!ELEMENT Started
(#PCDATA)>
<!ELEMENT Ends
(#PCDATA)>
<!ELEMENT Seller
EMPTY>
<!ATTLIST Seller
UserID CDATA #REQUIRED
Rating CDATA #REQUIRED>
<!ELEMENT Description
(#PCDATA)>
3.3 Section B: Designing Relational Schema
1. Designing a relational schema, with all the keys
auctions(item_id,name, seller, current_price, buy_price, min_price,
starts, ends,description)
bids(item_id,bidder,time,amount money)
itemInCategory(item_id,cat_id)
categories(cat_id,name)
users(user_id,location,country,rating,last_rated)
2. Listing all nontrivial functional dependencies that hold on each relation,
excluding those that effectively specify keys.
In auctions item_id  name, seller, current_price, buy_price,
min_price, starts, ends,description
In bids item_id,bidder,time  amount
In categories cat_id  name
In users user_id  location,country,rating,last_rated
3.4 Section C: Creating Tables in MS SQL
After completing the Relational Schema it’s time to create our tables on MS SQL Query
Analyzer by issuing the following commands.
CREATE TABLE auctions(
item_id int PRIMARY KEY,
name varchar(255) NOT NULL,
seller varchar(50) NOT NULL,
current_price money NOT NULL,
buy_price money,
min_price money NOT NULL,
starts datetime NOT NULL,
ends datetime NOT NULL,
description varchar(4000)
);
CREATE TABLE;
CREATE TABLE itemInCategory(
item_id int,
cat_id int,
PRIMARY KEY(item_id, cat_id)
);
CREATE TABLE categories(
cat_id int PRIMARY KEY,
name varchar(50)
);
CREATE TABLE users(
user_id varchar(50) PRIMARY KEY,
location varchar(120),
country varchar(50),
rating int,
last_rated datetime
);
3.5 Section D: Writing a data transformation program [10]
We will use the internal SAX XML Parser that comes with the Java Package and
going to write a parser in Java that will take XML data and put it into MSSQL loader
format. Coding should eliminate duplicate entries or rely on MSSQL giving errors during
the loading but continuing the process.
In this exercise, you'll echo SAX parser events to System.out. Consider it the "Hello
World" version of an XML-processing program. It shows you how to use the SAX parser
to get at the data, and then echoes it to show you what you've got.
3.5.1 Creating the Skeleton
Start by creating a file named Echo.java and enter the skeleton for the application:
public class Echo
{
public static void main(String argv[])
{
}
}
Since we're going to run it standalone, we need a main method. And we need commandline arguments so we can tell the application which file to echo.
3.5.2 Importing Classes
Next, add the import statements for the classes the application will use:
import
import
import
import
import
import
public
{
...
java.io.*;
org.xml.sax.*;
org.xml.sax.helpers.DefaultHandler;
javax.xml.parsers.SAXParserFactory;
javax.xml.parsers.ParserConfigurationException;
javax.xml.parsers.SAXParser;
class Echo
The classes in java.io, of course, are needed to do output. The org.xml.sax package
defines all the interfaces we use for the SAX parser. The SAXParserFactory class creates
the instance we use. It throws a ParserConfigurationException if it is unable to produce a
parser that matches the specified configuration of options. (You'll see more about the
configuration options later.) The SAXParser is what the factory returns for parsing, and
the DefaultHandler defines the class that will handle the SAX events that the parser
generates.
3.5.3 Setting up for I/O
The first order of business is to process the command line argument, get the name of the
file to echo, and set up the output stream. Add the text highlighted below to take care of
those tasks and do a bit of additional housekeeping:
public static void main(String argv[])
{
if (argv.length != 1) {
System.err.println("Usage: cmd filename");
System.exit(1);
}
try {
// Set up output stream
out = new OutputStreamWriter(System.out, "UTF8");
}
catch (Throwable t) {
t.printStackTrace();
}
System.exit(0);
}
static private Writer out;
When we create the output stream writer, we are selecting the UTF-8 character encoding.
We could also have chosen US-ASCII, or UTF-16, which the Java platform also
supports.
3.5.4 Implementing the ContentHandler Interface
The most important interface for our current purposes is the ContentHandler interface.
That interface requires a number of methods that the SAX parser invokes in response to
different parsing events. The major event handling methods are: startDocument,
endDocument, startElement, endElement, and characters.
The easiest way to implement that interface is to extend the DefaultHandler class, defined
in the org.xml.sax.helpers package. That class provides do-nothing methods for all of the
ContentHandler events. Enter the code highlighted below to extend that class:
public class Echo extends DefaultHandler
{
...
}
Note: DefaultHandler also defines do-nothing methods for the other major events,
defined in the DTDHandler, EntityResolver, and ErrorHandler interfaces. You'll learn
more about those methods as we go along.
Each of these methods is required by the interface to throw a SAXException. An
exception thrown here is sent back to the parser, which sends it on to the code that
invoked the parser. In the current program, that means it winds up back at the Throwable
exception handler at the bottom of the main method.
When a start tag or end tag is encountered, the name of the tag is passed as a String to the
startElement or endElement method, as appropriate. When a start tag is encountered, any
attributes it defines are also passed in an Attributes list. Characters found within the
element are passed as an array of characters, along with the number of characters (length)
and an offset into the array that points to the first character.
3.5.5 Setting up the Parser
Now (at last) you're ready to set up the parser. Add the text highlighted below to set it up
and get it started:
public static void main(String argv[])
{
if (argv.length != 1) {
System.err.println("Usage: cmd filename");
System.exit(1);
}
// Use an instance of ourselves as the SAX event handler
DefaultHandler handler = new Echo();
// Use the default (non-validating) parser
SAXParserFactory factory = SAXParserFactory.newInstance();
try {
// Set up output stream
out = new OutputStreamWriter(System.out, "UTF8");
// Parse the input
SAXParser saxParser = factory.newSAXParser();
saxParser.parse( new File(argv[0]), handler );
} catch (Throwable t) {
t.printStackTrace();
}
System.exit(0);
}
With these lines of code, you created a SAXParserFactory instance, as determined by the
setting of the javax.xml.parsers.SAXParserFactory system property. You then got a
parser from the factory and gave the parser an instance of this class to handle the parsing
events, telling it which input file to process.
Note: The javax.xml.parsers.SAXParser class is a wrapper that defines a number of
convenience methods. It wraps the (somewhat-less friendly) org.xml.sax.Parser object. If
needed, you can obtain that parser using the SAXParser's getParser() method.
3.5.6 Writing the Output
The ContentHandler methods throw SAXExceptions but not IOExceptions, which can
occur while writing. The SAXException can wrap another exception, though, so it makes
sense to do the output in a method that takes care of the exception-handling details. Add
the code highlighted below to define an emit method that does that:
static private Writer out;
private void emit(String s)
throws SAXException
{
try {
out.write(s);
out.flush();
} catch (IOException e) {
throw new SAXException("I/O error", e);
}
}
...
When emit is called, any I/O error is wrapped in SAXException along with a message
that identifies it. That exception is then thrown back to the SAX parser. You'll learn more
about SAX exceptions later on. For now, keep in mind that emit is a small method that
handles the string output. (You'll see it called a lot in the code ahead.)
3.5.7 Spacing the Output
Here is another bit of infrastructure we need before doing some real processing. Add the
code highlighted below to define a nl() method that writes the kind of line-ending
character used by the current system:
private void emit(String s)
...
}
private void nl()
throws SAXException
{
String lineEnd = System.getProperty("line.separator");
try {
out.write(lineEnd);
} catch (IOException e) {
throw new SAXException("I/O error", e);
}
}
Note: Although it seems like a bit of a nuisance, you will be invoking nl() many times in
the code ahead. Defining it now will simplify the code later on. It also provides a place to
indent the output when we get to that section of the tutorial.
3.5.8 Handling Content Events
Finally, let's write some code that actually processes the ContentHandler events.
3.5.8.1 Document Events
Add the code highlighted below to handle the start-document and end-document events:
static private Writer out;
public void startDocument()
throws SAXException
{
emit("<?xml version='1.0' encoding='UTF-8'?>");
nl();
}
public void endDocument()
throws SAXException
{
try {
nl();
out.flush();
} catch (IOException e) {
throw new SAXException("I/O error", e);
}
}
private void echoText()
...
Here, you are echoing an XML declaration when the parser encounters the start of the
document. Since you set up the OutputStreamWriter using the UTF-8 encoding, you
include that specification as part of the declaration.
3.5.8.2 Element Events
Now for the interesting stuff. Add the code highlighted below to process the start-element
and end-element events:
public void startElement(String namespaceURI,
String sName, // simple name
String qName, // qualified name
Attributes attrs)
throws SAXException
{
String eName = sName; // element name
if ("".equals(eName)) eName = qName; // not namespaceAware
emit("<"+eName);
if (attrs != null) {
for (int i = 0; i < attrs.getLength(); i++) {
String aName = attrs.getLocalName(i); // Attr name
if ("".equals(aName)) aName = attrs.getQName(i);
emit(" ");
emit(aName+"=\""+attrs.getValue(i)+"\"");
}
}
emit(">");
}
public void endElement(String namespaceURI,
String sName, // simple name
String qName // qualified name
)
throws SAXException
{
String eName = sName; // element name
if ("".equals(eName)) eName = qName; // not namespaceAware
emit("<"+eName+">");
}
private void emit(String s)
...
With this code, you echoed the element tags, including any attributes defined in the start
tag. Note that when the startElement() method is invoked, the simple name ("local
name") for elements and attributes could turn out to be the empty string, if namespace
processing was not enabled. The code handles that case by using the qualified name
whenever the simple name is the empty string.
3.5.8.3 Character Events
To finish handling the content events, you need to handle the characters that the parser
delivers to your application.
Parsers are not required to return any particular number of characters at one time. A
parser can return anything from a single character at a time up to several thousand, and
still be standard-conforming implementation. So, if your application needs to process the
characters it sees, it is wise to accumulate the characters in a buffer, and operate on them
only when you are sure they have all been found.
Add the line highlighted below to define the text buffer:
public class Echo01 extends DefaultHandler
{
StringBuffer textBuffer;
public static void main(String argv[])
{
...
Then add the code highlighted below to accumulate the characters the parser delivers in
the buffer:
public void endElement(...)
throws SAXException
{
...
}
public void characters(char buf[], int offset, int len)
throws SAXException
{
String s = new String(buf, offset, len);
if (textBuffer == null) {
textBuffer = new StringBuffer(s);
} else {
textBuffer.append(s);
}
}
private void emit(String s)
...
Next, add this method highlighted below to send the contents of the buffer to the output
stream.
public void characters(char buf[], int offset, int len)
throws SAXException
{
...
}
private void echoText()
throws SAXException
{
if (textBuffer == null) return;
String s = ""+textBuffer
emit(s);
textBuffer = null;
}
private void emit(String s)
...
When this method is called twice in a row (which will happens at times, as we'll see
next), the buffer will be null. So in that case, the method simply returns. When the buffer
is non-null, however, it's contents are sent to the output stream.
Finally, add the code highlighted below to echo the contents of the buffer whenever an
element starts or ends:
public void startElement(...)
throws SAXException
{
echoText();
String eName = sName; // element name
...
}
public void endElement(...)
throws SAXException
{
echoText();
String eName = sName; // element name
...
}
You're done accumulating text when an element ends, of course. So you echo it at that
point, which clears the buffer before the next element starts.
But you also want to echo the accumulated text when an element starts! That's necessary
for document-style data, which can contain XML elements that are intermixed with text.
For example, in this document fragment:
<para>This paragraph contains <bold>important</bold>
ideas.</para>
The initial text, "This paragraph contains" is terminated by the start of the <bold>
element. The text, "important" is terminated by the end tag, </bold>, and the final text,
"ideas.", is terminated by the end tag, </para>.
Note: Most of the time, though, the accumulated text will be echoed when an
endElement() event occurs. When a startElement() event occurs after that, the buffer will
be empty. The first line in the echoText() method checks for that case, and simply
returns.
3.5.9 Compiling and Running the Program
In
the
Java
SDK,
the
JAXP
libraries
are
distributed
in
the
directory
<JAVA_HOME>/common/lib. To compile the program you created, you'll first need to
install the JAXP JAR files in the appropriate location. (The names of the JAR files
depend on which version of JAXP you are using, and their location depends of which
version of the Java platform you are using. See the Java XML release notes at
<JAVA_HOME>/docs/jaxp/ReleaseNotes.html for the latest details.)
Note: Since JAXP 1.1 is built into version 1.4 of the Java 2 platform, you can also
execute the majority of the JAXP tutorial (SAX, DOM, and XSLT) sections, without
doing any special installation of the JAR files. However, to make use of the added
features in JAXP -- XML Schema and the XSLTC compiling translator -- you will need
to install JAXP 1.2, as described in the release notes.
For versions 1.2 and 1.3 of the Java 2 platform, you can execute the following commands
to compile and run the program:
javac -classpath jaxp-jar-files Echo.java
java -cp jaxp-jar-files Echo slideSample.xml
Alternatively, you could place the JAR files in the platform extensions directory and use
the simpler commands:
javac Echo.java
java Echo slideSample.xml
For version 1.4 of the Java 2 platform, you must identify the JAR files as newer versions
of the "endorsed standards" that are built into the Java 2 platform. To do that, put the JAR
files in the endorsed standards directory, jre/lib/endorsed. (You copy all of the JAR files,
except for jaxp-api.jar. You ignore that one because the JAXP APIs are already built into
the 1.4 platform.)
You can then compile and run the program with these commands:
javac Echo.java
java Echo slideSample.xml
Note: You could also elect to set the java.endorsed.dirs system property on the command
line so that it points to a directory containing the necessary JAR files, using an commandline option like this: -D"java.endorsed.dirs=somePath".
slideSample01.xml
<?xml version='1.0' encoding='utf-8'?>
<!--
A SAMPLE set of slides
-->
<slideshow
title="Sample Slide Show"
date="Date of publication"
author="Yours Truly"
>
<!-- TITLE SLIDE -->
<slide type="all">
<title>Wake up to WonderWidgets!</title>
</slide>
<!-- OVERVIEW -->
<slide type="all">
<title>Overview</title>
<item>Why <em>WonderWidgets</em> are great</item>
<item/>
<item>Who <em>buys</em> WonderWidgets</item>
</slide>
</slideshow>
Echo01.java
/*
* @(#)Echo01.java 1.5 99/02/09
*
* Copyright 2002 Sun Microsystems, Inc. All Rights Reserved.
*/
import java.io.*;
import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;
import javax.xml.parsers.SAXParserFactory;
import javax.xml.parsers.ParserConfigurationException;
import javax.xml.parsers.SAXParser;
public class Echo01 extends DefaultHandler
{
StringBuffer textBuffer;
public static void main(String argv[])
{
if (argv.length != 1) {
System.err.println("Usage: cmd filename");
System.exit(1);
}
// Use an instance of ourselves as the SAX event handler
DefaultHandler handler = new Echo01();
// Use the default (non-validating) parser
SAXParserFactory factory = SAXParserFactory.newInstance();
try {
// Set up output stream
out = new OutputStreamWriter(System.out, "UTF8");
// Parse the input
SAXParser saxParser = factory.newSAXParser();
saxParser.parse( new File(argv[0]), handler);
} catch (Throwable t) {
t.printStackTrace();
}
System.exit(0);
}
static private Writer
out;
//===========================================================
// SAX DocumentHandler methods
//===========================================================
public void startDocument()
throws SAXException
{
emit("<?xml version='1.0' encoding='UTF-8'?>");
nl();
}
public void endDocument()
throws SAXException
{
try {
nl();
out.flush();
} catch (IOException e) {
throw new SAXException("I/O error", e);
}
}
public void startElement(String namespaceURI,
String sName, // simple name
String qName, // qualified name
Attributes attrs)
throws SAXException
{
echoText();
String eName = sName; // element name
if ("".equals(eName)) eName = qName; // not namespaceAware
emit("<"+eName);
if (attrs != null) {
for (int i = 0; i < attrs.getLength(); i++) {
String aName = attrs.getLocalName(i); // Attr name
if ("".equals(aName)) aName = attrs.getQName(i);
emit(" ");
emit(aName+"=\""+attrs.getValue(i)+"\"");
}
}
emit(">");
}
public void endElement(String namespaceURI,
String sName, // simple name
String qName // qualified name
)
throws SAXException
{
echoText();
String eName = sName; // element name
if ("".equals(eName)) eName = qName; // not namespaceAware
emit("</"+eName+">");
}
public void characters(char buf[], int offset, int len)
throws SAXException
{
String s = new String(buf, offset, len);
if (textBuffer == null) {
textBuffer = new StringBuffer(s);
} else {
textBuffer.append(s);
}
}
//===========================================================
// Utility Methods ...
//===========================================================
// Display text accumulated in the character buffer
private void echoText()
throws SAXException
{
if (textBuffer == null) return;
String s = ""+textBuffer;
emit(s);
textBuffer = null;
}
// Wrap I/O exceptions in SAX exceptions, to
// suit handler signature requirements
private void emit(String s)
throws SAXException
{
try {
out.write(s);
out.flush();
} catch (IOException e) {
throw new SAXException("I/O error", e);
}
}
// Start a new line
private void nl()
throws SAXException
{
String lineEnd = System.getProperty("line.separator");
try {
out.write(lineEnd);
} catch (IOException e) {
throw new SAXException("I/O error", e);
}
}
}
3.6 Section D: Load the data into MSSQL [13]
Now we are ready to load the data prepared in the previous section into the tables created
according to the schema. All tables should have at least a primary key for efficiency.
There are two specific data types, for date and time values MSSQL use datetime variable
and money for attributes representing money. Parse we have written in the previous
sections will produce a | vertical bar delimited text file which we can import directly into
the MS SQL’s Import capability. Our parser will produce .dat files that look like the
below example:
Auctions.dat
1045769659|*|SPRINGERLE COOKIE BOARD ** NO RESERVE**|*|dosouth|*|14.50|*||*|14.50|*|Dec-08-01 16:23:53|*|Dec-15-01
16:23:53|*|Wood Springerle cookie borad depicting a FISH, flowers &
birds. It will imprint 8 designs in all. It
|*| is the delimiter to separate colomns, and carriege return is the row delimiter so that
MS SQL can map text file colomns to actual colomns in the table in DB.
To import auction.dat into the database, open the Enterprise Manager for MSSQL and
then select the auction table, right click on it, click Tasks>Import.
At that stage it should ask for source database, and chose Text File from the bottom of the
drop down menu. Then browse and select the auction.dat file, then you should see the
below window.
Figure 3.6.1
We didn’t parse text with quoation marks because it is unsafe, so no text qualifier is used,
row delimiter is Carriage return and Line Feed which is standard in Windows systems but
it is only CR in Unix like systems. Click Next to continue and it should parse the
colomns.
In the Next screen select |*| as the delimiter and you will successfully import the parse
datafiles into relational database.
4 Auctionbase Schema and Data
4.1 Section A: Indexes
An important technique for improving the performance of queries is to create indexes. An
index on an attribute A of relation R allows the DBMS to quickly find all tuples in R
matching a given value or range of values for attribute A (useful when evaluating
selection or join conditions involving attribute A). An index can be created on any
attribute of any relation, or on several attributes combined.
Create at least one useful index on each table in your large AuctionBase schema. Run
several queries over your large AuctionBase database with the indexes and without the
indexes. Try to write queries that are realistic, that are complex enough to take a while to
execute, and that can exploit the indexes you chose so you can best experiment with the
performance differences. Turn in a transcript showing your commands to create indexes,
and showing the relative times of query execution with and without indexes
4.2 Section B: Views
4.2.1 What is a View ? [14]
A view can be thought of as either a virtual table or a stored query. The data accessible
through a view is not stored in the database as a distinct object. What is stored in the
database is a SELECT statement. The result set of the SELECT statement forms the
virtual table returned by the view. A user can use this virtual table by referencing the
view name in Transact-SQL statements the same way a table is referenced. A view is
used to do any or all of these functions:

Restrict a user to specific rows in a table.
For example, allow an employee to see only the rows recording his or her work in
a labor-tracking table.

Restrict a user to specific columns.
For example, allow employees who do not work in payroll to see the name, office,
work phone, and department columns in an employee table, but do not allow them
to see any columns with salary information or personal information.

Join columns from multiple tables so that they look like a single table.

Aggregate information instead of supplying details.
For example, present the sum of a column, or the maximum or minimum value
from a column.
The syntax for a VIEW is:
CREATE VIEW view_name AS
SELECT columns
FROM table
WHERE predicates;
Whether it is possible to perform INSERT, DELETE, and/or UPDATE statements a view
is an interesting question.

views meeting the criteria that can be updated, and

views not meeting the criteria that cannot be updated.
Views in all versions of SQL Server are updatable (can be the target of UPDATE,
DELETE, or INSERT statements), as long as the modification affects only one of the
base tables referenced by the view, for example:
4.2.2 Views of AuctionBase [13]
Here are two simple views we can use on our database;
create view openItemSummary AS
SELECT item_id, name, seller, current_price, starts, ((ends - curTime)
* 24) AS hours_left FROM auctions, time
WHERE ends > curTime AND current_price < buy_price;
create view num_bids AS
SELECT b.item_id, COUNT(b.item_id) AS num_bids FROM auctions a, bids b
WHERE a.item_id = b.item_id GROUP BY b.item_id;
create view auction_winner AS
SELECT * FROM bids b1 WHERE amount >= ALL
(SELECT b2.amount FROM bids b2 WHERE b1.item_id = b2.item_id);
5 MSSQL Features
5.1 Section A: Current Time
The original auction data that we provided for you in XML, which you translated into
relations and loaded into your AuctionBase database represents a single point in time,
specifically one second after midnight on January 17th, 2005 ("Jan-17-05 00:00:01"). In
the final part of the project, outlined below - we will develop full auction functionality:
users will be able to browse items, enter and retrieve bids, create new auctions, run
statistics, etc.
To fully test our functionality, and to simulate the true operation of an online auction
system in which auctions close as time passes, we suggest that you maintain a fictitious
"current time" in your database. Add a new one-attribute table to your AuctionBase
schema. This table should at all times contain a single row (i.e., a single value)
representing the "current time," which can be updated to represent time passing. (It's up
to you whether you also want to permit backward time-travel.) Initialize the table by
inserting the current time for the initial state of your database: Jan-17-05 00:00:01.
5.2 Section B: Constraints and Triggers [15] [14]
5.2.1 What is a CONSTRAINT ?
A constraint is a property assigned to a column or the set of columns in a table that
prevents certain types of inconsistent data values from being placed in the column(s).
Constraints are used to enforce the data integrity. This ensures the accuracy and
reliability of the data in the database. The following categories of the data integrity exist:

Entity Integrity

Domain Integrity

Referential integrity

User-Defined Integrity
Entity Integrity ensures that there are no duplicate rows in a table.
Domain Integrity enforces valid entries for a given column by restricting the type, the
format, or the range of possible values.
Referential integrity ensures that rows cannot be deleted, which are used by other
records (for example, corresponding data values between tables will be vital).
User-Defined Integrity enforces some specific business rules that do not fall into entity,
domain, or referential integrity categories.
Each of these categories of the data integrity can be enforced by the appropriate
constraints. Microsoft SQL Server supports the following constraints:



PRIMARY KEY
UNIQUE
FOREIGN KEY


CHECK
NOT NULL
A PRIMARY KEY constraint is a unique identifier for a row within a database table.
Every table should have a primary key constraint to uniquely identify each row and only
one primary key constraint can be created for each table. The primary key constraints are
used to enforce entity integrity.
A UNIQUE constraint enforces the uniqueness of the values in a set of columns, so no
duplicate values are entered. The unique key constraints are used to enforce entity
integrity as the primary key constraints.
A FOREIGN KEY constraint prevents any actions that would destroy link between
tables with the corresponding data values. A foreign key in one table points to a primary
key in another table. Foreign keys prevent actions that would leave rows with foreign key
values when there are no primary keys with that value. The foreign key constraints are
used to enforce referential integrity.
A CHECK constraint is used to limit the values that can be placed in a column. The
check constraints are used to enforce domain integrity.
A NOT NULL constraint enforces that the column will not accept null values. The not
null constraints are used to enforce domain integrity, as the check constraints.
You can create constraints when the table is created, as part of the table definition by
using the CREATE TABLE statement.
Example:
CREATE TABLE cust_sample
(
cust_id
int
PRIMARY KEY,
cust_name
char(50),
cust_address
char(50),
cust_credit_limit
money,
CONSTRAINT chk_id CHECK (cust_id BETWEEN 0 and 10000 )
5.2.2 What is a TRIGGER? [15]
A trigger is a special type of stored procedure that automatically takes effect when the
data in a specified table is modified. A trigger is invoked in response to an INSERT,
UPDATE, or DELETE statement. A trigger can query other tables and can include
complex Transact-SQL statements. The trigger and the statement that fires it are treated
as a single transaction, which can be rolled back from within the trigger. If a severe error
is detected (for example, insufficient disk space), the entire transaction automatically
rolls back.
Triggers are useful in these ways:

Triggers can cascade changes through related tables in the database; however,
these changes can be executed more efficiently using cascading referential
integrity constraints.

Triggers can enforce restrictions that are more complex than those defined with
CHECK constraints.
Unlike CHECK constraints, triggers can reference columns in other tables. For
example, a trigger can use a SELECT from another table to compare to the
inserted or updated data and to perform additional actions, such as modify the
data or display a user-defined error message.

Triggers can also evaluate the state of a table before and after a data modification
and take action(s) based on that difference.

Multiple triggers of the same type (INSERT, UPDATE, or DELETE) on a table
allow multiple, different actions to take place in response to the same
modification statement.
5.2.2.1.1 Triggers Compared to Constraints [15]
Constraints and triggers each have benefits that make them useful in special situations.
The primary benefit of triggers is that they can contain complex processing logic that
uses Transact-SQL code. Therefore, triggers can support all of the functionality of
constraints; however, triggers are not always the best method for a given feature.
Entity integrity should always be enforced at the lowest level by indexes that are part of
PRIMARY KEY and UNIQUE constraints or are created independently of constraints.
Domain integrity should be enforced through CHECK constraints, and referential
integrity (RI) should be enforced through FOREIGN KEY constraints, assuming their
features meet the functional needs of the application.
Triggers are most useful when the features supported by constraints cannot meet the
functional needs of the application. For example:

FOREIGN KEY constraints can validate a column value only with an exact match
to a value in another column, unless the REFERENCES clause defines a
cascading referential action.

A CHECK constraint can validate a column value only against a logical
expression or another column in the same table. If your application requires that a
column value be validated against a column in another table, you must use a
trigger.

Constraints can communicate about errors only through standardized system error
messages. If your application requires (or can benefit from) customized messages
and more complex error handling, you must use a trigger.
Triggers can cascade changes through related tables in the database; however, these
changes can be executed more efficiently through cascading referential integrity
constraints.

Triggers can disallow or roll back changes that violate referential integrity,
thereby canceling the attempted data modification. Such a trigger might go into
effect when you change a foreign key and the new value does not match its
primary key. For example, you can create an insert trigger on titleauthor.title_id
that rolls back an insert if the new value does not match some value in
titles.title_id. However, FOREIGN KEY constraints are usually used for this
purpose.

If constraints exist on the trigger table, they are checked after the INSTEAD OF
trigger execution but prior to the AFTER trigger execution. If the constraints are
violated, the INSTEAD OF trigger actions are rolled back and the AFTER trigger
is not executed.
5.2.3 CONSTRAINTs & TRIGGERs of AuctionBase DB [13]
If the data in your AuctionBase system at a given point in time represents a correct state
of the real world, a number of constraints are expected to hold. Here are a few possible
examples, some of which depend on a particular schema:

In every auction the number-of-bids field (if included) corresponds to the actual
number of bids.

In every auction and every bid the quantity (if present) must be greater than 0.

The item-id in every bid corresponds to an actual item.

No auction may have a bid before its start time or after its end time.

There are no bids after the current time.

The quantity in a bid must not exceed the quantity available.

A user may not bid on an item he or she is offering. (This one is a judgment call.)

All sellers and bidders must exist as users. (Whether this one makes sense
depends on your relational schema.)
5.2.3.1 CONSTRAINTs of AuctionBase DB
The following constraints enforce referential integrity of foreign keys; that is the
existence of foreign keys on other tables.
alter table bids ADD CONSTRAINT itemIDRef FOREIGN KEY (item_id)
REFERENCES auctions(item_id);
alter table bids ADD CONSTRAINT bidderIDRef FOREIGN KEY (bidder)
REFERENCES users(user_id);
alter table auctions ADD CONSTRAINT sellerIDRef FOREIGN KEY (seller)
REFERENCES users(user_id);
alter table itemInCategory ADD CONSTRAINT itemIDRef FOREIGN KEY (item_id)
REFERENCES auctions(item_id);
alter table itemInCategory ADD CONSTRAINT catIDRef FOREIGN KEY (cat_id)
REFERENCES categories(cat_id);
5.2.3.2 TRIGGERs of AuctionBase DB
bidBetweenStartAndEnd Trigger enforces that bid time is always between start and
end of an auction, it checks for that info on every update and insert in the relevant tables.
CREATE TRIGGER bidsBetweenStartAndEnd
AFTER INSERT OR UPDATE OF time ON bids
REFERENCING NEW AS newRow
FOR EACH ROW
DECLARE
startTime auctions.starts%TYPE;
endTime auctions.ends%TYPE;
BEGIN
select starts, ends INTO startTime, endTime FROM auctions WHERE
item_id = newRow.item_id;
IF(newRow.time < startTime OR newRow.time > endTime) THEN
RAISEERROR('Bid time must be between the starting and ending time
of the item being bid upon');
END IF;
END;
GO;
cannotBidOnOwnItem Trigger prevents the auction owner to bid on his/her own item,
on every insert and update on table on bids, the check must hold otherwise the trigger
raises an error.
CREATE TRIGGER cannotBidOnOwnItem
AFTER INSERT OR UPDATE OF bidder ON bids
REFERENCING NEW AS newRow
FOR EACH ROW
DECLARE
seller auctions.seller%TYPE;
BEGIN
select a.seller INTO seller FROM auctions a WHERE item_id =
newRow.item_id;
IF(:newRow.bidder = seller) THEN
RAISEERROR('A user may not bid upon his or her own items');
END IF;
END;
GO;
setCurrentPriceonNewBid trigger sets the current price value of an auction with the
new bidding price, after every insert into the bid, meaning that a new bid is made.
CREATE TRIGGER setCurrentPriceOnNewBid
AFTER INSERT ON bids
REFERENCING NEW AS newRow
FOR EACH ROW
BEGIN
UPDATE auctions SET current_price = newRow.amount WHERE item_id =
newRow.item_id;
END;
6 AuctionBase Web Site
6.1 Functionality
The functionality of our AuctionBase system is quite flexible and open-ended. However,
we would want to implement some basic capabilities:

Ability to manually change the "current time."

Automatic auction closing. An auction is "open" after its start time and "closed"
when its end time is past or its buy price is reached for its entire quantity. Your
design may be such that an auction closes implicitly with high enough bids or a
time update, or you may have chosen to represent open/closed status with an
explicit data field.

Ability for new auction users to provide their information to be entered into the
database (name, initial rating if not assigned automatically, optional location and
country), if relevant in your schema.

Ability to browse auctions of interest based on a variety of input choices. Possible
parameters include open/closed status, category, date, price, substring match in
description, etc. Use your imagination.

Ability to see the winner(s) of a closed auction.

Ability for auction users to enter bids on open auctions.

Ability for auction users to add new items up for auction.

Ability to retrieve auction or bidding history for a given user, including current
auctions or bids.

Ability to run various statistics over the auctions. Possibilities include average
number of bids per user, highest selling price over initial bid, average time to
reach buy-price, etc. Use your imagination.
6.2 Web Interface
[13]
Implementation of the web interface is totally left to the student but a clean and user
friendly interface is expected. Below are some template pages;
Figure 6.2.1
Figure 6.2.2
Figure 6.2.3
6.3 System testing
We should debug our queries directly on MSSQL before hooking them into your Web
interface. JDBC is not particularly friendly when it comes to "runaway" queries, so you
will benefit yourself and the rest of the class by using Query Analyzer first to ensure that
your queries are working properly and are finishing in a reasonable amount of time. Once
you are certain your queries are working correctly, incorporate them into your Web
interface.
Even with prior debugging, it is prudent to set a timeout mechanism in JDBC for all of
your queries. Use setQueryTimeout([time in seconds]) on each of your statement objects,
for example:
Statement stmt = conn.createStatement( );
stmt.setQueryTimeout(180);
...
7 Conclusion
In this project, we have captured some practical issues of building a simple web
application with its design, technical background, setting up, installation and fine-tuning
of its development environment and finally implementing it. There is infinite detail in
even a small project like this because there is usually no limit in optimizing the
performance, writing a better code, building a good and user-friendly interface and a
secure system, a small scaled software engineering challenge. This manual is written as a
help document for a student willing to do a database project as part of his introduction to
database course. In the references and resources section, a decent amount of book
refrences and world wide web links are provided so that the student can reach to more
detail in the area that needs clarification, deepening of the knowledge. I hope that tutorial
will of use to many database students.
8 References & Resources

XML References & Resources
o Books

Inside XML, Steven Holzner, 2001, New Riders Publishing

XML Schema, Eric van der Vlist, The W3C's Object-Oriented
Descriptions for XML, 2002, O’Reilly

XML in a Nutshell, Elliotte Rusty Harold, W. Scott Means , 3rd
Edition, 2004, O’Reilly
o World Wide Web

W3C Application Domain: http://www.w3.org/XML/

A Technical Introduction:
http://www.xml.com/pub/a/98/10/guide0.html

[12] XML from the Inside Out: http://www.xml.com

Well-formedness Checker:
http://www.cogsci.ed.ac.uk/~richard/xml-check.html

W3C Markup Validator: http://validator.w3.org/

Microsoft’s XML Perspective: http://msdn.microsoft.com/xml/

Oracle XML Technology Center:
http://www.oracle.com/technology/tech/xml/index.html

[1] SAX (Simple Api for XML): http://www.saxproject.org/

[11] Declaring Elements and Attributes in an XML DTD :
http://www.rpbourret.com/xml/xmldtd.htm

Java References & Resources
o Books

Java Servlet & JSP Cookbook, Bruce W. Perry 1st Edition January
2004, O’Reilly

Java Servlet Programming, 2nd Edition, Jason Hunter, O’Reilly

Tomcat: The Definitive Guide, Jason Brittain, Ian F. Darwin,
O’Reilly

Java: How To Program 5/E, 2004, Deitel & Deitel
o World Wide Web

[10] Sun Microsystems: http://java.sun.com

Everything about Tomcat, Apache Jakarta Project:
http://jakarta.apache.org/


[2] Tomcat Configuration for Windows XP & 2000:
http://www.coreservlets.com/Apache-Tomcat-Tutorial/

[3] A Servlet and JSP Tutorial:
http://www.apl.jhu.edu/~hall/java/Servlet-Tutorial/ServletTutorial-First-Servlets.html

J2SE v1.4.2 API Specification:
http://java.sun.com/j2se/1.4.2/docs/api/index.html

Apache Tomcat 5.0 Documentation:
http://jakarta.apache.org/tomcat/tomcat-5.0-doc/

JDBC Microsoft SQL Driver for Windows XP/2000:
http://www.microsoft.com/downloads/details.aspx?FamilyID=9f18
74b6-f8e1-4bd6-947c-0fc5bf05bf71&displaylang=en

[4] JDBC Driver for Microsoft SQL Server Installation How-to:
http://www.akadia.com/services/sqlsrv_jdbc.html

[5] A Complete JDBC Example:
http://www.eas.asu.edu/~cse494db/IonJDBC/JDBCExample.html

Sun’s JDBC Course on Web:
http://java.sun.com/developer/onlineTraining/Database/JDBCShort
Course/jdbc/jdbc.html

[6] Java Developers Almanac: JSP & Servlet Examples:
http://javaalmanac.com/egs/javax.servlet.jsp/pkg.html

JSP & Servlet Tutorials: http://www.coreservlets.com
Database Design
o Books

An Introduction to Database Systems, C.J.Date 7/E, AddisonWesley, 2000
o World Wide Web
 Database Modelin Using UML:
http://www.sparxsystems.com.au/uml_topics/uml_datamodel/uml_
datamodel.htm

Data Warehousing – SQL for Nerds:
http://philip.greenspun.com/sql/data-warehousing.html


SQL & Microsoft SQL Server References & Resources
o Books

SQL Bible, Alex Kriegel, Boris M. Trukhnov, 2002, Wiley

SQL Queries for Mere Mortals, A Hands-On Guide to Data
Manipulation in SQL, Michael J. Hernandez, John L. Viescas,
2000, Addison-Wesley

SQL - The Complete Reference, Paul N. Weinberg, James R.
Groff, 2002,McGraw-Hill

Inside Microsoft SQL Server 2000, Kalen Delaney, 2000,
Microsoft Press
o World Wide Web

SQL for Web Nerds:
http://philip.greenspun.com/sql/introduction.html

SQL for Web Nerds – Queries:
http://philip.greenspun.com/sql/queries.html

SQL for Web Nerds – Complex Queries:
http://philip.greenspun.com/sql/complex-queries.html

[7] Microsoft Sql Server Programming Guide:
http://www.informit.com/guides/content.asp?g=sqlserver&seqNum
=46

[15] MS SQL Constraints:
http://www.mssqlcity.com/Articles/General/using_constraints.htm

[8] SQL – Web Nerds Triggers & Constraints:
http://philip.greenspun.com/sql/triggers.html

[14] Microsoft SQL Home: www.microsoft.com/sql/

Sample Parser, DB Code and Help from Stanford Database Group
o [13] Josh Sandberg [email protected]
o [9] Stanford CS 145 Page: http://www.stanford.edu/class/cs145/